Chapter 16
Deploying Hadoop
In This Chapter
Examining the components that comprise a Hadoop cluster
Designing the Hadoop cluster components
Reviewing Hadoop deployment form factors
Sizing a Hadoop cluster
At its core, Hadoop is a system for storing and processing data at a massive scale using a cluster of many individual compute nodes. In this chapter, we describe the tasks involved in building a Hadoop cluster, all the way from the hardware components in the compute nodes to different cluster configuration patterns, to how to appropriately size clusters. In at least one way, Hadoop is no different from many other IT systems: If you don’t design your cluster to match your business requirements, you get bad results.
Working with Hadoop Cluster Components
While you’re getting your feet wet with Hadoop, you’re likely to limit yourself to using a pseudo-distributed cluster running in a virtual machine on a personal computer. Though this environment is a good one for testing and learning, it’s obviously inappropriate for production-level performance and scalability. In this section, ...
Get Hadoop For Dummies now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.