Chapter 4. Exploring HDFS Federation and Its High Availability

You are now ready to set up a Hadoop cluster using CDH5. Once you have a cluster up and running, you are now responsible for managing it and making sure the cluster is available all the time. In this chapter, we will cover some techniques to manage HDFS efficiently and also handle the single point of failure in a Hadoop cluster. In this chapter, we will cover the following topics:

  • Configuring HDFS Federation
  • HDFS high availability using Quorum-based storage and storage using Network File System (NFS)
  • Jobtracker high availability

The heart of HDFS is the namenode. The namenode manages the locations of all data blocks in the cluster. To serve requests faster, the namenode manages all its ...

Get Cloudera Administration Handbook now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.