Configuring HDFS federation
Hadoop NameNode keeps the metadata in the main memory. When the HDFS namespace becomes large, the main memory can become a bottleneck of the cluster. HDFS federation was introduced in Hadoop for MRv2. It increases the NameNode capacity and throughput by leveraging the capacity of multiple independent NameNodes, with each NameNode hosting or managing part of the HDFS namespace.
Getting ready
Currently, only Hadoop MRv2 supports NameNode federation, so we are assuming that Hadoop MRv2 has been properly configured on all the cluster machines.
Note
We are assuming that all the configurations are making changes to the $HADOOP_CONF_DIR/hdfs-site.xml
file.
How to do it...
Use the following steps to configure HDFS federation:
- Log ...
Get Hadoop Operations and Cluster Management Cookbook now with O’Reilly online learning.
O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.