July 2017
Beginner to intermediate
378 pages
10h 26m
English
A Hadoop cluster has one MasterNode and one to many slave nodes. The MasterNode operates what is called the NameNode. The role of the NameNode is to track which other nodes are healthy and some other key information about the cluster, such as file locations. There is also the role of the ResourceManager, one per cluster, which may or may not be on the same server. You will learn more about the NameNode in the HDFS section and more about the ResourceManager in the YARN section.
The rest of the machines in the cluster act as both DataNode and NodeManager. These are the workers. This is where, you guessed it, the data is distributed and where the distributed computing happens. There are several types of slave node that serve different ...