Now that you have an understanding of how replication handles failures in DataNodes, let us
take a look at failure management for the NameNode. As you now realize, the NameNode is the
most important node in the cluster. Without the NameNode, we will have no idea which le is
stored on which DataNode. In the mappings that the NameNode holds within itself, the block
locations are not persistent. These block locations are stored in memory for quick lookup, where
it is called block caching.
If the NameNode fails, then the file block locations in memory will be completely lost.
Therefore, ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month, and much more.