In this recipe, we will look at the memory requirements per node in the cluster, especially looking at the memory on Datanodes as a factor of storage.
Despite having large clusters with many Datanodes, it is of not much use if the nodes do not have sufficient memory to serve the requests. A Namenode stores the entire metadata in the memory and also has to take care of the block reports sent by the Datanodes in the cluster. The larger the cluster, the larger will be the block reports, and the more resources a Namenode will require.
The intent of this recipe is not to tune it for memory, but to give an estimate on the memory required per node.
To complete this recipe, the user must have completed the Disk space calculations ...