Time for action – the replication factor in action
Let's repeat the preceding process, but this time, kill two DataNodes out of our cluster of four. We will give an abbreviated walk-through of the activity as it is very similar to the previous Time for action section:
- Restart the dead DataNode and monitor the cluster until all nodes are marked as live.
- Pick two DataNodes, use the process ID, and kill the DataNode processes.
- As done previously, wait for around 10 minutes then actively monitor the cluster state via
dfsadmin
, paying particular attention to the reported number of under-replicated blocks. - Wait until the cluster has stabilized with an output similar to the following:
Configured Capacity: 61032370176 (56.84 GB) Present Capacity: 45842373555 ...
Get Hadoop: Data Processing and Modelling now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.