May 2017
Intermediate to advanced
348 pages
7h 8m
English
For redundancy, it is important to have multiple copies of data. In HDFS, this is achieved by placing copies of blocks on different nodes. By default, the replication factor is 3, which means that for each block written to HDFS, there will be three copies in total on the nodes in the cluster.
It is important to make sure that the cluster is working fine and the user can perform file operations on the cluster.
Log in to any of the nodes in the cluster. It is best to use the edge node, as stated in Chapter 1, and switch to the user hadoop.
Create a simple text file named file1.txt using any of your favorite text editors, and write some content in it.
ssh to the Namenode, which in this case is ...Read now
Unlock full access