O'Reilly logo

Hadoop 2.x Administration Cookbook by Gurmukh Singh

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

HDFS health and FSCK

The health of the filesystem is very important for data retrieval and optimal performance. In a distributed system, it becomes more critical to maintain the good health of the HDFS filesystem so as to ensure block replication and near-parallel streaming of data blocks.

In this recipe, we will see how to check the health of the filesystem and do repairs, if any are needed.

Getting ready

Make sure you have a running cluster that has already been up for a few days with data. We can run the commands on a new cluster as well, but for the sake of this lab, it will give you more insights if it is run on a cluster with a large dataset.

How to do it...

  1. ssh to the master1.cyrus.com Namenode and change the user to hadoop.
  2. To check the HDFS ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required