O'Reilly logo

Hadoop MapReduce v2 Cookbook - Second Edition by Thilina Gunarathne

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Using multiple disks/volumes and limiting HDFS disk usage

Hadoop supports specifying multiple directories for the DataNode data directory. This feature allows us to utilize multiple disks/volumes to store data blocks in DataNodes. Hadoop tries to store equal amounts of data in each directory. It also supports limiting the amount of disk space used by HDFS.

How to do it...

The following steps will show you how to add multiple disk volumes:

  1. Create HDFS data storage directories in each volume.
  2. Locate the hdfs-site.xml configuration file. Provide a comma-separated list of directories corresponding to the data storage locations in each volume under the dfs.datanode.data.dir property as follows:
    <property> <name>dfs.datanode.data.dir</name> <value>/u1/hadoop/data, ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required