Hadoop supports specifying multiple directories for the DataNode data directory. This feature allows us to utilize multiple disks/volumes to store data blocks in DataNodes. Hadoop tries to store equal amounts of data in each directory. It also supports limiting the amount of disk space used by HDFS.
The following steps will show you how to add multiple disk volumes:
hdfs-site.xmlconfiguration file. Provide a comma-separated list of directories corresponding to the data storage locations in each volume under the
dfs.datanode.data.dirproperty as follows:
<property> <name>dfs.datanode.data.dir</name> <value>/u1/hadoop/data, ...