Setting the HDFS block size for a specific file in a cluster

In this recipe, we are going to take a look at how to set the block size for a specific file only.

Getting ready

To perform this recipe, you should already have a running Hadoop cluster.

How to do it...

In the previous recipe, we learned how to change the block size at the cluster level. But this is not always required. HDFS provides us with the facility to set the block size for a single file as well. The following command copies a file called myfile to HDFS, setting the block size to 1MB:

hadoop fs -Ddfs.block.size=1048576  -put /home/ubuntu/myfile /

Once the file is copied, you can verify whether the block size is set to 1MB and has been broken into exact chunks:

hdfs fsck -blocks /myfile ...

Get Hadoop: Data Processing and Modelling now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Hadoop: Data Processing and Modelling by Garry Turkington, Tanmay Deshpande, Sandeep Karanth

Setting the HDFS block size for a specific file in a cluster

Getting ready

How to do it...

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly