Hadoop contains several benchmarks that you can use to verify whether your HDFS cluster is set up properly and performs as expected. DFSIO is a benchmark test that comes with Hadoop, which can be used to analyze the I/O performance of an HDFS cluster. This recipe shows how to use DFSIO to benchmark the read/write performance of an HDFS cluster.
You must set up and deploy HDFS and Hadoop v2 YARN MapReduce prior to running these benchmarks. Locate the
hadoop-mapreduce-client-jobclient-*-tests.jar file in your Hadoop installation.
The following steps will show you how to run the write and read DFSIO performance benchmarks: