Benchmarking and profiling a Hadoop cluster

Benchmarking of a Hadoop cluster is the first step to tune the performance of a Hadoop cluster. We can also use Hadoop benchmarks to identify configuration problems and use it as reference for performance tuning. For example, by comparing the local benchmark with clusters with similar configurations, we can have a general understanding of the cluster performance.

Typically, we benchmark a Hadoop cluster after the cluster is newly configured and before putting it to service to accept jobs. This is because, when clients can submit jobs, the benchmarks can be perplexed by the client's jobs to show the real performance of a Hadoop cluster, and also the benchmark jobs can cause inconveniences for the clients. ...

Get Hadoop Operations and Cluster Management Cookbook now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.