Identifying resource bottlenecks

Typically, a bottleneck occurs when one resource of the system consumes more time than required to finish its tasks and forces other resources to wait, which decreases the overall system performance.

Prior to any deep-dive action into tuning your Hadoop cluster, it is a good practice to ensure that your cluster is stable and your MapReduce jobs are operational. We suggest you verify that the hardware components of your cluster are configured correctly and if necessary, upgrade any software components of the Hadoop stack to the latest stable version. You may also perform a MapReduce job such as TeraSort or PI Estimator to stress your cluster. This is a very important step to get a healthy and optimized Hadoop cluster. ...

Get Optimizing Hadoop for MapReduce now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.