O'Reilly logo

Optimizing Hadoop for MapReduce by Khaled Tannir

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Factors affecting the performance of MapReduce

The processing time of input data with MapReduce may be affected by many factors. One of these factors is the algorithm you use while implementing your map and reduce functions. Other external factors may also affect the MapReduce performance. Based on our experience and observation, the following are the major factors that may affect MapReduce performance:

  • Hardware (or resources) such as CPU clock, disk I/O, network bandwidth, and memory size.
  • The underlying storage system.
  • Data size for input data, shuffle data, and output data, which are closely correlated with the runtime of a job.
  • Job algorithms (or program) such as map, reduce, partition, combine, and compress. Some algorithms may be hard to conceptualize ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required