O'Reilly logo

Optimizing Hadoop for MapReduce by Khaled Tannir

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 6. Optimizing MapReduce Tasks

Most MapReduce programs are written for data analysis and they usually take a lot of time to finish. Many companies are embracing Hadoop for advanced data analytics over large datasets that require completion-time guarantees. Efficiency, especially the I/O costs of MapReduce, still need to be addressed for successful implications.

In this chapter, we will discuss some optimization techniques such as using compression and using Combiners in order to improve job execution. Also in this chapter, you will learn basic guidelines and rules to optimize your mappers and reducers code, and techniques to use and reuse the object's instances.

The following topics will be covered in this chapter:

  • The benefits of using Combiners ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required