Chapter 3. Processing – MapReduce and Beyond

In Hadoop 1, the platform had two clear components: HDFS for data storage and MapReduce for data processing. The previous chapter described the evolution of HDFS in Hadoop 2 and in this chapter we'll discuss data processing.

The picture with processing in Hadoop 2 has changed more significantly than has storage, and Hadoop now supports multiple processing models as first-class citizens. In this chapter we'll explore both MapReduce and other computational models in Hadoop2. In particular, we'll cover:

What MapReduce is and the Java API required to write applications for it
How MapReduce is implemented in practice
How Hadoop reads data into and out of its processing jobs
YARN, the Hadoop2 component that allows ...

Get Learning Hadoop 2 now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Learning Hadoop 2 by Garry Turkington, Gabriele Modena

Chapter 3. Processing – MapReduce and Beyond

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly