Chapter 3. Processing – MapReduce and Beyond

In Hadoop 1, the platform had two clear components: HDFS for data storage and MapReduce for data processing. The previous chapter described the evolution of HDFS in Hadoop 2 and in this chapter we'll discuss data processing.

The picture with processing in Hadoop 2 has changed more significantly than has storage, and Hadoop now supports multiple processing models as first-class citizens. In this chapter we'll explore both MapReduce and other computational models in Hadoop2. In particular, we'll cover:

  • What MapReduce is and the Java API required to write applications for it
  • How MapReduce is implemented in practice
  • How Hadoop reads data into and out of its processing jobs
  • YARN, the Hadoop2 component that allows ...

Get Learning Hadoop 2 now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.