Understanding the Hadoop MapReduce fundamentals

To understand Hadoop MapReduce fundamentals properly, we will:

  • Understand MapReduce objects
  • Learn how to decide the number of Maps in MapReduce
  • Learn how to decide the number of Reduces in MapReduce
  • Understand MapReduce dataflow
  • Take a closer look at Hadoop MapReduce terminologies

Understanding MapReduce objects

As we know, MapReduce operations in Hadoop are carried out mainly by three objects: Mapper, Reducer, and Driver.

  • Mapper: This is designed for the Map phase of MapReduce, which starts MapReduce operations by carrying input files and splitting them into several pieces. For each piece, it will emit a key-value data pair as the output value.
  • Reducer: This is designed for the Reduce phase of a MapReduce ...

Get Big Data Analytics with R and Hadoop now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.