Understanding the Hadoop MapReduce fundamentals
To understand Hadoop MapReduce fundamentals properly, we will:
- Understand MapReduce objects
- Learn how to decide the number of Maps in MapReduce
- Learn how to decide the number of Reduces in MapReduce
- Understand MapReduce dataflow
- Take a closer look at Hadoop MapReduce terminologies
Understanding MapReduce objects
As we know, MapReduce operations in Hadoop are carried out mainly by three objects: Mapper, Reducer, and Driver.
- Mapper: This is designed for the Map phase of MapReduce, which starts MapReduce operations by carrying input files and splitting them into several pieces. For each piece, it will emit a key-value data pair as the output value.
- Reducer: This is designed for the Reduce phase of a MapReduce ...