Multiple mappers reducer jobs are used in join use cases. In this design pattern, our input is taken from multiple input files to yield joined/aggregated output:
Scenario |
We have to find the average of city-wide temperature, but we have two files with different schema, one for cities and the other for temperature. Input File 1 City ID to Name Input File 2 Temperature for each city per day |
Map (Key, Value) |
Map 1 (For input 1) We need to write a program to split cityID, Name and according to the cityID, write the Name Then prepare the key/value pair (cityID, Name) Map 2 (For input 2) We need to write ... |