Chapter 31
Double Map Reduce
Very similar to the previous style, but with an additional twist.
31.1 Constraints
- Input data is divided in blocks.
- A map function applies a given worker function to each block of data, potentially in parallel.
- The results of the many worker functions are reshuffled.
- The reshuffled blocks of data are given as input to a second map function that takes a reducible function as input.
- Optional step: a reduce function takes the results of the many worker functions and recombines them into a coherent output.
31.2 A Program in this Style
1 #!/usr/bin/env python
2 import sys, re, operator, string
3
4 #
5 # Functions for ...