Parallelizing Computations with mapreduce

Now we’re going to turn theory into practice. First we’ll look at the higher-order function mapreduce; then we’ll show how it can be used to parallelize a simple computation.

mapreduce

In the following figure, we can see the basic idea of mapreduce. We have a number of mapping processes, which produce streams of {Key, Value} pairs. The mapping processes send these pairs to a reduce process that merges the pairs, combining pairs with the same key.

images/fig_9.png

Warning: The word map, used in the context of mapreduce, is completely different from the map function that occurs elsewhere in this book.

mapreduce is a ...

Get Programming Erlang, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.