September 2017
Beginner to intermediate
412 pages
8h 55m
English
In this section, we present the MapReduce solution to the WordCount problem, sometimes called the Hello World example for MapReduce.
The diagram in Figure 11-2 shows the data flow for the WordCount program. On the left are two of the 80 files that are read into the program:

Figure 11-2. Data flow for the WordCount program
During the mapping stage, each word, followed by the number 1, is copied into a temporary file, one pair per line. Notice that many words are duplicated many times. For example, image appears five times among the 80 files (including both files shown), so the string image 1 will appear four times in the temporary ...
Read now
Unlock full access