November 2017
Intermediate to advanced
670 pages
17h 35m
English
MapReduce is a technique that splits big datasets into many smaller ones. Each small dataset is separately, but simultaneously processed on different servers. The results are then gathered and aggregated to produce a final result.
How does it work?
Suppose we have a lot of web servers and we want to determine the top requested pages across all of them. We can analyze web server access logs to find all the requested URLs, count them, and sort the results.
The following are the good use cases for MapReduce:
The following are the use cases not good for MapReduce: