May 2017
Intermediate to advanced
270 pages
6h 18m
English
MapReduce is a programming model that allows you to express algorithms for efficient execution on a distributed system. The MapReduce model was first introduced by Google in 2004 (https://research.google.com/archive/mapreduce.html), as a way to automatically partition datasets over different machines and for automatic local processing and the communication between cluster nodes.
The MapReduce framework was used in cooperation with a distributed filesystem, the Google File System (GFS or GoogleFS), which was designed to partition and replicate data across the computing cluster. Partitioning was useful for storing and processing datasets that wouldn't fit on a single node while replication ensured that the system ...
Read now
Unlock full access