MapReduce is a programming model that allows you to express algorithms for efficient execution on a distributed system. The MapReduce model was first introduced by Google in 2004 (https://research.google.com/archive/mapreduce.html), as a way to automatically partition datasets over different machines and for automatic local processing and the communication between cluster nodes.
The MapReduce framework was used in cooperation with a distributed filesystem, the Google File System (GFS or GoogleFS), which was designed to partition and replicate data across the computing cluster. Partitioning was useful for storing and processing datasets that wouldn't fit on a single node while replication ensured that the system ...