Performing MapReduce operations

MapReduce is the programming model to perform operations (mainly aggregation) on distributed sets of data across various clusters in different servers. This concept was coined by Google and was used in the Google file system initially and later was adopted by the open source Hadoop project.

MapReduce works by processing the data on each server and then combine it together to form a result set. It actually divides into two operations namely Map and Reduce.

  • Map: This performs the transformation of the elements in the group or individual sequence
  • Reduce: This performs the aggregation and combines the results from Map into a meaningful result set

In RethinkDB, MapReduce queries operate in three steps as follows:

  • Group operation ...

Get Mastering RethinkDB now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.