MapReduce Model
Abstract
The MapReduce model was developed by Google and Yahoo for their internal use. Google created the Hadoop distributed file system and Yahoo developed Pig Latin to handle their volume of data. These products became open source. Hadoop dominates the NoSQL market as part of the SMAQ stack, the NoSQL counterpart of the LAMP stack for websites. The process has two phases: mapping and reducing. The mapping phase gets the data in a parallelized fashion. The reduce phase filters and aggregates this data to produce a final result.
Keywords
ETL (extract transform load); Google; Hadoop; HDFS (Hadoop distributed file system); LAMP stack; MapReduce; Pig Latin; RAID storage systems; SMAQ stack; Yahoo
Introduction
This chapter discusses ...
Get Joe Celko’s Complete Guide to NoSQL now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.