6.3 Implementations and Systems

We briefly survey a MapReduce framework as well as popular key-value stores and document databases.

6.3.1 Apache Hadoop MapReduce

The open source map-reduce implementation hosted by Apache is called Hadoop.

image Web resources:

–Apache Hadoop: http://hadoop.apache.org/

–documentation page: http://hadoop.apache.org/docs/stable/

–GitHub repository: https://github.com/apache/hadoop

The entire Hadoop ecosystem consists of several modules which we briefly describe.

HDFS: Hadoop MapReduce runs on the Hadoop Distributed File System (HDFS). In an HDFS installation, a NameNode is responsible for managing metadata and handling ...

Get Advanced Data Management now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.