Understanding the Apache Hadoop daemons

Most of the Apache Hadoop clusters in production run Apache Hadoop 1.x (MRv1—MapReduce Version 1). However, the new version of Apache Hadoop, 2.x (MRv2—MapReduce Version 2), also referred to as Yet Another Resource Negotiator (YARN) is being adopted by many organizations actively. In this section, we shall go through the daemons for both these versions.

Apache Hadoop 1.x (MRv1) consists of the following daemons:

  • Namenode
  • Secondary namenode
  • Jobtracker
  • Datanode
  • Tasktracker

All the preceding daemons are Java services and run within their own JVM.

Apache Hadoop stores and processes data in a distributed fashion. To achieve this goal, Hadoop implements a master and slave model. The namenode and jobtracker daemons are ...

Get Cloudera Administration Handbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.