O'Reilly logo

Cloudera Administration Handbook by Rohit Menon

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 1. Getting Started with Apache Hadoop

Apache Hadoop is a widely used open source distributed computing framework that is employed to efficiently process large volumes of data using large clusters of cheap or commodity computers. In this chapter, we will learn more about Apache Hadoop by covering the following topics:

  • History of Apache Hadoop and its trends
  • Components of Apache Hadoop
  • Understanding the Apache Hadoop daemons
  • Introducing Cloudera
  • What is CDH?
  • Responsibilities of a Hadoop administrator

History of Apache Hadoop and its trends

We live in the era where almost everything surrounding us is generating some kind of data. A click on a web page is being logged on the server. The flipping of channels when watching TV is being captured by cable ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required