Chapter 3. Components of Hadoop

This chapter covers

  • Managing files in HDFS
  • Analyzing components of the MapReduce framework
  • Reading and writing input and output data

In the last chapter we looked at setting up and installing Hadoop. We covered what the different nodes do and how to configure them to work with each other. Now that you have Hadoop running, let’s look at the Hadoop framework from a programmer’s perspective. If the previous chapter is like teaching you how to connect your turntable, your mixer, your amplifier, and your speakers together, then this chapter is about the techniques of mixing music.

We first cover HDFS, where you’ll store data that your Hadoop applications will process. Next we explain the MapReduce framework in ...

Get Hadoop in Action now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.