Chapter 1. Introducing Hadoop

This chapter covers

  • The basics of writing a scalable, distributed data-intensive program
  • Understanding Hadoop and MapReduce
  • Writing and running a basic MapReduce program

Today, we’re surrounded by data. People upload videos, take pictures on their cell phones, text friends, update their Facebook status, leave comments around the web, click on ads, and so forth. Machines, too, are generating and keeping more and more data. You may even be reading this book as digital data on your computer screen, and certainly your purchase of this book is recorded as data with some retailer.[1]

1 Of course, you’re reading a legitimate copy of this, right?

The exponential growth of data first presented challenges to cutting-edge ...

Get Hadoop in Action now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.