Chapter 1. Hadoop in a heartbeat


This chapter covers
  • Understanding the Hadoop ecosystem
  • Downloading and installing Hadoop
  • Running a MapReduce job


We live in the age of big data, where the data volumes we need to work with on a day-to-day basis have outgrown the storage and processing capabilities of a single host. Big data brings with it two fundamental challenges: how to store and work with voluminous data sizes, and more important, how to understand data and turn it into a competitive advantage.

Hadoop fills a gap in the market by effectively storing and providing computational capabilities over substantial amounts of data. It’s a distributed system made up of a distributed filesystem and it offers a way to parallelize and execute ...

Get Hadoop in Practice now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.