Chapter 9. Running Hadoop in the cloud

This chapter covers

  • Setting up a compute cloud with Amazon Web Services (AWS)
  • Running Hadoop in the AWS cloud
  • Transferring data into and out of an AWS Hadoop cloud

Depending on your data processing needs, your Hadoop workload can vary widely over time. You may have a few large data processing jobs that occasionally take advantage of hundreds of nodes, but those same nodes will sit idle the rest of the time. You may be new to Hadoop and want to get familiar with it first before investing in a dedicated cluster. You may own a startup that needs to conserve cash and wants to avoid the capital expense of a Hadoop cluster. In these and other situations, it makes more sense to rent a cluster of machines rather ...

Get Hadoop in Action now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.