O'Reilly logo

Hadoop Operations and Cluster Management Cookbook by Shumin Guo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Using S3 to host data

Simple Storage Service (S3) provides a convenient online data store. Users can use it to store and retrieve data. More information about S3 can be obtained from http://aws.amazon.com/s3/.

This recipe will outline steps to configure S3 as the distributed data storage system for MapReduce.

Getting ready

Before getting started, we assume that you have successfully registered with AWS and the client machine has been successfully configured to access the AWS.

How to do it...

Use the following steps to configure S3 for data storage:

  1. Stop the Hadoop cluster using the following command:
    stop-all.sh
    
  2. Open the file $HADOOP_HOME/conf/core-site.xml and add the following contents into the file:
    <property>
    <name>fs.default.name</name>
    <!-- value>master:54310</value--> ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required