Time for action – running UFO analysis on EMR

Let us explore the use of EMR with Hive by doing some UFO analysis on the platform.

  1. Log in to the AWS management console at http://aws.amazon.com/console.
  2. Every Hive job flow on EMR runs from an S3 bucket and we need to select the bucket we wish to use for this purpose. Select S3 to see the list of the buckets associated with your account and then choose the bucket from which to run the example, in the example below, we select the bucket called garryt1use.
  3. Use the web interface to create three directories called ufodata, ufoout, and ufologs within that bucket. The resulting list of the bucket's contents should look like the following screenshot:
  4. Double-click on the ufodata directory to open it and within ...

Get Hadoop Beginner's Guide now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.