Time for action – running WordCount on EMR
We will now show you how to run this same JAR file on EMR. Remember, as always, that this costs money!
- Go to the AWS console at http://aws.amazon.com/console, sign in, and select S3.
- You'll need two buckets: one to hold the JAR file and another for the job output. You can use existing buckets or create new ones.
- Open the bucket where you will store the job file, click on Upload, and add the
wc1.jar
file created earlier. - Return to the main console home page, and then go to the EMR portion of the console by selecting Elastic MapReduce.
- Click on the Create a New Job Flow button and you'll see a familiar screen as shown in the following screenshot:
- Previously, we used a sample application; to run our code, we need ...
Get Hadoop: Data Processing and Modelling now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.