Running a MapReduce job on Amazon EMR

This recipe involves running the MapReduce job on the cloud using AWS. You will need an AWS account in order to proceed. Register with AWS at http://aws.amazon.com/. We will see how to run a MapReduce job on the cloud using Amazon Elastic Map Reduce (Amazon EMR). Amazon EMR is a managed MapReduce service provided by Amazon on the cloud. Refer to https://aws.amazon.com/elasticmapreduce/ for more details. Amazon EMR consumes data, binaries/JARs, and so on from AWS S3 bucket, processes them and writes the results back to S3 bucket. Amazon Simple Storage Service (Amazon S3) is another service by AWS for data storage on the cloud. Refer to http://aws.amazon.com/s3/ for more details on Amazon S3. Though we will ...

Get MongoDB Cookbook - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.