Best practices for using Amazon EMR

Amazon has made working with Hadoop a lot easier. You can launch an EMR cluster in minutes for big data processing, machine learning, and real-time stream processing with the Apache Hadoop ecosystem. You can use the Management Console or the command line to start a few nodes or a thousand nodes with ease.

Like EC2, EMR pricing is now changed from pay per hour to pay per second. If you stop your cluster—you stop paying immediately. This results in lower costs and you don’t have to worry about the hourly boundary, anymore.

EMR makes a whole bunch of the latest versions of open source software available to you. There are 19 open source projects, presently. New releases are made every 4 to 6 weeks so latest ...

Get Learning AWS - Second Edition now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.