Spark on the Cloud – Amazon Elastic MapReduce

Finally, now that you have learnt about Spark, let's finally look at potentially limitless scaling! We will learn how to use cloud services to deploy Spark clusters. There are many big data and data analytic service providers, such as Google or IBM Bluemix, but we will concentrate on Amazon for this chapter. We will provide screenshots of the process because sometimes such platforms can get a little overwhelming. The following are the steps for the process:

  1. First, we need to create an Amazon Cloud account if you don't already have one. Go to https://aws.amazon.com and click on create a free account:
  1. Provide your credentials and click on Create account.
  2. Next, we have to create a Key Pair. Key ...

Get Python Social Media Analytics now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.