How to do it...

  1. Spark comes bundled with scripts to launch the Spark cluster on Amazon EC2. Let's launch the cluster using the following command:
        $ cd /home/hduser$ spark-ec2 -k <key-pair> -i <key-file> -s <num-slaves> launch <cluster-name><key-pair> - name of EC2 keypair created in AWS<key-file> the private key file you downloaded<num-slaves> number of slave nodes to launch<cluster-name> name of the cluster
  1. Launch the cluster with the example value:
        $ spark-ec2 -k kp-spark -i /home/hduser/keypairs/kp-spark.pem --hadoop-major-          version 2  -s 3 launch spark-cluster
  1. Sometimes, the default availability zones are not available; in that case, retry sending the request by specifying the specific availability zone you are requesting:
        

Get Apache Spark 2.x Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.