Launching and deploying Spark programs

A Spark program can run by itself or over cluster managers. The first option is similar to running a program locally with multiple threads, and one thread is considered one Spark job worker. Of course, there is no parallelism at all, but it is a quick and easy way to launch a Spark application, and we will be deploying it in this model by way of demonstration, throughout the chapter. For example, we can run the following script to launch a Spark application:

./bin/spark-submit examples/src/main/python/pi.py

This is precisely as we did in the previous section. Or, we can specify the number of threads:

./bin/spark-submit --master local[4] examples/src/main/python/pi.py

In the previous code, we run Spark ...

Get Python Machine Learning By Example - Second Edition now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.