Launching and deploying Spark programs

A Spark program can run by itself or over cluster managers. The first option is similar to running a program locally with multiple threads, and one thread is considered one Spark job worker. Of course, there is no parallelism at all, but it is a quick and easy way to launch a Spark application, and we will be deploying it in this model by way of demonstration, throughout the chapter. For example, we can run the following script to launch a Spark application:

./bin/spark-submit examples/src/main/python/pi.py

This is precisely as we did in the previous section. Or, we can specify the number of threads:

./bin/spark-submit --master local[4] examples/src/main/python/pi.py

In the previous code, we run Spark ...

Get Python Machine Learning By Example - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.