O'Reilly logo

Fast Data Processing with Spark 2 - Third Edition by Krishna Sankar

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

A single machine

A single machine is the simplest use case for Spark. It is also a great way to sanity check your build. In spark/bin, there is a shell script called run-example, which can be used to launch a Spark job. The run-example script takes the name of a Spark class and some arguments. Earlier, we used the run-example script from the /bin directory to calculate the value of Pi. There is a collection of the sample Spark jobs in examples/src/main/scala/org/apache/spark/examples/.

All of the sample programs take the parameter, master (the cluster manager), which can be the URL of a distributed cluster or local[N], where N is the number of threads.

Going back to our run-example script, it invokes the more general bin/spark-submit script. For ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required