Building the Uber JAR

The first step for deploying our Spark application on a cluster is to bundle it into a single Uber JAR, also known as the assembly JAR. In this recipe, we'll be looking at how to use the SBT assembly plugin to generate the assembly JAR. We'll be using this assembly JAR in subsequent recipes when we run Spark in distributed mode. We could alternatively set dependent JARs using the spark.driver.extraClassPath property (https://spark.apache.org/docs/1.3.1/configuration.html#runtime-environment). However, for a large number of dependent JARs, this is inconvenient.

How to do it...

The goal of building the assembly JAR is to build a single, Fat JAR that contains all dependencies and our Spark application. Refer to the following screenshot, ...

Get Scala: Guide for Data Science Professionals now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.