O'Reilly logo

Mastering Apache Spark by Mike Frampton

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Jobs and libraries

Within Databricks, it is possible to import JAR libraries and run the classes in them on your clusters. I will create a very simple piece of Scala code to print out the first 100 elements of the Fibonacci series as BigInt values, locally on my Centos Linux server. I will compile my class into a JAR file using SBT, run it locally to check the result, and then run it on my Databricks cluster to compare the results. The code looks as following:

import org.apache.spark.SparkContext import org.apache.spark.SparkContext._ import org.apache.spark.SparkConf object db_ex1 extends App { val appName = "Databricks example 1" val conf = new SparkConf() conf.setAppName(appName) val sparkCxt = new SparkContext(conf) var seed1:BigInt = 1 var ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required