Writing your first Spark program
As mentioned before, you can use Spark with Python, Scala, Java, and R. We have different executable shell scripts available in the /spark/bin
directory and so far, we have just looked at Spark shell, which can be used to explore data using Scala. The following executables are available in the spark/bin
directory. We'll use most of these during the course of this book:
beeline
PySpark
run-example
spark-class
sparkR
spark-shell
spark-sql
spark-submit
Whatever shell you use, based on your past experience or aptitude, you have to deal with one abstract that is your handle to the data available on the spark cluster, be it local or spread over thousands of machines. The abstraction we are referring to here is called Resilient ...
Get Learning Apache Spark 2 now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.