So far we have been working with a very convenient way of developing code in Spark - the Jupyter notebooks. Such an approach is great when you want to develop a proof of concept and document what you do along the way.
However, Jupyter notebooks will not work if you need to schedule a job, so it runs every hour. Also, it is fairly hard to package your application as it is not easy to split your script into logical chunks with well-defined APIs - everything sits in a single notebook.
In this chapter, we will learn how to write your scripts in a reusable form of modules and submit jobs to Spark programmatically.
Before you begin, however, you might want to check out the Bonus Chapter 2, Free Spark Cloud Offering ...