O'Reilly logo

Learning PySpark by Denny Lee, Tomasz Drabas

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 11. Packaging Spark Applications

So far we have been working with a very convenient way of developing code in Spark - the Jupyter notebooks. Such an approach is great when you want to develop a proof of concept and document what you do along the way.

However, Jupyter notebooks will not work if you need to schedule a job, so it runs every hour. Also, it is fairly hard to package your application as it is not easy to split your script into logical chunks with well-defined APIs - everything sits in a single notebook.

In this chapter, we will learn how to write your scripts in a reusable form of modules and submit jobs to Spark programmatically.

Before you begin, however, you might want to check out the Bonus Chapter 2, Free Spark Cloud Offering ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required