O'Reilly logo

Fast Data Processing with Spark 2 - Third Edition by Krishna Sankar

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 5. Loading and Saving Data in Spark

Until now, you have experimented with the Spark shell, figured out how to create a connection with the Spark cluster, and build jobs for deployment. Now to make these jobs useful, you will need to learn how to load and save data in Spark, which we'll do in this chapter.

Before we dive into data, we have a couple of background tasks to do. First we need to get a view of Spark abstractions, and second, have a quick discussion about the different modalities of data.

Spark abstractions

The goal of this book is that you get a good understanding of Spark via hands-on programming. The best way to understand Spark is to work through operations iteratively. As we are still in the initial chapters, some of the things ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required