July 2017
Intermediate to advanced
796 pages
18h 55m
English
- Elbert Hubbard
In this chapter, you will learn how to use Spark for the analysis of structured data (unstructured data, such as a document containing arbitrary text or some other format has to be transformed into a structured form); we will see how DataFrames/datasets are the corner stone here, and how Spark SQL's APIs make querying structured data simple yet robust. Moreover, we introduce datasets and see the difference between datasets, DataFrames, and RDDs. In a nutshell, the following topics will be covered in this chapter:
Read now
Unlock full access