© Subhashini Chellappan, Dharanitharan Ganesan 2018
Subhashini Chellappan and Dharanitharan GanesanPractical Apache Sparkhttps://doi.org/10.1007/978-1-4842-3652-9_4

4. Spark SQL, DataFrames, and Datasets

Subhashini Chellappan1  and Dharanitharan Ganesan2
(1)
Bangalore, India
(2)
Krishnagiri, Tamil Nadu, India
 

In the previous chapter on Spark Core, you learned about the RDD transformations and actions as the fundamentals and building blocks of Apache Spark. In this chapter, you will learn about the concepts of Spark SQL, DataFrames, and Datasets. As a heads up, the Spark SQL DataFrames and Datasets APIs are useful to process structured file data without the use of core RDD transformations and actions. This allows programmers and developers to analyze ...

Get Practical Apache Spark: Using the Scala API now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.