11. Working with SQL

This chapter covers

  • Using SQL within Spark
  • Determining the local or global scope of your views
  • Mixing both the dataframe API and SQL
  • Deleting records in a dataframe

Structured Query Language ( SQL ) is the golden standard for manipulating data. Introduced in 1974, it has since evolved to become an ISO standard (ISO/IEC 9075). The latest revision is SQL:2016.

It seems that SQL has been around forever as a way to extract and manipulate data in relational databases. And SQL will be around forever. When I was in college, I clearly remember asking my database professor, “Who do you expect will use SQL? A secretary making a report?” His answer was simply, “Yes.” (Based on that answer, I might just figure that you are a secretary ...

Get Spark in Action, Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.