Chapter 12: Spark SQL Primer
In the previous chapter, you learned about data visualizations as a powerful and key tool of data analytics. You also learned about various Python visualization libraries that can be used to visualize data in pandas DataFrames. An equally important and ubiquitous and essential skill in any data analytics professional's repertoire is Structured Query Language or SQL. SQL has existed as long as the field of data analytics has existed, and even with the advent of big data, data science, and machine learning (ML), SQL is still proving to be indispensable.
This chapter introduces you to the basics of SQL and looks at how SQL can be applied in a distributed computing setting via Spark SQL. You will learn about the various ...
Get Essential PySpark for Scalable Data Analytics now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.