© Raju Kumar Mishra and Sundar Rajan Raman 2019
Raju Kumar Mishra and Sundar Rajan RamanPySpark SQL Recipeshttps://doi.org/10.1007/978-1-4842-4335-0_8

8. Structured Streaming

Raju Kumar Mishra1  and Sundar Rajan Raman2
Bangalore, Karnataka, India
Chennai, Tamil Nadu, India

In this chapter, we look into Apache Spark’s structured streaming feature. So far, we have seen the fluent APIs Apache Spark provides for batch processing data. Typical ETL-based data flows are batch oriented and operate on static data. In this case, this static data has been obtained and is available for processing. While this is one side of the coin, the other side is streaming data. For example, websites such as Twitter and Facebook are continuously fed with data from their ...

Get PySpark SQL Recipes: With HiveQL, Dataframe and Graphframes now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.