In this chapter, we will look at an example of how to create a streaming application. Apache Spark’s structured streaming API allows you to use the DataFrame API to express your Apache Spark job. Instead of working with static datasets, you work with micro-batches of data using the scalable, fault-tolerant stream processing engine built on Apache Spark.
The application we will create will do two things. Firstly, it will examine every message for a specific condition and allow our application to raise an alert, and secondly it will gather all the data received within a 5-minute window, aggregate ...