Using Spark SQL in Streaming Applications

In this chapter, we will present typical use cases for using Spark SQL in streaming applications. Our focus will be on structured streaming using the Dataset/DataFrame APIs introduced in Spark 2.0. Additionally, we will introduce and work with Apache Kafka, as it is an integral part of many web-scale streaming application architectures. Streaming applications typically involve real-time, context-aware responses to incoming data or messages. We will use several examples to illustrate the key concepts and techniques to build such applications.

In this chapter, we will learn these topics:

  • What is a streaming data application?
  • Typical streaming use cases
  • Using Spark SQL DataFrame/Dataset APIs to build ...

Get Learning Spark SQL now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.