Chapter 8. Streaming SQL

Let’s talk SQL. In this chapter, we’re going to start somewhere in the middle with the punchline, jump back in time a bit to establish additional context, and finally jump back to the future to wrap everything up with a nice bow. Imagine Quentin Tarantino held a degree in computer science and was super pumped to tell the world about the finer points of streaming SQL, and so he offered to ghostwrite this chapter with me; it’s sorta like that. Minus the violence.

What Is Streaming SQL?

I would argue that the answer to this question has eluded our industry for decades. In all fairness, the database community has understood maybe 99% of the answer for quite a while now. But I have yet to see a truly cogent and comprehensive definition of streaming SQL that encompasses the full breadth of robust streaming semantics. That’s what we’ll try to come up with here, although it would be hubris to assume we’re 100% of the way there now. Maybe 99.1%? Baby steps.

Regardless, I want to point out up front that most of what we’ll discuss in this chapter is still purely hypothetical as of the time of writing. This chapter and the one that follows (covering streaming joins) both describe an idealistic vision for what streaming SQL could be. Some pieces are already implemented in systems like Apache Calcite, Apache Flink, and Apache Beam. Many others aren’t implemented anywhere. Along the way, I’ll try to call out a few of the things that do exist in concrete form, but given ...

Get Streaming Systems now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.