Chapter 2. Using Stream Processing Engines on Real-Time Data

In Chapter 1 we discussed the gaps in today’s real-time data architectures and the limitations of real-time analytics to act on data immediately. In this chapter, we will dive into how stream processing engines work with real-time data. First, we will discuss the different processing patterns and how they relate to real-time data. We will start with distributed data processing architectures, then cover stream processing and streaming analytics, and explain event-based architectures. Finally, we will end this chapter with a discussion on transitioning from traditional batch processing environments to stream processing.

Before we go into the details, it is important to highlight that modern stream processing engines can do all the things described in this chapter (including stream analytics); these engines are usually referred to as “stream processing.”

How Do Processing Patterns Apply to Real-Time Systems?

Processing patterns are the different methods or approaches employed to process, analyze, and manage data in a computing system. These patterns dictate how data is ingested, transformed, stored, and retrieved. Understanding and selecting the appropriate processing patterns is crucial for designing efficient, robust, and scalable real-time systems that address the challenges discussed in Chapter 1.

Distributed Data Processing

Traditional relational databases encounter scalability challenges when faced with a high volume ...

Get The Unrealized Opportunities with Real-Time Data now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.