4

The Modern Data Stack

In this chapter, we will explore the modern data architecture that has emerged for building scalable and flexible data platforms. Specifically, we will cover the Lambda architecture pattern and how it enables real-time data processing along with batch data analytics. You will learn about the key components of the Lambda architecture, including the batch processing layer for historical data, the speed processing layer for real-time data, and the serving layer for unified queries. We will discuss how technologies such as Apache Spark, Apache Kafka, and Apache Airflow can be used to implement these layers at scale.

By the end of the chapter, you will understand the core design principles and technology choices for building ...

Get Big Data on Kubernetes now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.