Skip to Content
Apache Hudi: The Definitive Guide
book

Apache Hudi: The Definitive Guide

by Shiyan Xu, Prashant Wason, Bhavani Sudha Saktheeswaran, Rebecca Bilbro
October 2025
Intermediate to advanced
290 pages
7h 43m
English
O'Reilly Media, Inc.
Book available
Content preview from Apache Hudi: The Definitive Guide

Chapter 10. Building an End-to-End Lakehouse Solution

Having established the operational foundations to run a production lakehouse, the stage is set for us to build a comprehensive, integrated solution atop Hudi. This chapter will demonstrate how to construct an end-to-end production data lakehouse architecture with Apache Hudi as its foundation. Rather than examining isolated components, we’ll follow a single dataset through its entire lifecycle, from initial ingestion to analytical insights and AI-driven applications.

Modern data architectures require seamless data integration from upstream sources, unified support for both streaming and batch processing, reliable handling of diverse data types, and the ability to serve multiple downstream consumers with varying requirements. The magic isn’t about having perfect data, but about nimbly stitching together key features to deliver novel insights despite real-world problems like data silos and operational challenges. You have to “make data easy” for your organization and empower your teams to build on top of it.

This chapter will explain how to tackle these challenges in style by combining multiple processing frameworks on top of a unified data lakehouse. Hudi’s versatility supports this level of integration while making it easy to do things “the right way,” with respect to data consistency, performance, and governance.

In this chapter, we’ll construct a complete data platform that progressively transforms raw data into business ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

gRPC: Up and Running

gRPC: Up and Running

Kasun Indrasiri, Danesh Kuruppu
Stream Processing with Apache Flink

Stream Processing with Apache Flink

Fabian Hueske, Vasiliki Kalavri
Apache Iceberg: The Definitive Guide

Apache Iceberg: The Definitive Guide

Tomer Shiran, Jason Hughes, Alex Merced
Command-Line Rust

Command-Line Rust

Ken Youens-Clark

Publisher Resources

ISBN: 9781098173821Errata Page