Skip to Content
Apache Iceberg: The Definitive Guide
book

Apache Iceberg: The Definitive Guide

by Tomer Shiran, Jason Hughes, Alex Merced
May 2024
Intermediate to advanced
344 pages
8h 40m
English
O'Reilly Media, Inc.
Content preview from Apache Iceberg: The Definitive Guide

Chapter 14. Real-World Use Cases of Apache Iceberg

In this chapter, we will dive into some of the real-world applications of Apache Iceberg and provide you with hands-on experience in running different analytical use cases supported by a lakehouse architecture. These use cases will include ensuring data quality in data lakes, building business intelligence (BI) reports, and implementing critical processes such as CDC. Additional use case for building a real-time analytical architecture, running machine learning (ML) workloads, and slowly changing dimensions (SCDs) are available at this supplemental repository. This chapter is a practical introductory guide, showcasing how to tackle essential real-world applications using Iceberg and highlighting its adaptability and importance as a core element in any data architecture.

Ensuring High-Quality Data with Write-Audit-Publish in Apache Iceberg

Maintaining the highest level of data quality is crucial for deriving meaningful insights. If data quality is compromised at any point in a data engineering workflow, it can adversely affect subsequent analyses such as BI and predictive analytics. For example, consider an extract, transform, and load (ETL) process: it takes data from an operational system and transfers it to an analytical system for use in BI reports or ad hoc analyses. If the original data has duplicates or inconsistencies or if such issues are introduced during the ETL process and are not addressed before reaching the production ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Terraform: Up and Running, 3rd Edition

Terraform: Up and Running, 3rd Edition

Yevgeniy Brikman
Kubernetes: Up and Running, 3rd Edition

Kubernetes: Up and Running, 3rd Edition

Brendan Burns, Joe Beda, Kelsey Hightower, Lachlan Evenson
System Design on AWS

System Design on AWS

Jayanth Kumar, Mandeep Singh

Publisher Resources

ISBN: 9781098148614Errata PageSupplemental Content