Preface
Welcome to Apache Polaris: The Definitive Guide. This book is designed to guide you through the journey of building and managing scalable, secure, and flexible data lakehouses with Apache Polaris™, an innovative, community-driven catalog project. As data lakehouses continue to evolve, Polaris represents the next generation of catalog solutions, offering unified data management, role-based access control, and multi-catalog support, all while promoting open standards and interoperability across cloud and on-premise environments.
The story of Apache Polaris begins with the data lakehouse architecture and the critical role that Apache Iceberg™ plays in making data lakehouses performant, reliable, and accessible. In the first part of this book, we’ll dive deep into the origins and architecture of data lakehouses, explore the challenges they were designed to solve, and walk through the capabilities that Apache Iceberg brings to modern data lakes. As data becomes increasingly central to all aspects of business operations, Iceberg’s robust table format has emerged as an essential tool for managing data at scale, providing essential features like ACID transactions, schema evolution, and efficient querying. We’ll also look at how Iceberg catalogs originally developed to bring this table format to life, allowing data lakehouses to become more accessible and consistent.
Note
Apache Polaris is an effort undergoing incubation at The Apache Software Foundation (ASF), sponsored by the ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access