Chapter 9. Using Apache Polaris with Dremio
In this chapter, we explore how to integrate Apache Polaris with Dremio, a high-performance intelligent data lakehouse platform. Dremio has robust support for Apache Polaris, being one of the co-creators of the project along with Snowflake. By connecting Dremio to Polaris, you can query and create Apache Iceberg tables managed by Polaris as the metadata authority, while Dremio serves as the execution engine accessing data on cloud storage. This enables a seamless lakehouse architecture where Polaris handles table metadata (schemas, snapshots, partition info, etc.) and Dremio handles query processing and query federation, allowing you to join your Iceberg tables with data in other databases, data lakes, and data warehouses in a governed semantic layer (as illustrated in Figure 9-1). We will cover setting up the connection in Dremio, configuring authentication and storage properties, and using Dremio SQL to work with Polaris-managed Iceberg tables. All examples assume you have a Polaris service running with a catalog created and accessible via its REST API and Dremio Enterprise Edition version 26.0 or later. You can try Dremio by visiting https://www.dremio.com/get-started. Dremio Enterprise Edition also has its own integrated Iceberg catalog powered by Apache Polaris, making it another option for a managed Polaris catalog along with the Snowflake Open Catalog.