Chapter 2. Getting Started with DuckLake
As discussed in Chapter 1, getting DuckLake up and running is quick and straightforward. Beyond the initial setup, however, there are several architectural decisions to consider: which backend database for catalog hosting, where the data will be stored, and what compute engine will run your workloads. The key advantage is that these decisions do not have to be finalized on day one. You can begin with a local DuckLake instance using a DuckDB-managed catalog and local storage and then evolve over time e.g. move the catalog to PostgreSQL and store the data in a cloud object store such as S3.
This chapter will explore the major DuckLake deployment patterns and outline the trade-offs of each. It will also provide practical examples that demonstrate how to implement these configurations as your requirements evolve. Additionally, we will cover cloud and SaaS options for DuckLake and how to load data and perform basic CRUD operations. By the end of the chapter, ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access