Book description
DuckDB, an open source in-process database created for OLAP workloads, provides key advantages over more mainstream OLAP solutions: It's embeddable and optimized for analytics. It also integrates well with Python and is compatible with SQL, giving you the performance and flexibility of SQL right within your Python environment. This handy guide shows you how to get started with this versatile and powerful tool.
Author Wei-Meng Lee takes developers and data professionals through DuckDB's primary features and functions, best practices, and practical examples of how you can use DuckDB for a variety of data analytics tasks. You'll also dive into specific topics, including how to import data into DuckDB, work with tables, perform exploratory data analysis, visualize data, perform spatial analysis, and use DuckDB with JSON files, Polars, and JupySQL.
- Understand the purpose of DuckDB and its main functions
- Conduct data analytics tasks using DuckDB
- Integrate DuckDB with pandas, Polars, and JupySQL
- Use DuckDB to query your data
- Perform spatial analytics using DuckDB's spatial extension
- Work with a diverse range of data including Parquet, CSV, and JSON
Publisher resources
Table of contents
- Preface
- 1. Getting Started with DuckDB
- 2. Importing Data into DuckDB
- 3. A Primer on SQL
- 4. Using DuckDB with Polars
-
5. Performing EDA with DuckDB
- Our Dataset: The 2015 Flight Delays Dataset
- Geospatial Analysis
-
Performing Descriptive Analytics
- Finding the Airports for Each State and City
- Aggregating the Total Number of Airports in Each State
- Obtaining the Flight Counts for Each Pair of Origin and Destination Airports
- Getting the Canceled Flights from Airlines
- Getting the Flight Count for Each Day of the Week
- Finding the Most Common Timeslot for Flight Delays
- Finding the Airlines with the Most and Fewest Delays
- Summary
- 6. Using DuckDB with JSON Files
- 7. Using DuckDB with JupySQL
- 8. Accessing Remote Data Using DuckDB
- 9. Using DuckDB in the Cloud with MotherDuck
- Index
- About the Author
Product information
- Title: DuckDB: Up and Running
- Author(s):
- Release date: December 2024
- Publisher(s): O'Reilly Media, Inc.
- ISBN: 9781098159696
You might also like
book
FastAPI
FastAPI is a young yet solid framework that takes advantage of newer Python features in a …
book
Building Event-Driven Microservices
Organizations today often struggle to balance business requirements with ever-increasing volumes of data. Additionally, the demand …
book
Docker: Up & Running, 3rd Edition
Docker and Linux containers have fundamentally changed the way that organizations develop, deliver, and run software …
book
DuckDB in Action
Dive into DuckDB and start processing gigabytes of data with ease—all with no data warehouse. DuckDB …