Book description
With the surge in big data and AI, organizations can rapidly create data products. However, the effectiveness of their analytics and machine learning models depends on the data's quality. Delta Lake's open source format offers a robust lakehouse framework over platforms like Amazon S3, ADLS, and GCS.
This practical book shows data engineers, data scientists, and data analysts how to get Delta Lake and its features up and running. The ultimate goal of building data pipelines and applications is to gain insights from data. You'll understand how your storage solution choice determines the robustness and performance of the data pipeline, from raw data to insights.
You'll learn how to:
- Use modern data management and data engineering techniques
- Understand how ACID transactions bring reliability to data lakes at scale
- Run streaming and batch jobs against your data lake concurrently
- Execute update, delete, and merge commands against your data lake
- Use time travel to roll back and examine previous data versions
- Build a streaming data quality pipeline following the medallion architecture
Publisher resources
Table of contents
- Preface
- 1. The Evolution of Data Architectures
- 2. Getting Started with Delta Lake
- 3. Basic Operations on Delta Tables
- 4. Table Deletes, Updates, and Merges
- 5. Performance Tuning
- 6. Using Time Travel
- 7. Schema Handling
- 8. Operations on Streaming Data
- 9. Delta Sharing
- 10. Building a Lakehouse on Delta Lake
- Index
- About the Author
Product information
- Title: Delta Lake: Up and Running
- Author(s):
- Release date: October 2023
- Publisher(s): O'Reilly Media, Inc.
- ISBN: 9781098139728
You might also like
book
Delta Lake: The Definitive Guide
Ready to simplify the process of building data lakehouses and data pipelines at scale? In this …
book
Terraform: Up and Running, 3rd Edition
Terraform has become a key player in the DevOps world for defining, launching, and managing infrastructure …
book
Kubernetes: Up and Running, 3rd Edition
This third edition comes with a dedicated playlist of interactive Katacoda labs mapped to each section …
book
System Design on AWS
Enterprises building complex and large-scale applications in the cloud face multiple challenges. From figuring out the …