Skip to main content

Get full access to Delta Lake: Up and Running and 60K+ other titles, with a free 10-day trial of O'Reilly.

There are also live events, courses curated by job role, and more.

Start your free trial

Delta Lake: Up and Running

Delta Lake: Up and Running

by Bennie Haelen, Dan Davis

Released October 2023

Publisher(s): O'Reilly Media, Inc.

ISBN: 9781098139728

Buy on Amazon Buy on ebooks.com

Start your free trial

Book description

With the surge in big data and AI, organizations can rapidly create data products. However, the effectiveness of their analytics and machine learning models depends on the data's quality. Delta Lake's open source format offers a robust lakehouse framework over platforms like Amazon S3, ADLS, and GCS.

This practical book shows data engineers, data scientists, and data analysts how to get Delta Lake and its features up and running. The ultimate goal of building data pipelines and applications is to gain insights from data. You'll understand how your storage solution choice determines the robustness and performance of the data pipeline, from raw data to insights.

You'll learn how to:

Use modern data management and data engineering techniques
Understand how ACID transactions bring reliability to data lakes at scale
Run streaming and batch jobs against your data lake concurrently
Execute update, delete, and merge commands against your data lake
Use time travel to roll back and examine previous data versions
Build a streaming data quality pipeline following the medallion architecture

Publisher resources

View/Submit Errata

Table of contents

Product information

Title: Delta Lake: Up and Running
Author(s): Bennie Haelen, Dan Davis
Release date: October 2023
Publisher(s): O'Reilly Media, Inc.
ISBN: 9781098139728

You might also like

book

Delta Lake: The Definitive Guide

by Denny Lee, Prashanth Babu, Tristen Wentling, Scott Haines

Discover how Delta Lake simplifies the process of building data lakehouses and data pipelines at scale. …

book

Data Engineering with Apache Spark, Delta Lake, and Lakehouse

by Manoj Kukreja

Understand the complexities of modern-day data engineering platforms and explore strategies to deal with them with …

book

Terraform: Up and Running, 3rd Edition

by Yevgeniy Brikman

Terraform has become a key player in the DevOps world for defining, launching, and managing infrastructure …

book

Kubernetes: Up and Running, 3rd Edition

by Brendan Burns, Joe Beda, Kelsey Hightower, Lachlan Evenson

This third edition comes with a dedicated playlist of interactive Katacoda labs mapped to each section …

Don’t leave empty-handed

Get Mark Richards’s Software Architecture Patterns ebook to better understand how to design components—and how they should interact.

It’s yours, free.

Get it now

Cover of Software Architecture Patterns

Check it out now on O’Reilly

Dive in for free with a 10-day trial of the O’Reilly learning platform—then explore all the other resources our members count on to build skills and solve problems every day.

Start your free trial Become a member now