Chapter 2. Understanding Delta Sharing

Delta Sharing is an innovation from Databricks that revolutionizes how organizations share and exchange data. It provides a simple, secure, and open way for data providers and consumers to share data across organizations in real time, regardless of which computing platforms they use. This protocol is built on top of Delta Lake, an open source storage layer that brings reliability to data lakes. It provides ACID (atomicity, consistency, isolation, durability) transactions and scalable metadata handling and unifies streaming and batch data processing on top of an existing data lake. Unlike traditional methods, Delta Sharing employs a streamlined REST (representational state transfer) protocol to grant access to specific segments of cloud datasets. REST serves as the cornerstone of web service creation, enabling data transmission over HTTP in a lightweight manner. Delta Sharing natively integrates with cloud storage systems such as Amazon S3, Azure Data Lake Storage (ADLS), Cloudflare R2, and Google Cloud Storage (GCS), ensuring seamless and dependable data transfer between data providers and recipients.

Delta Sharing fundamentally simplifies how data is accessed. Whether working with live data, notebooks, dashboards, or sophisticated models like machine learning and AI, Delta Sharing enables secure sharing from your lakehouse to any computing environment. Notably, it frees data from the confines of proprietary systems, enabling data sharing ...

Get Data Sharing and Collaboration with Delta Sharing now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.