© Ed Elliott 2021
E. ElliottIntroducing .NET for Apache Sparkhttps://doi.org/10.1007/978-1-4842-6992-3_11

11. Delta Lake

Ed Elliott1  
(1)
Sussex, UK
 

Delta Lake is an extension to Apache Spark created by the company behind Apache Spark, Databricks, and released as a separate open source project. Delta Lake aims to make writing to data lakes efficient in an enterprise environment – no matter which type of data lake you have, whether it is Azure Data Lake Storage, AWS S3, or Hadoop. Delta Lake brings the ACID properties that you would expect with a relational database such as Microsoft SQL Server or Oracle to a remote file system such as a data lake.

ACID

When we work with an RDBMS such as Microsoft SQL Server or Oracle, we talk about the ACID properties ...

Get Introducing .NET for Apache Spark: Distributed Processing for Massive Datasets now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.