7

Performance Tuning in Delta Lake

Delta Lake is an open source data lake that supports ACID transactions and provides reliable data versioning and schema evolution capabilities. This chapter covers several techniques to optimize query performance in Delta Lake, including optimizing table partitioning, caching tables for fast query response, organizing data with Z-ordering, skipping data for faster query execution, reducing table size and I/O cost with compression, and boosting query performance.

We will cover the following recipes in this chapter:

  • Optimizing Delta Lake table partitioning for query performance
  • Organizing data with Z-ordering for efficient query execution
  • Skipping data for faster query execution
  • Reducing Delta Lake table size ...

Get Data Engineering with Databricks Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.