R. L'EsteveThe Azure Data Lakehouse Toolkithttps://doi.org/10.1007/978-1-4842-8233-5_12

12. Dynamic Partition Pruning

Ron L’Esteve¹

(1)

Chicago, IL, USA

Database pruning is an optimization process used to avoid reading files that do not contain the data that you are searching for. You can skip sets of partition files if your query has a filter on a particular partition column. In Apache Spark, dynamic partition pruning is a capability that combines both logical and physical optimizations to find the dimensional filter, ensures that the filter executes only once on the dimension side, and then applies the filter directly to the scan of the table which speeds ...

Get The Azure Data Lakehouse Toolkit: Building and Scaling Data Lakehouses on Azure with Delta Lake, Apache Spark, Databricks, Synapse Analytics, and Snowflake now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

The Azure Data Lakehouse Toolkit: Building and Scaling Data Lakehouses on Azure with Delta Lake, Apache Spark, Databricks, Synapse Analytics, and Snowflake by Ron L'Esteve

12. Dynamic Partition Pruning

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly