R. L'EsteveThe Azure Data Lakehouse Toolkithttps://doi.org/10.1007/978-1-4842-8233-5_13

13. Z-Ordering and Data Skipping

Ron L’Esteve¹

(1)

Chicago, IL, USA

When querying terabytes and petabytes of big data for analytics using Apache Spark, having optimized querying speeds is critical. There are a few available optimization commands within Databricks that can be used to speed up queries and make them more efficient. Seeing that Z-Ordering and Data Skipping are optimization features that are available within Databricks, we are interested in getting started with testing and using them in Databricks notebooks.

Z-Ordering is a method used by Apache Spark to combine ...

Get The Azure Data Lakehouse Toolkit: Building and Scaling Data Lakehouses on Azure with Delta Lake, Apache Spark, Databricks, Synapse Analytics, and Snowflake now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

The Azure Data Lakehouse Toolkit: Building and Scaling Data Lakehouses on Azure with Delta Lake, Apache Spark, Databricks, Synapse Analytics, and Snowflake by Ron L'Esteve

13. Z-Ordering and Data Skipping

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly