October 2024
Intermediate to advanced
246 pages
6h 22m
English
This chapter introduces several techniques for managing the data quality of datasets in a data pipeline. We’ll introduce expectations in Delta Live Tables (DLT), which is a way to enforce certain data quality constraints on arriving data before merging the data into downstream tables. Later in the chapter, we’ll look at more advanced techniques such as quarantining bad data for human intervention. Next, we’ll also see how we can decouple constraints so that they can be managed separately by non-technical personas within your organization. By the end of the chapter, you should have a firm understanding of how you can take measures to ensure the data integrity of datasets in your lakehouse and how ...
Read now
Unlock full access