© The Author(s), under exclusive license to APress Media, LLC, part of Springer Nature 2022
M. KromerMapping Data Flows in Azure Data Factoryhttps://doi.org/10.1007/978-1-4842-8612-8_7

7. Data Deduplication

Mark Kromer1  
(1)
SNOHOMISH, WA, USA
 

In the previous chapter, we went into depth on our first data flow pattern in this book, walking through the common slowly changing dimensions pattern. In this chapter, we’ll continue the deep-dive exploration with another data flow pattern. This time we’ll cover data deduplication.

The Need for Data Deduplication

Part of the role of data engineering and ETL jobs is to ensure that the data being processed for business use is clean and contains a single source of truth. Deduping data is extremely important and ...

Get Mapping Data Flows in Azure Data Factory: Building Scalable ETL Projects in the Microsoft Cloud now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.