CHAPTER 6Profiling for Data Quality

INTRODUCTION

In the previous chapters, we discussed the fact that data is a valuable business asset that can transform business performance when the quality is high. While many companies are plagued with poor data quality, not many have a good understanding of the level of data quality. However, before you fix the data quality, you need to have a good understanding of where you stand so that you know what to fix. So, how can one can one objectively assess the quality of the data set? The solution to objectively assess the quality of the data is by profiling the data with the right KPIs (key performance indicators).

A key performance indicator (KPI) is a measurable value that demonstrates how effectively the measurement entity is achieving its key objectives. While designing KPIs in data quality, three key rules are recommended (Southekal 2020):

  1. Why do you want to know? How much do you want to know? What is the value of knowing and not knowing?
  2. Who owns this KPI? Knowing the KPI's ownership is the key to realizing change.
  3. Is the KPI owner close to data? If the KPI owner is close to data, that means they are close to the business processes and implementing the change will be much easier.

Broadly, data quality issues, especially logical data decay issues, can be either visible or hidden. Visible data quality problems are easy to see, and solving them creates tangible and quick value creation for the company. But, as you go deeper, visibility ...

Get Data Quality now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.