• Clearly understand why data quality is critical in a data warehouse
  • Observe the challenges posed by corrupt data and learn the methods to deal with them
  • Appreciate the benefits of quality data
  • Review the various categories of data quality tools and examine their usage
  • Study the implications of a data quality initiative and learn practical tips on data quality
  • Review Master Data Management (MDM) and check its applicability to data quality in the data warehouse

Imagine a small error, seemingly inconsequential, creeping into one of your operational systems. While collecting data in that operational system about customers, let us say the user consistently entered erroneous region codes. The sales region codes of the customers are all messed up, but in the operational system, the accuracy of the region codes may not be that important because no invoices to the customers are going to be mailed out using region codes. These region codes were entered for marketing purposes.

Now take the customer data to the next step and move it into the data warehouse. What is the consequence of this error? All analyses performed by your data warehouse users based on region codes will result in serious misrepresentation. An error that seems to be so irrelevant in the operational systems can cause gross distortion in the results from the data warehouse. This example may not appear to be the true state of affairs in many data warehouses, but ...

Get Data Warehousing Fundamentals for IT Professionals now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.