Chapter 18 Data quality in DW 2.0

The DW 2.0 environment departs from the “code, load, and explode” world that was the norm for first-generation data warehouses. In the latter case, no attention was paid to data quality until the very last moment—the 11th hour of the project. That was the point in time when the project team was loading the data warehouse with data extracted from source systems and only then discovering the “gremlins” that were lurking in the source data. This led to enormous frustration and inevitably to major delays in implementation schedules. Data quality problems discovered during the testing or loading stage can be a major cause of projects going over time and budget, both of which are leading indicators of project failure. ...

Get DW 2.0: The Architecture for the Next Generation of Data Warehousing now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.