4.3. Dimension ETL: The Challenge
As already mentioned, dimension ETL often consumes the largest portion of a data warehouse ETL effort. But even beyond the challenges presented of handling history, managing surrogate keys, and accounting for different dimension designs, dimension ETL must manage the source system complexities, and must consolidate them down from the source to the dimension structure in the data warehouse. Figure 4-3 compares the table structures of the product dimension destination in the AdventureWorksDW database with the multiple related product tables from the AdventureWorks database source.
Figure 4-3. Comparison of table structures of the product dimension destination
The left side of the table layout represents the AdventureWorks transactional system, with nine related tables that focus on the product dimension. These tables involve many-to-many relationships and one-to-many primary to foreign key relationships, organized into a transactional third normal form. The challenge for the ETL process is taking the source data and transforming it to a structure that can be compared with the dimension in order to handle dimension history and changes. The challenge is also ensuring that the ETL is scalable and manageable, simple for people to follow when administering it, and flexible to handle changes.
The three tables on the right side of the table layout represent ...
Get Expert SQL Server™ 2005 Integration Services now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.