6.2. Dimension Processing
Dimension tables provide the structure and context for the fact tables and measurements in the DW/BI system. Dimension tables are much smaller than fact tables, which is good because their processing is complex.
Dimensions are updated over time, either by updating in place (Type 1), or by adding new rows to track history (Type 2). Sometimes you're lucky enough to be handed separate streams of new rows and changed rows. Often you need to figure out what changes have occurred, and handle them appropriately.
This section presents examples related to the AdventureWorks databases. You'll be able to follow along simply by reading the text and looking at the figures. But if you want to experiment with the packages from this chapter, you need to go to the book's web site. We're populating a modified version of the AdventureWorksDW database that ships with SQL Server 2005: We call our versions MDWT_AdventureWorksDW. The web site (www.MsftDWToolkit.com) has copies of the packages that populate MDWT_AdventureWorksDW.
6.2.1. Dimension Processing Basics
NOTE
KIMBALL METHOD ETL SUBSYSTEM
This section describes techniques relevant to Kimball Method ETL Subsystems:
#1 Extract System: Source data adapters, push/pull/dribble job schedulers, filtering and sorting at the source, proprietary data format conversions, and data staging after transfer to ETL environment.
#9 Surrogate Key Creation System: Robust mechanism for producing stream of surrogate keys independently for ...
Get The Microsoft® Data Warehouse Toolkit: With SQL Server™ 2005 and the Microsoft® Business Intelligence Toolset now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.