6.3. Fact Processing
We've been describing how to build packages to perform dimension processing. Now it's time to turn our attention to fact processing. Extracting fact data, consolidating it, and loading it into the data warehouse is generally less complex than managing dimension changes. Most fact tables are at the grain of a transaction. During normal processing of a transaction grain fact table, rows are inserted but not updated. Later in this chapter, we discuss exceptions to this general rule.
There are three main things to worry about when processing facts:
Cleaning data and synthesizing data from multiple sources: This task is dependent on your environment. The greatest challenge is the design step: What data needs to be cleaned and synthesized, and how will you go about doing it? Integration Services has many tasks and transforms to use in building your custom logic. For example, the fact table in the MDWT_AdventureWorksDW dimensional model keeps orders data in both U.S. dollars and the original local currency. The source system holds only the U.S. dollars, so the fact table ETL process must pull conversion rates and calculate the (local currency) amount for each row.
Performing the surrogate key lookups: The greatest burden of cleaning and conforming is in the dimension processing. By the time you process the facts, you have well-formed dimension tables or staging tables against which to perform the lookups that swap out the natural keys from the extracts for the surrogate ...
Get The Microsoft® Data Warehouse Toolkit: With SQL Server™ 2005 and the Microsoft® Business Intelligence Toolset now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.