Mastering Data Warehouse Aggregates: Solutions for Star Schema Performance
by Christopher Adamson
8.4. Summary
Advanced dimensional design techniques affect the design of dimensional aggregates in a variety of ways:
The semi-additive facts of periodic snapshots must not be summed across time periods. Invisible aggregates can summarize these facts across other dimensions. Derived schemas may average the semi-additive fact across periods, but cannot be further summarized.
An accumulating snapshot is generally a poor candidate for aggregation; omission of numerous milestones may be required to produce any savings in size, and the resulting aggregate can include lag times only as averages.
A factless fact table that models a transaction or event can be aggregated, producing a transaction count. Symmetry with the base schema can be restored by including this fact in the base schema with a constant value of 1.
Coverage tables are usually poor candidates for aggregation because summarization destroys the business definition of the relationship the table represents.
Aggregates of transaction dimensions usually omit timestamps. Their presence in an aggregate is likely to limit the number of rows summarized, and updates to the expiration timestamp make incremental loads complex.
Bridge tables are often the source of derived schemas that eliminate repeating values; these schemas in turn are summarized by invisible aggregates. If the bridge does not include an allocation factor, it may be possible to summarize a bridge table itself, or limit aggregation to other dimensions in the schema.
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access