Mastering Data Warehouse Aggregates: Solutions for Star Schema Performance
by Christopher Adamson
2.4. Summary
A potential aggregate is expressed in terms of grain and dimensional conformance. This works equally well for aggregate fact tables as for pre-joined aggregates.
In this chapter, you have seen that the number of potential aggregates can be large enough to preclude their evaluation on a case-by-case basis. Business requirements can be gathered and linked to potential aggregates, narrowing the pool that must be evaluated.
These requirements can be identified during the schema design process by looking at the dimensional conformance matrix for drill-across aggregates and by looking at actual user reports.
The number of aggregates and space they consume will be dictated by your operating environment and resource considerations. Given these limits, you will look at aggregates that seem to have wide applicability and evaluate the average number of rows each aggregate row summarizes.
This savings is compared to the base table and to other aggregates. Factors such as skew and changes in sparsity require careful testing of the proposed pool of aggregates. The relative importance of various business requirements is used as a tiebreaker in choosing between aggregates to include.
Now that the aggregates have been chosen, you need to formally document their design. This chapter has already touched on some design principles for aggregate schemas, and others have been implied. The next chapter provides a formal set of design principles that should be followed in designing and documenting ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access