3.3. Documenting the Aggregate Schema

Documentation of the data warehouse schema is an important product of the schema design process. Disciplined documentation of schema design prevents the team from omitting important steps in the design process, such as the setting of fact table grain. Schema documentation will drive the development of the schema itself and the ETL processes that load it. Once in production, the documentation will also become a valuable reference for maintenance of the data warehouse.

The aggregate schema is no different in these respects. Good documentation will enforce critical design steps, drive aggregate development, and serve as a reference for warehouse maintenance. This last point is important, because the aggregate portfolio is more likely to change over time than the base schema.

This section defines a set of documentation requirements for the aggregate schema that reflect the principles that have been discussed. Examples are provided to illustrate the requirements. You may choose to document your aggregates differently, but be sure each of these requirements is addressed.

3.3.1. Identify Schema Families

Organize your aggregate documentation around the base fact tables. A base fact table and its aggregates form a schema family. Each schema family is a set of stars that convey the same information at various levels of summarization. A pre-joined aggregate is also part of the base fact table's schema family.

This organizational approach will come naturally ...

Get Mastering Data Warehouse Aggregates: Solutions for Star Schema Performance now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.