1.2. Invisible Aggregates

Aggregate tables improve data warehouse performance by reducing the number of rows the RDBMS must access when responding to a query. At the simplest level, this is accomplished by partially summarizing the data in a base fact table and storing the result in a new fact table. Some new terminology will be necessary to differentiate aggregate tables from those in the original schema.

If the design of an aggregate schema is carefully managed, a query can be rewritten to leverage it through simple substitution of aggregate table names for base table names. Rather than expecting users or application developers to perform this substitution, an aggregate navigator is deployed. This component of the data warehouse architecture intercepts all queries and rewrites them to leverage aggregates, allowing users and applications to issue SQL written for the original schema.

For all of this to come off smoothly, it is important that the aggregates be designed and built according to some basic principles. These principles will shape the best practices detailed throughout this book.

1.2.1. Improving Performance

As you have seen, the best practices of dimensional design dictate that fact table grain is set at the lowest possible level of detail. This ensures that it will be possible to present the facts in any dimensional context desired. But most queries do not call for presentation of these individual atomic measurements. Instead, some group of these measurements will ...

Get Mastering Data Warehouse Aggregates: Solutions for Star Schema Performance now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.