4.1. Which Tables to Use?

Without aggregates, all queries are directed to the base schema. With aggregates, there are potentially several database tables capable of answering a query. Each fact table may have several aggregates derived from it. Some will be capable of providing an answer to a given query; some will not. Of those that can provide an answer, some will be faster than others.

A fundamental question must now be answered for every query issued: Which tables will be used? This new requirement is driven by the presence of aggregates.

Consider the Orders schema at the top of Figure 4.1, for which a set of dimensional aggregates has been designed. You are interested in looking at order dollars by month for the year 2006. Without the aggregates, this and any question about orders would be directed to the order_facts star. Now, there are several options. Some will be faster than others. There are also some tables that cannot respond to the query.

Deciding which table will best answer the query requires considering a number of factors: the schema design itself, the relative size of each option, the availability of each option, and its location. As you will see later, these choices are best left to an aggregate navigator. But even if this task is to be automated, it is important to understand what it needs to do.

4.1.1. The Schema Design

The identification of tables capable of providing query results should be simple, even if the process is performed manually. The dimensionality ...

Get Mastering Data Warehouse Aggregates: Solutions for Star Schema Performance now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.