4.1. Which Tables to Use?

Without aggregates, all queries are directed to the base schema. With aggregates, there are potentially several database tables capable of answering a query. Each fact table may have several aggregates derived from it. Some will be capable of providing an answer to a given query; some will not. Of those that can provide an answer, some will be faster than others.

A fundamental question must now be answered for every query issued: Which tables will be used? This new requirement is driven by the presence of aggregates.

Consider the Orders schema at the top of Figure 4.1, for which a set of dimensional aggregates has been designed. You are interested in looking at order dollars by month for the year 2006. Without the aggregates, this and any question about orders would be directed to the order_facts star. Now, there are several options. Some will be faster than others. There are also some tables that cannot respond to the query.

Deciding which table will best answer the query requires considering a number of factors: the schema design itself, the relative size of each option, the availability of each option, and its location. As you will see later, these choices are best left to an aggregate navigator. But even if this task is to be automated, it is important to understand what it needs to do.

4.1.1. The Schema Design

The identification of tables capable of providing query results should be simple, even if the process is performed manually. The dimensionality ...

Get Mastering Data Warehouse Aggregates: Solutions for Star Schema Performance now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.