11.4. THE SNOWFLAKE SCHEMA

"Snowflaking" is a method of normalizing the dimension tables in a STAR schema. When you completely normalize all the dimension tables, the resultant structure resembles a snowflake with the fact table in the middle. First, let us begin with Figure 11-7, which shows a simple STAR schema for sales in a manufacturing company.

The sales fact table contains quantity, price, and other relevant metrics. Sales rep, customer, product, and time are the dimension tables. This is a classic STAR schema, denormalized for optimal query access involving all or most of the dimensions. The model is not in the third normal form.

11.4.1. Options to Normalize

Assume that there are 500,000 product dimension rows. These products fall under 500 product brands and these product brands fall under 10 product categories. Now suppose one of your users runs a query constraining just on product category. If the product dimension table is not indexed on product category, the query will have to search through 500,000 rows. On the other hand, even if the product dimension is partially normalized by separating out product brand and product category into separate tables, the initial search for the query will have to go through only 10 rows in the product category table. Figure 11-8 illustrates this reduction in the search process.

In Figure 11-8, we have not completely normalized the product dimension. We can also move other attributes out of the product dimension table and form normalized ...

Get DATA WAREHOUSING FUNDAMENTALS: A Comprehensive Guide for IT Professionals now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.