book

Mastering Data Warehouse Aggregates: Solutions for Star Schema Performance

Name: Mastering Data Warehouse Aggregates: Solutions for Star Schema Performance
Author: Christopher Adamson
ISBN: 9780471777090

by Christopher Adamson

July 2006

Intermediate to advanced

378 pages

9h 38m

English

Wiley

Read now

Unlock full access

Content preview from Mastering Data Warehouse Aggregates: Solutions for Star Schema Performance

2.4. Summary

A potential aggregate is expressed in terms of grain and dimensional conformance. This works equally well for aggregate fact tables as for pre-joined aggregates.

In this chapter, you have seen that the number of potential aggregates can be large enough to preclude their evaluation on a case-by-case basis. Business requirements can be gathered and linked to potential aggregates, narrowing the pool that must be evaluated.

These requirements can be identified during the schema design process by looking at the dimensional conformance matrix for drill-across aggregates and by looking at actual user reports.

The number of aggregates and space they consume will be dictated by your operating environment and resource considerations. Given these limits, you will look at aggregates that seem to have wide applicability and evaluate the average number of rows each aggregate row summarizes.

This savings is compared to the base table and to other aggregates. Factors such as skew and changes in sparsity require careful testing of the proposed pool of aggregates. The relative importance of various business requirements is used as a tiebreaker in choosing between aggregates to include.

Now that the aggregates have been chosen, you need to formally document their design. This chapter has already touched on some design principles for aggregate schemas, and others have been implied. The next chapter provides a formal set of design principles that should be followed in designing and documenting ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Usage-Driven Database Design: From Logical Data Modeling through Physical Schema Definition

Publisher Resources

ISBN: 9780471777090Purchase book

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills