Chapter 10. Working with OLAP Data

OLAP, or Online Analytical Processing, has a special position as subsystem 20 in the 34 ETL subsystems. Although handling OLAP data stores is only one of the 34 ETL subsystems, the topic is so important that we have devoted this entire chapter to it.

The term OLAP was introduced in 1993 by database legend E. F. (Ted) Codd, who came up with 12 rules for defining OLAP. The rules are nicely summarized at http://www.olap.com/w/index.php/Codd's_Paper.

The most important notion in Codd's definition is the multi-dimensional nature of OLAP data—the two terms, OLAP and multi-dimensional, are now used almost as synonyms. It's not that Codd invented OLAP, but he did give it a name, and since then a multi-billion–dollar industry emerged around this concept. The first OLAP products already existed when Codd came up with his rules, with Cognos Powerplay and Arbor Essbase probably being the most familiar. These products still exist today, Powerplay as part of the IBM-Cognos BI offering, and Essbase as part of the Oracle BI offering. The biggest player in the OLAP market, however, started its life in Israel under the wings of a small software company called Panorama. In 1996, the company sold its OLAP server technology to Microsoft and the rest is history, as Microsoft Analysis Services is now the OLAP solution with the biggest overall market share and the largest number of production deployments. Microsoft did some other good things for the OLAP market as well: ...

Get Pentaho® Kettle Solutions: Building Open Source ETL Solutions with Pentaho Data Integration now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.