Apart from the indexing techniques we have discussed in the previous section, a few other methods also improve performance in a data warehouse. For example, physically compacting the data when writing to storage enables more data to be loaded into a single block. That also means that more data may be retrieved in one read. Another method for improving performance is the merging of tables. Again, this method enables more data to be retrieved in one read. If you purge unwanted and unnecessary data from the warehouse in a regular manner, you can improve the overall performance.

In the remainder of this section, let us review a few other effective performance enhancement techniques. Many techniques are available through the DBMS, and most of these techniques are especially suitable for the data warehouse environment.

18.6.1. Data Partitioning

Typically, the data warehouse holds some very large database tables. The fact tables run into millions of rows. Dimension tables like the product and customer tables may also contain a huge number of rows. When you have tables of such vast sizes, you face certain specific problems. First, loading of large tables takes excessive time. Then, building indexes for large tables also runs into several hours. What about processing of queries against large tables? Queries also run longer when attempting to sort through large volumes of data to obtain the result sets. Backing up and recovery of huge tables takes ...

Get DATA WAREHOUSING FUNDAMENTALS: A Comprehensive Guide for IT Professionals now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.