By making use of the several techniques discussed in the previous section, you design the data extraction function. Now the extracted data is raw data and it cannot be applied to the data warehouse. First, all the extracted data must be made usable in the data warehouse. Having information that is usable for strategic decision making is the underlying principle of the data warehouse. You know that the data in the operational systems is not usable for this purpose. Next, because operational data is extracted from many old legacy systems, the quality of the data in those systems is less likely to be good enough for the data warehouse. You have to enrich and improve the quality of the data before it could be usable in the data warehouse.

Before moving the extracted data from the source systems into the data warehouse, you inevitably have to perform various kinds of data transformations. You have to transform the data according to standards because they come from many dissimilar source systems. You have to ensure that after all the data is put together, the combined data does not violate any business rules.

Consider the data structures and data elements that you need in your data warehouse. Now think about all the relevant data to be extracted from the source systems. From the variety of source data formats, data values, and the condition of the data quality, you know that you have to perform several types of transformations to make the source data suitable ...

Get DATA WAREHOUSING FUNDAMENTALS: A Comprehensive Guide for IT Professionals now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.