Chapter 8. External Data and the Data Warehouse

Most organizations build their first data warehouse efforts on data whose source is existing systems (that is, on data internal to the corporation). In almost every case, this data can be termed internal, structured data. The data comes internally from the corporation and has been already shaped into a regularly occurring format.

A whole host of other data is of legitimate use to a corporation that is not generated from the corporation's own systems. This class of data is called external data and usually enters the corporation in an unpredictable format. Figure 8-1 shows external data entering the data warehouse.

The data warehouse is the ideal place to store external data. If external data is not stored in a centrally located place, several problems are sure to arise. Figure 8-2 shows that when this type of data enters the corporation in an undisciplined fashion, the identity of the source of the data is lost, and there is no coordination whatsoever in the orderly use of the data.

External data belongs in the data warehouse.

Figure 8.1. External data belongs in the data warehouse.

Problems with external data.

Figure 8.2. Problems with external data.

Typically, when external data is not entered into the data warehouse, it comes into the corporation by means of the personal computer (the PC). There is nothing wrong ...

Get Building the Data Warehouse now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.