The textual integration process for unstructured data is applied in an iterative manner, just as is the structured ETL process. Figure 3.8 shows that textual integration is processed iteratively. First, one pass of integration is made at the text. The results are analyzed, the processing parameters are refined, and the integration processing is repeated on the same text. The refinements that are made are made entirely on the basis of the analysis that is done (or cannot be done) on the data that has been created. If an analysis cannot be done or is done incorrectly, then the parameters that shape the textual data are adjusted so that analysis can be done and can be done correctly. This iterative process continues until the analytical ...

Get Building the Unstructured Data Warehouse: Architecture, Analysis, and Design now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.