O'Reilly logo

Building the Unstructured Data Warehouse: Architecture, Analysis, and Design by Krish Krishnan, W. H. Inmon

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

In the same vein as filtering data comes the practice of removing extraneous data from the unstructured data warehouse. An example of the removal of extraneous data from the unstructured data warehouse is the removal of stop words from the data that is passed through Textual ETL. Stop words simply get in the way and add nothing to the analytical capabilities of the analyst/designer. Therefore, removing extraneous data from the unstructured environment is yet another way that the volume of data can be managed in the unstructured data warehouse environment. Figure 9.5 shows that extraneous data should be prevented from entering the unstructured data warehouse.

Figure 9.5 Removing extraneous data

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required