O'Reilly logo

Building the Unstructured Data Warehouse: Architecture, Analysis, and Design by Krish Krishnan, W. H. Inmon

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

One of the most useful aspects of textual ETL is the ability to recognize extensions of a concept. For example, suppose an organization was interested in finding all the places for which Sarbanes Oxley was relevant. Note that this is quite different from finding all references to Sarbanes Oxley. The raw text is examined and everywhere there is a reference to which Sarbanes Oxley is relevant, the term “Sarbanes Oxley” is concatenated. In doing so, a query for all of the text that is relevant to Sarbanes Oxley can be performed. An example of a section of text where “Sarbanes Oxley” would be concatenated is: “We have decided to delay shipment for now. We hope this will not affect revenue recognition. We can account ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required