O'Reilly logo

Building the Unstructured Data Warehouse: Architecture, Analysis, and Design by Krish Krishnan, W. H. Inmon

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Once the basic libraries to be built are determined, the next step is to determine the likely sources of information for the different libraries. The likely sources are divided into different iterations. The plan is to process only a few documents of each type. The representative first documents are to be processed and reprocessed until the analyst is satisfied that the information produced by textual ETL is being produced properly. It is anticipated that it may take as many as ten iterations of the first few documents to process them properly. Once the first few documents are processed properly, then larger iterations of documents are processed. After each iteration of processing, the output is checked to make sure that the output ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required