O'Reilly logo

Building the Unstructured Data Warehouse: Architecture, Analysis, and Design by Krish Krishnan, W. H. Inmon

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

After the sources of input are surveyed, the next step in building the unstructured data warehouse is the selection and preparation of taxonomies. As discussed earlier, taxonomies are only needed when terminology resolution is an issue or when categories of text must be assigned. There will be occasions when taxonomies are not needed. But if they are needed (which is the usual case), then they need to be identified now.

When taxonomies are needed, how much of the taxonomy that is needed must be identified, as well. It is normal for a taxonomy to contain far more terms than are necessary. When more terms are selected than necessary, there is a performance penalty to be paid. That penalty can be mitigated by weeding ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required