O'Reilly logo

Building the Unstructured Data Warehouse: Architecture, Analysis, and Design by Krish Krishnan, W. H. Inmon

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

The next simple parameter is the stemming parameter. The stemming parameter allows words to be read and then reduced to their root word. The word that is being processed remains in the index, while stemming adds a new word. For example, suppose that in the base of unstructured data, the words “moved”, “mover”, “moving”, and “moves” were found. If a proper analysis were to be done, it makes sense to reduce these words to their common base “move”.

Care must be taken in specifying stemming to make sure that if stemming is specified for a language other than English, the stemming algorithm has been installed for that language.

The reduction to a common base is typically done in one of two ways: through a stemming algorithm such as the Porter ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required