January 2017
Beginner to intermediate
446 pages
8h 46m
English
Lemmatization is another way of reducing words to their base forms. In the previous section, we saw that the base forms that were obtained from those stemmers didn't make sense. For example, all the three stemmers said that the base form of calves is calv, which is not a real word. Lemmatization takes a more structured approach to solve this problem.
The lemmatization process uses a vocabulary and morphological analysis of words. It obtains the base forms by removing the inflectional word endings such as ing or ed. This base form of any word is known as the lemma. If you lemmatize the word calves, you should get calf as the output. One thing to note is that the output depends on whether ...
Read now
Unlock full access