Artificial Intelligence for Big Data
by Anand Deshpande, Manish Kumar, Albenzo Coletta, Giancarlo Zaccone
Porter stemming
Porter stemming is one form of the stemming algorithm that removes suffixes from base words or terms in the English dictionary. The whole purpose of Porter Stemmer is to improve the performance of the NLP model training exercise. It does so by removing suffixes from a word and bringing it to its base form. This way, the number of terms is reduced and the memory footprint and complexity of your term space is also minimized. Porter is not dictionary-based. It does not use any stem dictionary to identify suffixes that need to be removed. It is based on a set of generic rules. Some people see this as a drawback as its working is pretty straightforward and does not take care of the lower-level contextual nitty-gritty of English ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access