Artificial Intelligence for Big Data
by Anand Deshpande, Manish Kumar, Albenzo Coletta, Giancarlo Zaccone
Stemming
Different forms of a word often communicate essentially the same meaning. Consider an example of a search engine when a user searches shoe or when they search for shoes. The intent of the user is the same and the search result is still going to be shoes from different brands. But the presence of both words can confuse models. So for better accuracy, we need to convert these different forms of the word in its row format. Stemming is converting a word in a text into its raw format. For example, introduction, introduced, and introducing all turn into introduce after stemming. The purpose of this method is to remove various suffixes, to reduce the number of words. Also, this helps the model to avoid confusion while getting trained. Many ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access