O'Reilly logo

Hands-On Natural Language Processing with Python by Rajalingappaa Shanmugamani, Rajesh Arumugam

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Summary

In this chapter, we covered common NLP tasks, such as preprocessing and exploratory analysis of text using the NLTK library. The unstructured characteristics of real-world data need extensive preprocessing, such as tokenization, stemming, and stop word removal, to make it suitable for ML. As you saw in the examples, NLTK provides a very extensive API for carrying out these preprocessing steps. It provides built-in packages and modules, and supports flexibility to build custom modules, such as user-defined stemmers and tokenizers.

We also discussed using NLTK for POS tagging, which is another common NLP task, used for issues such as word sense disambiguation and answering questions. Applications such as sentiment classification are ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required