O'Reilly logo

Practical Data Analysis by Hector Cuesta

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Getting started with Natural Language Toolkit (NLTK)

NLTK is a powerful Python library for computational linguistics and text classification. NLTK include about 50 corpora and lexical resources such as Wordnet. NLTK is the most used tool for natural language processing in Python. It includes powerful algorithms for text tokenization, parsing, semantic reasoning, and text classification. We can find a complete guide of NLTK from http://nltk.org/.

To install NLTK, we just need to download the executable file from the website for windows and use easy_install in Linux distributions.

Tip

We may need to install PyYaml in order to use NLTK. We can download PyYaml from http://pyyaml.org/wiki/PyYAML.

NLTK defines four basic classifiers:

  • Naive Bayes
  • Maximum entropy ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required