Chapter 4. Part-of-speech Tagging

In this chapter, we will cover the following recipes:

  • Default tagging
  • Training a unigram part-of-speech tagger
  • Combining taggers with backoff tagging
  • Training and combining ngram taggers
  • Creating a model of likely word tags
  • Tagging with regular expressions
  • Affix tagging
  • Training a Brill tagger
  • Training the TnT tagger
  • Using WordNet for tagging
  • Tagging proper names
  • Classifier-based tagging
  • Training a tagger with NLTK-Trainer


Part-of-speech tagging is the process of converting a sentence, in the form of a list of words, into a list of tuples, where each tuple is of the form (word, tag). The tag is a part-of-speech tag, and signifies whether the word is a noun, adjective, verb, and so on.

Part-of-speech tagging is a ...

Get Python 3 Text Processing with NLTK 3 Cookbook now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.