Chapter 4. Part-of-speech Tagging

In this chapter, we will cover the following recipes:

  • Default tagging
  • Training a unigram part-of-speech tagger
  • Combining taggers with backoff tagging
  • Training and combining ngram taggers
  • Creating a model of likely word tags
  • Tagging with regular expressions
  • Affix tagging
  • Training a Brill tagger
  • Training the TnT tagger
  • Using WordNet for tagging
  • Tagging proper names
  • Classifier-based tagging
  • Training a tagger with NLTK-Trainer

Introduction

Part-of-speech tagging is the process of converting a sentence, in the form of a list of words, into a list of tuples, where each tuple is of the form (word, tag). The tag is a part-of-speech tag, and signifies whether the word is a noun, adjective, verb, and so on.

Part-of-speech tagging is a ...

Get Python 3 Text Processing with NLTK 3 Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.