Default tagging

Default tagging provides a baseline for part-of-speech tagging. It simply assigns the same part-of-speech tag to every token. We do this using the DefaultTagger class. This tagger is useful as a last-resort tagger, and provides a baseline to measure accuracy improvements.

Getting ready

We're going to use the treebank corpus for most of this chapter because it's a common standard and is quick to load and test. But everything we do should apply equally well to brown, conll2000, and any other part-of-speech tagged corpus.

How to do it...

The DefaultTagger class takes a single argument, the tag you want to apply. We'll give it NN, which is the tag for a singular noun. DefaultTagger is most useful when you choose the most common part-of-speech ...

Get Natural Language Processing: Python and NLTK now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.