August 2014
Beginner to intermediate
304 pages
7h 10m
English
Using the included names corpus, we can create a simple tagger for tagging names as proper nouns.
The NamesTagger class is a subclass of SequentialBackoffTagger as it's probably only useful near the end of a backoff chain. At initialization, we create a set of all names in the names corpus, lower-casing each name to make lookup easier. Then, we implement the choose_tag() method, which simply checks whether the current word is in the names_set list. If it is, we return the NNP tag (which is the tag for proper nouns). If it isn't, we return None, so the next tagger in the chain can tag the word. The following code can be found in taggers.py:
from nltk.tag import SequentialBackoffTagger from nltk.corpus import names ...
Read now
Unlock full access