August 2014
Beginner to intermediate
304 pages
7h 10m
English
To identify LOCATION chunks, we can make a different kind of ChunkParserI subclass that uses the gazetteers corpus to identify location words. The gazetteers corpus is a WordListCorpusReader class that contains the following location words:
The
LocationChunker class, found in chunkers.py, iterates over a tagged sentence looking for words that are found in the gazetteers corpus. When it finds one or more location words, it creates a LOCATION chunk using IOB tags. The helper method iob_locations() is where the IOB LOCATION tags are produced, and the parse() method converts these IOB tags into a Tree:
from nltk.chunk ...
Read now
Unlock full access