August 2014
Beginner to intermediate
304 pages
7h 10m
English
NLTK provides a CategorizedPlaintextCorpusReader and CategorizedTaggedCorpusReader class, but there's no categorized corpus reader for chunked corpora. So in this recipe, we're going to make one.
Refer to the earlier recipe, Creating a chunked phrase corpus, for an explanation of ChunkedCorpusReader, and refer to the previous recipe for details on CategorizedPlaintextCorpusReader and CategorizedTaggedCorpusReader, both of which inherit from CategorizedCorpusReader.
We'll create a class called CategorizedChunkedCorpusReader that inherits from both CategorizedCorpusReader and ChunkedCorpusReader. It is heavily based on the CategorizedTaggedCorpusReader class, and also provides ...