Creating a categorized chunk corpus reader
NLTK provides a CategorizedPlaintextCorpusReader
and CategorizedTaggedCorpusReader
class, but there's no categorized corpus reader for chunked corpora. So in this recipe, we're going to make one.
Getting ready
Refer to the earlier recipe, Creating a chunked phrase corpus, for an explanation of ChunkedCorpusReader
, and refer to the previous recipe for details on CategorizedPlaintextCorpusReader
and CategorizedTaggedCorpusReader
, both of which inherit from CategorizedCorpusReader
.
How to do it...
We'll create a class called CategorizedChunkedCorpusReader
that inherits from both CategorizedCorpusReader
and ChunkedCorpusReader
. It is heavily based on the CategorizedTaggedCorpusReader
class, and also provides ...
Get Natural Language Processing: Python and NLTK now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.