Skip to Main Content
Python 3 Text Processing with NLTK 3 Cookbook - Second Edition
book

Python 3 Text Processing with NLTK 3 Cookbook - Second Edition

by Jacob Perkins
August 2014
Beginner to intermediate content levelBeginner to intermediate
304 pages
7h 10m
English
Packt Publishing
Content preview from Python 3 Text Processing with NLTK 3 Cookbook - Second Edition

Training a tagger-based chunker

Training a chunker can be a great alternative to manually specifying regular expression chunk patterns. Instead of a pain-staking process of trial and error to get the exact right patterns, we can use existing corpus data to train chunkers much like we did for part-of-speech tagging in the previous chapter.

How to do it...

As with the part-of-speech tagging, we'll use the treebank corpus data for training. But this time, we'll use the treebank_chunk corpus, which is specifically formatted to produce chunked sentences in the form of trees. These chunked_sents() methods will be used by a TagChunker class to train a tagger-based chunker. The TagChunker class uses a helper function, conll_tag_chunks(), to extract a list ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Python Machine Learning - Third Edition

Python Machine Learning - Third Edition

Sebastian Raschka, Vahid Mirjalili
Python Cookbook, 3rd Edition

Python Cookbook, 3rd Edition

David Beazley, Brian K. Jones

Publisher Resources

ISBN: 9781782167853Supplemental Content