August 2014
Beginner to intermediate
304 pages
7h 10m
English
The following is a table of all the part-of-speech tags that occur in the treebank corpus distributed with NLTK. The tags and counts shown here were acquired using the following code:
>>> from nltk.probability import FreqDist >>> from nltk.corpus import treebank >>> fd = FreqDist() >>> for word, tag in treebank.tagged_words(): ... fd[tag] += 1 >>> fd.items()
The FreqDist fd contains all the counts shown here for every tag in the treebank corpus. You can inspect each tag count individually, by doing fd[tag], for example, fd['DT']. Punctuation tags are also shown, along with special tags such as -NONE-, which signifies that the part-of-speech tag is unknown. Descriptions of most of the tags can be found ...
Read now
Unlock full access