Extracting Context Free Grammar (CFG) rules from Treebank
CFG was defined for natural languages in 1957 by Noam Chomsky. A CFG consists of the following components:
- A set of non terminal nodes (N)
- A set of terminal nodes (T)
- Start symbol (S)
- A set of production rules (P) of the form:
A→a
CFG rules are of two types—Phrase structure rules and Sentence structure rules.
A Phrase Structure Rule can be defined as follows—A→a, where A Î N and a consists of Terminals and Non terminals.
In Sentence level Construction of CFG, there are four structures:
- Declarative structure: Deals with declarative sentences (the subject is followed by a predicate).
- Imperative structure: Deals with imperative sentences, commands, or suggestions (sentences begin with a verb phrase ...
Get Natural Language Processing: Python and NLTK now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.