Context-Free Grammar
A Simple Grammar
Let’s start off by looking at a simple context-free grammar
(CFG). By convention, the lefthand side of the first production is the
start-symbol of the grammar,
typically S
, and all well-formed
trees must have this symbol as their root label. In NLTK, context-free
grammars are defined in the nltk.grammar
module. In Example 8-9 we define a
grammar and show how to parse a simple sentence admitted by the
grammar.
Example 8-9. A simple context-free grammar.
grammar1 = nltk.parse_cfg(""" S -> NP VP VP -> V NP | V NP PP PP -> P NP V -> "saw" | "ate" | "walked" NP -> "John" | "Mary" | "Bob" | Det N | Det N PP Det -> "a" | "an" | "the" | "my" N -> "man" | "dog" | "cat" | "telescope" | "park" P -> "in" | "on" | "by" | "with" """)
>>> sent = "Mary saw Bob".split() >>> rd_parser = nltk.RecursiveDescentParser(grammar1) >>> for tree in rd_parser.nbest_parse(sent): ... print tree (S (NP Mary) (VP (V saw) (NP Bob)))
The grammar in Example 8-9 contains productions involving various syntactic categories, as laid out in Table 8-1. The recursive descent parser used here can also be inspected via a graphical interface, as illustrated in Figure 8-3; we discuss this parser in more detail in Parsing with Context-Free Grammar.
Table 8-1. Syntactic categories
Symbol | Meaning | Example |
---|---|---|
S | sentence | the man walked |
NP | noun phrase | a dog |
VP | verb phrase | saw a park |
PP | prepositional phrase | with a telescope |
Det | determiner | the |
N | noun | dog |
V | verb | walked |
P | preposition | in |
A production like VP -> V NP | V NP
PP
has a disjunction ...
Get Natural Language Processing with Python now with the O’Reilly learning platform.
O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.