Extending a Feature-Based Grammar
In this section, we return to feature-based grammar and explore a variety of linguistic issues, and demonstrate the benefits of incorporating features into the grammar.
Subcategorization
In Chapter 8, we augmented our category labels to represent
different kinds of verbs, and used the labels IV
and TV
for intransitive and transitive verbs respectively. This allowed us to
write productions like the following:
Example 9-31.
VP -> IV VP -> TV NP
Although we know that IV
and
TV
are two kinds of V
, they are just atomic non-terminal symbols
in a CFG and are as distinct from each other as any other pair of
symbols. This notation doesn’t let us say anything about verbs in
general; e.g., we cannot say “All lexical items of category V
can be marked for tense,” since
walk, say, is an item of category IV
, not V
. So, can we replace category labels such
as TV
and IV
by V
along with a feature that tells us whether the verb combines with a
following NP
object or whether it
can occur without any complement?
A simple approach, originally developed for a grammar framework
called Generalized Phrase Structure Grammar (GPSG), tries to solve
this problem by allowing lexical categories to bear a SUBCAT
feature, which tells us what
subcategorization class the item belongs to. In contrast to the
integer values for SUBCAT
used by
GPSG, the example here adopts more mnemonic values, namely intrans
, trans
, and clause
:
Example 9-32.
VP[TENSE=?t, NUM=?n] -> V[SUBCAT=intrans, TENSE=?t, ...
Get Natural Language Processing with Python now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.