"Hello, World!" on Steroids!

Pyparsing comes with a number of examples, including a basic "Hello, World!" parser[1]. This simple example is also covered in the O'Reilly ONLamp.com article "Building Recursive Descent Parsers with Python" (http://www.onlamp.com/-pub/a/python/2006/01/26/pyparsing.html). In this section, I use this same example to introduce many of the basic parsing tools in pyparsing.

The current "Hello, World!" parsers are limited to greetings of the form:

word, word !

This limits our options a bit, so let's expand the grammar to handle more complicated greetings. Let's say we want to parse any of the following:

Hello, World!
Hi, Mom!
Good morning, Miss Crabtree!
Yo, Adrian!
Whattup, G?
How's it goin', Dude?
Hey, Jude!
Goodbye, Mr. Chips!

The first step in writing a parser for these strings is to identify the pattern that they all follow. Following our best practice, we write up this pattern as a BNF. Using ordinary words to describe a greeting, we would say, "a greeting is made up of one or more words (which is the salutation), followed by a comma, followed by one or more additional words (which is the subject of the greeting, or greetee), and ending with either an exclamation point or a question mark." As BNF, this description looks like:

greeting ::= salutation comma greetee endpunc
salutation ::= word+
comma ::= ,
greetee ::= word+
word ::= a collection of one or more characters, which are any alpha or ' or .
endpunc ::= ! | ?

This BNF translates almost directly into ...

