"Hello, World!" on Steroids!

Pyparsing comes with a number of examples, including a basic "Hello, World!" parser[1]. This simple example is also covered in the O'Reilly ONLamp.com article "Building Recursive Descent Parsers with Python" (http://www.onlamp.com/-pub/a/python/2006/01/26/pyparsing.html). In this section, I use this same example to introduce many of the basic parsing tools in pyparsing.

The current "Hello, World!" parsers are limited to greetings of the form:

word, word !

This limits our options a bit, so let's expand the grammar to handle more complicated greetings. Let's say we want to parse any of the following:

Hello, World!
Hi, Mom!
Good morning, Miss Crabtree!
Yo, Adrian!
Whattup, G?
How's it goin', Dude?
Hey, Jude!
Goodbye, Mr. Chips!

The first step in writing a parser for these strings is to identify the pattern that they all follow. Following our best practice, we write up this pattern as a BNF. Using ordinary words to describe a greeting, we would say, "a greeting is made up of one or more words (which is the salutation), followed by a comma, followed by one or more additional words (which is the subject of the greeting, or greetee), and ending with either an exclamation point or a question mark." As BNF, this description looks like:

greeting ::= salutation comma greetee endpunc
salutation ::= word+
comma ::= ,
greetee ::= word+
word ::= a collection of one or more characters, which are any alpha or ' or .
endpunc ::= ! | ?

This BNF translates almost directly into ...

Get Getting Started with Pyparsing now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.