Basic Form of a Pyparsing Program

The prototypical pyparsing program has the following structure:

  • Import names from pyparsing module

  • Define grammar using pyparsing classes and helper methods

  • Use the grammar to parse the input text

  • Process the results from parsing the input text

Import Names from Pyparsing

In general, using the form from pyparsing import * is discouraged among Python style experts. It pollutes the local variable namespace with an unknown number of new names from the imported module. However, during pyparsing grammar development, it is hard to anticipate all of the parser element types and other pyparsing-defined names that will be needed, and this form simplifies early grammar development. After the grammar is mostly finished, you can go back to this statement and replace the * with the list of pyparsing names that you actually used.

Define the Grammar

The grammar is your definition of the text pattern that you want to extract from the input text. With pyparsing, the grammar takes the form of one or more Python statements that define text patterns, and combinations of patterns, using pyparsing classes and helpers to specify these individual pieces. Pyparsing allows you to use operators such as +, |, and ^ to simplify this code. For instance, if I use the pyparsing Word class to define a typical programming variable name consisting of a leading alphabetic character with a body of alphanumeric characters or underscores, I would start with the Python statement:

identifier = ...

Get Getting Started with Pyparsing now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.