Hand-Coded Parsers

Since Python is a general purpose programming language, it’s also reasonable to consider writing a hand-coded parser. For instance, recursive descent parsing is a fairly well-known technique for analyzing language-based information. Since Python is a very high-level language, writing the parser itself is usually easier than it would be in a traditional language like C or C++.

To illustrate, this section develops a custom parser for a simple grammar: it parses and evaluates arithmetic expression strings. This example also demonstrates the utility of Python as a general-purpose programming language. Although Python is often used as a frontend or rapid development language, it’s also useful for the kinds of things we’d normally write in a systems development language like C or C++.

The Expression Grammar

The grammar our parser will recognize can be described as follows:

goal -> <expr> END                       [number, variable, ( ]
goal -> <assign> END                     [set]
 
assign -> 'set' <variable> <expr>        [set]

expr -> <factor> <expr-tail>             [number, variable, ( ]
 
expr-tail -> ^                           [END, ) ]
expr-tail -> '+' <factor> <expr-tail>    [+]
expr-tail -> '-' <factor> <expr-tail>    [-]

factor -> <term> <factor-tail>           [number, variable, ( ]

factor-tail -> ^                         [+, -, END, ) ]
factor-tail -> '*' <term> <factor-tail>  [*]
factor-tail -> '/' <term> <factor-tail>  [/]
 
term -> <number>                         [number]
term -> <variable>                       [variable]
term -> '(' <expr> ')'                   [(]

tokens: (, ), num, var, -, +, /, *, set, end

This is a fairly typical grammar ...

Get Programming Python, Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.