Writing a Simple Recursive Descent Parser
Problem
You need to parse text according to a set of grammar rules and perform actions or build an abstract syntax tree representing the input. The grammar is small, so you’d prefer to just write the parser yourself as opposed to using some kind of framework.
Solution
Here we’re focused on the problem of parsing text according to a particular grammar. In order to do this, you should probably start by having a formal specification of the grammar in the form of a BNF or EBNF. For example, a grammar for simple arithmetic expressions might look like this:
expr ::= expr + term | expr - term | term term ::= term * factor | term / factor | factor factor ::= ( expr ) | NUM
Or, alternatively, in EBNF form:
expr ::= term { (+|-) term }* term ::= factor { (*|/) factor }* factor ::= ( expr ) | NUM
In an EBNF, parts of a rule enclosed in { ... }*
are
optional. The *
means zero or more repetitions (the same
meaning as in a regular expression).
Now, if you’re not familiar with the mechanics of working with a BNF, think of it as a specification of substitution or replacement rules where the symbols on the left side can be replaced by the symbols on the right (or vice versa). Generally, what happens during parsing is that you try to match the input text to the grammar by ...
Get Writing a Simple Recursive Descent Parser now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.