Chapter 3. Using Yacc

The previous chapter concentrated on lex alone. In this chapter we turn our attention to yacc, although we use lex to generate our lexical analyzers. Where lex recognizes regular expressions, yacc recognizes entire grammars. Lex divides the input stream into pieces (tokens) and then yacc takes these pieces and groups them together logically.

In this chapter we create a desk calculator, starting with simple arithmetic, then adding built-in functions, user variables, and finally user-defined functions.

Grammars

Yacc takes a grammar that you specify and writes a parser that recognizes valid “sentences” in that grammar. We use the term “sentence” here in a fairly general way—for a C language grammar the sentences are syntactically valid C programs.[8]

As we saw in Chapter 1, a grammar is a series of rules that the parser uses to recognize syntactically valid input. For example, here is a version of the grammar we’ll use later in this chapter to build a calculator.

            statement → NAME = expression

        expression → NUMBER + NUMBER | NUMBER − NUMBER

The vertical bar, “|”, means there are two possibilities for the same symbol, i.e., an expression can be either an addition or a subtraction. The symbol to the left of the → is known as the left-hand side of the rule, often abbreviated LHS, and the symbols to the right are the right-hand side, usually abbreviated RHS. Several rules may have the same left-hand side; the vertical bar is just a short hand for this. Symbols that actually ...

Get lex & yacc, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.