Chapter 7. A Reference for Yacc Grammars

In this chapter, we discuss the format of the yacc grammar and describe the various features and options available. This chapter summarizes the capabilities demonstrated in the examples in previous chapters and covers features not yet mentioned.

After the section on the structure of a yacc grammar, the sections in this chapter are in alphabetical order by feature.

Structure of a Yacc Grammar

A yacc grammar consists of three sections: the definition section, the rules section, and the user subroutines section.

    ... definition section ...
    %%
    ... rules section ...
    %%
    ... user subroutines section ...

The sections are separated by lines consisting of two percent signs. The first two sections are required, although a section may be empty. The third section and the preceding “%%” line may be omitted. (Lex uses the same structure.)

Symbols

A yacc grammar is constructed from symbols, the “words” of the grammar. Symbols are strings of letters, digits, periods, and underscores that do not start with a digit. The symbol error is reserved for error recovery, otherwise yacc attaches no a priori meaning to any symbol.

Symbols produced by the lexer are called terminal symbols or tokens. Those that are defined on the left-hand side of rules are called nonterminal symbols or non-terminals. Tokens may also be literal quoted characters. (See "Literal Tokens.”) A widely-followed convention makes token names all uppercase and non-terminals lowercase. We follow that ...

Get lex & yacc, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.