Chapter 1. Introducing Flex and Bison
Flex and Bison are tools for building programs that handle structured input. They were originally tools for building compilers, but they have proven to be useful in many other areas. In this first chapter, we’ll start by looking at a little (but not too much) of the theory behind them, and then we’ll dive into some examples of their use.
Lexical Analysis and Parsing
The earliest compilers back in the 1950s used utterly ad hoc techniques to analyze the syntax of the source code of programs they were compiling. During the 1960s, the field got a lot of academic attention, and by the early 1970s, syntax analysis was a well-understood field.
One of the key insights was to break the job into two parts: lexical analysis (also called lexing or scanning) and syntax analysis (or parsing).
Roughly speaking, scanning divides the input into meaningful chunks, called tokens, and parsing figures out how the tokens relate to each other. For example, consider this snippet of C code:
alpha = beta + gamma ;
A scanner divides this into the tokens alpha, equal
sign, beta, plus sign, gamma, and semicolon. Then the parser determines that
beta + gamma is an expression, and
that the expression is assigned to alpha.