Chapter 2. Using Flex

In this chapter we’ll take a closer look at flex as a standalone tool, with some examples that exercise most of its C language capabilities. All of flex’s facilities are described in Chapter 5, and the usage of flex scanners in C++ programs is described in Chapter 9.

Regular Expressions

The patterns at the heart of every flex scanner use a rich regular expression language. A regular expression is a pattern description using a metalanguage, a language that you use to describe what you want the pattern to match. Flex’s regular expression language is essentially POSIX-extended regular expressions (which is not surprising considering their shared Unix heritage). The metalanguage uses standard text characters, some of which represent themselves and others of which represent patterns. All characters other than the ones listed below, including all letters and digits, match themselves.

The characters with special meaning in regular expressions are:

.

Matches any single character except the newline character (\n).

[]

A character class that matches any character within the brackets. If the first character is a circumflex (^), it changes the meaning to match any character except the ones within the brackets. A dash inside the square brackets indicates a character range; for example, [0-9] means the same thing as [0123456789] and [a-z] means any lowercase letter. A - or ] as the first character after the [ is interpreted literally to let you include dashes and square brackets ...

Get flex & bison now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.