Pattern 2LL(1) Recursive-Descent Lexer


Lexers derive a stream of tokens from a character stream by recognizing lexical patterns.

Lexers are also called scanners, lexical analyzers, and tokenizers. As a bonus, this pattern can recognize nested lexical structures such as nested comments (even though languages typically don’t have super-complicated lexical structure).


The goal of the lexer is to emit a sequence of tokens. Each token has two primary attributes: a token type (symbol category) and the text associated with it. In the English language, we’ve got categories such as verbs and nouns as well as punctuation symbols such as commas and periods. All words within a particular category are said to have the same token type, though ...

Get Language Implementation Patterns now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.