Pattern 2LL(1) Recursive-Descent Lexer

Purpose

Lexers derive a stream of tokens from a character stream by recognizing lexical patterns.

Lexers are also called scanners, lexical analyzers, and tokenizers. As a bonus, this pattern can recognize nested lexical structures such as nested comments (even though languages typically don’t have super-complicated lexical structure).

Discussion

The goal of the lexer is to emit a sequence of tokens. Each token has two primary attributes: a token type (symbol category) and the text associated with it. In the English language, we’ve got categories such as verbs and nouns as well as punctuation symbols such as commas and periods. All words within a particular category are said to have the same token type, though ...

Get Language Implementation Patterns now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.