Wildcard Operator and Nongreedy Subrules

EBNF subrules like (...)?, (...)*, and (...)+ are greedy—they consume as much input as possible, but sometimes that’s not what’s needed. Constructs like .* consume until the end of the input in the lexer and sometimes in the parser. We want that loop to be nongreedy, so we need to use different syntax: .*? borrowed from regular expression notation. We can make any subrule that has a ?, *, or + suffix nongreedy by adding another ? suffix. Such nongreedy subrules are allowed in both the parser and the lexer, but they are used much more frequently in the lexer.

Nongreedy Lexer Subrules

Here’s the very common C-style comment lexer rule that consumes any characters until it sees the trailing ’*/’ ...

Get The Definitive ANTLR 4 Reference, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.