Recognizing Languages Whose Keywords Aren’t Fixed

To explore actions embedded in lexer rules, let’s build a grammar for a contrived programming language whose keywords can change dynamically (from run to run). This is not as unusual as it sounds. For example, in version 5, Java added the keyword enum, so the same compiler must be able to enable and disable a keyword depending on the -version option.

Perhaps a more common use would be dealing with languages that have huge keyword sets. Rather than making the lexer match all of the keywords individually (as separate rules), we can make a catchall ID rule and then look up the identifier in a keywords table. If the lexer finds a keyword, we can specifically set the token type from a generic

Get The Definitive ANTLR 4 Reference, 2nd Edition now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.