9.6. Tokenizer Lookup Tables

Both the Tokenizer class in sjm.parse.tokens and StreamTokenizer in java.io use lookup tables to decide how to build a token. The classes are similar in that the first character of a token determines the tokenizer's state. The classes differ in that Tokenizer transfers control to a TokenizerState object, whereas the state of StreamTokenizer is internal to the StreamTokenizer class. Figure 9.5 shows the table that a default Tokenizer object uses to determine which state to use to build a token.

Figure 9.5. This table depicts the default lookup table used by the class Tokenizer in sjm.parse.tokens to determine which TokenizerState can produce a Token. The Unicode value of each character is the sum of its row number ...

Get Building Parsers with Java™ now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.