O'Reilly logo

Building Parsers with Java™ by Steven John Metsker

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

9.5. A Tokenizer Class

The Tokenizer class in sjm.parse.tokens uses a set of states to recognize different types of tokens. Each state is a subclass of TokenizerState, a class in the same package. A Tokenizer object reads a character of an input string and uses this character to decide which state to use to find the next token. The design of Tokenizer in sjm.parse.tokens is as follows:

1.
Read a character and use it to look up which TokenizerState object to use.
2.
Send the TokenizerState object the initial character, and ask the TokenizerState to return a Token. The TokenizerState reads as many characters as it needs to produce a Token.
3.
Repeat until there are no more characters.

Figure 9.3 shows a state diagram of the classes in sjm.parse.tokens ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required