O'Reilly logo

Building Parsers with Java™ by Steven John Metsker

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

9.9. Customizing a Tokenizer

You can customize a tokenizer in three ways: by customizing one of the tokenizer's states, by changing which state the tokenizer enters given an initial character, or by adding an entirely new state.

9.9.1. Customizing a State

The preceding section shows how the CoffeeParser class creates a special tokenizer that allows spaces to appear in words. The tokenizer() method of this class retrieves a WordState object from a tokenizer t and updates it:

t.wordState().setWordChars(' ', ' ', true); 

9.9.2. Changing Which State the Tokenizer Enters

The example in Section 9.7.1 changes the state the tokenizer enters on seeing a “#” to a quote state. It uses this line:

t.setCharacterState('#', '#', t.quoteState()); 

9.9.3. Adding ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required