Simple Java tokenizers
There are several Java classes that support simple tokenization; some of them are as follows:
Scanner
String
BreakIterator
StreamTokenizer
StringTokenizer
Although these classes provide limited support, it is useful to understand how they can be used. For some tasks, these classes will suffice. Why use a more difficult to understand and less efficient approach when a core Java class can do the job? We will cover each of these classes as they support the tokenization process.
The StreamTokenizer
and StringTokenizer
classes should not be used for new development. Instead, the String
class' split
method is usually a better choice. They have been included here in case you run across them and wonder whether they should be used or not. ...
Get Natural Language Processing with Java now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.