Taking Strings Apart with StringTokenizer

Problem

You need to take a string apart into words or tokens.

Solution

Construct a StringTokenizer around your string and call its methods hasMoreTokens( ) and nextToken( ). These implement the Iterator design pattern (see Section 7.5). In addition, StringTokenizer implements the Enumeration interface (also in Section 7.5), but if you use the methods thereof you will need to cast the results to String:

// StrTokDemo.java
StringTokenizer st = new StringTokenizer("Hello World of Java");

while (st.hasMoreTokens(  ))
    System.out.println("Token: " + st.nextToken(  ));

The StringTokenizer normally breaks the String into tokens at what we would think of as “word boundaries” in European languages. Sometimes you want to break at some other character. No problem. When you construct your StringTokenizer, in addition to passing in the string to be tokenized, pass in a second string that lists the "break characters.” For example:

// StrTokDemo2.java
StringTokenizer st = new StringTokenizer("Hello, World|of|Java", ", |");

while (st.hasMoreElements(  ))
    System.out.println("Token: " + st.nextElement(  ));

But wait, there’s more! What if you are reading lines like:

FirstName|Lastname|Company|PhoneNumber

and your dear old Aunt Begonia hasn’t been employed for the last 38 years? Her “Company” field will in all probability be blank.[12] If you look very closely at the previous code example, you’ll see that it has two delimiters together (the comma and the space), but ...

Get Java Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.