This class determines character, word, sentence, and line breaks in a block of text in a way that is independent of locale and text encoding. As an abstract class, BreakIterator cannot be instantiated directly. Instead, you must use one of the class methods getCharacterInstance( ), getWordInstance( ), getSentenceInstance( ), or getLineInstance( ) to return an instance of a nonabstract subclass of BreakIterator. These various factory methods return a BreakIterator object that is configured to locate the requested boundary types and is localized to work for the optionally specified locale.

Once you have obtained an appropriate BreakIterator object, use setText( ) to specify the text in which to locate boundaries. To locate boundaries in a Java String object, simply specify the string. To locate boundaries in text that uses some other encoding, you must specify a CharacterIterator object for that text so that the BreakIterator object can locate the individual characters of the text. Having set the text to be searched, you can determine the character positions of characters, words, sentences, or line breaks with the first( ), last( ), next( ), previous( ), current( ), and following( ) methods, which perform the obvious functions. Note that these methods do not return text itself, but merely the position of the appropriate word, sentence, or line break.


Figure 15-2. java.text.BreakIterator ...

Get Java in a Nutshell, 5th Edition now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.