Printing All Occurrences of a Pattern
Problem
You need to find all the strings that match a given RE in one or more files or other sources.
Solution
This example reads through a file using a
ReaderCharacterIterator
, one of four
CharacterIterator classes in the Jakarta RegExp
package. Whenever a match is found, I extract it from the
CharacterIterator and print it.
The other character iterators are
StreamCharacterIterator
(as we’ll see in Chapter 9, streams are 8-bit bytes, while readers handle
conversion among various representations of
Unicode
characters), CharacterArrayIterator, and
StringCharacterIterator. All of these character
iterators are interchangeable; apart from the construction process,
this program would work on any of them. Use a
StringCharacterIterator, for example, to find all
occurrences of a pattern in the (possibly long) string you get from a
JTextArea’s getText( )
method, described in Chapter 13.
This code takes the getParen( ) methods from Section 4.6, the substring method from
the CharacterIterator interface, and the
match( )
method from the RE, and simply puts
them all together. I coded it to extract all the “names”
from a given file; in running the program through itself, it prints
the words “import”, “org”,
“apache”, “regexp”, and so on.
> jikes +E -d . ReaderIter.java > java ReaderIter ReaderIter.java import org apache regexp import java io import com darwinsys util Debug Demonstrate the Character Iterator interface print
I interrupted it here ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access