Scanning a File

Problem

You need to scan a file with more fine-grained resolution than the readLine( ) method of the BufferedReader class and its subclasses (discussed in Section 9.12).

Solution

Use a StreamTokenizer, readline( ) and a StringTokenizer, regular expressions (Chapter 4), or one of several scanning tools such as JavaCC.

Discussion

While you could, in theory, read the file a character at a time and analyze each character, that is a pretty low-level approach. The read( ) method in the Reader class is defined to return int, so that it can use the time-honored value -1 (defined as EOF in Unix <stdio.h> for years) to indicate that you have read to the end of the file.

void doFile(Reader is) {
    int c;
    while ((c=is.read(  )) != -1) {
        System.out.print((char)c);
    }
}

The cast to char is interesting. The program will compile fine without it, but may not print correctly (depending on the contents of the file).

We discussed the StringTokenizer class extensively in Section 3.3. The combination of readLine( ) and StringTokenizer provides a simple means of scanning a file. Suppose you need to read a file in which each line consists of a name like “user@host.domain”, and you want to split the lines into the user part and the host address part. You could use this:

// ScanStringTok.java protected void process(LineNumberReader is) { String s = null; try { while ((s = is.readLine( )) != null) { StringTokenizer st = new StringTokenizer(s, "@", true); String user = (String)st.nextElement( ); st.nextElement( ...

Get Java Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.