BUY THIS BOOK
Add to Cart

Print Book $49.99


Safari Books Online

What is this?

Add to UK Cart

Print Book £35.50

What is this?

Looking to Reprint this content?


Java I/O
Java I/O, Second Edition By Elliotte Rusty Harold
May 2006
Pages: 726

Cover | Table of Contents


Table of Contents

Chapter 1: Introducing I/O
Input and output, I/O for short, are fundamental to any computer operating system or programming language. Only theorists find it interesting to write programs that don't require input or produce output. At the same time, I/O hardly qualifies as one of the more "thrilling" topics in computer science. It's something in the background, something you use every day—but for most developers, it's not a topic with much sex appeal.
But in fact, there are plenty of reasons Java programmers should find I/O interesting. Java includes a particularly rich set of I/O classes in the core API, mostly in the java.io and java.nio packages. These packages support several different styles of I/O. One distinction is between byte-oriented I/O, which is handled by input and output streams, and character-I/O, which is handled by readers and writers. Another distinction is between the old-style stream-based I/O and the new-style channel- and buffer-based I/O. These all have their place and are appropriate for different needs and use cases. None of them should be ignored.
Java's I/O libraries are designed in an abstract way that enables you to read from external data sources and write to external targets, regardless of the kind of thing you're writing to or reading from. You use the same methods to read from a file that you do to read from the console or from a network connection. You use the same methods to write to a file that you do to write to a byte array or a serial port device.
Reading and writing without caring where your data is coming from or where it's going is a very powerful abstraction. Among other things, this enables you to define I/O streams that automatically compress, encrypt, and filter from one data format to another. Once you have these tools, programs can send encrypted data or write zip files with almost no knowledge of what they're doing. Cryptography or compression can be isolated in a few lines of code that say, "Oh yes, make this a compressed, encrypted output stream."
In this book, I'll take a thorough look at all parts of Java's I/O facilities. This includes all the different kinds of streams you can use and the channels and buffers that offer high-performance, high-throughput, nonblocking operations on servers. We're also going to investigate Java's support for Unicode. We'll look at Java's powerful facilities for formatting I/O. Finally, we'll look at the various APIs Java provides for low-level I/O through various devices including serial ports, parallel ports, USB, Bluetooth, and other hardware you'll find in devices that don't necessarily look like a traditional desktop computer or server.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
What Is a Stream?
A stream is an ordered sequence of bytes of indeterminate length. Input streams move bytes of data into a Java program from some generally external source. Output streams move bytes of data from Java to some generally external target. (In special cases, streams can also move bytes from one part of a Java program to another.)
The word stream is derived from an analogy between a sequence and a stream of water. An input stream is like a siphon that sucks up water; an output stream is like a hose that sprays out water. Siphons can be connected to hoses to move water from one place to another. Sometimes a siphon may run out of water if it's drawing from a finite source like a bucket. On the other hand, if the siphon is drawing water from a river, it may well operate indefinitely. So, too, an input stream may read from a finite source of bytes such as a file or an unlimited source of bytes such as System.in . Similarly, an output stream may have a definite number of bytes to output or an indefinite number of bytes.
Input to a Java program can come from many sources. Output can go to many different kinds of destinations. The power of the stream metaphor is that the differences between these sources and destinations are abstracted away. All input and output operations are simply treated as streams using the same classes and the same methods. You don't need to learn a new API for every different kind of device. The same API that reads files can read network sockets, serial ports, Bluetooth transmissions, and more.
The first source of input most programmers encounter is System.in. This is the same thing as stdin in C—generally some sort of console window, probably the one in which the Java program was launched. If input is redirected so the program reads from a file, then System.in is changed as well. For instance, on Unix, the following command redirects stdin so that when the MessageServer program reads from System.in
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Numeric Data
Input streams read bytes and output streams write bytes. Readers read characters and writers write characters. Therefore, to understand input and output, you first need a solid understanding of how Java deals with bytes, integers, characters, and other primitive data types, and when and why one is converted into another. In many cases Java's behavior is not obvious.
The fundamental integer data type in Java is the int, a 4-byte, big-endian, two's complement integer. An int can take on all values between -2,147,483,648 and 2,147,483,647. When you type a literal integer such as 7, -8345, or 3000000000 in Java source code, the compiler treats that literal as an int. In the case of 3000000000 or similar numbers too large to fit in an int, the compiler emits an error message citing "Numeric overflow."
long s are 8-byte, big-endian, two's complement integers that range all the way from -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807. long literals are indicated by suffixing the number with a lower- or uppercase L. An uppercase L is preferred because the lowercase l is too easily confused with the numeral 1 in most fonts. For example, 7L, -8345L, and 3000000000L are all 64-bit long literals.
Two more integer data types are available in Java, the short and the byte. shorts are 2-byte, big-endian, two's complement integers with ranges from -32,768 to 32,767. They're rarely used in Java and are included mainly for compatibility with C.
bytes, however, are very much used in Java. In particular, they're used in I/O. A byte is an 8-bit, two's complement integer that ranges from −128 to 127. Note that like all numeric data types in Java, a byte is signed. The maximum byte value is 127. 128, 129, and so on through 255 are not legal values for bytes.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Character Data
Numbers are only part of the data a typical Java program needs in order to read and write. Many programs also handle text, which is composed of characters. Since computers only really understand numbers, characters are encoded by assigning each character in a given script a number. For example, in the common ASCII encoding, the character A is mapped to the number 65; the character B is mapped to the number 66; the character C is mapped to the number 67; and so on. Different encodings may encode different scripts or may encode the same or similar scripts in different ways.
Java understands several dozen different character sets for a variety of languages, ranging from ASCII to the Shift Japanese Input System (SJIS) to Unicode. Internally, Java uses the Unicode character set. Unicode is a superset of the 1-byte Latin-1 character set, which in turn is an 8-bit superset of the 7-bit ASCII character set.
ASCII, the American Standard Code for Information Interchange, is a 7-bit character set. Thus it defines 27, or 128, different characters whose numeric values range from 0 to 127. These characters are sufficient for handling most of American English. It's an often-used lowest common denominator format for different computers. If you were to read a byte value between 0 and 127 from a stream, then cast it to a char, the result would be the corresponding ASCII character.
ASCII characters 0–31 and character 127 are nonprinting control characters. Characters 32–47 are various punctuation and space characters. Characters 48–57 are the digits 0–9. Characters 58–64 are another group of punctuation characters. Characters 65–90 are the capital letters A–Z. Characters 91–96 are a few more punctuation marks. Characters 97–122 are the lowercase letters a–z. Finally, characters 123–126 are a few remaining punctuation symbols. The complete ASCII character set is shown in Table A-1 in the Appendix.
ISO 8859-1, Latin-1, is an 8-bit character set that's a strict superset of ASCII. It defines 2
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Readers and Writers
Streams are primarily intended for data that can be read as pure bytes—basically, byte data and numeric data encoded as binary numbers of one sort or another. Streams are specifically not intended for reading and writing text, including both ASCII text, such as "Hello World," and numbers formatted as text, such as "3.1415929". For these purposes, you should use readers and writers.
Input and output streams are fundamentally byte-based. Readers and writers are based on characters, which can have varying widths depending on the character set. For example, ASCII and Latin-1 use 1-byte characters. UTF-32 uses 4-byte characters. UTF-8 uses characters of varying width (between one and four bytes). Since characters are ultimately composed of bytes, readers take their input from streams. However, they convert those bytes into chars according to a specified encoding format before passing them along. Similarly, writers convert chars to bytes according to a specified encoding before writing them onto some underlying stream.
The java.io.Reader and java.io.Writer classes are abstract superclasses for classes that read and write character-based data. The subclasses are notable for handling the conversion between different character sets. The core Java API includes nine reader and eight writer classes, all in the java.io package:
BufferedReader BufferedWriter CharArrayReader CharArrayWriter FileReader FileWriter FilterReader FilterWriter InputStreamReader LineNumberReader OutputStreamWriter
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Buffers and Channels
Streams are reasonably fast as long as an application has to read from or write to only one at a time. In fact, the bottleneck is more likely to be the disk or network you're reading from or writing to than the Java program itself. The situation is a little dicier when a program needs to read from or write to many different streams simultaneously. This is a common situation in web servers, for example, where a single process may be communicating with hundreds or even thousands of different clients simultaneously.
At any given time, a stream may block. That is, it may simply stop accepting further requests temporarily while it waits for the actual hardware it's writing to or reading from to catch up. This can happen on disks, and it's a major issue on network connections. Clearly, you don't want to stop sending data to 999 clients just because one of them is experiencing network congestion. The traditional solution to this problem prior to Java 1.4 was to put each connection in a separate thread. Five hundred clients requires 500 threads. Each thread can run independently of the others so that one slow connection doesn't slow down everyone.
However, threads are not without overhead of their own. Creating and managing threads takes a lot of work, and few virtual machines can handle more than a thousand or so threads without serious performance degradation. Spawning several thousand threads can crash even the toughest virtual machine. Nonetheless, big servers need to be able to communicate with thousands of clients simultaneously.
The solution invented in Java 1.4 was nonblocking I/O. In nonblocking I/O, streams are relegated mostly to a supporting role while the real work is done by channels and buffers. Input buffers are filled with data from the channel and then drained of data by the application. Output buffers work in reverse: the application fills them with data that is subsequently drained out by the target. The design is such that the writer and reader don't always have to operate in lockstep with each other. Most importantly, the client application can queue reads and writes to each channel. It does not have to stop processing simply because the other end of the channel isn't quite ready. This enables one thread to service many different channels simultaneously, dramatically reducing the load on the virtual machine.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The Ubiquitous IOException
As far as computer operations go, input and output are unreliable. They are subject to problems completely outside the programmer's control. Disks can develop bad sectors while a file is being read. Construction workers drop their backhoes through the cables that connect your WAN. Users unexpectedly cancel their input. Telephone repair crews shut off your modem line while trying to repair someone else's. (This last one actually happened to me while writing this chapter. My modem kept dropping the connection and then not getting a dial tone; I had to hunt down the Verizon "repairman" in my building's basement and explain to him that he was working on the wrong line.)
Because of these potential problems and many more, almost every method that performs input or output is declared to throw an IOException. IOException is a checked exception, so you must either declare that your methods throw it or enclose the call that can throw it in a try/catch block. The only real exceptions to this rule are the PrintStream and PrintWriter classes. Because it would be inconvenient to wrap a try/catch block around each call to System.out.println( ), Sun decided to have PrintStream (and later PrintWriter) catch and eat any exceptions thrown inside a print( ) or println( ) method. If you do want to check for exceptions inside a print( ) or println( ) method, you can call checkError( ):
public boolean checkError( )
The checkError( ) method returns true if an exception has occurred on this print stream, false if one hasn't. It tells you only that an error occurred. It does not tell you what sort of error occurred. If you need to know more about the error, you'll have to use a different output stream or writer class.
IOException has many subclasses—15 in java.io alone—and methods often throw a more specific exception that subclasses IOException; for instance, EOFException on an unexpected end of file or UnsupportedEncodingException when you try read text in an unknown character set. However, methods usually declare only that they throw an
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The Console: System.out, System.in, and System.err
The console is the default destination for output written to System.out or System.err and the default source of input for System.in. On most platforms the console is the command-line environment from which the Java program was initially launched, perhaps an xterm or a DOS prompt as shown in Figure 1-1. The word console is something of a misnomer, since on Unix systems the console refers to a very specific command-line shell rather than to command-line shells overall.
Figure 1-1: A DOS console on Windows
Many common misconceptions about I/O occur because most programmers' first exposure to I/O is through the console. The console is convenient for quick hacks and toy examples commonly found in textbooks, and I will use it for that in this book, but it's really a very unusual source of input and destination for output, and good Java programs avoid it. It behaves almost, but not completely, unlike anything else you'd want to read from or write to. While consoles make convenient examples in programming texts like this one, they're a horrible user interface and really have little place in modern programs. Users are more comfortable with a well-designed GUI. Furthermore, the console is unreliable across platforms. Many smaller devices such as Palm Pilots and cell phones have no console. Web browsers running applets sometimes provide a console that can be used for output. However, this is hidden by default, normally cannot be used for input, and is not available in all browsers on all platforms.
System.out is the first instance of the OutputStream class most programmers encounter. In fact, it's often encountered before students know what a class or an output stream is. Specifically, System.out is the static out field of the java.lang.System class. It's an instance of
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Security Checks on I/O
One of the original fears about downloading executable content like applets from the Internet was that a hostile applet could erase your hard disk or read your Quicken files. Nothing has happened to change that since Java was introduced. This is why Java applets run under the control of a security manager that checks each operation an applet performs to prevent potentially hostile acts.
The security manager is particularly careful about I/O operations. For the most part, the checks are related to these questions:
  • Can the program read a particular file?
  • Can the program write a particular file?
  • Can the program delete a particular file?
  • Can the program determine whether a particular file exists?
  • Can the program make a network connection to a particular host?
  • Can the program accept an incoming connection from a particular host?
The short answer to all these questions when the program is an applet is "No, it cannot." A slightly more elaborate answer would specify a few exceptions. Applets can make network connections to the host they came from; applets can read a few very specific files that contain information about the Java environment; and trusted applets may sometimes run without these restrictions. But for almost all practical purposes, the answer is almost always no.
Because of these security issues, you need to be careful when using code fragments and examples from this book in an applet. Everything shown here works when run in an application, but when run in an applet, it may fail with a SecurityException. It's not always obvious whether a particular method or class will cause problems. The write( )
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 2: Output Streams
The java.io.OutputStream class declares the three basic methods you need to write bytes of data onto a stream. It also has methods for closing and flushing streams:
public abstract void write(int b) throws IOException
public void write(byte[] data) throws IOException
public void write(byte[] data, int offset, int length) throws IOException
public void flush( ) throws IOException
public void close( ) throws IOException
OutputStream is an abstract class. Subclasses provide implementations of the abstract write(int b) method. They may also override the four nonabstract methods. For example, the FileOutputStream class overrides all five methods with methods that call native code to write files. Although OutputStream is abstract, often you only need to know that the object you have is an OutputStream; the more specific subclass of OutputStream is hidden from you. For example, the getOutputStream( ) method of java.net.URLConnection has this signature:
public OutputStream getOutputStream( ) throws IOException
Depending on the type of URL associated with this URLConnection object, the actual class of the output stream that's returned may be a sun.net.TelnetOutputStream, a sun.net.smtp.SmtpPrintStream, a sun.net.www.http.KeepAliveStream, or something else completely. All you know as a programmer, and all you need to know, is that the object returned is some kind of OutputStream.
Furthermore, even when working with subclasses whose types you know, you still need to be able to use the methods inherited from OutputStream. And since methods that are inherited are not included in the API documentation, it's important to remember that they're there. For example, the java.io.DataOutputStream class does not declare a close( ) method, but you can still call the method it inherits from its superclass.
The fundamental method of the OutputStream class is write( ) :
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Writing Bytes to Output Streams
The fundamental method of the OutputStream class is write( ) :
public abstract void write(int b) throws IOException
This method writes a single unsigned byte of data whose value should be between 0 and 255. If you pass a number larger than 255 or smaller than 0, it's reduced modulo 256 before being written.
Example 2-1, AsciiChart, is a simple program that writes the printable ASCII characters (32 to 126) on the console. The console interprets the numeric values as ASCII characters, not as numbers. This is a feature of the console, not of the OutputStream class or the specific subclass of which System.out is an instance. The write( ) method merely sends a particular bit pattern to a particular output stream. How that bit pattern is interpreted depends on what's connected to the other end of the stream.
Example 2-1. The AsciiChart program
import java.io.*;
public class AsciiChart {
  public static void main(String[] args) {
    for (int i = 32; i < 127; i++) {
      System.out.write(i);
      // break line after every eight characters.
      if (i % 8 == 7) System.out.write('\n');
      else System.out.write('\t');
    }
    System.out.write('\n');
   }
}
Notice the use of the char literals '\t' and '\n'. The compiler converts these to the numbers 9 and 10, respectively. When these numbers are written on the console, the console interprets them as a tab and a linefeed, respectively. The same effect could have been achieved by writing the if clause like this:
if (i % 8 == 7) System.out.write(10);
else System.out.write(9);
Here's the output:
% java AsciiChart
!       "       #       $       %       &       '
(       )       *       +       ,       -       .       /
0       1       2       3       4       5       6       7
8       9       :       ;       <       =       >       ?
@       A       B       C       D       E       F       G
H       I       J       K       L       M       N       O
P       Q       R       S       T       U       V       W
X       Y       Z       [       \       ]       ^       _
`       a       b       c       d       e       f       g
h       i       j       k       l       m       n       o
p       q       r       s       t       u       v       w
x       y       z       {       |       }       ∼
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Writing Arrays of Bytes
It's often faster to write data in large chunks than it is to write it byte by byte. Two overloaded variants of the write( ) method do this:
public void write(byte[] data) throws IOException
public void write(byte[] data, int offset, int length) throws IOException
The first variant writes the entire byte array data. The second writes only the subarray of data starting at offset and continuing for length bytes. For example, the following code fragment blasts the bytes in a string onto System.out:
String s = "How are streams treating you?";
byte[] data = s.getBytes( );
System.out.write(data);
Conversely, you may run into performance problems if you attempt to write too much data at a time. The exact turnaround point depends on the eventual destination of the data. Files are often best written in small multiples of the block size of the disk, typically 1024, 2048, or 4096 bytes. Network connections often require smaller buffer sizes—128 or 256 bytes. The optimal buffer size depends on too many system-specific details for anything to be guaranteed, but I often use 128 bytes for network connections and 1024 bytes for files.
Example 2-2 is a simple program that constructs a byte array filled with an ASCII chart, then blasts it onto the console in one call to write( ).
Example 2-2. The AsciiArray program
import java.io.*;
public class AsciiArray {
  public static void main(String[] args) {
    byte[] b = new byte[(127-31)*2];
    int index = 0;
    for (int i = 32; i < 127; i++) {
      b[index++] = (byte) i;
      // Break line after every eight characters.
      if (i % 8 == 7) b[index++] = (byte) '\n';
      else b[index++] = (byte) '\t';
    }
    b[index++] = (byte) '\n';
    try {
      System.out.write(b);
    }
    catch (IOException ex) {
      System.err.println(ex);
    }
  }
}
The output is the same as in Example 2-1. Because of the nature of the console, this particular program probably isn't a lot faster than Example 2-1, but it certainly could be if you were writing data into a file rather than onto the console. The difference in performance between writing a
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Closing Output Streams
When you're through with a stream, you should close it. This allows the operating system to free any resources associated with the stream. Exactly what these resources are depends on your platform and varies with the type of stream. However, many systems have finite resources. For example, on some personal computer operating systems, no more than several hundred files can be open at once. Multiuser operating systems have larger limits, but limits nonetheless.
To close a stream, invoke its close( ) method:
public void close( ) throws IOException
For example, again assuming out is an OutputStream, calling out.close( ) closes the stream and frees any underlying resources such as file handles or network ports associated with the stream.
Once you have closed an output stream, you probably can't write anything else onto that stream. Attempting to do so normally throws an IOException, though there are a few classes where this doesn't happen.
Again, System.out is a partial exception because, as a PrintStream, all exceptions it throws are eaten. Once you close System.out, you can't write to it. Trying to do so won't throw any exceptions; however, your output will not appear on the console.
Not all streams need to be closed—byte array output streams do not need to be closed, for example. However, streams associated with files and network connections should always be closed when you're done with them. For example, if you open a file for writing and neglect to close it when you're through, then other processes may be blocked from reading or writing to that file. Often, files are closed like this:
try {
  OutputStream out = new FileOutputStream("numbers.dat");
  // Write to the stream...
  out.close( );
}
catch (IOException ex) {
  System.err.println(ex);
}
However, this code fragment has a potential leak. If an IOException is thrown while writing, the stream won't be closed. It's more reliable to close the stream in a
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Flushing Output Streams
Many output streams buffer writes to improve performance. Rather than sending each byte to its destination as it's written, the bytes are accumulated in a memory buffer ranging in size from several bytes to several thousand bytes. When the buffer fills up, all the data is sent at once. The flush( ) method forces the data to be written whether or not the buffer is full:
public void flush( ) throws IOException
This is not the same as any buffering performed by the operating system or the hardware. These buffers will not be emptied by a call to flush( ). (Then sync( ) method in the FileDescriptor class, discussed in Chapter 17, can sometimes empty these buffers.)
If you use a stream for only a short time, you don't need to flush it explicitly. It should flush automatically when the stream is closed. This should happen when the program exits or when the close( ) method is invoked. You flush an output stream explicitly only if you want to make sure data is sent before you're through with the stream. For example, a program that sends bursts of data across the network periodically should flush after each burst of data is written to the stream.
Flushing is often important when you're trying to debug a crashing program. All streams flush automatically when their buffers fill up, and all streams should be flushed when a program terminates normally. If a program terminates abnormally, however, buffers may not get flushed. In this case, unless there is an explicit call to flush( ) after each write, you can't be sure the data that appears in the output indicates the point at which the program crashed. In fact, the program may have continued to run for some time past that point before it crashed.
System.out, System.err, and some (but not all) other print streams automatically flush after each call to println( ) and after each time a new line character ('\n') appears in the string being written. You can enable or disable auto-flushing in the PrintStream constructor.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Subclassing OutputStream
OutputStream is an abstract class that mainly describes the operations available with any OutputStream object. Specific subclasses know how to write bytes to particular destinations. For instance, a FileOutputStream uses native code to write data in files. A ByteArrayOutputStream uses pure Java to write its output in an expanding byte array.
Recall that there are three overloaded variants of the write( ) method in OutputStream, one abstract, two concrete:
public abstract void write(int b) throws IOException
public void write(byte[] data) throws IOException
public void write(byte[] data, int offset, int length) throws IOException
Subclasses must implement the abstract write(int b) method. They often also override the third variant, write(byte[], data int offset, int length), to improve performance. The implementation of the three-argument version of the write( ) method in OutputStream simply invokes write(int b) repeatedly—that is:
public void write(byte[] data, int offset, int length) throws IOException {
  for (int i = offset; i < offset+length; i++) write(data[i]);
}
Most subclasses can provide a more efficient implementation of this method. The one-argument variant of write( ) merely invokes write(data, 0, data.length); if the three-argument variant has been overridden, this method will perform reasonably well. However, a few subclasses may override it anyway.
Example 2-3 is a simple program called NullOutputStream that mimics the behavior of /dev/null on Unix operating systems. Data written into a null output stream is lost.
Example 2-3. The NullOutputStream class
package com.elharo.io;
import java.io.*;
public class NullOutputStream extends OutputStream {
  private boolean closed = false;
  public void write(int b) throws IOException {
    if (closed) throw new IOException("Write to closed stream");
  }
  public void write(byte[] data, int offset, int length)
   throws IOException {
    if (data == null) throw new NullPointerException("data is null");
    if (closed) throw new IOException("Write to closed stream");
  }
  public void close( ) {
    closed = true;
  }
}
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
A Graphical User Interface for Output Streams
As an example, I'm going to show a subclass of javax.swing.JTextArea that can be connected to an output stream. As data is written onto the stream, it is appended to the text area in the default character set. (This isn't ideal. Since text areas contain text, a writer would be a better source for this data. In later chapters, I'll expand on this class to use a writer instead. For now, this makes a neat example.) This subclass is shown in Example 2-4.
The actual output stream is contained in an inner class inside the JStreamedTextArea class. Each JStreamedTextArea component contains a TextAreaOutputStream object in its theOutput field. Client programmers access this object via the getOutputStream( ) method. The JStreamedTextArea class has four overloaded constructors that imitate the four constructors in the javax.swing.JTextArea class, each taking a different combination of text, rows, and columns. The first three constructors merely pass their arguments and suitable defaults to the most general fourth constructor using this( ). The fourth constructor calls the most general superclass constructor, then calls setEditable(false) to ensure that the user doesn't change the text while output is streaming into it.
Example 2-4. The JStreamedTextArea component
package com.elharo.io.ui;
import javax.swing.*;
import java.io.*;
public class JStreamedTextArea extends JTextArea {
  private OutputStream theOutput = new TextAreaOutputStream( );
  public JStreamedTextArea( ) {
    this("", 0, 0);
  }
  public JStreamedTextArea(String text) {
    this(text, 0, 0);
  }
  public JStreamedTextArea(int rows, int columns) {
    this("", rows, columns);
  }
  public JStreamedTextArea(String text, int rows, int columns) {
    super(text, rows, columns);
    setEditable(false);
  }
  public OutputStream getOutputStream( ) {
    return theOutput;
  }
  private class TextAreaOutputStream extends OutputStream {
    private boolean closed = false;
    public void write(int b) throws IOException {
      checkOpen( );
      // recall that the int should really just be a byte
      b &= 0x000000FF;
      // must convert byte to a char in order to append it
      char c = (char) b;
      append(String.valueOf(c));
    }
    private void checkOpen( ) throws IOException {
        if (closed) throw new IOException("Write to closed stream");
    }
    public void write(byte[] data, int offset, int length)
     throws IOException {
      checkOpen( );
      append(new String(data, offset, length));
    }
    public void close( ) {
        this.closed = true;
    }
  }
}
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 3: Input Streams
java.io.InputStream is the abstract superclass for all input streams. It declares the three basic methods needed to read bytes of data from a stream. It also has methods for closing streams, checking how many bytes of data are available to be read, skipping over input, marking a position in a stream and resetting back to that position, and determining whether marking and resetting are supported.
The fundamental method of the InputStream class is read( ). This method reads a single unsigned byte of data and returns the integer value of the unsigned byte. This is a number between 0 and 255:
public abstract int read( ) throws IOException
read( ) is declared abstract; therefore, InputStream is abstract. Hence, you can never instantiate an InputStream directly; you always work with one of its concrete subclasses.
The following code reads 10 bytes from the System.in input stream and stores them in the int array data:
int[] data = new int[10];
for (int i = 0; i < data.length; i++) {
  data[i] = System.in.read( );
}
Notice that although read( ) is reading a byte, it returns an int. If you want to store the raw bytes instead, you can cast the int to a byte. For example:
byte[] b = new byte[10];
for (int i = 0; i < b.length; i++) {
  b[i] = (byte) System.in.read( );
}
Of course, this produces a signed byte instead of the unsigned byte returned by the read( ) method (that is, a byte in the range −128 to 127 instead of 0 to 255). As long as you're clear in your mind and in your code about whether you're working with signed or unsigned data, you won't have any trouble. Signed bytes can be converted back to ints in the range of 0 to 255 like this:
int i = (b >= 0) ? b : 256 + b;
When you call read( ), you also have to catch the IOException that it might throw, or declare that your methods throw it. However, there's no IOException
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The read( ) Method
The fundamental method of the InputStream class is read( ). This method reads a single unsigned byte of data and returns the integer value of the unsigned byte. This is a number between 0 and 255:
public abstract int read( ) throws IOException
read( ) is declared abstract; therefore, InputStream is abstract. Hence, you can never instantiate an InputStream directly; you always work with one of its concrete subclasses.
The following code reads 10 bytes from the System.in input stream and stores them in the int array data:
int[] data = new int[10];
for (int i = 0; i < data.length; i++) {
  data[i] = System.in.read( );
}
Notice that although read( ) is reading a byte, it returns an int. If you want to store the raw bytes instead, you can cast the int to a byte. For example:
byte[] b = new byte[10];
for (int i = 0; i < b.length; i++) {
  b[i] = (byte) System.in.read( );
}
Of course, this produces a signed byte instead of the unsigned byte returned by the read( ) method (that is, a byte in the range −128 to 127 instead of 0 to 255). As long as you're clear in your mind and in your code about whether you're working with signed or unsigned data, you won't have any trouble. Signed bytes can be converted back to ints in the range of 0 to 255 like this:
int i = (b >= 0) ? b : 256 + b;
When you call read( ), you also have to catch the IOException that it might throw, or declare that your methods throw it. However, there's no IOException if read( ) encounters the end of the input stream; in this case, it returns −1. You use this as a flag to watch for the end of stream. The following code fragment shows how to catch the IOException and test for the end of the stream:
try {
  InputStream in = new FileInputStream("file.txt");
  int[] data = new int[10];
  for (int i = 0; i < data.length; i++) {
    int datum = in.read( );
    if (datum  == −1) break;
    data[i] = datum;
  }
}
catch (IOException ex) {
  System.err.println(ex.getMessage( ));
}
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Reading Chunks of Data from a Stream
Input and output are often the performance bottlenecks in a program. Reading from or writing to disk can be hundreds of times slower than reading from or writing to memory; network connections and user input are even slower. While disk capacities and speeds have increased over time, they have never kept pace with CPU speeds. Therefore, it's important to minimize the number of reads and writes a program actually performs.
All input streams have overloaded read( ) methods that read chunks of contiguous data into a byte array. The first variant tries to read enough data to fill the array. The second variant tries to read length bytes of data starting at position offset into the array. Neither of these methods is guaranteed to read as many bytes as you want. Both methods return the number of bytes actually read, or −1 on end of stream.
public int read(byte[] data) throws IOException
public int read(byte[] data, int offset, int length) throws IOException
The default implementation of these methods in the java.io.InputStream class merely calls the basic read( ) method enough times to fill the requested array or subarray. Thus, reading 10 bytes of data takes 10 times as long as reading 1 byte of data. However, most subclasses of InputStream override these methods with more efficient methods, perhaps native, that read the data from the underlying source as a block.
For example, to attempt to read 10 bytes from System.in, you could write the following code:
try {
  byte[] b = new byte[10];
  System.in.read(b);
}
catch (IOException ex) {
  System.err.println("Couldn't read from System.in!");
}
Reads don't always succeed in getting as many bytes as you want. Conversely, there's nothing to stop you from trying to read more data into the array than will fit. If you do this, read( ) throws an ArrayIndexOutOfBoundsException. For example, the following code loops repeatedly until it either fills the array or sees the end of stream:
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Counting the Available Bytes
It's sometimes convenient to know how many bytes can be read before you attempt to read them. The InputStream class's available( ) method tells you how many bytes you can read without blocking. It returns 0 if there's no data available to be read.
public int available( ) 
 throws IOException
For example:
try {
  byte[] b = new byte[100];
  int offset = 0;
  while (offset < b.length) {
    int a = System.in.available( );
    int bytesRead = System.in.read(b, offset, a);
    if (bytesRead == −1) break; // end of stream
    offset += bytesRead;
}
catch (IOException ex) {
  System.err.println("Couldn't read from System.in!");
}
There's a potential bug in this code. There may be more bytes available than there's space in the array to hold them. One common idiom is to size the array according to the number available( ) returns, like this:
try {
  byte[] b = new byte[System.in.available( )];
  System.in.read(b);
}
catch (IOException ex) {
  System.err.println("Couldn't read from System.in!");
}
This works well if you're going to perform a single read. For multiple reads, however, the overhead of creating multiple arrays is excessive. You should probably reuse the array and create a new array only if more bytes are available than will fit in the array.
The available( ) method in java.io.InputStream always returns 0. Subclasses are supposed to override it, but I've seen a few that don't. You may be able to read more bytes from the underlying stream without blocking than available( ) suggests; you just can't guarantee that you can. If this is a concern, place input in a separate thread so that blocked input doesn't block the rest of the program.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Skipping Bytes
The skip( ) method jumps over a certain number of bytes in the input:
public long skip(long bytesToSkip) throws IOException
The argument to skip( ) is the number of bytes to skip. The return value is the number of bytes actually skipped, which may be less than bytesToSkip. −1 is returned if the end of stream is encountered. Both the argument and return value are longs, allowing skip( ) to handle extremely long input streams. Skipping is often faster than reading and discarding the data you don't want. For example, when an input stream is attached to a file, skipping bytes just requires that the position in the file be changed, whereas reading involves copying bytes from the disk into memory. For example, to skip the next 80 bytes of the input stream in:
try {
  long bytesSkipped = 0;
  long bytesToSkip = 80;
  while (bytesSkipped < bytesToSkip) {
    long n = in.skip(bytesToSkip - bytesSkipped);
    if (n == −1) break;
    bytesSkipped += n;
  }
}
catch (IOException ex) {
  System.err.println(ex);
}
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Closing Input Streams
As with output streams, input streams should be closed when you're through with them to release any native resources such as file handles or network ports that the stream is holding onto. To close a stream, invoke its close( ) method:
public void close( ) throws IOException
Once you have closed an input stream, you should no longer read from it. Most attempts to do so will throw an IOException (though there are a few exceptions).
Not all streams need to be closed—System.in generally does not need to be closed, for example. However, streams associated with files and network connections should always be closed when you're done with them. As with output streams, it's best to do this in a finally block to guarantee that the stream is closed, even if an exception is thrown while the stream is open. For example:
// Initialize this to null to keep the compiler from complaining
// about uninitialized variables
InputStream in = null;
try {
  URL u = new URL("http://www.msf.org/");
  in = u.openStream( );
  // Read from the stream...
}
catch (IOException ex) {
  System.err.println(ex);
}
finally {
  if (in != null) {
    try {
      in.close( );
    }
    catch (IOException ex) {
      System.err.println(ex);
    }
  }
}
If you can propagate any exceptions that are thrown, this strategy can be a little shorter and simpler. For example:
// Initialize this to null to keep the compiler from complaining
// about uninitialized variables
InputStream in = null;
try {
  URL u = new URL("http://www.msf.org/");
  in = u.openStream( );
  // Read from the stream...
}
finally {
  if (in != null) in.close( );
}
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Marking and Resetting
It's often useful to be able to read a few bytes and then back up and reread them. For example, in a Java compiler, you don't know for sure whether you're reading the token <, <<, or <<= until you've read one too many characters. It would be useful to be able to back up and reread the token once you know which token you've read.
Some (but not all) input streams allow you to mark a particular position in the stream and then return to it. Three methods in the java.io.InputStream class handle marking and resetting:
public void    mark(int readLimit)
public void    reset( ) 
 throws IOException
public boolean markSupported( ) 

The markSupported( ) method returns true if this stream supports marking and false if it doesn't. If marking is not supported, reset( ) throws an IOException and mark( ) does nothing. Assuming the stream does support marking, the mark( ) method places a bookmark at the current position in the stream. You can rewind the stream to this position later with reset( ) as long as you haven't read more than readLimit bytes. There can be only one mark in the stream at any given time. Marking a second location erases the first mark.
The only two input stream classes in java.io that always support marking are BufferedInputStream (of which System.in is an instance) and ByteArrayInputStream.
However, other input streams such as DataInputStream may support marking if they're chained to a buffered input stream first.
This is a truly bizarre design. It's almost always a bad idea to put methods in a superclass that aren't applicable to all subclasses. The proper solution to this problem would be to define a Resettable interface that declares these three methods and then have subclasses implement that interface or not as they choose. You could then tell whether marking and resetting were supported with a simple
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Subclassing InputStream
Immediate subclasses of InputStream must provide an implementation of the abstract read( ) method. They may also override some of the nonabstract methods. For example, the default markSupported( ) method returns false, mark( ) does nothing, and reset( ) throws an IOException. Any class that allows marking and resetting must override these three methods. Subclasses should also override available( ) to return something other than 0. Furthermore, they may override skip( ) and the other two read( ) methods to provide more efficient implementations.
Example 3-2 is a simple class called RandomInputStream that "reads" random bytes of data. This provides a useful source of unlimited data you can use in testing. A java.util.Random object provides the data.
Example 3-2. The RandomInputStream class
package com.elharo.io;
import java.util.*;
import java.io.*;
public class RandomInputStream extends InputStream {
  private Random generator = new Random( );
  private boolean closed = false;
  public int read( ) throws IOException {
    checkOpen( );
    int result = generator.nextInt( ) % 256;
    if (result < 0) result = -result;
    return result;
  }
  public int read(byte[] data, int offset, int length) throws IOException {
    checkOpen( );
    byte[] temp = new byte[length];
    generator.nextBytes(temp);
    System.arraycopy(temp, 0, data, offset, length);
    return length;
  }
  public int read(byte[] data) throws IOException {
    checkOpen( );
    generator.nextBytes(data);
    return data.length;
  }
  public long skip(long bytesToSkip) throws IOException {
    checkOpen( );
    // It's all random so skipping has no effect.
    return bytesToSkip;
  }
  public void close( ) {
      this.closed = true;
  }
  private void checkOpen( ) throws IOException {
      if (closed) throw new IOException("Input stream closed");
  }
  public int available( ) {
    // Limited only by available memory and the size of an array.
    return Integer.MAX_VALUE;
  }
}
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
An Efficient Stream Copier
Content preview·