This chapter
discusses Java’s stream classes, which are defined
in the java.io.*
package. While streams are
not really part of RMI, a working knowledge of the
stream classes is an important part of an RMI
programmer’s skillset. In particular, this chapter
provides essential background information for
understanding two related areas: sockets and
object serialization.
A stream is an ordered sequence of bytes. However, it’s helpful to also think of a stream as a data structure that allows client code to either store or retrieve information. Storage and retrieval are done sequentially—typically, you write data to a stream one byte at a time or read information from the stream one byte at a time. However, in most stream classes, you cannot “go back”—once you’ve read a piece of data, you must move on. Likewise, once you’ve written a piece of data, it’s written.
You may
think that a stream sounds like an impoverished
data structure. Certainly, for most programming
tasks, a HashMap
or an ArrayList
storing objects is preferable
to a read-once sequence of bytes. However, streams
have one nice feature: they are a simple and
correct model for almost any external device
connected to a computer. Why
correct? Well, when you think
about it, the code-level mechanics of writing data
to a printer are not all that different from
sending data over a modem; the information is sent
sequentially, and, once it’s sent, it can not be
retrieved or “un-sent.”[3] Hence, streams are an abstraction that
allow client code to access an external resource
without worrying too much about the specific
resource.
Using the streams
library is a two-step process. First,
device-specific code that creates the stream
objects is executed; this is often called
“opening” the stream. Then, information is either
read from or written to the stream. This second
step is device-independent; it relies only on the
stream interfaces. Let’s start by looking at the
stream classes offered with Java: InputStream
and OutputStream
.
InputStream
is an abstract class that represents a data
source. Once opened, it provides information to
the client that created it. The InputStream
class
consists of the following methods:
public int available( ) throws IOException public void close( ) throws IOException public void mark(int numberOfBytes) throws IOException public boolean markSupported( ) throws IOException public abstract int read( ) throws IOException public int read(byte[] buffer) throws IOException public int read(byte[] buffer, int startingOffset, int numberOfBytes) throws IOException public void reset( ) throws IOException public long skip(long numberOfBytes) throws IOException
These methods serve three different roles: reading data, stream navigation, and resource management.
The most important
methods are those that actually retrieve data from
the stream. InputStream
defines three basic methods
for reading data:
public int read( ) throws IOException public int read(byte[] buffer) throws IOException public int read(byte[] buffer, int startingOffset, int numberOfBytes) throws IOException
The first of these
methods, read(
)
, simply returns the next available
byte in the stream. This byte is returned as an
integer in order to allow the InputStream
to return
nondata values. For example, read( )
returns -1 if
there is no data available, and no more data will
be available to this stream. This can happen, for
example, if you reach the end of a file. On the
other hand, if there is currently no data, but
some may become available in the future, the
read( )
method
blocks. Your code then waits until a byte becomes
available before continuing.
Tip
A piece of
code is said to block if it
must wait for a resource to finish its job. For
example, using the read(
)
method to retrieve data from a file
can force the method to halt execution until the
target hard drive becomes available. Blocking can
sometimes lead to undesirable results. If your
code is waiting for a byte that will never come,
the program has effectively crashed.
The other two methods for retrieving data
are more advanced versions of read( )
, added to the
InputStream
class for efficiency. For example, consider what
would happen if you created a tight loop to fetch
65,000 bytes one at a time from an external
device. This would be extraordinarily inefficient.
If you know you’ll be fetching large amounts of
data, it’s better to make a single request:
byte buffer = new byte[1000]; read(buffer);
The
read(byte[]
buffer)
method is a request to read
enough bytes to fill the buffer (in this case,
buffer.length
number of bytes). The integer return value is the
number of bytes that were actually read, or -1 if
no bytes were read.
Finally, read(byte[] buffer, int startingOffset, int
numberOfBytes)
is a request to read the
exact numberOfBytes
from the stream and place
them in the buffer starting at position startingOffset
. For
example:
read(buffer, 2, 7);
This is a request to read 7 bytes and place
them in the locations buffer[2]
, buffer[3]
, and so on up to buffer[8]
. Like the
previous read(
)
, this method returns an integer
indicating the amount of bytes that it was able to
read, or -1 if no bytes were read at
all.
Stream navigation methods are methods that enable you to move around in the stream without necessarily reading in data. There are five stream navigation methods:
public int available( ) throws IOException public long skip(long numberOfBytes) throws IOException public void mark(int numberOfBytes) throws IOException public boolean markSupported( ) throws IOException public void reset( ) throws IOException
available( )
is used to
discover how many bytes are guaranteed to be
immediately available. To avoid blocking, you can
call available(
)
before each read( )
, as in the following code
fragment:
while (stream.available( ) >0 )) { processNextByte(stream.read( )); }
Warning
There are two caveats when using available( )
in this
way. First, you should make sure that the stream
from which you are reading actually implements
available( )
in
a meaningful way. For example, the default
implementation, defined in InputStream
, simply
returns 0. This behavior, while technically
correct, is really misleading. (The preceding code
fragment will not work if the stream always
returns 0.) The second caveat is that you should
make sure to use buffering. See Section 1.3
later in this chapter for more details on how to
buffer streams.
The skip( )
method simply
moves you forward numberOfBytes
in the stream. For many
streams, skipping is equivalent to reading in the
data and then discarding it.
Warning
In fact, most implementations of skip( )
do exactly that:
repeatedly read and discard the data. Hence, if
numberOfBytes
worth of data aren’t available yet, these
implementations of skip(
)
will block.
Many input streams are
unidirectional: they only allow you to move
forward. Input streams that support repeated
access to their data do so by implementing
marking. The intuition
behind marking is that code that reads data from
the stream can mark a point to which it might want
to return later. Input streams that support
marking return true
when markSupported( )
is called. You can use
the mark( )
method to mark the current location in the stream.
The method’s sole parameter, numberOfBytes
, is used
for expiration—the stream will retire the mark if
the reader reads more than numberOfBytes
past it.
Calling reset(
)
returns the stream to the point where
the mark was made.
Because streams
are often associated with external devices such as
files or network connections, using a stream often
requires the operating system to allocate
resources beyond memory. For example, most
operating systems limit the number of files or
network connections that a program can have open
at the same time. The resource management methods
of the InputStream
class involve communication
with native code to manage operating system-level
resources.
The only resource
management method defined for InputStream
is close( )
. When you’re
done with a stream, you should always explicitly
call close( )
.
This will free the associated system resources
(e.g., the associated file descriptor for files).
At first glance, this seems a little strange. After all, one of the big advantages of Java is that it has garbage collection built into the language specification. Why not just have the object free the operating-system resources when the object is garbage collected?
The reason is that garbage collection is unreliable. The Java language specification does not explicitly guarantee that an object that is no longer referenced will be garbage collected (or even that the garbage collector will ever run). In practice, you can safely assume that, if your program runs short on memory, some objects will be garbage collected, and some memory will be reclaimed. But this assumption isn’t enough for effective management of scarce operating-system resources such as file descriptors. In particular, there are three main problems:
You have no control over how much time will elapse between when an object is eligible to be garbage collected and when it is actually garbage collected.
You have very little control over which objects get garbage collected.[4]
There isn’t necessarily a relationship between the number of file handles still available and the amount of memory available. You may run out of file handles long before you run out of memory. In which case, the garbage collector may never become active.
Put succinctly, the garbage collector is an unreliable way to manage anything other than memory allocation. Whenever your program is using scarce operating-system resources, you should explicitly release them. This is especially true for streams; a program should always close streams when it’s finished using them.
All of the
methods defined for InputStream
can throw an IOException
. IOException
is a checked
exception. This means that stream manipulation
code always occurs inside a try
/catch
block, as in the
following code fragment:
try{ while( -1 != (nextByte = bufferedStream.read( ))) { char nextChar = (char) nextByte; ... } } catch (IOException e) { ... }
The idea behind IOException
is this: streams are mostly
used to exchanging data with devices that are
outside the JVM. If something goes wrong with the
device, the device needs a universal way to
indicate an error to the client code.
Consider, for example, a printer that refuses to print a document because it is out of paper. The printer needs to signal an exception, and the exception should be relayed to the user; the program making the print request has no way of refilling the paper tray without human intervention. Moreover, this exception should be relayed to the user immediately.
Most stream exceptions are similar to this
example. That is, they often require some sort of
user action (or at least user notification), and
are often best handled immediately. Therefore, the
designers of the streams library decided to make
IOException
a
checked exception, thereby forcing programs to
explicitly handle the possibility of failure.
OutputStream
is an abstract class that represents a data sink.
Once it is created, client code can write
information to it. OutputStream
consists of the following
methods:
public void close( ) throws IOException public void flush( ) throws IOException public void write(byte[] buffer) throws IOException public void write(byte[] buffer, int startingOffset, int numberOfBytes) throws IOException public void write(int value) throws IOException
The OutputStream
class is a little simpler
than InputStream
; it doesn’t support
navigation. After all, you probably don’t want to
go back and write information a second time.
OutputStream
methods serve two purposes: writing
data and resource
management.
OutputStream
defines three basic
methods for writing data:
public void write(byte[] buffer) throws IOException public void write(byte[] buffer, int startingOffset, int numberOfBytes) throws IOException public void write(int value) throws IOException
These methods are
analogous to the read(
)
methods defined for InputStream
. Just as
there was one basic method for reading a single
byte of data, there is one basic method, write(int value)
, for
writing a single byte of data. The argument to
this write( )
method should be an integer between 0 and 255. If
not, it is reduced to module 256 before being
written.
Just as there were two array-based variants
of read( )
,
there are two methods for writing arrays of bytes.
write(byte[]
buffer)
causes all the bytes in the
array to be written out to the stream. write(byte[] buffer, int
startingOffset, int
numberOfBytes)
causes numberOfBytes
bytes to be written,
starting with the value at buffer[startingOffset]
.
Tip
The fact that the
argument to the basic write( )
method is an integer is
somewhat peculiar. Recall that read( )
returned an
integer, rather than a byte, in order to allow
instances of InputStream
to signal exceptional
conditions. write(
)
takes an integer, rather than a byte,
so that the read and write method declarations are
parallel. In other words, if you’ve read a value
in from a stream, and it’s not -1, you should be
able to write it out to another stream
without casting it.
OutputStream
defines two
resource management methods:
public void close( ) public void flush( )
close( )
serves exactly the same
role for OutputStream
as it did for InputStream
—itshould be
called when the client code is done using the
stream and wishes to free up all the associated
operating-system resources.
The flush( )
method is
necessary because output streams frequently use a
buffer to store data that is being written. This
is especially true when data is being written to
either a file or a socket. Passing data to the
operating system a single byte at a time can be
expensive. A much more practical strategy is to
buffer the data at the JVM level and occasionally
call flush( )
to send the data en masse.
To make this
discussion more concrete, we will now discuss a
simple application that allows the user to display
the contents of a file in a JTextArea
. The
application is called ViewFile
and is shown in Example 1-1.
Note that the application’s main( )
method is
defined in the com.ora.rmibook.chapter1.ViewFile
class.[5] The resulting screenshot is shown in
Figure
1-1.
Example 1-1. ViewFile.java
public class ViewfileFrame extends ExitingFrame{ // lots of code to set up the user interface. // The View button's action listener is an inner class private void copyStreamToViewingArea(InputStream fileInputStream) throws IOException { BufferedInputStream bufferedStream = new BufferedInputStream(fileInputStream); int nextByte; _fileViewingArea.setText(""); StringBuffer localBuffer = new StringBuffer( ); while( -1 != (nextByte = bufferedStream.read( ))) { char nextChar = (char) nextByte; localBuffer.append(nextChar); } _fileViewingArea.append(localBuffer.toString( )); } private class ViewFileAction extends AbstractAction { public ViewFileAction( ) { putValue(Action.NAME, "View"); putValue(Action.SHORT_DESCRIPTION, "View file contents in main text area."); } public void actionPerformed(ActionEvent event) { FileInputStream fileInputStream = _fileTextField.getFileInputStream( ); if (null==fileInputStream) { _fileViewingArea.setText("Invalid file name"); } else { try { copyStreamToViewingArea(fileInputStream); fileInputStream.close( ); } catch (java.io.IOException ioException) { _fileViewingArea.setText("\n Error occured while reading file"); } } }
The important part of the
code is the View button’s action listener and the
copyStreamToViewingArea(
)
method. copyStreamToViewingArea( )
takes an
instance of InputStream
and copies the contents of
the stream to the central JTextArea
. What happens when a user
clicks on the View button? Assuming all goes well,
and that no exceptions are thrown, the following
three lines of code from the buttons’s action
listener are executed:
FileInputStream fileInputStream = _fileTextField.getFileInputStream( ); copyStreamToViewingArea(fileInputStream); fileInputStream.close( );
The
first line is a call to the getFileInputStream( )
method on _fileTextField
. That is, the program
reads the name of the file from a text field and
tries to open a FileInputStream
. FileInputStream
is
defined in the java.io*
package. It is a subclass of
InputStream
used to read the contents of a file.
Once this stream is opened, copyStreamToViewingArea(
)
is called. copyStream
-ToViewingArea( )
takes the input
stream, wraps it in a buffer, and then reads it
one byte at a time. There are two things to note
here:
We explicitly check that
nextByte
is not equal to -1 (e.g., that we’re not at the end of the file). If we don’t do this, the loop will never terminate, and we will we will continue to append(char) -1
to the end of our text until the program crashes or throws an exception.We use
BufferedInputStream
instead of usingFileInputStream
directly. Internally, aBufferedInputStream
maintains a buffer so it can read and store many values at one time. Maintaining this buffer allows instances ofBuffered
-InputStream
to optimize expensive read operations. In particular, rather than reading each byte individually,bufferedStream
converts individual calls to itsread( )
method into a single call toFileInputStream
’sread(byte[] buffer)
method. Note that buffering also provides another benefit.BufferedInputStream
supports stream navigation through the use of marking.
The use of BufferedInputStream
illustrates a central idea in the design of the
streams library: streams can be wrapped in other
streams to provide incremental functionality. That
is, there are really two types of streams:
- Primitive streams
These are the streams that have native methods and talk to external devices. All they do is transmit data exactly as it is presented.
FileInputStream
andFile
-OuputStream
are examples of primitive streams.- Intermediate streams
These streams are not direct representatives of a device. Instead, they function as a wrapper around an already existing stream, which we will call the underlying stream. The underlying stream is usually passed as an argument to the intermediate stream’s constructor. The intermediate stream has logic in its
read( )
orwrite( )
methods that either buffers the data or transforms it before forwarding it to the underlying stream. Intermediate streams are also responsible for propagatingflush( )
andclose( )
calls to the underlying stream.BufferedInputStream
andBufferedOutputStream
are examples of intermediate streams.
Warning
close( )
and flush( )
propagate to
sockets as well. That is, if you close a stream
that is associated with a socket, you will close
the socket. This behavior, while logical and
consistent, can come as a surprise.
To
further illustrate the idea of layering, I will
demonstrate the use of GZIPOutputStream
, defined in the
package java.util.zip
, with the CompressFile
application. This application is shown in Example 1-2.
CompressFile
is an
application that lets the user choose a file and
then makes a compressed copy of it. The
application works by layering three output streams
together. Specifically, it opens an instance of
FileOutputStream
, which it then uses as
an argument to the constructor of a BufferedOutputStream
,
which in turn is used as an argument to GZIPOutputStream
’s
constructor. All data is then written using
GZIPOutputStream
. Again, the main( )
method for this
application is defined in the com.ora.rmibook.chapter1.CompressFile
class.
The important
part of the source code is the copy( )
method, which
copies an InputStream
to an OutputStream
, and
ActionListener
,
which is added to the Compress button. A
screenshot of the application is shown in Figure 1-2.
Example 1-2. CompressFile.java
private int copy(InputStream source, OutputStream destination) throws IOException { int nextByte; int numberOfBytesCopied = 0; while(-1!= (nextByte = source.read( ))) { destination.write(nextByte); numberOfBytesCopied++; } destination.flush( ); return numberOfBytesCopied; } private class CompressFileAction extends AbstractAction { // setup code omitted public void actionPerformed(ActionEvent event) { InputStream source = _startingFileTextField.getFileInputStream( ); OutputStream destination = _destinationFileTextField.getFileOutputStream( ); if ((null!=source) && (null!=destination)) { try { BufferedInputStream bufferedSource = new BufferedInputStream(source); BufferedOutputStream bufferedDestination = new BufferedOutputStream(destination); GZIPOutputStream zippedDestination = new GZIPOutputStream(bufferedDestination); copy(bufferedSource, zippedDestination); bufferedSource.close( ); zippedDestination.close( ); } catch (IOException e){} }
When the user
clicks on the Compress button, two input streams
and three output streams are created. The input
streams are similar to those used in the ViewFile
application—they allow us to use buffering as we
read in the file. The output streams, however, are
new. First, we create an instance of FileOutputStream
. We
then wrap an instance of BufferedOutputStream
around the
instance of FileOutputStream
. And finally, we wrap
GZIPOutputStream
around BufferedOutputStream
. To
see what this accomplishes, consider what happens
when we start feeding data to GZIPOutputStream
(the
outermost OutputStream
).
write(nextByte)
is repeatedly called onzippedDestination
.zippedDestination
does not immediately forward the data tobuffered
-Destination
. Instead, it compresses the data and sends the compressed version of the data tobufferedDestination
usingwrite(int value)
.bufferedDestination
does not immediately forward the data it received todestination
. Instead, it puts the data in a buffer and waits until it gets a large amount of data before callingdestination
’swrite(byte[] buffer)
method.
Eventually, when all the data has been read
in, zippedDestination
’s close( )
method is
called. This flushes bufferedDestination
, which flushes
destination
,
causing all the data to be written out to the
physical file. After that, zippedDestination
is
closed, which causes bufferedDestination
to be closed, which
then causes destination
to be closed, thus freeing
up scarce system resources.
I
will close our discussion of streams by briefly
mentioning a few of the most useful intermediate
streams in the Javasoft libraries. In addition to
buffering and compressing, the two most commonly
used intermediate stream types are DataInputStream
/DataOutputStream
and
ObjectInputStream
/ObjectOutputStream
. We
will discuss ObjectInputStream
and ObjectOutputStream
extensively in Chapter 10.
DataInputStream
and DataOutputStream
don’t
actually transform data that is given to them in
the form of bytes. However, DataInputStream
implements the DataInput
interface, and DataOutputStream
implements the DataOutput
interface. This allows other
datatypes to be read from, and written to,
streams. For example, DataOutput
defines the writeFloat(float value)
method, which can be used to write an IEEE 754
floating-point value out to a stream. This method
takes the floating point argument, converts it to
a sequence of four bytes, and then writes the
bytes to the underlying stream.
If DataOutputStream
is used to convert
data for storage into an underlying stream, the
data should always be read in with a DataInputStream
object.
This brings up an important principle:
intermediate input and output streams
which transform data must be used in
pairs. That is, if you zip, you must
unzip. If you encrypt, you must decrypt. And, if
you use DataOuputStream
, you must use DataInputStream
.
The last topics I will
touch on in this chapter are the Reader
and Writer
abstract classes.
Readers and writers are like input streams and
output streams. The primary difference lies in the
fundamental datatype that is read or written;
streams are byte-oriented, whereas readers and
writers use characters and strings.
The reason
for this is internationalization. Readers and
writers were designed to allow programs to use a
localized character set and still have a
stream-like model for communicating with external
devices. As you might expect, the method
definitions are quite similar to those for
InputStream
and
OutputStream
.
Here are the basic methods defined in Reader
:
public void close( ) public void mark(int readAheadLimit) public boolean markSupported( ) public int read( ) public int read(char[] cbuf) public int read(char[] cbuf, int off, int len) public boolean ready( ) public void reset( ) public long skip(long n)
These are analogous to the read( )
methods defined
for InputStream
. For example, read( ) still
returns an integer. The difference is that,
instead of data values being in the range of 0-255
(i.e., single bytes), the return value is in the
range of 0-65535 (appropriate for characters,
which are 2 bytes wide). However, a return value
of -1 is still used to signal that there is no
more data.
The only other major change is that InputStream
’s available( )
method has
been replaced with a boolean method, ready( )
, which returns
true
if the
next call to read(
)
doesn’t block. Calling ready( )
on a class that
extends Reader
is analogous to checking (available( ) > 0)
on InputStream
.
There aren’t nearly so many subclasses of
Reader
or
Writer
as there
are types of streams. Instead, readers and writers
can be used as a layer on top of streams—most
readers have a constructor that takes an InputStream
as an
argument, and most writers have a constructor that
takes an OutputStream
as an argument. Thus, in
order to use both localization and compression
when writing to a file, open the file and
implement compression by layering streams, and
then wrap your final stream in a writer to add
localization support, as in the following snippet
of code:
FileOutputStream destination = new FileOutputStream(fileName); BufferedOutputStream bufferedDestination = new BufferedOutputStream(destination); GZIPOutputStream zippedDestination = new GZIPOutputStream(bufferedDestination); OutputStreamWriter destinationWriter = new OutputStreamWriter(zippedDestination);
There is one very
common Reader
/Writer
pair: BufferedReader
and BufferedWriter
. Unlike
the stream buffering classes, which don’t add any
new functionality, BufferedReader
and BufferedWriter
add
additional methods for handling strings. In
particular, BufferedReader
adds the readLine( )
method
(which reads a line of text), and BufferedWriter
adds the
newLine( )
method, which appends a line separator to the
output.
These classes are very handy
when reading or writing complex data. For example,
a newline character is often a useful way to
signal “end of current record.” To illustrate
their use, here is the action listener from
ViewFileFrame, rewritten to use BufferedReader
:
private class ViewFileAction extends AbstractAction { public void actionPerformed(ActionEvent event) { FileReader fileReader = _fileTextField.getFileReader( ); if (null==fileReader) { _fileViewingArea.setText("Invalid file name"); } else { try { copyReaderToViewingArea(fileReader); fileReader.close( ); } catch (java.io.IOException ioException) { _fileViewingArea.setText("\n Error occured while reading file"); } } } private void copyReaderToViewingArea(Reader reader) throws IOException { BufferedReader bufferedReader = new BufferedReader(reader); String nextLine; _fileViewingArea.setText(""); while( null != (nextLine = bufferedReader.readLine( ))) { _fileViewingArea.append(nextLine + "\n"); } }
[3] Print orders can be cancelled by sending another message: a cancellation message. But the original message was still sent.
[4]
You
can use SoftReference (defined in java.lang.ref
) to get a
minimal level of control over the order in which
objects are garbage collected.
[5] This example uses classes from the Java Swing libraries. If you would like more information on Swing, see Java Swing (O’Reilly) or Java Foundation Classes in a Nutshell (O’Reilly).
Get Java RMI now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.