Chapter 10. Compressing Streams

The java.util.zip package, shown in Figure 10-1, contains six stream classes and another half dozen assorted classes that read and write data in zip, gzip, and inflate/deflate formats. Java uses these classes to read and write JAR archives and to display PNG images. The java.util.zip classes are well-suited for general-purpose compression and decompression.

The java.util.zip package hierarchy
Figure 10-1. The java.util.zip package hierarchy

Inflaters and Deflaters

The java.util.zip.Deflater and java.util.zip.Inflater classes provide compression and decompression services for all other classes. These two classes support several related compression formats, including zlib, deflate, and gzip. Each of these formats is based on the LZ77 compression algorithm (named after the inventors, Jakob Ziv and Abraham Lempel), though each has a different way of storing metadata that describes an archive’s contents. Since compression and decompression are extremely CPU-intensive operations, these classes are usually implemented as Java wrappers around native methods written in C.

zip, gzip, and zlib all compress data in more or less the same way. Repeated bit sequences in the input data are replaced with pointers back to the first occurrence of that bit sequence. Other tricks are used, but this is basically how these compression schemes work, and it has certain implications for compression and decompression ...

Get Java I/O, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.