With LZ77, we try to compress data by encoding phrases from a look-ahead buffer as tokens referencing phrases in a sliding window. To uncompress the data, we decode each token into the phrase or symbol it represents. To do this, we must continually update the sliding window so that at any one time it looks the same as it did during the compression process. In the implementation presented here, a symbol in the original data is one byte.
The lz77_compress operation (see Example 14.5) compresses data using LZ77. It begins by writing the number of symbols in the data to the buffer of compressed data and initializing the sliding window and look-ahead buffer. The look-ahead buffer is then loaded with symbols.
Compression takes place inside of a loop that iterates until
there are no more symbols to process. We use
ipos to keep track of the current byte
being processed in the original data, and
opos to keep track of the current bit we
are writing to the buffer of compressed data. During each iteration
of the loop, we call compare_win to determine
the longest phrase in the look-ahead buffer that matches one in the
sliding window. The compare_win function
returns the length of the longest match.
When a match is found,
offset to the position of the match in
the sliding window and
next to the symbol
in the look-ahead buffer immediately after the match. In this case,
we write a phrase token to the compressed data (see Figure 14.4 ...