Efficiently compressing a byte array

Compressing a byte array is a matter of recognizing repeating patterns within a byte sequence and devising a method that can represent the same underlying information to take advantage of these discovered repetitions.

To get a rough idea of how this works, imagine having a sequence as:

["a" "a" "a" "b" "b" "b" "b" "b" "b" "b" "c" "c"]

It is intuitively more efficient to represent this as:

[3 times "a", 7 times "b", 2 times "c"]

Now, we are going to use a methodology based on a well-known algorithm, that is, the LZ77 compression method. LZ77 is, despite being quite old, the basis of most of all the well-known and currently used compression methods, especially the Deflate algorithm.

Note

Deflate is at the heart of the ...

Get Clojure Data Structures and Algorithms Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.