O'Reilly logo

Understanding Compression by Aleks Haecky, Colton McAnlis

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 7. Dictionary Transforms

Even though information theory was created in the 1940s, Huffman encoding in the 1950s, and the Internet in the 1970s, it wasn’t until the 1980s that data compression truly became of practical interest.

As the Internet took off, people began to share images and other data formats that are considerably larger than text. This was during a time when bandwidth and storage were either limited, expensive, or both, and data compression became the key to alleviating these bottlenecks.

Note

With mobile devices on the march to world dominance, we are actually experiencing these same bottlenecks all over again today.

Although variable-length coding (VLC) was churning away at content, the fact that it was locked to entropy produced a limiting gate on the future of compression. So, while the majority of researchers were trying to find more efficient variable-length encodings,1 a few researchers found new ways for preprocessing a stream to make the statistical compression more impactful.

The result was what’s called “dictionary transforms,” which completely changed the mentality and value of data compression with respect to the masses. Suddenly, compression became a useful algorithm for all sorts of data types. So useful, in fact, that all of today’s dominant compression algorithms (think gzip or 7-Zip) use a dictionary transform as their core transformation step. So, let’s see what it’s all about.

A Basic Dictionary Transform

Statistical compression mostly ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required