Even if you are familiar with *binary numbers*, DO NOT SKIP THIS CHAPTER. We are going to begin digging into *information theory* as well, which is required for understanding the rest of this book.

It might seem a bit odd to start a book about data compression with a primer on binary numbers. Bear with us here. Everything in data compression is about reducing the number of bits used to represent a given data set. To expand on this concept, and the ramifications of its mathematics, let’s just take a second and make sure everyone is on the same page.

Modern human mathematics is built around the decimal—base 10—number system.^{1}

This system makes it possible for us to use the digits [0,1,2,3,4,5,6,7,8,9] strung together to represent number values. Back in elementary school, you might have been exposed to the concept of numeric columns, where, for example, the value 193 is split into three columns of hundreds, tens, and ones.

Hundreds | Tens | Ones |
---|---|---|

1 |
9 |
3 |

Effectively, 193 is equivalent to 1 * 100 + 9 * 10 + 3. And as soon as you grasped that pattern, maybe you realized that you could count to any number.

Later, when you learned about exponents, you were able to replace the “hundreds” and “tens” with their “base ten to the power” equivalents, and a new pattern emerged.

10 |
10 |
10 |
---|---|---|

1 |
9 |
3 |

So:

193 = 1 * 100 + 9 * 10 + 3 = (1 * 10^{2}) + (9 * 10^{1}) + (3 * 10^{0})

Because each column can contain ...

Start Free Trial

No credit card required