Errata

Errata for Understanding Compression

Submit your own errata for this product.

The errata list is a list of errors and their corrections that were found after the product was released.

The following errata were submitted by our customers and have not yet been approved or disproved by the author or editor. They solely represent the opinion of the customer.

Color Key: Serious technical mistake Minor technical mistake Language or formatting error Typo Question Note Update

Version	Location	Description	Submitted by	Date submitted
ePub	Page 1 1	Hi, I was wondering why you're bashing 'math' so hard in your book. I am not a fan of demonizing math and you shouldn't convey the impression math is something bad and always hard to learn. I always mentally discount books which tries to bash math or write that math is not really needed. You should consider adjusting your 'math bashing' and substitute it with something more constructive like You will learn some basic math in the e-book and for further studies there are some superb books which gives you a deeper inside in the compression theories. Greetings Tobias	Tobias Köck	May 12, 2016
PDF	Page 10 Para commencing "Since each....	the word in is repeated showing "in in our current....."	Biterg	Jun 24, 2016
PDF	Page 11 towards bottom	Binary to decimal description is WRONG. = 9 + 1 = 10 Should be = 8 + 2 = 10	Biterg	Jun 24, 2016
Printed	Page 15 paragraph text	"which interestingly enough, is exactly the binary representation" It's not clear whether the result (10) is a coincidence caused by a poor selection of example values, or a deep consequence of the mathematics being demonstrated.	Eric Lawrence	Aug 12, 2016
Printed	Page 16 First equation	I haven't done "real" math in a long time, but isn't there an extra minus sign in: log2(x) = -(log(x) / log(2)) It should instead be: log2(x) = (log(x) / log(2)) right? e.g. https://en.wikipedia.org/wiki/Binary_logarithm#Conversion_from_other_bases	Eric Lawrence	Aug 12, 2016
ePub	Page 17 about BWT	Hi, in page 17 you say that: "Researchers were able to find that the BWT algorithm was the most efficient way to store DNA information in a compressed form" But, BWT is not a compression algorithm. In fact you say that in page 125, when you say: "This does not provide compression in-and-of-itself, but allows you to hand off the transformed stream to other compression systems." So I think in page 17 you could write a little bit more about GENOME COMPRESSION. In fact you could say about horizontal and vertical compression. And show up the major results. Another thing you could say is about READs(FASTQ) compression also called NGS sequencing data compression. Congrats for your work.	Kelvin Kredens	Jun 13, 2016
PDF	Page 21 3rd paragraph	iPod was introduced in 2001, not 1998. I remember because the original iMac came out in ’98, and the iPod / CD-RW were after that, when Apple was pitching the Mac as the “digital hub” of our homes.	significant.bit	May 29, 2016
Printed	Page 21 Just before end of section	The phrase: "(give or take a few qubits)." seems like it was intended to be some sort of joke. Is that the case, or have I missed something somewhere along the way? Given that, as the authors say, "math is hard" it seems inappropriate to include a misleading aside here.	Eric Lawrence	Aug 12, 2016
PDF	Page 23 caption for Figure 2-2	a couple of issues: “Lenna” and the phrase “full portrait:” repeated.	significant.bit	May 29, 2016
PDF	Page 30 1st paragraph	Hi, your book is really intearsting, the error I foun in page 30 is a minor error, 2^3 = 8 2^1=2 And you put 2^3=9 2^1=1	Anonymous	May 30, 2016
PDF	Page 31 definition	"information theory" is not bolded like other definitions	significant.bit	May 29, 2016
PDF	Page 34 3rd paragraph	The following sentence is incorrect: Perhaps the simplest way to encode some text information would be to number all of the English characters—A to Z—with numeric values 0–26. It should be "0-25" at the end of the sentence.	Paul Dacus	Jul 04, 2017
PDF	Page 34 2nd paragraph	The following sentences are incorrect: Perhaps the simplest way to encode some text information would be to number all of the English characters—A to Z—with numeric values 0–26. You could then use the number of pulses, along with pairing, to determine what digit you were transmitting. For example, you could translate “THE HAT” into 20-8-5 8-1-20. Either the numeric values need to be "1-26" not "0-26", or each numeric coded value at the end of the second sentence needs 1 subtracted from it; eg "19-7-4-7-0-19".	Paul Dacus	Jul 04, 2017
PDF	Page 37 footnote 3	implimentations	significant.bit	May 29, 2016
PDF	Page 38 1st paragraph	"This is set in the mathematical sense: a group of numbers..." symbols or items might be a better choice than numbers	significant.bit	May 29, 2016
PDF	Page 38 1st line	The original sequence has 5 Ds, not 4 as described later in the example.	significant.bit	May 29, 2016
PDF	Page 38 near bottom of page	“… your Entropy for the set.” Unclear whether you’re talking about the whole dataset or the subset of symbols used [ABCD].	significant.bit	May 29, 2016
PDF	Page 40 after the definition	"To be practical and concrete, let’s start with a groupof letters, say:" needs a space "To be practical and concrete, let’s start with a group of letters, say:"	Anonymous	Jun 19, 2016
PDF	Page 42 mid page	“This type of transform is known as Delta Coding, or the process of encoding a series of numbers as the difference from the previous number.” value might be a better word than number here "... a series of values ... from the previous value."	significant.bit	May 29, 2016
Printed	Page 48 Elias Gamma table	The values for n=1 and n=2 don't seem to be right. The code for n=2 should probably be "100", because you encode the value of 2 by doing (2^1 + 0). The code "101" would match n=3. In the table on the prior page, it's explained that the number 0 is not representable in unary, which suggests that n=1 cannot use (2^0), which means that the encoding of "0" must be some kind of special case that isn't listed in the algorithm.	Eric Lawrence	Aug 15, 2016
Printed	Page 49 Explanation following Elias Delta table	The algorithm specified after "To decode Elias delta" seems like it would fail to correctly decode when the code is "0" as there wouldn't be any way to tell if the code word is done. Following the encoding algorithm before the table, I encode n=1 as "10".	Eric Lawrence	Aug 15, 2016
Printed	Page 50 VarInt block	The algorithm here says that the "lower 7 bits are used to store the two's complement representation of the number", language mirrored in Google's docs (https://developers.google.com/protocol-buffers/docs/encoding#varints). Mentioning "two's complement" rather than just "binary representation" implies that the scheme supports negative numbers, but it's not clear how that would work. After stripping the MSB from each byte, is the most-significant-remaining-bit of the last byte a flag indicating whether the number is negative? Or does this scheme not handle negative numbers and the use of the term "two's complement" is just misleading?	Eric Lawrence	Aug 15, 2016
Printed	Page 52 top of page	The table header block is unnecessarily repeated (widow) on the next page, presumably because the "a) More information on Elias omega..." text was accidentally widowed over to the next page. This is a bit confusing.	Eric Lawrence	Aug 15, 2016
Printed	Page 61 1st paragraph	When compression formats for video are mentioned, webM is also mentioned. But webM is a container format, not a compression codec.	Matteo Contrini	Dec 05, 2017
PDF	Page 64 4th paragraph in the section "Picking the Right Output Value"	It says that the encoding of the string "GGB" is 83, which needs log2(83) = 7 bits, which needs 1.42 bits per symbol. However, that calculation looks wrong. We need log2(83) / 3 = 2.12 bits per symbol.	Abhinav Upadhyay	Sep 23, 2023
PDF	Page 69 first line	On this first line .. write n=2^(N + L) (where L =n-2^N)... Should this be n = (2^N) + L like the example below on the same page?	Anonymous	Jun 21, 2016
PDF	Page 69 The example table	The example for n = 2 has 2^1+0 = 101 decode this the first part is B(10) = 2 so 2^2 . the remainder = B(01) = 1 then decoding 2^2 + 1 = 5 should n = 2 be B(01) = 2^0 + 1	Anonymous	Jun 25, 2016
Printed	Page 72 8th paragraph	On section "creating the reference table", the text explains that to calculate the values in a row, it is necessary to multiply the row number with the symbol probability. The correct operation is division, as also reported in the table on the next page.	Antonio	Jul 02, 2020
Printed	Page 86 Point 7	The adaptive VLC steps miss the one that should output the "B" character for the first time.	Matteo Contrini	Feb 04, 2018
Printed	Page 91 Second to last paragraph	WebM is again mentioned as a video compression codec/encoder/algorithm, but WebM is a container format	Matteo Contrini	Feb 04, 2018
Printed	Page 97 1st figure	The entropy of the two-symbol set should be 0.97, not 2.2, making it vastly superior to the other tokenization options. Less critically, the entropy of the 1st figure on the facing page (p96) is also incorrect: it should be 2.42, not 2.38.	Kevin Nygaard	Oct 02, 2023
Printed	Page 151 1st line	"context missing" should be "context mixing"	Matteo Contrini	Feb 13, 2018
Printed	Page 157 second sentence	"GZIP, BZIP, and now" should be "GZIP, and now" For most of the history of HTTP, DEFLATE and its wrapper GZIP were the only supported means of compression. BZIP2 is not a part of the "Standard HTTP stack" -- the only browser I'm aware of that ever supported BZIP2 was an early version of Chrome, and that code was ripped out long ago. Today, everybody does GZIP/DEFLATE, Chrome+Firefox+Opera do Brotli, Chrome+Opera do SDCH, and Opera does lzma.	Eric Lawrence	Aug 04, 2016
Printed	Page 168 image block	Image quality in the black and white printed text is so low that the graphic loses all ability to convey "degrading quality"; each version looks essentially identical.	Eric Lawrence	Aug 12, 2016
Printed	Page 170 image block	Image quality in the black and white printed text is so low that the graphic loses all ability to convey a distinction between 128 and 32 colors; each version looks essentially identical.	Eric Lawrence	Aug 12, 2016
Printed	Page 197 last word on page	the word "battery" should instead be "radio". The radio being on is what drains the battery. The battery is always "on".	Eric Lawrence	Aug 04, 2016
Printed	Page 200 first sentence	"have a valid computing experience" It's not clear what is meant by "valid" here. Maybe use "good" or "compelling" or any similar word.	Eric Lawrence	Aug 04, 2016