Second, many of the patents have expired (e.g., [11, 16]), or became obsolete. Finally, we do
not need to worry so much about complexity-reduction details that obscure the inherent simplic-
ity of the method. Current computational resources allow us to implement simple, efficient, and
royalty-free arithmetic coding.
5.2.1 Notation
Let f2 be a data source that puts out symbols sk coded as integer numbers in the set {0, 1 .....
M - 1 }, and let S = {Sl, s2 .....
be a sequence of N random symbols put out by f2 [1, 4,
5, 21, 55, 56]. For now, we assume that the source symbols are independent and identically
distributed [22], with probability
m=0,1,2 ..... M-l, k=l,2 ..... N.
We also assume that for all symbols we have
p(m) :7/=
0, and define
to be the cumulative
c(m) -- Z p(s),
m = O, 1 ..... M. (5.2)
Note that c(0) -- 0,
_--__ 1, and
p(m ) = c(m + 1) - c(m ).
We use boldface letters to represent the vectors with all
values, i.e.,
p = [p(0) p(1) -..
c --- [c(0) c(1) ... c(M)].
We assume that the compressed data (output of the encoder) is saved in a vector (buffer) d. The
output alphabet has D symbols i.e., each element in d is number in the set {0, 1 ..... D -- 1 }.
Under the assumptions above, an optimal coding method [ 1 ] codes each symbol s from f2 with
an average number of bits equal to
B(s) = -log 2
bits. (5.4)
Example 5.1. Data source f2 can be a file with English text: each symbol from this source
is a single byte with the ASCII (American Standards Committee for Information Interchange)
representation of a character. This data alphabet contains M = 256 symbols, and symbol numbers
are defined by the ASCII standard. The probabilities of the symbols can be estimated by gathering
statistics using a large number of English texts. Table 5.1 shows some characters, their ASCII
symbol values, and their estimated probabilities. It also shows the number of bits required to
code symbol s in an optimal manner, -log2p(s ). From these numbers we conclude that, if data
symbols in English text were i.i.d., then the best possible text compression ratio would be about
2:1 (4 bits/symbol). Specialized text compression methods [8, 10, 29, 41 ] can yield significantly
better compression ratios because they exploit the statistical dependence between letters.
This first example shows that our initial assumptions about data sources are rarely found in
practical cases. More commonly, we have the following issues.

Get Lossless Compression Handbook now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.