4Ergodic Properties

According to a preformal intuition, the probability of any event can be defined as the limiting relative frequency of this event in an infinite sequence of repeated experiments. In probability theory, this sort of a statement can be formulated as a theorem, the law of large numbers. Namely, for a sequence of independent identically distributed (IID) real random variables, their averages tend to their common expectation almost surely when the number of averaged random variables tends to infinity – see Problem 3.8, where we have discussed the Hoeffding inequality. This result is the main motivation for the frequentist interpretation of probability. As we have noted in the introductory Section 1.6, another well‐known interpretation of probability is called Bayesian, and in this interpretation, probabilities are odds of an intelligent agent making predictions.

Linguists may rightly object that there are no repeatable experiments and no probabilistically independent variables in language, so the frequentist interpretation of probability need not be valid for natural language. Partly accepting this point of view, information theorists investigating the phenomenon of human communication sought for some generalizations of the law of large numbers for dependent stochastic processes. They found a plausible generalization in ergodic theory, a branch of mathematics that sprung up from pondering over origins of randomness and probability in classical mechanics, a branch ...

Get Information Theory Meets Power Laws now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.