8Power Laws for Information

This chapter opens the exposition of more original research, done mostly by the author of this book. In the chapters to follow, we would like to discuss stochastic processes that satisfy certain power laws common to natural language. As we have identified in Chapter 1, there are several such power laws possibly exhibited by texts in natural language: power‐law growth of block mutual information (1.132), power‐law logarithmic growth of maximal repetition (1.133), and power‐law decay of mutual information between individual characters (1.134). Plausibly, there may be more such laws. It is important to understand to what extent these laws are mathematical consequences of one another. For possible applications in engineering and artificial intelligence, it is also vital to understand general constructions of stochastic processes that satisfy these power laws as well as to comprehend reasons why these laws are satisfied.

We will begin, in this chapter, with a discussion of power laws that are connected to the growth of mutual information between two adjacent blocks of a random text. In the context of natural language modeling, the power‐law growth of block mutual information is called the relaxed Hilberg hypothesis, as it was discussed in Section 1.9. A large body of our research concerned relating this hypothesis to other plausible statistical patterns of natural language. Some natural candidates have been the famous Zipf law concerning the distribution ...

Get Information Theory Meets Power Laws now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.