5Entropy and Information

Language is a system of communication, whereas communication can be regarded as a process of information transmission. From this point of view, while studying language, we should be interested in understanding the notion of information. The standard mathematical theories of information are theories of the amount of information, which turns out to be a more fundamental concept than the structure of information. Namely, the amount of information is preserved by reversible operations on the data, whereas the structure of information can be distorted by them. These different properties of the amount of information and the structure of information are probably the reason for little interaction between linguists, who prefer to study the structure of information, and information theorists, who tend to investigate the amount of information.

There is one more problem with studying the amount of information in empirical data. Namely, in the standard mathematical theories of information, the amount of information is given relative to a probability distribution or a universal computer. Asymptotically, the choice of a concrete probability distribution or a universal computer does not matter much, but, if we are going to perform experiments on finite samples, the choice can influence our estimates. Of course, we can confine ourselves to studying some concrete estimators of the amount of information, but these procedures introduce another level of approximation of ...

Get Information Theory Meets Power Laws now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.