3Probabilistic Toolbox

In this chapter, we will review some important technical results in probability theory, which concern mostly real random variables and related mathematical concepts. Formal linguists may rightly ask what the utility of real numbers is in language research. Indeed, the language structure is largely about discrete symbols and hierarchical categorical distinctions (Adger, 2019). Yet rational and computable real numbers arise naturally when we count the frequencies of those symbols and categories or compute other kinds of statistics for linguistic data, such as fractions, means, or variances. One can debate whether it is more difficult to see some patterns in large tables of numbers than in collections of categorical data, but we cannot deny that for quantitative and computational linguists fractional numbers in language matter as well.

Moreover, we should be aware that it is mathematically possible to encode an infinite sequence of binary digits images as a single real number images. Except for countably many cases such as binary numbers images, this mapping between binary sequences and real numbers is one‐to‐one. In particular, the text of any novel or even a collection of large ...

Get Information Theory Meets Power Laws now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.