10AMS Processes

An important property of natural language – and of programming languages, as well – is the double articulation: written texts consist of words, whereas words consist of letters. Whereas the number of distinct letters is bounded and rather small, the number of distinct words is difficult to bound and we can imagine it as countably infinite. The theorems about facts and words discussed in Section 8.4 suggest that the double articulation arises from a mere semantic constraint that texts describe infinitely many independent elementary facts in a repetitive way. From this perspective, it is natural to switch between two alternative views of a text: one in which the text is a string of letters and another in which the text is a string of words.

By an analogy, if we consider discrete stochastic processes, we can also switch between viewing them as sequences of random letters and sequences of random words. It pays off to study this operation on stochastic processes and their probability measures in its own right. The operation does not preserve stationarity of the investigated process, but it preserves a more general property, called asymptotic mean stationarity (AMS). Curiously, this natural operation has not been much investigated in the literature. We notice that the theory of AMS measures and respective operations on them is quite technical, and probably underdeveloped, but we will need it for construction of some linguistically motivated examples of stochastic ...

Get Information Theory Meets Power Laws now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.