O'Reilly logo

Think Complexity by Allen B. Downey

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 5. Scale-Free Networks

Zipf’s Law

Zipf’s law describes a relationship between the frequencies and ranks of words in natural languages; see http://en.wikipedia.org/wiki/Zipf%27s_law. The “frequency” of a word is the number of times it appears in a body of work. The “rank” of a word is its position in a list of words sorted by frequency. The most common word has rank 1, the second most common has rank 2, etc.

Specifically, Zipf’s Law predicts that the frequency, f, of the word with rank r is:

where s and c are parameters that depend on the language and the text.

If you take the logarithm of both sides of this equation, you get:

So if you plot versus , you should get a straight line with slope and intercept .

Example 5-1. 

Write a program that reads a text from a file, counts word frequencies, and prints one line for each word in descending order of frequency. You can test it by downloading ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required