Skip to Main Content
Think Complexity
book

Think Complexity

by Allen B. Downey
March 2012
Beginner content levelBeginner
160 pages
4h 6m
English
O'Reilly Media, Inc.
Content preview from Think Complexity

Chapter 5. Scale-Free Networks

Zipf’s Law

Zipf’s law describes a relationship between the frequencies and ranks of words in natural languages; see http://en.wikipedia.org/wiki/Zipf%27s_law. The “frequency” of a word is the number of times it appears in a body of work. The “rank” of a word is its position in a list of words sorted by frequency. The most common word has rank 1, the second most common has rank 2, etc.

Specifically, Zipf’s Law predicts that the frequency, f, of the word with rank r is:

where s and c are parameters that depend on the language and the text.

If you take the logarithm of both sides of this equation, you get:

So if you plot versus , you should get a straight line with slope and intercept .

Example 5-1. 

Write a program that reads a text from a file, counts word frequencies, and prints one line for each word in descending order of frequency. You can test it by downloading ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Unikernels

Unikernels

Russell Pavlicek
Elemental Design Patterns

Elemental Design Patterns

Jason McColm Smith
LEGO® with Dad

LEGO® with Dad

Warren Nash

Publisher Resources

ISBN: 9781449331672Errata