The uniformity of the font sizes I noted earlier is still a problem. The reason forthis is that the tag counts are arranged in a power curve (Figure 13). Power curvesare a very common phenomenon found in popularity or frequency data collectedfrom human activity.
Figure 13. A power curve
There tends to be a very few large values in the data, and lots and lots of smallvalues. The problem with mapping a power curve to a limited set of font sizes isthat the "long tail" of the power curve ends up getting represented by just one ortwo font sizes. Many of the intermediate font sizes won't get used at all because ofthe larger gaps between the counts of the most popular words.
The way to make this tag cloud look better is to use a logarithmic function toreverse the power curve's effects. Essentially, we will map the linear range of fontvalues to the logarithmic range of tag counts, magnifying the differences betweensmaller counts and making the "long tail" of the power curve more visible (Figures 14 and 15).
Figure 14. Linear mapping of x to y
Figure 15. Logarithmic mapping of x to y
To do this, we'll add a logarithmic measure of the tag counts: ...