Problem 27
Benford and the Peculiar Behavior of the First Significant Digit (1938)
Problem. In many naturally occurring tables of numerical data, it is observed that the leading (i.e., leftmost) digit is not uniformly distributed among {1, 2, . . ., 9}, but the smaller digits appear more frequently that the larger ones. In particular, the digit one leads with a frequency close to 30% whereas the digit nine leads with a frequency close to only 4%. Explain.
Solution. We make use of the fact that, for a given data set, changing the units of measurements should not change the distribution p(x) of the first significant digit. That is, this distribution should be scale invariant:
where k and C are constants. Integrating on both sides,
Therefore . Let us now differentiate the latter w.r.t. k:
Writing u = kx, we have so that
is a solution. Let D be the first significant digit in base 10. Then ...