12Estimation Problems

The Number of Locomotives Problem

An important use of Bayes Theorem is for estimating parameters in a situation where there is limited information. The locomotive problem is an example of this estimation category of calculations. We are told that a railroad company numbers its locomotives consecutively from 1 to N, where N is the total number of locomotives owned by the company (and presumably they're all out running on tracks somewhere). We see a locomotive with number 60 painted on it. We are asked to estimate how many locomotives this company has.

We have – at this time – very little information to help us create a prior. Therefore, let us start with the simplest prior possible. If N is the total number of locomotives owned by the company, then our prior is a uniform distribution from 1 to N. We'll try to gain some insight into the implications of our choice of N as we go along.

The likelihoods are straightforward. First, if the company has less than 60 locomotives, there is 0 probability that we saw locomotive number 60. If the company has 60 locomotives, then the probability that the one we saw was number 60 is exactly 1/60. If they have 61 locomotives, then the probability that we saw number 60 is 1/61, etc. In general, if the company has N locomotives, then the probability that we saw number 60 (or any particular locomotive) is 1/N.

Since we are starting with a uniform prior distribution, the posterior curve and the likelihood curve look exactly ...

Get Probably Not, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.