Chapter 8. Estimation
The code for this chapter is in estimation.py. For information about downloading
and working with this code, see Using the Code.
The Estimation Game
Let’s play a game. I think of a distribution, and you have to guess what it is. I’ll give you two hints: it’s a normal distribution, and here’s a random sample drawn from it:
[-0.441, 1.774, -0.101, -1.138, 2.975,
-2.138]
What do you think is the mean parameter, μ, of this distribution?
One choice is to use the sample mean,
, as an estimate of μ.
In this example,
is 0.155, so it would be reasonable to guess μ = 0.155. This process is called estimation, and the
statistic we used (the sample mean) is called an estimator.
Using the sample mean to estimate μ is so obvious that it is hard to imagine a reasonable alternative. But suppose we change the game by introducing outliers.
I’m thinking of a distribution. It’s a normal distribution, and here’s a sample that was collected by an unreliable surveyor who occasionally puts the decimal point in the wrong place.
[-0.441, 1.774, -0.101, -1.138, 2.975,
-213.8]
Now what’s your estimate of μ? If you use the sample mean, your guess is -35.12. Is that the best choice? What are the alternatives?
One option is to identify and discard outliers, and then compute the sample mean of the rest. ...