TOPIC 24

Goodness-of-Fit Tests

How can statistics be useful in catching people who try to cheat on their taxes by making up the numbers they report? Well, it turns out the leading (leftmost) digits of many numbers that occur in the world follow a predictable pattern, so auditors can test for whether the numbers on a tax return follow that pattern or not. To do this yourself, you need a new test procedure, one that allows you to assess whether sample data conform to a hypothesized model. That's what you will learn in this topic.

Overview

In this topic, you return to studying categorical variables. For the first time, you will learn an inference technique that applies to a categorical variable with more than two categories. One of the most famous and widely used procedures in all of statistics is the chi-square goodness-of-fit test. This procedure assesses how closely sample results conform to a hypothesized model about the proportional breakdown of the various categories.

Preliminaries

  1. What day of the week were you born on? [ Hint: If you do not know, you might consult an online yearly calendar such as the one at www.timeanddate.com.] (Activity 24-1)
  2. Do you suspect that any one day of the week is more or less likely to be a person's birthday than any other day of the week? (Activity 24-1)
  3. Guess which digit (1–9) is most likely to be the leading (leftmost) digit for numbers appearing in an almanac. (Activity 24-5)
  4. Guess which digit (1–9) is least likely to be the leading (leftmost) ...

Get Workshop Statistics: Discovery with Data, Fourth Edition now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.