The task of trying to assess the impact of random variability on the conclusion drawn from a study, or the results of a measurement, is called statistical inference. In this chapter, we look at a particular kind of statistical inference called a hypothesis test. Generally, a hypothesis test seeks to determine whether the effects we see in some data from a study are real or might just be the result of chance variation.

Who uses hypothesis testing? The research community uses it to determine whether a study is worthy of publication or regulatory approval. Data scientists are in less need of the formal apparatus of hypothesis testing but they do use the resampling methods presented here, and their variants, to help separate random from “real” patterns in data.

After completing this chapter, you should be able to

  • explain the concept of a null hypothesis,
  • describe how to conduct a permutation test with a hat and slips of paper,
  • interpret the results of a permutation test,
  • describe the shape of the Normal distribution and what is meant when it is said that a more accurate name is the “Error” distribution,
  • define, in the context of hypothesis testing, alpha, Type I error, and Type II error,
  • explain in what circumstances hypothesis testing is used.

The Null Hypothesis

The standard hypothesis-testing procedure involves a what-if calculation. We ask, “Could my results be due to chance?” This supposition is called the null hypothesis. Null is an old-fashioned word ...

Get Introductory Statistics and Analytics: A Resampling Perspective now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.