Multiple comparisons

The fact that with repeated trials, we increase the probability of discovering a significant effect is called the multiple comparisons problem. In general, the solution to the problem is to demand more significant effects when comparing many samples. There is no straightforward solution to this issue though; even with an α of 0.01, we will make a Type I error on an average of 1 percent of the time.

To develop our intuition about how multiple comparisons and statistical significance relate to each other, let's build an interactive web page to simulate the effect of taking multiple samples. It's one of the advantages of using a powerful and general-purpose programming language like Clojure for data analysis that we can run our ...

Get Clojure for Data Science now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.