12

ANALYSIS OF VARIANCE—ANOVA

So far in hypothesis testing, we have been concerned mostly with single tests:

  • Is humidifier A better than humidifier B?
  • Which hospital error-reporting regime is better—no-fault or standard?
  • Does providing additional explanation on the web reduce product returns?

We also looked at comparisons among multiple groups with categorical data. Now, we move on to comparisons among multiple groups for continuous numeric data. When we want to compare more than two groups at a time, we use a technique known as analysis of variance, or ANOVA. The technical details of ANOVA, and especially the hypothesis testing, will be of interest mainly to the research community. However, data scientists who analyze multigroup experiments will find the graphical exposition and the discussion of variance decomposition of benefit.

After completing this chapter, you should be able to

  • compare multiple groups using boxplots,
  • explain the problems involved with multiple testing,
  • explain how the observations in a multigroup experiment can be decomposed into an overall average component, a treatment component, and a residual component,
  • explain interaction, and include it in an ANOVA analysis,
  • explain what factorial design is, what its advantages are, and what types of studies it is useful for,
  • explain the role of blocking in experiments.

12.1 COMPARING MORE THAN TWO GROUPS: ANOVA

Although the following data dates back to 1935, the purpose of the study—the amount of fat in our diet—is ...

Get Introductory Statistics and Analytics: A Resampling Perspective now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.