Let's look at a formulaic method that corresponds to what we just did (and is named after the originators, Mann, Whitney, and Wilcoxon). We'll be testing for a rank sum difference between 30 randomly selected males and 30 randomly selected females. The female sample contains an extreme outlier. We'll look at the statistical analysis results produced by statistical analysis software, and walk through what it did.
But first, look at Table 51.1 to see how the standard t-test is bamboozled by the unruly data: Notice the impact of the outlier on the female sample mean, variance, number of standard errors, and -value. This is what happens when you have unruly scaled data and use methods from Parts II, III, and IV.
Table 51.1 An inappropriate -test.
|-Test for mean difference—Whoops!|
|Hypothesized mean difference||0|