CHAPTER 4

Statistical Analysis

Statistics is the science of data.1 In the late nineteenth century, renowned British scientist and author H.G. Wells (1866–1946) said that an intelligent citizen in a twentieth-century free society would need to understand statistical methods. It can be said that an intelligent twenty-first-century practitioner or consumer of TA has the same need. This chapter and the next two address aspects of statistics that are particularly relevant to TA.

Statistical methods are not needed when a body of data conveys a message loudly and clearly. If all people drinking from a certain well die of cholera but all those drinking from a different well remain healthy, there is no uncertainty about which well is infected and no need for statistical analysis. However, when the implications of data are uncertain, statistical analysis is the best, perhaps the only, way to draw reasonable conclusions.

Identifying which TA methods have genuine predictive power is highly uncertain. Even the most potent rules display highly variable performance from one data set to the next. Therefore, statistical analysis is the only practical way to distinguish methods that are useful from those that are not.

Whether or not its practitioners acknowledge it, the essence of TA is statistical inference. It attempts to discover generalizations from historical data in the form of patterns, rules, and so forth and then extrapolate them to the future. Extrapolation is inherently uncertain. Uncertainty ...