Part X

Tests on Outliers

Outliers are a phenomenon which inevitably occurs in the analysis of statistical data. Although there is no generally accepted unique definition of the term outlier they are commonly understood as observations which somehow do not fit into the data set. In Barnett and Lewis (1994, p. 7) we read ‘We shall define an outlier in a set of data to be an observation (or subset of observations) which appears to be inconsistent with the remainder of that set of data’. The same authors further proceed by putting emphasis on the fact that outliers are to be described as observations which are extreme as well as surprising for the observer. Whether an extreme value should be declared as an outlier depends on what we think about the main population from which we sample. Here a distinction has to be made from contaminants in the sense of observations originating from some other population, which might or might not be extreme with respect to the remaining observations and hence may or may not be outliers. How we generally decide to deal with the question of handling outliers is beyond the scope of this book, whether it is better to accommodate them by using robust methods, to detect them as they are of interest in themselves or just an undue influence on the applied analysis. Here we just present some well known discordancy tests from the toolkit of statistical methods to handle outliers. In tests of discordancy we aim at a decision on whether or not an extreme observation ...

Get Statistical Hypothesis Testing with SAS and R now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.