9

Outliers and robustness for ordinal data

Marco Riani, Francesca Torti and Sergio Zani

This chapter tackles the topics of robustness and multivariate outlier detection for ordinal data. We initially review outlier detection methods in regression for continuous data and give an example which shows that graphical tools of data analysis or traditional diagnostic measures based on all the observations are not sufficient to detect multivariate atypical observations. Then we focus on ordinal data and illustrate how to detect atypical measurements in customer satisfaction surveys. Next, we review the generalized linear model of ordinal regression and apply it to the ABC survey. The chapter concludes with an analysis of a set of diagnostics to check the goodness of the suggested model and the presence of anomalous observations.

9.1 An overview of outlier detection methods

There are several definitions of outliers in the statistical literature (see Barnett and Lewis, 1994; Atkinson et al., 2004; Hadi et al., 2009). A commonly used definition is that outliers are a minority of observations in a data set that is represented by a common pattern which can be captured by some statistical model. The assumption here is that there is a core of at least 50% of observations that is homogeneous and a set of remaining observations (hopefully few) which has patterns that are inconsistent with this common pattern. Awareness of outliers in some form or another has existed for at least 2000 years. Thucydides, ...

Get Modern Analysis of Customer Surveys: with applications using R now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.