Preface to the First Edition

Why robust statistics are needed

All statistical methods rely explicitly or implicitly on a number of assumptions. These assumptions generally aim at formalizing what the statistician knows or conjectures about the data analysis or statistical modeling problem he or she is faced with, while at the same time aim at making the resulting model manageable from the theoretical and computational points of view. However it is generally understood that the resulting formal models are simplifications of reality and that their validity is at best approximate. The most widely used model formalization is the assumption that the observed data has a normal (Gaussian) distribution. This assumption has been present in statistics for two centuries, and has been the framework for all the classical methods in regression, analysis of variance, and multivariate analysis. There have been attempts to justify the assumption of normality with theoretical arguments, such as the central limit theorem. These attempts however are easily proven wrong. The main justification for assuming a normal distribution is that it gives an approximate representation to many real data sets, and at the same time is theoretically quite convenient because it allows one to derive explicit formulae for optimal statistical methods such as maximum likelihood and likelihood ratio tests, as well as the sampling distribution of inference quantities such as t‐statistics. We refer to such methods as ...

Get Robust Statistics, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.