Data Analysis: What Can Be Learned From the Past 50 Years

CHAPTER 5

APPROXIMATE MODELS

This chapter is based on a talk I gave at the International Conference on Goodness-of-Fit Tests and Model Validity, at Paris, May 29–31, 2000, commemorating the centennial anniversary of the landmark paper by Karl Pearson (1900) on chi-square goodness-of-fit test (see Huber 2002)1.

5.1 MODELS

The anniversary of Karl Pearson’s paper offers a timely opportunity for a digression and to discuss the role of models in contemporary and future statistics, and the assessment of adequacy of their fit, in a somewhat broader context, stressing necessary changes in philosophy rather than the technical nitty-gritty they involve. The present chapter elaborates on what I had tentatively named “postmodern robustness” in Huber (1996b, final section):

The current trend toward ever-larger computer-collected and computer-managed data bases poses interesting challenges to statistics and data analysis in general. Most of these challenges are particularly pertinent to diagnostics and robustness. The data sets are not only getting larger, but also are more complexly structured. [&] Exact theories become both impossible and unmanageable. In any case, models never are exactly true, and for sufficiently large samples they will be rejected by statistical goodness-of-fit tests. This poses some rather novel problems of robust model selection, and we must learn to live with crude models (robustness with regard to systematic, but uninteresting, errors in the model). [&] It appears ...

Get Data Analysis: What Can Be Learned From the Past 50 Years now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Data Analysis: What Can Be Learned From the Past 50 Years by Peter J. Huber

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly