Chapter 3Regression
A statistical model involves the identification of those elements of our problem which are subject to uncontrolled variation and a specification of that variation in terms of probability distributions.
— David J. Bartholomew, Unobserved Variables: Models and Misunderstandings, 2013
Essentially, all models are wrong, but some are useful.
— George E. P. Box and Norman R. Draper, Empirical Model-Building and Response Surfaces, 1987
This is the one chapter in this book which is dedicated to regression, and it presents regression mostly from the classical, statistical, least-squares point of view. We open the discussion of practical learning methods with least-squares regression for (at least) four reasons. First, it is useful: the underlying model assumptions are, in many applications, close enough to correct that this method has been widely used in the literature of many scientific domains.1 Second, it is relatively easy to treat analytically, and study of the analysis and the closed-form solutions it yields can provide insight into “what is going on” with this approach to regression. The author hopes this will lead to more informed application. The exercises of this chapter are an integral part of the analytic development (solutions can be found in Appendix F). Third, it enables transformation of classification problems into regression problems, which is an approach taken by logistic regression and neural networks in Chapter 4 and gradient boosting in ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access