# CHAPTER 8

# RIDGE REGRESSION

## Outline

**8.3** Bias, MSE, and Risk Expressions

**8.4** Performance of the Estimators

The area of shrinkage estimators came to the forefront of statistical literature soon after Stein (1956) discovered that the sample mean in a multivariate model is not admissible, under a quadratic loss function, for dimension more than two. The idea took a decade to settle in to statistical methodology. Another class of shrinkage estimators appeared in the statistical literature in the 1970’s due to Hoerl and Kennard (1970). The methodology is an advancement for linear models. The idea is simple but its impact is great: that the ordinary least squares estimates (OLSEs) are unbiased and the covariance matrix is dependent on the design matrix ** X**, which may be ill-conditioned, that is, some of the eigenvalues may be zero or near zero, which impacts on the variance of the OLSEs very large to make the OLSEs useless. To overcome this problem, Hoerl and Kennard (1970) put forward the idea of using the (

**+**

*X′X**k*

*I*_{p})

^{−1}

**instead of (**

*Xy***)**

*X′X*^{−1}

**for the estimation of the coefficients of a regression model, as in Chapter 7. These estimators are called “ridge regression estimators” (RREs), where**

*X′y**k*is called the tuning/biasing/ridge parameter, which is traditionally known as the “ridge parameter.” In this chapter, we consider the regression model and apply the ridge regression methodology when the error ...

Get *Statistical Inference for Models with Multivariate t-Distributed Errors* now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.