12.1 Predictor–Response Data

Multivariate predictor–response smoothing methods fit smooth surfaces to observations (x_i, y_i), where x_i is a vector of p predictors and y_i is the corresponding response value. The y₁, . . ., y_n values are viewed as observations of the random variables Y₁, . . ., Y_n, where the distribution of Y_i depends on the ith vector of predictor variables.

Many of the bivariate smoothing methods discussed in Chapter 11 can be generalized to the case of several predictors. Running lines can be replaced by running planes. Univariate kernels can be replaced by multivariate kernels. One generalization of spline smoothing is thin plate splines [280, 451]. In addition to the significant complexities of actually implementing some of these approaches, there is a fundamental change in the nature of the smoothing problem when using more than one predictor.

The curse of dimensionality is that high-dimensional space is vast, and points have few near neighbors. This same problem was discussed in Section 10.4.1 as it applied to multivariate density estimation. Consider a unit sphere in p dimensions with volume π^p/2/Γ(p/2 + 1). Suppose that several p-dimensional predictor points are distributed uniformly within the ball of radius 4. In one dimension, 25% of predictors are expected within the unit ball; hence unit balls might be reasonable neighborhoods for smoothing. Table 12.1 shows that this proportion vanishes rapidly as p increases. In order ...

Get Computational Statistics, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Computational Statistics, 2nd Edition by Geof H. Givens, Jennifer A. Hoeting

12.1 Predictor–Response Data

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly