Chapter 12

Multivariate Smoothing

# 12.1 Predictor–Response Data

Multivariate predictor–response smoothing methods fit smooth surfaces to observations (x_{i}, y_{i}), where x_{i} is a vector of p predictors and y_{i} is the corresponding response value. The y_{1}, . . ., y_{n} values are viewed as observations of the random variables Y_{1}, . . ., Y_{n}, where the distribution of Y_{i} depends on the ith vector of predictor variables.

Many of the bivariate smoothing methods discussed in Chapter 11 can be generalized to the case of several predictors. Running lines can be replaced by running planes. Univariate kernels can be replaced by multivariate kernels. One generalization of spline smoothing is thin plate splines [280, 451]. In addition to the significant complexities of actually implementing some of these approaches, there is a fundamental change in the nature of the smoothing problem when using more than one predictor.

The curse of dimensionality is that high-dimensional space is vast, and points have few near neighbors. This same problem was discussed in Section 10.4.1 as it applied to multivariate density estimation. Consider a unit sphere in p dimensions with volume π^{p/2}/Γ(p/2 + 1). Suppose that several p-dimensional predictor points are distributed uniformly within the ball of radius 4. In one dimension, 25% of predictors are expected within the unit ball; hence unit balls might be reasonable neighborhoods for smoothing. Table 12.1 shows that this proportion vanishes rapidly as p increases. In order ...