5.3. The Peaking Phenomenon
As stated in the introduction of this chapter, in order to design a classifier with good generalization performance, the number of training points, N, must be large enough with respect to the number of features, l, that is, the dimensionality of the feature space. Take as an example the case of designing a linear classifier, wTx + w0. The number of the unknown parameters is l + 1. In order to get a good estimate of these parameters, the number of data points must be larger than l + 1. The larger the N the better the estimate, since we can filter out the effects of the noise and also minimize the effects of the outliers.
In [Trun 79], an elegant simple example has been given that reveals the interplay between the ...
Get Pattern Recognition, 4th Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.