2.4 Kernel-Based Classification
Kernel-based approaches have recently received considerable interest in solving linear nonseparable problems in classification. Their main idea is to introduce a nonlinear kernel that maps the original data into a high-dimensional feature space from which the extracted features can be used to resolve the issue of linear nonseparability encountered in the original data space. The property of the Mercer's theorem in Scholkopf and Smola (2002) then plays a trick called kernel trick to implicitly calculate the dot products in the feature space F without actually mapping all data sample vectors into the feature space F in which case the kernel function was not even identified.
2.4.1 Kernel Trick Used in Kernel-Based Methods
Prior to introducing kernel-based classifiers a kernelization procedure and kernel trick need to be discussed. The idea of kernel-based techniques is to obtain a nonlinear version of a linear algorithm by implicitly mapping the original data to a higher-dimensional feature space. Suppose that forms a data space of and ϕ is a nonlinear function that maps the data space X into a feature space F with dimensionality yet to be determined by a kernel, that is,
It is shown in Scholkopf and Smola (2002) that an effective kernel-based ...