Sometimes, datasets are not linearly separable, and standard PCA is not able to extract the correct principal components. The process is not dissimilar to the one discussed in Chapter 3, Advanced Clustering, when we faced the problem of non-convex clusters. In that case, some algorithms were not able to perform successful separation because of the geometry. In this case, the goal is to distinguish between different classes (in a pure, unsupervised scenario, we think about specific groupings) according to the structure of the principal components. Therefore, we want to work with the transformed dataset, Z, and detect the presence of distinguishable thresholds. For example, let's consider the following screenshot:
Kernel PCA
Original dataset ...
Get Hands-On Unsupervised Learning with Python now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.