Chapter 9. Dimensionality Reduction Using Feature Extraction
9.0 Introduction
It is common to have access to thousands and even hundreds of thousands of features. For example, in Chapter 8 we transformed a 256 × 256–pixel color image into 196,608 features. Furthermore, because each of these pixels can take one of 256 possible values, our observation can take 256196608 different configurations. Many machine learning algorithms have trouble learning from such data, because it will never be practical to collect enough observations for the algorithms to operate correctly. Even in more tabular, structured datasets we can easily end up with thousands of features after the feature engineering process.
Fortunately, not all features are created equal, and the goal of feature extraction for dimensionality reduction is to transform our set of features, poriginal, such that we end up with a new set, pnew, where poriginal > pnew, while still keeping much of the underlying information. Put another way, we reduce the number of features with only a small loss in our data’s ability to generate high-quality predictions. In this chapter, we will cover a number of feature extraction techniques to do just this.
One downside of the feature extraction techniques we discuss is that the new features we generate will not be interpretable by humans. They will contain as much or nearly as much ability to train our models but will appear to the human eye as a collection of random numbers. If we wanted to ...
Get Machine Learning with Python Cookbook, 2nd Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.