Chapter 10. Dimensionality Reduction Using Feature Selection
10.0 Introduction
In Chapter 9, we discussed how to reduce the dimensionality of our feature matrix by creating new features with (ideally) similar abilities to train quality models but with significantly fewer dimensions. This is called feature extraction. In this chapter we will cover an alternative approach: selecting high-quality, informative features and dropping less useful features. This is called feature selection.
There are three types of feature selection methods: filter, wrapper, and embedded. Filter methods select the best features by examining their statistical properties. Methods where we explicitly set a threshold for a statistic or manually select the number of features we want to keep are examples of feature selection by filtering. Wrapper methods use trial and error to find the subset of features that produces models with the highest quality predictions. Wrapper methods are often the most effective, as they find the best result through actual experimentation as opposed to naive assumptions. Finally, embedded methods select the best feature subset as part of, as an extension of, a learning algorithm’s training process.
Ideally, we’d describe all three methods in this chapter. However, since embedded methods are closely intertwined with specific learning algorithms, they are difficult to explain prior to a deeper dive into the algorithms themselves. Therefore, in this chapter we cover only filter and ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access