Chapter 5

Feature Selection

OUTLINE

Preamble

After your analytical data set is prepared for modeling, you must select those variables (or features) to use as predictors. This process of feature selection is a very important strategy to follow in preparing data for data mining. A major problem in data mining in large data sets with many potential predictor variables is the curse of dimensionality. This expression was coined by Richard Bellman (1961) to describe the problem that increases as more variables are added to a model. As additional variables are added to a model, it may be able to predict a number better in regression ...

Get Handbook of Statistical Analysis and Data Mining Applications now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.