A case study – a credit card defaulting dataset

By intelligently extracting the most important signals from our data and ignoring noise, feature selection algorithms achieve two major outcomes:

  • Improved model performance: By removing redundant data, we are less likely to make decisions based on noisy and irrelevant data, and it also allows our models to hone in on the important features, thereby improving model pipeline predictive performance
  • Reduced training and predicting time: By fitting pipelines to less data, this generally results in improved model fitting and predicting times, making our pipelines faster overall

In order to gain a realistic understanding of how and why noisy data gets in the way, let's introduce our newest dataset, ...

Get Feature Engineering Made Easy now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.