O'Reilly logo

IBM SPSS Modeler Cookbook by Scott Mutchler, Tom Khabaza, Meta S. Brown, Dean Abbott, Keith McCormick

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Quantifying variable importance with Monte Carlo simulation

Finding the smallest subset of all possible input variables that result in an accurate model (that is, a parsimonious solution) is often the biggest challenge for many data mining projects. It's common for data sets to contain 10s to 100s of input variables. Models that are over-trained or simply fail to build are both possible with so called "wide" data sets. Removing unimportant variables to find the sweet spot between model accuracy and stability is where experienced data miners can deliver significant value.

The primary method of variable selection in Modeler is Feature Selection. The Feature Selection process identifies the significance of each variable individually. Statistically ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required