Chapter 31Case Study, Part 3: Modeling And Evaluation For Performance And Interpretability

31.1 Do You Prefer The Best Model Performance, Or A Combination Of Performance And Interpretability?

This chapter and Chapter 32 address our primary objective with the Case Study of Predicting Response to Direct-Mail Marketing: that of developing a classification model that will maximize profits. However, recall that multicollinearity among the predictors can lead to instability in certain models, such as multiple regression or logistic regression. Unstable models lack interpretability, because we cannot know with confidence, for example, that a particular logistic regression coefficient is positive or negative. The use of correlated predictors for decision trees is problematic as well. For example, imagine a decision tree applied to a data set with correlated predictors c31-math-0001 and c31-math-0002. Suppose the root node split is made on the uncorrelated variable c31-math-0003. Then the left side of the tree may make splits based on c31-math-0004, while the right side of the tree makes splits based on . Decision rules based on this tree will ...

Get Data Mining and Predictive Analytics, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.