Chapter 13. Explaining Models

Predictive models have different properties. Some are designed to handle linear data. Others can mold to more complex input. Some models can be interpreted very easily, others are like black boxes and don’t offer much insight into how the prediction is made.

In this chapter we will look at interpreting different models. We will look at some examples using the Titanic data.

>>> dt = DecisionTreeClassifier(
...     random_state=42, max_depth=3
... )
>>> dt.fit(X_train, y_train)

Regression Coefficients

The intercepts and regression coefficients explain the expected value, and how features impact the prediction. A positive coefficient indicates that as a feature’s value increases, the prediction increases as well.

Feature Importance

Tree-based models in the scikit-learn library include a .fea⁠ture_importances_ attribute for inspecting how the features of a dataset affect the model. We can inspect or plot them.

LIME

LIME works to help explain black-box models. It performs a local interpretation rather than an overall interpretation. It will help explain a single sample.

For a given data point or sample, LIME indicates which features were important in determining the result. It does this by perturbing the sample in question and fitting a linear model to it. The linear model approximates the model close to the sample (see Figure 13-1).

Here is an example explaining the last sample (which our decision tree predicts will survive) from the training data: ...

Get Machine Learning Pocket Reference now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.