Chapter 16. Explaining Regression Models
Most of the techniques used to explain classification models apply to regression models. In this chapter, I will show how to use the SHAP library to interpret regression models.
We will interpret an XGBoost model for the Boston housing dataset:
>>>importxgboostasxgb>>>xgr=xgb.XGBRegressor(...random_state=42,base_score=0.5...)>>>xgr.fit(bos_X_train,bos_y_train)
Shapley
I’m a big fan of Shapley because it is model agnostic. This library also gives us global insight into our model and helps explain individual predictions. If you have a black-box model, I find it very useful.
We will first look at the prediction for index 5. Our model predicts the value to be 27.26:
>>>sample_idx=5>>>xgr.predict(bos_X.iloc[[sample_idx]])array([27.269186], dtype=float32)
To use the model, we have to create a TreeExplainer from our model
and estimate the SHAP values for our samples. If we want to use
Jupyter and have an interactive interface, we also need to call the
initjs function:
>>>importshap>>>shap.initjs()>>>exp=shap.TreeExplainer(xgr)>>>vals=exp.shap_values(bos_X)
With the explainer and the SHAP values, we can create a force plot to explain the prediction (see Figure 16-1). This informs us that the base prediction is 23, and that the population status (LSTAT) and property tax rate (TAX) push the price up, while the number of rooms (RM) pushes the price down:
>>>shap.force_plot(...exp.expected_value,...vals[
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access