Predictive modeling: Striking a balance between accuracy and interpretability

Tips for using machine learning models in regulated industries.

By Patrick Hall
February 11, 2016
Balance. Balance. (source: By Evonne on Flickr)

The inherent trade-off between accuracy and interpretability in predictive modeling can be a catch-22 for analysts and data scientists working in regulated industries. Professionals in the regulated verticals of banking and insurance often feel locked into using traditional, linear modeling techniques to create their predictive models. This is mainly due to strenuous regulatory and documentation requirements. As machine learning becomes more mainstream, the forces of innovation and competition often drive these same analysts and data scientists to break out of the mold and try new algorithms with more predictive capacity. Such algorithms for machine learning include gradient boosted ensembles, neural networks, and random forests, among many others. These algorithms are typically more accurate for predicting nonlinear, faint, or rare phenomena. Unfortunately, more accuracy almost always means less interpretability, and interpretability is crucial for documentation and regulation processes.

Due to their inscrutable inner-workings, many machine learning algorithms have been labeled “black box” models. What makes these models accurate is what makes their predictions difficult to understand: they are very complex. This is a fundamental trade-off. So how can you improve the accuracy of more traditional linear models while still retaining some degree of interpretability?

Learn faster. Dig deeper. See farther.

Join the O'Reilly online learning platform. Get a free trial today and find answers on the fly, or master something new and useful.

Learn more

Try different regression techniques

Penalized regression techniques are particularly well-suited for wide data. They avoid the multiple comparison problem that can arise with stepwise variable selection. They can be trained on datasets with more columns than rows and they preserve interpretability by selecting a small number of original variables for the final model. Generalized additive models fit linear terms to certain variables and nonlinear splines to other variables, allowing you to hand-tune a trade-off between interpretability and accuracy. With quantile regression you can fit a traditional, interpretable linear model to different percentiles of your training data, allowing you to find different sets of variables for modeling different behaviors across a customer market or portfolio of accounts.

Train a black box model and use it as a benchmark

A major difference between black box models and traditional linear models is that black box models automatically take into consideration a large number of implicit variable interactions. If your regression model is much less accurate than your black box model, you’ve probably missed some important interaction(s) of predictors.

Try surrogate models for explaining black box models

Surrogate models are interpretable models used as a proxy to explain black box models. For example, fit a black box model to your training data. Then train a single decision tree on the original training data, but instead of using the actual target in the training data, use the predictions of the more complex algorithm as the target for this single decision tree. This single decision tree will likely be a more interpretable proxy you can use to explain the more complex logic of the black box model.

Use a black box model in your deployment process

As models are often trained on static snapshots of data, their predictions typically become less accurate over time as the market environment shifts away from the conditions captured in the training data. After a certain amount of time, models usually have to be retrained or replaced. Like changing an airplane part before it actually requires maintenance, you could use a black box model to predict when traditional deployed models need to be retrained or replaced, and do so before their predictive power lessens.

Train a small, interpretable ensemble model

Combining predictions between a handful of good, but different, models often results in better predictions. You could train an interpretable regression model and an interpretable decision tree and average their predictions. You can also try a stacked ensemble or super learner, in which a linear model is often used to optimally weight the predictions of several different models that are then assembled together.

Use black box techniques to create nonlinear predictors

One way to increase the accuracy of traditional, linear models is to introduce nonlinear predictors into the model. While this can be as straightforward as using polynomial and interaction terms, black box models can be used to create new predictors that capture more complex relationships. Use a black box model to learn complex, non-polynomial, nonlinear relationships between input and target variables, or to learn high-degree interactions between input variables. Then use its predictions as a predictor in a linear model.

Try variable importance measures and partial dependency plots for explaining black box models

Variable importance measures are available for models like gradient boosted ensembles, neural networks, and random forests. Partial dependency plots are an interpretation tool that visually depicts complex interactions between important variables in a model. Depending on your organization’s validation requirements or regulator, variable importance measures and partial dependency plots may be an acceptable documentation tool for black box models used as part of your overall analytics infrastructure.


The detailed logic that defines black box models is usually too complicated to explain to internal validation teams and external regulators. However, competitive regulated industries like banking and insurance also demand a high degree of accuracy from their operational predictive models. These competing interests can put analysts, data scientists, and their management in a tough spot. I’ve drawn on my experience with customers on four continents to propose the tips in this post. While I realize these tips probably won’t solve your exact business problem, I hope they may trigger ideas and help you take small steps toward using machine learning to gain a competitive advantage.

This post is a collaboration between O’Reilly and SAS. See our statement of editorial independence.

Post topics: Data science