Chapter 7. Explaining a PyTorch Image Classifier

Chapter 6 focused on using explainable models and post hoc explanations for models trained on tabular data. In this chapter, we’ll discuss these same concepts in the context of deep learning (DL) models trained on unstructured data, with a particular focus on image data. Code examples for the chapter are available online, and remember that Chapter 2 introduces the concepts of explainable models and post hoc explanation.

We’ll begin this chapter with an introduction to the hypothetical use case demonstrated through technical examples in this chapter. Then we’ll proceed much as we did in Chapter 6. First, we’ll present a concept refresher on explainable models and feature attribution methods for deep neural networks—focusing on perturbation—and gradient-based explanation methods. We’ll also continue a thread from Chapter 6 by outlining how explainability techniques can inform model debugging, a topic we’ll expand on even further in Chapters 8 and 9.

Next, we’ll discuss inherently explainable models in more detail. We put forward a short section on explainable DL models in hopes that some readers will be able to build their own explainable models, because as of today, that’s the best hope for truly explainable results. We’ll introduce prototype-based image classification models, like ProtoPNet Digital Mammography—a promising direction for explainable computer vision. After that, we’ll discuss post hoc explanation techniques. We will ...

Get Machine Learning for High-Risk Applications now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.