Chapter 18. CNN Interpretation with CAM

Now that we know how to build up pretty much anything from scratch, let’s use that knowledge to create entirely new (and very useful!) functionality: the class activation map. It gives us some insight into why a CNN made the predictions it did.

In the process, we’ll learn about one handy feature of PyTorch we haven’t seen before, the hook, and we’ll apply many of the concepts introduced in the rest of the book. If you want to really test out your understanding of the material in this book, after you’ve finished this chapter, try putting it aside and re-creating the ideas here yourself from scratch (no peeking!).

CAM and Hooks

The class activation map (CAM) was introduced by Bolei Zhou et al. in “Learning Deep Features for Discriminative Localization”. It uses the output of the last convolutional layer (just before the average pooling layer) together with the predictions to give us a heatmap visualization of why the model made its decision. This is a useful tool for interpretation.

More precisely, at each position of our final convolutional layer, we have as many filters as in the last linear layer. We can therefore compute the dot product of those activations with the final weights to get, for each location on our feature map, the score of the feature that was used to make a decision.

We’re going to need a way to get access to the activations inside the model while it’s training. In PyTorch, this can be done with a hook. Hooks are PyTorch’s ...

Get Deep Learning for Coders with fastai and PyTorch now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.