Chapter 13. Convolutional Neural Networks

In Chapter 4, we learned how to create a neural network recognizing images. We were able to achieve a bit over 98% accuracy at distinguishing 3s from 7s—but we also saw that fastai’s built-in classes were able to get close to 100%. Let’s start trying to close the gap.

In this chapter, we will begin by digging into what convolutions are and building a CNN from scratch. We will then study a range of techniques to improve training stability and learn all the tweaks the library usually applies for us to get great results.

The Magic of Convolutions

One of the most powerful tools that machine learning practitioners have at their disposal is feature engineering. A feature is a transformation of the data that is designed to make it easier to model. For instance, the add_datepart function that we used for our tabular dataset preprocessing in Chapter 9 added date features to the Bulldozers dataset. What kinds of features might we be able to create from images?

Jargon: Feature Engineering

Creating new transformations of the input data in order to make it easier to model.

In the context of an image, a feature is a visually distinctive attribute. For example, the number 7 is characterized by a horizontal edge near the top of the digit, and a top-right to bottom-left diagonal edge underneath that. On the other hand, the number 3 is characterized by a diagonal edge in one direction at the top left and bottom right of the digit, the opposite diagonal ...

Get Deep Learning for Coders with fastai and PyTorch now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.