Chapter 7. Training a State-of-the-Art Model

This chapter introduces more advanced techniques for training an image classification model and getting state-of-the-art results. You can skip it if you want to learn more about other applications of deep learning and come back to it later—knowledge of this material will not be assumed in later chapters.

We will look at what normalization is, a powerful data augmentation technique called Mixup, the progressive resizing approach, and test time augmentation. To show all of this, we are going to train a model from scratch (not using transfer learning) by using a subset of ImageNet called Imagenette. It contains a subset of 10 very different categories from the original ImageNet dataset, making for quicker training when we want to experiment.

This is going to be much harder to do well than with our previous datasets because we’re using full-size, full-color images, which are photos of objects of different sizes, in different orientations, in different lighting, and so forth. So, in this chapter we’re going to introduce important techniques for getting the most out of your dataset, especially when you’re training from scratch, or using transfer learning to train a model on a very different kind of dataset than the pretrained model used.

Imagenette

When fast.ai first started, people used three main datasets for building and testing computer vision models:

ImageNet

1.3 million images of various sizes, around 500 pixels across, in 1,000 ...

Get Deep Learning for Coders with fastai and PyTorch now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.