Chapter 7. Training Pipeline
The stage after preprocessing is model training, during which the machine learning model will read in the training data and use that data to adjust its weights (see Figure 7-1). After training, the model is saved or exported so that it can be deployed.
In this chapter, we will look at ways to make the ingestion of training (and validation) data into the model more efficient. We will take advantage of time slicing between the different computational devices (CPUs and GPUs) available to us, and examine how to make the whole process more resilient and reproducible.
The code for this chapter is in the 07_training folder of the book’s GitHub repository. We will provide file names for code samples and notebooks where applicable.
A significant part of the time it takes to train machine learning models is spent on ingesting data—reading it and transforming it into a form that is usable by the model. The more we can do to streamline and speed up this stage of the training pipeline, the more efficient we can be. We can do this by:
- Storing data efficiently
We should preprocess the input images as much as possible, and store the preprocessed values in a way that is ...