Chapter 6. Model Training

Now that the data preprocessing step is complete and the data has been transformed into the format that our model requires, the next step in our pipeline is to train the model with the freshly transformed data.

As we discussed in Chapter 1, we won’t cover the process of choosing your model architecture. We assume that you have a separate experimentation process that took place before you even picked up this book and that you already know the type of model you wish to train. We discuss how to track this experimentation process in Chapter 15 because it helps with creating a full audit trail for the model. However, we don’t cover any of the theoretical background you’ll need to understand the model training process. If you would like to learn more about this, we strongly recommend the O’Reilly publication Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 2nd edition.

In this chapter, we cover the model training process as part of a machine learning pipeline, including how it is automated in a TFX pipeline. We also include some details of distribution strategies available in TensorFlow and how to tune hyperparameters in a pipeline. This chapter is more specific to TFX pipelines than most of the others because we don’t cover training as a standalone process.

As shown in Figure 6-1, by this point data has been ingested, validated, and preprocessed. This ensures that all the data needed by the model is present and that it has been reproducibly ...

Get Building Machine Learning Pipelines now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.