Chapter 7. Monitoring the Training Process

In the last chapter, you learned how to launch the model training process. In this chapter, we’ll cover the process itself.

I’ve used fairly straightforward examples in this book to help you grasp each concept. When you’re running a real training process in TensorFlow, however, things can be more complicated. When problems arise, for example, you need to think about how to determine whether your model is overfitting the training data. (Overfitting occurs when the model learns and memorizes the training data and the noise in training data so well that it negatively affects its ability to learn new data.) If it is, you’ll need to set up cross validation. If not, you can take steps to prevent overfitting.

Other questions that often arise during the training process include:

  • How often should I save the model during the training process?

  • How should I determine which epoch gives the best model before overfitting occurs?

  • How can I track model performance?

  • Can I stop training if the model is not improving or is overfitting?

  • Is there a way to visualize the model training process?

TensorFlow provides a very easy way to address these questions: callback functions. In this chapter, you will learn how to make quick use of callback functions as you monitor the training process. The first half of the chapter discusses ModelCheckpoint and EarlyStopping, while the second half focuses on TensorBoard and shows you several techniques for invoking TensorBoard ...

Get TensorFlow 2 Pocket Reference now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.