Chapter 9. Serving TensorFlow Models

If you’ve been reading the chapters in this book sequentially, you now know a lot about how to handle the data engineering pipeline, build models, launch training routines, checkpoint models at each epoch, and even score test data. In all the examples thus far, these tasks have mostly been wrapped together for didactic purposes. In this chapter, however, you’re going to learn more about how to serve TensorFlow models based on the format in which they are saved.

Another important distinction between this chapter and previous chapters is that here you will learn a coding pattern for handling data engineering for test data. Previously, you saw that test data and training data are transformed at the same runtime. As a machine learning engineer, though, you also have to think about the scenarios where your model is deployed.

Imagine that your model is loaded in a Python runtime and ready to go. You have a batch of samples or a sample. What do you need to do to the input data so that the model can accept it and return predictions? In other words: you have a model and raw test data; how do you implement the logic of transforming the raw data? In this chapter, you will learn about serving the model through a few examples.

Model Serialization

A TensorFlow model can be saved in two different native formats (without any optimization): HDF5 (h5) or protobuf (pb). Both formats are standard data serialization (saving) formats in Python and other programming ...

Get TensorFlow 2 Pocket Reference now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.