Chapter 9. Advanced Model Deployments with TensorFlow Serving

In the previous chapter, we discussed the efficient deployment of TensorFlow or Keras models with TensorFlow Serving. With the knowledge of a basic model deployment and TensorFlow Serving configuration, we now introduce advanced use cases of machine learning model deployments in this chapter. The use cases touch a variety of topics, for example, deploying model A/B testing, optimizing models for deployment and scaling, and monitoring model deployments. If you haven’t had the chance to review the previous chapter, we recommend doing so because it provides the fundamentals for this chapter.

Decoupling Deployment Cycles

The basic deployments shown in Chapter 8 work well, but they have one restriction: the trained and validated model needs to be either included in the deployment container image during the build step or mounted into the container during the container runtime, as we discussed in the previous chapter. Both options require either knowledge of DevOps processes (e.g., updating Docker container images) or coordination between the data science and DevOps teams during the deployment phase of a new model version.

As we briefly mentioned in Chapter 8, TensorFlow Serving can load models from remote storage drives (e.g., AWS S3 or GCP Storage buckets). The standard loader policy of TensorFlow Serving frequently polls the model storage location, unloads the previously loaded model, and loads a newer model upon detection. ...

Get Building Machine Learning Pipelines now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.