Deploying models with TensorFlow Serving

In general, when deploying models, we want the inner machinations of the model to be isolated from the public behind an HTTP interface. With a traditional machine learning model, we would wrap this serialized model in a deployment framework such as Flask to create an API, and serve our model from there. This could lead us to a myriad of issues with dependencies, versioning, and performance, so instead, we are going to use a tool provided to us by the TensorFlow authors called TensorFlow Serving. This spins up a small server that runs a TensorFlow model and provides access to it.

TensorFlow Serving implements a specific type of remote procedure call known as GPRC. In computer science, remote procedure ...

Get Hands-On Artificial Intelligence for Beginners now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.