O'Reilly logo
live online training icon Live Online training

High Performance TensorFlow in Production: Hands on with GPUs and Kubernetes

Develop hands-on experience optimizing and deploying Tensorflow models

Chris Fregly

This course will give you hands-on experience deploying an optimized Tensorflow Model into production for real-time prediction serving. You’ll build a lightweight Machine Learning/AI prediction service, similar to AWS and Google Cloud ML. You’ll train and export your Tensorflow Model using a Jupyter Notebook and the Python-based Tensorflow development libraries. And you’ll learn the structure of a Tensorflow Model, key components of Tensorflow Serving, including versioning and rollback, and how to optimize and simplify our trained Tensorflow Model through various techniques to reduce prediction latency.

You’ll also tune Tensorflow Serving to increase prediction throughput, and deploy your model with C++-based Tensorflow Serving to serve high-performance, real-time predictions.

While model training is part of this course, we focus mainly on model optimizing and serving. We may also cover parts of Distributed Tensorflow, Docker, and Kubernetes.

What you'll learn-and how you can apply it

You’ll understand:

  • The structure of a Tensorflow Model
  • Key components of Tensorflow Serving
  • How to optimize a Tensorflow Model for serving
  • How to tune Tensorflow Serving for performance
  • How to deploy Tensorflow models with Tensorflow Serving
  • How to version and rollback models with Tensorflow Serving

And you'll be able to:

  • Optimize trained Tensorflow Models to reduce prediction latency
  • Deploy trained Tensorflow Models to Tensorflow Serving in production
  • Tune the Tensorflow Serving runtime to increase prediction throughput
  • Version and roll-back models with Tensorflow Serving

This training course is for you because...

  • You are a Software Engineer or Data Engineer with Intermediate Production-Deployment Experience and need to learn to deploy Tensorflow models to production.

  • You are a Data Scientist or Business Analyst with intermediate ML or AI experience and need to learn to optimize Tensorflow models for production deployment.


  • Intermediate software engineering or data science skills.

Setup required prior to the first course meeting:

  1. The only requirement is a modern browser (ie. Chrome, Firefox, etc).
  2. Every attendee will get their own cloud instance accessible via their browser. The instructor will provide the IP addresses to each attendee at the beginning of the course. All work will be done using Jupyter notebooks running on each attendee’s assigned cloud instance.
  3. All work can be saved locally. The instructor will provide download instructions at the end of the course.

About your instructor

  • Chris Fregly is a Research Scientist at PipelineAI - a Machine Learning and Artificial Intelligence Startup in San Francisco. He is also an Apache Spark Contributor, Netflix Open Source Committer, and Founder of the SF-based Advanced Spark and TensorFlow Meetup. Previously, Chris was a Streaming Engineer at Netflix, Data Solutions Engineer at Databricks, and Founding Member of the IBM Spark Technology Center in San Francisco.


The timeframes are only estimates and may vary according to how the class is progressing

Day 1: TensorFlow Model Training - TensorFlow and GPUs - Inspect and Debug Models - Distributed Training Across a Cluster - Optimize Training with Queues, Dataset API, and JIT XLA Compiler

Day 2: TensorFlow Model Deploying and Serving Predictions - Optimize Predicting with AOT XLA and Graph Transform Tool (GTT) - Key Components of TensorFlow Serving - Optimize TensorFlow Serving Runtime