Chapter 7. Deploying PyTorch to Production

Most of this book so far has focused on model design and training. Earlier chapters showed you how to use the built-in capabilities of PyTorch to design your models and create custom NN modules, loss functions, optimizers, and other algorithms. In the previous chapter, we looked at how to use distributed training and model optimizations to accelerate your model training times and minimize the resources needed for running your models.

At this point, you have everything you need to create some well-trained, cutting-edge NN models—but don’t let your innovations sit in isolation. Now it’s time to deploy your models into the world through applications.

In the past, going from research to production was a challenging task that required a team of software engineers to move PyTorch models to a framework and integrate them nto a (often non-Python) production environment. Today, PyTorch includes built-in tools and external libraries to support rapid deployment to a variety of production environments.

In this chapter, we focus on deploying your model for inference, not training, and we’ll explore how to deploy your trained PyTorch models into a variety of applications. First, I’ll describe the various built-in capabilities and tools within PyTorch that you can use for deployment. Tools like TorchServe and TorchScript allow you to easily deploy your PyTorch models to the cloud and to mobile or edge devices.

Depending on the application and environment, ...

Get PyTorch Pocket Reference now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.