9

Scaling a Deep Learning Pipeline

Amazon Web Services (AWS) opens many possibilities in deep learning (DL) model deployments. In this chapter, we will introduce the two most popular services designed for deploying a DL model as an inference endpoint: Elastic Kubernetes Service (EKS) and SageMaker.

In the first half, we will describe the EKS-based approach. First, we will discuss how to create inference endpoints for TensorFlow (TF) and PyTorch models and deploy them using EKS. We will also introduce the Elastic Inference (EI) accelerator, which can increase the throughput while reducing the cost. EKS clusters have pods that host the inference endpoints as web servers. As the last topic for EKS-based deployment, we will introduce how the pods ...

Get Production-Ready Applied Deep Learning now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.