Skip to Content
Fabric for deep learning at IBM
case study

Fabric for deep learning at IBM

by Animesh Singh, Atin Sood, Tommy Li
August 2019
44m
English
O'Reilly Media, Inc.
Closed Captioning available in German, English, Spanish, French, Japanese, Korean, Portuguese (Portugal, Brazil), Chinese (Simplified), Chinese (Traditional)

Overview

Training deep neural network models requires a highly tuned system with the right combination of software, drivers, compute, memory, network, and storage resources. Deep learning frameworks like TensorFlow, PyTorch, Caffe, Torch, Theano, and MXNet have contributed to the popularity of deep learning by reducing the effort and skill needed to design, train, and use deep learning models. Fabric for Deep Learning (FfDL, pronounced “fiddle”) provides a consistent way to run these deep learning frameworks as a service on Kubernetes. FfDL uses a microservices architecture to reduce coupling between components, keep each component simple and as stateless as possible, isolate component failures, and allow each component to be developed, tested, deployed, scaled, and upgraded independently. Animesh Singh, Atin Sood, and Tommy Li share lessons learned while building and using FfDL and demonstrate how to leverage it to execute distributed deep learning training for models written using multiple frameworks, using GPUs and object storage constructs. They then explain how to take models from IBM’s Model Asset Exchange, train them using FfDL, and deploy them on Kubernetes for serving and inferencing. This session is sponsored by IBM.

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Hyperledger and Blockchain at IBM

Hyperledger and Blockchain at IBM

Christopher Ferris
OSCON Open Source Software Superstream Series: Infrastructure

OSCON Open Source Software Superstream Series: Infrastructure

Brianna McCullough, Nicole Pitter Patterson, Bridget Lewis, Mary Grygleski, Grace Jansen, Tiffany Le-Nguyen, Robin Bender Ginn, Rosemary Wang

Publisher Resources

ISBN: 0636920405535