Training deep neural network models requires a highly tuned system with the right combination of software, drivers, compute, memory, network, and storage resources. Deep learning frameworks like TensorFlow, PyTorch, Caffe, Torch, Theano, and MXNet have contributed to the popularity of deep learning by reducing the effort and skill needed to design, train, and use deep learning models. Fabric for Deep Learning (FfDL, pronounced “fiddle”) provides a consistent way to run these deep learning frameworks as a service on Kubernetes. FfDL uses a microservices architecture to reduce coupling between components, keep each component simple and as stateless as possible, isolate component failures, and allow each component to be developed, tested, deployed, scaled, and upgraded independently. Animesh Singh, Atin Sood, and Tommy Li share lessons learned while building and using FfDL and demonstrate how to leverage it to execute distributed deep learning training for models written using multiple frameworks, using GPUs and object storage constructs. They then explain how to take models from IBM’s Model Asset Exchange, train them using FfDL, and deploy them on Kubernetes for serving and inferencing. This session is sponsored by IBM.
Table of contents
- Fabric for deep learning at IBM 00:44:37
- Title: Fabric for deep learning at IBM
- Release date: August 2019
- Publisher(s): O'Reilly Media, Inc.
- ISBN: 0636920452133
You might also like
Natural Language Processing in Action Video Edition
"Learn both the theory and practical skills needed to go beyond merely understanding the inner workings …
51+ hours of video instruction. Overview The professional programmer’s Deitel® video guide to Python development with …
O'Reilly Strata Data Conference 2019 - New York, New York
The 2019 Strata Data Conference NYC, the biggest Big Data conference in the world, was a …
Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 2nd Edition
Through a series of recent breakthroughs, deep learning has boosted the entire field of machine learning. …