Designing Deep Learning Systems, Video Edition

Video description

In Video Editions the narrator reads the book while the content, figures, code listings, diagrams, and text appear on the screen. Like an audiobook that you can also watch as a video.

A vital guide to building the platforms and systems that bring deep learning models to production.

In Designing Deep Learning Systems you will learn how to:

  • Transfer your software development skills to deep learning systems
  • Recognize and solve common engineering challenges for deep learning systems
  • Understand the deep learning development cycle
  • Automate training for models in TensorFlow and PyTorch
  • Optimize dataset management, training, model serving and hyperparameter tuning
  • Pick the right open-source project for your platform

Deep learning systems are the components and infrastructure essential to supporting a deep learning model in a production environment. Written especially for software engineers with minimal knowledge of deep learning’s design requirements, Designing Deep Learning Systems is full of hands-on examples that will help you transfer your software development skills to creating these deep learning platforms. You’ll learn how to build automated and scalable services for core tasks like dataset management, model training/serving, and hyperparameter tuning. This book is the perfect way to step into an exciting—and lucrative—career as a deep learning engineer.

Designing Deep Learning Systems: A software engineer's guide teaches you everything you need to design and implement a production-ready deep learning platform. First, it presents the big picture of a deep learning system from the developer’s perspective, including its major components and how they are connected. Then, it carefully guides you through the engineering methods you’ll need to build your own maintainable, efficient, and scalable deep learning platforms.

Chi Wang is a principal software developer in the Salesforce Einstein group. Donald Szeto was the co-founder and CTO of PredictionIO.

Table of contents

  1. Chapter 1. An introduction to deep learning systems
  2. Chapter 1. Deep learning system design overview
  3. Chapter 1. Building a deep learning system vs. developing a model
  4. Chapter 1. Summary
  5. Chapter 2. Dataset management service
  6. Chapter 2. Touring a sample dataset management service
  7. Chapter 2. Open source approaches
  8. Chapter 2. Summary
  9. Chapter 3. Model training service
  10. Chapter 3. Deep learning training code pattern
  11. Chapter 3. A sample model training service
  12. Chapter 3. Kubeflow training operators: An open source approach
  13. Chapter 3. When to use the public cloud
  14. Chapter 3. Summary
  15. Chapter 4. Distributed training
  16. Chapter 4. Data parallelism
  17. Chapter 4. A sample service supporting data parallel–distributed training
  18. Chapter 4. Training large models that can’t load on one GPU
  19. Chapter 4. Summary
  20. Chapter 5. Hyperparameter optimization service
  21. Chapter 5. Understanding hyperparameter optimization
  22. Chapter 5. Designing an HPO service
  23. Chapter 5. Open source HPO libraries
  24. Chapter 5. Summary
  25. Chapter 6. Model serving design
  26. Chapter 6. Common model serving strategies
  27. Chapter 6. Designing a prediction service
  28. Chapter 6. Summary
  29. Chapter 7. Model serving in practice
  30. Chapter 7. TorchServe model server sample
  31. Chapter 7. Model server vs. model service
  32. Chapter 7. Touring open source model serving tools
  33. Chapter 7. Releasing models
  34. Chapter 7. Postproduction model monitoring
  35. Chapter 7. Summary
  36. Chapter 8. Metadata and artifact store
  37. Chapter 8. Metadata in a deep learning context
  38. Chapter 8. Designing a metadata and artifacts store
  39. Chapter 8. Open source solutions
  40. Chapter 8. Summary
  41. Chapter 9. Workflow orchestration
  42. Chapter 9. Designing a workflow orchestration system
  43. Chapter 9. Touring open source workflow orchestration systems
  44. Chapter 9. Summary
  45. Chapter 10. Path to production
  46. Chapter 10. Model productionization
  47. Chapter 10. Model deployment strategies
  48. Chapter 10. Summary
  49. Appendix A. A “hello world” deep learning system
  50. Appendix A. Lab demo
  51. Appendix B. Survey of existing solutions
  52. Appendix B. Google Vertex AI
  53. Appendix B. Microsoft Azure Machine Learning
  54. Appendix B. Kubeflow
  55. Appendix B. Side-by-side comparison
  56. Appendix C. Creating an HPO service with Kubeflow Katib
  57. Appendix C. Getting started with Katib
  58. Appendix C. Expedite HPO
  59. Appendix C. Katib system design
  60. Appendix C. Adding a new algorithm
  61. Appendix C. Further reading
  62. Appendix C. When to use it

Product information

  • Title: Designing Deep Learning Systems, Video Edition
  • Author(s): Chi Wang, Donald Szeto
  • Release date: June 2023
  • Publisher(s): Manning Publications
  • ISBN: None