4 Model serving patterns

This chapter covers

  • Using model serving to generate predictions or make inferences on new data with previously trained machine learning models
  • Handling model serving requests and achieving horizontal scaling with replicated model serving services
  • Processing large model serving requests using the sharded services pattern
  • Assessing model serving systems and event-driven design

In the previous chapter, we explored some of the challenges involved in the distributed training component, and I introduced a couple of practical patterns that can be incorporated into this component. Distributed training is the most critical part of a distributed machine learning system. For example, we’ve seen challenges when training very large ...

Get Distributed Machine Learning Patterns now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.