8
Two-Phase Model Serving
In this chapter, we will discuss the two-phase prediction pattern. In the two-phase prediction pattern, we deploy two different models. The bigger and more complex model is deployed on the server. In most cases, the users of this model are edge devices where the network may fluctuate. So, in the case of bad network access, an edge device can use a lightweight model to get predictions for basic use cases. For broader and more accurate predictions, the devices can get the prediction by calling APIs to the model deployed to the server. We will discuss the serving of models in this scenario of edge devices that exist in unstable networking conditions.
We will cover the following topics in this chapter:
- Introducing two-phase ...
Get Machine Learning Model Serving Patterns and Best Practices now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.