Chapter 2. What Matters in Model Management

The logistics required for successful machine learning go beyond what is needed for other types of software applications and services. This is a dynamic process that should be able to run multiple production models, across various locations and through many cycles of model development, retraining, and replacement. Management must be flexible and quickly responsive: you don’t want to wait until changes in the outside world reduce performance of a production system before you begin to build better or alternative models, and you don’t want delays when it’s time to deploy new models into production.

All this needs to be done in a style that fits the goals of modern digital transformation. Logistics should not be a barrier to fast-moving, data-oriented systems, or a burden to the people who build machine learning models and make use of the insights drawn from them. To make it much easier to do these things, we introduce the rendezvous architecture for management of machine learning logistics.

Ingredients of the Rendezvous Approach

The rendezvous architecture takes advantage of data streams and geo-distributed stream replication to maintain a responsive and flexible way to collect and save data, including raw data, and to make data and multiple models available when and where needed. A key feature of the rendezvous design is that it keeps new models warmed up so that they can replace production models without significant lag time. The design ...

Get Machine Learning Logistics now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.