Chapter 4. Managing Model Development
Model development is the aspect that is most unchanged by the introduction of rendezvous systems, containers, and a DataOps-style of development, but the rendezvous style does bring some important changes.
One of the biggest differences is that in a DataOps team, model development goes on cheek-by-jowl with software development, and operations with much less separation between data scientists and others. What that means is that data scientists must take on some tasks relative to packaging and testing models that are a bit different from what they may be used to. The good news is that doing this makes deployment and management of the model smoother, so the data scientists are distracted less often by deployment problems.
Investing in Improvements
Over time, systems that use machine learning heavily can build up large quantities of hidden technical debt. This debt takes many forms, including data coupling between models, dead features, redundant inputs, hidden dependencies, and more. Most important, this debt is different from the sort of technical debt you find in normal software, so the software and ops specialists in a DataOps team won’t necessarily see it and data scientists, who are typically used to working in a cloistered and sterilized environment won’t recognize it either because it is an emergent feature of real-world deployments.
A variety of straightforward things can help with this debt. For instance, you should schedule regular ...