Chapter 5. Evaluating Model Validity and Quality

OK, so our model developers have created a model that they say is ready to go into production. Or we have an updated version of a model that needs to be swapped in to replace a currently running version of that model in production. Before we flip the switch and start using this new model in a critical setting, we need to answer two broad questions. The first establishes model validity: will the new model break our system? The second addresses model quality: is the new model any good?

These are simple questions to ask but may require deep investigation to answer, often necessitating collaboration among folks with various areas of expertise. From an organizational perspective, it is important for us to develop and follow robust processes to ensure that these investigations are carried out carefully and thoroughly. Channeling our inner Thomas Edison, it is reasonable to say that model development is 1% inspiration and 99% verification.

This chapter dives into questions of both validity and quality, and provides enough background to allow MLOps folks to engage with both of these issues. We will also spend time talking about how to build processes, automation, and a strong culture around ensuring that these issues are treated with the appropriate attention, care, and rigor that practical deployment demands.

Figure 5-1 outlines the basic steps of model development and the role that quality plays in it. While this chapter focuses on evaluation ...

Get Reliable Machine Learning now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Reliable Machine Learning by Cathy Chen, Niall Richard Murphy, Kranti Parisa, D. Sculley, Todd Underwood

Chapter 5. Evaluating Model Validity and Quality

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly