Chapter 6. Debug Your ML Problems

In the previous chapter, we trained and evaluated our first model.

Getting a pipeline to a satisfactory level of performance is hard and requires multiple iterations. The goal of this chapter is to guide you through one such iteration cycle. In this chapter, I will cover tools to debug modeling pipelines and ways to write tests to make sure they stay working once we start changing them.

Software best practices encourage practitioners to regularly test, validate, and inspect their code, especially for sensitive steps such as security or input parsing. This should be no different for ML, where errors in a model can be much harder to detect than in traditional software.

We will cover some tips that will help you make sure that your pipeline is robust and that you can try it out without causing your entire system to fail, but first let’s dig into software best practices!

Software Best Practices

For most ML projects, you will repeat the process of building a model, analyzing its shortcomings, and addressing them multiple times. You are also likely to change each part of your infrastructure more than once, so it is crucial to find methods to increase iteration speed.

In ML just like with any other software project, you should follow time-tested software best practices. Most of them can be applied to ML projects with no modifications, such as building only what you need, often referred to as the Keep It Stupid Simple (KISS) principle.

ML projects are ...

Get Building Machine Learning Powered Applications now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.