Chapter 1. Introduction
In this first chapter, we will introduce machine learning pipelines and outline all the steps that go into building them. We’ll explain what needs to happen to move your machine learning model from an experiment to a robust production system. We’ll also introduce our example project that we will use throughout the rest of the book to demonstrate the principles we describe.
Why Machine Learning Pipelines?
The key benefit of machine learning pipelines lies in the automation of the model life cycle steps. When new training data becomes available, a workflow which includes data validation, preprocessing, model training, analysis, and deployment should be triggered. We have observed too many data science teams manually going through these steps, which is costly and also a source of errors. Let’s cover some details of the benefits of machine learning pipelines:
- Ability to focus on new models, not maintaining existing models
-
Automated machine learning pipelines will free up data scientists from maintaining existing models. We have observed too many data scientists spending their days on keeping previously developed models up to date. They run scripts manually to preprocess their training data, they write one-off deployment scripts, or they manually tune their models. Automated pipelines allow data scientists to develop new models, the fun part of their job. Ultimately, this will lead to higher job satisfaction and retention in a competitive job market.
- Prevention ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access