Skip to Content
Kubeflow for Machine Learning
book

Kubeflow for Machine Learning

by Trevor Grant, Holden Karau, Boris Lublinsky, Richard Liu, Ilan Filonenko
October 2020
Intermediate to advanced
261 pages
6h 19m
English
O'Reilly Media, Inc.
Book available
Content preview from Kubeflow for Machine Learning

Chapter 6. Artifact and Metadata Store

Machine learning typically involves dealing with a large amount of raw and intermediate (transformed) data where the ultimate goal is creating and deploying the model. In order to understand our model it is necessary to be able to explore datasets used for its creation and transformations (data lineage). The collection of these datasets and the transformation applied to them is called the metadata of our model.1

Model metadata is critical for reproducibility in machine learning;2 reproducibility is critical for reliable production deployments. Capturing the metadata allows us to understand variations when rerunning jobs or experiments. Understanding variations is necessary to iteratively develop and improve our models. It also provides a solid foundation for model comparisons. As Pete Warden defined it in this post:

To reproduce results, code, training data, and the overall platform need to be recorded accurately.

The same information is also required for other common ML operations—model comparison, reproducible model creation, etc.

There are many different options for tracking the metadata of models. Kubeflow has a built-in tool for this called Kubeflow ML Metadata.3 The goal of this tool is to help Kubeflow users understand and manage their ML workflows by tracking and managing the metadata that the workflows produce. Another tool for tracking metadata that we can integrate into our Kubeflow pipelines is MLflow Tracking. It provides API ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Feature Store for Machine Learning

Feature Store for Machine Learning

Jayanth Kumar M J
Grokking Deep Learning

Grokking Deep Learning

Andrew W. Trask

Publisher Resources

ISBN: 9781492050117Errata PageSupplemental Content