Kubeflow for Machine Learning
by Trevor Grant, Holden Karau, Boris Lublinsky, Richard Liu, Ilan Filonenko
Chapter 1. Kubeflow: What It Is and Who It Is For
If you are a data scientist trying to get your models into production, or a data engineer trying to make your models scalable and reliable, Kubeflow provides tools to help. Kubeflow solves the problem of how to take machine learning from research to production. Despite common misconceptions, Kubeflow is more than just Kubernetes and TensorFlow—you can use it for all sorts of machine learning tasks. We hope Kubeflow is the right tool for you, as long as your organization is using Kubernetes. “Alternatives to Kubeflow” introduces some options you may wish to explore.
This chapter aims to help you decide if Kubeflow is the right tool for your use case. We’ll cover the benefits you can expect from Kubeflow, some of the costs associated with it, and some of the alternatives. After this chapter, we’ll dive into setting up Kubeflow and building an end-to-end solution to familiarize you with the basics.
Model Development Life Cycle
Machine learning or model development essentially follows the path: data → information → knowledge → insight. This path of generating insight from data can be graphically described with Figure 1-1.
Model development life cycle (MDLC) is a term commonly used to describe the flow between training and inference. Figure 1-1 is a visual representation of this continuous interaction, where upon triggering a model update the whole cycle kicks off yet again.