book

Kubeflow for Machine Learning

by Trevor Grant, Holden Karau, Boris Lublinsky, Richard Liu, Ilan Filonenko

October 2020

Intermediate to advanced

261 pages

6h 19m

English

O'Reilly Media, Inc.

Book available

Read now

Unlock full access

Our Assumption About YouYour Responsibility as a PractitionerConventions Used in This BookCode ExamplesUsing Code ExamplesO’Reilly Online LearningHow to Contact the AuthorsHow to Contact UsAcknowledgmentsGrievances
Model Development Life CycleWhere Does Kubeflow Fit In?Why Containerize?Why Kubernetes?Kubeflow’s Design and Core ComponentsData Exploration with NotebooksData/Feature PreparationTrainingHyperparameter TuningModel ValidationInference/PredictionPipelinesComponent OverviewAlternatives to KubeflowClipper (RiseLabs)MLflow (Databricks)OthersIntroducing Our Case StudiesModified National Institute of Standards and TechnologyMailing List DataProduct RecommenderCT ScansConclusion
Getting Set Up with KubeflowInstalling Kubeflow and Its DependenciesSetting Up Local KubernetesSetting Up Your Kubeflow Development EnvironmentCreating Our First Kubeflow ProjectTraining and Deploying a ModelTraining and Monitoring ProgressTest QueryGoing Beyond a Local DeploymentConclusion
Getting Around the Central DashboardNotebooks (JupyterHub)Training OperatorsKubeflow PipelinesHyperparameter TuningModel InferenceMetadataComponent SummarySupport ComponentsMinIOIstioKnativeApache SparkKubeflow Multiuser IsolationConclusion
Getting Started with PipelinesExploring the Prepackaged Sample PipelinesBuilding a Simple Pipeline in PythonStoring Data Between StepsIntroduction to Kubeflow Pipelines ComponentsArgo: the Foundation of PipelinesWhat Kubeflow Pipelines Adds to Argo WorkflowBuilding a Pipeline Using Existing ImagesKubeflow Pipeline ComponentsAdvanced Topics in PipelinesConditional Execution of Pipeline StagesRunning Pipelines on ScheduleConclusion
Deciding on the Correct ToolingLocal Data and Feature PreparationFetching the DataData Cleaning: Filtering Out the JunkFormatting the DataFeature PreparationCustom ContainersDistributed ToolingTensorFlow ExtendedDistributed Data Using Apache SparkDistributed Feature Preparation Using Apache SparkPutting It Together in a PipelineUsing an Entire Notebook as a Data Preparation Pipeline StageConclusion
Kubeflow ML MetadataProgrammatic QueryKubeflow Metadata UIUsing MLflow’s Metadata Tools with KubeflowCreating and Deploying an MLflow Tracking ServerLogging Data on RunsUsing the MLflow UIConclusion
Building a Recommender with TensorFlowGetting StartedStarting a New Notebook SessionTensorFlow TrainingDeploying a TensorFlow Training JobDistributed TrainingUsing GPUsUsing Other Frameworks for Distributed TrainingTraining a Model Using Scikit-LearnStarting a New Notebook SessionData PreparationScikit-Learn TrainingExplaining the ModelExporting ModelIntegration into PipelinesConclusion
Model ServingModel Serving RequirementsModel MonitoringModel Accuracy, Drift, and ExplainabilityModel Monitoring RequirementsModel UpdatingModel Updating RequirementsSummary of Inference RequirementsModel Inference in KubeflowTensorFlow ServingReviewSeldon CoreDesigning a Seldon Inference GraphTesting Your ModelServing RequestsMonitoring Your ModelsReviewKFServingServerless and the Service PlaneData PlaneExample WalkthroughPeeling Back the Underlying InfrastructureReviewConclusion

The Denoising CT Scans ExampleData Prep with PythonDS-SVD with Apache SparkVisualizationThe CT Scan Denoising PipelineSharing the PipelineConclusion
AutoML: An OverviewHyperparameter Tuning with Kubeflow KatibKatib ConceptsInstalling KatibRunning Your First Katib ExperimentPrepping Your Training CodeConfiguring an ExperimentRunning the ExperimentKatib User InterfaceTuning Distributed Training JobsNeural Architecture SearchAdvantages of Katib over Other FrameworksConclusion
Google CloudTPU-Accelerated InstancesDataflow for TFX
Building Streaming Applications Leveraging Model ServingStream Processing Engines and LibrariesIntroducing CloudflowBuilding Batch Applications Leveraging Model Serving

Content preview from Kubeflow for Machine Learning

Chapter 4. Kubeflow Pipelines

In the previous chapter we described Kubeflow Pipelines, the component of Kubeflow that orchestrates machine learning applications. Orchestration is necessary because a typical machine learning implementation uses a combination of tools to prepare data, train the model, evaluate performance, and deploy. By formalizing the steps and their sequencing in code, pipelines allow users to formally capture all of the data processing steps, ensuring their reproducibility and auditability, and training and deployment steps.

We will start this chapter by taking a look at the Pipelines UI and showing how to start writing simple pipelines in Python. We’ll explore how to transfer data between stages, then continue by getting into ways of leveraging existing applications as part of a pipeline. We will also look at the underlying workflow engine—Argo Workflows, a standard Kubernetes pipeline tool—that Kubeflow uses to run pipelines. Understanding the basics of Argo Workflows allows you to gain a deeper understanding of Kubeflow Pipelines and will aid in debugging. We will then show what Kubeflow Pipelines adds to Argo.

We’ll wrap up Kubeflow Pipelines by showing how to implement conditional execution in pipelines and how to run pipelines execution on schedule. Task-specific components of pipelines will be covered in their respective chapters.