This course lays out the common architecture, infrastructure, and theoretical considerations for managing an enterprise machine learning (ML) model pipeline. Because automation is the key to effective operations, you'll learn about open source tools like Spark, Hive, ModelDB, and Docker and how they're used to bridge the gap between individual models and a reproducible pipeline. You'll also learn how effective data teams operate; why they use a common process for building, training, deploying, and maintaining ML models; and how they're able to seamlessly push models into production. The course is designed for the data engineer transitioning to the cloud and for the data scientist ready to use model deployment pipelines that are reproducible and automated. Learners should have basic familiarity with: cloud platforms like Amazon Web Services; Scala or Python; Hadoop, Spark, or Pandas; SBT or Maven; Bash, Docker, and REST.
- Understand how to set-up and manage an enterprise ML model pipeline
- Learn the common components that make up enterprise ML model pipelines
- Explore the use and purpose of pipeline tools like Spark, Hive, ModelDB, and Docker
- Discover the gaps in the Spark ecosystem for maintaining and deploying ML pipelines
- Learn how to move from creating one-off models to building a reproducible automated pipeline
Jason Slepicka is a senior data engineer with Los Angeles based DataScience, where he builds pipelines and data science platform infrastructure. He has a decade of experience integrating data to support efforts like fighting human trafficking for DARPA, exploring the evolution of evolvability in yeast, and tracking intruders in computer networks. Jason has both a Bachelor's and Master’s in Computer Science from the University of Arizona and is working on his PhD in Computer Science at the University of Southern California Information Sciences Institute.
Table of contents
- Title: An Introduction to Machine Learning Models in Production
- Release date: December 2017
- Publisher(s): O'Reilly Media, Inc.
- ISBN: 9781491988787
You might also like
Data Science Fundamentals Part 2: Machine Learning and Statistical Analysis
21+ Hours of Video Instruction Data Science Fundamentals Part II teaches you the foundational concepts, theory, …
AI Superstream: Efficient Machine Learning
Sponsored by Intel Machine learning has grown significantly, and with it the footprint of ML models—which …
A practical guide to algorithmic bias and explainability in machine learning
The concepts of “undesired bias” and “black box models” in machine learning have become a highly …
Spotlight on Data: Machine Learning in Production at Google Scale with Todd Underwood
You can propel your business forward with AI-centric approaches to solving customer needs, but to be …