Automated Machine Learning on AWS

Book description

Automate the process of building, training, and deploying machine learning applications to production with AWS solutions such as SageMaker Autopilot, AutoGluon, Step Functions, Amazon Managed Workflows for Apache Airflow, and more

Key Features

  • Explore the various AWS services that make automated machine learning easier
  • Recognize the role of DevOps and MLOps methodologies in pipeline automation
  • Get acquainted with additional AWS services such as Step Functions, MWAA, and more to overcome automation challenges

Book Description

AWS provides a wide range of solutions to help automate a machine learning workflow with just a few lines of code. With this practical book, you'll learn how to automate a machine learning pipeline using the various AWS services.

Automated Machine Learning on AWS begins with a quick overview of what the machine learning pipeline/process looks like and highlights the typical challenges that you may face when building a pipeline. Throughout the book, you'll become well versed with various AWS solutions such as Amazon SageMaker Autopilot, AutoGluon, and AWS Step Functions to automate an end-to-end ML process with the help of hands-on examples. The book will show you how to build, monitor, and execute a CI/CD pipeline for the ML process and how the various CI/CD services within AWS can be applied to a use case with the Cloud Development Kit (CDK). You'll understand what a data-centric ML process is by working with the Amazon Managed Services for Apache Airflow and then build a managed Airflow environment. You'll also cover the key success criteria for an MLSDLC implementation and the process of creating a self-mutating CI/CD pipeline using AWS CDK from the perspective of the platform engineering team.

By the end of this AWS book, you'll be able to effectively automate a complete machine learning pipeline and deploy it to production.

What you will learn

  • Employ SageMaker Autopilot and Amazon SageMaker SDK to automate the machine learning process
  • Understand how to use AutoGluon to automate complicated model building tasks
  • Use the AWS CDK to codify the machine learning process
  • Create, deploy, and rebuild a CI/CD pipeline on AWS
  • Build an ML workflow using AWS Step Functions and the Data Science SDK
  • Leverage the Amazon SageMaker Feature Store to automate the machine learning software development life cycle (MLSDLC)
  • Discover how to use Amazon MWAA for a data-centric ML process

Who this book is for

This book is for the novice as well as experienced machine learning practitioners looking to automate the process of building, training, and deploying machine learning-based solutions into production, using both purpose-built and other AWS services. A basic understanding of the end-to-end machine learning process and concepts, Python programming, and AWS is necessary to make the most out of this book.

Table of contents

  1. Automated Machine Learning on AWS
  2. Foreword
  3. Contributors
  4. About the author
  5. About the reviewer
  6. Preface
    1. Who this book is for
    2. What this book covers
    3. To get the most out of this book
    4. Download the example code files
    5. Download the color images
    6. Conventions used
    7. Get in touch
    8. Share Your Thoughts
  7. Section 1: Fundamentals of the Automated Machine Learning Process and AutoML on AWS
  8. Chapter 1: Getting Started with Automated Machine Learning on AWS
    1. Technical requirements
    2. Overview of the ML process
    3. Complexities in the ML process
    4. An example of the end-to-end ML process
      1. Introducing ACME Fishing Logistics
      2. The case for ML
      3. Getting insights from the data
      4. Building the right model
      5. Training the model
      6. Evaluating the trained model
      7. Exploring possible next steps
      8. Tuning our model
      9. Deploying the optimized model into production
      10. Streamlining the ML process with AutoML
    5. How AWS makes automating the ML development and deployment process easier
    6. Summary
  9. Chapter 2: Automating Machine Learning Model Development Using SageMaker Autopilot
    1. Technical requirements
    2. Introducing the AWS AI and ML landscape
    3. Overview of SageMaker Autopilot
    4. Overcoming automation challenges with SageMaker Autopilot
      1. Getting started with SageMaker Studio
      2. Preparing the experiment data
      3. Starting the Autopilot experiment
      4. Running the Autopilot experiment
      5. Post-experimentation tasks
    5. Using the SageMaker SDK to automate the ML experiment
      1. Codifying the Autopilot experiment
      2. Analyzing the Autopilot experiment with code
      3. Deploying the best candidate
      4. Cleaning up
    6. Summary
  10. Chapter 3: Automating Complicated Model Development with AutoGluon
    1. Technical requirements
    2. Introducing the AutoGluon library
    3. Using AutoGluon for tabular data
      1. Prerequisites
      2. Creating the AutoML experiment with AutoGluon
      3. Evaluating the experiment results
    4. Using AutoGluon for image data
      1. Prerequisites
      2. Creating an image prediction experiment
      3. Evaluating the experiment results
    5. Summary
  11. Section 2: Automating the Machine Learning Process with Continuous Integration and Continuous Delivery (CI/CD)
  12. Chapter 4: Continuous Integration and Continuous Delivery (CI/CD) for Machine Learning
    1. Technical requirements
    2. Introducing the CI/CD methodology
      1. Introducing the CI part of CI/CD
      2. Introducing the CD part of CI/CD
      3. Closing the loop
    3. Automating ML with CI/CD
      1. Taking a deployment-centric approach
      2. Creating an MLOps methodology
    4. Creating a CI/CD pipeline on AWS
      1. Using the AWS CI/CD toolchain
      2. Working with additional AWS developer tools
      3. Creating a cloud-native CI/CD pipeline for a production ML model
      4. Preparing the development environment
      5. Creating the pipeline artifact repository
      6. Developing the application artifacts
    5. Summary
  13. Chapter 5: Continuous Deployment of a Production ML Model
    1. Technical requirements
    2. Deploying the CI/CD pipeline
      1. Codifying the pipeline construct
      2. Creating the CDK application
      3. Deploying the pipeline application
    3. Building the ML model artifacts
      1. Reviewing the modeling file
      2. Reviewing the application file
      3. Reviewing the model serving files
      4. Reviewing the container build file
      5. Committing the ML artifacts
    4. Executing the automated ML model deployment
      1. Cleanup
    5. Summary
  14. Section 3: Optimizing a Source Code-Centric Approach to Automated Machine Learning
  15. Chapter 6: Automating the Machine Learning Process Using AWS Step Functions
    1. Technical requirements
    2. Introducing AWS Step Functions
      1. Creating a state machine
      2. Addressing state machine complexity
    3. Using the Step Functions Data Science SDK for CI/CD
    4. Building the CI/CD pipeline resources
      1. Updating the development environment
      2. Creating the pipeline artifact repository
      3. Building the pipeline application artifacts
      4. Deploying the CI/CD pipeline
    5. Summary
  16. Chapter 7: Building the ML Workflow Using AWS Step Functions
    1. Technical requirements
    2. Building the state machine workflow
      1. Setting up the service permissions
      2. Creating an ML workflow
    3. Performing the integration test
    4. Monitoring the pipeline's progress
    5. Summary
  17. Section 4: Optimizing a Data-Centric Approach to Automated Machine Learning
  18. Chapter 8: Automating the Machine Learning Process Using Apache Airflow
    1. Technical requirements
    2. Introducing Apache Airflow
    3. Introducing Amazon MWAA
    4. Using Airflow to process the abalone dataset
    5. Configuring the MWAA prerequisites
    6. Configuring the MWAA environment
    7. Summary
  19. Chapter 9: Building the ML Workflow Using Amazon Managed Workflows for Apache Airflow
    1. Technical requirements
    2. Developing the data-centric workflow
      1. Building and unit testing the data ETL artifacts
      2. Building the Airflow DAG
    3. Creating synthetic Abalone survey data
    4. Executing the data-centric workflow
      1. Cleanup
    5. Summary
  20. Section 5: Automating the End-to-End Production Application on AWS
  21. Chapter 10: An Introduction to the Machine Learning Software Development Life Cycle (MLSDLC)
    1. Technical requirements
    2. Introducing the MLSDLC
    3. Building the application platform
      1. Examining the role of the application owner
      2. Examining the role of the platform engineers
      3. Examining the role of the frontend developers
    4. Examining ML and data engineering roles
      1. Creating a SageMaker Feature Store
      2. Creating ML artifacts
      3. Creating continuous training artifacts
    5. Understanding the security lens
      1. Securing the data
      2. Securing the code
      3. Securing the website
    6. Summary
  22. Chapter 11: Continuous Integration, Deployment, and Training for the MLSDLC
    1. Technical requirements
    2. Codifying the continuous integration stage
      1. Building the integration artifacts
      2. Building the test artifacts
      3. Building the production artifacts
      4. Automating the continuous integration process
    3. Managing the continuous deployment stage
      1. Reviewing the build phase
      2. Reviewing the test phase
      3. Reviewing the deploy and maintain phases
      4. Reviewing the application user experience
    4. Managing continuous training
      1. Creating new Abalone survey data
      2. Reviewing the continuous training process
      3. Cleanup
    5. Summary
    6. Further reading
    7. Why subscribe?
  23. Other Books You May Enjoy
    1. Packt is searching for authors like you
    2. Share Your Thoughts

Product information

  • Title: Automated Machine Learning on AWS
  • Author(s): Trenton Potgieter, Jonathan Dahlberg
  • Release date: April 2022
  • Publisher(s): Packt Publishing
  • ISBN: 9781801811828