Chapter 3. Automated Machine Learning

In this chapter, we will show how to use the fully managed Amazon AI and machine learning services to avoid the need to manage our own infrastructure for our AI and machine learning pipelines. We dive deep into two Amazon services for automated machine learning, Amazon SageMaker Autopilot and Amazon Comprehend, both designed for users who want to build powerful predictive models from their datasets with just a few clicks. We can use both SageMaker Autopilot and Comprehend to establish baseline model performance with very low effort and cost.

Machine learning practitioners typically spend weeks or months building, training, and tuning their models. They prepare the data and decide on the framework and algorithm to use. In an iterative process, ML practitioners try to find the best performing algorithm for their dataset and problem type. Unfortunately, there is no cheat sheet for this process. We still need experience, intuition, and patience to run many experiments and find the best hyper-parameters for our algorithm and dataset. Seasoned data scientists benefit from years of experience and intuition to choose the best algorithm for a given dataset and problem type, but they still need to validate their intuition with actual training runs and repeated model validations.

What if we could just use a service that, with just a single click, finds the best algorithm for our dataset, trains and tunes the model, and deploys a model to production? ...

Get Data Science on AWS now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.