Chapter 3. Automated Machine Learning

In this chapter, we will show how to use the fully managed Amazon AI and machine learning services to avoid the need to manage our own infrastructure for our AI and machine learning pipelines. We dive deep into two Amazon services for automated machine learning, Amazon SageMaker Autopilot and Amazon Comprehend, both designed for users who want to build powerful predictive models from their datasets with just a few clicks. We can use both SageMaker Autopilot and Comprehend to establish baseline model performance with very low effort and cost.

Machine learning practitioners typically spend weeks or months building, training, and tuning their models. They prepare the data and decide on the framework and algorithm to use. In an iterative process, ML practitioners try to find the best performing algorithm for their dataset and problem type. Unfortunately, there is no cheat sheet for this process. We still need experience, intuition, and patience to run many experiments and find the best hyper-parameters for our algorithm and dataset. Seasoned data scientists benefit from years of experience and intuition to choose the best algorithm for a given dataset and problem type, but they still need to validate their intuition with actual training runs and repeated model validations.

What if we could just use a service that, with just a single click, finds the best algorithm for our dataset, trains and tunes the model, and deploys a model to production? ...

Get Data Science on AWS now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.