6

SageMaker Training and Debugging Solutions

In Chapter 2, Deep Learning AMIs, and Chapter 3, Deep Learning Containers, we performed our initial ML training experiments inside EC2 instances. We took note of the cost per hour of running these EC2 instances as there are some cases where we would need to use the more expensive instance types (such as the p2.8xlarge instance at approximately $7.20 per hour) to run our ML training jobs and workloads. To manage and reduce the overall cost of running ML workloads using these EC2 instances, we discussed a few cost optimization strategies, including manually turning off these instances after the training job has finished.

At this point, you might be wondering if it is possible to automate the following ...

Get Machine Learning Engineering on AWS now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.