4

Managing Deep Learning Datasets

Deep learning models usually require a considerable amount of training data to learn useful patterns. In many real-life applications, new data is continuously collected, processed, and added to the training dataset, so your models can be periodically retrained so that they can adjust to changing real-world conditions. In this chapter, we will look into SageMaker capabilities and other AWS services to help you manage your training data.

SageMaker provides a wide integration capability where you can use AWS general-purpose data storage services such as Amazon S3, Amazon EFS, and Amazon FSx for Lustre. Additionally, SageMaker has purpose-built storage for machine learning (ML) called SageMaker Feature Store. We ...

Get Accelerate Deep Learning Workloads with Amazon SageMaker now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.