book

Learn Amazon SageMaker - Second Edition

Name: Learn Amazon SageMaker - Second Edition
Author: Julien Simon
ISBN: 9781801817950

by Julien Simon

November 2021

Intermediate to advanced

554 pages

10h 9m

English

Packt Publishing

Read now

Unlock full access

Learn Amazon SageMaker Second Edition
ContributorsAbout the authorAbout the reviewers
Preface
Who this book is forWhat this book coversTo get the most out of this bookDownload the example code filesDownload the color imagesConventions usedGet in touchShare your thoughts
Section 1: Introduction to Amazon SageMaker
Chapter 1: Introducing Amazon SageMaker
Technical requirementsExploring the capabilities of Amazon SageMakerThe main capabilities of Amazon SageMakerThe Amazon SageMaker API Setting up Amazon SageMaker on your local machineInstalling the SageMaker SDK with virtualenvInstalling the SageMaker SDK with Anaconda A word about AWS permissionsSetting up Amazon SageMaker StudioOnboarding to Amazon SageMaker StudioOnboarding with the quick start procedureDeploying one-click solutions and models with Amazon SageMaker JumpStartDeploying a solutionDeploying a modelFine-tuning a modelSummary
Chapter 2: Handling Data Preparation Techniques
Technical requirementsLabeling data with Amazon SageMaker Ground TruthUsing workforcesCreating a private workforceUploading data for labelingCreating a labeling jobLabeling imagesLabeling textTransforming data with Amazon SageMaker Data WranglerLoading a dataset in SageMaker Data WranglerTransforming a dataset in SageMaker Data WranglerExporting a SageMaker Data Wrangler pipelineRunning batch jobs with Amazon SageMaker Processing Discovering the Amazon SageMaker Processing APIProcessing a dataset with scikit-learnProcessing a dataset with your own codeSummary
Section 2: Building and Training Models
Chapter 3: AutoML with Amazon SageMaker Autopilot
Technical requirementsDiscovering Amazon SageMaker AutopilotAnalyzing dataFeature engineeringModel tuningUsing Amazon SageMaker Autopilot in SageMaker StudioLaunching a jobMonitoring a jobComparing jobsDeploying and invoking a modelUsing the SageMaker Autopilot SDKLaunching a jobMonitoring a jobCleaning upDiving deep on SageMaker AutopilotThe job artifactsThe data exploration notebookThe candidate generation notebookSummary
Chapter 4: Training Machine Learning Models
Technical requirementsDiscovering the built-in algorithms in Amazon SageMakerSupervised learningUnsupervised learningA word about scalabilityTraining and deploying models with built-in algorithmsUnderstanding the end-to-end workflowUsing alternative workflowsUsing fully managed infrastructureUsing the SageMaker SDK with built-in algorithmsPreparing dataConfiguring a training jobLaunching a training jobDeploying a modelCleaning upWorking with more built-in algorithmsRegression with XGBoostRecommendation with Factorization MachinesUsing Principal Component AnalysisDetecting anomalies with Random Cut ForestSummary
Chapter 5: Training CV Models
Technical requirementsDiscovering the CV built-in algorithms in Amazon SageMakerDiscovering the image classification algorithmDiscovering the object detection algorithmDiscovering the semantic segmentation algorithmTraining with CV algorithmsPreparing image datasetsWorking with image filesWorking with RecordIO filesWorking with SageMaker Ground Truth filesUsing the built-in CV algorithmsTraining an image classification modelFine-tuning an image classification modelTraining an object detection modelTraining a semantic segmentation modelSummary
Chapter 6: Training Natural Language Processing Models
Technical requirementsDiscovering the NLP built-in algorithms in Amazon SageMakerDiscovering the BlazingText algorithmDiscovering the LDA algorithmDiscovering the NTM algorithmDiscovering the seq2sea algorithmTraining with NLP algorithmsPreparing natural language datasetsPreparing data for classification with BlazingTextPreparing data for classification with BlazingText, version 2Preparing data for word vectors with BlazingTextPreparing data for topic modeling with LDA and NTMUsing datasets labeled with SageMaker Ground TruthUsing the built-in algorithms for NLPClassifying text with BlazingTextComputing word vectors with BlazingTextUsing BlazingText models with FastTextModeling topics with LDAModeling topics with NTMSummary

Chapter 7: Extending Machine Learning Services Using Built-In Frameworks
Technical requirementsDiscovering the built-in frameworks in Amazon SageMakerRunning a first example with XGBoostWorking with framework containersTraining and deploying locallyTraining with script modeUnderstanding model deploymentManaging dependenciesPutting it all togetherRunning your framework code on Amazon SageMakerUsing the built-in frameworksWorking with TensorFlow and KerasWorking with PyTorchWorking with Hugging FaceWorking with Apache SparkSummary
Chapter 8: Using Your Algorithms and Code
Technical requirementsUnderstanding how SageMaker invokes your codeCustomizing an existing framework containerSetting up your build environment on EC2Building training and inference containersUsing the SageMaker Training Toolkit with scikit-learnBuilding a fully custom container for scikit-learnTraining with a fully custom containerDeploying a fully custom containerBuilding a fully custom container for RCoding with R and plumberBuilding a custom container Training and deploying a custom container on SageMakerTraining and deploying with your own code on MLflowInstalling MLflowTraining a model with MLflowBuilding a SageMaker container with MLflowBuilding a fully custom container for SageMaker ProcessingSummary
Section 3: Diving Deeper into Training
Chapter 9: Scaling Your Training Jobs
Technical requirementsUnderstanding when and how to scaleUnderstanding what scaling meansAdapting training time to business requirementsRight-sizing training infrastructureDeciding when to scaleDeciding how to scaleScaling a BlazingText training jobMonitoring and profiling training jobs with Amazon SageMaker DebuggerViewing monitoring and profiling information in SageMaker StudioEnabling profiling in SageMaker DebuggerSolving training challengesStreaming datasets with pipe modeUsing pipe mode with built-in algorithmsUsing pipe mode with other algorithms and frameworksSimplifying data loading with MLIOTraining factorization machines with pipe modeDistributing training jobsUnderstanding data parallelism and model parallelismDistributing training for built-in algorithmsDistributing training for built-in frameworksDistributing training for custom containersScaling an image classification model on ImageNetPreparing the ImageNet datasetDefining our training jobTraining on ImageNetUpdating batch sizeAdding more instancesSumming things upTraining with the SageMaker data and model parallel librariesTraining on TensorFlow with SageMaker DDPTraining on Hugging Face with SageMaker DDPTraining on Hugging Face with SageMaker DMPUsing other storage servicesWorking with SageMaker and Amazon EFSWorking with SageMaker and Amazon FSx for LustreSummary
Chapter 10: Advanced Training Techniques
Technical requirementsOptimizing training costs with managed spot trainingComparing costsUnderstanding Amazon EC2 Spot InstancesUnderstanding managed spot trainingUsing managed spot training with object detectionUsing managed spot training and checkpointing with KerasOptimizing hyperparameters with automatic model tuningUnderstanding automatic model tuningUsing automatic model tuning with object detectionUsing automatic model tuning with KerasUsing automatic model tuning for architecture searchExploring models with SageMaker DebuggerDebugging an XGBoost jobInspecting an XGBoost jobDebugging and inspecting a Keras jobManaging features and building datasets with SageMaker Feature StoreEngineering features with SageMaker ProcessingCreating a feature groupIngesting featuresQuerying features to build a datasetExploring other capabilities of SageMaker Feature StoreDetecting bias in datasets and explaining predictions with SageMaker ClarifyConfiguring a bias analysis with SageMaker ClarifyRunning a bias analysisAnalyzing bias metricsRunning an explainability analysisMitigating biasSummary
Section 4: Managing Models in Production
Chapter 11: Deploying Machine Learning Models
Technical requirementsExamining model artifacts and exporting modelsExamining and exporting built-in models Examining and exporting built-in CV models Examining and exporting XGBoost modelsExamining and exporting scikit-learn modelsExamining and exporting TensorFlow modelsExamining and exporting Hugging Face modelsDeploying models on real-time endpointsManaging endpoints with the SageMaker SDKManaging endpoints with the boto3 SDKDeploying models on batch transformersDeploying models on inference pipelinesMonitoring prediction quality with Amazon SageMaker Model MonitorCapturing dataCreating a baselineSetting up a monitoring scheduleSending bad dataExamining violation reportsDeploying models to container servicesTraining on SageMaker and deploying on Amazon FargateSummary
Chapter 12: Automating Machine Learning Workflows
Technical requirementsAutomating with AWS CloudFormationWriting a templateDeploying a model to a real-time endpointModifying a stack with a change setAdding a second production variant to the endpointImplementing canary deploymentImplementing blue-green deploymentAutomating with AWS CDKInstalling the CDKCreating a CDK applicationWriting a CDK applicationDeploying a CDK applicationBuilding end-to-end workflows with AWS Step FunctionsSetting up permissionsImplementing our first workflowAdding parallel execution to a workflowAdding a Lambda function to a workflowBuilding end-to-end workflows with Amazon SageMaker PipelinesDefining workflow parametersProcessing the dataset with SageMaker ProcessingIngesting the dataset in SageMaker Feature Store with SageMaker ProcessingBuilding a dataset with Amazon Athena and SageMaker ProcessingTraining a modelCreating and registering a model in SageMaker PipelinesCreating a pipelineRunning a pipelineDeploying a model from the model registrySummary
Chapter 13: Optimizing Prediction Cost and Performance
Technical requirementsAutoscaling an endpointDeploying a multi-model endpointUnderstanding multi-model endpointsBuilding a multi-model endpoint with scikit-learnDeploying a model with Amazon Elastic InferenceDeploying a model with Amazon Elastic InferenceCompiling models with Amazon SageMaker NeoUnderstanding Amazon SageMaker NeoCompiling and deploying an image classification model on SageMakerExploring models compiled with NeoDeploying an image classification model on a Raspberry PiDeploying models on AWS InferentiaBuilding a cost optimization checklistOptimizing costs for data preparationOptimizing costs for experimentationOptimizing costs for model trainingOptimizing costs for model deploymentSummary
Why subscribe?
Other Books You May EnjoyPackt is searching for authors like youShare your thoughts

Overview

Learn how to leverage Amazon SageMaker to streamline your machine learning workflows in this comprehensive guide. With clear steps and practical examples, you'll explore building, training, and deploying effective machine learning models with minimal effort. This book sets you up to maximize the benefits of SageMaker's ecosystem, making it crucial for modern ML practitioners.

What this Book will help me do

Master the use of Amazon SageMaker for efficient ML model development and deployment.
Understand and implement automated ML tasks using SageMaker AutoPilot.
Gain expertise in managing data workflows, from preparation to feature engineering.
Learn optimization techniques for cost-effective and accurate ML solutions.
Deploy scalable ML models while ensuring continuous monitoring and debugging.

Author(s)

Julien Simon is an experienced AI and ML practitioner with significant contributions to advancing the understanding and usage of cloud-based ML solutions. Dedicated to demystifying complex concepts, he provides clear tutorials and practical insights, empowering readers to harness powerful tools like Amazon SageMaker effectively. His approachable writing style resonates with learners seeking actionable knowledge.

Who is it for?

This book is perfect for machine learning developers, data scientists, and software engineers who want to enhance their workflows using Amazon SageMaker. Suitable for those familiar with AWS and Python, it bridges the technical gap towards implementing advanced ML workflows. Learners are guided from understanding to adeptly applying SageMaker's extensive features.

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Publisher Resources

ISBN: 9781801817950

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills