book

Generative AI on AWS

by Chris Fregly, Antje Barth, Shelbee Eigenbrode

November 2023

Intermediate to advanced

312 pages

8h 15m

English

O'Reilly Media, Inc.

Book available

Read now

Unlock full access

Includes

Has Sandbox

Conventions Used in This BookUsing Code ExamplesO’Reilly Online LearningHow to Contact UsAcknowledgmentsChrisAntjeShelbee
Use Cases and TasksFoundation Models and Model HubsGenerative AI Project Life CycleGenerative AI on AWSWhy Generative AI on AWS?Building Generative AI Applications on AWSSummary
Prompts and CompletionsTokensPrompt EngineeringPrompt StructureInstructionContextIn-Context Learning with Few-Shot InferenceZero-Shot InferenceOne-Shot InferenceFew-Shot InferenceIn-Context Learning Gone WrongIn-Context Learning Best PracticesPrompt-Engineering Best PracticesInference Configuration ParametersSummary
Large-Language Foundation ModelsTokenizersEmbedding VectorsTransformer ArchitectureInputs and Context WindowEmbedding LayerEncoderSelf-AttentionDecoderSoftmax OutputTypes of Transformer-Based Foundation ModelsPretraining DatasetsScaling Laws Compute-Optimal ModelsSummary
Memory ChallengesData Types and Numerical PrecisionQuantizationfp16bfloat16fp8int8Optimizing the Self-Attention LayersFlashAttentionGrouped-Query AttentionDistributed ComputingDistributed Data ParallelFully Sharded Data ParallelPerformance Comparison of FSDP over DDPDistributed Computing on AWSFully Sharded Data Parallel with Amazon SageMakerAWS Neuron SDK and AWS TrainiumSummary
Instruction Fine-TuningLlama 2-ChatFalcon-ChatFLAN-T5Instruction DatasetMultitask Instruction DatasetFLAN: Example Multitask Instruction DatasetPrompt TemplateConvert a Custom Dataset into an Instruction DatasetInstruction Fine-TuningAmazon SageMaker StudioAmazon SageMaker JumpStartAmazon SageMaker Estimator for Hugging FaceEvaluationEvaluation MetricsBenchmarks and DatasetsSummary
Full Fine-Tuning Versus PEFTLoRA and QLoRALoRA FundamentalsRankTarget Modules and LayersApplying LoRAMerging LoRA Adapter with Original ModelMaintaining Separate LoRA AdaptersFull-Fine Tuning Versus LoRA PerformanceQLoRAPrompt Tuning and Soft PromptsSummary
Human Alignment: Helpful, Honest, and HarmlessReinforcement Learning OverviewTrain a Custom Reward ModelCollect Training Dataset with Human-in-the-LoopSample Instructions for Human LabelersUsing Amazon SageMaker Ground Truth for Human AnnotationsPrepare Ranking Data to Train a Reward ModelTrain the Reward ModelExisting Reward Model: Toxicity Detector by MetaFine-Tune with Reinforcement Learning from Human FeedbackUsing the Reward Model with RLHFProximal Policy Optimization RL AlgorithmPerform RLHF Fine-Tuning with PPOMitigate Reward HackingUsing Parameter-Efficient Fine-Tuning with RLHFEvaluate RLHF Fine-Tuned ModelQualitative EvaluationQuantitative EvaluationLoad Evaluation ModelDefine Evaluation-Metric Aggregation FunctionCompare Evaluation Metrics Before and AfterSummary
Model Optimizations for InferencePruningPost-Training Quantization with GPTQDistillationLarge Model Inference ContainerAWS Inferentia: Purpose-Built Hardware for InferenceModel Update and Deployment StrategiesA/B TestingShadow DeploymentMetrics and MonitoringAutoscalingAutoscaling PoliciesDefine an Autoscaling PolicySummary
Large Language Model LimitationsHallucinationKnowledge CutoffRetrieval-Augmented GenerationExternal Sources of KnowledgeRAG WorkflowDocument Loading ChunkingDocument Retrieval and RerankingPrompt AugmentationRAG Orchestration and ImplementationDocument Loading and ChunkingEmbedding Vector Store and RetrievalRetrieval ChainsReranking with Maximum Marginal RelevanceAgentsReAct FrameworkProgram-Aided Language FrameworkGenerative AI ApplicationsFMOps: Operationalizing the Generative AI Project Life CycleExperimentation ConsiderationsDevelopment Considerations Production Deployment ConsiderationsSummary

Use CasesMultimodal Prompt Engineering Best PracticesImage Generation and EnhancementImage GenerationImage Editing and EnhancementInpainting, Outpainting, Depth-to-ImageInpaintingOutpaintingDepth-to-ImageImage Captioning and Visual Question AnsweringImage CaptioningContent ModerationVisual Question Answering Model EvaluationText-to-Image Generative TasksForward DiffusionNonverbal ReasoningDiffusion Architecture FundamentalsForward DiffusionReverse DiffusionU-Net Stable Diffusion 2 ArchitectureText EncoderU-Net and Diffusion ProcessText ConditioningCross-AttentionSchedulerImage DecoderStable Diffusion XL ArchitectureU-Net and Cross-AttentionRefinerConditioningSummary
ControlNetFine-TuningDreamBoothDreamBooth and PEFT-LoRATextual InversionHuman Alignment with Reinforcement Learning from Human FeedbackSummary
Bedrock Foundation ModelsAmazon Titan Foundation ModelsStable Diffusion Foundation Models from Stability AIBedrock Inference APIsLarge Language ModelsGenerate SQL CodeSummarize TextEmbeddingsFine-TuningAgentsMultimodal ModelsCreate Images from TextCreate Images from ImagesData Privacy and Network SecurityGovernance and MonitoringSummary

Content preview from Generative AI on AWS

About the Authors

Chris Fregly is a Principal Solutions Architect for generative AI at Amazon Web Services based in San Francisco, California. Chris holds every AWS certification. He is also cofounder of the global Generative AI on AWS Meetup. Chris regularly speaks at AI and machine learning meetups and conferences across the world. Previously, Chris was an engineer at Databricks and Netflix, where he worked on scalable big data and machine learning products and solutions. He is also coauthor of the O’Reilly book Data Science on AWS.

Antje Barth is a Principal Developer Advocate for generative AI at Amazon Web Services based in San Francisco, California. She is also cofounder of the global Generative AI on AWS Meetup and the Düsseldorf chapter of Women in Big Data. Antje frequently speaks at AI and machine learning conferences and meetups around the world. Prior to joining AWS, Antje worked in solutions engineering roles at MapR and Cisco, helping developers leverage big data, containers, and Kubernetes platforms in the context of AI and machine learning. She is also coauthor of the O’Reilly book Data Science on AWS.

Shelbee Eigenbrode is a Principal Solutions Architect for generative AI at Amazon Web Services based in Denver, Colorado. She is cofounder of the Denver chapter of Women in Big Data. Shelbee holds six AWS certifications and has been in technology for 23 years, spanning multiple industries, technologies, and roles. She focuses on combining her DevOps and ML backgrounds ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Start your free trial

Kubernetes for the Absolute Beginners - Hands-On

Publisher Resources

ISBN: 9781098159214Errata Page Supplemental Content

Generative AI on AWS

by Chris Fregly, Antje Barth, Shelbee Eigenbrode

About the Authors

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

You might also like

Kubernetes for the Absolute Beginners - Hands-On

AWS Certified Cloud Practitioner (CLF-C02)

Building AI Agents with LLMs: Harnessing the Power of Generative AI with Autonomous Agents

Building Generative AI Services with FastAPI

Publisher Resources

About the Authors

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,and much more.

You might also like

Kubernetes for the Absolute Beginners - Hands-On

AWS Certified Cloud Practitioner (CLF-C02)

Building AI Agents with LLMs: Harnessing the Power of Generative AI with Autonomous Agents

Building Generative AI Services with FastAPI

Publisher Resources

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.