book

AI Engineering

Name: AI Engineering
Author: Chip Huyen
ISBN: 9781098166304

by Chip Huyen

December 2024

Intermediate to advanced

534 pages

15h 52m

English

O'Reilly Media, Inc.

Audio summary available

Read now

Unlock full access

Includes

Quizzes

Preface
What This Book Is AboutWhat This Book Is NotWho This Book Is ForNavigating This BookConventions Used in This BookUsing Code ExamplesO’Reilly Online LearningHow to Contact UsAcknowledgments
1. Introduction to Building AI Applications with Foundation Models
The Rise of AI EngineeringFrom Language Models to Large Language ModelsFrom Large Language Models to Foundation ModelsFrom Foundation Models to AI EngineeringFoundation Model Use CasesCodingImage and Video ProductionWritingEducationConversational BotsInformation AggregationData OrganizationWorkflow AutomationPlanning AI ApplicationsUse Case EvaluationSetting ExpectationsMilestone PlanningMaintenanceThe AI Engineering StackThree Layers of the AI StackAI Engineering Versus ML EngineeringAI Engineering Versus Full-Stack EngineeringSummary
2. Understanding Foundation Models
Training DataMultilingual ModelsDomain-Specific ModelsModelingModel ArchitectureModel SizePost-TrainingSupervised FinetuningPreference FinetuningSamplingSampling FundamentalsSampling StrategiesTest Time ComputeStructured OutputsThe Probabilistic Nature of AISummary
3. Evaluation Methodology
Challenges of Evaluating Foundation ModelsUnderstanding Language Modeling MetricsEntropyCross EntropyBits-per-Character and Bits-per-BytePerplexityPerplexity Interpretation and Use CasesExact EvaluationFunctional CorrectnessSimilarity Measurements Against Reference DataIntroduction to EmbeddingAI as a JudgeWhy AI as a Judge?How to Use AI as a JudgeLimitations of AI as a JudgeWhat Models Can Act as Judges?Ranking Models with Comparative EvaluationChallenges of Comparative EvaluationThe Future of Comparative EvaluationSummary
4. Evaluate AI Systems
Evaluation CriteriaDomain-Specific CapabilityGeneration CapabilityInstruction-Following CapabilityCost and LatencyModel SelectionModel Selection WorkflowModel Build Versus BuyNavigate Public BenchmarksDesign Your Evaluation PipelineStep 1. Evaluate All Components in a SystemStep 2. Create an Evaluation Guideline Step 3. Define Evaluation Methods and DataSummary
5. Prompt Engineering
Introduction to PromptingIn-Context Learning: Zero-Shot and Few-ShotSystem Prompt and User PromptContext Length and Context EfficiencyPrompt Engineering Best PracticesWrite Clear and Explicit InstructionsProvide Sufficient ContextBreak Complex Tasks into Simpler SubtasksGive the Model Time to ThinkIterate on Your PromptsEvaluate Prompt Engineering ToolsOrganize and Version PromptsDefensive Prompt EngineeringProprietary Prompts and Reverse Prompt EngineeringJailbreaking and Prompt InjectionInformation ExtractionDefenses Against Prompt AttacksSummary
6. RAG and Agents
RAGRAG ArchitectureRetrieval AlgorithmsRetrieval OptimizationRAG Beyond TextsAgentsAgent OverviewToolsPlanningAgent Failure Modes and EvaluationMemorySummary
7. Finetuning
Finetuning OverviewWhen to FinetuneReasons to FinetuneReasons Not to FinetuneFinetuning and RAGMemory BottlenecksBackpropagation and Trainable ParametersMemory MathNumerical RepresentationsQuantizationFinetuning TechniquesParameter-Efficient FinetuningModel Merging and Multi-Task FinetuningFinetuning TacticsSummary
8. Dataset Engineering
Data CurationData QualityData CoverageData QuantityData Acquisition and AnnotationData Augmentation and SynthesisWhy Data SynthesisTraditional Data Synthesis TechniquesAI-Powered Data SynthesisModel DistillationData ProcessingInspect DataDeduplicate DataClean and Filter DataFormat DataSummary
9. Inference Optimization
Understanding Inference OptimizationInference OverviewInference Performance MetricsAI AcceleratorsInference Optimization Model OptimizationInference Service OptimizationSummary

10. AI Engineering Architecture and User Feedback
AI Engineering ArchitectureStep 1. Enhance ContextStep 2. Put in GuardrailsStep 3. Add Model Router and GatewayStep 4. Reduce Latency with CachesStep 5. Add Agent PatternsMonitoring and ObservabilityAI Pipeline OrchestrationUser FeedbackExtracting Conversational FeedbackFeedback DesignFeedback LimitationsSummary
Epilogue
Index
About the Author

Content preview from AI Engineering

Chapter 1. Introduction to Building AI Applications with Foundation Models

If I could use only one word to describe AI post-2020, it’d be scale. The AI models behind applications like ChatGPT, Google’s Gemini, and Midjourney are at such a scale that they’re consuming a nontrivial portion of the world’s electricity, and we’re at risk of running out of publicly available internet data to train them.

The scaling up of AI models has two major consequences. First, AI models are becoming more powerful and capable of more tasks, enabling more applications. More people and teams leverage AI to increase productivity, create economic value, and improve quality of life.

Second, training large language models (LLMs) requires data, compute resources, and specialized talent that only a few organizations can afford. This has led to the emergence of model as a service: models developed by these few organizations are made available for others to use as a service. Anyone who wishes to leverage AI to build applications can now use these models to do so without having to invest up front in building a model.

In short, the demand for AI applications has increased while the barrier to entry for building AI applications has decreased. This has turned AI engineering—the process of building applications on top of readily available models—into one of the fastest-growing engineering disciplines.

Building applications on top of machine learning (ML) models isn’t new. Long before LLMs became prominent, AI ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Publisher Resources

ISBN: 9781098166298Errata Page

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

AI Engineering

by Chip Huyen

Chapter 1. Introduction to Building AI Applications with Foundation Models

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.