book

GenAI on Google Cloud

by Ayo Adedeji, Lavi Nigam, Sarita Joshi, Stephanie Gervasi

January 2026

Intermediate to advanced

320 pages

9h 58m

English

O'Reilly Media, Inc.

Read now

Unlock full access

Includes

Quizzes

Preface
Why This Book MattersWhat You’ll Find in This BookOur ApproachWho This Book Is ForPrerequisitesConventions Used in This BookUsing Code ExamplesO’Reilly Online LearningHow to Contact UsAcknowledgments
1. The Challenge of Generative AI Application Development
Overview of LLMs, Generative AI Agents, and Potential Applications to Business TasksSmall Language Models (SLMs)Foundation Models and MultimodalityDomain-Specific and Reasoning ModelsGenerative AI AgentsAgent ArchitecturesChallenges in Development, Deployment, and MaintenanceDevelopment ChallengesDeployment ChallengesMaintenance ChallengesAddressing Challenges with Modern PlatformsIndustry Use Cases and ROILooking AheadLearning Labs
2. Data Readiness and Accessibility
The Amplified Importance of Data for GenAIWhat Data Readiness Really Means for GenAI ApplicationsKey Dimensions of Data ReadinessThe Interconnected Nature of Data ReadinessManaging Prompts as Data AssetsThe Human Element: Roles in the Data Readiness JourneyData Scientists: The ExplorersML Engineers: Building the Bridge to ProductionData Engineers: Architecting the FoundationDevOps and SREs: Operationalizing the FoundationBusiness SMEs and Domain Leaders: The “Why” Behind the “What”Strategic Data Patterns: The Foundation for Reliable GenAI SystemsThe Unified Data and AI PlatformFrom RAG to Agentic RAG: The Evolution of a Data PatternTying it All Together: the Enterprise RAG Knowledge EngineData Readiness for Agent SystemsSecurity and Governance: Protecting Data Throughout the LLM LifecycleData Privacy FrameworkComprehensive GovernancePractical Data Readiness AssessmentLooking AheadLearning Labs
3. Building a Multimodal Agent with the Agent Development Kit (ADK)
From Zero to Agent in Seven LinesThe Simplest Thing That WorksThe Runtime Behind the SimplicityRunning Your First ConversationUnderstanding the LimitationsAdding Intelligence Through ToolsYour Agent’s First ToolTools Versus Subagents–A Practical Decision FrameworkState Management That Actually ScalesBuilding a Stateful Shopping CartUnderstanding the Three ScopesState Scope InteractionsMaking State Persist in ProductionBeyond Structured State: Semantic MemoryVertex AI Agent Engine Memory Bank: Learning from ConversationsImplementationExpanding to MultimodalMaking Our Agent SeeFrom Static Analysis to Live SupportBuilding Complete Interaction MemoryBuilding Production-Grade ToolsHandling Asynchronous OperationsEnsuring Safety with Human-in-the-LoopProduction Monitoring and Policy Enforcement with Callbacks and Plug-insLooking AheadLearning Labs
4. Orchestrating Intelligent Agent Teams
The Bottleneck of the Monolithic AgentConflicting InstructionsTool Selection ParalysisToken LimitationsMaintenance NightmareThe Solution: An Agent TeamThe Roadmap: From Local Teams to Distributed SystemsLocal TeamsThe Foundation: Agent HierarchyPattern 1: The Assembly Line (SequentialAgent)Pattern 2: The Independent Taskforce (ParallelAgent)Pattern 3: The Iterative Refiner (LoopAgent)Distributed CollaborationThe Organizational “Why”MCP: The Language of ToolsA2A: The Language of DelegationPutting It All Together: A Hybrid Agent TeamProduction RealitiesThe Trust Problem: Security Schemes in A2AThe Extension Problem: Evolving Agent CapabilitiesThe Visibility Problem: Distributed TracingThe Versioning Problem: Managing Agent EvolutionLooking AheadEdge and Embodied IntelligenceFrom Architecture to ExcellenceLearning Labs
5. Evaluation and Optimization Strategies
Tailoring Evaluation to Your LLM/Agent’s PurposeBeyond Basic FunctionalityKey Dimensions of EvaluationSetting the Bar for Production ExcellencePractical Evaluation StrategiesHuman-Centered EvaluationA/B Testing and Preference ScoringRed Teaming: Stress Testing for Safety and ReliabilityAutomated Evaluation: Scaling Feedback for Rapid ImprovementReference-Based Metrics for Text GenerationLimitations of Reference-Based EvaluationDomain-Specific and Task-Oriented MetricsMetrics for Agentic Systems and Tool UseOptimization StrategiesRefining PromptsElevating Agent PerformanceBeyond Prompt and Agent OptimizationsLooking AheadLearning Labs
6. Tuning and Infrastructure
The Tuning DecisionThe Fine-Tuning Decision FrameworkFine-Tuning Strategies: From Full Training to Efficient AdaptationsThe Real Cost of Fine-TuningImplementation ApproachesInfrastructure Questions EmergeThe Constraint You’ll Hit FirstPattern 1: The Waiting AcceleratorPattern 2: The Memory WallPattern 3: Maxed Out But Still SlowPattern 4: More GPUs = Worse PerformanceAccelerators: Matching Hardware to BottlenecksThe Decision FrameworkThe Practical DecisionMigration RealityStorage OptionsWhen Storage Becomes Your BottleneckThe Storage PatternServing and DeploymentConfiguration That MattersConnecting Models to AgentsAgent Deployment PlatformsAgent EngineCloud RunGKELooking AheadLearning Labs
7. MLOps for Production-Ready AI and Agentic Systems
From Ad Hoc to Systematic: The Current State of TeamsThe Evolution of MLOpsBuilding Reproducible Training PipelinesData Versioning and LineageExperiment TrackingModel Registry and GovernanceAutomated RetrainingComprehensive MonitoringAgent MonitoringTechnical MonitoringHallucination DetectionCI/CD for AI SystemsCloud BuildCloud DeploySecurity and Governance as FoundationSecurity Framework for AI AgentsModel Armor: A Key Security ComponentCost ManagementThe True Cost ModelCost Attribution StrategiesIntelligent Cost OperationsSpending ControlsLooking AheadLearning Labs
8. The AI and Agentic Maturity Framework
What Is the AI and Agentic Maturity Framework?The Maturity Dimensions and PhasesVision and Leadership (The “What” and the “Why” Dimension)Talent and Culture (The “Who” Dimension)Operational and Technical Practice (The “How” Dimension)How the Three Dimensions of AI and Agentic Maturity Can Work TogetherFrom Framework to Reality: What Are Teams Actually Building, and How?Technical ConversationsLeadership, Talent, and Culture ConversationsWhy and How a Platform Approach Can Accelerate an Organization’s AI and Agentic MaturityVertex AI PlatformLearning Labs
Conclusion

Appendix. Further Reading for Leaders
Index
About the Authors

Content preview from GenAI on Google Cloud

Chapter 7. MLOps for Production-Ready AI and Agentic Systems

Over the past six chapters, you’ve built a comprehensive foundation: preparing data for GenAI applications (Chapter 2), constructing multimodal agents (Chapter 3), orchestrating agent teams (Chapter 4), establishing evaluation frameworks (Chapter 5), and optimizing models and infrastructure (Chapter 6). Each of these capabilities represents a critical pillar of what we call agent operations (AgentOps)—the systematic practices that transform working prototypes into production-ready systems.

Figure 7-1 maps these pillars across nine key dimensions. This chapter extends the pillars you’ve learned with production-specific practices while introducing three pillars essential for sustainable operations: observability, security and safety, and cost and capacity.

Diagram illustrating the nine pillars of AgentOps, including cost and capacity, model strategy, serving and scale, observability, security and safety, deploy and release, evaluation and quality, and data layer, emphasizing sustainable operations.

The gap between “the model works” and “the model works in production” is wider for GenAI models than traditional ML. Identical prompts produce different outputs. Language evolves constantly. Agents maintain state across sessions. No single metric captures quality. Costs can explode through hidden operational overhead.

These challenges compound over time. Models that perform well at deployment gradually degrade as language patterns shift. Without proper versioning, teams can’t identify which model version is running or what data ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Generative AI on Google Cloud with LangChain

Publisher Resources

ISBN: 9798341623842Errata Page

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

GenAI on Google Cloud

by Ayo Adedeji, Lavi Nigam, Sarita Joshi, Stephanie Gervasi

Chapter 7. MLOps for Production-Ready AI and Agentic Systems

Figure 7-1. The nine pillars of AgentOps

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.