book

GenAI on Google Cloud

by Ayo Adedeji, Lavi Nigam, Sarita Joshi, Stephanie Gervasi

January 2026

Intermediate to advanced

320 pages

9h 58m

English

O'Reilly Media, Inc.

Read now

Unlock full access

Includes

Quizzes

Preface
Why This Book MattersWhat You’ll Find in This BookOur ApproachWho This Book Is ForPrerequisitesConventions Used in This BookUsing Code ExamplesO’Reilly Online LearningHow to Contact UsAcknowledgments
1. The Challenge of Generative AI Application Development
Overview of LLMs, Generative AI Agents, and Potential Applications to Business TasksSmall Language Models (SLMs)Foundation Models and MultimodalityDomain-Specific and Reasoning ModelsGenerative AI AgentsAgent ArchitecturesChallenges in Development, Deployment, and MaintenanceDevelopment ChallengesDeployment ChallengesMaintenance ChallengesAddressing Challenges with Modern PlatformsIndustry Use Cases and ROILooking AheadLearning Labs
2. Data Readiness and Accessibility
The Amplified Importance of Data for GenAIWhat Data Readiness Really Means for GenAI ApplicationsKey Dimensions of Data ReadinessThe Interconnected Nature of Data ReadinessManaging Prompts as Data AssetsThe Human Element: Roles in the Data Readiness JourneyData Scientists: The ExplorersML Engineers: Building the Bridge to ProductionData Engineers: Architecting the FoundationDevOps and SREs: Operationalizing the FoundationBusiness SMEs and Domain Leaders: The “Why” Behind the “What”Strategic Data Patterns: The Foundation for Reliable GenAI SystemsThe Unified Data and AI PlatformFrom RAG to Agentic RAG: The Evolution of a Data PatternTying it All Together: the Enterprise RAG Knowledge EngineData Readiness for Agent SystemsSecurity and Governance: Protecting Data Throughout the LLM LifecycleData Privacy FrameworkComprehensive GovernancePractical Data Readiness AssessmentLooking AheadLearning Labs
3. Building a Multimodal Agent with the Agent Development Kit (ADK)
From Zero to Agent in Seven LinesThe Simplest Thing That WorksThe Runtime Behind the SimplicityRunning Your First ConversationUnderstanding the LimitationsAdding Intelligence Through ToolsYour Agent’s First ToolTools Versus Subagents–A Practical Decision FrameworkState Management That Actually ScalesBuilding a Stateful Shopping CartUnderstanding the Three ScopesState Scope InteractionsMaking State Persist in ProductionBeyond Structured State: Semantic MemoryVertex AI Agent Engine Memory Bank: Learning from ConversationsImplementationExpanding to MultimodalMaking Our Agent SeeFrom Static Analysis to Live SupportBuilding Complete Interaction MemoryBuilding Production-Grade ToolsHandling Asynchronous OperationsEnsuring Safety with Human-in-the-LoopProduction Monitoring and Policy Enforcement with Callbacks and Plug-insLooking AheadLearning Labs
4. Orchestrating Intelligent Agent Teams
The Bottleneck of the Monolithic AgentConflicting InstructionsTool Selection ParalysisToken LimitationsMaintenance NightmareThe Solution: An Agent TeamThe Roadmap: From Local Teams to Distributed SystemsLocal TeamsThe Foundation: Agent HierarchyPattern 1: The Assembly Line (SequentialAgent)Pattern 2: The Independent Taskforce (ParallelAgent)Pattern 3: The Iterative Refiner (LoopAgent)Distributed CollaborationThe Organizational “Why”MCP: The Language of ToolsA2A: The Language of DelegationPutting It All Together: A Hybrid Agent TeamProduction RealitiesThe Trust Problem: Security Schemes in A2AThe Extension Problem: Evolving Agent CapabilitiesThe Visibility Problem: Distributed TracingThe Versioning Problem: Managing Agent EvolutionLooking AheadEdge and Embodied IntelligenceFrom Architecture to ExcellenceLearning Labs
5. Evaluation and Optimization Strategies
Tailoring Evaluation to Your LLM/Agent’s PurposeBeyond Basic FunctionalityKey Dimensions of EvaluationSetting the Bar for Production ExcellencePractical Evaluation StrategiesHuman-Centered EvaluationA/B Testing and Preference ScoringRed Teaming: Stress Testing for Safety and ReliabilityAutomated Evaluation: Scaling Feedback for Rapid ImprovementReference-Based Metrics for Text GenerationLimitations of Reference-Based EvaluationDomain-Specific and Task-Oriented MetricsMetrics for Agentic Systems and Tool UseOptimization StrategiesRefining PromptsElevating Agent PerformanceBeyond Prompt and Agent OptimizationsLooking AheadLearning Labs
6. Tuning and Infrastructure
The Tuning DecisionThe Fine-Tuning Decision FrameworkFine-Tuning Strategies: From Full Training to Efficient AdaptationsThe Real Cost of Fine-TuningImplementation ApproachesInfrastructure Questions EmergeThe Constraint You’ll Hit FirstPattern 1: The Waiting AcceleratorPattern 2: The Memory WallPattern 3: Maxed Out But Still SlowPattern 4: More GPUs = Worse PerformanceAccelerators: Matching Hardware to BottlenecksThe Decision FrameworkThe Practical DecisionMigration RealityStorage OptionsWhen Storage Becomes Your BottleneckThe Storage PatternServing and DeploymentConfiguration That MattersConnecting Models to AgentsAgent Deployment PlatformsAgent EngineCloud RunGKELooking AheadLearning Labs
7. MLOps for Production-Ready AI and Agentic Systems
From Ad Hoc to Systematic: The Current State of TeamsThe Evolution of MLOpsBuilding Reproducible Training PipelinesData Versioning and LineageExperiment TrackingModel Registry and GovernanceAutomated RetrainingComprehensive MonitoringAgent MonitoringTechnical MonitoringHallucination DetectionCI/CD for AI SystemsCloud BuildCloud DeploySecurity and Governance as FoundationSecurity Framework for AI AgentsModel Armor: A Key Security ComponentCost ManagementThe True Cost ModelCost Attribution StrategiesIntelligent Cost OperationsSpending ControlsLooking AheadLearning Labs
8. The AI and Agentic Maturity Framework
What Is the AI and Agentic Maturity Framework?The Maturity Dimensions and PhasesVision and Leadership (The “What” and the “Why” Dimension)Talent and Culture (The “Who” Dimension)Operational and Technical Practice (The “How” Dimension)How the Three Dimensions of AI and Agentic Maturity Can Work TogetherFrom Framework to Reality: What Are Teams Actually Building, and How?Technical ConversationsLeadership, Talent, and Culture ConversationsWhy and How a Platform Approach Can Accelerate an Organization’s AI and Agentic MaturityVertex AI PlatformLearning Labs
Conclusion

Appendix. Further Reading for Leaders
Index
About the Authors

Content preview from GenAI on Google Cloud

Chapter 5. Evaluation and Optimization Strategies

We’ve now constructed our multimodal question-answering agent, a system capable of ingesting diverse data types and providing relevant answers. It works; it fulfills its designed function. In the world of LLMs and agents, however, “functional” is just the starting line. The real challenge—and where true value is unlocked—lies in the journey from functional to optimal.

How quickly does it respond? How consistently accurate is it across a vast range of unseen queries, especially ambiguous ones? If it uses tools, how reliably and efficiently does it invoke them? Are its responses not just correct, but also concise, helpful, and perfectly aligned with the user’s nuanced intent? And, critically, as your systems evolve and interact with more complex data and tasks, how do you ensure they maintain performance, learn from experience, and continuously improve?

This chapter explores practical approaches for evaluation and optimization—two sides of the same coin in the journey to production excellence. You can’t meaningfully improve what you can’t measure, and you can’t know if your optimizations are effective without robust evaluation methods. We’ve structured this chapter to reflect this natural cycle: first establishing frameworks for systematically measuring performance across multiple dimensions, then applying targeted optimization techniques based on those insights.

In the evaluation section, we’ll explore both human-centered assessment ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Generative AI on Google Cloud with LangChain

Publisher Resources

ISBN: 9798341623842Errata Page

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

GenAI on Google Cloud

by Ayo Adedeji, Lavi Nigam, Sarita Joshi, Stephanie Gervasi

Chapter 5. Evaluation and Optimization Strategies

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.