book

Prompt Engineering for LLMs

by John Berryman, Albert Ziegler

November 2024

Intermediate to advanced

282 pages

8h 1m

English

O'Reilly Media, Inc.

Audio summary available

Read now

Unlock full access

Includes

Quizzes

Preface
Who Is This Book For?What You Will LearnConventions Used in This BookO’Reilly Online LearningHow to Contact UsAcknowledgmentsFrom JohnFrom Albert
I. Foundations
1. Introduction to Prompt Engineering
LLMs Are MagicLanguage Models: How Did We Get Here?Early Language ModelsGPT Enters the ScenePrompt EngineeringConclusion
2. Understanding LLMs
What Are LLMs?Completing a DocumentHuman Thought Versus LLM ProcessingHallucinationsHow LLMs See the WorldDifference 1: LLMs Use Deterministic TokenizersDifference 2: LLMs Can’t Slow Down and Examine LettersDifference 3: LLMs See Text DifferentlyCounting TokensOne Token at a TimeAuto-Regressive ModelsPatterns and RepetitionsTemperature and ProbabilitiesThe Transformer ArchitectureConclusion
3. Moving to Chat
Reinforcement Learning from Human FeedbackThe Process of Building an RLHF ModelKeeping LLMs HonestAvoiding Idiosyncratic BehaviorRLHF Packs a Lot of Bang for the BuckBeware of the Alignment TaxMoving from Instruct to ChatInstruct ModelsChat ModelsThe Changing APIChat Completion APIComparing Chat with CompletionMoving Beyond Chat to ToolsPrompt Engineering as PlaywritingConclusion
4. Designing LLM Applications
The Anatomy of the LoopThe User’s ProblemConverting the User’s Problem to the Model DomainUsing the LLM to Complete the PromptTransforming Back to User DomainZooming In to the Feedforward PassBuilding the Basic Feedforward PassExploring the Complexity of the LoopEvaluating LLM Application QualityOffline EvaluationOnline EvaluationConclusion
II. Core Techniques
5. Prompt Content
Sources of ContentStatic ContentClarifying Your QuestionFew-Shot PromptingDynamic ContentFinding Dynamic ContextRetrieval-Augmented GenerationSummarizationConclusion
6. Assembling the Prompt
Anatomy of the Ideal PromptWhat Kind of Document?The Advice ConversationThe Analytic ReportThe Structured DocumentFormatting SnippetsMore on InertnessFormatting Few-Shot ExamplesElastic SnippetsRelationships Among Prompt ElementsPositionImportanceDependencyPutting It All TogetherConclusion
7. Taming the Model
Anatomy of the Ideal CompletionThe PreambleRecognizable Start and EndPostscriptBeyond the Text: LogprobsHow Good Is the Completion?LLMs for ClassificationCritical Points in the PromptChoosing the ModelConclusion

III. An Expert of the Craft
8. Conversational Agency
Tool UsageLLMs Trained for Tool UsageGuidelines for Tool DefinitionsReasoningChain of ThoughtReAct: Iterative Reasoning and ActionBeyond ReActContext for Task-Based InteractionsSources for ContextSelecting and Organizing ContextBuilding a Conversational AgentManaging ConversationsUser ExperienceConclusion
9. LLM Workflows
Would a Conversational Agent Suffice?Basic LLM WorkflowsTasksAssembling the WorkflowExample Workflow: Shopify Plug-in MarketingAdvanced LLM WorkflowsAllowing an LLM Agent to Drive the WorkflowStateful Task AgentsRoles and DelegationConclusion
10. Evaluating LLM Applications
What Are We Even Testing?Offline EvaluationExample SuitesFinding SamplesEvaluating SolutionsSOMA AssessmentOnline EvaluationA/B TestingMetricsConclusion
11. Looking Ahead
MultimodalityUser Experience and User InterfaceIntelligenceConclusion
Index
About the Authors

Content preview from Prompt Engineering for LLMs

Chapter 10. Evaluating LLM Applications

GitHub Copilot is arguably the first industrial-scale application using LLMs. The curse of going first is that some of the choices you make will seem silly in hindsight, laughably flying in the face of what (by now) everyone knows.

But one of the things we got absolutely right was how we got started. The oldest part of Copilot’s codebase is not the proxy, or the prompts, or the UI, or even the boilerplate setting up the application as an IDE extension. The very first bit of code we wrote was the evaluation, and it’s only thanks to this that we were able to move so fast and so successfully with the rest. That’s because, for every change we made, we could check directly whether it was a step in the right direction, a mistake, or a good attempt that just didn’t have much of an impact. And that’s the main advantage of an evaluation framework for your LLM application: it will guide all future development.

Depending on your application and your project’s position in its lifecycle, different types of evaluation may be available and appropriate. The two big categories here are offline and online evaluation. Offline evaluation is evaluation of example cases that are independent of any live runs of your application. Since it doesn’t require real users or even, in many cases, an end-to-end working app, it will typically be the evaluation you implement first in your project’s lifecycle.

Offline evaluation, however, is somewhat theoretical and possibly ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Publisher Resources

ISBN: 9781098156145Errata Page

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

Prompt Engineering for LLMs

by John Berryman, Albert Ziegler

Chapter 10. Evaluating LLM Applications

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.