Chapter 3. Advanced Planning, Reasoning, and Scalable Execution in Agents
In the last two chapters, you saw that AI agents aren’t magic, they’re engineered systems. But the techniques you’ll learn in this chapter may start to feel close. As Arthur C. Clarke once said:
Any sufficiently advanced technology is indistinguishable from magic. 1
That sense of magic comes from what happens when agents stop reacting and instead start learning from their own experience or making smarter choices at test time. They are no longer non-player characters (NPCs), because they start to learn and adapt on their own when you apply the principles of reinforcement learning (RL) to LLMs. This is how Clarke’s quote finds new meaning in the world of AI agents.
But what may look uncanny at first, actually comes from a set of clear mechanisms. This chapter explains those mechanisms in depth: you’ll learn how RL builds a feedback loop between reasoning and outcomes, how tree-based search and adaptive planning lets agents ...