Chapter 6. Agent Architecture
Building on the architectures described in Chapter 5, this chapter will cover what is perhaps the most important of all current LLM architectures, the agent architecture. First, we introduce what makes LLM agents unique, then we show how to build them and how to extend them for common use cases.
In the artificial intelligence field, there is a long history of creating (intelligent) agents, which can be most simply defined as “something that acts,” in the words of Stuart Russell and Peter Norvig in their Artificial Intelligence (Pearson, 2020) textbook. The word acts actually carries a little more meaning than meets the eye:
-
Acting requires some capacity for deciding what to do.
-
Deciding what to do implies having access to more than one possible course of action. After all, a decision without options is no decision at all.
-
In order to decide, the agent also needs access to information about the external environment (anything outside of the agent itself).
So an agentic LLM application must be one that uses an LLM to pick from one or more possible courses of action, given some context about the current state of the world or some desired next state. These attributes are usually implemented by mixing two prompting techniques we first met in the Preface:
- Tool calling
-
Include a list of external functions that the LLM can make use of in your prompt (that is, the actions it can decide to take) and provide instructions on how to format its choice ...