Chapter 5. Extending Model Capabilities
LLMs excel at recognizing patterns in vast amounts of data. They can generate coherent and contextually relevant responses by statistically predicting the next token. However, this doesn’t necessarily mean they can genuinely understand the underlying concepts. If you have a task that is not well covered by the training regimen of your foundational model, then the model may not be able to perform that task. This chapter discusses four patterns you can use to teach foundational models tasks that they were not trained to perform.
The Limits of LLM Reasoning
Foundational models can manipulate symbols and words effectively, but this may be because they are generalizing from the manipulation they have encountered in the training data, not because they grasp semantic meanings and the logical relationships between them in the way humans do. While you can use foundational models to perform many tasks, that’s because those tasks are similar to the tasks that the foundational model was trained to do.
It’s difficult to describe tasks that foundational models can’t do well—and that’s because they’ll be esoteric or industry-specific tasks that a more general audience, such as the readership of this book, will not understand. Tasks that aren’t well captured by the training data of LLMs include writing a memo to the investment committee of a mutual fund or adjudicating an internal investigation because such memos are internal records and such investigations ...