Chapter 6. Do Language Models Dream of Electric Sheep?

Among all the excitement about advances in LLMs, few phenomena captivate and perplex like their so-called hallucinations. It’s almost as if these computational entities, deep within their myriad layers, occasionally drift into a dreamlike state, creating wondrous and bewildering narratives. Like a human’s dreams, these hallucinations can be reflective, absurd, or even prophetic, providing insights into the complex interplay between training data and the model’s learned interpretations.

In the world of LLMs, the term “hallucination” might evoke images of vivid and whimsical creations, but in reality, it signifies a more mundane statistical anomaly. At its core, a hallucination is the model’s attempt to bridge gaps in its knowledge using the patterns it has gleaned from its training data. While it might be termed “imaginative,” it’s essentially the LLM making an educated guess when faced with unfamiliar input or scenarios. However, these guesses can manifest as confident yet unfounded assertions, revealing the model’s struggle to differentiate between well-learned facts and the statistical noise within its training data.

LLMs do not provide easily usable probability scores like some other “predictive” AI algorithms. For example, a vision classifier algorithm may return a probability as a percent. It might show a 79% chance that a particular image depicts a monkey. Thus, a user of that model gets a sense of how strongly the ...

Get The Developer's Playbook for Large Language Model Security now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.