Why RAG?
A basic visual representation of retrieval-augmented generation (RAG). The user question is used to retrieve the most relevant information from the knowledge base to answer it. Then, this information is integrated into the prompt, which is sent to the language model (e.g., GPT-4) to answer the question. The final response is sent back to the user.
Retrieval-augmented generation (RAG) is a method created by the FAIR team at Meta to enhance the accuracy of large language models (LLMs) and reduce false information or “hallucinations”. RAG improves LLMs by adding an information retrieval step before generating an answer, which systematically ...
Get Building LLMs for Production now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.