Retrieval-augmented generation draws upon external data sources to address two shortcomings of large language models, out-of-date training sets and limited context windows.
Credit: Thinkstock
Retrieval-augmented generation (RAG) is a technique used to “ground” large language models (LLMs) with specific data sources, often sources that weren’t included in the models’ original training. RAG’s three steps are retrieval from a specified source, augmentation of the prompt with the context retrieved from the source, and then generation using the model and the augmented prompt.
In an exercise in dogfooding, I asked the GPT-4 large language model “What is retrieval-augmented generation?” using its Browse plug-in, which is one implementation of retrieval-augmented generation.
This is what GPT-4 eventually told me, after I asked it to regenerate its answer:
Retrieval-Augmented Generation (RAG) is an AI framework that significantly impacts the field of Natural Language Processing (NLP). It is designed to improve the accuracy and richness of content produced by language models. Here’s a synthesis of the key points regarding RAG …