Definition
RAG enhances LLM responses by first retrieving relevant information from external sources, then using that information to generate more accurate, grounded responses.
How RAG Works: 1. Query: User asks a question 2. Retrieve: Search knowledge base for relevant documents 3. Augment: Add retrieved context to the prompt 4. Generate: LLM produces response using the context
Benefits: - Reduces hallucinations - Enables access to current information - Allows domain-specific knowledge - More transparent (can cite sources)
- **Components:**
- Vector Database: Stores document embeddings
- Embedding Model: Converts text to vectors
- Retriever: Finds relevant documents
- Generator: LLM that produces final response
Examples
A customer service bot that retrieves from company documentation before answering.
Related Terms
Want more AI knowledge?
Get bite-sized AI concepts delivered to your inbox.
Free daily digest. No spam, unsubscribe anytime.