New Jobs Simplified, AI University
← Back to courses

RAG — Retrieval-Augmented Generation

What is RAG?

This lesson covers Retrieval-Augmented Generation (RAG), a new approach to improve search systems and language models. RAG combines search and generation capabilities to reduce hallucinations and improve factuality. We will explore the components of a RAG system and its application in various use cases.

Why It Matters

RAG matters because it solves the problem of "hallucinations" in language models, where they provide incorrect or outdated information. By combining search and generation, RAG enables more accurate and relevant responses. This is particularly important for applications like chatbots and search systems that rely on accurate information.

Key Points

RAG systems combine search and generation capabilities to improve factuality and reduce hallucinations.
A RAG pipeline consists of three steps: retrieval, generation, and output.
The retrieval step involves retrieving the most similar documents to the question using a sentence transformer.
The generation step involves passing the question and retrieved documents to a language model (LLM) to generate a response.
RAG enables use cases like "chat with my data" where users can interact with internal company data or a specific data source of interest.
RAG can be used as a final step in a search pipeline to improve the relevance of search results.
RAG systems can be fine-tuned to adjust the relevance of search results based on user feedback.
RAG can be used to improve the performance of language models in various applications, including chatbots and search systems.

Key Concepts

RAG

Retrieval-Augmented Generation, a new approach to improve search systems and language models.

LLM

Language Model, a type of artificial intelligence model that can understand and generate human-like text.

Sentence Transformer

A type of model that can transform text into a numerical representation that can be used for tasks like search and generation.

Hallucinations

The problem of language models providing incorrect or outdated information.

Code Examples

Building a minimal RAG pipeline using a sentence transformer

from sentence_transformers import SentenceTransformer
model = SentenceTransformer('all-MiniLM-L6-v2')

Fine-tuning a RAG model to adjust the relevance of search results

from transformers import pipeline
model = pipeline('text-classification', model='bert-base-uncased', tokenizer='bert-base-uncased')
From the books
“metrics. Retrieval-Augmented Generation (RAG) | 257 Summary In this chapter, we looked at different ways of using language models to improve existing search systems and even be the core of new, more p…”
“the most mature, well-maintained systems that billions of people around the planet rely on. The ability they add is called semantic search, which enables searching by meaning, and not simply keyword m…”
“of a search step followed by a grounded generation step where the LLM is prompted with the question and the information retrieved from the search step. RAG systems incorporate search capabilities in a…”