Coursify

Retrieval-Augmented Generation (RAG) — From Fundamentals to Production-Ready Agentic RAG Systems

Corrective RAG (CRAG)

Implement CRAG with document relevance grading, web search fallback, and query rewriting. Build a LangGraph-based system with three states: Correct, Incorrect, and Ambiguous.

Learning Goals

  • Implement Corrective RAG with document grading
  • Build web search fallback for failed retrievals

In a traditional RAG pipeline, the system assumes the retrieved documents are relevant. If they aren't, the LLM is forced to "hallucinate" an answer or simply fail. Corrective RAG (CRAG) introduces a self-correction mechanism: it uses a "Retrieval Evaluator" (a small LLM) to grade the relevance of retrieved documents. If the documents are deemed irrelevant or ambiguous, CRAG triggers an automatic fallback to Web Search to find fresh, accurate context.

This pattern is the industry standard for building robust AI systems that don't fail when internal data is missing or outdated.

Learning Goals

  • Explain the 3-state logic of Corrective RAG (Correct, Incorrect, Ambiguous).
  • Implement a retrieval grading step using LangChain.
  • Integrate Web Search (Tavily) as a fallback mechanism for failed retrievals.

Core Concepts

1. The Retrieval Evaluator

Before generation, a "Grader" model looks at each retrieved chunk and the query.

  • Goal: Determine if the document contains the information needed to answer the question.
  • Output: A binary "yes" or "no" (or a confidence score).

2. The 3-State Action Loop

Based on the grade, CRAG chooses a path:

  • Correct: All documents are relevant. Proceed to standard RAG generation.
  • Incorrect: No documents are relevant. Discard them and trigger a Web Search.
  • Ambiguous: Some documents are relevant but incomplete. Combine internal docs with Web Search results.

3. Query Rewriting

If a web search is needed, the user's conversational query is often too vague. CRAG uses an LLM to "rewrite" the query into optimized keywords for the search engine.

CRAG Architecture

Implementing Corrective RAG

  1. 1
    Step 1

    Use a small model with structured output to grade documents:

    1from pydantic import BaseModel, Field 2 3class GradeDocuments(BaseModel): 4 binary_score: str = Field(description="Documents are relevant, 'yes' or 'no'") 5 6# ... (LLM setup with structured output) 7structured_llm_grader = llm.with_structured_output(GradeDocuments)
  2. 2
    Step 2

    Initialize the Tavily search tool:

    1from langchain_community.tools.tavily_search import TavilySearchResults 2 3web_search_tool = TavilySearchResults(k=3)
  3. 3
    Step 3

    Iterate through docs and trigger search if 'no' is found:

    1def crag_logic(query, docs): 2 scores = [structured_llm_grader.invoke({"query": query, "doc": d.page_content}) for d in docs] 3 4 if all(s.binary_score == "no" for s in scores): 5 print("Internal docs failed. Triggering web search...") 6 return web_search_tool.invoke(query) 7 return docs

Example: Niche Product Support

Imagine a user asks about a bug in a software update released two hours ago. Your vector store was last updated last month. CRAG would retrieve outdated docs, the Grader would see they don't mention the bug, and the system would automatically search the live web to find the solution on the developer's blog or Twitter.

Common Mistakes

  • Grading Latency: Grading 10 documents one-by-one is slow. Use batching or grade them in parallel to keep response times under control.
  • Trusting the Grader blindly: If your Grader is too strict, you'll perform unnecessary web searches (expensive). If it's too loose, you'll still hallucinate. Tune the Grader's prompt carefully.

Recap

  • CRAG adds a "Quality Control" layer to retrieval.
  • Web Search acts as the "Safety Net" for the knowledge base.
  • Structured output (Pydantic) is essential for reliable grading and routing.

Knowledge Check

Question 1 of 3
Q1Single choice

What is the role of the 'Retrieval Evaluator' in CRAG?