Skip to content

07. Building a Documentation Assistant (RAG)

Overview

This section builds a complete, production-oriented RAG application end-to-end — from crawling live documentation, through ingestion and indexing, to retrieval with an agent, and finally a Streamlit-based chat UI. Unlike Section 06 (which used a single blog post), this project ingests an entire documentation website (LangChain's docs), demonstrating real-world scale challenges like rate limiting, batch processing, and concurrent API calls.

Architecture

flowchart TD
    subgraph Ingestion["📥 Ingestion Pipeline"]
        CRAWL["🌐 Tavily Crawl\n(Crawl docs website)"]
        DOCS["📄 LangChain Documents"]
        SPLIT["✂️ RecursiveCharacterTextSplitter\n(chunk_size=4000)"]
        BATCH["📦 Batch Processing\n(concurrent embedding)"]
        STORE["🗄️ Pinecone / ChromaDB"]

        CRAWL --> DOCS --> SPLIT --> BATCH --> STORE
    end

    subgraph Retrieval["📤 Retrieval Pipeline"]
        Q["❓ User Query"]
        AGENT["🤖 LangGraph Agent\n(with retrieval tool)"]
        SEARCH["🔍 Similarity Search\n(top K chunks)"]
        LLM["🤖 LLM Generation\n(GPT-4/5)"]
        ANS["✅ Grounded Answer\n+ Source Citations"]

        Q --> AGENT --> SEARCH --> LLM --> ANS
        STORE -.-> SEARCH
    end

    subgraph Frontend["🖥️ Frontend"]
        ST["Streamlit Chat UI"]
        HIST["Session State\n(Chat History)"]
        SRC["📎 Source Citations"]

        ST --> HIST
        ANS --> ST
        ST --> SRC
    end

    style Ingestion fill:#4a9eff,color:#fff
    style Retrieval fill:#10b981,color:#fff
    style Frontend fill:#8b5cf6,color:#fff

Lesson Map

# Lesson Focus
1 What Are We Building? Project overview — documentation helper with RAG + Streamlit
2 Pipenv vs uv Quick note on package manager differences
3 Environment Setup Clone, install, Pinecone index, API keys
4 Ingestion Pipeline Intro Architecture overview — Tavily for crawling, LangChain for indexing
5 Imports & Initialization All imports, SSL config, embeddings, vector store, rate limiting
6 Tavily Crawl One-call crawling with TavilyCrawl — depth, instructions, filtering
7 TavilyMap & TavilyExtract Manual two-step crawling — map URLs then extract content
8 Crawling Deep Dive Advanced: batch extraction, concurrent processing, error handling
9 Recap Transition from crawling to chunking and indexing
10 Chunking & Text Splitting RecursiveCharacterTextSplitter, chunk size philosophy, RAG vs long context
11 Batch Indexing Concurrent vector store indexing, rate limit handling, ChromaDB alternative
12 Retrieval Agent Agent with retrieval tool, content_and_artifact, initChatModel, source tracking
13 Run, Debug, Trace Debug walkthrough, LangSmith trace analysis, artifact inspection
14 Frontend with Streamlit Chat UI, session state, message rendering, source citation display
15 Production RAG Chat LangChain — open-source production RAG with Agentic RAG + LangGraph
16 RAG Architectures Two-step vs Agent vs Hybrid RAG — comparison and production recommendations

Key Technologies

Technology Role
Tavily Web crawling and content extraction (map, extract, crawl)
LangChain Document loaders, text splitters, embeddings, vector stores, LCEL
Pinecone Cloud-managed vector database
ChromaDB Local open-source vector database alternative
OpenAI Embeddings (text-embedding-3-small) + LLM (GPT-4/5)
LangGraph Agent framework (via create_agent)
Streamlit Python-based chat UI for prototyping
LangSmith Tracing and observability