07. Building a Documentation Assistant (RAG)
Overview
This section builds a complete, production-oriented RAG application end-to-end — from crawling live documentation, through ingestion and indexing, to retrieval with an agent, and finally a Streamlit-based chat UI. Unlike Section 06 (which used a single blog post), this project ingests an entire documentation website (LangChain's docs), demonstrating real-world scale challenges like rate limiting, batch processing, and concurrent API calls.
Architecture
flowchart TD
subgraph Ingestion["📥 Ingestion Pipeline"]
CRAWL["🌐 Tavily Crawl\n(Crawl docs website)"]
DOCS["📄 LangChain Documents"]
SPLIT["✂️ RecursiveCharacterTextSplitter\n(chunk_size=4000)"]
BATCH["📦 Batch Processing\n(concurrent embedding)"]
STORE["🗄️ Pinecone / ChromaDB"]
CRAWL --> DOCS --> SPLIT --> BATCH --> STORE
end
subgraph Retrieval["📤 Retrieval Pipeline"]
Q["❓ User Query"]
AGENT["🤖 LangGraph Agent\n(with retrieval tool)"]
SEARCH["🔍 Similarity Search\n(top K chunks)"]
LLM["🤖 LLM Generation\n(GPT-4/5)"]
ANS["✅ Grounded Answer\n+ Source Citations"]
Q --> AGENT --> SEARCH --> LLM --> ANS
STORE -.-> SEARCH
end
subgraph Frontend["🖥️ Frontend"]
ST["Streamlit Chat UI"]
HIST["Session State\n(Chat History)"]
SRC["📎 Source Citations"]
ST --> HIST
ANS --> ST
ST --> SRC
end
style Ingestion fill:#4a9eff,color:#fff
style Retrieval fill:#10b981,color:#fff
style Frontend fill:#8b5cf6,color:#fff
Lesson Map
| # |
Lesson |
Focus |
| 1 |
What Are We Building? |
Project overview — documentation helper with RAG + Streamlit |
| 2 |
Pipenv vs uv |
Quick note on package manager differences |
| 3 |
Environment Setup |
Clone, install, Pinecone index, API keys |
| 4 |
Ingestion Pipeline Intro |
Architecture overview — Tavily for crawling, LangChain for indexing |
| 5 |
Imports & Initialization |
All imports, SSL config, embeddings, vector store, rate limiting |
| 6 |
Tavily Crawl |
One-call crawling with TavilyCrawl — depth, instructions, filtering |
| 7 |
TavilyMap & TavilyExtract |
Manual two-step crawling — map URLs then extract content |
| 8 |
Crawling Deep Dive |
Advanced: batch extraction, concurrent processing, error handling |
| 9 |
Recap |
Transition from crawling to chunking and indexing |
| 10 |
Chunking & Text Splitting |
RecursiveCharacterTextSplitter, chunk size philosophy, RAG vs long context |
| 11 |
Batch Indexing |
Concurrent vector store indexing, rate limit handling, ChromaDB alternative |
| 12 |
Retrieval Agent |
Agent with retrieval tool, content_and_artifact, initChatModel, source tracking |
| 13 |
Run, Debug, Trace |
Debug walkthrough, LangSmith trace analysis, artifact inspection |
| 14 |
Frontend with Streamlit |
Chat UI, session state, message rendering, source citation display |
| 15 |
Production RAG |
Chat LangChain — open-source production RAG with Agentic RAG + LangGraph |
| 16 |
RAG Architectures |
Two-step vs Agent vs Hybrid RAG — comparison and production recommendations |
Key Technologies
| Technology |
Role |
| Tavily |
Web crawling and content extraction (map, extract, crawl) |
| LangChain |
Document loaders, text splitters, embeddings, vector stores, LCEL |
| Pinecone |
Cloud-managed vector database |
| ChromaDB |
Local open-source vector database alternative |
| OpenAI |
Embeddings (text-embedding-3-small) + LLM (GPT-4/5) |
| LangGraph |
Agent framework (via create_agent) |
| Streamlit |
Python-based chat UI for prototyping |
| LangSmith |
Tracing and observability |