06.08 — Medium Analyzer: LCEL-Based RAG Chain¶

Overview¶

This lesson rebuilds the naive retrieval pipeline from Lesson 07 using LangChain Expression Language (LCEL). The result is a composable, traceable, streamable chain that performs the same RAG retrieval — but with full LangSmith observability, streaming support, async capabilities, and clean composability. This is the production-ready approach.

Why LCEL?¶

Capability	Naive (Lesson 07)	LCEL (This Lesson)
Streaming	❌ No	✅ Yes — `chain.stream()`
Async	❌ No	✅ Yes — `chain.ainvoke()`
Batch processing	❌ Manual	✅ Yes — `chain.batch()`
LangSmith tracing	⚠️ Disconnected traces	✅ Single unified trace
Composability	❌ Standalone function	✅ Pipe into other chains
Type safety	❌ Manual	✅ Runnable interface

The Same Result, Better Architecture¶

Both implementations take the same input and produce the same output. The difference is how they're structured:

flowchart TD
    subgraph Naive["Lesson 07: Naive Approach"]
        N1["manual retriever.invoke()"]
        N2["manual format_docs()"]
        N3["manual prompt.format_messages()"]
        N4["manual llm.invoke()"]
        N1 --> N2 --> N3 --> N4
    end

    subgraph LCEL["Lesson 08: LCEL Chain"]
        L1["RunnablePassthrough.assign(\ncontext=retriever | format_docs\n)"]
        L2["|"]
        L3["prompt_template"]
        L4["|"]
        L5["llm"]
        L6["|"]
        L7["StrOutputParser"]
        L1 --> L2 --> L3 --> L4 --> L5 --> L6 --> L7
    end

    style Naive fill:#ef4444,color:#fff
    style LCEL fill:#10b981,color:#fff

New Imports¶

from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from operator import itemgetter

Import	Purpose
`StrOutputParser`	Extracts `.content` from the LLM's `AIMessage` → returns a plain string
`RunnablePassthrough`	Passes input through unchanged; `.assign()` adds computed fields to the output dict
`itemgetter`	Python utility that extracts a key from a dict — cleaner than `lambda x: x["key"]`

Building the LCEL Chain¶

def create_retrieval_chain():
    """Create a composable RAG chain using LCEL."""

    retrieval_chain = (
        RunnablePassthrough.assign(
            context=itemgetter("question") | retriever | format_docs
        )
        | prompt_template
        | llm
        | StrOutputParser()
    )

    return retrieval_chain

This Is the Tricky Part¶

The chain above is compact but dense. Let's break it down step by step.

Step-by-Step Breakdown¶

The Input¶

When we invoke the chain, the input is a dictionary:

{"question": "What is Pinecone in machine learning?"}

Stage 1: `RunnablePassthrough.assign(context=...)`¶

This is the most complex part. RunnablePassthrough.assign() does two things simultaneously:

Passes the input through unchanged (the {"question": "..."} dict)
Adds a new key (context) to the output by running a sub-chain

flowchart TD
    INPUT["📥 Input:\n{'question': 'What is Pinecone?'}"]

    subgraph RPA["RunnablePassthrough.assign(context=...)"]
        PASS["Pass through:\nquestion stays"]
        SUB["Compute context:\nitemgetter → retriever → format_docs"]
    end

    OUTPUT["📤 Output:\n{\n  'question': 'What is Pinecone?',\n  'context': 'Pinecone is a managed...'\n}"]

    INPUT --> RPA
    RPA --> OUTPUT

    style RPA fill:#f59e0b,color:#fff

Input: {"question": "What is Pinecone?"}
Output: {"question": "What is Pinecone?", "context": "Pinecone is a managed vector..."}

The Sub-Chain: `itemgetter("question") | retriever | format_docs`¶

This sub-chain runs inside the assign():

flowchart LR
    IG["itemgetter('question')\n→ 'What is Pinecone?'"]
    RET["retriever\n→ [Doc1, Doc2, Doc3]"]
    FD["format_docs\n→ 'Chunk text 1\\n\\nChunk text 2...'"]

    IG --> RET --> FD

    style IG fill:#4a9eff,color:#fff
    style RET fill:#8b5cf6,color:#fff
    style FD fill:#10b981,color:#fff

itemgetter("question") — extracts the "question" value from the input dict → "What is Pinecone?"
retriever — embeds the string, searches Pinecone → [Doc1, Doc2, Doc3]
format_docs — concatenates document texts → "Chunk 1 text\n\nChunk 2 text\n\nChunk 3 text"

[!NOTE] format_docs is a regular Python function, not a LangChain Runnable. When used in an LCEL pipe, LangChain automatically wraps it in a RunnableLambda — so it gains .invoke(), .stream(), and .ainvoke() for free.

Stage 2: `| prompt_template`¶

Receives the dict {"question": "...", "context": "..."} and populates the prompt template:

Answer the question based only on the following context:

Pinecone is a managed vector database...
Chunk 2 text...
Chunk 3 text...

Question: What is Pinecone in machine learning?

Provide a detailed answer.

Stage 3: `| llm`¶

Sends the populated prompt to GPT-3.5 Turbo → receives an AIMessage.

Stage 4: `| StrOutputParser()`¶

Extracts AIMessage.content → returns a plain string (the answer text).

Invoking the Chain¶

if __name__ == "__main__":
    chain = create_retrieval_chain()

    result = chain.invoke({"question": "What is Pinecone in machine learning?"})
    print(result)
    # → "Pinecone is a fully managed cloud-based vector database..."

The result is identical to the naive implementation — but the chain is now a Runnable with full capabilities:

# Streaming (token by token)
for chunk in chain.stream({"question": "What is Pinecone?"}):
    print(chunk, end="", flush=True)

# Async
result = await chain.ainvoke({"question": "What is Pinecone?"})

# Batch
results = chain.batch([
    {"question": "What is Pinecone?"},
    {"question": "How do embeddings work?"}
])

LangSmith Trace: The Key Advantage¶

With the naive approach, traces were disconnected. With LCEL, everything appears in one unified trace:

📊 RunnableSequence (8.2s)
├── 📥 Input: {"question": "What is Pinecone in ML?"}
├── 🔧 RunnablePassthrough.assign
│   ├── 🔎 itemgetter → "What is Pinecone in ML?"
│   ├── 🔍 VectorStoreRetriever (1.2s)
│   │   ├── Input: "What is Pinecone in ML?"
│   │   └── Output: [Doc1, Doc2, Doc3]
│   └── 🔧 format_docs → "Pinecone is a managed..."
├── 📝 ChatPromptTemplate
│   ├── Input: {"question": "...", "context": "..."}
│   └── Output: [HumanMessage with augmented prompt]
├── 🤖 ChatOpenAI (6.5s)
│   ├── Input: Augmented prompt
│   └── Output: AIMessage("Pinecone is a fully managed...")
├── 📤 StrOutputParser → "Pinecone is a fully managed..."
└── 📤 Final Output: "Pinecone is a fully managed..."

Every step is visible, timed, and linked. You can see: - What the retriever returned (and how long it took) - The exact prompt that was sent to the LLM - The LLM's response and timing - Where bottlenecks are (retrieval? LLM? formatting?)

Comparing Naive vs. LCEL Side-by-Side¶

Naive Step	LCEL Equivalent
`docs = retriever.invoke(query)`	`itemgetter("question") \\| retriever` (inside assign)
`context = format_docs(docs)`	`\\| format_docs` (inside assign)
`messages = prompt.format_messages(...)`	`\\| prompt_template` (accepts dict with both keys)
`response = llm.invoke(messages)`	`\\| llm`
`return response.content`	`\\| StrOutputParser()`

Summary¶

Concept	What We Learned
`RunnablePassthrough.assign()`	Passes input through while adding new computed keys to the dict
`itemgetter("question")`	Extracts a specific key from the input dict — cleaner than lambda
Auto-wrapping	Regular Python functions are automatically wrapped as `RunnableLambda` in LCEL pipes
Unified trace	All steps appear in one LangSmith trace — crucial for debugging
Streaming / Async / Batch	Free capabilities from the Runnable interface
Same result, better architecture	LCEL produces identical answers but with production-ready infrastructure