The Technology Behind vgr_zirp
vgr_zirp is a retrieval-augmented Q&A system that answers questions in Venkatesh Rao's intellectual voice, drawing on his published writing from 2007–2023. It is not a fine-tuned model — it retrieves relevant passages from the actual corpus and constructs answers grounded in that material. This page documents how it works.
The corpus
The system indexes three bodies of work:
| Corpus | Source | Scale | Coverage |
|---|---|---|---|
| Ribbonfarm blog | ribbonfarm.com | 1,133 posts | 2007–2023 |
| Twitter archive | vgr full export | ~153,000 tweets | 2007–2022 |
| Books + bibliography | EPUBs + bibliography_raw.json | 7 books, 908 bibliography items | 2011–2023 |
The books corpus includes Tempo, Be Slightly Evil, Breaking Smart (season 1 + newsletter archives), Art of Gig (vols 1–3), plus a 908-item bibliography of books, papers, and essays cited across the blog — each bibliography item enriched with a 3-sentence AI-generated semantic summary to improve retrieval.
Embedding and retrieval
Chunking
Blog posts and book text are split into 512-token chunks with 64-token overlap. Each chunk is stored with metadata: source, date, author, title, series membership, and whether the post was collected into a book. Bibliography items are stored as single vectors (one per item, not chunked).
Embedding model
Voyage AI voyage-3 — 1,024-dimensional dense vectors, cosine similarity. The same model is used for both document encoding (at index time) and query encoding (at query time), which is important for retrieval quality.
Vector indexes
Three Pinecone serverless indexes (AWS us-east-1):
| Index | Vectors | Contents |
|---|---|---|
ribbonfarm | 6,489 | All blog posts, chunked |
vgr-twitter | 57,715 | Full Twitter archive; tweets grouped into threads |
vgr-books | 2,028+ | Books (chunked) + bibliography (one vector/item) |
Tier weighting
Retrieved chunks are scored by semantic similarity, then adjusted by a content-tier multiplier before merging across indexes. The tier order reflects editorial curation signal:
| Tier | Content type | Weight |
|---|---|---|
| 0 | vgr-books content (non-bibliography) | 1.15× |
| 1 | Blog post collected into a book | 1.10× |
| 2 | Blog post in a named series | 1.05× |
| 3 | Plain blog post | 1.00× |
| 4 | Bibliography item | 0.95× |
| 5 | Tweet collected into Twitter book | 0.90× |
| 6 | Thread (not in book) | 0.85× |
| 7 | Individual tweet | 0.80× |
Up to 8 sources are passed to the language model as context.
The persona: deriving a soul document
The most distinctive part of vgr_zirp's architecture is the persona layer — a detailed structured document that captures the author's worldview, characteristic intellectual moves, voice patterns, and rhetorical style, used as the language model's system prompt.
Method
The persona was derived using a process we call the soul document approach, inspired by techniques in the AI character-building community for extracting stable personality representations from a text corpus.
SOUL.md (worldview, themes, opinions) and STYLE.md (voice, rhetorical patterns, what to avoid) — that any LLM can load to write as you. The resulting documents are more coherent and internally consistent than hand-authored persona prompts, because they are derived from actual writing rather than self-description.
Application to this project
The derivation script (derive_soul.py v2) uses AI-generated summaries of all 708 Venkat-authored posts as its corpus (vs. 90 post excerpts in v1), and uses the 15 empirically-derived topic clusters from the blog's tag co-occurrence graph as structural anchors for SOUL.md theme organization. The script calls Claude Sonnet with two prompts (~99K tokens each):
plans/SOUL.md— 15 core intellectual themes (up from 13 in v1), full-corpus coverage, with signature vocabulary, known contradictions, characteristic intellectual moves, and strong positions (~26,000 words)plans/STYLE.md— sentence-level patterns, rhetorical structures, neologism introduction pattern, era-by-era voice evolution, a "Signature Formats" section on 2×2 matrices and aphorisms, and an explicit "what to avoid" section
A third constant, LEXICON_MD, is generated from the top 50 high-confidence terms in data/glossary_candidates.json — an AI-derived glossary of Venkat's coinages and redefinitions — providing precise, scannable definitions the model can draw on without retrieving a post.
All three documents are compiled into workers/oracle/persona.js (~62KB). The system prompt is assembled in workers/oracle/build-prompt.js, which imports from persona.js and is the single source of truth used by both the oracle Worker and the MCP Worker. It includes:
- ORACLE IDENTITY — factual answers to meta-questions the corpus can't answer: the etymology of "vgr_zirp" (Drew Austin tweet on ZIRP-era personality), full biography (born 1974 Jamshedpur; IIT Mumbai B.S. 1997; Michigan M.S./Ph.D. 1999/2004; Cornell postdoc 2004–06; Xerox Research Center Webster NY 2006–11; Sulekha.com 2000–01; Ribbonfarm founded 2007 while at Xerox), the Gervais Principle series, a technical self-description of the RAG pipeline, and a redirect rule for empty-archive queries ("ask the live vgr at venkateshrao.com").
- VOICE RULES — first person for Venkat's content, third person for guest contributors, bibliography items treated as recommended reading.
- SOUL_MD + LEXICON_MD — full worldview and vocabulary.
- STYLE GUIDE — Signature Formats and What to Avoid sections from STYLE_MD.
- GUARDRAILS — four unconditional rules: temporal scope, professional distance, personal scope, persona integrity.
Inference
Each query follows this pipeline:
- Embed query via Voyage
voyage-3(input_type="query") - Query all three Pinecone indexes in parallel (top-15 / top-12 / top-15 per index)
- Normalize, tier-weight, and merge results; deduplicate by document ID
- Select top 8 sources; build context block with labeled excerpts
- Call Claude Haiku (
claude-haiku-4-5-20251001) with persona system prompt + context block - Return answer + sources as JSON
A mode=sources query param stops after step 4, returning retrieval results without an LLM call. This is used for the semantic search interface.
Cost
Each query costs roughly $0.013 (1.3¢): ~14,400 input tokens × $0.80/M + ~450 output tokens × $4.00/M. The dominant cost is the ~11,200-token persona system prompt (SOUL.md + LEXICON + Signature Formats + What to Avoid). The Voyage embedding call is negligible (~$0.000001/query).
Infrastructure
| Component | Technology |
|---|---|
| Worker runtime | Cloudflare Workers (V8 isolates) |
| Rate limiting | Cloudflare KV — 20 queries/IP/hour |
| Circuit breaker | KV flag + hourly cron; sleeps when hourly spend exceeds $5 |
| Usage stats | KV accumulators (hourly + daily); visible at bottom of oracle page |
| Alerts | Telegram bot — circuit trips, daily spend summary |
| Deployment | wrangler deploy from workers/oracle/ in the ribbonfarm-site repo |
Limitations
- Temporal boundary: The corpus ends in 2023. vgr_zirp explicitly qualifies responses about post-2023 events and does not attempt to simulate post-corpus positions.
- Retrieval errors: Questions about niche topics with few matching chunks will produce answers that generalize from tangentially related material. The sources panel shows exactly what was retrieved.
- Author vs. archive: vgr_zirp speaks from the written record, not from Venkat's current views. Ideas that were explored and discarded, or positions since revised, remain in the corpus as-written.
- Hallucination risk: Claude Haiku can generate plausible-sounding but incorrect attributions. When specific claims matter, follow the source links.
MCP access (public)
vgr_zirp is available as a public Model Context Protocol server — no account or API key required. Connect it to Claude Code or Claude Desktop and use the corpus directly from your AI client.
Endpoint: https://ribbonfarm.com/mcp
| Tool | What it does | Limit |
|---|---|---|
ask_vgr_zirp | Full RAG + Claude Haiku response in vgr's voice | 30 calls/IP/day |
search_corpus | Semantic search across all corpora, returns ranked excerpts | Unlimited |
Claude Code
Run once in your terminal:
claude mcp add vgr-zirp --transport http https://ribbonfarm.com/mcp
Claude Desktop
Add to your claude_desktop_config.json:
{"mcpServers": {"vgr-zirp": {"type": "http", "url": "https://ribbonfarm.com/mcp"}}}
ask_vgr_zirp daily limit resets at midnight UTC. search_corpus has no limit — use it freely for research or agentic workflows.Acknowledgments
Voyage AI — voyage-3 embedding model. voyageai.com
Pinecone — serverless vector indexes. pinecone.io
Anthropic — Claude Sonnet (persona derivation) and Claude Haiku (inference). anthropic.com
Soul document methodology — adapted from soul.md by Aaron J. Mars.