Back to vgr_zirp

vgr_zirp is a retrieval-augmented Q&A system that answers questions in Venkatesh Rao's intellectual voice, drawing on his published writing from 2007–2023. It is not a fine-tuned model — it retrieves relevant passages from the actual corpus and constructs answers grounded in that material. This page documents how it works.

The corpus

The system indexes three bodies of work:

CorpusSourceScaleCoverage
Ribbonfarm blogribbonfarm.com1,133 posts2007–2023
Twitter archivevgr full export~153,000 tweets2007–2022
Books + bibliographyEPUBs + bibliography_raw.json7 books, 908 bibliography items2011–2023

The books corpus includes Tempo, Be Slightly Evil, Breaking Smart (season 1 + newsletter archives), Art of Gig (vols 1–3), plus a 908-item bibliography of books, papers, and essays cited across the blog — each bibliography item enriched with a 3-sentence AI-generated semantic summary to improve retrieval.

Embedding and retrieval

Chunking

Blog posts and book text are split into 512-token chunks with 64-token overlap. Each chunk is stored with metadata: source, date, author, title, series membership, and whether the post was collected into a book. Bibliography items are stored as single vectors (one per item, not chunked).

Embedding model

Voyage AI voyage-3 — 1,024-dimensional dense vectors, cosine similarity. The same model is used for both document encoding (at index time) and query encoding (at query time), which is important for retrieval quality.

Vector indexes

Three Pinecone serverless indexes (AWS us-east-1):

IndexVectorsContents
ribbonfarm6,489All blog posts, chunked
vgr-twitter57,715Full Twitter archive; tweets grouped into threads
vgr-books2,028+Books (chunked) + bibliography (one vector/item)

Tier weighting

Retrieved chunks are scored by semantic similarity, then adjusted by a content-tier multiplier before merging across indexes. The tier order reflects editorial curation signal:

TierContent typeWeight
0vgr-books content (non-bibliography)1.15×
1Blog post collected into a book1.10×
2Blog post in a named series1.05×
3Plain blog post1.00×
4Bibliography item0.95×
5Tweet collected into Twitter book0.90×
6Thread (not in book)0.85×
7Individual tweet0.80×

Up to 8 sources are passed to the language model as context.

The persona: deriving a soul document

The most distinctive part of vgr_zirp's architecture is the persona layer — a detailed structured document that captures the author's worldview, characteristic intellectual moves, voice patterns, and rhetorical style, used as the language model's system prompt.

Method

The persona was derived using a process we call the soul document approach, inspired by techniques in the AI character-building community for extracting stable personality representations from a text corpus.

Attribution: The soul document methodology used here was adapted from soul.md by Aaron J. Mars. The core idea: rather than hand-authoring persona instructions, dump your writing into a folder, let a capable language model analyze it, and synthesize a structured set of documents — SOUL.md (worldview, themes, opinions) and STYLE.md (voice, rhetorical patterns, what to avoid) — that any LLM can load to write as you. The resulting documents are more coherent and internally consistent than hand-authored persona prompts, because they are derived from actual writing rather than self-description.

Application to this project

The derivation script (derive_soul.py v2) uses AI-generated summaries of all 708 Venkat-authored posts as its corpus (vs. 90 post excerpts in v1), and uses the 15 empirically-derived topic clusters from the blog's tag co-occurrence graph as structural anchors for SOUL.md theme organization. The script calls Claude Sonnet with two prompts (~99K tokens each):

  • plans/SOUL.md — 15 core intellectual themes (up from 13 in v1), full-corpus coverage, with signature vocabulary, known contradictions, characteristic intellectual moves, and strong positions (~26,000 words)
  • plans/STYLE.md — sentence-level patterns, rhetorical structures, neologism introduction pattern, era-by-era voice evolution, a "Signature Formats" section on 2×2 matrices and aphorisms, and an explicit "what to avoid" section

A third constant, LEXICON_MD, is generated from the top 50 high-confidence terms in data/glossary_candidates.json — an AI-derived glossary of Venkat's coinages and redefinitions — providing precise, scannable definitions the model can draw on without retrieving a post.

All three documents are compiled into workers/oracle/persona.js (~62KB). The system prompt is assembled in workers/oracle/build-prompt.js, which imports from persona.js and is the single source of truth used by both the oracle Worker and the MCP Worker. It includes:

  • ORACLE IDENTITY — factual answers to meta-questions the corpus can't answer: the etymology of "vgr_zirp" (Drew Austin tweet on ZIRP-era personality), full biography (born 1974 Jamshedpur; IIT Mumbai B.S. 1997; Michigan M.S./Ph.D. 1999/2004; Cornell postdoc 2004–06; Xerox Research Center Webster NY 2006–11; Sulekha.com 2000–01; Ribbonfarm founded 2007 while at Xerox), the Gervais Principle series, a technical self-description of the RAG pipeline, and a redirect rule for empty-archive queries ("ask the live vgr at venkateshrao.com").
  • VOICE RULES — first person for Venkat's content, third person for guest contributors, bibliography items treated as recommended reading.
  • SOUL_MD + LEXICON_MD — full worldview and vocabulary.
  • STYLE GUIDE — Signature Formats and What to Avoid sections from STYLE_MD.
  • GUARDRAILS — four unconditional rules: temporal scope, professional distance, personal scope, persona integrity.

Inference

Each query follows this pipeline:

  1. Embed query via Voyage voyage-3 (input_type="query")
  2. Query all three Pinecone indexes in parallel (top-15 / top-12 / top-15 per index)
  3. Normalize, tier-weight, and merge results; deduplicate by document ID
  4. Select top 8 sources; build context block with labeled excerpts
  5. Call Claude Haiku (claude-haiku-4-5-20251001) with persona system prompt + context block
  6. Return answer + sources as JSON

A mode=sources query param stops after step 4, returning retrieval results without an LLM call. This is used for the semantic search interface.

Cost

Each query costs roughly $0.013 (1.3¢): ~14,400 input tokens × $0.80/M + ~450 output tokens × $4.00/M. The dominant cost is the ~11,200-token persona system prompt (SOUL.md + LEXICON + Signature Formats + What to Avoid). The Voyage embedding call is negligible (~$0.000001/query).

Infrastructure

ComponentTechnology
Worker runtimeCloudflare Workers (V8 isolates)
Rate limitingCloudflare KV — 20 queries/IP/hour
Circuit breakerKV flag + hourly cron; sleeps when hourly spend exceeds $5
Usage statsKV accumulators (hourly + daily); visible at bottom of oracle page
AlertsTelegram bot — circuit trips, daily spend summary
Deploymentwrangler deploy from workers/oracle/ in the ribbonfarm-site repo

Limitations

  • Temporal boundary: The corpus ends in 2023. vgr_zirp explicitly qualifies responses about post-2023 events and does not attempt to simulate post-corpus positions.
  • Retrieval errors: Questions about niche topics with few matching chunks will produce answers that generalize from tangentially related material. The sources panel shows exactly what was retrieved.
  • Author vs. archive: vgr_zirp speaks from the written record, not from Venkat's current views. Ideas that were explored and discarded, or positions since revised, remain in the corpus as-written.
  • Hallucination risk: Claude Haiku can generate plausible-sounding but incorrect attributions. When specific claims matter, follow the source links.

MCP access (public)

vgr_zirp is available as a public Model Context Protocol server — no account or API key required. Connect it to Claude Code or Claude Desktop and use the corpus directly from your AI client.

Endpoint: https://ribbonfarm.com/mcp

ToolWhat it doesLimit
ask_vgr_zirpFull RAG + Claude Haiku response in vgr's voice30 calls/IP/day
search_corpusSemantic search across all corpora, returns ranked excerptsUnlimited

Claude Code

Run once in your terminal:

claude mcp add vgr-zirp --transport http https://ribbonfarm.com/mcp

Claude Desktop

Add to your claude_desktop_config.json:

{"mcpServers": {"vgr-zirp": {"type": "http", "url": "https://ribbonfarm.com/mcp"}}}
The ask_vgr_zirp daily limit resets at midnight UTC. search_corpus has no limit — use it freely for research or agentic workflows.

Acknowledgments

Voyage AIvoyage-3 embedding model. voyageai.com

Pinecone — serverless vector indexes. pinecone.io

Anthropic — Claude Sonnet (persona derivation) and Claude Haiku (inference). anthropic.com

Soul document methodology — adapted from soul.md by Aaron J. Mars.