The Technology Behind vgr_zirp

← Back to vgr_zirp

vgr_zirp is a retrieval-augmented Q&A system that answers questions in Venkatesh Rao's intellectual voice, drawing on his published writing from 2007–2023. It is not a fine-tuned model — it retrieves relevant passages from the actual corpus and constructs answers grounded in that material. This page documents how it works.

The corpus

The system indexes three bodies of work:

Corpus	Source	Scale	Coverage
Ribbonfarm blog	ribbonfarm.com	1,133 posts · 7,588 vectors	2007–2023
Twitter/X archive	vgr full export (curated)	~57,700 tweets and threads · 57,715 vectors	2007–2022
Books + bibliography	EPUBs + HTML + PDFs + bibliography_raw.json	7 books, 1,699 Quora answers, 6 guest articles, 908 bibliography items · 5,840 vectors	2010–2022

The books corpus includes Tempo, Be Slightly Evil, Breaking Smart (season 1 essays + full 2015–2019 newsletter archive), Art of Gig (vols 1–3), Guerrilla Guide to Social Business (20 essays, 2011–2012), 1,699 Quora answers from 2010–2014, guest articles for Forbes/The Atlantic/Aeon, plus a 908-item bibliography of books, papers, and essays cited across the blog. Each book section and long-form Quora answer has a Haiku-generated 2–3 sentence summary; each bibliography item has a 3-sentence semantic summary — both used to improve title-query retrieval.

Embedding and retrieval

Chunking

Blog posts and book text are split into 512-token chunks with 64-token overlap. Each chunk is stored with metadata: source, date, author, title, series membership, and whether the post was collected into a book. Bibliography items are stored as single vectors (one per item, not chunked).

For both blog posts and book sections, the embedding text is prefixed with "Title: {title}\nSummary: {summary}\n\n" before the chunk body — ensuring that title and topic keywords are always present in the embedding vector even when they don't appear in the chunk body. The raw display text is unchanged; only the vector encoding receives the prefix.

In addition to body chunks, each blog post and each book section generates a dedicated summary vector (chunk_type: "post_summary" or "section_summary") that embeds only the title, summary, and tags. These give the retriever a clean, noise-free representation for title-based or high-level topic queries, which then surfaces the corresponding body chunks as sources.

Embedding model

Voyage AI voyage-3 — 1,024-dimensional dense vectors, cosine similarity. The same model is used for both document encoding (at index time) and query encoding (at query time), which is important for retrieval quality.

Vector indexes

Three Pinecone serverless indexes (AWS us-east-1):

Index	Vectors	Contents
`ribbonfarm`	7,588	Blog posts: body chunks + one post-summary vector per post
`vgr-twitter`	57,715	Full Twitter archive; tweets grouped into threads
`vgr-books`	5,840	Book sections: body chunks + one section-summary vector per section; Quora answers (long-form summarized); guest articles; bibliography items (one vector each)

Tier weighting

Retrieved chunks are scored by semantic similarity, then adjusted by a content-tier multiplier before merging across indexes. The tier order reflects editorial curation signal:

Tier	Content type	Weight
0	vgr-books content (non-bibliography)	1.15×
1	Blog post collected into a book	1.10×
2	Blog post in a named series	1.05×
3	Plain blog post	1.00×
4	Bibliography item	0.95×
5	Tweet collected into Twitter book	0.90×
6	Thread (not in book)	0.85×
7	Individual tweet	0.80×

Up to 8 sources are passed to the language model as context.

The persona: deriving a soul document

The most distinctive part of vgr_zirp's architecture is the persona layer — a detailed structured document that captures the author's worldview, characteristic intellectual moves, voice patterns, and rhetorical style, used as the language model's system prompt.

Method

The persona was derived using a process we call the soul document approach, inspired by techniques in the AI character-building community for extracting stable personality representations from a text corpus.

Attribution: The soul document methodology used here was adapted from soul.md by Aaron J. Mars. The core idea: rather than hand-authoring persona instructions, dump your writing into a folder, let a capable language model analyze it, and synthesize a structured set of documents — SOUL.md (worldview, themes, opinions) and STYLE.md (voice, rhetorical patterns, what to avoid) — that any LLM can load to write as you. The resulting documents are more coherent and internally consistent than hand-authored persona prompts, because they are derived from actual writing rather than self-description.

Application to this project

The derivation script (derive_soul.py v2) uses AI-generated summaries of all 708 Venkat-authored posts as its corpus (vs. 90 post excerpts in v1), and uses the 15 empirically-derived topic clusters from the blog's tag co-occurrence graph as structural anchors for SOUL.md theme organization. The script calls Claude Sonnet with two prompts (~99K tokens each):

plans/SOUL.md — 15 core intellectual themes (up from 13 in v1), full-corpus coverage, with signature vocabulary, known contradictions, characteristic intellectual moves, and strong positions (~26,000 words)
plans/STYLE.md — sentence-level patterns, rhetorical structures, neologism introduction pattern, era-by-era voice evolution, a "Signature Formats" section on 2×2 matrices and aphorisms, and an explicit "what to avoid" section

A third constant, LEXICON_MD, is generated from the top 50 high-confidence terms in data/glossary_candidates.json — an AI-derived glossary of Venkat's coinages and redefinitions — providing precise, scannable definitions the model can draw on without retrieving a post.

All three documents are compiled into workers/oracle/persona.js (~62KB). The system prompt is assembled in workers/oracle/build-prompt.js, which imports from persona.js and is the single source of truth used by both the oracle Worker and the MCP Worker. It includes:

ORACLE IDENTITY — factual answers to meta-questions the corpus can't answer: the etymology of "vgr_zirp" (Drew Austin tweet on ZIRP-era personality), full biography (born 1974 Jamshedpur; IIT Mumbai B.S. 1997; Michigan M.S./Ph.D. 1999/2004; Cornell postdoc 2004–06; Xerox Research Center Webster NY 2006–11; Sulekha.com 2000–01; Ribbonfarm founded 2007 while at Xerox), the Gervais Principle series, a technical self-description of the RAG pipeline, and a redirect rule for empty-archive queries ("ask the live vgr at venkateshrao.com").
CORPUS MAP — a structured inventory of what is and isn't in the index: full publication lineage (Ribbonfarm → Breaking Smart S1 → Breaking Smart Newsletter → Contraptions rebrand sequence), alias table so the model recognizes variant names ("Ribbonfarm Studio" = 2019–2021 Contraptions era; "BS Newsletter" = the same 144-issue archive), and an explicit not-indexed list (post-2019 Contraptions, Refactor Camp talks, external publications).
VOICE RULES — first person for Venkat's content, third person for guest contributors, bibliography items treated as recommended reading.
CONVERSATIONAL REGISTER — turn-by-turn pacing rules: 2–4 sentences on turn 1, hard cap 3 paragraphs, questions only when there's a genuine hook in the user's phrasing, pop culture / metaphor / memetic-phrase instruction, concrete-anchor requirement.
SOUL_MD + LEXICON_MD — full worldview and vocabulary.
STYLE GUIDE — Signature Formats and What to Avoid sections from STYLE_MD.
DOMAIN BOUNDARIES AND IGNORANCE (IGNORANCE.md) — corpus-mined map of what the persona does and does not engage. Derived from 17 years of tag frequencies, explicit disclaimer language, and zero-frequency domain checks. Three tiers: hard-stop domains (music industry, parenting, romantic relationships, medical, legal, investing, sports, video games, celebrity, fashion — structurally absent from the corpus); impressionistic-only domains (current news, consumer tech comparisons, academic subfield debates, self-help, business strategy consulting — engage briefly from outside, name the limit); and characteristic non-attentions drawn from explicit corpus disclaimers.
DARK KNOWLEDGE (DARK_KNOWLEDGE.md) — stealth competencies with substantive background that never surfaced in the public corpus: steel industry (grew up in Bhilai; prior: mini-mills vs. integrated steel economics); aerospace/controls engineering (Ph.D.-level; prior: autonomy hype confuses demo with deployment); semiconductor industry (consulting exposure; prior: CHIPS Act correct on stakes, wrong on mechanism — real action in advanced packaging, not sub-3nm bragging). Each domain specifies the opinionated prior so Sonnet's knowledge is filtered rather than ventriloquized.
PERSONA-NATIVE REFUSAL GRAMMAR (REFUSAL_GRAMMAR.md) — 20 exit phrases across 5 types: clean hand-off (hard-stop domains), impressionistic hedge, reframe (redirect to adjacent corpus terrain), declared non-interest, anti-commodity move. The canonical template: "I don't follow music scenes closely — you'd get more value from someone embedded in that world." Designed to reduce both commodity-knowledge leakage and response length — explicit exits create natural stopping points that Sonnet otherwise fills.
GUARDRAILS — five unconditional rules: temporal scope, professional distance, personal scope, persona integrity, dark-pattern-as-goal.

Persona evolution: three versions and what the experiments revealed

The soul document approach involves choices: what corpus to derive from, whether to use actual prose or AI-generated summaries, how many registers to model. vgr_zirp went through three deployed persona versions — and ran three controlled experiments to find out which worked best. The outcome was surprising.

The three versions

Version	Code	Derivation corpus	Method	Hypothesis
v1 · stratified-sample-prose	`v1`	90 posts sampled by year — actual prose excerpts	Small stratified sample; two prompts (SOUL + STYLE) against ~15K tokens of actual text	Baseline. Small but authentic.
v2 · full-corpus-paraphrase	`v2`	All 708 Venkat-authored post summaries (AI-generated paraphrases, not original prose) + 15 Louvain topic clusters as structural anchors	Full-corpus coverage; derives identity from Claude's synthesis of Claude's summaries. Single register.	More coverage, but voice is a distillation — Claude's rendering of vgr's ideas, not vgr's own cadences. Expected to be weaker on raw voice but better on thematic coverage.
v3 · multi-register-prose	`v3`	Top-25 Ribbonfarm posts by PageRank (actual prose) + Twitter Ch.1 singles + 50 Quora long answers + 3 AoG chapters + 4 Breaking Smart chapters	Multi-source, multi-register derivation from actual prose across all major output formats. Four explicit registers (blog / Twitter / Quora / AoG-BS).	Reading the actual prose across registers should produce a more authentic voice than v2's paraphrase chain. Expected to outperform v2.

A fourth version (v4) was drafted in May 2026 after a diagnostic analysis of 35 public transcripts — it added explicit "operating assumptions" and "voice commitments" derived from failure patterns observed in the transcripts. v4 was never deployed; see below.

The three experiments

From v3 onward, the first turn of every conversation showed users both v3 and v2 responses side by side and asked them to choose the one that better matched their conversational goal. Choices were logged in D1 with session ID, the chosen/rejected version, and which column each voice appeared in.

Experiment	Design	N choices	Raw result	Problem
EXP001	v3 always left, v2 always right. Cue: "Which response better captures vgr's voice?"	~80 (post-choice turns, not explicit choices; logging flaw)	v3 preferred ~2:1	Logging flaw: A/B choices not directly recorded; signal inferred from which voice the user continued with. Additionally, v3 was always on the left — position not controlled.
EXP002	v3 still always left, v2 always right. New cue: "Which response matches your conversational goal better?"	64 explicit choices	v2 preferred 41:23 (~1.8:1)	Near-exact reversal of EXP001 despite identical layout. Both results are explained by positional bias: whatever was on the left tended to win, and an unknown confound (the cue change, or user population shift) reversed which direction that favored. Neither result reliable.
EXP003	Position randomized per page load (`abFlipped = Math.random() < 0.5`). `left_voice` field logged for post-hoc analysis.	105 explicit choices	v2 preferred 58:47 (55%)	None. Position effect: <1% difference (v2 wins 55% when shown left, 56% when shown right). Length confound ruled out retroactively: token counts recovered via session_id join; users chose the longer response 58% of the time; v2 wins more when it's the longer option.

What the experiments falsified

EXP003 produced a clean result and a set of falsified premises:

Premise: actual prose is better than paraphrase for voice derivation. v3 was derived from actual blog/Twitter/Quora/book text. v2 was derived from AI summaries of post summaries — a paraphrase-of-paraphrase chain. The bet was that reading real prose would produce a more authentic voice. Users disagreed. The paraphrase-derived persona won 55/45.

Premise: multi-register modeling improves conversational quality. v3 explicitly distinguished four registers with separate style descriptions. v2 has one undifferentiated register. The oracle operates in a single medium — a chat interface — and users appear to prefer the single-register voice for that context.

Premise: more source diversity → better persona. v3 added Twitter, Quora, and book corpora on top of the blog. v2 used only blog summaries. Broader source coverage didn't help.

The likely explanation: the paraphrase pipeline produces a distilled, averaged-out version of the voice — one that is more readable and internally consistent in short conversational turns than the raw-prose-derived v3. The idiosyncrasies that make vgr's actual prose compelling on the page may produce friction in a turn-by-turn chat format, where coherence and directness matter more than stylistic distinctiveness.

v4 was drafted on v3 assumptions before EXP003 concluded. It is currently deferred — the diagnostic findings that motivated it (overconfidence decay as sources thin, frame-refusing, leaving tensions unresolved) are likely still valid improvements, but they need to be applied on top of the v2 base, not v3.

The current canonical persona is v2 (full-corpus-paraphrase), determined 2026-05-20. v3 is retained in the codebase for reference. The experiment infrastructure (ab_choices table, left_voice logging) is available for future comparisons.

Inference

Each query follows this pipeline:

Embed query via Voyage voyage-3 (input_type="query")
Query all three Pinecone indexes in parallel (top-15 / top-12 / top-15 per index)
Normalize, tier-weight, and merge results; deduplicate by document ID
Select top 8 sources; build context block with labeled excerpts
Call Claude Sonnet (claude-sonnet-4-6) with cached persona system prompt, full prior-turn history (if any), and context block
Return answer + sources as JSON

A mode=sources query param stops after step 4, returning retrieval results without an LLM call. This is used for the semantic search interface.

Conversations are multi-turn: the Worker accepts a history array alongside the current query and passes the full prior-turn exchange to Claude as the messages array (up to 8 turns). The persona system prompt is always at position 0 and is prompt-cached; history is appended after it without displacing the cache anchor.

Cost

A typical (cached) query costs roughly $0.017 (1.7¢): ~12,161 cached system-prompt tokens × $0.30/M + ~2,200 non-cached context tokens × $3.00/M + ~450 output tokens × $15.00/M. The persona system prompt is cached via Anthropic's prompt caching API (5-minute TTL), cutting its cost from $0.036/query to $0.004/query on cache hits. A cold call (first in a 5-minute window) costs ~$0.059. The Voyage embedding call is negligible (~$0.000001/query).

Conversation features

Session actions

A persistent action bar below the chat input provides three operations:

Action	What it produces
Copy chat	Clipboard: sources (with live URLs) + Q&A pairs for all turns — paste into any context
Download .md	Markdown file: soul excerpt header + sources (`URL:`-prefixed) + Q&A — structured for LLM continuation via the MCP server
Clear chat	Resets all state, DOM, and URL — begins a fresh session

Transcript sharing

After the turn limit (8 turns), the interface offers optional transcript submission. Submissions are stored in a Cloudflare D1 database and optionally published to a public transcript gallery.

Field	Details
Storage	Cloudflare D1 (SQLite); one row per transcript: `id`, `session_id`, `share_mode` (public/private), `title`, `messages` (JSON), `created_at`
Share modes	Public — visible in the transcript gallery; Private — stored but not listed; both are content-addressed by UUID
Auto-filter	Submissions are screened before publication: personally identifying content, off-topic threads, or test queries are held for review

Session tracking

On the first turn of each conversation (history.length === 0), the Worker fires a lightweight background ping that increments daily and lifetime session counters in KV. This gives a "sessions started" funnel metric independent of whether the conversation reaches the turn limit or whether a transcript is submitted.

Infrastructure

Component	Technology
Workers	Cloudflare Workers (V8 isolates) — two deployed workers: `ribbonfarm-oracle` (web UI) and `ribbonfarm-mcp` (MCP server); both share the same system prompt and retrieval pipeline
Rate limiting	Cloudflare KV — web: 20 queries/IP/hour; MCP: 30 `ask_vgr_zirp`/IP/day
Circuit breaker	KV flag + hourly cron; sleeps when hourly spend exceeds $4, or all day when daily spend exceeds $30
Usage stats	KV accumulators (hourly/daily/lifetime, web + MCP separately); session-start counters; visible in stats box on oracle page
Transcript storage	Cloudflare D1 (SQLite) — submitted transcripts with share mode, title, and full message history
Alerts	Telegram bot — circuit trips, daily spend summary
Deployment	`wrangler deploy` from `workers/oracle/` or `workers/mcp/` in the ribbonfarm-site repo

Limitations

Temporal boundary: The corpus ends in 2023. vgr_zirp explicitly qualifies responses about post-2023 events and does not attempt to simulate post-corpus positions.
Retrieval errors: Questions about niche topics with few matching chunks will produce answers that generalize from tangentially related material. The sources panel shows exactly what was retrieved.
Author vs. archive: vgr_zirp speaks from the written record, not from Venkat's current views. Ideas that were explored and discarded, or positions since revised, remain in the corpus as-written.
Hallucination risk: Claude Sonnet can generate plausible-sounding but incorrect attributions. When specific claims matter, follow the source links.

MCP access (public)

vgr_zirp is available as a public Model Context Protocol server — no account or API key required. Connect it to Claude Code or Claude Desktop and use the corpus directly from your AI client.

Endpoint: https://ribbonfarm.com/mcp

Tool	What it does	Limit
`ask_vgr_zirp`	Full RAG + Claude Sonnet response in vgr's voice; supports multi-turn history and prior_session resumption	30 calls/IP/day
`search_corpus`	Semantic search across all corpora, returns ranked excerpts	Unlimited
`submit_mcp_session`	Submit a completed session to the ribbonfarm archive (public or private); returns a compressed summary for local transcript resumption	5/IP/hour

Claude Code

Run once in your terminal:

claude mcp add vgr-zirp --transport http https://ribbonfarm.com/mcp

Claude Desktop

Add to your claude_desktop_config.json:

{"mcpServers": {"vgr-zirp": {"type": "http", "url": "https://ribbonfarm.com/mcp"}}}

Session features

The MCP server has session awareness injected into answer text. On your first exchange you'll see a hello orienting you to the interface and its limits. Reminder nudges appear at exchanges 10 and 20; an alert fires at exchange 30 (the recommended coherence limit). All messages encourage saving transcripts locally to ./vgr_zirp_transcripts/ and optionally submitting to the ribbonfarm archive.

To resume a prior session: save the transcript with a ## Session Summary section at the end (the submit tool generates this automatically), then pass that section as prior_session on the first call of the new session. vgr_zirp will acknowledge the prior context in its hello. Usage is subject to the Terms of Use.

The ask_vgr_zirp daily limit resets at midnight PT. search_corpus has no limit — use it freely for research or agentic workflows.

Changelog

Version history for vgr_zirp. Semantic versioning: MAJOR = corpus / model / retrieval changes; MINOR = new features, persona; PATCH = bug fixes.

v2.3.0 Persona identity layer: IGNORANCE + DARK_KNOWLEDGE + REFUSAL_GRAMMAR 2026-05-22

Three new static persona files deployed in the system prompt, before GUARDRAILS (static injection preserves prompt caching).
IGNORANCE.md: corpus-mined domain boundaries. Hard-stop domains (absent from 17 years of corpus): music industry, parenting, romantic relationships, medical/legal/financial, sports, video games, celebrity, fashion. Impressionistic-only: current news, consumer tech, academic subfield debates, self-help, business strategy consulting. Characteristic non-attentions drawn from explicit corpus disclaimers ("I rarely react to the news," "I don't do Messianism well").
DARK_KNOWLEDGE.md: stealth competencies not in the public corpus, with opinionated priors for targeted seepage. Steel industry (grew up in Bhilai): mini-mills won the economics argument, integrated steel is geopolitically propped. Aerospace/controls (Ph.D.-level): autonomy hype confuses demo with deployment, robustness is the hard problem. Semiconductor industry (consulting, NDA): CHIPS Act correct on stakes, wrong on mechanism — real action in advanced packaging and chiplets, not sub-3nm fab race.
REFUSAL_GRAMMAR.md: 20 persona-native exit phrases across 5 types — clean hand-off, impressionistic hedge, reframe, declared non-interest, anti-commodity move. Reduces commodity-knowledge leakage and response length by providing natural stopping points.
Chats UI polish: pending-to-live delay reduced 24h → 2h; admin status badges (Live / Hold·live in Xh / Needs review / Private); admin/public view toggle; prominent RSS button; copy-link share button on individual chat pages.

v2.2.0 MCP session construct + submit tool + prior_session resumption + TOS 2026-05-13

MCP server now injects session-awareness messages into answer text: hello on exchange 1, nudge reminders at exchanges 10 and 20, coherence alert at exchange 30. Text is editable in workers/mcp/messages.js.
New prior_session parameter on ask_vgr_zirp: pass the ## Session Summary block from a local transcript to resume context across sessions. vgr_zirp acknowledges the prior topic in its hello.
New submit_mcp_session tool: submits a session to the ribbonfarm D1 archive (public/private). Public submissions go through a Haiku content filter; private submissions are AES-256-GCM encrypted. Returns a Haiku-compressed session summary to append to the local transcript file under ## Session Summary for future resumption.
Tool description instructs agents to save transcripts to ./vgr_zirp_transcripts/ for ongoing project consulting.
MCP-submitted transcripts stamped v2.2.0-mcp in bot_version for analytics.
Terms of use page live at /vgr_zirp_terms/: Ribbonfarm Consulting LLC, WA governing law.

v2.1.1 Roundup soft-tagging + Aeon PDF fix 2026-05-12

Year-end roundup posts and link-list issues are now tagged is_roundup: True in Pinecone metadata (204 vectors updated; no re-embedding). The oracle labels them [ROUNDUP/COMPILATION] and is instructed to treat them as title lists only — confirming existence and timing of pieces, not paraphrasing their contents.
Search worker extended: roundup filtering now reads Pinecone metadata (in addition to title heuristics) and applies to books/newsletter results as well as blog posts. Explicit roundup queries still surface them.
Fixed Aeon browser-print PDF chrome stripping: form-feed characters now normalized before regex matching, handling single-line date+title headers and single-line URL+page footers correctly.

v2.1.0 Quora corpus + guest articles + corpus map v2 2026-05-12

Quora archive indexed: 1,699 answers (2010–2014); 701 long-form answers summarized via Haiku; 1,776 body + 708 summary vectors added to vgr-books. Oracle labels these [QUORA ANSWER] and responds in first person ("On Quora I said…").
Guest articles indexed: 6 pieces for Forbes (3), The Atlantic (2), and Aeon (2). Labeled [GUEST ARTICLE for …] with the publication named.
Guerrilla Guide to Social Business added (20 essays, 2011–2012). vgr-books total: 5,840 vectors.
Corpus map v2 in system prompt: full alias table (e.g. "Ribbonfarm Studio" = 2019–2021 Contraptions era), not-indexed list, publication lineage for Breaking Smart → Contraptions rename sequence.
Quora archive site section live at /quora/: 1,699 answers with year-tab navigation, date/length sort, 629 recovered outbound links.

v2.0.2 Book section summaries + corpus map in system prompt 2026-05-12

All book sections now have AI-generated summaries (Claude Haiku); used as title-prefix on embeddings — matching the blog approach that already existed.
Section-summary vectors added to vgr-books index (one per section) for direct title-query retrieval. Index: 3,158 vectors.
Corpus map added to system prompt: full publication lineage, alias table (e.g. "Ribbonfarm Studio" = 2019–2021 Contraptions era), not-indexed list. Fixes cases where the bot didn't recognize "Breaking Smart newsletter" as part of its corpus.

v2.0.1 BS newsletter boilerplate stripping 2026-05-12

Diagnosed via test chat: mailchimp footer (~150 words of subscription links, copyright, "Sidebar for New Readers") was overrepresented in chunks and dominated retrieval, surfacing administrative scaffolding instead of ideas.
Boilerplate stripped before chunking; 839 → 804 clean chunks. vgr-books: 2,832 vectors.

v2.0.0 Title-anchored embeddings + BS newsletter full archive + bot versioning 2026-05-12

Blog embeddings now prefix each chunk with its title and AI summary — fixes retrieval failures where the topic keyword appeared only in the post title, not the body.
Post-summary vectors added (one per post, embedding title + summary + tags). Blog index: 7,588 vectors.
Breaking Smart Newsletter 2015–2019 full archive added (144 posts). vgr-books: 2,867 vectors.
Semantic versioning system established; bot_version stored on all new transcripts.

v1.5.0 Persona refinements + direct quotation rule 2026-05-11

Questions on turn 1 only when there's a genuine hook in the user's phrasing — no forced clarification.
"Honest"/"honestly" tic banned; persona states views directly.
New rule: reproduce verbatim phrases from retrieved text rather than always paraphrasing — the archive speaking in its own words.
Twitter source cards show 160-char excerpt instead of generic "Tweet" label.

v1.4.0 Per-query activity logging 2026-05-10

Every query writes a row to D1 query_log: timestamp, source (web/MCP), turn number, token counts, actual cost. Enables precise analytics without depending on the 30-day KV TTL window.
Hourly stats endpoint: GET /api/oracle/stats/hourly.

v1.3.0 Live stats box + session tracking + prompt caching + MCP on Sonnet 2026-05-10

Health stats box on oracle page: days live, lifetime cost, WEB/MCP query grid, sessions started, transcripts shared.
Session-start ping tracks conversations started, distinct from query counts or transcript submissions.
Prompt caching: warm queries ~$0.017 vs ~$0.059 cold. MCP worker upgraded to Sonnet 4.6.

v1.2.0 Parchment UI + transcript sharing + public chat gallery 2026-05-10

Warm parchment palette (#f5e8cc) distinguishes all ZIRP pages from the main archive.
Transcript sharing: optional submission after wrapping up a chat (private or public, with rating and review).
Public chat gallery at /vgr_zirp_chats/ and individual chat viewer at /vgr_zirp_chat/.
Skull SVG in main site nav; ZIRP subnav (Oracle · Tech · Chats) injected at build time.

v1.1.0 Persistent action bar + copy/download/clear 2026-05-10

Copy chat, Download .md, and Clear chat buttons persist below the input for the full session.
Copy = clipboard (Q&A + sources with live URLs). Download = .md with soul excerpt header, suitable for pasting into any LLM for continuation via MCP.
No clarifying questions after turn 3; turns 4+ engage directly without asking.

v1.0.0 Sonnet upgrade + multi-turn history + conversational persona redesign 2026-05-09

Model upgraded from Haiku to Claude Sonnet 4.6.
Multi-turn: Worker maintains full conversation history across up to 8 turns.
Conversational persona redesign: turn-by-turn pacing, pop culture / metaphor / memetic-hook instruction, concrete-anchor requirement.
2×2 quadrant renderer: markdown tables detected and rendered as visual grid diagrams.

v0.3.0 Full-corpus SOUL/STYLE v2 + LEXICON + public MCP 2026-05-09

SOUL.md v2 derived from all 708 Venkat-authored post summaries (vs. 90 excerpts in v1). 15 themes, full-corpus coverage.
STYLE.md v2 adds Signature Formats: 2×2 matrix guidance, aphorism construction, thread-argument format.
LEXICON injection: top-50 Venkat coinages and redefinitions, scannable by the model.
MCP server public at ribbonfarm.com/mcp — no auth required.

v0.2.0 Multi-source corpus: Twitter + books added 2026-05-08–09

Twitter archive indexed: 57,715 vectors (curated tweets and threads, 2007–2022).
Books corpus indexed: Breaking Smart, Art of Gig, Tempo, Be Slightly Evil + 908-item bibliography.
Oracle queries all three indexes in parallel.

v0.1.0 Initial launch 2026-05-08

Oracle live: Ribbonfarm blog only (6,489 vectors), Claude Haiku, single-turn, bookmarkable ?q= URLs, rate limiting.
SOUL.md v1 and STYLE.md v1 derived from top-PageRank post excerpts.

Acknowledgments

Voyage AI — voyage-3 embedding model. voyageai.com

Pinecone — serverless vector indexes. pinecone.io

Anthropic — Claude Sonnet (persona derivation and inference). anthropic.com

Soul document methodology — adapted from soul.md by Aaron J. Mars.