{"title":"Manual RAG for a personal site","slug":"manual-rag-for-a-personal-site","url":"https://reboot.md/manual-rag-for-a-personal-site","markdownUrl":"https://reboot.md/manual-rag-for-a-personal-site.md","jsonUrl":"https://reboot.md/manual-rag-for-a-personal-site.json","date":"2026-01-15","updated":"2026-05-28","status":"imported","type":"log","readTime":"7 min","tags":["ai-workflows","rag","machine-readable","publishing","product"],"projects":["reboot.md","Kudos Ideas"],"summary":"An imported note about building a low-cost local retrieval layer for a personal website chat experience.","content":"![Personal site RAG experiment](/images/core-personal/00002-parlor-automaton.png)\n\n## The experiment\n\nThe old core-personal site included a chat interface for asking questions about the website.\n\nThe original framing called it a parlor automaton. The practical version is simpler: a visitor asks a question, the site retrieves relevant local content, and an LLM answers from that context.\n\nThe product question was not “can I add a chat widget?” The better question was:\n\nCan a personal site become queryable without creating an expensive runtime dependency?\n\nThere is a profile question inside that product question. If someone is curious enough to ask the site a question, the site should help them reach the right evidence quickly. That might be a collaborator evaluating judgment, a potential customer looking for proof, or a reader trying to understand the arc of the work.\n\n## The architecture\n\nThe first version used a manual RAG approach:\n\n- build a local search index with MiniSearch\n- retrieve relevant chunks from published site content\n- assemble those chunks into a prompt\n- ask Gemini on the free tier to produce the final answer\n- keep the knowledge base close to the site instead of depending on a separate CMS\n\nThis was intentionally pragmatic. It worked well enough to publish, and it made the cost model visible.\n\n## What broke first\n\nThe early problems were predictable but useful.\n\nSome answers were cut off when the context payload was too large or poorly shaped. Some chunks were technically relevant but awkward because they were split at bad boundaries. Index freshness was also too manual: publishing new content should automatically update the knowledge base.\n\nThe important lesson is that “chat with your site” is not one feature. It is a small retrieval product with a pipeline, budgets, quality controls, and maintenance requirements.\n\nIt is also a small trust surface. Bad retrieval creates confusion at the exact moment someone is showing intent. Good retrieval can make the next step obvious without turning the site into a funnel.\n\n## What should change\n\nThe build pipeline should own indexing.\n\nWhen the site builds, it should discover posts and pages, extract clean text, keep useful metadata, chunk by headings and paragraphs, rebuild the search index, and publish the updated knowledge base alongside the site.\n\nRetrieval also needs source references, duplicate reduction, better chunk boundaries, and a context budget that prefers fewer high-quality passages over many weak ones.\n\nThe cost dial should stay explicit:\n\n- local-only mode for retrieval without a model call\n- caching for common questions\n- optional model providers later\n- clear fallback behavior when the model is unavailable\n- surfaced source links that let interested readers keep moving through the archive\n\n## Why this connects to reboot.md\n\n`reboot.md` already ships raw Markdown, JSON post routes, feeds, `llms.txt`, and `llms-full.txt`.\n\nThat makes the site more useful to AI systems before adding a chat interface. The lesson from core-personal is that machine-readable publishing should come first. Chat can come later, if the underlying context is clean enough to deserve it.\n\nThat structure also supports the work around the work without changing the writing into a pitch. The archive can answer: what I build, who I help, what I have learned, what I am testing, and where the proof lives. Humans can browse it. Machines can retrieve it. Both should land on a coherent profile.\n\nThe durable part of the experiment is not the Victorian interface. It is the constraint:\n\nKeep public context structured, cheap to retrieve, easy to cite, and economical enough to keep alive.","html":"<figure><img src=\"/images/core-personal/00002-parlor-automaton.png\" alt=\"Personal site RAG experiment\" loading=\"lazy\" /><figcaption>Personal site RAG experiment</figcaption></figure>\n<h2 id=\"the-experiment\">The experiment</h2>\n<p>The old core-personal site included a chat interface for asking questions about the website.</p>\n<p>The original framing called it a parlor automaton. The practical version is simpler: a visitor asks a question, the site retrieves relevant local content, and an LLM answers from that context.</p>\n<p>The product question was not “can I add a chat widget?” The better question was:</p>\n<p>Can a personal site become queryable without creating an expensive runtime dependency?</p>\n<p>There is a profile question inside that product question. If someone is curious enough to ask the site a question, the site should help them reach the right evidence quickly. That might be a collaborator evaluating judgment, a potential customer looking for proof, or a reader trying to understand the arc of the work.</p>\n<h2 id=\"the-architecture\">The architecture</h2>\n<p>The first version used a manual RAG approach:</p>\n<ul><li>build a local search index with MiniSearch</li><li>retrieve relevant chunks from published site content</li><li>assemble those chunks into a prompt</li><li>ask Gemini on the free tier to produce the final answer</li><li>keep the knowledge base close to the site instead of depending on a separate CMS</li></ul>\n<p>This was intentionally pragmatic. It worked well enough to publish, and it made the cost model visible.</p>\n<h2 id=\"what-broke-first\">What broke first</h2>\n<p>The early problems were predictable but useful.</p>\n<p>Some answers were cut off when the context payload was too large or poorly shaped. Some chunks were technically relevant but awkward because they were split at bad boundaries. Index freshness was also too manual: publishing new content should automatically update the knowledge base.</p>\n<p>The important lesson is that “chat with your site” is not one feature. It is a small retrieval product with a pipeline, budgets, quality controls, and maintenance requirements.</p>\n<p>It is also a small trust surface. Bad retrieval creates confusion at the exact moment someone is showing intent. Good retrieval can make the next step obvious without turning the site into a funnel.</p>\n<h2 id=\"what-should-change\">What should change</h2>\n<p>The build pipeline should own indexing.</p>\n<p>When the site builds, it should discover posts and pages, extract clean text, keep useful metadata, chunk by headings and paragraphs, rebuild the search index, and publish the updated knowledge base alongside the site.</p>\n<p>Retrieval also needs source references, duplicate reduction, better chunk boundaries, and a context budget that prefers fewer high-quality passages over many weak ones.</p>\n<p>The cost dial should stay explicit:</p>\n<ul><li>local-only mode for retrieval without a model call</li><li>caching for common questions</li><li>optional model providers later</li><li>clear fallback behavior when the model is unavailable</li><li>surfaced source links that let interested readers keep moving through the archive</li></ul>\n<h2 id=\"why-this-connects-to-reboot-md\">Why this connects to reboot.md</h2>\n<p><code>reboot.md</code> already ships raw Markdown, JSON post routes, feeds, <code>llms.txt</code>, and <code>llms-full.txt</code>.</p>\n<p>That makes the site more useful to AI systems before adding a chat interface. The lesson from core-personal is that machine-readable publishing should come first. Chat can come later, if the underlying context is clean enough to deserve it.</p>\n<p>That structure also supports the work around the work without changing the writing into a pitch. The archive can answer: what I build, who I help, what I have learned, what I am testing, and where the proof lives. Humans can browse it. Machines can retrieve it. Both should land on a coherent profile.</p>\n<p>The durable part of the experiment is not the Victorian interface. It is the constraint:</p>\n<p>Keep public context structured, cheap to retrieve, easy to cite, and economical enough to keep alive.</p>","headings":[{"id":"the-experiment","text":"The experiment","level":2},{"id":"the-architecture","text":"The architecture","level":2},{"id":"what-broke-first","text":"What broke first","level":2},{"id":"what-should-change","text":"What should change","level":2},{"id":"why-this-connects-to-reboot-md","text":"Why this connects to reboot.md","level":2}],"source":{"origin":"core-personal Victorian-style Parlor Automaton post","tools":["Codex"],"humanEdit":"Imported from core-personal and rewritten into reboot.md operating-log voice"}}