The LLM Wiki Pattern: A Second Brain That Compounds

·5 min readdevlog
Table of Contents

Every AI coding session starts the same way: the model has no memory of last time. You re-explain the project. You re-paste the relevant docs. You re-discover the same answers you found last week.

RAG helps — retrieve relevant chunks, feed them as context — but it's still rediscovery. The model reads raw sources fresh every time. Nothing compounds. Nothing connects.

I wanted a different model. Andrej Karpathy described exactly that — the LLM Wiki Pattern. So I implemented it.

Andrej Karpathy

Andrej Karpathy

@karpathy

LLM Knowledge Bases Something I'm finding very useful recently: using LLMs to build personal knowledge bases for various topics of research interest. In this way, a large fraction of my recent token throughput is going less into manipulating code, and more into manipulating…

April 2, 2026

The LLM Wiki Pattern

Karpathy's idea is simple: instead of retrieving from raw sources at query time, an LLM agent incrementally builds an interlinked wiki. Sources go in once. The wiki grows over time. Each session starts where the last one ended.

text
raw/sources/          ← immutable source documents
wiki/
├── entities/         ← people, orgs, projects, concrete things
├── concepts/         ← ideas, patterns, theories
├── sources/          ← one summary page per raw source
├── synthesis/        ← cross-source analysis
├── index.md          ← full catalog
├── log.md            ← append-only activity timeline
└── hotcache.md       ← recent context, read first

Raw sources are never modified. The wiki is fully owned and maintained by the agent. New source comes in: the agent reads it, writes a summary page, updates relevant entity and concept pages, cross-links everything, and appends to the log.

The Hotcache

The most useful piece is hotcache.md — a ~500-word dense summary of the current wiki state. It's the first thing the agent reads every session.

text
Who pyyupsk is. Active projects. Key concepts. Recent activity. Naming conventions.

Most questions are answerable from hotcache alone, without reading anything else. When they're not, the agent reads index.md, then drills into specific pages. Full context loading only happens for deep queries.

This is what makes it fast. A large wiki would be expensive to load every conversation. The hotcache brings that cost down to one file, most of the time.

What the Agent Does

Six commands drive the system:

CommandWhat it does
/wiki-llms:ingest <path>Read source → summarize → update entity/concept/index/log/hotcache
/wiki-llms:query <question>Answer via hotcache → index → drill in; file valuable synthesis back
/wiki-llms:lintHealth check: contradictions, stale claims, orphans, broken links, index drift
/wiki-llms:statusCounts + last 5 log entries + active threads
/wiki-llms:hotcacheRefresh hotcache from current wiki state
/wiki-llms:page <slug>Open a wiki page by slug (fuzzy match)

Ingest is the core loop. Drop a source file into raw/sources/ with a YYYY-MM-DD-kebab-case.md name, run /wiki-llms:ingest, and the agent handles everything downstream.

All six commands are available as Claude Code slash commands in this gist.

Why It's Better Than RAG

RAG is stateless. Every query starts at the raw sources. The model builds understanding on the fly, from scratch, each time. It works, but it doesn't accumulate. You get answers, not knowledge.

A wiki accumulates:

  • Cross-references are already there. The agent built them during ingest, not during your query.
  • Contradictions are already flagged. Lint runs catch them before they confuse you.
  • Synthesis exists. Synthesis pages represent cross-source analysis that no single document contains.

When I ask about a specific architectural pattern, the agent doesn't re-read six source documents. It reads the concept page, which already synthesizes them.

The Lint Pass

The healthiest thing I did was run /wiki-llms:lint after every batch of ingests.

The last pass on 2026-04-22 caught:

  • 52 broken wikilinks (30 stub pages created, typos fixed)
  • 34 Biome source files mapped to 3 umbrella summary pages — structural drift documented
  • Index drift: 31 missing rows added

Without lint, the wiki would have rotted the same way most documentation rots: quietly, invisibly, until it's wrong enough to mislead.

What's Stored

The scope is specific: knowledge management architecture, my developer tooling ecosystem, and AI agent workflows. Not everything. Just the domains I work in constantly and need durable context for.

Entities cover tools, frameworks, people, and projects. Concepts cover patterns, ideas, and design decisions. Synthesis covers cross-source analyses I'd otherwise lose between sessions.

The Real Payoff

The difference shows up when you return to a project after two weeks away.

Without the wiki: re-read the README, re-grep the codebase, re-ask questions you've already answered.

With the wiki: read hotcache, read the entity page, start working.

Context loading used to be the hidden tax on every conversation. The wiki makes it a solved problem.

What I'd Change

The naming convention is strict — YYYY-MM-DD-kebab-case.md for every source file. When a file doesn't match, the agent asks before ingesting. That's the right call, but it adds friction when processing batches.

Synthesis pages need more attention. The most valuable insights live in synthesis, but it's the step most likely to skip when you're moving fast.

Bottom Line

RAG is retrieval. A wiki is memory. Retrieval finds what you've stored. Memory is what you've understood.

It's the first time my AI context has felt genuinely persistent — not just cached, but accumulated. Each session starts a little further ahead than the last.

That's the pattern. Drop sources in. Let the agent build the understanding. Ask questions. Come back tomorrow and it's still there.