AI Agent Architecture

Fable 5 and Karpathy's LLM Wiki: Build Agent Memory That Compounds

Nate Herk's video lands because it makes agent memory feel concrete. Instead of another vague "second brain" promise, he shows a practical pattern: drop raw sources into a vault, let Claude Code turn them into cross-linked Markdown pages, then let Fable 5 or another agent reason over the connected structure.

The important part is not that the wiki looks cool in Obsidian. The important part is that the agent no longer starts cold. It can follow links, read an index, check a log, find related concepts, and build from a maintained knowledge graph instead of re-reading a messy folder every session.

JQ AI SYSTEMS take: an LLM wiki is one of the cleanest small-team memory patterns right now. Use cheap models for ingestion and maintenance, save frontier models like Fable 5 for high-value synthesis, and keep the whole thing inspectable as Markdown.

Video credit: Nate Herk. This article uses Nate's walkthrough as the practical example and Andrej Karpathy's LLM Wiki gist as the concept source.

Source Note

Credit for the walkthrough goes to Nate Herk, who demonstrates the workflow with YouTube transcripts, Claude Code, Obsidian, and Fable 5. The underlying idea comes from Andrej Karpathy's LLM Wiki gist, which describes a pattern for turning raw sources into a maintained, interlinked Markdown knowledge base.

I am treating the video as a worked example, not a guarantee that every ingest will organize itself perfectly. Wiki structure needs review. Agent memory can drift. Private sources need stricter rules than public YouTube transcripts or public docs.

Here are the useful links from the post, grouped by how a builder would use them.

Resource Link Use it for Builder note
Nate Herk video Fable 5 + Karpathy's LLM Wiki is Basically Cheating Practical walkthrough Shows the actual vault, graph, flat vs structured wiki, and two-source ingest demo.
Nate Herk credit YouTube / X / site Creator source Credit the workflow source and follow Nate's broader AI OS work if useful.
Karpathy LLM Wiki GitHub Gist Concept source The pattern: raw sources, generated wiki, schema, index, log, lint, and compounding knowledge.
Karpathy X post LLM knowledge bases post Original social context Useful background, but the gist is the clearer implementation brief.
Obsidian Product / Graph view docs Human browsing layer Plain Markdown plus internal links makes the wiki readable to humans and agents.
Claude Code Product page / quickstart Agent that reads and edits the vault Claude Code can read project files, edit Markdown, run commands, and work from a schema file.
Fable 5 Product page / model docs High-value synthesis layer Use Fable when the wiki needs deep reasoning, strategy, or interface generation, not for every cheap ingest.
Hermes LLM Wiki skill Hermes docs Cross-agent portability The pattern is not locked to Claude Code. Markdown wikis can be read by Hermes, Codex, and other file-aware agents.

The Core Idea

Karpathy's LLM Wiki pattern is simple but powerful: your raw sources are not the final knowledge base. They are source material. The agent reads them, extracts important information, creates or updates Markdown pages, cross-links related ideas, updates an index, and appends a log entry.

That means your knowledge base is no longer a pile of transcripts, PDFs, URLs, and meeting notes. It becomes a maintained artifact that improves as new sources arrive.

In Nate's demo, YouTube transcripts become pages for videos, tools, concepts, techniques, and relationships. A mention of GitHub can link to other videos that discuss GitHub, then to Vercel, then back to Claude Code. The graph is not decoration. It shows the agent where meaning has already been compiled.

Why This Works Better Than Chat History

Chat history is linear. It remembers a conversation, but it does not naturally become a durable knowledge structure. A normal file upload or basic RAG workflow can answer questions from chunks, but it often rediscovers the same connections again and again.

An LLM wiki works because it makes the agent maintain structure:

  • Raw sources stay immutable. PDFs, transcripts, URLs, notes, and exports remain the source of truth.
  • Wiki pages become working memory. Summaries, concepts, entities, comparisons, and topics are editable Markdown.
  • The index is the map. The agent reads it first to decide where to look.
  • The log is the memory trail. Every ingest, query, lint pass, and rewrite can be tracked.
  • Backlinks make context navigable. Humans can click through the same structure the agent uses.

The outcome is not perfect memory. It is inspectable memory. That is much better.

Starter Setup

The starter version can be very small. You do not need a vector database, a custom app, or an enterprise knowledge graph to test the idea.

your-llm-wiki/
  CLAUDE.md
  index.md
  log.md
  raw/
    sources-go-here.pdf
    transcript-example.md
  wiki/
    concepts/
    entities/
    sources/
    topics/

If you use Codex instead of Claude Code, the schema file could be AGENTS.md. If you use Hermes, you may point the agent at the same folder and give it a skill that explains the same conventions. The important part is not the filename; it is that the agent has rules for how to ingest, link, update, and verify.

A good first schema tells the agent:

  • what belongs in raw/,
  • what belongs in wiki/,
  • when to create a new page versus update an existing page,
  • how to write titles and backlinks,
  • how to update index.md,
  • how to append log.md,
  • what claims require citations,
  • what private data must be refused or redacted.

Flat vs Structured Wikis

Nate makes a useful distinction in the video: not every wiki should be deeply structured. Some wikis work better flat.

A flat wiki is useful when the source type is consistent, the volume is moderate, and you mostly want the agent to search across many similar notes. Meeting transcripts, daily notes, customer calls, and short internal memos often start here.

A structured wiki is useful when the sources naturally create reusable categories: tools, people, companies, models, workflows, techniques, objections, case studies, or product areas. Nate's YouTube transcript wiki became structured because his videos repeatedly mention the same tools and concepts.

The mistake is forcing a taxonomy too early. Start simple. Let the first 10 to 20 sources reveal what categories are actually useful.

The Ingest Workflow

Nate demos two ingest styles: dropping a PDF into raw/ and asking Claude Code to ingest a URL. That is enough to test the system.

  1. Create the vault and open it in Obsidian.
  2. Open the same folder in Claude Code or another file-aware agent.
  3. Paste Karpathy's LLM Wiki gist as the concept brief.
  4. Ask the agent to create the schema, index, log, raw folder, and wiki folder.
  5. Add one source to raw/.
  6. Ask the agent to ingest it, update relevant pages, update the index, and append the log.
  7. Review the generated pages before adding more sources.

Here is a starter prompt you can adapt:

You are my LLM wiki agent.

Build a Markdown knowledge base in this folder using the LLM Wiki pattern.

Create:
- CLAUDE.md with the schema and rules
- raw/ for immutable source material
- wiki/ for generated pages
- index.md as the map of the wiki
- log.md as the append-only activity record

Rules:
- Do not modify raw sources.
- Create or update wiki pages with clear backlinks.
- Update index.md after every ingest.
- Append log.md after every ingest, query, lint, or major rewrite.
- Flag uncertain claims and contradictions.
- Ask before ingesting private, sensitive, or client material.

Start by setting up the structure, then show me the first safe ingest workflow.

The Routing Layer

The most important part of the workflow is routing. A big folder of Markdown is still messy unless the agent knows where to look first.

That is why index.md, log.md, and the schema matter. They let the agent decide:

  • which wiki should answer the question,
  • which pages are likely relevant,
  • which sources support the claim,
  • which concepts connect across sources,
  • which pages need to be updated after a new ingest.

This also keeps token use saner. Instead of stuffing every transcript into context, the agent reads the routing layer first, then drills into the pages that matter.

Where Fable 5 Fits

Nate uses Fable 5 in the video, but he also says you probably do not need Fable for basic ingest. I agree.

Use cheaper models for:

  • turning one clean transcript into a source page,
  • updating index.md,
  • renaming pages,
  • adding backlinks,
  • running a lint pass for orphan pages or missing citations.

Use Fable 5 for:

  • high-level synthesis across many sources,
  • finding subtle contradictions,
  • turning the wiki into a beginner-friendly interface,
  • building a business narrative from months of notes,
  • planning a product, content system, or internal operating model from the wiki.

That routing matters because Fable 5 is powerful and expensive. The goal is not "Fable touches every file." The goal is "the right model touches the right layer."

Builder Checklist

If you want to test this this week, keep the first version boring.

  • Pick one domain: YouTube transcripts, client calls, product research, personal notes, or company SOPs.
  • Start with 5 to 10 safe public or low-risk sources.
  • Create raw/, wiki/, index.md, log.md, and a schema file.
  • Run one source ingest at a time until the structure is trustworthy.
  • Review the generated pages manually.
  • Add a weekly lint pass: orphan pages, stale claims, contradictions, missing citations, duplicate concepts.
  • Keep private data out until you have redaction, access control, and model-retention rules sorted.
  • Version the wiki with Git if it becomes operationally important.
  • Use the wiki from more than one agent only after the schema is stable.

The win is compounding context. The risk is compounding mess. Start small enough that you can still inspect the whole thing.

Sources

Common questions

What is an LLM wiki?
An LLM wiki is a persistent collection of Markdown pages maintained by an AI agent. Raw sources go in, and the agent turns them into summaries, concepts, entity pages, source notes, links, indexes, and logs that compound over time.
How is an LLM wiki different from normal RAG?
RAG usually retrieves chunks from raw documents at query time. Karpathy's LLM wiki pattern compiles knowledge into a maintained wiki first, so cross-links, contradictions, summaries, and indexes already exist when the agent needs them.
Do I need Fable 5 to build an LLM wiki?
No. Nate Herk uses Fable 5 in the video, but the ingest and maintenance workflow can run on cheaper models. Fable is better reserved for harder synthesis, interface design, strategy, or high-value reasoning over the wiki.
Why use Obsidian?
Obsidian works well because the vault is plain Markdown, supports internal links, and has a graph view for visualizing relationships between notes. The data is still portable even if you later use another editor or agent.
What should I avoid putting in an LLM wiki?
Do not casually ingest private client data, passwords, API keys, sensitive meeting transcripts, health data, financial records, or confidential documents into a cloud model. Create privacy rules, redaction steps, and review gates first.
Share
X LinkedIn Reddit
Build Yours

Want a system
like this one?

Book a free 30-minute call. We map your situation, identify the highest-impact automation, and figure out if we are a fit.

Book Free 30-min Call