AI Agent Architecture

Build a Self-Improving Claude Code System Without Losing Control

Most "self-improving AI" advice skips the part that makes the system useful: structure. A model cannot improve a chaotic workspace. It can only compound from data it can find, procedures it can repeat, and feedback it is allowed to write back.

Austin Marchese's B.U.I.L.D. framework is useful because it treats Claude Code less like a chatbot and more like a small operating system for your work: a knowledge base, ingestion skills, recurring data inflow, improvement loops, and human review gates.

JQ AI SYSTEMS take: a self-improving Claude Code system should not auto-change everything. It should auto-organize low-risk context, propose higher-risk improvements, and make the human review path easy enough that you actually use it.

Video credit: Austin Marchese. Austin's broader AI training material is available at The AI Playbook.

Source Note

This post uses Austin Marchese's video and transcript as the walkthrough source, then checks the Claude Code mechanics against official Claude Code documentation on memory, skills, routines, and settings. The post also references Andrej Karpathy's LLM Wiki pattern, Google Takeout, Outlook export, Granola MCP, and Wispr Flow as relevant data-capture examples.

Treat any fully automated "self-improving" workflow with care. Claude Code memory and skills are context and procedure, not guaranteed enforcement. Anthropic's docs are explicit that memory is loaded as context. For hard boundaries, use permissions, hooks, tests, and human review.

Resource Link Status Builder takeaway
Austin Marchese video Watch on YouTube Source commentary The B.U.I.L.D. framework: Base, Upload, Inflow, Loop, Drive.
Austin Marchese YouTube, The AI Playbook, The Incubator Creator credit Useful operator framing for turning AI usage into systems.
Claude Code memory Official docs Official source Use CLAUDE.md for persistent instructions and auto memory for learned notes.
Claude Code skills Official docs, Anthropic engineering Official source Turn repeated procedures into skills instead of bloating CLAUDE.md.
Claude Code routines Official docs Official source Schedule repeatable work, but scope repositories, connectors, and network access tightly.
Claude Code settings Official docs Official source Use project, local, and user scopes deliberately.
Karpathy LLM Wiki Gist Source pattern Raw sources become a structured markdown wiki that compounds.
Data exports Google Takeout, Outlook export Official tools Export only what you are comfortable processing.
Meeting and voice inputs Granola MCP, Wispr Flow Tooling examples Useful for recurring transcripts and weekly spoken data dumps.

The Main Takeaway

A self-improving system is not one that edits itself forever. That is how drift starts. A useful self-improving system has three layers:

  • Knowledge layer: raw sources, processed notes, and wiki pages.
  • Procedure layer: skills that tell Claude exactly how to ingest, sync, summarize, and improve.
  • Review layer: a queue where high-stakes suggestions wait for human approval.

The system improves because every week produces better context and better procedures. The human still decides what matters.

The B.U.I.L.D. Framework

Austin's framework has five steps:

Step Meaning What it creates
Base Create the framework Folders, CLAUDE.md, and first skills.
Upload Bulk ingest your data Initial source lake from sessions, files, email exports, and life/project context.
Inflow Set up data pipelines Recurring sync skills for new sessions, calls, newsletters, and data dumps.
Loop Improve the system Auto-approved cleanup, review-required suggestions, and more-context-needed files.
Drive Run it A habit of using, trimming, and improving the system instead of over-engineering it.

Base: Create The Framework

Start with a workspace Claude can understand. Austin recommends two parts: a knowledge base and skills.

The knowledge base should separate raw material from synthesized knowledge. That mirrors Karpathy's LLM Wiki pattern: raw files go into one layer, then an agent compiles them into structured, linked markdown that is easier to query later.

/raw
  /sessions
  /calls
  /email
  /curated
  /voice-dumps
/processed
/wiki
/skills
/output/review
/changelog.md
CLAUDE.md

Keep CLAUDE.md short. Anthropic's docs say CLAUDE.md is loaded into context, so shorter and more specific instructions tend to work better. If an instruction becomes a procedure, move it into a skill.

The first skill Austin recommends is add-new-resource. It should take a raw file, store it in the right folder, summarize it, create or update wiki entries, and log what changed.

Upload: Bulk Ingest Your Existing Data

Once the base exists, fill it with already-existing context. Austin points to three useful buckets:

  • AI inputs: previous Claude Code sessions, prompt history, corrections, and repeated questions.
  • Personal ecosystem data: selected files, emails, call transcripts, Slack exports, documents, and project notes.
  • Life story and project goals: a spoken or written context file explaining what you are building, how you work, and what you care about.

This is also where you need restraint. Do not ask an agent to scan your entire computer by default. Pick approved folders. Redact private data. Use Google Takeout or Outlook export only when you understand what is inside the archive and why it belongs in the system.

Privacy rule: if you would not hand the source file to a contractor, do not hand it to an automated ingestion loop without review, redaction, and a clear reason.

Inflow: Create The Data Pipelines

Bulk ingest is a one-time lake fill. Inflow is the recurring water supply.

Austin suggests building dedicated sync skills, then using them as building blocks:

  • sync-claude-sessions: bring new Claude session learnings into /processed and the wiki.
  • sync-ecosystem-data: pull approved recurring sources such as meeting notes, Slack, YouTube transcripts, or client-call summaries.
  • sync-curated-content: ingest selected newsletters, articles, videos, or research sources.
  • add-new-resource: the utility skill everything else can call.

Granola MCP is a good example of a meeting-note inflow source because it lets AI tools query meeting notes and transcripts. Wispr Flow is useful for spoken daily or weekly dumps because it turns messy speech into clean text.

The practical mistake is trying to ingest everything. High-signal inflow beats high-volume inflow. A narrow newsletter inbox, a weekly client-call summary, and your own Claude corrections may be more useful than a giant pile of undifferentiated content.

Loop: Improve Without System Drift

This is the part Austin is most careful about. Self-improving does not mean "approve everything." That creates drift: the system starts optimizing itself according to weak signals and accidental patterns.

The better pattern is a bucketed improvement queue:

Bucket Examples Action
Auto approve Missing links, duplicate notes, obvious wiki cleanup, changelog entries. Apply automatically and log it.
Needs signoff New skill, skill edit, changed workflow, external action, client-facing output rule. Write to /output/review/YYYY-MM-DD.md with approve/reject checkboxes.
More context required Contradictory notes, unclear preference, missing source, ambiguous business rule. Ask the human a focused question.

Claude Code routines can run recurring work on schedules, API triggers, or GitHub events. Anthropic describes routines as a research preview, so build them like scheduled workers: clear scope, narrow permissions, visible logs, and a human review path.

Austin's split is smart: run data ingestion first, then run improvement analysis later, then review the suggestions. Keeping those as separate routines makes failures easier to diagnose.

Drive: Run It, Do Not Over-Engineer It

The final step is not technical. It is operational.

Use the system every week. Delete skills that are not helping. Improve a skill immediately after Claude disappoints you. Do not debate whether a folder should be called /raw/input or /raw/sessions for two hours. The system compounds from use, not from perfect architecture.

The rule I would give a founder or consultant is simple: if the system did not help you make a better decision, produce a better output, or avoid repeating yourself this week, simplify it.

A Safe Starter System

If you want the smallest useful version, build this:

  1. Create /raw, /processed, /wiki, /skills, and /output/review.
  2. Add a short CLAUDE.md explaining the folder structure, review gates, and privacy rules.
  3. Create one add-new-resource skill.
  4. Ingest three safe sources: one transcript, one project note, and one weekly voice dump.
  5. Ask Claude to create or update wiki pages from those sources.
  6. Ask Claude to suggest one improvement to the skill, but write it to the review queue instead of applying it.
  7. Review the suggestion manually and approve only if it clearly improves future work.

Once that works manually, schedule it. Not before.

CTA: If Claude has to relearn your goals, style, clients, and project rules every session, do not build another agent yet. Build the memory system first, then add automation.

Sources

Common questions

What is a self-improving Claude Code system?
It is a structured workspace where Claude Code can ingest useful source material, update a persistent knowledge base, suggest skill improvements, and route risky changes through a human review queue. It is not magic autonomy.
Does Claude Code really have memory?
Claude Code has CLAUDE.md files for persistent instructions and auto memory for notes it saves from corrections and preferences. Anthropic says these are loaded as context, not enforced rules, so high-risk behavior still needs permissions, hooks, tests, and review.
Should I let Claude ingest my whole computer and email?
No. Start with scoped folders and approved exports. Redact sensitive data, skip private material you do not want processed, and keep raw source files separate from summarized wiki files.
What should be auto-approved?
Only low-risk housekeeping: missing links, duplicate notes, obvious wiki cleanup, and changelog updates. New skills, changed workflows, external actions, client-facing copy, or production changes should require human signoff.
What is the first thing to build?
Build one add-new-resource skill, one raw folder, one wiki folder, and one review file. Run it manually before scheduling anything.
Share
X LinkedIn Reddit
Build Yours

Want a system
like this one?

Book a free 30-minute call. We map your situation, identify the highest-impact automation, and figure out if we are a fit.

Book Free 30-min Call