← 所有文章

Compound Context: Why AI Projects Get Better the Longer You Stay With Them

Six months ago, a coding task in my resumegeni project took an entire session of explanation. The agent needed to understand the database schema, the routing conventions, the template inheritance, the cache layer, the deployment pipeline, and the testing patterns before it could touch a single line of code. Every session started from scratch.

Last week, I said “fix the market page performance” and the agent read a handoff document from a previous session, identified the bottleneck in market_hub(), implemented a paginated database query with an aggregate RPC, wrote tests, and deployed. Austin went from 14 seconds to 108 milliseconds. The agent did not become smarter. The project became richer.

The difference is not the model. The difference is the accumulated context surrounding the project: the CLAUDE.md that describes conventions, the memory files that capture decisions, the handoff documents that preserve diagnosis across sessions, the hooks that enforce constraints, the skills that encode workflows, the test suites that verify correctness, the captain’s logs that record what shipped and why. Each artifact was created to solve a specific problem. Together, they make every subsequent problem cheaper to solve.

This is context compounding.

TL;DR

  • Context compounding is the phenomenon where AI-assisted projects improve faster the longer you work on them, because solved problems deposit reusable context that reduces the cost of solving the next problem.
  • The model does not improve between sessions. The project infrastructure does: CLAUDE.md files, memory systems, hooks, skills, handoff documents, test coverage, naming conventions, and operational logs.
  • Context compounding explains why starting a new project with an AI agent feels slow, but the 500th session on the same project feels fast. The first session builds context. The 500th session spends it.
  • The effect is not automatic. It requires intentional investment in context artifacts: documents that capture decisions, hooks that encode constraints, tests that verify assumptions, and logs that preserve operational history.
  • Organizations that understand context compounding will stop rotating engineers across projects every quarter and start treating accumulated project context as a capital asset.

What Compounds

Context compounding operates through six categories of accumulated project knowledge. Each category deposits a different type of return.

Convention documents (CLAUDE.md). A CLAUDE.md file tells every agent session how the project works: file structure, naming conventions, import patterns, testing approach, deployment process. The first session without a CLAUDE.md spends much of its effort discovering conventions. The hundredth session with a mature CLAUDE.md spends zero. The document compounds because every convention captured once is never re-explained.

Decision memory. Memory files capture why decisions were made, not just what was decided. When a future session encounters the same trade-off, it reads the memory instead of re-deriving the answer. My memory system stores project decisions, user preferences, feedback corrections, and reference pointers. Each memory is small. The collection is a decision cache that prevents the project from relitigating settled questions.

Handoff documents. A handoff document preserves a diagnosis across session boundaries. The market page performance handoff survived three code review corrections, two priority reorderings, and ultimately guided the implementation four days later. Without the handoff, the next session would have started the investigation from scratch, likely targeting the wrong code path (as the first draft did). The handoff compounded by converting diagnosis time into a reusable artifact.

Hooks and constraints. Every hook encodes a lesson from a past failure. My destructive API guard exists because an agent purged the entire Cloudflare cache. My sandbox hook exists because an agent attempted to write to ~/.ssh/. My drift detector exists because agents lost track of their task twelve times in sixty days. Each hook prevents the same failure class from recurring across all future sessions. Hooks compound because they convert incident response into permanent prevention.

Skills and workflows. A skill is a codified workflow that an agent can execute without re-inventing the process. My /nightcheck skill runs 50+ page checks with TTFB benchmarks, cache verification, and comprehensive sitemap crawls. My /scan-intel skill searches six academic sources across eight research topics with deduplication and scoring. My /blog-translator skill translates posts to nine locales with format preservation. Each skill was expensive to build once and is free to run forever. Skills compound because they convert process knowledge into executable automation.

Test suites. Tests verify that the project still works after changes. A mature test suite lets an agent make aggressive changes with confidence, because failures are caught immediately. A project with no tests forces conservative, incremental changes because the agent cannot verify its work. Test coverage compounds because each test makes future changes cheaper and safer.

The Compounding Curve

Context compounding follows a characteristic curve.

Sessions 1-10: Investment phase. Most effort goes into building context rather than delivering features. You write the CLAUDE.md, establish conventions, create the first hooks, set up the testing framework. Output feels slow because you are building infrastructure, not product.

Sessions 10-50: Acceleration phase. Context begins returning value. The agent stops asking about conventions and starts following them. Hooks catch mistakes before they deploy. Skills automate repetitive workflows. Each session produces more output than the last because the context base is growing.

Sessions 50-200: Compounding phase. The project has enough accumulated context that hard problems become easy. An agent reading a mature CLAUDE.md, a set of memory files, and a handoff document can execute complex multi-step implementations without additional guidance. The market page fix happened in this phase. One sentence (“fix the market page performance”) triggered a four-day process that ended with a 132x improvement because the context infrastructure carried the diagnosis, the constraints, and the verification criteria.

Sessions 200+: Maintenance phase. The rate of new context creation slows because most conventions, constraints, and workflows are already captured. The focus shifts to updating existing context (correcting outdated memories, extending skills, adding tests for new edge cases) rather than creating it from scratch. The compounding effect plateaus but remains high.

Why This Is Not Obvious

Three factors obscure the compounding effect.

Model improvements mask context improvements. When your AI sessions improve over time, you attribute the improvement to better models. Claude Opus 4.6 is better than Claude 3.5 Sonnet. But the improvement you experience on a long-running project exceeds the model improvement because context compounding stacks on top of model improvement. Switching to a new project on the same model reveals the difference: the new project feels slow because it has no compound context.

Context is invisible. A CLAUDE.md file is a text document. Memory files are markdown notes. Hooks are shell scripts. None of these artifacts look impressive individually. The compounding effect is not visible in any single artifact. It is visible only in the aggregate behavior of sessions that operate against the full context stack. You cannot point to a single file and say “this is why the project is fast.” You can only compare the 500th session to the 1st and notice the difference.

Starting new projects feels exciting. A new project has fresh energy and no accumulated debt. But it also has no accumulated context. The first session on a new project feels productive because it makes high-level decisions that feel impactful. The 20th session on an existing project feels routine because it executes within established conventions. The routine feeling is the compounding effect working. The exciting feeling is the absence of it.

What Prevents Compounding

Four failure modes break the compounding curve.

Context rot. Outdated memories, stale CLAUDE.md sections, and deprecated hooks create confusion rather than clarity. An agent following outdated conventions produces worse output than an agent with no conventions. Context requires maintenance. My memory system includes last-updated timestamps and explicit staleness checks. Dead context is worse than no context.

Context sprawl. Too many files, too many hooks, too many skills create a discovery problem. If the agent cannot find the relevant context, the context does not compound. Organization matters: my memory files use frontmatter with descriptions so future sessions can assess relevance without reading the full content. My hooks are registered in a dispatcher that loads them by event type. Discoverable context compounds. Buried context rots.

Session isolation. If sessions do not read or write persistent context, each session starts from zero. The compounding effect requires intentional bridges: handoff documents that carry diagnosis across sessions, memory writes that capture decisions, captain’s logs that record operational history. Without these bridges, a project with 500 sessions has the same effective context as a project with one.

Platform churn. Switching between AI tools resets the context stack. A CLAUDE.md written for one platform does not automatically help another. Hooks written for one platform’s event model do not fire in another. Context compounding is platform-specific, which creates lock-in that is also a moat. The deeper your context stack on a platform, the higher the switching cost, and the faster your project improves relative to competitors who keep switching.

Context Compounding as Capital

In finance, compound interest turns small deposits into large sums given enough time. The key insight is that the returns themselves generate further returns. Context compounding works the same way.

A convention captured in CLAUDE.md reduces re-explanation in every future session. That time saved is spent on solving new problems, which generates new conventions, which further reduces future re-explanation. A hook that prevents a failure class eliminates re-investigation of that failure in every future session. That time saved is spent on building new hooks for new failure classes. Each investment generates returns that enable further investment.

The implication for organizations: project context is a capital asset. Rotating engineers across projects every quarter destroys accumulated context the same way closing a savings account destroys accumulated interest. A team that stays on the same project for two years with AI assistance will outperform a team that rotates quarterly, not because the individuals are better, but because the context has compounded.

The implication for individual engineers: your AI infrastructure is an investment portfolio. Every CLAUDE.md section, every memory file, every hook, every skill, every handoff document is a deposit. The portfolio grows slowly at first. After hundreds of sessions, it generates returns that make hard problems look easy to observers who do not see the context stack underneath.

The market page went from 14 seconds to 108 milliseconds. An observer sees a performance fix. I see a handoff document that survived three revisions, a nightcheck system that measured the regression, a destructive guard that prevented a repeat of the cache purge, a code review skill that caught the wrong initial target, and five hundred sessions of accumulated context that made the whole thing possible.

That is compound context.


FAQ

What is context compounding?

Context compounding is the phenomenon where AI-assisted projects improve faster over time because solved problems deposit reusable context (documents, hooks, skills, tests, memories) that reduces the cost of solving subsequent problems. The term is analogous to compound interest: the returns themselves generate further returns.

Does this work with any AI tool?

The principle applies broadly, but the implementation depends on the tool’s support for persistent context. Claude Code supports CLAUDE.md files, hooks, skills, and memory systems natively. Other tools may require external scaffolding to achieve the same effect. The compounding curve is steeper on platforms that provide more context persistence mechanisms.

How do I start building compound context?

Start with a CLAUDE.md that describes your project conventions. Add memory files for key decisions. Write hooks for failure patterns you have experienced. Create skills for workflows you repeat across sessions. The investment feels slow initially. The returns appear after 10-20 sessions.

Is this just documentation?

No. Documentation is a component, but context compounding also includes executable artifacts: hooks that enforce constraints at runtime, skills that automate workflows, test suites that verify correctness, and memory systems that inform decision-making. Static documentation explains. Compound context acts.

What about context window limits?

Context compounding does not require loading all context into every session. It requires the right context being available when needed. A CLAUDE.md is loaded automatically. Memory files are queried by relevance. Handoff documents are read when continuing a specific task. The context stack is larger than any single context window. The agent accesses the relevant slice per session.

How do I know if my project has compound context?

Compare the effort required for similar tasks early versus late in the project’s history. If a task that took a full session in month one takes a single prompt in month six, compound context is working. If the effort is the same, context is not accumulating or not being persisted between sessions.


Sources

This article draws on production experience from 500+ autonomous coding sessions across six projects since May 2025. Specific examples referenced:

  • Market page performance: handoff document, nightcheck verification, and deployment described in captain’s logs from March 21-25, 2026
  • Destructive API guards: built after an agent purged the entire Cloudflare cache, described in the deploy-and-defend post
  • Hook and skill infrastructure: 84 hooks intercepting 15 event types, described in the NIST comment
  • Drift detection: cosine similarity tracking across 60+ sessions, described in The Invisible Agent
  • Autoresearch loops: fixed-budget experiments on Apple Silicon, validated by the Claudini paper
  • Anthropic documentation on Claude Code memory and project instructions: Manage Claude’s memory
  • Andrej Karpathy’s autoresearch repository: autoresearch

相關文章

Claude Code as Infrastructure

Claude Code is not an IDE feature. It is infrastructure. 84 hooks, 48 skills, 19 agents, and 15,000 lines of orchestrati…

12 分鐘閱讀

The Supply Chain Is the Attack Surface

Trivy got compromised. Then LiteLLM. Then 47,000 installs in 46 minutes. The AI supply chain worked exactly as designed.

14 分鐘閱讀

Your Agent Writes Faster Than You Can Read

Five research groups published about the same problem this week: AI agents produce code faster than developers can under…

16 分鐘閱讀