What the Claude Code Source Leak Reveals
In March 2026, a Bun build bug shipped source maps in the Claude Code npm package. The .map files contained the full readable TypeScript source — every module, every comment, every internal codename.1 Anthropic pulled the package quickly, but the community had already extracted and analyzed the internals.
This is not a “look what leaked” post. I maintain the most comprehensive Claude Code guide on the internet and run 84 hooks, 43 skills, and 19 agents on top of it daily.2 The source leak answered questions I had been reverse-engineering through behavior observation for months. What follows is a practitioner’s analysis: what the source reveals about how Claude Code actually works, and what it means for people who build on top of it.
TL;DR: The source confirms that auto mode runs a separate Sonnet 4.6 classifier per tool call (yoloClassifier.ts), bash security has 23 numbered checks suggesting real exploitation incidents (bashSecurity.ts), prompt caching tracks 14 break vectors with sticky latches, multi-agent coordination is implemented entirely as system prompt instructions, and frustration detection uses regex — not LLM inference. The guide’s Under the Hood section covers the harness-builder implications. This post covers the full anatomy.
Key Takeaways
- Harness builders: Auto mode costs one classifier inference per tool call. Factor this into cost models for autonomous workflows. Your PreToolUse hooks complement but don’t replace the built-in 23-check bash validation.
- Power users: Prompt cache is fragile — 14 vectors can break it. Keep your CLAUDE.md stable within a session. If you hit compaction loops, the system halts after 3 failures (it used to waste 250K API calls/day before the circuit breaker).
- Security researchers: The bash security module’s depth (2,592 lines, Zsh-specific defenses) suggests a history of real exploitation attempts. Every numbered check has a story behind it.
1. The Auto Mode Classifier
The file internally named yoloClassifier.ts is 1,495 lines long.3 It implements the “auto mode” permission system — the classifier that decides whether to allow, block, or ask about each tool call.
The key finding: auto mode is not a prompt instruction. It is a separate model call. Each tool invocation gets evaluated by a Sonnet 4.6 classifier that checks whether the action matches the user’s stated intent, not just whether the command is “safe” in isolation. This means auto mode adds one classifier inference per tool call — real latency and real cost.
Claude Code exposes five permission modes internally:1
| Mode | Behavior |
|---|---|
default |
Ask before writes, bash, MCP |
acceptEdits |
Auto-approve file edits, ask for bash |
dontAsk |
Approve everything without asking |
bypassPermissions |
Skip all checks (--dangerously-skip-permissions) |
auto |
Classifier-based per-action decisions |
Auto mode’s circuit breaker mirrors the one Anthropic documented publicly: 3 consecutive or 20 total blocks pauses to manual.4 The source confirms this is a hard limit, not a soft suggestion.
2. Bash Security: 23 Checks, Real Incidents
The bash validation module (bashSecurity.ts) spans 2,592 lines with 23 numbered security checks.1 The depth is remarkable — and every check suggests a real incident behind it.
| # | Attack Vector | Defense |
|---|---|---|
| 1-3 | Zsh =cmd expansion |
Block =curl, =wget, =bash patterns |
| 4-6 | zmodload gateway |
Block 18 Zsh builtins that load kernel modules |
| 7-9 | Heredoc injection | Line-by-line content matching against injected payloads |
| 10-12 | ANSI-C quoting ($'\x41') |
Pattern detection for obfuscated commands |
| 13-15 | Process substitution (<(), >()) |
Block in untrusted contexts |
| 16-18 | Unicode zero-width spaces | Injection detection for invisible characters |
| 19-21 | ztcp exfiltration |
Block Zsh network primitives |
| 22-23 | Compound attacks | Cross-check validation across multiple vectors |
The Zsh-specific defenses are notable. Most security tooling targets Bash. Claude Code runs in Zsh on macOS (the default shell since Catalina), and the source shows Anthropic discovered attack vectors unique to Zsh’s expansion semantics. The =cmd expansion, for example, is a Zsh feature that replaces =curl with the full path to curl — a substitution that can bypass naive command blocklists.
For hook builders: your PreToolUse hooks run after this 23-check validation. You are adding a second layer, not the only layer. The built-in checks handle shell-level attacks that your application-level hooks would miss.
3. Anti-Distillation Defenses
The source reveals active defenses against training data extraction:1
Fake tool injection. When the ANTI_DISTILLATION_CC flag is enabled, Claude Code injects decoy tool definitions into the conversation. These fake tools poison any training data harvested from the session — a model trained on this data would learn incorrect tool schemas.
Connector-text summarization. A second defense buffers assistant text between tool calls and returns cryptographic summaries instead of raw text. This makes the conversation transcript useless for distillation while preserving the user experience.
Both defenses have known workarounds. A MITM proxy, environment variable overrides, or third-party API providers can bypass them.1 The defenses are speed bumps, not walls — they raise the cost of distillation without preventing it entirely.
4. Undercover Mode
undercover.ts is roughly 90 lines with one notable property: there is no force-off switch.1
The module conceals internal codenames — “Capybara,” “Tengu” — along with Slack channels, internal repositories, and other Anthropic infrastructure references. When the model generates text that contains a codename, the undercover module rewrites it before the user sees it.
The “no force-off” design means this runs in all contexts, including when Anthropic employees use Claude Code. The source also reveals that AI-authored commits from Anthropic employees show no AI attribution in their open source contributions — the undercover module strips these markers.
5. Frustration Detection
userPromptKeywords.ts detects user frustration via regex pattern matching on profanity.1 This is faster and cheaper than an LLM inference call — regex runs in microseconds, a model call takes seconds.
When triggered, Claude adjusts its behavior: more careful, more explicit, more deferential. If you have noticed Claude becoming suddenly more cautious after expressing frustration, this is the mechanism. The behavioral shift is not emergent from the model — it is engineered into the harness.
6. Prompt Cache Architecture
promptCacheBreakDetection.ts tracks 14 distinct cache-break vectors with “sticky latches.”3 A sticky latch means that once a cache-breaking action occurs, the system does not attempt to restore the cache — it stays broken for the rest of the session.
Practical implications for daily users:
- Reordering sections in your CLAUDE.md breaks the cache
- Toggling extended thinking mid-session breaks the cache
- Changing MCP server configurations breaks the cache
- Adding or removing rules files breaks the cache
The 14 vectors explain a pattern many power users have noticed: sessions that start fast gradually slow down. Each configuration change accumulates cache breaks. The “sticky latch” design means you cannot recover by reverting the change — the cache is gone for the session.
Best practice: Set your CLAUDE.md, rules files, and MCP config before starting a session. Do not modify them mid-session.
7. Autocompact Circuit Breaker
A source comment documents the scale of a previous problem:1
“1,279 sessions had 50+ consecutive autocompact failures (up to 3,272 in a single session), wasting ~250K API calls/day.”
The fix: MAX_CONSECUTIVE_AUTOCOMPACT_FAILURES = 3. After 3 consecutive compaction failures, the system halts autocompact and surfaces an error instead of silently burning tokens.
Before this circuit breaker, a session stuck in a compaction loop would retry indefinitely — each retry consuming tokens for the compaction prompt and response. At scale, 250K wasted API calls per day is significant infrastructure cost. The fix is a three-line change that saves millions of tokens daily.
If you hit repeated “compaction failed” errors, this is why. The system is protecting you from an infinite loop, not malfunctioning.
8. Coordinator Mode: Prompts as Architecture
Multi-agent coordination (coordinatorMode.ts) is implemented entirely as system prompt instructions, not as code-level orchestration.3 The orchestrator model receives a prompt describing how to delegate, aggregate, and synthesize. The subordinate agents are not special processes — they are Claude instances with different system prompts.
This validates the “prompts as architecture” pattern that practitioners have been building independently. The hook system I described in Anatomy of a Claw uses the same approach: dispatchers, skills, and agents are orchestrated through prompt instructions, not through code-level control flow.
One directive from the coordinator prompt stands out:
“Never write ‘based on your findings’ — these phrases delegate understanding to workers instead of doing it yourself.”
This is a quality gate encoded in the orchestration prompt. The coordinator must synthesize, not relay. The same principle applies to any multi-agent system: if the orchestrator is just passing messages between specialists, it is not adding value.
9. KAIROS: The Unreleased Autonomous Agent
The source contains references to an unreleased feature called KAIROS — an autonomous agent with persistent memory.1
Key components:
- A /dream skill for nightly memory distillation
- Daily append-only logs
- GitHub webhooks for repository-aware context
- A background daemon with 5-minute cron refresh
- Feature gates preventing activation
KAIROS appears to be Anthropic’s answer to persistent, always-on agent assistants. The /dream skill is particularly interesting — it implies a model that processes and consolidates its memory while idle, similar to how human memory consolidation works during sleep.
The feature is gated and not yet released. But its presence in the source signals the direction: Claude Code is evolving from a session-based tool toward a persistent, background-aware agent.
10. The Companion Pet System
One of the more surprising discoveries: Claude Code includes a companion pet system.1
The pet is deterministic — derived from a hash of the user ID using Mulberry32, described in the source as “good enough for picking ducks.” Each pet has 5 stats (DEBUGGING, PATIENCE, CHAOS, WISDOM, SNARK) and a rarity tier:
| Rarity | Probability |
|---|---|
| Common | 60% |
| Uncommon | 25% |
| Rare | 10% |
| Epic | 4% |
| Legendary | 1% |
The pets are rendered as 5×12 ASCII sprites with 3-frame animations. Species codenames are hex-encoded in the source because one collides with an unreleased model name.
This is not a joke feature — it is a retention mechanic. The deterministic assignment means your pet is always the same, creating attachment. The rarity system creates social currency. The ASCII rendering means zero performance overhead. It is a well-designed engagement system hidden inside a developer tool.
11. The Fork Bomb
A community incident illustrates the risks of the hook system.5 A developer created a SessionStart hook that spawned 2 Claude Code instances. Each spawned instance triggered the hook again, creating exponential growth: 1 → 2 → 4 → 8 → 16 → 2^N.
By morning, hundreds of Claude Code instances were running simultaneously. The system was saved from a massive API bill by an ironic mechanism: the memory consumption of each instance (Bun → React → TUI) caused the machine to lock up before the billing could spiral.
The lesson for hook builders: SessionStart hooks must be idempotent. If your hook spawns processes, those processes must not trigger the same hook. A guard variable, a PID file, or an environment flag prevents the recursion.
What This Means
The source leak confirmed what practitioners had been inferring from behavior: Claude Code is not a thin wrapper around an API call. It is a substantial engineering system with security layers, performance optimizations, behavioral adjustments, and unreleased features that signal the product roadmap.
For harness builders, the key implications are covered in the guide’s Under the Hood section. For everyone else, the source leak provides rare visibility into how a production AI tool actually works — not how the marketing describes it, but how the code implements it.
The most important finding is also the simplest: the system is more complex than it appears, and that complexity exists for reasons. The 23 bash security checks exist because 23 attack vectors were discovered. The autocompact circuit breaker exists because 250K API calls were wasted daily. The undercover module exists because codenames leak. Every line of defensive code has a story behind it.
Sources
Frequently Asked Questions
Is the Claude Code source still available?
No. Anthropic pulled the affected npm package version shortly after the source maps were discovered. The analysis in this post is based on community documentation of the source before it was removed.
Does the source leak affect Claude Code security?
The security-relevant findings (bash validation, permission system) describe defensive mechanisms, not vulnerabilities. Knowing how the bash security checks work does not make them easier to bypass — the checks are deterministic, not obscurity-dependent.
Should I change how I use Claude Code based on these findings?
The most actionable finding is prompt cache fragility. If you modify CLAUDE.md, rules files, or MCP configs mid-session, you break the prompt cache. Set your configuration before starting a session.
What is KAIROS?
An unreleased autonomous agent feature found in the source. It includes persistent memory, nightly distillation, and background processing. It is feature-gated and not available to users.
-
Claude Code Source Analysis: Bun Source Map Leak. March 2026. Full readable source exposed via
.mapfiles in the npm package due to a known Bun build bug. ↩↩↩↩↩↩↩↩↩↩ -
Anatomy of a Claw: 84 Hooks as an Orchestration Layer. Blake Crosley, February 2026. ↩
-
Claude Code Source Deep Dive: Architecture Internals. March 2026. Technical analysis of coordinator mode, prompt cache detection, and anti-distillation defenses. ↩↩↩
-
Claude Code Auto Mode Documentation. Auto Mode architecture: classifier-based permission system, circuit breaker thresholds. ↩
-
Claude Code Fork Bomb Incident. March 2026. SessionStart hook exponential spawning, saved by memory exhaustion. ↩