agent:~/.claude$ cat agent-architecture.md

Agent Architecture: Building AI-Powered Development Harnesses

# The complete system for building production AI agent harnesses. Skills, hooks, memory, subagents, multi-agent orchestration, and the patterns that make AI coding agents reliable infrastructure.

words: 15138 read_time: 76m updated: 2026-05-08 00:00
$ less agent-architecture.md

TL;DR: Claude Code is not a chat box with file access. It is a programmable runtime with 29 documented lifecycle events, each hookable with shell scripts the model cannot skip. Stack hooks into dispatchers, dispatchers into skills, skills into agents, agents into workflows, and you get an autonomous development harness that enforces constraints, delegates work, persists memory across sessions, and orchestrates multi-agent deliberation. This guide covers every layer of that stack: from a single hook to a 10-agent consensus system. Zero frameworks required. All bash and JSON.

Andrej Karpathy coined a term for what grows around an LLM agent: claws. The hooks, scripts, and orchestration that let the agent grip the world outside its context window.1 Most developers treat AI coding agents as interactive assistants. They type a prompt, watch it edit a file, and move on. That framing caps productivity at whatever you can personally oversee.

The infrastructure mental model is different: an AI coding agent is a programmable runtime with an LLM kernel. Every action the model takes passes through hooks you control. You define policies, not prompts. The model operates within your infrastructure the same way a web server operates within nginx rules. You do not sit at nginx and type requests. You configure it, deploy it, and monitor it.

The distinction matters because infrastructure compounds. A hook that blocks credentials in bash commands protects every session, every agent, every autonomous run. A skill that encodes your evaluation rubric applies consistently whether you invoke it or an agent does. An agent that reviews code for security runs the same checks whether you are watching or not.2


Key Takeaways

  • Hooks guarantee execution; prompts do not. Use hooks for linting, formatting, security checks, and anything that must run every time regardless of model behavior. Exit code 2 blocks actions. Exit code 1 only warns.3
  • Skills encode domain expertise that auto-activates. The description field determines everything. Claude uses LLM reasoning (not keyword matching) to decide when to apply a skill.4
  • Subagents prevent context bloat. Isolated context windows for exploration and analysis keep the main session lean. Run independent subagents in parallel, and use agent teams when workers need sustained coordination.5
  • Memory lives in the filesystem. Files persist across context windows. CLAUDE.md, MEMORY.md, rules directories, and handoff documents form a structured external memory system.6
  • Multi-agent deliberation catches blind spots. Single agents cannot challenge their own assumptions. Two independent agents with different evaluation priorities catch structural failures that quality gates cannot address.7
  • The harness pattern is the system. CLAUDE.md, hooks, skills, agents, and memory are not independent features. They compose into a deterministic layer between you and the model that scales with automation.

How to Use This Guide

Experience Start Here Then Explore
Using Claude Code daily, want more The Harness Pattern Skills System, Hook Architecture
Building autonomous workflows Subagent Patterns Multi-Agent Orchestration, Production Patterns
Evaluating agent architecture Why Agent Architecture Matters Decision Framework, Security Considerations
Setting up a team harness CLAUDE.md Design Hook Architecture, Quick Reference Card

Each section builds on the previous. The Decision Framework at the end provides a lookup table for choosing the right mechanism for each problem type.


Five-Minute Golden Path

Before the deep dive, here is the shortest path from zero to a working harness. One hook, one skill, one subagent, one outcome.

Step 1: Create a security hook (2 minutes)

Create .claude/hooks/block-secrets.sh:

#!/bin/bash
INPUT=$(cat)
CMD=$(echo "$INPUT" | jq -r '.tool_input.command // empty')
if echo "$CMD" | grep -qEi '(AKIA|sk-|ghp_|password=)'; then
    echo "BLOCKED: Potential secret in command" >&2
    exit 2
fi

Wire it in .claude/settings.json:

{
  "hooks": {
    "PreToolUse": [
      {
        "matcher": "Bash",
        "hooks": [{ "type": "command", "command": ".claude/hooks/block-secrets.sh" }]
      }
    ]
  }
}

Result: Every bash command Claude runs is now screened for leaked credentials. The model cannot skip this check.

Step 2: Create a code review skill (1 minute)

Create .claude/skills/reviewer/SKILL.md with frontmatter (name: reviewer, description: Review code for security issues, bugs, and quality problems. Use when examining changes, reviewing PRs, or auditing code., allowed-tools: Read, Grep, Glob) and a checklist: SQL injection, XSS, hardcoded secrets, missing error handling, functions over 50 lines.

Result: Claude auto-activates this expertise whenever you mention review, check, or audit.

Step 3: Spawn a subagent (30 seconds)

In any Claude Code session, ask Claude to review the last 3 commits for security issues using a separate agent. Claude spawns an Explore agent that reads the diff, applies your review skill, and returns a summary. Your main context stays clean.

What you now have

A three-layer harness: a deterministic security gate (hook), domain expertise that auto-activates (skill), and isolated analysis that protects your context (subagent). Every section below expands one of these three layers.


Why Agent Architecture Matters

Simon Willison frames the current moment around a single observation: writing code is cheap now.8 Correct. But the corollary is that verification is now the expensive part. Cheap code without verification infrastructure produces bugs at scale. The investment that pays off is not a better prompt. It is the system around the model that catches what the model misses.

Three forces make agent architecture necessary:

Context windows are finite and lossy. Every file read, tool output, and conversation turn consumes tokens. Microsoft Research and Salesforce tested 15 LLMs across 200,000+ simulated conversations and found a 39% average performance drop from single-turn to multi-turn interaction.9 The degradation starts in as few as two turns and follows a predictable curve: precise multi-file edits in the first 30 minutes degrade into single-file tunnel vision by minute 90. Longer context windows do not fix this. The same study’s “Concat” condition (full conversation as a single prompt) achieved 95.1% of single-turn performance with identical content. The degradation comes from turn boundaries, not token limits.

Model behavior is probabilistic, not deterministic. Telling Claude “always run Prettier after editing files” works roughly 80% of the time.3 The model might forget, prioritize speed, or decide the change is “too small.” For compliance, security, and team standards, 80% is not acceptable. Hooks guarantee execution: every Edit or Write triggers your formatter, every time, no exceptions. Deterministic beats probabilistic.

Single perspectives miss multi-dimensional problems. A single agent reviewing an API endpoint checked authentication, validated input sanitization, and verified CORS headers. Clean bill of health. A second agent, prompted separately as a penetration tester, found the endpoint accepted unbounded query parameters that could trigger denial-of-service through database query amplification.7 The first agent never checked because nothing in its evaluation framework treated query complexity as a security surface. That gap is structural. No amount of prompt engineering fixes it.

Agent architecture addresses all three: hooks enforce deterministic constraints, subagents manage context isolation, and multi-agent orchestration provides independent perspectives. Together they form the harness.


The Harness Pattern

The harness is not a framework. It is a pattern: a composable set of files, scripts, and conventions that wrap an AI coding agent in deterministic infrastructure. The components:

┌──────────────────────────────────────────────────────────────┐
│                      THE HARNESS PATTERN                      │
├──────────────────────────────────────────────────────────────┤
│  ORCHESTRATION                                                │
│  ┌────────────┐  ┌────────────┐  ┌────────────┐             │
│  │   Agent     │  │   Agent    │  │  Consensus │             │
│  │   Teams     │  │  Spawning  │  │  Validation│             │
│  └────────────┘  └────────────┘  └────────────┘             │
│  Multi-agent deliberation, parallel research, voting          │
├──────────────────────────────────────────────────────────────┤
│  EXTENSION LAYER                                              │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐    │
│  │  Skills   │  │  Hooks   │  │  Memory  │  │  Agents  │    │
│  └──────────┘  └──────────┘  └──────────┘  └──────────┘    │
│  Domain expertise, deterministic gates, persistent state,     │
│  specialized subagents                                        │
├──────────────────────────────────────────────────────────────┤
│  INSTRUCTION LAYER                                            │
│  ┌──────────────────────────────────────────────────────┐    │
│  │     CLAUDE.md  +  .claude/rules/  +  MEMORY.md       │    │
│  └──────────────────────────────────────────────────────┘    │
│  Project context, operational policy, cross-session memory    │
├──────────────────────────────────────────────────────────────┤
│  CORE LAYER                                                   │
│  ┌──────────────────────────────────────────────────────┐    │
│  │           Main Conversation Context (LLM)             │    │
│  └──────────────────────────────────────────────────────┘    │
│  Your primary interaction; finite context; costs money        │
└──────────────────────────────────────────────────────────────┘

Instruction Layer: CLAUDE.md files and rules directories define what the agent knows about your project. They load automatically at session start and after every compaction. This is the agent’s long-term architectural memory.

Extension Layer: Skills provide domain expertise that auto-activates based on context. Hooks provide deterministic gates that fire on every matching tool call. Memory files persist state across sessions. Custom agents provide specialized subagent configurations.

Orchestration Layer: Multi-agent patterns coordinate independent agents for research, review, and deliberation. Spawn budgets prevent runaway recursion. Consensus validation ensures quality.

The key insight: most users work entirely in the Core Layer, watching context bloat and costs climb. Power users configure the Instruction and Extension layers, then use the Core Layer only for orchestration and final decisions.2

Managed vs. Self-Hosted Harnesses (April 2026)

Throughout early 2026, the “build your own harness” path was the only real option. In April 2026, that changed. Anthropic shipped Claude Managed Agents in public beta (April 8): harness loop + tool execution + sandbox container + state persistence as a REST API, billed at standard tokens plus $0.08/session-hour. OpenAI’s Agents SDK update (April 16) formalized the same split — harness and compute as separate layers, with native sandbox providers (Blaxel, Cloudflare, Daytona, E2B, Modal, Runloop, Vercel) and snapshot/rehydrate for surviving container loss.2324

The deeper SDK surface for the OpenAI side landed in openai-agents Python v0.14.0 (released April 15, 2026; announced April 16): a SandboxAgent subclass of Agent with default_manifest, sandbox instructions, and capabilities; a Manifest describing the fresh-workspace contract (files, dirs, local files, Git repos, env, users, mounts); a SandboxRunConfig for per-run wiring of sandbox client, live session injection, manifest overrides, snapshots, and materialization concurrency limits. Built-in capabilities cover shell access, filesystem editing, image inspection, skills, sandbox memory, and compaction. Sandbox memory persists extracted lessons across runs and progressively discloses them; workspaces support local files, Git repo entries, and remote mounts (S3, R2, GCS, Azure Blob, S3 Files); snapshots are portable across providers. Backends: UnixLocalSandboxClient, DockerSandboxClient, and hosted clients for Blaxel, Cloudflare, Daytona, E2B, Modal, Runloop, and Vercel via optional extras.24

For Python projects that want to embed the Claude Code runtime as a library — between “shell out to claude” and “REST API to Managed Agents” — claude-agent-sdk-python is the third option. The April 28-29 series (v0.1.69 → v0.1.71) bumped the bundled CLI to v2.1.123, raised the floor on the mcp dependency to >=1.19.0 (older versions silently dropped CallToolResult returns from in-process MCP tools, leaving the model with a validation-error blob), and brought SandboxNetworkConfig to schema parity with the TypeScript SDK (allowedDomains, deniedDomains, allowManagedDomainsOnly, allowMachLookup).30

The architectural fork is now real:

Dimension Self-hosted harness (this guide’s default) Managed harness (Claude Managed Agents / OpenAI Agents SDK)
Operational burden You run everything Vendor runs loop, sandbox, state
Customization Total — your hooks, your skills, your memory Bounded — vendor-defined extension points
Cost model Token + self-hosted compute Token + runtime-hour premium
State durability You design it Vendor checkpoints across disconnects
Agent team orchestration Build your own Vendor-provided multi-agent coordination

When to pick which: self-hosted remains right for teams that already have infrastructure muscle, want skills/hooks they control, or are optimizing a specific workflow deeply. Managed is right for teams without dedicated platform engineers, when time-to-value matters more than customization, or when agent runs need to survive laptop closures reliably without you building that persistence layer. The two are compatible — you can run a self-hosted harness that delegates specific long-running tasks to Managed Agents via its REST API.

What the Harness Looks Like on Disk

~/.claude/
├── CLAUDE.md                    # Personal global instructions
├── settings.json                # User-level hooks and permissions
├── skills/                      # Personal skills (44+)
   ├── code-reviewer/SKILL.md
   ├── security-auditor/SKILL.md
   └── api-designer/SKILL.md
├── agents/                      # Custom subagent definitions
   ├── security-reviewer.md
   └── code-explorer.md
├── rules/                       # Categorized rule files
   ├── security.md
   ├── testing.md
   └── git-workflow.md
├── hooks/                       # Hook scripts
   ├── validate-bash.sh
   ├── auto-format.sh
   └── recursion-guard.sh
├── configs/                     # JSON configuration
   ├── recursion-limits.json
   └── deliberation-config.json
├── state/                       # Runtime state
   ├── recursion-depth.json
   └── agent-lineage.json
├── handoffs/                    # Session handoff documents
   └── deliberation-prd-7.md
└── projects/                    # Per-project memory
    └── {project}/memory/MEMORY.md

.claude/                         # Project-level (in repo)
├── CLAUDE.md                    # Project instructions
├── settings.json                # Project hooks
├── skills/                      # Team-shared skills
├── agents/                      # Team-shared agents
└── rules/                       # Project rules

Every file in this structure serves a purpose. The ~/.claude/ tree is personal infrastructure that applies to all projects. The .claude/ tree in each repository is project-specific and shared via git. Together, they form the complete harness.


Skills System

Skills are model-invoked extensions. Claude discovers and applies them automatically based on context, without you explicitly calling them.4 The moment you catch yourself re-explaining the same context across sessions is the moment you should build a skill.

When to Build a Skill

Situation Build a… Why
You paste the same checklist every session Skill Domain expertise that auto-activates
You run the same command sequence explicitly Slash command User-invoked action with predictable trigger
You need isolated analysis that shouldn’t pollute context Subagent Separate context window for focused work
You need a one-time prompt with specific instructions Nothing Just type it. Not everything needs abstraction.

Skills are for knowledge Claude always has available. Slash commands are for actions you explicitly trigger. If you are deciding between the two, ask: “Should Claude apply this automatically, or should I decide when to run it?”

Creating a Skill

Skills live in four possible locations, from broadest to narrowest scope:4

Scope Location Applies to
Enterprise Managed settings All users in organization
Personal ~/.claude/skills/<name>/SKILL.md All your projects
Project .claude/skills/<name>/SKILL.md This project only
Plugin <plugin>/skills/<name>/SKILL.md Where plugin is enabled

Every skill requires a SKILL.md file with YAML frontmatter:

---
name: code-reviewer
description: Review code for security vulnerabilities, performance issues,
  and best practice violations. Use when examining code changes, reviewing
  PRs, analyzing code quality, or when asked to review, audit, or check code.
allowed-tools: Read, Grep, Glob
---

# Code Review Expertise

## Security Checks
When reviewing code, verify:

### Input Validation
- All user input sanitized before database operations
- Parameterized queries (no string interpolation in SQL)
- Output encoding for rendered HTML content

### Authentication
- Session tokens validated on every protected endpoint
- Permission checks before data mutations
- No hardcoded credentials or API keys in source

Frontmatter Reference

Field Required Purpose
name Yes Unique identifier (lowercase, hyphens, max 64 chars)
description Yes Discovery trigger (max 1024 chars). Claude uses this to decide when to apply the skill
allowed-tools No Restrict Claude’s capabilities (e.g., Read, Grep, Glob for read-only)
disable-model-invocation No Prevents auto-activation; skill only activates via /skill-name
user-invocable No Set false to hide from the / menu entirely
model No Override which model to use when the skill is active
context No Set to fork to run in isolated context window
agent No Run as a subagent with its own isolated context
hooks No Define lifecycle hooks scoped to this skill
$ARGUMENTS No String substitution: replaced with user’s input after /skill-name

The Description Field Is Everything

At session start, Claude Code extracts every skill’s name and description and injects them into Claude’s context. When you send a message, Claude uses language model reasoning to decide if any skill is relevant. Independent analysis of the Claude Code source confirms the mechanism: skill descriptions are injected into an available_skills section of the system prompt, and the model uses standard language understanding to select relevant skills.10

Bad description:

description: Helps with code

Effective description:

description: Review code for security vulnerabilities, performance issues,
  and best practice violations. Use when examining code changes, reviewing
  PRs, analyzing code quality, or when asked to review, audit, or check code.

The effective description includes: what it does (review code for specific issue types), when to use it (examining changes, PRs, quality analysis), and trigger phrases (review, audit, check) that users naturally type.

Context Budget

All skill descriptions share a context budget that scales dynamically at 1% of the context window, with a fallback of 8,000 characters.4 If you have many skills, keep each description concise and put the key use case first. You can override the budget via the SLASH_COMMAND_TOOL_CHAR_BUDGET environment variable,11 but the better fix is shorter, more precise descriptions. Run /context during a session to check whether any skills are being excluded.

Supporting Files and Organization

Skills can reference additional files in the same directory:

~/.claude/skills/code-reviewer/
├── SKILL.md                    # Required: frontmatter + core expertise
├── SECURITY_PATTERNS.md        # Referenced: detailed vulnerability patterns
└── PERFORMANCE_CHECKLIST.md    # Referenced: optimization guidelines

Reference them from SKILL.md with relative links. Claude reads these files on-demand when the skill activates. Keep SKILL.md under 500 lines and move detailed reference material to supporting files.12

Sharing Skills via Git

Project skills (.claude/skills/ in the repo root) are shared via version control:4

mkdir -p .claude/skills/domain-expert
# ... write SKILL.md ...
git add .claude/skills/
git commit -m "feat: add domain-expert skill for payment processing rules"
git push

When teammates pull, they get the skill automatically. No installation, no configuration. This is the most effective way to standardize expertise across a team.

Skills as a Prompt Library

Beyond single-purpose skills, the directory structure works as an organized prompt library:

~/.claude/skills/
├── code-reviewer/          # Activates on: review, audit, check
├── api-designer/           # Activates on: design API, endpoint, schema
├── sql-analyst/            # Activates on: query, database, migration
├── deploy-checker/         # Activates on: deploy, release, production
└── incident-responder/     # Activates on: error, failure, outage, debug

Each skill encodes a different facet of your expertise. Together, they form a knowledge base that Claude draws from automatically based on context. A junior developer gets senior-level guidance without asking for it.

Skills Compose with Hooks

Skills can define their own hooks in frontmatter that activate only while the skill runs. This creates domain-specific behavior that does not pollute other sessions:2

---
name: deploy-checker
description: Verify deployment readiness. Use when preparing to deploy,
  release, or push to production.
hooks:
  PreToolUse:
    - matcher: Bash
      hooks:
        - type: command
          command: "bash -c 'INPUT=$(cat); CMD=$(echo \"$INPUT\" | jq -r \".tool_input.command\"); if echo \"$CMD\" | grep -qE \"deploy|release|publish\"; then echo \"DEPLOYMENT COMMAND DETECTED. Running pre-flight checks.\" >&2; fi'"
---

Philosophy skills auto-activate via SessionStart hooks, injecting quality constraints into every session without explicit invocation. The skill itself is knowledge. The hook is enforcement. Together, they form a policy layer.

Common Skill Mistakes

Too-broad descriptions. A git-rebase-helper skill that activates on any git-related prompt (rebases, merges, cherry-picks, even git status) pollutes context on 80% of sessions. The fix is either tightening the description or adding disable-model-invocation: true and requiring explicit /skill-name invocation.4

Too many skills competing for budget. More skills means more descriptions competing for the 1% context budget. If you notice skills not activating, check /context for excluded ones. Prioritize fewer, well-described skills over many vague ones.

Critical information buried in supporting files. Claude reads SKILL.md immediately but only accesses supporting files when needed. If critical information is in a supporting file, Claude might not find it. Put essential information in SKILL.md directly.4

SDK Skill Surface (May 8, 2026)

Self-hosted harnesses on claude-agent-sdk-python v0.1.77+ should use the skills option on ClaudeAgentOptions to declare available skills, not the legacy "Skill" value in allowed_tools.37 The "Skill" shorthand is deprecated and the dedicated option gives Claude Code more structured information about which skills are available. Bundled CLI in v0.1.77 is v2.1.133.


Hook Architecture

Hooks are shell commands triggered by Claude Code lifecycle events.3 They run outside the LLM as plain scripts, not prompts interpreted by the model. The model wants to run rm -rf /? A 10-line bash script checks the command against a blocklist and rejects it before the shell ever sees it. The hook fires whether the model wants it to or not.

Available Events

Claude Code exposes 29 documented lifecycle events across eight categories as of this guide update. The event list grows with releases, so treat the reference docs as the source of truth and check the cheat sheet for the current full table before wiring production hooks:13

Category Events Can Block?
Session SessionStart, Setup, SessionEnd No
User / completion UserPromptSubmit, UserPromptExpansion, Stop, StopFailure, TeammateIdle Prompt/expansion/stop/idle can block; StopFailure cannot
Tool PreToolUse, PermissionRequest, PermissionDenied, PostToolUse, PostToolUseFailure, PostToolBatch Pre/permission/batch can block; post events cannot
Subagent / task SubagentStart, SubagentStop, TaskCreated, TaskCompleted Stop/task events can block; start cannot
Context PreCompact, PostCompact, InstructionsLoaded PreCompact can block; post/load cannot
Filesystem / workspace CwdChanged, FileChanged, WorktreeCreate, WorktreeRemove Worktree creation can block; others cannot
Configuration / notification ConfigChange, Notification Config changes can block except policy settings; notifications cannot
MCP Elicitation, ElicitationResult Yes

Exit Code Semantics

Exit codes determine whether hooks block actions:3

Exit Code Meaning Action
0 Success Operation proceeds. Stdout shown in verbose mode.
2 Blocking error Operation stops. Stderr becomes error message fed to Claude.
1, 3, etc. Non-blocking error Operation continues. Stderr shown in verbose mode only (Ctrl+O).

Critical: Every security hook must use exit 2, not exit 1. Exit 1 is a non-blocking warning. The dangerous command still executes. This is the most common hook mistake across teams.14

Hook Configuration

Hooks live in settings files. Project-level (.claude/settings.json) for shared hooks. User-level (~/.claude/settings.json) for personal hooks:

{
  "hooks": {
    "PreToolUse": [
      {
        "matcher": "Bash",
        "hooks": [
          {
            "type": "command",
            "command": ".claude/hooks/validate-bash.sh"
          }
        ]
      }
    ],
    "PostToolUse": [
      {
        "matcher": "Write|Edit",
        "hooks": [
          {
            "type": "command",
            "command": "bash -c 'if [[ \"$FILE_PATH\" == *.py ]]; then black --quiet \"$FILE_PATH\" 2>/dev/null; fi'"
          }
        ]
      }
    ]
  }
}

The matcher field filters an event-specific value. For tool events, it matches tool_name values such as Bash, Edit, Write, Read, Glob, Grep, MCP tool names like mcp__server__tool, or * for all tools. Simple names and |-separated lists are exact matches; values with other characters are JavaScript regular expressions. Some events do not support matchers and always fire when configured.13

Hook Input/Output Protocol

Hooks receive JSON on stdin with full context:

{
  "tool_name": "Bash",
  "tool_input": {
    "command": "npm test",
    "description": "Run test suite"
  },
  "session_id": "abc-123",
  "agent_id": "main",
  "agent_type": "main"
}

For advanced control, PreToolUse hooks can output JSON to modify tool input, inject context, or make permission decisions. Use the hookSpecificOutput wrapper — the older top-level decision/reason format is deprecated for PreToolUse:

{
  "hookSpecificOutput": {
    "hookEventName": "PreToolUse",
    "permissionDecision": "allow",
    "permissionDecisionReason": "Command validated and modified",
    "updatedInput": {
      "command": "npm test -- --coverage --ci"
    },
    "additionalContext": "Note: This database has a 5-second query timeout."
  }
}

Three Types of Guarantees

Before writing any hook, ask: what kind of guarantee do I need?14

Formatting guarantees ensure consistency after the fact. PostToolUse hooks on Write/Edit run your formatter after every file change. The model’s output does not matter because the formatter normalizes everything.

{
  "hooks": {
    "PostToolUse": [
      {
        "matcher": "Write|Edit",
        "hooks": [
          {
            "type": "command",
            "command": "bash -c 'if [[ \"$FILE_PATH\" == *.py ]]; then black --quiet \"$FILE_PATH\" 2>/dev/null; elif [[ \"$FILE_PATH\" == *.js ]] || [[ \"$FILE_PATH\" == *.ts ]]; then npx prettier --write \"$FILE_PATH\" 2>/dev/null; fi'"
          }
        ]
      }
    ]
  }
}

Safety guarantees prevent dangerous actions before they execute. PreToolUse hooks on Bash inspect commands and block destructive patterns with exit code 2:

#!/bin/bash
# validate-bash.sh — block dangerous commands
INPUT=$(cat)
CMD=$(echo "$INPUT" | jq -r '.tool_input.command')

if echo "$CMD" | grep -qE "rm\s+-rf\s+/|git\s+push\s+(-f|--force)\s+(origin\s+)?main|git\s+reset\s+--hard|DROP\s+TABLE"; then
    echo "BLOCKED: Dangerous command detected: $CMD" >&2
    exit 2
fi

Quality guarantees validate state at decision points. PreToolUse hooks on git commit commands run your linter or test suite and block the commit if quality checks fail:

#!/bin/bash
# quality-gate.sh — lint before commit
INPUT=$(cat)
CMD=$(echo "$INPUT" | jq -r '.tool_input.command')

if echo "$CMD" | grep -qE "^git\s+commit"; then
    if ! LINT_OUTPUT=$(ruff check . --select E,F,W 2>&1); then
        echo "LINT FAILED -- fix before committing:" >&2
        echo "$LINT_OUTPUT" >&2
        exit 2
    fi
fi

Hook Types Beyond Shell Commands

Claude Code supports five hook types:13

Command hooks (type: "command") run shell scripts. Fast, deterministic, no token cost.

MCP tool hooks (type: "mcp_tool") call a tool on an already-connected MCP server. Use them when validation logic already lives behind an MCP boundary and does not need a separate shell script.

Prompt hooks (type: "prompt") send a single-turn prompt to a fast Claude model. The model returns { "ok": true } to allow or { "ok": false, "reason": "..." } to block. Use for nuanced evaluation that regex cannot express.

Agent hooks (type: "agent") spawn a subagent with tool access (Read, Grep, Glob) for multi-turn verification. They are experimental; prefer command hooks for production gates and reserve agent hooks for checks that genuinely require inspecting actual files or test output:

{
  "hooks": {
    "Stop": [
      {
        "hooks": [
          {
            "type": "agent",
            "prompt": "Verify all unit tests pass. Run the test suite and check results. $ARGUMENTS",
            "timeout": 120
          }
        ]
      }
    ]
  }
}

HTTP hooks (type: "http") send the event’s JSON input as a POST request to a URL and receive JSON back. Use for webhooks, external notification services, or API-based validation (v2.1.63+). Not supported for SessionStart events:

{
  "hooks": {
    "PostToolUse": [
      {
        "hooks": [
          {
            "type": "http",
            "url": "https://your-webhook.example.com/hook",
            "headers": { "Authorization": "Bearer $WEBHOOK_TOKEN" },
            "allowedEnvVars": ["WEBHOOK_TOKEN"],
            "timeout": 10
          }
        ]
      }
    ]
  }
}

Async Hooks

Hooks can run in the background without blocking execution. Add async: true for non-critical operations like notifications and logging:13

{
  "type": "command",
  "command": ".claude/hooks/notify-slack.sh",
  "async": true
}

Use async for notifications, telemetry, and backups. Never use async for formatting, validation, or anything that must complete before the next action.

Dispatchers Over Independent Hooks

Running seven hooks all firing on the same event, each reading stdin independently, creates race conditions. Two hooks writing to the same JSON state file concurrently will truncate the JSON. Every downstream hook that parses that file breaks.2

The fix: one dispatcher per event that runs hooks sequentially from cached stdin:

#!/bin/bash
# dispatcher.sh — run hooks sequentially with cached stdin
INPUT=$(cat)
HOOK_DIR="$HOME/.claude/hooks/pre-tool-use.d"

for hook in "$HOOK_DIR"/*.sh; do
    [ -x "$hook" ] || continue
    echo "$INPUT" | "$hook"
    EXIT_CODE=$?
    if [ "$EXIT_CODE" -eq 2 ]; then
        exit 2  # Propagate block
    fi
done

Debugging Hooks

Five techniques for debugging hooks that fail silently:14

  1. Test scripts independently. Pipe sample JSON: echo '{"tool_input":{"command":"git commit -m test"}}' | bash your-hook.sh
  2. Use stderr for debug output. Exit code 2 stderr is fed back to Claude as an error message. Non-blocking stderr (exit 1, 3, etc.) appears only in verbose mode (Ctrl+O).
  3. Watch for jq failures. Wrong JSON paths return null silently. Test jq expressions against real tool input.
  4. Verify exit codes. A PreToolUse hook that uses exit 1 provides zero enforcement while appearing to work.
  5. Keep hooks fast. Hooks run synchronously. Keep all hooks under 2 seconds, ideally under 500ms.

SDK-Side Hook Event Streaming

Self-hosted harnesses built on claude-agent-sdk-python (v0.1.74+, May 6, 2026) can subscribe to hook events directly from the message stream rather than going through shell-script callbacks.36 Set include_hook_events=True on ClaudeAgentOptions and HookEventMessage objects (PreToolUse, PostToolUse, Stop, and others) yield from the same iterator as assistant messages and tool results. This mirrors the TypeScript SDK’s includeHookEvents option; bundled CLI was bumped to v2.1.129 in the same release.

The event-stream pattern is the right fit when your harness already lives in Python and you want hook signals in the same control flow as model output. The shell-script hook contract (exit codes, stdin JSON, dispatchers) remains the right answer for harnesses that compose multiple tools, share hooks across Claude Code and Codex, or need exit-code semantics for blocking.

Effort and Session Provenance (May 7-8, 2026)

Two additions in Claude Code v2.1.132 and v2.1.133 give hooks and subprocesses better signal about their execution context:3839

  • effort.level in hook input. Hooks now receive an effort.level JSON field on the same input that carries tool_input and session_id. The same value is exported as the $CLAUDE_EFFORT env var, so Bash commands can read it without parsing JSON. Use this to scale hook cost with effort tier: skip expensive validation on low, run the full security gate on xhigh or max.
  • CLAUDE_CODE_SESSION_ID env var on Bash subprocesses. Bash tool subprocesses now see the same session_id value the hooks see, exposed as CLAUDE_CODE_SESSION_ID. This closes the provenance gap for tools that log per-session state and were previously unable to correlate subprocess events with hook events.

Both signals are available without code changes; existing hooks that ignore the new fields keep working.


Memory and Context

Every AI conversation operates within a finite context window. As the conversation grows, the system compresses earlier turns to make room for new content. The compression is lossy. Architectural decisions documented in turn 3 may not survive to turn 15.9

The Three Mechanisms of Multi-Turn Collapse

The MSR/Salesforce study identified three independent mechanisms, each requiring a different intervention:9

Mechanism What Happens Intervention
Context compression Earlier information discarded to fit new content State checkpointing to filesystem
Reasoning coherence loss Model contradicts its own earlier decisions across turns Fresh-context iteration (Ralph loop)
Coordination failure Multiple agents hold different state snapshots Shared state protocols between agents

Strategy 1: Filesystem as Memory

The most reliable memory across context boundaries lives in the filesystem. Claude Code reads CLAUDE.md and memory files at the start of every session and after every compaction.6

~/.claude/
├── configs/           # 14 JSON configs (thresholds, rules, budgets)
│   ├── deliberation-config.json
│   ├── recursion-limits.json
│   └── consensus-profiles.json
├── hooks/             # 95 lifecycle event handlers
├── skills/            # 44 reusable knowledge modules
├── state/             # Runtime state (recursion depth, agent lineage)
├── handoffs/          # 49 multi-session context documents
├── docs/              # 40+ system documentation files
└── projects/          # Per-project memory directories
    └── {project}/memory/
        └── MEMORY.md  # Always loaded into context

The MEMORY.md file captures errors, decisions, and patterns across sessions. When you discover that ((VAR++)) fails with set -e in bash when VAR is 0, you record it. Three sessions later, when you encounter a similar integer edge case in Python, the MEMORY.md entry surfaces the pattern.15

Auto Memory (v2.1.32+): Claude Code automatically records and recalls project context. As you work, Claude writes observations to ~/.claude/projects/{project-path}/memory/MEMORY.md. Auto memory loads the first 200 lines into your system prompt at session start. Keep it concise and link to separate topic files for detailed notes.6

Strategy 2: Proactive Compaction

Claude Code’s /compact command summarizes the conversation and frees context space while preserving key decisions, file contents, and task state.15

When to compact: - After completing a distinct subtask (feature implemented, bug fixed) - Before starting a new area of the codebase - When Claude starts repeating or forgetting earlier context - Roughly every 25-30 minutes during intensive sessions

Custom compaction instructions in CLAUDE.md:

# Summary Instructions
When using compact, focus on:
- Recent code changes
- Test results
- Architecture decisions made this session

Strategy 3: Session Handoffs

For tasks spanning multiple sessions, create handoff documents that capture the full state:

## Handoff: Deliberation Infrastructure PRD-7
**Status:** Hook wiring complete, 81 Python unit tests passing
**Files changed:** hooks/post-deliberation.sh, hooks/deliberation-pride-check.sh
**Decision:** Placed post-deliberation in PostToolUse:Task, pride-check in Stop
**Blocked:** Spawn budget model needs inheritance instead of depth increment
**Next:** PRD-8 integration tests in tests/test_deliberation_lib.py

The Status/Files/Decision/Blocked/Next structure provides the successor session with full context at minimal token cost. Starting a new session with claude -c (continue) or reading the handoff document goes straight to implementation.15

Strategy 4: Fresh-Context Iteration (The Ralph Loop)

For sessions exceeding 60-90 minutes, spawn a fresh Claude instance per iteration. State persists through the filesystem, not through conversational memory. Each iteration gets the full context budget:16

Iteration 1: [200K tokens] -> writes code, creates files, updates state
Iteration 2: [200K tokens] -> reads state from disk, continues
Iteration 3: [200K tokens] -> reads updated state, continues
...
Iteration N: [200K tokens] -> reads final state, verifies criteria

Compare with a single long session:

Minute 0:   [200K tokens available] -> productive
Minute 30:  [150K tokens available] -> somewhat productive
Minute 60:  [100K tokens available] -> degraded
Minute 90:  [50K tokens available]  -> significantly degraded
Minute 120: [compressed, lossy]     -> errors accumulate

The fresh-context-per-iteration approach trades 15-20% overhead for the orient step (reading state files, scanning git history) against full cognitive resources per iteration.16 The cost-benefit calculation: for sessions under 60 minutes, a single conversation is more efficient. Beyond 90 minutes, fresh-context produces higher-quality output despite the overhead.

Strategy 5: Managed Memory Curation (Dreaming)

Anthropic’s Claude Managed Agents added Dreaming as a Research Preview on May 6, 2026.35 Per Anthropic: “Dreaming is a scheduled process that reviews your agent sessions and memory stores, extracts patterns, and curates memories so your agents improve over time.”35

Dreaming runs in the background between sessions, not on the critical path. It complements rather than replaces the filesystem-as-memory pattern: your MEMORY.md file remains the load-bearing surface; Dreaming writes curated memory entries into the Managed Agents memory store, which the agent reads at session start. The two patterns coexist for harnesses that mix self-hosted filesystem state with managed-side curation.

Filesystem Memory Dreaming (Managed)
Where memory lives Your repo, version-controlled Anthropic-managed memory store
When it updates You write entries by hand or via hooks Background process between sessions
What it captures Decisions, errors, patterns you flag Patterns extracted from session history
Best for Project-specific institutional knowledge Cross-session pattern discovery you would not catch by hand

Dreaming is in Research Preview, so behavior may change. The session-handoffs and CLAUDE.md patterns documented above remain the authoritative memory mechanism for self-hosted harnesses.

The Anti-Patterns

Reading entire files when you need 10 lines. A single 2,000-line file read consumes 15,000-20,000 tokens. Use line offsets: Read file.py offset=100 limit=20 saves the vast majority of that cost.15

Keeping verbose error output in context. After debugging a bug, your context holds 40+ stack traces from failed iterations. A single /compact after fixing the bug frees that dead weight.

Starting every session by reading every file. Let Claude Code’s glob and grep tools find relevant files on demand, saving 100,000+ tokens of unnecessary pre-loading.15


Subagent Patterns

Subagents are specialized Claude instances that handle complex tasks independently. They start with a clean context (no pollution from the main conversation), operate with specified tools, and return results as summaries. The exploration results do not bloat your main conversation; only the conclusions return.5

Built-In Subagent Types

Type Model Mode Tools Use For
Explore Haiku (fast) Read-only Glob, Grep, Read, safe bash Codebase exploration, finding files
General-purpose Inherits Full read/write All available Complex research + modification
Plan Inherits (or Opus) Read-only Read, Glob, Grep, Bash Planning before execution

Creating Custom Subagents

Define subagents in .claude/agents/ (project) or ~/.claude/agents/ (personal):

---
name: security-reviewer
description: Expert security code reviewer. Use PROACTIVELY after any code
  changes to authentication, authorization, or data handling.
tools: Read, Grep, Glob, Bash
model: opus
permissionMode: plan
---

You are a senior security engineer reviewing code for vulnerabilities.

When invoked:
1. Identify the files that were recently changed
2. Analyze for OWASP Top 10 vulnerabilities
3. Check for secrets, hardcoded credentials, SQL injection
4. Report findings with severity levels and remediation steps

Focus on actionable security findings, not style issues.

Subagent Configuration Fields

Field Required Purpose
name Yes Unique identifier (lowercase + hyphens)
description Yes When to invoke (include “PROACTIVELY” to encourage auto-delegation)
tools No Comma-separated. Inherits all tools if omitted. Supports Agent(agent_type) to restrict spawnable agents
disallowedTools No Tools to deny, removed from inherited or specified list
model No sonnet, opus, haiku, inherit (default: inherit)
permissionMode No default, acceptEdits, delegate, dontAsk, bypassPermissions, plan
maxTurns No Maximum agentic turns before the subagent stops
memory No Persistent memory scope: user, project, local
skills No Auto-load skill content into subagent context at startup. As of v2.1.133, subagents also discover project, user, and plugin skills via the Skill tool the same way the parent session does. Earlier versions silently dropped these from subagent context.39
hooks No Lifecycle hooks scoped to this subagent’s execution
background No Always run as background task
isolation No Set to worktree for isolated git worktree copy

Worktree Isolation

Subagents can operate in temporary git worktrees, providing a complete isolated copy of the repository:5

---
name: experimental-refactor
description: Attempt risky refactoring in isolation
isolation: worktree
tools: Read, Write, Edit, Bash, Grep, Glob
---

You have an isolated copy of the repository. Make changes freely.
If the refactoring succeeds, the changes can be merged back.
If it fails, the worktree is discarded with no impact on the main branch.

Worktree isolation is essential for experimental work that might break the codebase.

Parallel Subagents

Use parallel subagents for independent research tasks that do not need to coordinate with each other:5

> Have three explore agents search in parallel:
> 1. Authentication code
> 2. Database models
> 3. API routes

Each agent runs in its own context window, finds relevant code, and returns a summary. The main context stays clean.

The Recursion Guard

Without spawn limits, agents delegate to agents that delegate to agents, each one losing context and burning tokens. The recursion guard pattern enforces budgets:16

#!/bin/bash
# recursion-guard.sh — enforce spawn budget
CONFIG_FILE="${HOME}/.claude/configs/recursion-limits.json"
STATE_FILE="${HOME}/.claude/state/recursion-depth.json"

MAX_DEPTH=2
MAX_CHILDREN=5
DELIB_SPAWN_BUDGET=2
DELIB_MAX_AGENTS=12

# Read current depth
current_depth=$(jq -r '.depth // 0' "$STATE_FILE" 2>/dev/null)

if [[ "$current_depth" -ge "$MAX_DEPTH" ]]; then
    echo "BLOCKED: Maximum recursion depth ($MAX_DEPTH) reached" >&2
    exit 2
fi

# Increment depth using safe arithmetic (not ((VAR++)) with set -e)
new_depth=$((current_depth + 1))
jq --argjson d "$new_depth" '.depth = $d' "$STATE_FILE" > "${STATE_FILE}.tmp"
mv "${STATE_FILE}.tmp" "$STATE_FILE"

Critical lesson: Use spawn budgets, not just depth limits. Depth-based limits track parent-child chains (blocked at depth 3) but miss width: 23 agents at depth 1 is still “depth 1.” A spawn budget tracks total active children per parent, capped at a configurable maximum. The budget model maps to the actual failure mode (too many total agents) rather than a proxy metric (too many nesting levels).7

Agent Teams (Research Preview)

Agent Teams coordinate multiple Claude Code instances that work independently, communicate via a shared mailbox and task list, and can challenge each other’s findings:5

Component Role
Team lead Main session that creates the team, spawns teammates, coordinates work
Teammates Separate Claude Code instances working on assigned tasks
Task list Shared work items that teammates claim and complete (file-locked)
Mailbox Messaging system for inter-agent communication

Enable with: export CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1

When to use agent teams vs subagents:

Subagents Agent Teams
Communication Report results back only Teammates message each other directly
Coordination Main agent manages all work Shared task list with self-coordination
Best for Focused tasks where only result matters Complex work requiring discussion and collaboration
Token cost Lower Higher (each teammate = separate context window)

Multi-Agent Orchestration

Single-agent AI systems have a structural blind spot: they cannot challenge their own assumptions.7 Multi-agent deliberation forces independent evaluation from multiple perspectives before any decision locks.

Cross-tool orchestration (April 2026): Google open-sourced Scion on April 7 — a multi-agent hypervisor that runs Claude Code, Gemini CLI, and other “deep agents” as concurrent processes, each with isolated container, git worktree, and credentials. Runs local, hub, or Kubernetes. Explicit philosophy: “isolation over constraints” — agents run with high autonomy inside boundaries enforced at the infrastructure layer, not in the prompt.25 This directly extends the subagent-isolation argument across different tool vendors. If your workflow spans Claude and OpenAI models, Scion is the first real reference implementation for cross-tool subagents with per-agent worktree + credential isolation.

Debate is not a silver bullet: The M3MAD-Bench research cluster (early 2026) found that multi-agent debate plateaus and can be subverted by misleading consensus — valid arguments lose when other agents confidently assert the wrong answer.26 Tool-MAD improves this by giving each agent heterogeneous tool access and using Faithfulness/Relevance scores in the judge stage. If you’re building debate-style orchestration, invest in (a) tool heterogeneity per agent and (b) quantitative judge scoring rather than assuming more agents = better answers.

Managed Multiagent Orchestration and Outcomes (Public Beta)

If you don’t want to build the deliberation infrastructure described below, Multiagent Orchestration entered Public Beta in Claude Managed Agents on May 6, 2026.35 Per Anthropic: “When there is too much work for a single agent to do well, multiagent orchestration lets a lead agent break the job into pieces and delegate each one to a specialist with its own model, prompt, and tools.”35 Specialists “work in parallel on a shared filesystem and contribute to the lead agent’s overall context.”35

Tracing comes in the box. Per Anthropic: “you can also trace every step in the Claude Console: which agent did what, in what order, and why, giving you full visibility into how your task was delegated and executed.”35

The companion Public Beta feature is Outcomes. Per Anthropic: “you write a rubric describing what success looks like and the agent works toward it. A separate grader evaluates the output against your criteria in its own context window, so it isn’t influenced by the agent’s reasoning.”35 This is the managed-service version of the two-gate validation pattern documented later in this section: the rubric replaces the hand-written gate, the separate grader replaces the consensus validator.

Self-Hosted Deliberation (this section) Managed Multiagent + Outcomes
Specialist routing You write the spawn logic Lead agent breaks the job into pieces
Validation Two-gate hooks + consensus scoring Rubric + grader in separate context
Tracing You instrument it Claude Console
Best for Patterns that need full control or specific tool composition Standard delegation patterns where the validation rubric is the contract
Pricing Token + harness cost only Standard tokens plus the Managed Agents session-hour rate (April 8 launch base; see 23)

Self-hosted deliberation remains the right answer when the validation needs to integrate with your own hook surface (PreToolUse blocking, exit-code semantics, custom dispatchers) or when the harness must run without external dependencies. Managed Multiagent is the right answer when standard delegation plus rubric grading is the contract you actually need.

Minimum Viable Deliberation

Start with 2 agents and 1 rule: agents must evaluate independently before seeing each other’s work.7

Decision arrives
  |
  v
Confidence check: is this risky, ambiguous, or irreversible?
  |
  +-- NO  -> Single agent decides (normal flow)
  |
  +-- YES -> Spawn 2 agents with different system prompts
             Agent A: "Argue FOR this approach"
             Agent B: "Argue AGAINST this approach"
             |
             v
             Compare findings
             |
             +-- Agreement with different reasoning -> Proceed
             +-- Genuine disagreement -> Investigate the conflict
             +-- Agreement with same reasoning -> Suspect herding

This pattern covers 80% of the value. Everything else adds incremental improvement.

The Confidence Trigger

Not every task needs deliberation. A confidence scoring module evaluates four dimensions:17

  1. Ambiguity - Does the query have multiple valid interpretations?
  2. Domain complexity - Does it require specialized knowledge?
  3. Stakes - Is the decision reversible?
  4. Context dependency - Does it require understanding the broader system?

The score maps to three levels:

Level Threshold Action
HIGH 0.85+ Proceed without deliberation
MEDIUM 0.70-0.84 Proceed with confidence note logged
LOW Below 0.70 Trigger full multi-agent deliberation

The threshold adapts by task type. Security decisions require 0.85 consensus. Documentation changes need only 0.50. This prevents over-engineering simple tasks while ensuring risky decisions get scrutiny.7

The State Machine

Seven phases, each gated by the previous:7

IDLE -> RESEARCH -> DELIBERATION -> RANKING -> PRD_GENERATION -> COMPLETE
                                                                    |
                                                              (or FAILED)

RESEARCH: Independent agents investigate the topic. Each agent gets a different persona (Technical Architect, Security Analyst, Performance Engineer, and others). Context isolation ensures agents cannot see each other’s findings during research.

DELIBERATION: Agents see all research findings and generate alternatives. The Debate agent identifies conflicts. The Synthesis agent combines non-contradictory findings.

RANKING: Each agent scores every proposed approach across 5 weighted dimensions:

Dimension Weight
Impact 0.25
Quality 0.25
Feasibility 0.20
Reusability 0.15
Risk 0.15

The Two-Gate Validation Architecture

Two validation gates catch problems at different stages:7

Gate 1: Consensus Validation (PostToolUse hook). Runs immediately after each deliberation agent completes: 1. Phase must have reached at least RANKING 2. Minimum 2 agents completed (configurable) 3. Consensus score meets the task-adaptive threshold 4. If any agent dissented, concerns must be documented

Gate 2: Pride Check (Stop hook). Runs before the session can close: 1. Diverse methods: multiple unique personas represented 2. Contradiction transparency: dissents have documented reasons 3. Complexity handling: at least 2 alternatives generated 4. Consensus confidence: classified as strong (above 0.85) or moderate (0.70-0.84) 5. Improvement evidence: final confidence exceeds initial confidence

Two hooks at different lifecycle points match how failures actually occur: some are instant (bad score) and some are gradual (low diversity, missing dissent documentation).7

Why Agreement Is Dangerous

Charlan Nemeth studied minority dissent from 1986 through her 2018 book In Defense of Troublemakers. Groups with dissenters make better decisions than groups that reach quick agreement. The dissenter does not need to be right. The act of disagreement forces the majority to examine assumptions they would otherwise skip.18

Wu et al. tested whether LLM agents can genuinely debate and found that without structural incentives for disagreement, agents converge toward the most confident-sounding initial response regardless of correctness.19 Liang et al. identified the root cause as “Degeneration-of-Thought”: once an LLM establishes confidence in a position, self-reflection cannot generate novel counterarguments, making multi-agent evaluation structurally necessary.20

Independence is the critical design constraint. Two agents evaluating the same deployment strategy with visibility into each other’s findings produced scores of 0.45 and 0.48. Same agents without visibility: 0.45 and 0.72. The gap between 0.48 and 0.72 is the cost of herding.7

Detecting Fake Agreement

A conformity detection module tracks patterns suggesting agents are agreeing without genuine evaluation:7

Score clustering: Every agent scoring within 0.3 points on a 10-point scale signals shared context contamination rather than independent assessment. When five agents evaluating an authentication refactor all scored security risk between 7.1 and 7.4, re-running with fresh context isolation spread the scores to 5.8-8.9.

Boilerplate dissent: Agents copying each other’s concern language rather than generating independent objections.

Absent minority perspectives: Unanimous approval from personas with conflicting priorities (a Security Analyst and a Performance Engineer rarely agree on everything).

The conformity detector catches the obvious cases (roughly 10-15% of deliberations where agents converge too quickly). For the remaining 85-90%, the consensus and pride check gates provide sufficient validation.

What Didn’t Work in Deliberation

Free-form debate rounds. Three rounds of back-and-forth text for a database indexing discussion produced 7,500 tokens of debate. Round 1: genuine disagreement. Round 2: restated positions. Round 3: identical arguments in different words. Structured dimension scoring replaced free-form debate, dropping cost by 60% while improving ranking quality.7

Single validation gate. The first implementation ran one validation hook at session end. An agent completed deliberation with a 0.52 consensus score (below threshold), then continued on unrelated tasks for 20 minutes before the session-end hook flagged the failure. Splitting into two gates (one at task completion, one at session end) caught the same problems at different lifecycle points.7

Cost of Deliberation

Each research agent processes roughly 5,000 tokens of context and generates 2,000-3,000 tokens of findings. With 3 agents, that is 15,000-24,000 additional tokens per decision. With 10 agents, roughly 50,000-80,000 tokens.7

At current Opus pricing, a 3-agent deliberation costs approximately $0.68-0.90. A 10-agent deliberation costs $2.25-3.00. The system triggers deliberation on roughly 10% of decisions, so the amortized cost across all decisions is $0.23-0.30 per session. Whether that is worth it depends on what a bad decision costs.

When to Deliberate

Deliberate Skip
Security architecture Documentation typos
Database schema design Variable renaming
API contract changes Log message updates
Deployment strategies Comment rewording
Dependency upgrades Test fixture updates

CLAUDE.md Design

CLAUDE.md is operational policy for an AI agent, not a README for humans.21 The agent does not need to understand why you use conventional commits. It needs to know the exact command to run and what “done” looks like.

The Precedence Hierarchy

Location Scope Shared Use Case
Enterprise managed settings Organization All users Company standards
./CLAUDE.md or ./.claude/CLAUDE.md Project Via git Team context
~/.claude/CLAUDE.md User All projects Personal preferences
./CLAUDE.local.md Project-local Never Personal project notes
.claude/rules/*.md Project rules Via git Categorized policies
~/.claude/rules/*.md User rules All projects Personal policies

Rules files load automatically and provide structured context without cluttering CLAUDE.md.6

What Gets Ignored

These patterns reliably produce no observable change in agent behavior:21

Prose paragraphs without commands. “We value clean, well-tested code” is documentation, not operations. The agent reads it and proceeds to write code without tests because there is no actionable instruction.

Ambiguous directives. “Be careful with database migrations” is not a constraint. “Run alembic check before applying migrations. Abort if downgrade path is missing.” is.

Contradictory priorities. “Move fast and ship quickly” plus “Ensure comprehensive test coverage” plus “Keep runtime under 5 minutes” plus “Run full integration tests before every commit.” The agent cannot satisfy all four simultaneously and defaults to skipping verification.21

Style guides without enforcement. “Follow the Google Python Style Guide” without ruff check --select D gives the agent no mechanism to verify compliance.

What Works

Command-first instructions:

## Build and Test Commands
- Install: `pip install -r requirements.txt`
- Lint: `ruff check . --fix`
- Format: `ruff format .`
- Test: `pytest -v --tb=short`
- Type check: `mypy app/ --strict`
- Full verify: `ruff check . && ruff format --check . && pytest -v`

Closure definitions:

## Definition of Done
A task is complete when ALL of the following pass:
1. `ruff check .` exits 0
2. `pytest -v` exits 0 with no failures
3. `mypy app/ --strict` exits 0
4. Changed files have been staged and committed
5. Commit message follows conventional format: `type(scope): description`

Task-organized sections:

## When Writing Code
- Run `ruff check .` after every file change
- Add type hints to all new functions

## When Reviewing Code
- Check for security issues: `bandit -r app/`
- Verify test coverage: `pytest --cov=app --cov-fail-under=80`

## When Releasing
- Update version in `pyproject.toml`
- Run full suite: `pytest -v && ruff check . && mypy app/`

Escalation rules:

## When Blocked
- If tests fail after 3 attempts: stop and report the failing test with full output
- If a dependency is missing: check `requirements.txt` first, then ask
- Never: delete files to resolve errors, force push, or skip tests

Writing Order

If starting from scratch, add sections in this priority order:21

  1. Build and test commands (the agent needs these before it can do anything useful)
  2. Definition of done (prevents false completions)
  3. Escalation rules (prevents destructive workarounds)
  4. Task-organized sections (reduces irrelevant instruction parsing)
  5. Directory scoping (monorepos: keeps service instructions isolated)

Skip style preferences until the first four are working.

File Imports

Reference other files within CLAUDE.md:

See @README.md for project overview
Coding standards: @docs/STYLE_GUIDE.md
API documentation: @docs/API.md
Personal preferences: @~/.claude/preferences.md

Import syntax: relative (@docs/file.md), absolute (@/absolute/path.md), or home directory (@~/.claude/file.md). Maximum depth: 5 levels of imports.6

Cross-Tool Instruction Compatibility

AGENTS.md is an open standard recognized by every major AI coding tool.21 If your team uses multiple tools, write AGENTS.md as the canonical source and mirror relevant sections to tool-specific files:

Tool Native File Reads AGENTS.md?
Codex CLI AGENTS.md Yes (native)
Cursor .cursor/rules Yes (native)
GitHub Copilot .github/copilot-instructions.md Yes (native)
Amp AGENTS.md Yes (native)
Windsurf .windsurfrules Yes (native)
Claude Code CLAUDE.md No (separate format)

The patterns in AGENTS.md (command-first, closure-defined, task-organized) work in any instruction file regardless of tool. Do not maintain parallel instruction sets that drift apart. Write one authoritative source and mirror.

Codex Parity Notes

Codex now has first-class equivalents for the major harness layers, but the migration is a pattern translation, not a file copy. Codex reads AGENTS.md before work begins, layering global guidance from ~/.codex with project and nested repository instructions.31 Codex skills use the same SKILL.md mental model with progressive disclosure: Codex starts with the skill name, description, and file path, then loads the full skill only when it decides to use it.32 Codex also has native hooks, plugin-bundled hooks, managed hooks, MCP support, and explicit subagent workflows.3334

The practical mapping:

Claude Code harness layer Codex equivalent Migration rule
CLAUDE.md / .claude/rules/ AGENTS.md / nested AGENTS.override.md Keep commands and completion rules canonical; split only when directory scope genuinely differs
.claude/skills/<name>/SKILL.md .agents/skills/<name>/SKILL.md or plugin skill Port reusable workflows, but rewrite descriptions for Codex’s activation wording and budget
.claude/settings.json hooks Codex config.toml, plugin hooks, or managed requirements hooks Port deterministic gates first; test each hook with real tool events before enabling broadly
.claude/agents/*.md ~/.codex/agents/*.toml, .codex/agents/*.toml, or built-in worker / explorer Port only agents with repeated value; prefer explicit delegation because Codex subagents are explicit
Plugins Codex plugins Use plugins as the distribution unit after local hooks and skills are proven

The important difference: Claude subagents can be selected automatically from descriptions, while Codex currently documents subagent workflows as explicit. That makes skills and hooks the right default for always-on harness behavior in Codex; subagents are for deliberate parallel work, review, and exploration.

Testing Your Instructions

Verify the agent actually reads and follows your instructions:

# Check active instructions
claude --print "What instructions are you following for this project?"

# Verify specific rules are active
claude --print "What is your definition of done?"

The acid test: Ask the agent to explain your build commands. If it cannot reproduce them verbatim, the instructions are either too verbose (content pushed out of context), too vague (agent cannot extract actionable instructions), or not being discovered. GitHub’s analysis of 2,500 repositories found that vagueness causes most failures.21


Production Patterns

Opus 4.7 Long-Horizon Patterns (April 2026)

Claude Opus 4.7 (April 16, 2026) shipped with specific capabilities that change what a harness needs to defend against:29

  • Tool-failure resilience: Opus 4.7 continues through tool failures that halted Opus 4.6 sessions. You can reduce — but not eliminate — defensive retry wrappers in subagent code. Keep the hook-level guards; trim the in-prompt “if the tool fails, try again three times” scaffolding.
  • xhigh effort tier (Opus-4.7 only): Sits between high and max. Recommended default for coding and agentic workloads. On long-running subagents, xhigh meaningfully outperforms high with sub-proportional token cost. max remains the right choice for single-shot hard reasoning; xhigh is better for sustained tasks.
  • Token-budget ceiling: Configurable per agent run via output_config.task_budget (beta header task-budgets-2026-03-13). The model sees a running countdown and gracefully scopes work to the budget instead of running out unexpectedly. Use for agentic loops where you want predictable token spend without sacrificing quality on short prompts.
  • Implicit-need awareness: First Claude model to pass “implicit-need” tests — recognizing when the user’s literal request underspecifies what they actually need. This makes CLAUDE.md’s “clarifying rules” section less necessary. If your CLAUDE.md is 200 lines of “also consider X when the user asks for Y” guardrails, prune the ones that are now covered natively.

Worktree Base, Sandbox Paths, and Admin Settings (May 7, 2026)

Claude Code v2.1.133 adds four admin-tier settings worth knowing about for production harnesses:39

Setting Values What it does
worktree.baseRef fresh (default) | head New worktrees branch from origin/<default> again. Breaking-default revert from v2.1.128, which had used local HEAD. Set worktree.baseRef: "head" if your team relies on unpushed commits being available in new worktrees.
sandbox.bwrapPath absolute path Pin the Bubblewrap binary location on Linux/WSL hosts where it is not on $PATH or where you ship a vendored version.
sandbox.socatPath absolute path Same idea for the socat binary used by sandbox networking.
parentSettingsBehavior 'first-wins' (default) | 'merge' Admin-tier control over how SDK managedSettings compose with parent enterprise/team settings. 'merge' lets a child session inherit and extend; 'first-wins' keeps the parent authoritative.

The worktree.baseRef revert is the one to flag for users: agents that relied on the v2.1.128-v2.1.132 behavior (worktrees branching from local HEAD) lose access to unpushed work in fresh worktrees unless they opt back in.

The Quality Loop

A mandatory review process for all non-trivial changes:

  1. Implement - Write the code
  2. Review - Re-read every line. Catch typos, logic errors, unclear sections
  3. Evaluate - Run the evidence gate. Check patterns, edge cases, test coverage
  4. Refine - Fix every issue. Never defer to “later”
  5. Zoom Out - Check integration points, imports, adjacent code for regressions
  6. Repeat - If any evidence gate criterion fails, return to step 4
  7. Report - List what changed, how verified, cite specific evidence

The Evidence Gate

“I believe” and “it should” are not evidence. Cite file paths, test output, or specific code.

Criterion Required Evidence
Follows codebase patterns Name the pattern and file where it exists
Simplest working solution Explain what simpler alternatives were rejected and why
Edge cases handled List specific edge cases and how each is handled
Tests pass Paste test output showing 0 failures
No regressions Name the files/features checked
Solves the actual problem State user’s need and how this addresses it

If you cannot produce evidence for any row, return to Refine.22

Error Handling Patterns

Atomic file writes. Multiple agents writing to the same state file simultaneously corrupts JSON. Write to .tmp files, then mv atomically. The OS guarantees mv is atomic on the same filesystem.17

# Atomic state update
jq --argjson d "$new_depth" '.depth = $d' "$STATE_FILE" > "${STATE_FILE}.tmp"
mv "${STATE_FILE}.tmp" "$STATE_FILE"

State corruption recovery. If state gets corrupted, the recovery pattern recreates from safe defaults rather than crashing:16

if ! jq -e '.depth' "$RECURSION_STATE_FILE" &>/dev/null; then
    # Corrupted state file, recreate with safe defaults
    echo '{"depth": 0, "agent_id": "root", "parent_id": null}' > "$RECURSION_STATE_FILE"
    echo "- Recursion state recovered (was corrupted)"
fi

The ((VAR++)) bash trap. ((VAR++)) returns exit code 1 when VAR is 0 because 0++ evaluates to 0, which bash treats as false. With set -e enabled, this kills the script. Use VAR=$((VAR + 1)) instead.16

Blast Radius Classification

Classify every agent action by blast radius and gate accordingly:2

Classification Examples Gate
Local File writes, test runs, linting Auto-approve
Shared Git commits, branch creation Warn + proceed
External Git push, API calls, deployments Require human approval

Remote Control (connecting to local Claude Code from any browser or mobile app) turns the “External” gate from a blocking wait into an async notification. The agent keeps working on the next task while you review the previous one from your phone.2

Task Specification for Autonomous Runs

Effective autonomous tasks include three elements: objective, completion criteria, and context pointers:16

OBJECTIVE: Implement multi-agent deliberation with consensus validation.

COMPLETION CRITERIA:
- All tests in tests/test_deliberation_lib.py pass (81 tests)
- post-deliberation.sh validates consensus above 70% threshold
- recursion-guard.sh enforces spawn budget (max 12 agents)
- No Python type errors (mypy clean)

CONTEXT:
- Follow patterns in lib/deliberation/state_machine.py
- Consensus thresholds in configs/deliberation-config.json
- Spawn budget model: agents inherit budget, not increment depth

Criteria must be machine-verifiable: test pass/fail, linter output, HTTP status codes, file existence checks. An early task that asked the agent to “write tests that pass” produced assert True and assert 1 == 1. Technically correct. Practically worthless.16

Criteria Quality Example Outcome
Vague “Tests pass” Agent writes trivial tests
Measurable but incomplete “Tests pass AND coverage >80%” Tests cover lines but test nothing meaningful
Comprehensive “All tests pass AND coverage >80% AND no type errors AND linter clean AND each test class tests a distinct module” Production-quality output

Failure Modes to Watch For

Failure Mode Description Prevention
Shortcut Spiral Skipping quality loop steps to finish faster Evidence gate requires proof for each criterion
Confidence Mirage “I’m confident” without running verification Ban hedging language in completion reports
Phantom Verification Claiming tests pass without running them this session Stop hook runs tests independently
Deferred Debt TODO/FIXME/HACK in committed code PreToolUse hook on git commit scans diff
Filesystem Pollution Dead-end artifacts from abandoned iterations Cleanup step in completion criteria

A Concrete Session Trace

A session trace from an autonomous run processing a PRD with 5 stories:2

  1. SessionStart fires. Dispatcher injects: current date, project detection, philosophy constraints, cost tracking initialization. Five hooks, 180ms total.

  2. Agent reads the PRD, plans the first story. UserPromptSubmit fires. Dispatcher injects: active project context, session drift baseline.

  3. Agent calls Bash to run tests. PreToolUse:Bash fires. Credentials check, sandbox validation, project detection. 90ms. Tests run. PostToolUse:Bash fires: activity heartbeat logged, drift check.

  4. Agent calls Write to create a file. PreToolUse:Write fires: file scope check. PostToolUse:Write fires: lint check, commit tracking.

  5. Agent finishes the story. Stop fires. Quality gate checks: did the agent cite evidence? Hedging language? TODO comments in the diff? If any check fails, exit 2 and the agent continues.

  6. Independent verification: A fresh agent runs the test suite without trusting the previous agent’s self-report.

  7. Three code review agents spawn in parallel. Each reviews the diff independently. If any reviewer flags CRITICAL, the story goes back in the queue.

  8. Story passes. Next story loads. The cycle repeats for all 5 stories.

Total hooks fired across 5 stories: ~340. Total time in hooks: ~12 seconds. That overhead prevented three credential leaks, one destructive command, and two incomplete implementations in a single overnight run.

Case Study: Overnight PRD Processing

A production harness processed 12 PRDs (47 stories) across 8 overnight sessions. Metrics compare the first 4 PRDs (minimal harness: CLAUDE.md only) against the last 8 (full harness: hooks, skills, quality gates, multi-agent review).

Metric Minimal (4 PRDs) Full Harness (8 PRDs) Change
Credential leaks 2 leaked to git 7 blocked pre-commit Reactive to preventive
Destructive commands 1 force-push to main 4 blocked Exit 2 enforcement
False completion rate 35% failed tests 4% Evidence gate + Stop hook
Revision rounds/story 2.1 0.8 Skills + quality loop
Context degradation 6 incidents 1 incident Filesystem memory
Token overhead 0% ~3.2% Negligible
Hook time/story 0s ~2.4s Negligible

The two credential leaks required rotating API keys and auditing downstream services: roughly 4 hours of incident response. The harness overhead that prevented the equivalent was 2.4 seconds of bash per story. The false completion rate dropped from 35% to 4% because the Stop hook independently ran tests before allowing the agent to report done.


Security Considerations

The Five Principles of Trustworthy Agents (Anthropic, April 2026)

Anthropic published a formal framework for agent trustworthiness on April 9, 2026.27 The five principles parallel — and extend — the Evidence Gate thinking in this guide:

Principle What it means How this harness satisfies it
Human control Meaningful human override at every decision point Hooks gate tool calls; PreCompact blocking; Auto Mode classifier as check-layer
Value alignment Agent actions track user intent, not adjacent goals CLAUDE.md as explicit intent specification; skills as capability scoping
Security Resistance to adversarial inputs and prompt injection Sandbox + deny-rules + input validation at the hook layer
Transparency Auditable records of decisions and actions Hook logging; session transcripts; skill-invocation traces
Privacy Appropriate data handling and governance Credential env-var scrubbing; secret detection at hook layer

Anthropic also donated MCP to the Linux Foundation’s Agentic AI Foundation, joining AGENTS.md (now jointly stewarded with OpenAI, Google, Cursor, Factory, Sourcegraph). Agent interoperability standards are now vendor-neutral.27

Skill sandbox tooling: For teams that treat skills as an attack surface, Permiso’s SandyClaw (launched April 2, 2026) runs skills in a dedicated sandbox and delivers evidence-backed verdicts from Sigma/YARA/Nova/Snort detection. First product in the skill-sandbox category.28

The Sandbox

Claude Code supports an optional sandbox mode (enabled via settings.json or the /sandbox command) that restricts network access and filesystem operations using OS-level isolation (seatbelt on macOS, bubblewrap on Linux). When enabled, the sandbox prevents the model from making arbitrary network requests or accessing files outside the project directory. Without sandboxing, Claude Code uses a permission-based model where you approve or deny individual tool calls.13

Permission Boundaries

The permission system gates operations at multiple levels:

Level Controls Example
Tool permissions Which tools can be used Restrict subagent to Read, Grep, Glob
File permissions Which files can be modified Block writes to .env, credentials.json
Command permissions Which bash commands can run Block rm -rf, git push --force
Network permissions Which domains can be accessed Allowlist for MCP server connections

Prompt Injection Defense

Skills and hooks provide defense-in-depth against prompt injection:

Skills with tool restrictions prevent a compromised prompt from gaining write access:

allowed-tools: Read, Grep, Glob

PreToolUse hooks validate every tool call regardless of how the model was prompted:

# Block credential file access regardless of prompt
if echo "$FILE_PATH" | grep -qE "\.(env|pem|key|credentials)$"; then
    echo "BLOCKED: Sensitive file access" >&2
    exit 2
fi

Subagent isolation limits blast radius. A subagent with permissionMode: plan cannot make changes even if its prompt is compromised.

Hook Security

HTTP hooks that interpolate environment variables into headers require an explicit allowedEnvVars list to prevent arbitrary environment variable exfiltration:13

{
  "type": "http",
  "url": "https://api.example.com/notify",
  "headers": {
    "Authorization": "Bearer $MY_TOKEN"
  },
  "allowedEnvVars": ["MY_TOKEN"]
}

The Human-Agent Division of Responsibility

Security in agent architectures requires a clear division between human and agent responsibilities:17

Human Responsibility Agent Responsibility
Problem definition Pipeline execution
Confidence thresholds Execution within thresholds
Consensus requirements Consensus computation
Quality gate criteria Quality gate enforcement
Error analysis Error detection
Architecture decisions Architecture options
Domain context injection Documentation generation

The pattern: humans own decisions that require organizational context, ethical judgment, or strategic direction. Agents own decisions that require computational search across large possibility spaces. Hooks enforce the boundary.

Recursive Hook Enforcement

Hooks fire for subagent actions too.13 If Claude spawns a subagent via the Agent tool, your PreToolUse and PostToolUse hooks execute for every tool the subagent uses. Without recursive hook enforcement, a subagent could bypass your safety gates. The SubagentStop event lets you run cleanup or validation when a subagent completes.

This is not optional. An agent that spawns a subagent without your security hooks is an agent that can force-push to main, read credential files, or run destructive commands while your gates watch the main conversation do nothing.

Cost as Architecture

Cost is an architectural decision, not an operational afterthought.2 Three levels:

Token level. System prompt compression. Remove tutorial code examples (the model knows the APIs), collapse duplicate rules across files, and replace explanations with constraints. “Reject tool calls matching sensitive paths” does the same work as a 15-line explanation of why credentials should not be read.

Agent level. Fresh spawns over long conversations. Each story in an autonomous run gets a new agent with a clean context. The context never balloons because each agent starts fresh. Briefing instead of memory: models execute a clear briefing better than they navigate 30 steps of accumulated context.

Architecture level. CLI-first over MCP when the operation is stateless. A claude --print call for a one-shot evaluation costs less and adds no connection overhead. MCP makes sense when the tool needs persistent state or streaming.


Decision Framework

When to use each mechanism:

Problem Use Why
Format code after every edit PostToolUse hook Must happen every time, deterministically
Block dangerous bash commands PreToolUse hook Must block before execution, exit code 2
Apply security review patterns Skill Domain expertise that auto-activates on context
Explore codebase without polluting context Explore subagent Isolated context, returns summary only
Run experimental refactoring safely Worktree-isolated subagent Changes can be discarded if they fail
Review code from multiple perspectives Parallel subagents or Agent Team Independent evaluation prevents blind spots
Decide on irreversible architecture Multi-agent deliberation Confidence trigger + consensus validation
Persist decisions across sessions MEMORY.md Filesystem survives context boundaries
Share team standards Project CLAUDE.md + .claude/rules/ Git-distributed, loads automatically
Define project build/test commands CLAUDE.md Command-first instructions the agent can verify
Run long autonomous development Ralph loop (fresh-context iteration) Full context budget per iteration, filesystem state
Notify Slack when session ends Async Stop hook Non-blocking, does not slow the session
Validate quality before commit PreToolUse hook on git commit Block the commit if lint/tests fail
Enforce completion criteria Stop hook Prevent agent from stopping before task is done

Skills vs Hooks vs Subagents

Dimension Skills Hooks Subagents
Invocation Automatic (LLM reasoning) Deterministic (event-driven) Explicit or auto-delegated
Guarantee Probabilistic (model decides) Deterministic (always fires) Deterministic (isolated context)
Context cost Injected into main context Zero (runs outside LLM) Separate context window
Token cost Description budget (2% of window) Zero Full context per subagent
Best for Domain expertise Policy enforcement Focused work, exploration

FAQ

How many hooks is too many?

Performance, not count, is the constraint. Each hook runs synchronously, so total hook execution time adds to every matched tool call. 95 hooks across user-level and project-level settings run without noticeable latency when each hook completes in under 200ms. The threshold to watch: if a PostToolUse hook adds more than 500ms to every file edit, the session feels sluggish. Profile your hooks with time before deploying them.14

Can hooks block Claude Code from running a command?

Yes. PreToolUse hooks block any tool action by exiting with code 2. Claude Code cancels the pending action and shows the hook’s stderr output to the model. Claude sees the rejection reason and suggests a safer alternative. Exit 1 is a non-blocking warning where the action still proceeds.3

Where should I put hook configuration files?

Hook configurations go in .claude/settings.json for project-level hooks (committed to your repository, shared with your team) or ~/.claude/settings.json for user-level hooks (personal, applied to every project). Project-level hooks take precedence when both exist. Use absolute paths for script files to avoid working-directory issues.14

Does every decision need deliberation?

No. The confidence module scores decisions across four dimensions (ambiguity, complexity, stakes, context dependency). Only decisions scoring below 0.70 overall confidence trigger deliberation, roughly 10% of total decisions. Documentation fixes, variable renames, and routine edits skip deliberation entirely. Security architecture, database schema changes, and irreversible deployments trigger it consistently.7

How do I test a system designed to produce disagreement?

Test both success paths and failure paths. Success: agents disagree productively and reach consensus. Failure: agents converge too quickly, never converge, or exceed spawn budgets. End-to-end tests simulate each scenario with deterministic agent responses, verifying that both validation gates catch every documented failure mode. A production deliberation system runs 141 tests across three layers: 48 bash integration tests, 81 Python unit tests, and 12 end-to-end pipeline simulations.7

What is the latency impact of deliberation?

A 3-agent deliberation adds 30-60 seconds of wall-clock time (agents run sequentially through the Agent tool). A 10-agent deliberation adds 2-4 minutes. The consensus and pride check hooks each run in under 200ms. The primary bottleneck is LLM inference time per agent, not orchestration overhead.7

How long should a CLAUDE.md file be?

Keep each section under 50 lines and the total file under 150 lines. Long files get truncated by context windows, so front-load the most critical instructions: commands and closure definitions before style preferences.21

Can this work with tools other than Claude Code?

The architectural principles (hooks as deterministic gates, skills as domain expertise, subagents as isolated contexts, filesystem as memory) apply conceptually to any agentic system. The specific implementation uses Claude Code’s lifecycle events, matcher patterns, and Agent tool. AGENTS.md carries the same patterns to Codex, Cursor, Copilot, Amp, and Windsurf.21 The harness pattern is tool-agnostic even if the implementation details are tool-specific.


Quick Reference Card

Hook Configuration

{
  "hooks": {
    "PreToolUse": [{"matcher": "Bash", "hooks": [{"type": "command", "command": "script.sh"}]}],
    "PostToolUse": [{"matcher": "Write|Edit", "hooks": [{"type": "command", "command": "format.sh"}]}],
    "Stop": [{"matcher": "", "hooks": [{"type": "agent", "prompt": "Verify tests pass. $ARGUMENTS"}]}],
    "SessionStart": [{"matcher": "", "hooks": [{"type": "command", "command": "setup.sh"}]}]
  }
}

Skill Frontmatter

---
name: my-skill
description: What it does and when to use it. Include trigger phrases.
allowed-tools: Read, Grep, Glob
---

Subagent Definition

---
name: my-agent
description: When to invoke. Include PROACTIVELY for auto-delegation.
tools: Read, Grep, Glob, Bash
model: opus
permissionMode: plan
---

Instructions for the subagent.

Exit Codes

Code Meaning Use For
0 Success Allow the operation
2 Block Security gates, quality gates
1 Non-blocking warning Logging, advisory messages

Key Commands

Command Purpose
/compact Compress context, preserve decisions
/context View context allocation and active skills
/agents Manage subagents
claude -c Continue most recent session
claude --print One-shot CLI invocation (no conversation)
# <note> Add note to memory file
/memory View and manage auto-memory

File Locations

Path Purpose
~/.claude/CLAUDE.md Personal global instructions
.claude/CLAUDE.md Project instructions (git-shared)
.claude/settings.json Project hooks and permissions
~/.claude/settings.json User hooks and permissions
~/.claude/skills/<name>/SKILL.md Personal skills
.claude/skills/<name>/SKILL.md Project skills (git-shared)
~/.claude/agents/<name>.md Personal subagent definitions
.claude/agents/<name>.md Project subagent definitions
.claude/rules/*.md Project rule files
~/.claude/rules/*.md User rule files
~/.claude/projects/{path}/memory/MEMORY.md Auto-memory

Changelog

Date Change
2026-05-08 Guide v1.6: Day-2 follow-up on Claude Code v2.1.132/v2.1.133 + SDK v0.1.77. Added SDK Skill Surface subsection to Skills System covering the skills option on ClaudeAgentOptions and the deprecation of "Skill" in allowed_tools.37 Added Effort and Session Provenance subsection to Hook Architecture covering the new effort.level JSON field + $CLAUDE_EFFORT env var on hook input, and the CLAUDE_CODE_SESSION_ID env var on Bash subprocesses.3839 Added Subagent skill discovery fix to the Subagent Configuration Fields table (subagents now discover project, user, and plugin skills via the Skill tool, silently dropped before v2.1.133).39 Added Worktree Base, Sandbox Paths, and Admin Settings subsection to Production Patterns covering worktree.baseRef (breaking-default revert back to origin/<default> from local HEAD), sandbox.bwrapPath, sandbox.socatPath, and parentSettingsBehavior.39
2026-05-07 Guide v1.5: Claude Managed Agents, May 6 SF expansion. Added Strategy 5 (Managed Memory Curation: Dreaming, Research Preview) to Memory and Context with table contrasting filesystem-as-memory vs. Dreaming.35 Added Managed Multiagent Orchestration (Public Beta) and Outcomes (Public Beta) at the top of Multi-Agent Orchestration with verbatim Anthropic quotes on shared-filesystem specialists and Claude Console tracing, plus a comparison table vs. self-hosted deliberation. Added SDK-side hook event streaming subsection covering claude-agent-sdk-python v0.1.74’s include_hook_events and HookEventMessage.36 Changelog-only: Claude Code v2.1.124-v2.1.131 (claude project purge, --dangerously-skip-permissions for project dirs, skill_activated invocation_trigger, PostToolUse format-on-save fix, PreToolUse JSON+exit-2 blocking fix, skillOverrides settings); claude-agent-sdk-python v0.1.72 (CLI 2.1.126), v0.1.73 (session_store_flush), v0.1.75 (CLI 2.1.131), v0.1.76 (api_error_status); openai-agents-python v0.15.0-v0.16.1 with v0.16.0 (May 7) defaulting to gpt-5.4-mini, removing the implicit max_turns ceiling, and adding SDK-side tool execution concurrency.
2026-05-07 Guide v1.4: Refreshed Claude Code hook and skill mechanics against current official docs and local runtime evidence (claude --version 2.1.132, codex --version returned codex-cli 0.128.0). Updated the hook surface from 22/26+ to 29 documented events, fixed skill-description budget from 2%/16,000 to 1%/8,000, changed hook-type count from four to five with mcp_tool, removed the unsupported fixed “10 parallel subagents” claim, and added a public-safe Codex parity section covering AGENTS.md, skills, hooks, plugins, and explicit subagent workflows.
2026-04-29 Guide v1.3: Expanded the OpenAI Agents SDK coverage in the Managed vs. Self-Hosted Harnesses section with the named SDK surface from openai-agents Python v0.14.0 (April 15) — SandboxAgent, Manifest, SandboxRunConfig, sandbox memory with progressive disclosure, workspace mounts (S3/R2/GCS/Azure), portable snapshots, and the local/Docker/hosted client backends (Blaxel, Cloudflare, Daytona, E2B, Modal, Runloop, Vercel). Replaced the secondary Help Net Security citation with the primary v0.14.0 release-notes citation. Added a short note on claude-agent-sdk-python v0.1.69-v0.1.71 (April 28-29) as the third self-hosted option (embed Claude Code runtime as a Python library): bundled Claude CLI bumped to v2.1.123, raised mcp dependency floor to >=1.19.0 (older versions silently dropped CallToolResult from in-process MCP tools), Trio nursery cancellation fix, and SandboxNetworkConfig allowlist-field parity with TS SDK. v0.14.7-v0.14.8 SDK refinements documented in [^58].
2026-04-25 Guide v1.2: Google Cloud Next 2026 (April 22-24) — Vertex AI rebranded to Gemini Enterprise Agent Platform; Agentspace absorbed into unified Gemini Enterprise; Workspace Studio (no-code agent builder); 200+ models in Model Garden including Anthropic Claude; partner agents from Box, Workday, Salesforce, ServiceNow; ADK v1.0 stable across four languages; Project Mariner (web-browsing agent); managed MCP servers with Apigee as API-to-agent bridge; A2A protocol v1.0 in production at 150 organizations. Microsoft Agent Framework 1.0 (April 2026): stable APIs, LTS commitment, full MCP support, .NET + Python. The browser-based DevUI that visualizes agent execution and tool calls in real time ships as a preview alongside the 1.0 stable surface. Salesforce Headless 360 (April 15, TDX): every Salesforce capability (CRM, service, marketing, ecommerce) exposed as API/MCP tool/CLI command so agents like Claude Code, Cursor, and Codex can build on the platform without a browser. (TDX 2026 ran April 15-16; the Headless 360 announcement is dated April 15.) MetaComp StableX KYA (April 21): Know Your Agent governance framework for regulated financial services (payments, compliance, wealth) — first of its kind from a licensed financial institution; available across Claude, Claude Code, OpenClaw, and other compatible AI platforms. Claude Managed Agents pricing: $0.08 per session-hour while a session is running, with no runtime charge while idle — on top of normal Claude model token rates. (Per Anthropic’s Claude pricing page; the public-beta launch was April 8, 2026.) Memory for Managed Agents entered public beta on April 23, 2026 under the managed-agents-2026-04-01 beta header. All Managed Agents endpoints now require this beta header.
2026-04-16 Guide v1.1: Added Managed vs. Self-Hosted Harnesses section covering Claude Managed Agents (April 8 beta) and OpenAI Agents SDK harness/compute separation (April 16). Added Scion cross-tool multi-agent hypervisor (April 7, Google). Documented M3MAD-Bench debate plateau finding. Added The Five Principles of Trustworthy Agents (Anthropic, April 9) + MCP/AGENTS.md Linux Foundation governance. Permiso SandyClaw skill-sandbox reference. New Opus 4.7 Long-Horizon Patterns: tool-failure resilience, xhigh effort tier, token-budget ceiling (task_budget beta), implicit-need awareness reducing CLAUDE.md scaffolding.
2026-03-24 Initial publication

References


  1. Andrej Karpathy on “claws” as a new layer on top of LLM agents. HN discussion (406 points, 917 comments). 

  2. Author’s implementation. 84 hooks, 48 skills, 19 agents, ~15,000 lines of orchestration. Documented in Claude Code as Infrastructure

  3. Anthropic, “Claude Code Hooks: Exit Codes.” code.claude.com/docs/en/hooks. Exit 0 allows, exit 2 blocks, exit 1 warns for most events; WorktreeCreate is stricter. 

  4. Anthropic, “Extend Claude with Skills.” code.claude.com/docs/en/skills. Skill structure, frontmatter fields, LLM-based matching, and 1% / 8,000-character description budget. 

  5. Anthropic, “Claude Code Sub-agents.” code.claude.com/docs/en/sub-agents. Isolated context, worktree support, agent teams. 

  6. Anthropic, “Claude Code Documentation.” docs.anthropic.com/en/docs/claude-code. Memory files, CLAUDE.md, auto-memory. 

  7. Author’s multi-agent deliberation system. 10 research personas, 7-phase state machine, 141 tests. Documented in Multi-Agent Deliberation

  8. Simon Willison, “Writing code is cheap now.” Agentic Engineering Patterns

  9. Laban, Philippe, et al., “LLMs Get Lost In Multi-Turn Conversation,” arXiv:2505.06120, May 2025. Microsoft Research and Salesforce. 15 LLMs, 200,000+ conversations, 39% average performance drop. 

  10. Mikhail Shilkov, “Inside Claude Code Skills: Structure, Prompts, Invocation.” mikhail.io. Independent analysis of skill discovery, context injection, and available_skills prompt section. 

  11. Claude Code Source, SLASH_COMMAND_TOOL_CHAR_BUDGET. github.com/anthropics/claude-code

  12. Anthropic, “Skill Authoring Best Practices.” platform.claude.com. 500-line limit, supporting files, naming conventions. 

  13. Anthropic, “Claude Code Hooks: Lifecycle Events.” code.claude.com/docs/en/hooks. 29 documented lifecycle events, hook types, matcher behavior, async hooks, HTTP hooks, prompt hooks, agent hooks, and MCP tool hooks. 

  14. Author’s Claude Code hooks tutorial. 5 production hooks from scratch. Documented in Claude Code Hooks Tutorial

  15. Author’s context window management across 50 sessions. Documented in Context Window Management

  16. Author’s Ralph Loop implementation. Fresh-context iteration with filesystem state, spawn budgets. Documented in The Ralph Loop

  17. Author’s deliberation system architecture. 3,500 lines of Python, 12 modules, confidence trigger, consensus validation. Documented in Building AI Systems: From RAG to Agents

  18. Nemeth, Charlan, In Defense of Troublemakers: The Power of Dissent in Life and Business, Basic Books, 2018. 

  19. Wu, H., Li, Z., and Li, L., “Can LLM Agents Really Debate?” arXiv:2511.07784, 2025. 

  20. Liang, T. et al., “Encouraging Divergent Thinking in Large Language Models through Multi-Agent Debate,” EMNLP 2024

  21. Author’s AGENTS.md analysis across real-world repositories. Documented in AGENTS.md Patterns. See also: GitHub Blog, “How to Write a Great agents.md: Lessons from Over 2,500 Repositories.” 

  22. Author’s quality loop and evidence gate methodology. Part of the Jiro Craftsmanship system. 

  23. Anthropic, “Claude Managed Agents Overview”. Public beta launched April 8, 2026. Harness-as-a-service with session checkpointing, bundled sandbox, REST API. Pricing: standard tokens + $0.08/session-hour. Beta header managed-agents-2026-04-01

  24. OpenAI, “openai-agents Python v0.14.0 release notes”. Released April 15, 2026; announcement covered April 16. Introduces the Sandbox Agents SDK surface as a beta layer over the existing Agent / Runner flow: SandboxAgent, Manifest (workspace contract), SandboxRunConfig, capabilities (shell, filesystem editing, image inspection, skills, sandbox memory, compaction), workspace mounts (local, Git, remote: S3, R2, GCS, Azure Blob, S3 Files), portable snapshots with path normalization and symlink preservation, and run-state serialization for resume. Backends: UnixLocalSandboxClient, DockerSandboxClient, and hosted clients for Blaxel, Cloudflare, Daytona, E2B, Modal, Runloop, Vercel via optional extras. The April 16 announcement summarized at Help Net Security

  25. Google Cloud, “Scion: Multi-Agent Hypervisor”. Open-sourced April 7, 2026. Orchestrates Claude Code, Gemini CLI, and other deep agents as isolated processes with per-agent container, git worktree, and credentials. Local/hub/Kubernetes deployment modes. InfoQ coverage

  26. Multi-agent debate research cluster, Q1–Q2 2026. Wu et al., “Can LLM Agents Really Debate?” (arXiv 2511.07784); M3MAD-Bench — multi-model multi-agent debate benchmark showing performance plateaus and susceptibility to misleading consensus; Tool-MAD — heterogeneous tool assignment per agent + Faithfulness/Relevance judge scores. 

  27. Anthropic, “Our framework for developing safe and trustworthy agents”. April 9, 2026. Five principles: human control, value alignment, security, transparency, privacy. MCP donation to Linux Foundation’s Agentic AI Foundation. 

  28. Permiso Security, “SandyClaw: First Dynamic Sandbox for AI Agent Skills”. April 2, 2026. Skill execution sandbox with Sigma/YARA/Nova/Snort detection and evidence-backed verdicts. 

  29. Anthropic, “Introducing Claude Opus 4.7”. April 16, 2026. Long-horizon agent improvements: 3× SWE-Bench production task resolution vs Opus 4.6, tool-failure resilience, xhigh effort tier, task budgets (beta), implicit-need awareness. See also What’s new in Opus 4.7 for Messages API breaking changes. 

  30. Composite reference — OpenAI openai-agents-python v0.14.7 (April 28, 2026) and v0.14.8 (April 29, 2026); Anthropic claude-agent-sdk-python v0.1.69 (April 28), v0.1.70 (April 28), and v0.1.71 (April 29). v0.14.7 highlights: tool_name/call_id convenience properties on tool items, raised Phase 2 memory consolidation turn limit, GPT-5.5 aliases for sandbox compaction, tar/zip member validation tightening, symlink rejection on LocalFile sources, removal of unset fields from Responses API calls. v0.14.8 highlights: preserve MCP re-export import errors, delimit sandbox prompt-instruction sections. claude-agent-sdk-python v0.1.69 added docstrings to ClaudeAgentOptions fields and bumped the bundled CLI to v2.1.121; v0.1.70 raised the mcp dependency floor to >=1.19.0 (older versions silently dropped CallToolResult returns from in-process MCP tool handlers), fixed Trio nursery corruption on early cancellation when iterating query() with options.stderr set (spawn_detached() now used for the stderr reader), and bumped the bundled CLI to v2.1.122; v0.1.71 added domain-allowlist fields (allowedDomains, deniedDomains, allowManagedDomainsOnly, allowMachLookup) to SandboxNetworkConfig for parity with the TypeScript schema, and bumped the bundled CLI to v2.1.123. 

  31. OpenAI, “Custom instructions with AGENTS.md”. Codex reads global and project AGENTS.md / AGENTS.override.md files before work, merges root-to-current-directory guidance, and caps project docs by project_doc_max_bytes

  32. OpenAI, “Agent Skills”. Codex skills use SKILL.md, progressive disclosure, explicit $skill invocation, and implicit activation from descriptions. 

  33. OpenAI, “Codex Hooks”. Codex hooks support command hooks in config, plugin hooks, managed hooks, matchers for supported events, stdin JSON input, and JSON output fields. 

  34. OpenAI, “Codex Subagents” and “Codex CLI 0.128.0 changelog”. Codex supports explicit parallel subagent workflows, built-in default, worker, and explorer agents, custom TOML agents, inherited sandbox policy, plugin-bundled hooks, hook enablement state, and persisted /goal workflows in 0.128.0. 

  35. Anthropic, “New in Claude Managed Agents”. May 6, 2026. Dreaming (Research Preview): scheduled background process that reviews agent sessions and memory stores, extracts patterns, and curates memories. Outcomes (Public Beta): rubric-based evaluation in which a separate grader scores output against the rubric in its own context window so it is not influenced by the agent’s reasoning. Multiagent Orchestration (Public Beta): lead agent delegates pieces of a job to specialists, each with its own model, prompt, and tools; specialists work in parallel on a shared filesystem and contribute to the lead agent’s overall context, with full per-step tracing in the Claude Console. 

  36. Anthropic, claude-agent-sdk-python v0.1.74. May 6, 2026. Adds include_hook_events to ClaudeAgentOptions; when set, hook events (PreToolUse, PostToolUse, Stop, others) are emitted by the CLI and yielded from the message stream as HookEventMessage, mirroring the TypeScript SDK’s includeHookEvents. Bundled Claude CLI bumped to v2.1.129. 

  37. Anthropic, claude-agent-sdk-python v0.1.77. May 8, 2026. Deprecates the "Skill" value in allowed_tools in favor of a dedicated skills option on ClaudeAgentOptions, gives Claude Code more structured signal about available skills, improves error messages on Command failed exceptions, and bundles Claude CLI v2.1.133. 

  38. Anthropic, Claude Code v2.1.132. May 6, 2026. Adds CLAUDE_CODE_SESSION_ID env var on Bash tool subprocesses (matches the session_id hooks already see), CLAUDE_CODE_DISABLE_ALTERNATE_SCREEN to keep conversation in native scrollback, refreshed /tui fullscreen startup banner (lower memory, mouse support, auto-copy on selection), and roughly twenty bug fixes spanning SIGINT graceful shutdown, surrogate emoji --resume corruption, plan-mode --permission-mode flag, Indic and ZWJ cursor handling, NFD vim ops, paste-starts-with-/ swallow, MCP unbounded memory, MCP tools/list retry, Bedrock + Vertex ENABLE_PROMPT_CACHING_1H 400, and statusline context_window showing cumulative tokens. 

  39. Anthropic, Claude Code v2.1.133. May 7, 2026. Hooks now receive effort.level JSON input + $CLAUDE_EFFORT env var (also readable from Bash commands). Subagents discover project, user, and plugin skills via the Skill tool (regression fix). New admin settings: worktree.baseRef (fresh | head) reverts the worktree base back to origin/<default> after v2.1.128’s switch to local HEAD; sandbox.bwrapPath and sandbox.socatPath pin sandbox binaries on Linux/WSL; parentSettingsBehavior ('first-wins' | 'merge') controls how SDK managedSettings compose with parent settings. Other fixes: parallel-session 401-after-refresh-token-race, drive-root allow-rule scoping, MCP OAuth proxy/mTLS support, Remote Control stop/interrupt completing cancel, cross-session /effort leakage, --remote-control listed in --help

NORMAL agent-architecture.md EOF