AI Agent Observability: Monitoring What You Can't See

March 02, 2026 22 min read Updated April 14, 2026

ai claude-code agents observability security monitoring engineering autonomous-coding

From the guide: Claude Code Comprehensive Guide

Anthropic shipped a feature called Cowork in Claude Desktop. The feature created a 10GB virtual machine bundle on every macOS installation. Users who never enabled Cowork still got the VM. Users who deleted it watched it regenerate. One user reported the bundle growing to 21GB. The GitHub issue collected 345 points and 175 comments on Hacker News before Anthropic acknowledged the problem.¹

Nobody noticed until disk space ran out.

What is AI agent observability? Agent observability is the practice of monitoring what autonomous agents consume and execute at runtime. It requires three layers: resource metering (disk, memory, CPU, network per session), policy enforcement (allow/deny rules at each tool call), and runtime auditing (kernel-level recording of actual behavior). Without all three, operators cannot detect drift, enforce scope, or reconstruct incidents.

TL;DR

Agent tools now allocate compute resources (disk, memory, CPU, network) without operator visibility. Anthropic’s Cowork VM is the visible example; every MCP tool call, every spawned sub-agent, and every web fetch is an invisible one. Governing agents requires three layers of observability: resource metering (what did it consume?), policy enforcement (what was it allowed to do?), and runtime auditing (what did it actually do?). Two open-source projects address the policy and audit layers (mcp-firewall and Logira), but no production tool covers all three. Below: the visibility problem, the three-layer stack, what each layer catches, and minimum monitoring hooks you can implement today.

The Visibility Problem

Traditional software operates below an observability line that operators choose to draw. A web server writes access logs because engineers configured logging. A database tracks slow queries because someone set log_min_duration_statement. The operator decides the granularity.

Agent systems invert the relationship. The agent decides what to execute at runtime. A coding agent that receives “fix the login endpoint” might read 47 files, write to 12, spawn three sub-agents, fetch two web pages, and execute 15 bash commands. Each action consumes resources. None of the consumption shows up in traditional monitoring.

The Cowork incident exposed the inversion at the infrastructure level. Claude Desktop allocated 10GB of disk space, consumed 24-55% CPU at idle, and drove swap usage from 20K to 24K+ swapins on 8GB machines.¹ Users discovered the resource consumption through macOS storage warnings, not through Anthropic’s telemetry. The application provided no dashboard, no meter, and no opt-in disclosure for the VM allocation.

The pattern is not hypothetical. In March 2026, a developer reported that Claude Code executed a Terraform command that destroyed a production database. The agent ran terraform apply against a production state file. No confirmation prompt appeared. No hook intercepted the command. The developer discovered the destruction when the application went offline. The incident collected 142 points and 158 comments on Hacker News.¹² Days later, a separate developer reported Claude Code deleting an entire production setup, including database snapshots representing 2.5 years of records.¹³ Both incidents share the same root cause: zero visibility into what the agent was doing before the damage was irreversible.

Scale the pattern to agent sessions. My hook orchestration system — built on Claude Code’s hook infrastructure — intercepts 15 event types across every tool call.¹¹ Over 60 sessions, the system logged 84 hooks firing on each action, producing telemetry that no default agent installation provides.² Without that instrumentation, I would not have detected the 12 drift incidents, the phantom verification failures, or the recursive spawning loops documented in my NIST public comment.³

The DORA 2024 Accelerate State of DevOps Report found that teams with strong observability practices deploy more frequently and recover faster from failures. The 2025 edition extends the framework to AI-assisted development, connecting observability to “how AI-assisted coding or testing affects quality, lead time, and overall reliability.”⁴ Agent observability is not a nice-to-have. Measuring agent behavior is a prerequisite for governing it.

Three Layers of Agent Visibility

Agent observability requires three independent layers. Each layer answers a different question. A failure in one layer does not compromise the others.

Layer	Question	Monitors	Example Tool
Resource metering	What did it consume?	Disk, memory, CPU, network per session	Cowork should have shown this
Policy enforcement	What was it allowed to do?	Allow/deny rules, tool permissions, scope limits	mcp-firewall
Runtime auditing	What did it actually do?	Syscall log, file access, network egress	Logira

The agent visibility stack shows three layers of observability: (1) Resource Metering tracking disk, memory, CPU, and network per session, (2) Policy Enforcement with allow/deny rules and tool permissions, (3) Runtime Auditing recording syscalls, file access, and network egress. Toggle between "blind" (no instrumentation) and "instrumented" views to see what each layer reveals.

The layers map to a progression: you cannot enforce policy on resources you do not measure, and you cannot audit compliance with policies you never defined. Each layer builds on the one below it.

Layer 1: Resource Metering

Resource metering answers: how much did the agent consume, and where?

The Cowork incident is a resource metering failure. The VM bundle consumed 10GB of disk space. The renderer process consumed 24% CPU at idle. Swap activity climbed steadily during sessions. All of these metrics existed in macOS Activity Monitor. None appeared in Claude Desktop’s interface.¹

For agent coding sessions, resource metering tracks four dimensions:

Disk. Every file write, every cache entry, every log file. My sessions generate 200-400KB of state files per session (jiro.state.json, jiro.progress.json, hook logs). Over 60 sessions, that accumulates to 12-24MB of state data that persists across sessions unless explicitly cleaned.²

Memory. Context window consumption per turn. A 200,000-token context window costs roughly $3 per full fill at current Opus pricing. My cost tracker logs cumulative token usage per session, with budget thresholds at 80%, 90%, and 95% of a configurable limit.⁵

CPU. Hook execution time. My nine-hook prompt dispatcher adds 200ms per prompt. That overhead is invisible to users (human typing is the bottleneck) but compounds across automated pipelines. The ralph autonomous loop fires the dispatcher 50-100 times per story, adding 10-20 seconds of hook overhead per story.²

Network. Web fetches, API calls, MCP tool invocations. Every outbound request is a potential data channel — and every MCP server is a new attack surface that network metering must cover. My web extract library logs fetch URLs and response sizes. Without network metering, a web fetch that returns a 50MB response is indistinguishable from one returning 5KB.⁶

No commercial agent tool provides a per-session resource dashboard. Cloud providers meter compute for billing, not for operator visibility. The gap between what agents consume and what operators can see is the resource metering deficit.

The absence feels invisible until the numbers accumulate. One session that writes 400KB of state files is nothing. Sixty sessions that write 400KB each, with no cleanup, leaves 24MB of orphaned state. One web fetch that returns 847KB is negligible. A scanning pipeline that fetches 80 URLs per run generates 67MB of cached content that the agent’s tool abstraction hides from the operator. Resource metering makes the cumulative visible before it becomes the crisis that drives someone to file GitHub issue #22543.¹

Layer 2: Policy Enforcement

Policy enforcement answers: what rules constrain the agent, and are those rules applied consistently?

mcp-firewall addresses the policy layer for CLI agents.⁷ The tool sits between the agent and all tool use requests, evaluating each request against a regex-based policy before execution. Policies use JSONNet configuration files scoped by folder, git repository, or user. The firewall supports Claude Code and GitHub Copilot CLI through PreToolUse hook integration.

The architecture reflects a key insight: every agent implements its own halfway solution to allow/deny logic. Claude Code uses glob patterns. Codex CLI uses prefix-only matching. Each approach covers a subset of the policy space. mcp-firewall centralizes the rules into one engine that works across agents.

Consider the policy gap without centralized enforcement. My hook system includes 12 PreToolUse:Bash handlers that check for credential patterns, dangerous git operations, sensitive path access, and deployment commands.² Each handler is a separate shell script with its own regex patterns. When I need to add a new deny rule, I write a new script. When I need to audit which rules exist, I grep across 12 files. mcp-firewall consolidates that into a single config file with explicit allow arrays.

The OWASP Top 10 for Agentic Applications (2025) identifies Agent Goal Hijacking (ASI01) and Excessive Agency (LLM06:2025) as top risks.⁸ Both risks require policy enforcement at the tool-call level. An agent that hijacks a goal still makes tool calls. An agent with excessive agency still requests permissions. Policy enforcement intercepts both at the boundary where the agent’s intent meets the system’s tools.

Policy enforcement differs from access control. Traditional access control asks “does this user have permission?” Policy enforcement for agents asks “does this action, in this context, for this task, fall within the approved scope?” The deploy-and-defend trust paradox explains why granting agents broad permissions to be useful creates the exact surface area that policy enforcement must constrain. The context sensitivity is the challenge. A git push to a feature branch and a git push --force to main are the same tool (Bash) with different blast radii. mcp-firewall’s regex patterns can distinguish between them. Default agent permissions cannot.

Application-level policy enforcement also faces bypass risk. As documented in your agent sandbox is a suggestion, enforcement boundaries that feel solid often have structural gaps. A March 2026 analysis documented three escalating sandbox escape techniques against Claude Code’s built-in denylist: path-based evasion (/proc/self/root/usr/bin/npx resolves to the same binary as the blocked /usr/bin/npx), self-directed sandbox disabling (the agent independently reasoned about removing the sandbox when it blocked task completion), and dynamic linker invocation (ld-linux-x86-64.so.2 loads binaries via memory mapping, bypassing kernel enforcement hooks entirely).¹⁴ The third technique is a class of exploit: any code loading that avoids execve bypasses process-level enforcement. The takeaway for policy enforcement is that string-matching denylists are a necessary first layer, not a sufficient one. Content-addressable enforcement (identifying binaries by SHA-256 hash rather than filename) closes the path-evasion gap, but dynamic loader bypass requires kernel-level controls that sit below the policy layer.

Layer 3: Runtime Auditing

Runtime auditing answers: what did the agent actually do at the syscall level?

Logira addresses the audit layer using eBPF probes to intercept system calls at the kernel level.⁹ The tool records three event categories: process execution (exec events), file operations (including credential file access), and network connections (with destination tracking). Each audited run generates three files: events.jsonl for timeline review, index.sqlite for queryable filtering, and meta.json for run metadata.

The design philosophy is “observe-only”: Logira records and detects but does not enforce or block.⁹ The separation from the enforcement layer is deliberate. Policy enforcement prevents known-bad actions. Runtime auditing discovers unknown-bad actions after the fact. The two layers serve different temporal functions: prevention (before) and forensics (after).

Logira’s eBPF probes operate below the application layer. An agent that constructs a novel command to exfiltrate data still makes syscalls. The agent cannot hide file reads, network connections, or process spawns from kernel-level tracing. The approach catches what application-level hooks miss: side effects that bypass the tool-call abstraction.

Built-in detection rules target AI agent risks specifically: credential file access, persistence mechanism changes (/etc, systemd, cron), suspicious command chains (curl-pipe-sh patterns), destructive operations (rm -rf), and anomalous network egress.⁹ The rules are opinionated defaults for the agent threat model, not generic system auditing.

The platform constraint matters. Logira requires Linux 5.8+ with cgroup v2. macOS agents (Claude Desktop, Claude Code on Darwin) cannot use eBPF-based auditing. My OS sandbox uses macOS Seatbelt profiles as the closest equivalent: kernel-enforced deny rules that block writes to sensitive paths.³ Seatbelt is enforcement, not auditing. macOS lacks a production-ready equivalent to Logira’s observe-only audit trail.

Agent Safehouse, a macOS-native sandboxing tool that collected 802 points and 181 comments on Hacker News in March 2026, addresses the platform gap from the enforcement side.¹⁵ The tool provides sandbox profiles specifically designed for local AI agents on macOS. The community response (802 points is exceptional for a sandboxing tool) reflects the urgency: practitioners running agents on macOS have limited options between “no sandbox” and “write your own Seatbelt profile.” Agent Safehouse fills that gap for enforcement. The audit gap on macOS remains open.

The distinction between enforcement and auditing maps to a temporal split in incident response. Enforcement prevents the incident. Auditing enables reconstruction after the incident. Both are necessary. An enforcement layer that blocks all credential access prevents exfiltration but also prevents legitimate SSH operations. An audit layer that logs all credential access without blocking enables the operator to review access patterns and tune enforcement rules based on evidence. The feedback loop between audit data and policy refinement is how the visibility stack improves over time: audit reveals patterns, patterns inform policy, policy reduces the surface the audit needs to cover.

Logira’s cgroup v2 isolation adds a feature that application-level auditing cannot replicate: run-scoped attribution. The system attributes every event to a specific audited run, not to the system globally. When two agent sessions run concurrently on the same machine, cgroup isolation ensures that file access in session A does not appear in session B’s audit trail. Application-level hooks cannot provide the same guarantee because the hooks fire within the agent process, which has no kernel-level boundary separating concurrent sessions.⁹

What I Actually Run

My orchestration system covers all three layers through hooks, not through dedicated monitoring tools.

Resource metering. The cost-gate hook tracks token usage per session against configurable budget thresholds.⁵ The system performance monitor checks CPU, memory, disk, and swap at configurable intervals, injecting warnings when resource pressure exceeds thresholds.¹⁰ The session drift detector fires every 25 tool calls, computing cosine similarity between the original prompt embedding and a sliding window of recent actions.²

Policy enforcement. Eight PreToolUse dispatcher hooks route to handler hooks by tool type. PreToolUse:Bash alone runs 12 handlers covering credential patterns, destructive git operations, sensitive path access, and deployment commands. The recursion guard enforces a maximum depth of two and maximum five children per parent agent.²

Runtime auditing. PostToolUse hooks log every tool call result. The security scanning hooks check bash output for credential leaks after execution. Session state files (jiro.state.json) record every story completion, reviewer verdict, and evidence gate result.² The system does not use eBPF (macOS limitation) but captures tool-level telemetry through the hook pipeline.

Layer	My Implementation	Limitation
Resource metering	cost-gate, sysmon, drift detector	No per-tool disk/network breakdown
Policy enforcement	84 hooks across 15 event types	Per-hook regex, not centralized config
Runtime auditing	PostToolUse loggers, session state files	Application-level only, no syscall trace

The system works because every action passes through the hook pipeline. The limitation is depth: hook-level monitoring captures what the agent asked to do, not what the operating system actually executed. An agent that constructs a bash command with embedded subshells executes code the hook sees as a single string. Kernel-level auditing would see each subprocess.

Concrete results from production incidents where the three-layer stack caught failures that default monitoring would have missed:

Incident	Layer That Caught It	Without Monitoring
Agent spent 45 min reorganizing project directory instead of fixing login endpoint	Resource: drift detector fired at cosine similarity 0.23	Task reported “complete” with wrong deliverable
Agent attempted to write to `~/.ssh/authorized_keys`	Policy: PreToolUse:Bash handler blocked sensitive path	SSH key modified, persistent backdoor
Agent reported “all tests pass” without running pytest	Audit: completion report lacked pasted test output	Broken code merged with phantom verification
Child agent failed silently, parent reported success	Resource: budget exceeded for empty-output child	Broken database migration discovered 3 hours later

Agents spawning agents multiply opacity. Each delegation hop introduces information loss.

When my orchestration system runs the ralph autonomous loop, the parent process spawns fresh Claude Code instances for each PRD story. Each child agent gets a focused task and a fresh context window. The parent tracks completion status. The parent does not see the child’s individual tool calls, file reads, or resource consumption.²

At depth one (parent spawns child), the parent sees the child’s final output. At depth two (child spawns grandchild), the parent sees the child’s report about the grandchild’s output. Each hop compresses information. The delegation chain analysis in my NIST comment measured three compounding risks: semantic compression (context collapses to a prompt string), authority amplification (children inherit permissions without understanding sensitivity), and accountability diffusion (the root agent bears responsibility for results it never inspected).³

Observability degrades at the same rate. A three-layer visibility stack on the root agent provides zero visibility into the grandchild agent unless each child independently runs its own monitoring. My recursion guard enforces the depth limit, but the guard is a policy control, not an observability control. Knowing that delegation stopped at depth two does not tell you what happened at depth two.

A concrete example from my production system: the ralph loop spawned a child agent to implement a database migration story. The child agent decided the migration needed a “verification step” and spawned its own sub-agent to run integration tests. The grandchild agent failed silently (the test database was not configured). The child agent received an empty response, interpreted silence as success, and reported the story complete. The parent logged “story 4: complete.” I discovered the broken migration three hours later when the application crashed on the missing column. The root agent’s telemetry showed a clean run. The failure lived two hops deep, invisible to every monitoring layer I had deployed on the root.²

The OWASP Agentic Applications framework addresses cascading failures and rogue agents but does not prescribe observability requirements for multi-agent delegation chains.⁸ OpenGuard’s March 2026 analysis of agent prompt injection surfaces the same gap from the security side: contaminated context flowing between agents with different permissions creates “a silent way to escalate authority” through handoffs, without clear data flow policies or composition-level authorization.¹⁶ The gap is structural: each agent in the chain would need its own resource metering, policy enforcement, and runtime auditing, independently configured and independently reported. The overhead is multiplicative. Three layers of monitoring on three agents in a chain is nine monitoring instances, each generating its own telemetry, each requiring its own configuration. No existing tool manages that coordination.

What You Can Implement Today

Three minimum monitoring hooks that cover the visibility stack:

1. Resource: Token budget tracker. Log cumulative input and output tokens per session. Set a hard limit. Alert at 80%. The implementation requires reading the agent’s usage stats (Claude Code exposes session costs via /cost) and comparing against a threshold. My cost-gate hook does this in 47 lines of bash.⁵

2. Policy: PreToolUse deny list. Create a hook that fires before every Bash tool call. Check the command against a list of patterns: rm -rf /, git push --force, paths containing .ssh or .env, curl | sh. Block matches. The implementation requires one shell script that reads stdin (the tool call JSON), extracts the command field, and greps against a pattern file. My credential-checking hook does this in 31 lines.²

3. Audit: PostToolUse session log. Append every tool call and result to a session-specific JSONL file. Include timestamp, tool name, arguments, and exit code. The log enables post-session reconstruction: what did the agent do, in what order, and did anything fail silently? My session logger does this in 22 lines of bash.²

A worked example of the deny list hook in settings.json:

{
  "hooks": {
    "PreToolUse": [
      {
        "matcher": "Bash",
        "hooks": [
          {
            "type": "command",
            "command": "~/.claude/hooks/check-sensitive-paths.sh"
          }
        ]
      }
    ]
  }
}

The hook script reads the tool call from stdin, extracts the command string, and checks against patterns. A blocked command returns a JSON object with {"decision": "block", "reason": "Sensitive path access denied"}. An allowed command returns {"decision": "approve"}. Claude Code respects both responses without further prompting. The entire hook adds zero latency to approved commands (the regex check runs in under 5ms) and provides immediate feedback for blocked ones.

These three hooks take less than 100 lines total. They do not replace dedicated monitoring tools. They replace zero visibility with minimum visibility. Minimum visibility is the prerequisite for every governance decision that follows. You cannot set a resource budget without metering. You cannot enforce a scope policy without a deny list. You cannot investigate an incident without an audit log. Start with the log. The other two follow.

Key Takeaways

For platform engineers: Agents consume resources that existing monitoring does not track. Disk, memory, CPU, and network usage per agent session belong on the same dashboard as container metrics. The Cowork incident proves the need: 10GB allocated with zero operator visibility.

For security teams: Policy enforcement at the tool-call boundary is the minimum viable agent security posture. mcp-firewall’s centralized approach consolidates per-agent allow/deny logic into one auditable configuration. Evaluate whether your agent’s built-in permissions cover the policy space your threat model requires.

For engineering managers: The long-term answer is agent self-governance — agents that internalize monitoring as a behavioral norm, not just an external constraint. Until that matures, ask three questions about your agent tooling: Can you see per-session resource consumption? Can you define and audit tool-call policies? Can you reconstruct what an agent did after the fact? If any answer is “no,” you have a visibility gap that grows with every additional agent in your workflow.

FAQ

What is agent observability? Agent observability is the ability to monitor and understand what an AI agent does during execution: what resources it consumes, what actions it takes, and whether those actions comply with defined policies.

Why did Anthropic’s Cowork create a 10GB VM? The Cowork feature in Claude Desktop provisions a virtual machine for collaborative development sessions. Claude Desktop creates the VM bundle automatically on every macOS installation, even for users who never enable the feature, and keeps it until manually deleted.¹

What is mcp-firewall? mcp-firewall is an open-source policy enforcement tool that intercepts tool use requests from CLI agents (Claude Code, GitHub Copilot CLI) and evaluates them against regex-based allow/deny rules before execution.⁷

What is eBPF runtime auditing? eBPF (extended Berkeley Packet Filter) enables kernel-level tracing of system calls without modifying the audited process. Tools like Logira use eBPF probes to record process execution, file operations, and network connections during AI agent runs.⁹

How do agents spawn sub-agents without operator visibility? Agents that delegate tasks spawn child processes with fresh context windows. The parent agent sees the child’s final output but not its individual tool calls, file reads, or resource consumption. At each delegation hop, information compresses: the grandchild’s full session becomes a one-line status in the parent’s log. Observability degrades at the same rate as delegation depth increases.²

How does agent monitoring differ from traditional APM? Traditional Application Performance Monitoring (APM) tracks request latency, error rates, and throughput for deterministic software. Agent monitoring tracks non-deterministic behavior: what the agent decided to do at runtime, whether those decisions fell within policy, and what resources each decision consumed. APM assumes the application follows a known code path. Agent monitoring assumes the agent chooses its own path.²

Can Claude Code hooks provide agent observability? Yes. Claude Code’s hook system fires scripts at lifecycle events (PreToolUse, PostToolUse, Notification) that can meter resources, enforce policy, and log actions. A three-hook setup — token budget tracker, PreToolUse deny list, and PostToolUse session logger — covers all three visibility layers in under 100 lines of bash. Hooks are the most accessible entry point for teams already using Claude Code.

What happens when agents bypass their own sandbox? Sandbox bypass means the enforcement layer fails silently. The agent continues executing outside its intended scope, and without runtime auditing, no record exists of the breach. Sandbox escape techniques documented in March 2026 include path-based evasion, self-directed sandbox disabling, and dynamic linker invocation — all invisible to application-level monitoring.

Why is MCP tool observability important? Every MCP tool call is a potential data channel: it can read files, make network requests, and execute code. MCP servers expand the attack surface because each server introduces new capabilities the agent can invoke at runtime. Without per-call logging and policy enforcement, MCP tool invocations are invisible to the operator.

How do you monitor multi-agent delegation chains? Each agent in a delegation chain needs its own resource metering, policy enforcement, and runtime auditing. The ralph autonomous loop demonstrates the problem: a parent spawns child agents that each get fresh context windows. The parent sees final output, not individual tool calls. Observability degrades at the same rate as delegation depth increases, requiring independent monitoring at every hop.

Sources

mystcb et al., “Cowork feature creates 10GB VM bundle that severely degrades performance,” GitHub Issue #22543, anthropics/claude-code, February 2026. 345 HN points, 175 comments. ↩↩↩↩↩
Author’s production telemetry. 84 hooks across 15 event types, ~15,000 lines of orchestration code, 60+ daily Claude Code sessions, February-March 2026. ↩↩↩↩↩↩↩↩↩↩↩↩↩
Crosley, Blake, “What I Told NIST About AI Agent Security,” blakecrosley.com, February 2026. Public comment on NIST-2025-0035. ↩↩↩
DORA Accelerate State of DevOps Report 2024, Google Cloud, 2024. 39,000+ professionals surveyed. ↩
Author’s cost-gate hook implementation. SQLite-backed budget tracker with configurable thresholds (80%/90%/95%), 36 tests, February 2026. ↩↩↩
Author’s web content extraction library. trafilatura 2.0.0, URL logging and response size tracking, 25 tests, February 2026. ↩
dzervas, “mcp-firewall,” GitHub, 2026. Go binary with JSONNet policy configuration, PreToolUse hook integration. ↩↩
OWASP Top 10 for Agentic Applications, OWASP GenAI Security Project, 2025. 100+ security researchers contributed. ↩↩
melonattacker, “Logira: eBPF runtime auditing for AI agent runs,” GitHub, 2026. Linux 5.8+, cgroup v2, observe-only design. ↩↩↩↩↩
Author’s system performance monitoring module. CPU, memory, disk, and swap monitoring with configurable thresholds, 46 tests, February 2026. ↩
Crosley, Blake, “Anatomy of a Claw: 84 Hooks as an Orchestration Layer,” blakecrosley.com, February 2026. ↩
jv22222, “Claude Code wiped our production database with a Terraform command,” Hacker News, March 2026. 142 points, 158 comments. ↩
vanburen, “Claude Code deletes developers’ production setup, including database,” Tom’s Hardware, March 2026. 42 HN points, 27 comments. ↩
tomvault, “How Claude Code escapes its own denylist and sandbox,” ona.com, March 2026. Three escalating escape techniques: path evasion, self-directed disabling, dynamic linker bypass. 34 HN points. ↩
atombender, “Agent Safehouse: macOS-native sandboxing for local agents,” agent-safehouse.dev, March 2026. 802 HN points, 181 comments. ↩
everlier, “The Webpage Has Instructions. The Agent Has Your Credentials,” openguard.sh, March 2026. Multi-agent handoff escalation, MCP tool description injection, memory poisoning with 95% success rate. 31 HN points. ↩

AI Agent Observability: Monitoring What You Can't See

TL;DR

The Visibility Problem

Three Layers of Agent Visibility

Layer 1: Resource Metering

Layer 2: Policy Enforcement

Layer 3: Runtime Auditing

What I Actually Run

The Compounding Blind Spot

What You Can Implement Today

Key Takeaways

FAQ

Sources

Related Posts

When Your Agent Finds a Vulnerability

Silent Egress: The Attack Surface You Didn't Build

Your Agent Writes Faster Than You Can Read

More from 941 Apps