Agent Security

The attack surfaces, trust boundaries, and runtime defenses for autonomous AI systems. MCP vulnerabilities, supply chain risks, sandbox escapes, and governance frameworks.

20 articles

Runtime Defense for Tool-Augmented Agents

ClawGuard demonstrates deterministic tool-call interception works. The Vercel telemetry incident shows why. Runtime defense is the enforceable layer.

2026-04-14

Cybersecurity Is Proof of Work

Claude Mythos completed a 32-step corporate network attack simulation in 3 of 10 tries. Each attempt cost $12,500 in tokens. Security is now a...

2026-04-14

Your Agent Has a Middleman You Didn't Vet

Researchers bought 28 LLM API routers and collected 400 more. 17 touched AWS canary credentials. One drained ETH from a private key. The router...

2026-04-10

MCP Servers Are the New Attack Surface

50 MCP vulnerabilities. 30 CVEs in 60 days. 13 critical. The attack surface nobody is auditing.

2026-04-08

Project Glasswing: What Happens When a Model Is Too Good at Finding Bugs

Anthropic built a model that finds thousands of zero-days, then restricted it to 12 partners. What Project Glasswing means for agent-assisted security.

2026-04-07

When Your Agent Finds a Vulnerability

An Anthropic researcher found a 23-year-old Linux kernel vulnerability using Claude Code and a 10-line bash script. 22 Firefox CVEs followed. What...

2026-04-05

What the Claude Code Source Leak Reveals

A practitioner's analysis of the Claude Code source leak. 11 findings that explain how auto mode, bash security, prompt caching, and multi-agent...

2026-04-02

Every Hook Is a Scar

84 hooks, 15 event types. Each one traces back to a specific failure. Institutional memory in shell scripts.

2026-03-29

The Fork Bomb Saved Us

The LiteLLM attacker made one implementation mistake. That mistake was the only reason 47,000 installs got caught in 46 minutes.

2026-03-28

AI Agent Research: Claude Beat 33 Attack Methods

Claude Code autonomously discovered adversarial attacks with 100% success rate against Meta's SecAlign-70B, beating all 33 published methods in 96...

2026-03-26

The Supply Chain Is the Attack Surface

Trivy got compromised. Then LiteLLM. Then 47,000 installs in 46 minutes. The AI supply chain worked exactly as designed.

2026-03-25

AI Agent Security: The Deploy-and-Defend Trust Paradox

1 in 8 enterprise AI breaches involve autonomous agents. Runtime hooks, OS-level sandboxes, and drift detection break the deploy-and-defend cycle.

2026-03-20

Every Iteration Makes Your Code Less Secure

43.7% of LLM iteration chains introduce more vulnerabilities than baseline. Adding SAST scanners makes it worse. SCAFFOLD-CEGIS cuts degradation to 2.1%.

2026-03-12

Your Agent Sandbox Is a Suggestion

An attacker opened a GitHub issue and shipped malware in Cline's next release. Agent sandboxes fail at three levels. Here is what actually works.

2026-03-05

Silent Egress: The Attack Surface You Didn't Build

A malicious web page injected instructions into URL metadata. The agent fetched it, read the poison, and exfiltrated the API key. No error. No log.

2026-03-02

AI Agent Observability: Monitoring What You Can't See

AI agents consume disk, CPU, and network with zero operator visibility. Three observability layers close the gap before damage is irreversible.

2026-03-02

What I Told NIST About AI Agent Security

Production evidence submitted to NIST: AI agent threats are behavioral. 7 failure modes, 3-layer defense, and framework gaps from 60 daily sessions.

2026-02-24

The Fabrication Firewall: When Your Agent Publishes Lies

An autonomous agent published fabricated claims to 8 platforms over 72 hours. Training-phase safety failed at the publication boundary. Here is the fix.

2026-02-23

AI Agent Memory Degradation: Why Multi-Turn LLMs Collapse

LLMs lose 39% accuracy across 200K+ multi-turn sessions. Three mechanisms drive collapse and longer context windows fix none of them.

2026-02-22

Runtime Constitutions for AI Agents: A Governance Framework

Runtime constitutions enforce AI agent governance where training-phase alignment fails. Competence checks, output gates, and four subsystems keep...

2026-02-22