Agent Security

The Agent Stack Has a 1998 Problem

A cluster of agent-tool CVEs in mid-2026 is not bad luck. It is the structural signature of an AI agent ecosystem shipping at 1998-era security maturity.

2026-07-06

Computer-Use Agents Overshare by Default

A 2026 benchmark tested 15 frontier computer-use agents for data leakage across contexts. Eleven leaked on over half the scenarios. No attacker needed.

2026-06-23

Apple's First-Party Answer to Prompt Injection

WWDC 2026: Apple cites the lethal trifecta, ships deterministic guardrail APIs in Foundation Models and App Intents, and moves PCC onto Google Cloud.

2026-06-12

Apple's Font Interpreter Is Now Swift, and 13% Faster

Apple's security team rewrote the TrueType hinting interpreter from C to memory-safe Swift, made it 13% faster, open-sourced it, and showed the techniques.

2026-06-12

Your Agent Has Two Untrusted Inputs

AI agents have two untrusted inputs: code the model writes and tool output it reads. One now has a real WASM sandbox; the other, MCP tool...

2026-06-06

Engineering Philosophy: Adi Shamir

Adi Shamir is the S in RSA and a master cryptanalyst -- he builds secure systems by also learning to break them. Attack to defend, turning clean...

2026-06-03

Engineering Philosophy: Joanna Rutkowska

Joanna Rutkowska built Qubes OS on reasonable paranoia: you cannot make software bug-free, so assume compromise, distrust the infrastructure, and...

2026-06-03

When the Maintainer Is the Attacker: jqwik 1.10.0

jqwik 1.10.0 emits a destructive prompt-injection string in Maven output. ANSI escapes hide it from humans. The maintainer added it on purpose.

2026-05-29

Loopback Is Not a Trust Boundary: CVE-2026-2611

MLflow 3.9.0's Assistant exposed a local AI agent on /ajax-api with no CORS check. Any webpage could take over Claude Code. The bug is older than MLflow.

2026-05-28

AI Malware Analysis Needs Evidence Packets

AI malware analysis needs evidence packets: hashes, commands, indicators, and claim-to-evidence trails matter more than confident agent summaries.

2026-05-18

Agents.txt Is Not Access Control

Agents.txt is not access control. Use robots.txt, llms.txt, bot verification, logs, and server-side policy to manage AI crawlers without false confidence.

2026-05-18

AI Agent Ownership Is the Trust Primitive

AI agent ownership links every autonomous action to the account, session, scope, and operator who can stop it, review it, and accept responsibility.

2026-05-18

AI Agent Monitoring Needs Runtime Intervention

AI agent monitoring should catch decisive errors during a run, not after failure. Runtime intervention turns traces, policies, and alerts into safe pauses.

2026-05-18

AI Agent Approval Prompts Are Not Authorization

AI agent approval prompts need scoped authority, risk lanes, audit logs, expiry, and revocation so humans approve concrete actions, not fluent requests.

2026-05-18

MCP Tools Need Action-Level Authorization

MCP tools need action-level authorization: bearer-token validation must lead to per-tool, per-role, and per-action capability checks before agents act.

2026-05-18

AI Agent Config Security Is Supply Chain Security

AI agent config security belongs in supply-chain review: hooks, editor tasks, install scripts, MCP files, and plugins can execute code before you notice.

2026-05-18

Agent Keys Need Risk Budgets

Shuriken's Agent Kit shows why AI agent tools that can act need scoped keys, server-side limits, activity logs, revocation, and conservative defaults.

2026-05-18

Open Source Is Not a Security Boundary

GDS guidance on AI vulnerability discovery gets open-source security right: hide less by default, fix faster, and make exceptions explicit with evidence.

2026-05-17

The Repo Shouldn't Get to Vote on Its Own Trust

Two Claude Code trust dialog bypass CVEs in 37 days reveal a load-order failure. One invariant fixes it: interpret no workspace byte until the...

2026-04-24

The Agent Operator's Handbook: Supervising What You Can't See

Operating autonomous AI agents is a new discipline. Five responsibilities, a supervision stack, and an intervention framework define what operators do.

2026-04-15

Cybersecurity Is Proof of Work: AI Attacks at $12,500 a Run

Claude Mythos completed a 32-step corporate network attack simulation in 3 of 10 tries. Each attempt cost $12,500 in tokens. Security is now a...

2026-04-14

Runtime Defense for Tool-Augmented Agents

ClawGuard demonstrates deterministic tool-call interception works. The Vercel telemetry incident shows why. Runtime defense is the enforceable layer.

2026-04-14

Your Agent Has a Middleman You Didn't Vet

Researchers tested 28 LLM API routers. 17 touched AWS canary credentials. One drained ETH from a private key. The router layer is the new attack surface.

2026-04-10

MCP Servers Are the New Attack Surface

50 MCP vulnerabilities, 30 CVEs in 60 days, 13 critical. Tool-use protocols are the attack surface nobody is auditing — here's the taxonomy and the fixes.

2026-04-08

Project Glasswing: When a Model Finds Too Many Bugs

Project Glasswing shows Anthropic restricting Claude Mythos after it found thousands of zero-days. What the rollout means for AI-assisted security.

2026-04-07

When Your Agent Finds a Vulnerability

An Anthropic researcher found a 23-year-old Linux kernel vulnerability using Claude Code and a 10-line bash script. 22 Firefox CVEs followed.

2026-04-05

What the Claude Code Source Leak Reveals

11 findings from the Claude Code source leak: how auto mode, bash security, prompt caching, and multi-agent coordination actually work.

2026-04-02

Every Hook Is a Scar: 84 Agent Failures Encoded in Code

84 hooks intercept 15 of the 26 lifecycle event types Claude Code exposes. Each one traces back to a specific production failure: wiped caches,...

2026-03-29

The Fork Bomb Saved Us

The LiteLLM attacker made one implementation mistake. That mistake was the only reason 47,000 installs got caught in 46 minutes.

2026-03-28

AI Agent Research: Claude Beat 33 Attack Methods

Claude Code autonomously discovered adversarial attacks with 100% success rate against Meta's SecAlign-70B, beating all 33 published methods in 96...

2026-03-26

AI Supply Chain Attacks: The Supply Chain Is the Surface

Trivy got compromised via tag hijacking, then LiteLLM on PyPI, then 47,000 installs in 46 minutes. The AI supply chain worked exactly as designed.

2026-03-25

AI Agent Security: The Deploy-and-Defend Trust Paradox

1 in 8 enterprise AI breaches involve autonomous agents. Runtime hooks, OS-level sandboxes, and drift detection break the deploy-and-defend cycle.

2026-03-20

Every Iteration Makes Your Code Less Secure

43.7% of LLM iteration chains introduce more vulnerabilities than baseline. Adding SAST scanners makes it worse. SCAFFOLD-CEGIS cuts degradation to 2.1%.

2026-03-12

Agent Sandbox Security Is a Suggestion: Three Failure Levels

An attacker opened a GitHub issue and shipped malware in Cline's next release. Agent sandboxes fail at three levels. Here is what actually works.

2026-03-05

Silent Egress: The Attack Surface You Didn't Build

A malicious web page injected instructions into URL metadata. The agent fetched it, read the poison, and exfiltrated the API key. No error. No log.

2026-03-02

AI Agent Observability: Monitoring What You Can't See

AI agents consume disk, CPU, and network with zero operator visibility. Three observability layers close the gap before damage is irreversible.

2026-03-02

What I Told NIST About AI Agent Security

Production evidence submitted to NIST: AI agent threats are behavioral. 7 failure modes, 3-layer defense, and framework gaps from 60 daily sessions.

2026-02-24

The Fabrication Firewall: When Your Agent Publishes Lies

An autonomous agent published fabricated claims to 8 platforms over 72 hours. Training-phase safety failed at the publication boundary. Here is the fix.

2026-02-23

AI Agent Memory Degradation: Why Multi-Turn LLMs Collapse

LLMs lose 39% accuracy across 200K+ multi-turn sessions. Three mechanisms drive collapse and longer context windows fix none of them.

2026-02-22

Runtime Constitutions for AI Agents: A Governance Framework

Runtime constitutions enforce AI agent governance where training-phase alignment fails. Competence checks, output gates, and four subsystems keep...

2026-02-22

Featured Guides

The Agent Stack Has a 1998 Problem

Computer-Use Agents Overshare by Default

Apple's First-Party Answer to Prompt Injection

Apple's Font Interpreter Is Now Swift, and 13% Faster

Your Agent Has Two Untrusted Inputs

Engineering Philosophy: Adi Shamir

Engineering Philosophy: Joanna Rutkowska

When the Maintainer Is the Attacker: jqwik 1.10.0

Loopback Is Not a Trust Boundary: CVE-2026-2611

AI Malware Analysis Needs Evidence Packets

Agents.txt Is Not Access Control

AI Agent Ownership Is the Trust Primitive

AI Agent Monitoring Needs Runtime Intervention

AI Agent Approval Prompts Are Not Authorization

MCP Tools Need Action-Level Authorization

AI Agent Config Security Is Supply Chain Security

Agent Keys Need Risk Budgets

Open Source Is Not a Security Boundary

The Repo Shouldn't Get to Vote on Its Own Trust

The Agent Operator's Handbook: Supervising What You Can't See

Cybersecurity Is Proof of Work: AI Attacks at $12,500 a Run

Runtime Defense for Tool-Augmented Agents

Your Agent Has a Middleman You Didn't Vet

MCP Servers Are the New Attack Surface

Project Glasswing: When a Model Finds Too Many Bugs

When Your Agent Finds a Vulnerability

What the Claude Code Source Leak Reveals

Every Hook Is a Scar: 84 Agent Failures Encoded in Code

The Fork Bomb Saved Us

AI Agent Research: Claude Beat 33 Attack Methods

AI Supply Chain Attacks: The Supply Chain Is the Surface

AI Agent Security: The Deploy-and-Defend Trust Paradox

Every Iteration Makes Your Code Less Secure

Agent Sandbox Security Is a Suggestion: Three Failure Levels

Silent Egress: The Attack Surface You Didn't Build

AI Agent Observability: Monitoring What You Can't See

What I Told NIST About AI Agent Security

The Fabrication Firewall: When Your Agent Publishes Lies

AI Agent Memory Degradation: Why Multi-Turn LLMs Collapse

Runtime Constitutions for AI Agents: A Governance Framework

More from 941 Apps