Writing

May 18, 2026 14 min read

AI Agent Skills Need Behavioral Audits, Not Pass Rates

AI agent skills can change behavior while pass rates stay flat. Behavioral audits compare traces, declared capabilities, and side effects before trust.

AI & Technology

ai agents skills evaluation behavioral-audits ai-engineering

May 18, 2026 11 min read

Long-Running AI Agents Need Durable Channels

Long-running AI agents need durable channels: workflow IDs, event logs, resumable streams, typed signals, safe cancellation, and user-visible checkpoints.

AI & Technology

ai agents durable-execution workflows agent-runtime webhooks ai-engineering

May 18, 2026 17 min read

AI Agents Need Exploration Checkpoints

Exploration checkpoints let AI agents prove what they discovered before acting, reducing premature exploitation, brittle plans, and generic world models.

AI & Technology

ai agents exploration evaluation agent-safety ai-engineering

May 18, 2026 13 min read

AI Agent Ownership Is the Trust Primitive

AI agent ownership links every autonomous action to the account, session, scope, and operator who can stop it, review it, and accept responsibility.

AI & Technology

ai agents security attribution governance agent-trust telemetry

May 18, 2026 15 min read

AI Agent Monitoring Needs Runtime Intervention

AI agent monitoring should catch decisive errors during a run, not after failure. Runtime intervention turns traces, policies, and alerts into safe pauses.

AI & Technology

ai agents monitoring runtime-safety security intervention ai-engineering

May 18, 2026 15 min read

AI Agent Config Security Is Supply Chain Security

AI agent config security belongs in supply-chain review: hooks, editor tasks, install scripts, MCP files, and plugins can execute code before you notice.

AI & Technology

ai agents security supply-chain claude-code hooks developer-tools

May 18, 2026 13 min read

AI Code Review Needs Dissent, Not Consensus

AI code review needs independent agents that preserve dissent, validate findings, route uncertainty to humans, and re-review fixes before teams merge PRs.

AI & Technology

ai code-review pull-requests agents multi-agent ai-engineering

May 18, 2026 14 min read

AI Agent Safety Starts With Small Software

AI agent safety starts with small software: smaller tools, plain files, narrow permissions, and faster tests give coding agents fewer places to hide bugs.

AI & Technology

ai agents software-architecture safety engineering unix small-software

May 18, 2026 15 min read

MCP Tools Need Action-Level Authorization

MCP tools need action-level authorization: bearer-token validation must lead to per-tool, per-role, and per-action capability checks before agents act.

AI & Technology

ai agents security mcp oauth authorization tool-security

May 18, 2026 11 min read

AI Coding Agents Need Smaller Review Surfaces

AI coding agents overwhelm reviewers with giant diffs. Smaller review surfaces keep engineers engaged, verification-focused, and accountable before merge.

AI & Technology

ai agents code-review developer-tools human-ai software-engineering

May 18, 2026 13 min read

AI Agent Approval Prompts Are Not Authorization

AI agent approval prompts need scoped authority, risk lanes, audit logs, expiry, and revocation so humans approve concrete actions, not fluent requests.

AI & Technology

ai agents approvals authorization security human-in-the-loop agentic-design

May 18, 2026 12 min read