AGENTS.md Patterns: What Actually Changes Agent Behavior
My first AGENTS.md was a 200-line paste of our team’s style guide. It included naming conventions, code review checklists, deployment procedures, and architectural principles. The agent ignored most of it. Not because the instructions were wrong — because they were documentation, not operations.
The distinction matters more than any specific pattern in this post. AGENTS.md is operational policy for an AI agent, not a README for humans. The agent doesn’t need to understand why you use conventional commits. It needs to know the exact command to run and what “done” looks like.
TL;DR
Most AGENTS.md problems come from writing human documentation instead of agent operations. Effective files are command-first (exact invocations, not descriptions), task-organized (coding, review, release sections), and closure-defined (explicit “done” criteria). Anti-patterns that reliably get ignored: prose paragraphs, ambiguous directives (“be careful”), and contradictory priorities. AGENTS.md is an open standard adopted by 60,000+ projects 1 and works across Codex, Cursor, Copilot, Amp, Windsurf, and more 2.
Context: AGENTS.md is governed by the Agentic AI Foundation under the Linux Foundation 3, with platinum members including Anthropic, Google, Microsoft, and OpenAI. This post covers practical patterns. For Codex-specific configuration, see the Codex guide. For Claude Code’s equivalent (CLAUDE.md), see the Claude Code guide.
What Gets Ignored
These patterns reliably produce no observable change in agent behavior. I identified each by running identical tasks with and without the instruction present, then comparing task completion accuracy across 10+ runs per pattern. GitHub’s analysis of 2,500+ repositories with AGENTS.md files reached the same conclusion: “Most agent files fail because they’re too vague — not because of technical limitations” 11. The patterns below failed to improve accuracy in any measurable way.
Prose paragraphs without commands
<!-- BAD: Agent skips this -->
We value clean, well-tested code. Our team follows TDD principles
and believes in comprehensive test coverage. Please ensure all
changes are properly tested before submitting.
The agent reads this, represents it as a vague preference, and proceeds to write code without tests. There’s no actionable instruction — no command to run, no threshold to meet, no definition of “properly tested.”
Ambiguous directives
<!-- BAD: "Careful" means nothing to an agent -->
- Be careful with database migrations
- Optimize queries where possible
- Handle errors gracefully
“Careful” isn’t a constraint. “Where possible” isn’t a trigger condition. “Gracefully” isn’t a behavior specification. These read as human-to-human guidance, not agent instructions. Compare with what works: “Run alembic check before applying migrations. Abort if downgrade path is missing.”
Contradictory priorities
<!-- BAD: Which one wins? -->
- Move fast and ship quickly
- Ensure comprehensive test coverage
- Keep the runtime budget under 5 minutes
- Run the full integration test suite before every commit
The agent can’t satisfy all four simultaneously. When instructions conflict without explicit priority ordering, the model skips verification steps and rushes to code generation. Research from ICLR 2026 (AMBIG-SWE) found that agents “default to non-interactive behavior without explicit encouragement” — proceeding silently rather than asking clarifying questions, which dropped resolve rates from 48.8% to 28% 12. Fix conflicting instructions by numbering priorities: “Priority 1: Tests pass. Priority 2: Under 5 minutes. Priority 3: Ship fast.”
Style guides without enforcement
<!-- BAD: No way to verify compliance -->
Follow the Google Python Style Guide for all code.
Use numpy-style docstrings for public functions.
Unless you include the exact linting command that enforces the style (ruff check --select D or pylint --rcfile=.pylintrc), the agent has no mechanism to verify its own compliance. The pattern here is universal: instructions without verification commands are suggestions, not rules.
What Works
These patterns produce consistent, measurable changes in agent behavior.
Command-first instructions
## Build and Test Commands
- Install: `pip install -r requirements.txt`
- Lint: `ruff check . --fix`
- Format: `ruff format .`
- Test: `pytest -v --tb=short`
- Type check: `mypy app/ --strict`
- Full verify: `ruff check . && ruff format --check . && pytest -v`
Commands are unambiguous. The agent knows exactly what to run, what arguments to pass, and can verify success by checking the exit code. Every instruction in your AGENTS.md should answer the question: “What command proves this was done correctly?”
Closure definitions
## Definition of Done
A task is complete when ALL of the following pass:
1. `ruff check .` exits 0
2. `pytest -v` exits 0 with no failures
3. `mypy app/ --strict` exits 0
4. Changed files have been staged and committed
5. Commit message follows conventional format: `type(scope): description`
Explicit closure definitions eliminate the most common failure mode: the agent reports “done” without verifying. When “done” is defined as specific exit codes, the agent runs each check before reporting completion. Without this definition, “done” means “I think I’m done” — a common source of agent-introduced bugs.
Task-organized sections
## When Writing Code
- Run `ruff check .` after every file change
- Add type hints to all new functions
- Test command: `pytest tests/ -v -k "test_<module>"`
## When Reviewing Code
- Check for security issues: `bandit -r app/`
- Verify test coverage: `pytest --cov=app --cov-fail-under=80`
- List changed files: `git diff --name-only HEAD~1`
## When Releasing
- Update version in `pyproject.toml`
- Run full suite: `pytest -v && ruff check . && mypy app/`
- Tag: `git tag -a v<version> -m "Release v<version>"`
Task-organized files let the agent select relevant instructions based on what it’s currently doing. Flat lists force the agent to parse every instruction regardless of context. The “When…” prefix maps directly to how the agent reasons about task context.
Escalation rules
## When Blocked
- If tests fail after 3 attempts: stop and report the failing test with full output
- If a dependency is missing: check `requirements.txt` first, then ask
- If you encounter merge conflicts: stop and show the conflicting files
- Never: delete files to resolve errors, force push, or skip tests
Without escalation rules, agents default to increasingly creative workarounds when blocked — deleting lock files, bypassing checks, or silently ignoring failures. The “Never” list is as important as the escalation paths. Explicitly banning destructive recovery patterns prevents the worst failure modes.
Directory Scoping for Monorepos
AGENTS.md supports hierarchical scoping as a core feature of the specification 2. Files closer to the working directory take precedence:
/repo/AGENTS.md ← Project-wide rules
└─ /repo/services/AGENTS.md ← Service defaults
├─ /repo/services/api/AGENTS.md ← API-specific rules
└─ /repo/services/web/AGENTS.md ← Frontend-specific rules
Root-level instructions concatenate with deeper files. Tools walk from the project root to the current working directory, combining every AGENTS.md found along the path 4. OpenAI’s own Codex repository uses 88 separate AGENTS.md files for its monorepo — one per service and package 4.
In Codex, you can also use AGENTS.override.md at any level to replace (not extend) parent instructions 4. The override mechanism is Codex-specific — other tools don’t implement it.
<!-- /repo/services/payments/AGENTS.override.md (Codex only) -->
# Payment Service Rules (OVERRIDE)
This service has additional security requirements.
All changes require: `bandit -r . -ll` passing with zero findings.
No dependency updates without explicit approval.
Test with: `pytest -v --tb=long -x` (fail fast, full tracebacks)
When to use override: Release freezes, incident mode, or any service with security constraints that supersede project-wide defaults.
Cross-Tool Compatibility
AGENTS.md is adopted by 60,000+ projects 1 and recognized by every major AI coding tool. Here’s how the same file behaves across ecosystems:
| Tool | Native File | Reads AGENTS.md? | Notes |
|---|---|---|---|
| Codex CLI | AGENTS.md | Yes (native) 4 | Full hierarchy + override support |
| Cursor | .cursor/rules |
Yes (native) 5 | Auto-discovered in project root and subdirectories |
| GitHub Copilot | .github/copilot-instructions.md |
Yes (native) 6 | Coding agent supports natively; VS Code requires chat.useAgentsMdFile |
| Amp | AGENTS.md | Yes (native) 7 | Co-creator of the standard; backward-compatible with AGENT.md |
| Windsurf | .windsurfrules |
Yes (native) 8 | Auto-discovered, case-insensitive matching |
| Gemini CLI | GEMINI.md |
Configurable 9 | Add "fileName": ["AGENTS.md"] to settings.json context block |
| Claude Code | CLAUDE.md | No | Separate format; similar patterns apply |
| Aider | CONVENTIONS.md |
Manual 10 | Requires --read AGENTS.md or --conventions-file AGENTS.md flag |
If your team uses multiple tools: Write AGENTS.md as the canonical source. Add tool-specific files (CLAUDE.md, .cursorrules) that either import or mirror the relevant sections. Don’t maintain parallel instruction sets that drift apart.
Writing Order: What to Add First
If you’re writing an AGENTS.md from scratch, add sections in this priority order. Each layer builds on the previous one:
- Build and test commands — the agent needs these before it can do anything useful
- Definition of done — prevents “I think I’m done” false completions
- Escalation rules — prevents destructive workarounds when the agent gets stuck
- Task-organized sections — reduces irrelevant instruction parsing per task
- Directory scoping (monorepos only) — keeps service instructions isolated
Skip style preferences until the first four are working. Most AGENTS.md files fail because they start with style guidance and never get to commands.
Testing Your AGENTS.md
Verify the agent actually reads and follows your instructions:
# Codex: Show the full instruction chain
codex --ask-for-approval never "Summarize your current instructions"
# Codex: Generate a scaffold (slash command inside an active session)
# Type /init at the Codex prompt, not as a shell command
codex # then type: /init
# Claude Code: Check active instructions
claude --print "What instructions are you following for this project?"
# Verify specific rules are active
codex --ask-for-approval never "What is your definition of done?"
The acid test: Ask the agent to explain your build commands. If it can’t reproduce them verbatim, the instructions aren’t being read or are too verbose to retain in context. Long AGENTS.md files get truncated by context windows — keep each section under 50 lines and front-load the most critical instructions.
FAQ
How long should an AGENTS.md file be?
Keep each section under 50 lines and the total file under 150 lines 13. Codex enforces a default 32 KiB limit (project_doc_max_bytes) 4. Long files get truncated by context windows, so front-load the most critical instructions — commands and closure definitions before style preferences.
Does AGENTS.md replace tool-specific instruction files?
No. AGENTS.md works alongside CLAUDE.md, .cursor/rules, and other tool-specific files. Write AGENTS.md as the canonical source, then mirror relevant sections to tool-specific files. The patterns in AGENTS.md (command-first, closure-defined) work in any instruction file regardless of tool.
What if the agent ignores my AGENTS.md?
Test by asking the agent to explain your build commands. If it can’t reproduce them verbatim, the file is either too verbose (content pushed out of context), too vague (agent can’t extract actionable instructions), or not being discovered (check file location and tool documentation). GitHub’s analysis of 2,500 repositories found that vagueness — not technical limitations — causes most failures 11.
Key Takeaways
For individual developers:
- Replace prose with commands. Every instruction should be verifiable by running something.
- Define closure explicitly. “Done” means specific exit codes, not feelings.
- Test your AGENTS.md by asking the agent to recite it. What it can’t recite, it won’t follow.
For teams:
- Use AGENTS.md as the single source of truth. Mirror to tool-specific files, don’t maintain parallel copies.
- Organize by task (coding, review, release), not by category (style, testing, deployment).
- Include escalation rules. Without them, blocked agents improvise in ways you won’t like.
- Scope per directory in monorepos. Service-specific rules shouldn’t pollute global instructions.
References
-
Linux Foundation AAIF Announcement — “adopted by more than 60,000 open source projects and agent frameworks” ↩↩
-
AGENTS.md Official Site — Specification, cross-tool compatibility list, and directory scoping ↩↩
-
OpenAI Co-founds the Agentic AI Foundation — AGENTS.md donated to AAIF under the Linux Foundation ↩
-
Codex Custom Instructions with AGENTS.md — Discovery hierarchy, override mechanism, concatenation behavior ↩↩↩↩↩
-
Cursor Rules Documentation — AGENTS.md auto-discovery in project root and subdirectories ↩
-
GitHub Blog: Copilot Coding Agent Supports AGENTS.md — Native support on github.com; experimental in VS Code ↩
-
Amp: From AGENT.md to AGENTS.md — Amp co-created the standard and switched to plural form August 2025 ↩
-
Windsurf AGENTS.md Documentation — Auto-discovery with case-insensitive matching ↩
-
Gemini CLI: Context with GEMINI.md — Configurable to read AGENTS.md via
settings.json↩ -
Aider: Specifying Coding Conventions — Requires explicit
--reador--conventions-fileflag ↩ -
How to Write a Great agents.md: Lessons from Over 2,500 Repositories — GitHub Blog — Six core areas, three-tier boundary system, anti-patterns from real-world analysis ↩↩
-
AMBIG-SWE: Resolving Ambiguous Bug Reports with LLM Agents (ICLR 2026) — “LLMs default to non-interactive behavior without explicit encouragement”; resolve rates drop 42% when agents skip clarification ↩
-
Agent Experience: Best Practices for Coding Agent Productivity — Marmelab — “Short and to the point, as coding agents read this file at the beginning of every session” - Codex CLI Comprehensive Guide — AGENTS.md Section — Full configuration reference - Claude Code Comprehensive Guide — CLAUDE.md — Claude Code’s equivalent instruction system - Claude Code vs Codex CLI — Architecture comparison and decision framework - Context Engineering Is Architecture — Why instruction file design is software architecture ↩