PRD-Driven Development: How I Use 30+ PRDs to Ship with AI Agents

SaasMaker’s RalphBlaster workflow generates a complete pull request from a one-line ticket in under 45 minutes, with the developer touching zero code during implementation.¹

I’ve tried this pattern. It works. It also fails in ways the demo videos don’t show.

PRD-driven development is a workflow where AI agents implement features from structured Product Requirements Documents instead of ad-hoc prompts. Each PRD contains user stories, acceptance criteria, file locations, test expectations, and explicit “don’t touch” lists that constrain agent behavior. The structure eliminates the ambiguity that causes agents to produce unpredictable output. Over 30+ PRDs, the pattern proved reliable for well-defined tasks but failed for novel architecture decisions requiring iterative human judgment.

TL;DR

I’ve written 30+ PRDs for AI agent tasks over the past six months. The pattern works well for well-defined tasks with clear acceptance criteria: CRUD endpoints, test additions, UI components following established patterns. It fails for ambiguous requirements, novel architecture decisions, and anything requiring iterative human judgment. My PRD template evolved from a simple user story format into a structured document with file locations, test expectations, constraints, and explicit “don’t touch” lists. The evolution happened because agents interpreted vague PRDs in surprising ways.

The Workflow I Actually Use

Step 1: Create a Ticket

I define what needs to happen in plain language. Specificity matters more than I initially realized:

Vague (produces unpredictable results):

Add user preference saving to the settings page.

Specific (produces predictable results):

Add dark mode toggle to /settings. Persist to user_preferences
table (column: dark_mode, type: boolean, default: false).
Use existing SettingsForm component. Add toggle below the
notification section. No new dependencies.

The second version constrains the agent enough that the output matches expectations. The first version gave me a settings page with a new React component, three new npm packages, and a localStorage implementation instead of the database persistence I wanted.

Step 2: Generate or Write the PRD

My /prd skill converts a ticket into a structured PRD with user stories, acceptance criteria, file locations, and test expectations.² A typical PRD looks like:

## Story: Add dark mode toggle
**As a** user
**I want to** toggle dark mode from settings
**So that** I can read comfortably in low light

### Acceptance Criteria
- [ ] Toggle appears in SettingsForm below notifications
- [ ] Persists to user_preferences.dark_mode (boolean)
- [ ] Default: false (light mode)
- [ ] Toggle state reflects current DB value on page load

### Files to Modify
- app/routes/settings.py (add dark_mode to form handler)
- app/templates/settings.html (add toggle component)
- app/models/user.py (add dark_mode column if missing)

### Files NOT to Modify
- app/static/css/styles.css (dark mode CSS already exists)
- app/templates/base.html (already reads dark_mode class)

### Test Expectations
- Test toggle persists to database
- Test default value on new user
- Test toggle reflects current state on reload

The “Files NOT to Modify” section was the biggest template evolution. Without it, agents would helpfully “improve” related files, introducing changes I hadn’t requested and didn’t want.

Step 3: Agent Implementation

The agent works in an isolated git worktree, preventing interference with my current branch:³

# Create isolated worktree for agent task
git worktree add ../worktrees/dark-mode -b feature/dark-mode

# Agent works in ../worktrees/dark-mode/
# I continue working in main workspace

# Review and cleanup after merge
git worktree remove ../worktrees/dark-mode

My recursion guard monitors the agent’s spawn behavior. My git safety guardian prevents the agent from force-pushing or committing credentials. These hooks run automatically. I don’t supervise the agent during implementation.

Step 4: Review

A notification arrives when the agent completes. I review the diff against the PRD acceptance criteria. If all criteria pass, I merge. If not, I either fix manually or restart the agent with updated context.⁴

Where PRD-Driven Development Works

CRUD endpoints with clear data models. The PRD specifies the model, the routes, and the response format. The agent generates boilerplate that matches existing patterns.

Test additions for existing code. “Write tests for app/content.py covering load_post_by_slug with valid slug, invalid slug, and missing file” produces useful tests because the function already exists and the acceptance criteria are objective.

UI components following established patterns. “Add a category filter to the blog listing page using the same tab pattern as the guide page” works because the agent can reference the existing pattern.

Database migrations with defined schemas. The PRD specifies columns, types, defaults, and constraints. The agent generates the migration and updates the model.

Where PRD-Driven Development Fails

Ambiguous requirements. “Make the blog better” is not a PRD. The agent will make changes, but they won’t match your intent because your intent wasn’t specified.

Novel architecture decisions. When I needed to design the deliberation system’s consensus model, no PRD could capture the decision. I needed to explore options, evaluate tradeoffs, and iterate on the design. That required my deliberation skill, not a PRD.

Performance optimization. “Make the page load faster” requires profiling, measurement, and iterative investigation. The agent can’t profile your production traffic patterns from a PRD.

Security-critical code. PRDs for auth systems produce code that handles the happy path. Edge cases in authentication (timing attacks, session fixation, CSRF in non-standard flows) require human expertise that PRDs can’t encode.

How My PRD Template Evolved

Version 1 (Month 1): Simple User Stories

As a user, I want to save preferences so I can customize my experience.

Problem: Too vague. The agent made reasonable but wrong assumptions about storage mechanism, UI placement, and scope.

Version 2 (Month 2): Added File Locations

## Story: Save preferences
### Files to Modify
- app/routes/settings.py
- app/templates/settings.html

Problem: Better, but agents still “improved” adjacent files without permission.

Version 3 (Month 4): Added Constraints

## Story: Save preferences
### Files to Modify (only these)
- app/routes/settings.py
- app/templates/settings.html
### Constraints
- No new dependencies
- Use existing database models
- Do not modify CSS

Problem: Agents sometimes ignored constraints when they conflicted with “best practices” from training data.

Version 4 (Current): Explicit Exclusions + Test Expectations

The current template adds “Files NOT to Modify,” explicit test expectations, and acceptance criteria checkboxes. This version produces predictable results approximately 85% of the time. The remaining 15% requires a second pass with clarified instructions.⁵

The 30-PRD Pattern Library

After 30+ PRDs, patterns emerged:

PRD Type	Success Rate	Avg Agent Time	Avg Review Time
CRUD endpoint	~95%	10-15 min	5 min
Test additions	~90%	5-10 min	10 min
UI component (existing pattern)	~85%	15-20 min	10 min
Database migration	~90%	5-10 min	5 min
Bug fix (with repro steps)	~80%	15-25 min	15 min
New feature (novel)	~50%	30-45 min	30+ min

The success rate for novel features (50%) explains why PRD-driven development supplements my workflow rather than replacing it. Half the time, novel work requires iteration that PRDs can’t capture in advance.

Key Takeaways

For solo developers: - Start with one well-defined PRD type (CRUD, tests) and validate the workflow before expanding to complex tasks - Add “Files NOT to Modify” to every PRD; agents will helpfully “improve” code you didn’t ask them to touch - Use git worktrees to isolate agent work; the cleanup cost of a failed agent run should be one command, not a git archaeology session

For engineering managers: - PRD quality determines agent output quality; invest in PRD templates and review processes before scaling autonomous agent usage - Track the merge-without-changes ratio to measure workflow maturity; the ratio should improve as PRD templates evolve - Novel architecture work and security-critical code should not be PRD-delegated; reserve agent delegation for well-defined, repeatable tasks

FAQ

What is PRD-driven development?

PRD-driven development is a workflow where a Product Requirements Document defines a task with enough specificity that an AI agent can implement it autonomously. The PRD includes user stories, acceptance criteria, file locations, constraints, and test expectations. The agent works in an isolated git worktree while you continue other work. The pattern works best for well-defined, repeatable tasks like CRUD endpoints, test additions, and UI components following established patterns.

How many PRDs should I write before trusting the workflow?

Start with 3-5 PRDs in a single category where you have high confidence in the expected output, such as test additions or database migrations. These categories have ~90% success rates in my experience. Once you validate that the template produces predictable results, expand to UI components (~85% success rate) and bug fixes (~80%). Novel features have a ~50% success rate and should not be your entry point into PRD-driven development.

What format works best for PRDs used with Claude Code?

The most effective PRD format includes five sections: a user story with acceptance criteria checkboxes, a “Files to Modify” list with specific paths, a “Files NOT to Modify” exclusion list, test expectations with specific scenarios, and constraints (no new dependencies, use existing patterns). The exclusion list was the single biggest improvement to my template, preventing agents from “helpfully” modifying code outside the task scope.

Can PRD-driven development replace human code review?

No. PRD-driven development replaces the implementation step, not the review step. Every completed PRD still requires human review against the acceptance criteria before merging. In my workflow, a notification arrives when the agent completes, and I review the diff against each criterion. The quality loop and independent test verification supplement but do not replace human judgment on architectural fit and correctness.

Why do AI agents sometimes ignore PRD constraints?

Agents ignore constraints when those constraints conflict with patterns from training data. If your PRD says “no new dependencies” but the model’s training data associates the task with a specific library, the model may install the library anyway. The fix is explicit exclusions: “Files NOT to Modify” and “Do NOT install” lists work better than positive constraints because they give the agent a concrete boundary rather than a general guideline.

References

@saasmakermac. “RalphBlaster autonomous workflow demonstration.” X/Twitter, January 2026. ↩
Author’s PRD template implementation using Claude Code skills. 30+ PRDs written between August 2025 and February 2026. ↩
Git Documentation. “git worktree - Manage multiple working trees.” 2025. ↩
GitHub - snarktank/ralph. “Ralph: autonomous AI agent loop for development tasks.” 2026. ↩
Author’s analysis of PRD success rates across 30+ agent tasks, tracked in project MEMORY.md. ↩