← Wszystkie wpisy

Boids to Agents: Flocking Rules for AI Systems

In 1986, Craig Reynolds asked a question that sounds trivial: how do birds flock? Not a philosophical question. An engineering one. He wanted to render convincing flocks in computer graphics without scripting each bird’s path.

His answer used three rules. Separation: steer to avoid crowding neighbors. Alignment: steer toward the average heading of neighbors. Cohesion: steer toward the average position of neighbors. No leader, no flight plan, no central controller. Just three rules applied locally to each bird, producing global coordination indistinguishable from real flocking.1

Forty years later, I run a multi-agent AI system designed around ten specialized agent roles that research, debate, and vote on decisions. The night a single prompt spawned 23 agents dynamically — each spawning more in an uncontrolled cascade — and all achieved perfect consensus on the wrong question, I realized I was watching the same dynamics Reynolds described: emergent coordination from simple rules, and emergent failure from the same mechanism.

TL;DR

Reynolds’ boids algorithm demonstrates that three simple local rules produce coherent global behavior without centralized control. Agents following local instructions (evaluate this code, check that security surface, review this architecture) produce emergent coordination when the rules are well-chosen and emergent chaos when they are not. Adding more rules to fix edge cases makes agents worse, not better. My 23-agent runaway incident proved this: a spawn budget constraining width, not only depth, is the agent equivalent of Reynolds’ separation rule.


Three Rules, Infinite Flocks

Toggle the rules above to see what happens. With all three active, the boids flock naturally. Remove separation, and they collapse into a single mass. Remove alignment, and they scatter into random motion. Remove cohesion, and they drift apart while maintaining local heading. Each rule is necessary. None is sufficient alone.

Reynolds presented this at SIGGRAPH 1987 as “Flocks, Herds, and Schools: A Distributed Behavioral Model.”1 The paper changed computer graphics. Before boids, animated flocks required hand-scripted paths. After boids, flocks emerged from rules. Over a decade later, in 1998, Reynolds received the Scientific and Engineering Award from the Academy of Motion Picture Arts and Sciences for this work.2

The deeper contribution wasn’t the specific rules. It was the proof that global coordination doesn’t require global knowledge. Each boid only knows about its immediate neighbors. No boid has a map of the flock. No boid knows the flock’s destination. The flock has no destination. What looks like coordinated movement is local decisions producing a global pattern.


From Pixels to Processes

The mapping from boids to AI agents isn’t metaphorical. A NeurIPS 2025 workshop paper, “Revisiting Boids for Emergent Intelligence via Multi-Agent Collaboration,” explicitly applies boids principles to multi-agent systems.3 The paper uses Reynolds’ rules as natural-language guidance for agents in a collaborative tool-building environment: alignment and separation derive from neighbors’ tool metadata, while cohesion injects the previous round’s global summary.

My mapping is simpler and comes from building the system before reading the paper:

Boids Rule Agent Equivalent What It Prevents
Separation Spawn budget: limit active agents per parent Agent pile-up on the same sub-problem
Alignment Shared evaluation criteria: all agents use the same evidence standard Agents working toward incompatible quality definitions
Cohesion Consensus protocol: agents converge toward group findings Agents drifting into unrelated tangents

The parallel isn’t perfect. Boids rules are continuous (steer toward average heading). Agent rules are discrete (spawn at most 12 children). Boids operate in spatial coordinates. Agents operate in problem space. But the structural insight holds: local rules, applied independently by each agent, produce coherent group behavior. And the failure modes are the same.


The Night 23 Agents Agreed on the Wrong Question

February 2026. I asked my agent to “investigate improving the hook dispatch system.” The agent assessed its own confidence at 0.58, which triggered the deliberation system. Three research agents spawned. Each found sub-problems and spawned their own research agents. Those agents spawned more.

Seven minutes later: 23 active agent processes. $4.80 in API credits. Token consumption climbing at $0.70 per minute.

The recursion guard tracked depth (parent spawns child, child spawns grandchild) but not width (parent spawns 12 children who each spawn 12 more). The depth limit of 3 never triggered because the agents spread horizontally. I killed the processes manually.

Every agent agreed the hook dispatch system needed improvement. Every agent proposed reasonable changes. Not one agent questioned whether the investigation itself had the right scope.

The result demonstrates what happens when you have cohesion and alignment but no separation. The agents converged toward the same conclusion (cohesion) and aligned on the same evaluation criteria (alignment). But nothing prevented them from crowding onto the same sub-problem. In Reynolds’ terms, 23 boids occupied the same point in space, a behavior his separation rule explicitly prevents.


The Spawn Budget as Separation

The fix took 20 minutes. A spawn budget that tracks total active children per parent, capped at 12.4 A width constraint, not only depth.

The implementation is a counter, not a rule:

# Simplified spawn budget (actual implementation uses hooks)
active_children = count_active_agents(parent_id=self.id)
if active_children >= MAX_CHILDREN:  # MAX_CHILDREN = 12
    return "Budget exhausted. Synthesize existing findings instead."
# else: allow spawn

The hook enforces this deterministically. The agent cannot rationalize past a counter the way it can rationalize past a prompt instruction like “try not to spawn too many agents.” The counter either permits or blocks. No argument, no interpretation.

In boids terms, the spawn budget is the separation rule: don’t crowd your neighbors. In agent terms: don’t put more than N agents on a sub-problem, because the N+1th agent adds cost without adding perspective. The 23-agent incident taught me that the separation rule is the most important of the three. Without it, alignment and cohesion become pathological: agents converging enthusiastically on the same wrong answer.

Common implementations of Reynolds’ algorithm weight the separation force 1.5-2x higher than alignment or cohesion for exactly this reason — a convention documented in Reynolds’ own follow-up paper, “Steering Behaviors for Autonomous Characters” (GDC 1999), where he discusses priority-based force combination and notes that collision avoidance (separation) typically receives highest priority.8 In my system, hooks enforce the spawn budget (deterministic, immune to agent rationalization), while prompts and context shape alignment and cohesion (softer, bendable). The hardest constraint gets the hardest enforcement.


Why More Rules Make Agents Worse

Toggle the boids simulation above. Now imagine adding a fourth rule: “avoid the center of the canvas.” A fifth: “prefer the top half.” A sixth: “reverse course every 100 frames.”

Each rule is individually reasonable. Together, they destroy the flock. The boids jitter between competing forces, unable to satisfy all constraints simultaneously. The elegant flocking collapses into noise because each rule creates a force vector, and when six force vectors pull in different directions, the resultant movement is effectively random.

I observed the same pattern in agent orchestration. Early versions of my deliberation system had elaborate rules: minimum research depth, mandatory citation counts, required counterargument generation, forced devil’s-advocate passes. Each rule improved some specific failure case. Together, they produced agents that spent more tokens satisfying rules than solving problems. A concrete example: the “mandatory counterargument” rule required every agent to argue against its own finding. The “minimum citation count” rule required three sources per claim. When an agent generated a genuine counterargument, the citation rule forced it to find three sources supporting the counter-position — which sometimes produced stronger evidence for the wrong answer than for the right one. The two rules interacted to reward well-sourced contrarianism over correct analysis.

The compounding engineering philosophy I documented explains why this happens at the system level: each new rule interacts with every existing rule, creating combinatorial complexity. Ten rules don’t produce ten constraints. They produce potentially 45 pairwise interactions. The system’s behavior becomes harder to predict, harder to debug, and more likely to produce emergent failure.

Reynolds’ original paper implicitly understood this. The original paper presented three rules. Reynolds explored additional behaviors (obstacle avoidance, goal seeking) but kept the core flocking model minimal. Adding a fourth core rule would have required rebalancing the weights of all four, a problem that grows combinatorially with each addition.

The agent lesson: start with the minimum rules that produce the behavior you want. Add rules only when you observe a specific failure that no existing rule addresses. And when you add a rule, check whether it interferes with the others.


Decentralization as Architecture

The absence of a central controller is not a limitation of boids. It is the architecture. The flock coordinates because no individual bird has authority over the others. If one bird became the leader, the system would become brittle — dependent on that bird’s judgment, vulnerable to its failure. The pattern mirrors the distributed resilience principle that structural emptiness creates in other domains.5

The same pattern appears across multi-agent AI systems. Microsoft’s AutoGen (v0.4+, 2025) coordinates agents through conversation protocols rather than centralized orchestration — each agent decides when to speak based on local context, and the framework explicitly avoids designating a “lead” agent.6 CrewAI (v0.28+, 2025) defines agent roles and handoff rules through its “crew” abstraction but does not give any single agent override authority over others.7 In my deliberation system, no single agent has veto power. Each evaluates independently. The consensus protocol aggregates their findings. If one agent produces garbage, the others outvote it. Centralized control would reintroduce the single-point-of-failure that decentralized coordination eliminates.


When the Three Rules Worked: A Quiet Success

The 23-agent incident is memorable because it failed dramatically. Successes are quieter and more numerous.

A week after adding the spawn budget, I asked the system to “evaluate whether the blog’s A/B testing infrastructure should use server-side or client-side variant assignment.” The deliberation system spawned three research agents. Agent 1 investigated server-side approaches (cookie-based, session-based). Agent 2 investigated client-side approaches (localStorage, URL parameters). Agent 3 evaluated hybrid patterns. Each agent worked independently (cohesion: converging toward the same evaluation question). Each used the same evidence standards (alignment: shared quality criteria). The spawn budget prevented any agent from spawning its own sub-agents for this task (separation: no crowding).

The three agents returned findings. Agent 1 favored server-side assignment for SEO consistency. Agent 2 favored client-side for implementation simplicity. Agent 3 identified a hybrid approach that matched the existing codebase patterns. The consensus protocol synthesized the findings: server-side assignment for the blog (where SEO matters), with client-side override for the interactive components (where JavaScript already runs).

The entire deliberation took four minutes and cost $1.40. No agent spawned sub-agents. No agent agreed with the others prematurely. The recommendations were independent and genuinely complementary. The three-rule architecture produced exactly the behavior Reynolds described: local decisions producing coordinated global output.


Emergent Patterns in Other Domains

The “simple rules, emergent behavior” pattern isn’t unique to boids or agents. It appears wherever local decisions aggregate into global structure:

Hamming codes demonstrate a related form of emergence. Strategic placement of parity bits at positions that are powers of 2 creates a system where the error’s position emerges from XORing the positions of all “1” bits. No bit knows where the error is. The error location emerges from the structure. A single parity bit in isolation tells you almost nothing. Multiple parity bits in the right positions tell you everything.

Conway’s Game of Life produces gliders, oscillators, and Turing-complete computation from four rules applied to a grid. The rules are simpler than boids (a cell lives if it has 2-3 neighbors, dies otherwise). The emergent complexity is unbounded.

In every case, the pattern is the same: local rules, no central coordinator, emergent global behavior. And in every case, the same failure mode applies: add too many rules and the system collapses from elegant emergence into chaotic interference.


Key Takeaways

For engineers designing multi-agent systems:

  • Three rules produce flocking. Four rules might not. Reynolds’ insight was that three well-chosen rules are sufficient for emergent coordination. Adding a fourth creates interference that can destroy the behavior the first three produced. Start with the minimum rules that produce the behavior you want.

  • Separation is the most important rule. Without it, alignment and cohesion become pathological, with agents converging on the same wrong answer. The spawn budget (width constraint) was the most impactful single change I made to my multi-agent system.

For architects building distributed systems:

  • The absence of a leader is the architecture. Centralized control creates a single point of failure. Decentralized coordination through local rules is more resilient. Design your agent systems without a “lead agent” unless you have a specific reason to add one.

  • More rules create more interactions. N rules produce up to N(N-1)/2 pairwise interactions. Behavior becomes harder to predict with each addition. Add rules only when you observe a specific failure that no existing rule addresses.

Exercise: Map your own system. Diagram your current multi-agent system (or a system you plan to build). For each agent, identify which of Reynolds’ three rules it follows. If separation is missing, add a spawn budget or concurrency limit. If alignment is missing, define shared evaluation criteria. If cohesion is missing, add a synthesis step. The diagram reveals which failure mode your system is most vulnerable to.


FAQ

What are boids and why do they matter for AI agent design?

Boids are simulated agents (originally “bird-oid objects”) that follow three local rules: separation (avoid crowding neighbors), alignment (steer toward average neighbor heading), and cohesion (steer toward average neighbor position). Craig Reynolds introduced them at SIGGRAPH 1987. They matter because they proved that complex, coordinated global behavior can emerge from simple local rules without any central controller — a principle that applies directly to multi-agent AI systems, swarm robotics, and distributed computing.

How do boids rules apply to AI agent orchestration?

Each boids rule maps to an agent design constraint. Separation becomes a spawn budget (limit how many agents work on the same sub-problem). Alignment becomes shared evaluation criteria (all agents use the same quality standard). Cohesion becomes a consensus protocol (agents converge toward group findings). The mapping is structural, not metaphorical: both systems produce emergent coordination from local rules applied independently by each agent.

Why do more rules make multi-agent AI systems worse instead of better?

Each new rule interacts with every existing rule, creating combinatorial complexity. Ten rules produce up to 45 pairwise interactions. Agents spend tokens satisfying rule constraints instead of solving problems. The same dynamic appears in boids: adding a fourth rule (e.g., “avoid the center”) forces rebalancing with the original three, and the elegant flocking behavior often degrades into jittering as agents try to satisfy competing forces simultaneously.

When should I use centralized control instead of emergent coordination?

Use centralized control when you need strict ordering guarantees (sequential pipeline stages), when failures must be handled deterministically (financial transactions), or when the system requires a single authoritative decision rather than a consensus (merge conflict resolution). Use emergent coordination when agents can evaluate independently, when redundancy improves reliability, and when no single agent needs the full picture to do its job well.


Part of the Interactive Explorations series, where algorithms meet visual intuition: from Hamming codes that catch their own mistakes to boids that flock without a leader. The agent orchestration patterns appear in detail in The Ralph System and Multi-Agent Deliberation. The metacognitive programming layer adds individual-agent self-monitoring to complement the inter-agent coordination above.



  1. Reynolds, C. W. (1987). “Flocks, Herds, and Schools: A Distributed Behavioral Model.” SIGGRAPH ‘87: Proceedings of the 14th annual conference on Computer graphics and interactive techniques, pp. 25-34. red3d.com/cwr/boids/ 

  2. Reynolds received the Scientific and Engineering Award from the Academy of Motion Picture Arts and Sciences in 1998 for his contributions to behavioral animation, building on the boids work. The algorithm has been used in film production since Batman Returns (1992). oscars.org/sci-tech/ceremonies/1998. See also Reynolds’ resume: red3d.com/cwr/resume.html

  3. “Revisiting Boids for Emergent Intelligence via Multi-Agent Collaboration.” NeurIPS 2025 Workshop: Scaling Environment. openreview.net/pdf?id=46LJ81Yqm2. The paper applies boids rules as natural-language guidance for agents following an observe-reflect-build loop. 

  4. The spawn budget implementation is documented in The Ralph System. The key architectural decision: enforce width limits through hooks (deterministic) rather than prompts (advisory). An agent instructed to “limit spawning” will rationalize past the limit. A hook that counts active children and blocks the spawn API call cannot. 

  5. Lao Tzu, Tao Te Ching, Chapter 11. Translation: D.C. Lau, Penguin Classics, 1963. The full passage discusses the hub of a wheel, the walls of a room, and the clay of a vessel: three examples where utility comes from the void, not the material. See Nothing is Structural for the full exploration. 

  6. Microsoft AutoGen, github.com/microsoft/autogen. AutoGen v0.4 (released 2025) introduced the GroupChat abstraction where agents participate in conversations without a designated leader. The framework’s documentation explicitly states: “AutoGen enables a group of agents to collectively perform tasks that a single agent alone cannot.” The conversation protocol, not a central controller, determines turn order. 

  7. CrewAI, github.com/crewAIInc/crewAI. CrewAI v0.28+ (2025) defines crews with role-based agents and task handoff rules. The process parameter supports “sequential” (fixed order) and “hierarchical” (manager-delegated) modes, but even in hierarchical mode, the manager agent coordinates rather than overrides — it cannot change another agent’s output, only request revisions. Documentation: docs.crewai.com

  8. Reynolds, C. W. (1999). “Steering Behaviors for Autonomous Characters.” Game Developers Conference 1999 Proceedings, pp. 763-782. red3d.com/cwr/steer/. Reynolds discusses prioritized force combination, where separation (collision avoidance) is processed first and consumes available steering force before alignment or cohesion are applied. The priority scheme ensures that avoiding crowding always takes precedence. 

Powiązane artykuły

Hamming Codes: How Computers Catch Their Own Mistakes

Every time you use RAM, read a QR code, or receive data from space, Hamming codes fix errors. An interactive exploration…

7 min czytania

GLSL for Builders: A Shader Lab You Can Actually Use

A practical GLSL playground with live controls for learning shader intuition fast. Presets, uniform manipulation, and ze…

16 min czytania

Multi-Agent Deliberation: When Agreement Is the Bug

Multi-agent deliberation catches failures that single-agent systems miss. Here is the architecture, the dead ends, and w…

19 min czytania