Minimum-Sufficient Control Ladder¶
Climb to the next control mechanism only when a named failure mode in the current rung demands it — for reversible, observable, bounded-blast-radius tasks.
The minimum-sufficient control ladder is an escalation algorithm: start every agent at the cheapest control mechanism (Tool Use), then add a heavier mechanism — Reflection, Evaluator-Optimizer, Human-in-the-Loop, Parallelization — only when a specific named failure mode in the current rung is observed. The order and the named failure mode at each rung are both load-bearing; without them the ladder collapses into the pattern-shopping the cargo-cult agent setup anti-pattern names. This algorithm is orthogonal to Pattern Selection Map, which compares patterns on trade-off axes once you have decided to use them — the ladder decides whether a control is needed at all.
When This Ladder Applies — Read First¶
This algorithm is Qualified: it assumes the work it governs is reversible, the failure modes are externally observable, and the cost of one cycle of a failure mode shipping is bounded. Three classes of work fail those assumptions and need concurrent defense-in-depth rather than incremental escalation:
- Irreversible / high-blast-radius actions — production writes, money movement, deletes, deploys. A reactive ladder reaches Human-in-the-Loop only after the first bad action has shipped, which is exactly the action you cannot allow. Microsoft's Taxonomy of Failure Mode in Agentic AI Systems argues "no single control is sufficient" for these surfaces and prescribes layered input sanitisation, output validation, runtime monitoring, and behavioural anomaly detection concurrently (Microsoft, 2025).
- Long multi-step workflows where compounding error dominates — at 85% per-step accuracy a 10-step workflow succeeds about 20% of the time; at 95% per-step it still only succeeds about 60% (Trantor, 2026). Reflection or Incremental Verification needs to be in from step one — the "wait for a failure to add Reflection" rung lands far too late on long horizons.
- Adversarial surfaces — prompt injection, untrusted fetched content, tool-response poisoning. The Redis three-layer guardrail architecture (perimeter / plan validation / output sanitisation) and the broader defense-in-depth literature place all three layers at launch, not rung by rung (Redis: Agentic AI Guardrails).
For the remaining work — most coding-agent tasks with branch-scoped writes, observable test signals, and bounded retries — the ladder applies and the rest of this page describes it.
The Five Rungs¶
The rungs are ordered by controllability cost: each climb trades controllability for capability, so unjustified climbs add latency and token spend without improving outcomes. Augment Code's Five Decision Rules name the original algorithm; the wording below adapts it to this site's pattern vocabulary, replacing Planning with Evaluator-Optimizer + HITL + Parallelization to match the failure modes most coding agents actually observe (Augment Code: Agentic Design Patterns).
| Rung | Add when… | Pattern |
|---|---|---|
| 1. Tool Use | Default for every agent — single LLM call plus tools resolves the task | Anthropic's Effective Agents Framework §Augmented LLM |
| 2. Reflection | Output quality needs verification against objective, externally verifiable criteria (tests, linters, type checkers) | Agent Self-Review Loop |
| 3. Evaluator-Optimizer | Self-critique is unreliable — same model marking its own homework misses systematic failure modes | Evaluator-Optimizer Pattern |
| 4. Human-in-the-Loop | Action is irreversible or blast radius escapes the agent's sandbox | Human-in-the-Loop section |
| 5. Parallelization | Subtasks are genuinely independent and benefit from sectioning or voting | Domain-Scoped Parallel Localization, Fan-Out Synthesis |
The named failure mode at each rung is load-bearing: "this output looks wrong" is not a failure mode, "this output passes the linter but fails the test suite that the agent did not run" is. The rung climbs when the failure mode is named, reproducible, and traceable to a missing control — not when an output simply looks insufficient.
Why It Works¶
Each rung trades controllability for capability, and capability that is not needed compounds negative return — latency, token spend, coordination bugs, new failure surfaces. Augment Code's trade-off matrix grounds the ordering: Tool Use is "low latency, predictable cost, high reliability (bounded), highest controllability"; Multi-Agent / Parallelization is "highest latency, highest token risk, lowest reliability out of the box, lowest controllability" (Augment Code: Agentic Design Patterns). Anthropic's Building Effective Agents makes the observation independently: "agentic systems often trade latency and cost for better task performance," with the recommendation to "start with simple prompts… and add multi-step agentic systems only when simpler solutions fall short" and to add complexity "only when it demonstrably improves outcomes" (Anthropic Engineering).
The naming requirement at each rung is the mechanism that keeps the ladder from collapsing into pattern shopping. Without a named failure mode, "add Reflection" becomes a reflex; tying each climb to an observable, reproducible failure forces the team to instrument what they have before adding what looks sophisticated.
When This Backfires¶
Three failure modes apply on top of the applicability bounds above:
- Failures that look like nominal behaviour — silent objective drift, hallucinated tool calls with plausible outputs, slow context poisoning. The ladder assumes a failure mode is externally observable; when it is not, the rung does not climb until the damage is downstream and harder to reverse. Trust Without Verify names this surface specifically.
- Escalation as uncertainty signalling rather than recovery — recent work on autonomy-induced security risks observes that once an agent's reasoning collapses, adding another rung does not reliably improve outcomes; escalation becomes "a behavioral signal of uncertainty" rather than productive recovery (arxiv: A Survey on Autonomy-Induced Security Risks in Large Model-Based Agents). Climbing the ladder cannot rescue a fundamentally broken plan.
- Teams without the eval scaffolding to name failure modes — the algorithm assumes you can distinguish a Reflection-resolvable failure from an Evaluator-Optimizer-resolvable failure. Without that scaffolding, "named failure mode" collapses to "something felt wrong," and the ladder loses its discipline. Get the eval strategy in place before the ladder is meaningful.
Worked Example¶
A code-review agent starts at Rung 1 — Tool Use alone — with a single prompt that reads the diff and posts a review. After two weeks, the team observes a reproducible failure mode: the agent posts reviews that miss the project's linter rules. The failure is named, externally verifiable (linter exit code), and traceable to a missing control (the agent never ran the linter before reviewing). Rung 2 — Reflection — adds a step where the agent runs the linter and revises the review. The failure mode disappears.
A month later a second failure mode appears: the agent marks its own reviews as "looks good" even when a test it ran has failed. Self-critique is unreliable for this surface. Rung 3 — Evaluator-Optimizer — adds a separate evaluator that gates the review on test results. The team stops at Rung 3; no rung 4 or 5 climb happens because no irreversible action and no genuinely independent subtask is in the loop. The agent's controllability and token cost stay at the minimum the observed failure modes require.
Key Takeaways¶
- Start every agent at Rung 1 (Tool Use) and climb only on a named, reproducible failure mode in the current rung.
- The ladder applies to reversible, observable, bounded-blast-radius tasks; irreversible, adversarial, or compounding-error surfaces need concurrent defense-in-depth instead.
- Each climb trades controllability for capability; unjustified climbs compound negative return.
- Get eval scaffolding in place before the ladder is meaningful — without it, "named failure mode" collapses to intuition.
Related¶
- Pattern Selection Map — compares patterns on cost / latency / blast-radius axes once you have decided to add one; this page decides whether to add one at all
- Anthropic's Effective Agents Framework — the taxonomy this ladder orders; "start simple, add complexity only when it demonstrably improves outcomes" is the underlying principle
- The Delegation Decision — decides whether to use an agent at all; this ladder takes over once you have
- Cost-Aware Agent Design — model-tier escalation (Haiku → Sonnet → Opus) by task complexity; orthogonal axis to this ladder's control-mechanism escalation by failure mode
- Cargo Cult Agent Setup — the anti-pattern this ladder exists to defuse: stacking sophisticated controls upfront without a failure mode that justifies them