Poka-Yoke for Agent Tools¶

Redesign agent tool interfaces so the wrong call cannot compile — prevention over documentation.

Also known as

Mistake-Proofing, Error-Proof Tool Design, Structural Constraints

From Manufacturing to Tool Design¶

Poka-yoke (mistake-proofing) originated in Toyota's production system: redesign the process so the defective outcome is structurally impossible. Anthropic applies it to agent tools — "Change the arguments so that it is harder to make mistakes" (Building Effective Agents) — and reports spending more time optimizing tools than the overall prompt (SWE-bench Sonnet).

The Core Shift¶

Documentation tells the agent how to use a tool correctly. Poka-yoke makes incorrect use fail fast or become impossible.

Approach	Mechanism	Failure mode
Documentation	Describes correct usage	Agent ignores or misreads instructions
Validation	Rejects bad input at runtime	Agent wastes a turn, retries blindly
Poka-yoke	Eliminates the bad input from the interface	Error cannot occur

Manufacturing taxonomy:

Manufacturing function	Tool design equivalent	Example
Contact method — shape prevents misuse	Parameter type prevents invalid calls	Enum `["python", "typescript", "all"]` vs free-text
Fixed-value method — counters enforce limits	Bounds and defaults prevent out-of-range values	`max_results` clamped 1–100, default 20
Motion-step method — enforced sequence	Prerequisite gates block out-of-order operations	Read-before-write: Edit rejects if file not yet read

Production Patterns¶

Absolute Paths Over Relative¶

Relative filepaths failed after directory changes. Mandatory absolute paths eliminated the failure mode:

"Sometimes models could mess up relative file paths after the agent had moved out of the root directory. To prevent this, we simply made the tool always require an absolute path." — SWE-bench Sonnet

Uniqueness Constraints on Edits¶

String replacement (old_str/new_str) fails if old_str matches zero or multiple locations — the uniqueness constraint is itself a poka-yoke. Zero matches means stale context; multiple means insufficient context. Both force the agent to add specificity.

Read-Before-Write Gates¶

Claude Code's Edit and Write tools reject calls if the file has not been read in the current session — a structural prerequisite, not just an instruction.

Output Truncation Boundaries¶

Read-tool line caps and Bash-tool command timeouts prevent unbounded context consumption even when the agent asks for everything.

Training-Aligned Formats¶

Tool formats should be close to "naturally-occurring internet text" to leverage model training priors. Formats requiring line counting, string escaping, or unusual reasoning increase error rates (Building Effective Agents).

graph LR
    A[Tool input format] --> B{Close to training data?}
    B -->|Yes: JSON, markdown, prose| C[Strong model priors]
    B -->|No: line numbers, fixed-width, custom DSL| D[Weak priors, higher error rate]
    C --> E[Reliable tool use]
    D --> F[Frequent misuse]

Tool Use Examples in Definitions¶

Concrete sample calls in tool definitions improved accuracy from 72% to 90% on complex parameter handling in Anthropic's testing (Advanced Tool Use).

Beyond Tool Parameters¶

Technique	What it prevents	Source
Credential scoping — test/staging with budget caps (e.g., $5 limit)	Costly mistakes in production	Willison
Pre-completion checklists — middleware forces verification before agent exit	Incomplete or incorrect final outputs	LangChain
Loop detection middleware — intervenes after N iterations	Infinite retry loops	LangChain
Minimal non-overlapping toolsets — reduce ambiguity in tool selection	Wrong-tool selection	Context Engineering

When This Backfires¶

Over-constraining tool interfaces introduces its own failure modes:

Enum exhaustion — a fixed enum valid at design time excludes production edge cases; update or the agent cannot proceed.
Prerequisite deadlock — read-before-write gates block optimistic-write and content-from-scratch pipelines.
Designer blind spots — constraints encode the designer's model of valid usage; legitimate emergent strategies get rejected.
Over-normalized toolsets — too-narrow toolsets push agents toward multi-step workarounds with higher cumulative error probability, the failure mode Tool Minimalism warns against in the opposite direction.

Apply poka-yoke where failure modes are well-understood and the constraint space is stable. Prefer validation when use cases are still evolving.

Designing Your Own Poka-Yoke¶

Can any parameter accept values that are never valid? Constrain to an enum or validated range.
Does the tool depend on prior state? Add a prerequisite gate (read-before-write, auth-before-access).
Can the output overwhelm the context window? Add truncation with recovery hints, as in Graceful Tool-Output Truncation.
Does the format require precise mechanical reasoning? Switch to a format with strong training priors.
Can the tool silently apply the wrong change? Add a uniqueness or idempotency constraint.
Test like a junior developer API — pass many inputs and observe where the model fails. Fix the interface, not the prompt.

Key Takeaways¶

Poka-yoke makes the wrong tool call structurally impossible, not merely documented as wrong — one of the four Agent-Computer Interface (ACI) design principles.
Three manufacturing mechanisms map to tool design: parameter types (contact), bounds and defaults (fixed-value), and prerequisite gates (motion-step).
Apply where failure modes are stable and well-understood; prefer runtime validation when the constraint space is still evolving.
Fix the interface, not the prompt — the prompt-fix loop has no terminating condition.

Agent-Computer Interface (ACI) — the discipline that frames poka-yoke as one of four core ACI design principles
Tool Engineering — broader tool design principles including poka-yoke
Write Tool Descriptions Like Onboarding Docs — complementary: documentation quality alongside structural constraints
Tool Description Quality — selection signals and description iteration
Deterministic Guardrails — defense-layer perspective on structural constraints
Hooks for Enforcement vs Prompts for Guidance — enforcement through hooks rather than instructions
Typed Schemas at Agent Boundaries — formal schemas as structural contracts preventing invalid agent-to-agent calls
Tool Minimalism — fewer, non-overlapping tools reduce selection ambiguity