Method Map: Failure-Mode to Smallest-Artifact Triage¶

A Method Map triages each observed failure mode to the smallest artifact that fixes it — add only that artifact, never a bigger instruction file.

A Method Map is a lookup table — one row per recurring failure mode an agent exhibits, mapped to the single primary-fix artifact that binds that failure. When a new failure is observed, add the row's named artifact and nothing else (walkinglabs/learn-harness-engineering, method-map.md).

When This Earns Its Place¶

The Method Map pays off only under three conditions:

Failures have been observed — applying the table prophylactically adds files for problems that do not exist
Instruction count is approaching the compliance ceiling — below the ~150-rule degradation point (IFScale, 2025), an AGENTS.md line is cheaper than a separate artifact; above it, every new rule pushes existing rules into omission territory (Instruction Compliance Ceiling)
Tasks span multiple sessions — cold-start, handoff, and scope-sprawl failures exist only for work that crosses context windows

Below these thresholds, an AGENTS.md rule is the smallest artifact, and the Method Map collapses into the instruction file itself.

The Canonical Table¶

The published Method Map covers six failure modes from long-running coding-agent work (walkinglabs, method-map.md):

Triggering observation	Failure mode	Primary fix
New session spends most of its time rediscovering setup and status	Cold-start confusion	`claude-progress.md`
The agent starts several features and finishes none cleanly	Scope sprawl	`feature_list.json`
The agent claims done after code edits but before runnable proof	Premature completion	`clean-state-checklist.md`
Every session re-learns how to boot the project	Fragile startup	`init.sh`
The next session cannot tell what is verified, broken, or next	Weak handoff	`session-handoff.md`
Review quality depends on taste or memory	Subjective review	`evaluator-rubric.md`

Five of these artifacts come directly from Anthropic's effective-harnesses writeup — feature_list.json with its passes field, claude-progress.txt, init.sh, and the clean-state criterion ("the kind of code that would be appropriate for merging to a main branch") (Anthropic, Effective Harnesses for Long-Running Agents).

The table is not a checklist; it is invoked when an observation matches a row. The first column — the triggering observation in concrete terms — is what distinguishes the Method Map from a generic list of artifacts.

The Doctrine¶

Three rules govern how the table is used:

Add the smallest artifact that directly addresses the observed failure. A feature_list.json binds scope sprawl; it does not also encode commit conventions or test runners.

Never solve a single failure by enlarging a global instruction file. Appending another rule to AGENTS.md pushes existing rules closer to the compliance ceiling, where attention drops them (IFScale, 2025). The reflex to add a rule is the failure mode the Method Map is designed to interrupt.

Per-failure additions must be removable when the failure no longer recurs. Every artifact has a removal condition, not just an addition condition.

Why This Works¶

Each row pairs a failure with an artifact that creates external state the agent cannot rewrite during reasoning. A feature_list.json constrains the agent to edit only the passes field; an instruction line that says "do not declare done early" has no external referent and competes with every other rule under attention degradation (Instruction Compliance Ceiling). Anthropic documents the mechanism directly: "after some features had already been built, a later agent instance would look around, see that progress had been made, and declare the job done" — only an external contract stopped it, not more prose (Anthropic harness post).

When This Backfires¶

The discipline carries its own failure modes:

Stale artifacts mislead. An out-of-date feature_list.json actively misdirects the agent; an artifact without an update discipline is worse than no artifact. The Evaluating AGENTS.md study shows auto-generated context files reduce task success rates
Prophylactic application. Adding all six rows before any failure has been observed violates the operating principle and creates speculative complexity
Solo developer overhead. For a project well under the compliance ceiling, file-discovery and maintenance cost more than the compliance gain
Dogma about "no more rules". For some failures the smallest artifact genuinely is one line in the instruction file

Example¶

A project starts noticing that its agent declares features complete after editing code but before running tests. The reflex fix is a new line in AGENTS.md: "Always run make test before marking a feature complete."

Reflex (instruction-file growth):

# AGENTS.md (line 187 of 240)
- Always run `make test` before marking a feature complete

This adds rule 188 to a file already near the compliance ceiling. Under IFScale-style benchmarks the rule will be dropped some fraction of the time, exactly when it matters (IFScale, 2025).

Method Map (smallest artifact):

The observation matches "premature completion." The project adds a clean-state-checklist.md whose presence the agent must verify before declaring done, and removes any corresponding instruction line:

# clean-state-checklist.md
- [ ] `make test` exits 0
- [ ] `git status` is clean (no uncommitted changes)
- [ ] No `TODO(agent)` or `FIXME(agent)` markers in changed files
- [ ] Feature entry in `feature_list.json` shows `passes: true`

The file is small, scoped, and has a clear removal condition: when premature-completion failures stop recurring across enough sessions, the file can be retired. AGENTS.md stays at 240 rules instead of 241.

Key Takeaways¶

The Method Map is a triage table — observation to smallest artifact — not a checklist applied prophylactically
Each row carries a triggering observation in concrete terms; the table is invoked when behavior matches
The doctrine: add the smallest artifact that addresses the observed failure; never enlarge a global instruction file
Every per-failure addition needs a removal condition, not just an addition condition
Earns its place only when failures have been observed, instruction count is near the compliance ceiling, and tasks span sessions

The Instruction Compliance Ceiling — the mechanism that makes "just add a rule" fail
Feature List Files — canonical page for the scope-sprawl row's feature_list.json artifact
Frozen Spec File — a Method Map artifact for scope drift across compaction
Agent Harness: Initializer and Coding Agent Pattern — canonical page for the init.sh and claude-progress.txt artifacts
Trajectory Logging via Progress Files and Git History — handoff and progress artifacts in depth
Evaluating AGENTS.md: When Context Files Hurt More Than Help — empirical counterweight: artifacts only help when curated
AGENTS.md as Table of Contents, Not Encyclopedia — the same anti-bloat discipline at the file level