Backlog Triage as a Named Agent Skill¶
A skill encodes a state machine into issue labels — each item carries one category and one state, then hands off a durable agent brief.
A named triage skill sits between human intent and AFK agent execution. It takes whatever lands on the issue tracker — half-written bug reports, customer notes, idea fragments, design-doc snippets — and decides whether the item is ready for an agent, a human, more information, or rejection. Matt Pocock's open-source /triage skill is the worked example below. The pattern is portable.
State Machine in Labels¶
Two category roles describe the work:
bug— something is brokenenhancement— new feature or improvement
Five state roles describe lifecycle position:
| State | Meaning |
|---|---|
needs-triage |
maintainer needs to evaluate |
needs-info |
waiting on reporter for more information |
ready-for-agent |
fully specified, ready for an AFK agent |
ready-for-human |
needs human implementation |
wontfix |
will not be actioned |
Every triaged issue carries exactly one category role and one state role. Conflicting states are flagged before action (SKILL.md).
graph TD
A[Unlabeled] --> B[needs-triage]
B --> C[needs-info]
B --> D[ready-for-agent]
B --> E[ready-for-human]
B --> F[wontfix]
C -->|reporter replies| B
The labels are the prompt. Five states collapse the open-ended question "what should happen with this issue?" into one of ten cells (state × category). Each cell has a fixed output shape — agent brief, needs-info template, out-of-scope record, polite close — so the skill cannot produce unstructured comments. This is the same constraint-as-prompt mechanism that gives structured tool use its reliability: the model picks a slot, not a format.
Per-Issue Process¶
For one issue, the skill runs six steps (SKILL.md):
- Gather context — read the body, comments, prior triage notes, ADRs in the affected area, and
.out-of-scope/*.md. Surface any prior rejection that resembles this issue. - Recommend category and state with reasoning. Wait for direction.
- Reproduce (bugs only) — trace the relevant code, run tests or commands, report repro / no-repro / insufficient-detail (SKILL.md). A confirmed repro produces a stronger brief.
- Grill if the issue needs fleshing out — run an interview-style refinement session.
- Apply the outcome — assign labels, post the corresponding template comment, close if
wontfix. - Disclaim provenance — every comment posted by the skill begins with
> *This was generated by AI during triage.*.
The maintainer can override transitions at any time. "Move #42 to ready-for-agent" trusts the maintainer and skips grilling, but the skill still asks whether to write an agent brief before promoting (SKILL.md).
The Agent Brief Contract¶
ready-for-agent triggers an agent brief — a structured comment that becomes the authoritative spec for the downstream executor. The original issue body is context; the brief is the contract (AGENT-BRIEF.md).
Four rules govern brief writing:
- Durability over precision — describe interfaces, types, and behavioral contracts. Never reference file paths or line numbers; the issue may sit for days while the codebase moves.
- Behavioral, not procedural — describe what the system should do, not how to implement it. The downstream executor will explore the codebase fresh, reconstructing structure at run time (issue requirements preprocessing).
- Complete acceptance criteria — every brief lists concrete, testable criteria. Each criterion is independently verifiable.
- Explicit scope boundaries — state what is out of scope to prevent gold-plating.
This is the upstream complement to issue requirements preprocessing: the brief is the structured input the executor receives, written before the executor opens its first context window. The REAgent paper measures a 17.40% lift in resolution rate when the executor reconstructs structured requirements at run time (Kuang et al., 2026) — a triage-skill brief moves that work upstream and out of the executor's context budget.
Out-of-Scope as Institutional Memory¶
Rejected enhancements are written to .out-of-scope/<concept>.md — one file per concept, not per issue. The file captures the decision, the reasoning, and a "Prior requests" list of every issue that asked for the feature (OUT-OF-SCOPE.md). During context gathering on every new issue, the skill reads this directory and surfaces matches by concept similarity ("night theme" matches dark-mode.md).
The mechanism is durable institutional memory: the skill cannot re-litigate decided questions on each invocation, similar to how agent memory patterns preserve state across sessions.
When the Pattern Earns Its Cost¶
The triage skill is human-invoked and deliberately keeps the maintainer in the loop. That makes it the right model when:
- The repo has an established codebase glossary or ADR set the agent can ground in
- Issue volume justifies bookkeeping but doesn't warrant lights-out automation
- The downstream executor is an AFK agent that benefits from a durable, structured brief
- The team needs explicit institutional memory for rejected requests
It is the wrong model — a different shape fits — when:
- High-volume bot issues (Dependabot, security scanners) need silence or auto-close, not classification. Run a different filter such as continuous triage; the state machine assumes human-authored intent.
- Lights-out triage on every event is the goal. Use continuous triage on GitHub Actions with
safe-outputs: [add-label, add-comment](GitHub Agentic Workflows) — no maintainer in the loop. - Batch intake from a single source (a QA session, a customer interview transcript) needs deduplication and codebase investigation across many candidates at once. Use the QA session to issues pipeline.
- Rigid issue-tracker workflow states (some Jira / ServiceNow configurations) cannot host the label-as-state-machine pattern without admin changes.
- The issue tracker has no codebase glossary or ADRs. Step 1 of the per-issue process depends on grounding the agent. Without it, recommendations are shallow or hallucinated.
Failure Modes¶
- Hallucinated agent briefs. If the maintainer skips reproduction and lets the skill auto-promote to
ready-for-agentbased on the reporter's description alone, the brief inherits any factual errors and the downstream agent burns context on a wrong-premise task. The skill's own guidance — confirm a repro before producing a brief — exists because this is the dominant failure mode (SKILL.md). - Brief drift. Briefs that reference file paths or line numbers go stale within days. The durability-over-precision rule is a hedge against this — interfaces and contracts survive renames; line numbers do not.
- Out-of-scope concept matching. Matching is by concept similarity, not keyword. Failed matches cause the skill to re-evaluate already-rejected requests; over-eager matches close legitimately distinct issues. The maintainer confirms or rejects the match each time.
- Tight
.out-of-scope/files for deferrals. Reasons like "we're too busy right now" are not real rejections; they are deferrals that produce out-of-scope files the maintainer later regrets. Substantive reasons reference scope, technical constraints, or strategic decisions (OUT-OF-SCOPE.md).
Example¶
Pocock's walkthrough on the Sandcastle repo shows the skill against a real backlog (AI Hero post).
/triage
Just give me all of the open issues that I haven't triaged yet.
The agent finds nine untriaged issues and labels each with a category and needs-triage. The maintainer then narrows: "Could you start with 477 for me?" The agent reads issue #477, finds existing triage notes with a stack trace, and recommends ready-for-agent. The maintainer asks the agent to reproduce the bug rather than trust the reporter:
Could you diagnose this yourself?
The agent traces task-ID variables, locates the literal-vs-placeholder mismatch, writes a regression test, applies the fix, pushes to main, and closes the issue. Context use sits at 46.5k of 100k tokens — the brief is short because the codebase grounding did the work upstream of the executor.
Key Takeaways¶
- A named triage skill encodes a state machine into labels — every triaged issue carries one category and one state, with no ambiguous middle ground
- The agent brief is the hand-off contract: durable, behavioral, with testable acceptance criteria and explicit scope boundaries
.out-of-scope/<concept>.mdfiles give the skill institutional memory; the skill checks them on every new issue and surfaces matches before triaging- The pattern is human-in-the-loop by design — choose continuous triage for lights-out classification or the QA session pipeline for batch intake
- Reproducing bugs before producing a brief is the most reliable defence against hallucinated specs that waste downstream agent context
Related¶
- Continuous Triage — automated event-time triage on GitHub Actions; complementary lights-out variant
- QA Session to Issues Pipeline — multi-stage batch pipeline for intake from a single artifact
- Issue Requirements Preprocessing — downstream complement: structured requirement reconstruction at executor invocation time
- Issue-to-PR Delegation Pipeline — what runs on issues marked
ready-for-agent - Issue-Tracker as Agent Dispatch Surface — the downstream dispatch contract that the agent brief feeds into
- Semantic Issue Search from Chat vs Query Syntax — paraphrase-tolerant duplicate detection that complements
.out-of-scope/concept matching - AI-Powered Vulnerability Triage — analogous classification pipeline in a security context
- Continuous AI: A Navigation Map of Always-On Agent Workflows — the parent map placing this in the triage family alongside the continuous-* workflows