Agent Harness: Initializer and Coding Agent¶
A two-phase agent harness pairs an initializer that prepares the environment with a coding agent that resumes from any prior session via git-based handoff artifacts.
The Stateless Session Problem¶
Agents have no memory between sessions. Without explicit design, they lose track of progress, repeat completed work, or declare premature completion when context pressure rises. A deliberate harness — two coordinated agents with structured artifacts — gives every session a reliable on-ramp.
Initializer Agent¶
The initializer runs once at the start of the overall task (not per session):
- Run environment setup scripts (e.g.,
init.sh) and verify readiness - Create a
claude-progress.txtrecording what is started, completed, and remaining - Make a baseline git commit so the coding agent has a clean starting point
Coding Agent¶
Each session starts with the coding agent reading orientation artifacts before touching any code:
git log— commits since the baselineclaude-progress.txt— current task status- Feature list file — which features are complete, failing, or next
The agent selects the highest-priority incomplete work, completes it, and leaves clean artifacts for the next session.
Git Commits as Cross-Session Memory¶
Each commit message is a structured handoff note documenting:
- What was implemented
- What tests pass
- What the next incomplete task is
git log becomes a human- and agent-readable audit trail of session progress.
Lazy Tool Discovery¶
Keep the active tool set small instead of loading every schema up front. The OPENDEV agent uses lazy tool discovery to hold down context bloat and reasoning degradation, surfacing tools to the model on demand rather than registering them all at construction time (Bui, 2026). Subagents still compile from spec to runtime and share a tool registry, but each isolates the schemas it actually exposes through schema filtering — so a session pays the schema cost only for the tools it reaches for.
Inner Loop: Execution Cycle¶
Each iteration follows a six-phase cycle (Bui, 2026 §2.2.2):
- Pre-check/compaction — assess context pressure, compact if needed
- Thinking — optional extended reasoning
- Self-critique — evaluate the approach before committing
- Action — LLM call with tool schemas
- Tool execution — run the selected tool
- Post-processing — update state, check termination conditions
LangChain's build-your-own walkthrough traces the same primitives — the loop, the tool set, and the state passed between iterations — when assembling a custom agent harness from scratch (LangChain, how to build a custom agent harness).
Failure Modes and Fixes¶
Agent tries to do too much in one session — exhausts context mid-feature, leaving partial work.
Fix: enforce single-feature-per-session in the coding agent's instructions. Anthropic's engineering practice confirms this constraint prevents context mid-feature exhaustion.
Agent prematurely declares completion — marks a feature done before tests pass.
Fix: require passing tests as the completion gate. This rule must be explicit in the system prompt; agents without it will optimistically self-report.
When This Backfires¶
The two-phase harness adds structure and overhead — it is not always the right choice.
- Short-lived or predictable tasks — a task that fits in a single context window needs no initializer, progress file, or multi-session handoff machinery. The overhead of maintaining
claude-progress.txtand baseline commits outweighs the benefit. - Human-in-the-loop workflows — if a human reviews and redirects after every subtask, rigid single-feature sessions introduce unnecessary checkpointing friction. An interactive back-and-forth agent is simpler and faster.
- Environments without reliable git access — the pattern depends on
git logcommit history as cross-session memory. Without git, the handoff mechanism degrades to manual file management with no audit trail.
Session Handoff Checklist¶
Every coding session ends with:
- All changes committed with a descriptive message
claude-progress.txtupdated with accurate status- Tests passing for the completed feature
- Next priority task identified
Example¶
The following shows what a claude-progress.txt handoff artifact looks like after a coding session, and what the next session's orientation reads before touching any code.
# claude-progress.txt — updated 2026-03-10T14:32Z
## Completed
- [x] feat: user authentication flow (commit a3f92c1)
- [x] feat: session management and token refresh (commit b87de04)
## In Progress
- [ ] feat: profile page — INCOMPLETE, do not mark done until tests pass
## Next Priority
- [ ] feat: notification preferences
## Notes
- Run `./init.sh` if environment is cold-started
- Use `pytest tests/` as the completion gate before updating this file
At the start of a new coding session, the agent reads orientation artifacts in this order before writing a single line of code:
git log --oneline -10 # What has been committed since baseline?
cat claude-progress.txt # What is the current task status?
cat features.md # Which features remain, and in what priority order?
Only after this orientation does the agent select the highest-priority incomplete item and begin work. The session ends with a commit whose message documents what was implemented, which tests pass, and what the next task is — making git log a readable cross-session audit trail.
Key Takeaways¶
- The initializer runs once; the coding agent runs once per session, always reading artifacts before acting
- Lazy tool discovery surfaces schemas on demand, holding down context bloat across a long-running session
- Git commits are structured session handoff notes, not just change records
- Require test passage as the completion gate; never allow agent self-report alone
Related¶
- Harness Engineering — broader engineering discipline that frames this pattern
- Session Initialization Ritual — the initializer's per-task setup procedure
- Trajectory Logging via Progress Files and Git History — how
claude-progress.txtand commits become an audit trail - Agent Handoff Protocols — formalises the cross-session handoff mechanism
- Feature List Files — the priority-ordered task list the coding agent reads
- Worktree Isolation — companion workflow for parallel session safety
- Cross-Cycle Consensus Relay — consensus structure that extends the initializer/worker pattern across autonomous cycles
- Session Harness Sandbox Separation — full three-primitive virtualization that generalizes the initializer/worker split