Frozen Spec File¶
Write goals, non-goals, constraints, and completion criteria into an immutable file the agent reads but cannot modify — preventing drift even across context compaction.
The Problem¶
Agents build impressive things that are wrong. Over long sessions, the original objective blurs: context compaction drops nuance, the agent reinterprets constraints it finds inconvenient, and scope creeps in directions nobody asked for. The result is coherent, high-quality code that solves a subtly different problem than assigned.
Spec-driven development solves the writing problem — how to capture intent. The frozen spec solves the preservation problem — how to keep that intent unmodifiable throughout execution.
Anatomy of a Frozen Spec¶
A frozen spec contains five sections. Everything else belongs in mutable plan or progress files.
| Section | Purpose | Example |
|---|---|---|
| Goals | What the agent must deliver | "Add OAuth2 login via Google" |
| Non-goals | What the agent must not build | "Do not add email/password auth" |
| Hard constraints | Non-negotiable boundaries | "No new runtime dependencies" |
| Deliverables | Concrete output artifacts | src/auth/oauth.ts, migration file, tests |
| Done-when criteria | Objective, testable completion conditions | "All existing tests pass; /auth/google returns 302 to Google consent screen" |
Non-goals are first-class. Without them, agents expand scope toward whatever seems useful — adding email auth "while we're at it" or refactoring adjacent code that looks messy.
Why Immutability Matters¶
The spec must be structurally protected from the agent, not just instructionally protected.
graph LR
A[Session starts] --> B[Agent reads SPEC.md]
B --> C[Agent works]
C --> D{Context compacted?}
D -->|Yes| B
D -->|No| C
C --> E[Agent verifies against SPEC.md]
E --> F[Done-when criteria met?]
F -->|No| C
F -->|Yes| G[Complete]
The re-read loop is the core mechanism. After every compaction event, the agent re-reads the frozen spec from disk — the discipline formalised in the post-compaction re-read protocol. The spec survives because it lives in a file, not in conversation history.
Without structural protection, agents rewrite specs they find inconvenient. Anthropic's harness research found that agents are less likely to modify JSON files than Markdown files — format choice is itself a defense against mutation (Effective Harnesses for Long-Running Agents). Strongly-worded prohibition language ("it is unacceptable to remove or edit tests") is necessary but not sufficient.
Immutability Mechanisms¶
No single mechanism guarantees immutability. Layer them.
| Mechanism | Strength | Limitation |
|---|---|---|
| System prompt directive ("never modify SPEC.md") | Easy to add | Degrades over long sessions |
| JSON format instead of Markdown | Agent less likely to edit | Not enforced, just resistant |
| File permissions (read-only) | OS-level enforcement | Agent may have elevated access |
| Pre-commit hook rejecting spec changes | Hard enforcement | Only catches git commits |
| PreToolUse hook blocking writes to spec path | Hard enforcement | Tool-specific (e.g. Claude Code) |
The strongest setups combine a system prompt directive with a structural guardrail (hook or file permission) — instruction tells the agent not to try, enforcement catches it if it does.
Example¶
Create the spec before the agent session starts:
{
"goals": [
"Add Google OAuth2 login to the existing Express app"
],
"non_goals": [
"Do not add email/password authentication",
"Do not refactor existing auth middleware",
"Do not upgrade Express or any existing dependency"
],
"hard_constraints": [
"No new runtime dependencies beyond googleapis",
"All routes must remain backward-compatible",
"Environment variables for secrets, never hardcoded"
],
"deliverables": [
"src/auth/google-oauth.ts",
"src/auth/google-oauth.test.ts",
"migration/003_oauth_tokens.sql"
],
"done_when": [
"GET /auth/google returns 302 redirect to Google consent screen",
"GET /auth/google/callback exchanges code for token and creates session",
"All existing tests pass without modification",
"New integration tests cover success and error paths"
]
}
The system prompt anchors the agent to the file:
Before starting work and after every context reset, read SPEC.json.
Verify each action against the goals, non-goals, and hard constraints.
Never modify SPEC.json. If a constraint prevents you from proceeding, stop and report — do not work around it.
A Claude Code hook provides hard enforcement:
{
"hooks": {
"PreToolUse": [
{
"matcher": "Edit|Write",
"hooks": [
{
"type": "command",
"command": "bash -c 'if jq -r \".tool_input.file_path\" | grep -q \"SPEC.json\"; then echo \"BLOCKED: SPEC.json is frozen\" >&2; exit 2; fi'"
}
]
}
]
}
}
Frozen vs. Mutable Artifacts¶
The frozen spec works alongside mutable files — plan, progress, and status. The distinction is the core insight.
| Artifact | Mutability | Contains |
|---|---|---|
| SPEC.json | Frozen | Goals, non-goals, constraints, done-when |
| PLAN.md | Mutable | Implementation approach, task breakdown |
| PROGRESS.md | Mutable | Completed steps, current status, blockers |
| Feature state JSON | Mutable | Per-feature pass/fail tracking |
OpenAI's Codex long-horizon guide uses this exact split: Prompt.md is frozen (goals and constraints), while Plan.md and Implement.md are mutable working documents (Run Long-Horizon Tasks with Codex).
When This Backfires¶
A frozen spec adds overhead that only pays off under specific conditions. Skip it or treat it as mutable when:
- Requirements are genuinely exploratory. If the goal is discovery — figuring out what to build, not building a known thing — locking constraints early causes thrash. Spec-first — the spec-driven development workflow — only works when you know what done looks like.
- Sessions are short enough that context doesn't compact. The re-read loop is the core benefit. For tasks that complete in a single context window, the spec provides no protection that a well-written system prompt doesn't already provide.
- Scope changes are legitimate and frequent. A frozen spec becomes adversarial when stakeholders update requirements mid-session — without that churn, the spec is instead a guard against objective drift. Agents blocked by an outdated constraint will either stall or find workarounds. Treat the spec as mutable — and accept the drift risk — when requirements aren't stable enough to freeze.
Key Takeaways¶
- A frozen spec contains goals, non-goals, hard constraints, deliverables, and done-when criteria — nothing else
- Non-goals explicitly prevent the scope creep agents find most tempting
- Immutability requires structural enforcement (hooks, permissions), not just prompt instructions
- The agent re-reads the spec after every context compaction, anchoring to the original intent
- JSON format resists agent mutation better than Markdown
Related¶
- Spec-Driven Development — the writing workflow that produces the spec
- Objective Drift — the failure mode a frozen spec prevents
- Trajectory Logging via Progress Files — mutable companion artifacts
- Feature List Files — structured feature tracking alongside specs
- Hooks Beat Prompts — why structural enforcement outperforms prompt instructions
- Post-Compaction Re-read Protocol — the mechanism that makes re-reading the frozen spec reliable after context compaction
- The Specification as Prompt — using formal artifacts (types, schemas, tests) as agent instructions
- Constraint Encoding Does Not Fix Constraint Compliance — why writing constraints into a spec is not the same as agents following them