Frozen Spec File¶

Write goals, non-goals, constraints, and completion criteria into an immutable file the agent reads but cannot modify — preventing drift even across context compaction.

The Problem¶

Agents build impressive things that are wrong. Over long sessions, the original objective blurs: context compaction drops nuance, the agent reinterprets constraints it finds inconvenient, and scope creeps in directions nobody asked for. The result is coherent, high-quality code that solves a subtly different problem than assigned.

Spec-driven development solves the writing problem — how to capture intent. The frozen spec solves the preservation problem — how to keep that intent unmodifiable throughout execution.

Anatomy of a Frozen Spec¶

A frozen spec contains five sections. Everything else belongs in mutable plan or progress files.

Section	Purpose	Example
Goals	What the agent must deliver	"Add OAuth2 login via Google"
Non-goals	What the agent must not build	"Do not add email/password auth"
Hard constraints	Non-negotiable boundaries	"No new runtime dependencies"
Deliverables	Concrete output artifacts	`src/auth/oauth.ts`, migration file, tests
Done-when criteria	Objective, testable completion conditions	"All existing tests pass; `/auth/google` returns 302 to Google consent screen"

Non-goals are first-class. Without them, agents expand scope toward whatever seems useful — adding email auth "while we're at it" or refactoring adjacent code that looks messy.

Why Immutability Matters¶

The spec must be structurally protected from the agent, not just instructionally protected.

graph LR
    A[Session starts] --> B[Agent reads SPEC.md]
    B --> C[Agent works]
    C --> D{Context compacted?}
    D -->|Yes| B
    D -->|No| C
    C --> E[Agent verifies against SPEC.md]
    E --> F[Done-when criteria met?]
    F -->|No| C
    F -->|Yes| G[Complete]

The re-read loop is the core mechanism. After every compaction event, the agent re-reads the frozen spec from disk — the discipline formalised in the post-compaction re-read protocol. The spec survives because it lives in a file, not in conversation history.

Without structural protection, agents rewrite specs they find inconvenient. Anthropic's harness research found that agents are less likely to modify JSON files than Markdown files — format choice is itself a defense against mutation (Effective Harnesses for Long-Running Agents). Strongly-worded prohibition language ("it is unacceptable to remove or edit tests") is necessary but not sufficient.

Immutability Mechanisms¶

No single mechanism guarantees immutability. Layer them.

Mechanism	Strength	Limitation
System prompt directive ("never modify SPEC.md")	Easy to add	Degrades over long sessions
JSON format instead of Markdown	Agent less likely to edit	Not enforced, just resistant
File permissions (read-only)	OS-level enforcement	Agent may have elevated access
Pre-commit hook rejecting spec changes	Hard enforcement	Only catches git commits
PreToolUse hook blocking writes to spec path	Hard enforcement	Tool-specific (e.g. Claude Code)

The strongest setups combine a system prompt directive with a structural guardrail (hook or file permission) — instruction tells the agent not to try, enforcement catches it if it does.

Example¶

Create the spec before the agent session starts:

{
  "goals": [
    "Add Google OAuth2 login to the existing Express app"
  ],
  "non_goals": [
    "Do not add email/password authentication",
    "Do not refactor existing auth middleware",
    "Do not upgrade Express or any existing dependency"
  ],
  "hard_constraints": [
    "No new runtime dependencies beyond googleapis",
    "All routes must remain backward-compatible",
    "Environment variables for secrets, never hardcoded"
  ],
  "deliverables": [
    "src/auth/google-oauth.ts",
    "src/auth/google-oauth.test.ts",
    "migration/003_oauth_tokens.sql"
  ],
  "done_when": [
    "GET /auth/google returns 302 redirect to Google consent screen",
    "GET /auth/google/callback exchanges code for token and creates session",
    "All existing tests pass without modification",
    "New integration tests cover success and error paths"
  ]
}

The system prompt anchors the agent to the file:

Before starting work and after every context reset, read SPEC.json.
Verify each action against the goals, non-goals, and hard constraints.
Never modify SPEC.json. If a constraint prevents you from proceeding, stop and report — do not work around it.

A Claude Code hook provides hard enforcement:

{
  "hooks": {
    "PreToolUse": [
      {
        "matcher": "Edit|Write",
        "hooks": [
          {
            "type": "command",
            "command": "bash -c 'if jq -r \".tool_input.file_path\" | grep -q \"SPEC.json\"; then echo \"BLOCKED: SPEC.json is frozen\" >&2; exit 2; fi'"
          }
        ]
      }
    ]
  }
}

Frozen vs. Mutable Artifacts¶

The frozen spec works alongside mutable files — plan, progress, and status. The distinction is the core insight.

Artifact	Mutability	Contains
SPEC.json	Frozen	Goals, non-goals, constraints, done-when
PLAN.md	Mutable	Implementation approach, task breakdown
PROGRESS.md	Mutable	Completed steps, current status, blockers
Feature state JSON	Mutable	Per-feature pass/fail tracking

OpenAI's Codex long-horizon guide uses this exact split: Prompt.md is frozen (goals and constraints), while Plan.md and Implement.md are mutable working documents (Run Long-Horizon Tasks with Codex).

When This Backfires¶

A frozen spec adds overhead that only pays off under specific conditions. Skip it or treat it as mutable when:

Requirements are genuinely exploratory. If the goal is discovery — figuring out what to build, not building a known thing — locking constraints early causes thrash. Spec-first — the spec-driven development workflow — only works when you know what done looks like.
Sessions are short enough that context doesn't compact. The re-read loop is the core benefit. For tasks that complete in a single context window, the spec provides no protection that a well-written system prompt doesn't already provide.
Scope changes are legitimate and frequent. A frozen spec becomes adversarial when stakeholders update requirements mid-session — without that churn, the spec is instead a guard against objective drift. Agents blocked by an outdated constraint will either stall or find workarounds. Treat the spec as mutable — and accept the drift risk — when requirements aren't stable enough to freeze.

Key Takeaways¶

A frozen spec contains goals, non-goals, hard constraints, deliverables, and done-when criteria — nothing else
Non-goals explicitly prevent the scope creep agents find most tempting
Immutability requires structural enforcement (hooks, permissions), not just prompt instructions
The agent re-reads the spec after every context compaction, anchoring to the original intent
JSON format resists agent mutation better than Markdown

Spec-Driven Development — the writing workflow that produces the spec
Objective Drift — the failure mode a frozen spec prevents
Trajectory Logging via Progress Files — mutable companion artifacts
Feature List Files — structured feature tracking alongside specs
Hooks Beat Prompts — why structural enforcement outperforms prompt instructions
Post-Compaction Re-read Protocol — the mechanism that makes re-reading the frozen spec reliable after context compaction
The Specification as Prompt — using formal artifacts (types, schemas, tests) as agent instructions
Constraint Encoding Does Not Fix Constraint Compliance — why writing constraints into a spec is not the same as agents following them