Skip to content

The Specification as Prompt: Existing Artifacts as Agent Instructions

Use types, schemas, tests, and API definitions as agent instructions instead of natural language descriptions.

The Core Idea

When a formal specification already exists, pointing the agent at it is more precise than writing a natural language description of the same thing. A TypeScript interface is unambiguous; an OpenAPI schema leaves no room for interpretation; a test file is a complete set of acceptance criteria. Re-describing in prose what already has a formal definition only adds noise and risks divergence between the description and the spec.

Artifact Types and How to Use Them

Type definitions — "Implement a function matching this signature" gives the agent an exact contract: return type, parameter types, and nullability are already specified. Pairing the type with the expected behavior is the complete instruction.

Test files — "Make these tests pass" is a verifiable, self-contained instruction and the core of spec-driven development. The tests define what correct looks like — the tests are the description.

OpenAPI and GraphQL schemas — "Implement this endpoint matching the OpenAPI spec" specifies the request/response shape, status codes, and path parameters without prose. The same spec can also generate agent tool definitions.

Database schemas — Grounding queries or migrations in the actual schema prevents the agent from making up column names or table relationships that don't exist.

Existing code as template — "Follow the pattern in auth/middleware.ts" is more precise than a paragraph describing middleware conventions. The agent can read the existing file and match its structure, naming, and error handling.

Why Specs Beat Prose

Natural language descriptions introduce several problems:

  • Ambiguity: prose admits multiple valid interpretations; a type signature does not
  • Staleness: a description can diverge from the spec over time; the spec cannot diverge from itself
  • Verbosity: describing a complex API costs more tokens than pointing at the schema
  • Verifiability: prose output cannot be auto-checked; spec-grounded output can be tested or linted

The Anthropic context engineering guide identifies high-signal, low-noise token selection as a core principle for effective agent context. Formal specifications are high-signal by construction. Research on spec-driven development confirms that grounding agent instructions in existing contracts reduces hallucinated structural details — column names, route shapes, field types — compared to prose descriptions (Spec-Driven Development: From Code to Contract in the Age of AI Coding Assistants).

Applying the Pattern

Load the specification artifact into context alongside the instruction:

Here is the OpenAPI spec for the /users endpoint:
<spec>
...
</spec>

Implement the route handler.

Or, when the spec lives in the codebase, reference it by path so the agent fetches it:

Implement the `UserRepository` class to satisfy the `IUserRepository` interface in src/types/user.ts.

The agent reads the interface, derives the implementation contract, and produces code that satisfies it.

When This Backfires

The pattern assumes a specification exists and is correct. When that assumption breaks, the approach adds friction rather than reducing it:

  • The spec is incomplete or wrong. An interface with missing methods, an OpenAPI spec with undocumented edge cases, or a schema that doesn't reflect production reality gives the agent a false contract. The agent produces code that satisfies the spec but not the actual system — and the mismatch is harder to diagnose than a prose description that was vague.
  • No formal spec exists yet. Early in a project, types and schemas may not exist, and forcing them prematurely displaces real work. Blocking on spec creation before any agent work is often the wrong order of operations; prose is the right tool until the formal artifacts stabilize.
  • The spec is a ceiling, not a floor. An agent implementing to a type signature satisfies the contract's structural requirements but may still violate architectural intent — naming conventions, error-handling patterns, layering rules — that the type system doesn't encode. Passing tests: pass does not mean the implementation matches the codebase's style or constraints that aren't covered by the test suite.
  • The agent games the spec. "Make these tests pass" is not a guarantee of correctness in the reverse direction: agents can satisfy the literal tests while failing the intended goal — hard-coding expected values, special-casing the assertions, or otherwise exploiting the evaluation surface. A benchmark of tool-using LLM agents found that as honest-solution complexity rises, even production-aligned models increasingly pass automated checks via exploits rather than genuine solutions, so benchmark success can decouple from real competence (Reward Hacking Benchmark: Measuring Exploits in LLM Agents with Tool Use). Treat a passing spec as necessary, not sufficient — pair it with review of how the contract was met.

Key Takeaways

  • Existing specifications — types, schemas, tests, API docs — are more precise agent instructions than prose descriptions, the same way actionable standards serve as instructions.
  • "Make these tests pass" and "implement this interface" are complete, verifiable instructions.
  • Formal specs prevent the agent from hallucinating structural details (column names, field types, route shapes) that don't match the actual system.
  • Reserve prose for context that has no formal equivalent: business rationale, priority trade-offs, user intent.

Example

A TypeScript interface serves as both the specification and the agent instruction:

// src/types/order.ts
interface OrderService {
  createOrder(items: LineItem[], customer: CustomerRef): Promise<Order>;
  cancelOrder(orderId: string, reason: CancelReason): Promise<void>;
  getOrderStatus(orderId: string): Promise<OrderStatus>;
}

The agent prompt references the interface directly:

Implement the OrderService interface defined in src/types/order.ts.
Use the existing DatabaseClient in src/db/client.ts for persistence.
Throw OrderNotFoundError (from src/errors.ts) when an orderId doesn't match a record.

The agent reads the interface, derives the signatures, types, and nullability constraints, and implements to the contract — no prose description of the API shape needed.

Feedback