Adversarial Multi-Model Development Pipeline (VSDD)¶
A six-phase AI-orchestrated pipeline that assigns a fresh-context adversary model to attack builder output until convergence, combining spec-driven development, TDD, and formal verification.
Roles¶
The pipeline separates two antagonistic roles across different model instances — ideally different providers:
- Builder — owns specs, tests, and implementation. Accumulates context across phases and can develop confirmation bias toward its own decisions.
- Adversary — gets a fresh context window each review pass and attacks specs, tests, and code with no prior investment. The context reset is the mechanism: the adversary cannot rationalize decisions it did not make.
Using a different model family for each role (e.g., Claude as Builder, Gemini as Adversary) reduces correlated failure modes — multi-model ensembles suppress shared error patterns that same-family models exhibit even with a fresh context (LLM-TOPLA, EMNLP 2024). See Loop Strategy Spectrum for when fresh-context resets are appropriate.
The Six Phases¶
graph TD
P1[Phase 1: Spec Crystallization] --> P2[Phase 2: Test-First Implementation]
P2 --> P3[Phase 3: Adversarial Refinement]
P3 -->|Spec gaps| P1
P3 -->|Test gaps| P2
P3 -->|No real findings| P4[Phase 4: Feedback Integration]
P4 --> P5[Phase 5: Formal Hardening]
P5 --> P6[Phase 6: Convergence]
Phase 1 — Spec Crystallization. Establish behavioral contracts, interface definitions, and an edge-case catalog using spec-driven development. Define the Purity Boundary Map (see below) before any implementation, since it shapes module decomposition.
Phase 2 — Test-First Implementation. Translate specs into failing tests, then implement only what tests demand. Red → Green → Refactor.
Phase 3 — Adversarial Refinement. The Adversary reviews specs, tests, and code with a clean context window, tagging each finding by dimension: spec fidelity, test coverage, or implementation flaw.
Phase 4 — Feedback Integration. Route findings back to the phase they belong to. Phases 3 and 4 repeat until convergence.
Phase 5 — Formal Hardening. Execute formal proofs, fuzzing, and mutation testing against the battle-tested implementation. The Purity Boundary Map identifies the formally verifiable subset. Cross-examination at phase boundaries is a documented robustness mechanism in LLM multi-agent SE systems (ACM TOSEM, 2024).
Phase 6 — Convergence. Exit the loop. See convergence criterion below.
Purity Boundary Map¶
The Purity Boundary Map separates the codebase into two zones before implementation begins:
| Zone | Properties | Verification approach |
|---|---|---|
| Pure core | Deterministic, no side effects | Formal proofs, property-based testing |
| Effectful shell | I/O, network, database, time | Integration tests, contract tests, fuzzing |
Designing this boundary in Phase 1 is not optional — it determines module structure, and retrofitting it later is expensive. The pure core is the formal-verification target in Phase 5; the effectful shell is not formally verifiable by definition.
Convergence Criterion¶
The loop exits when the Adversary's findings shift from genuine to invented:
- Spec critiques become stylistic nitpicks, not substantive behavioral gaps
- The Adversary cannot identify untested scenarios; mutation testing kill rates are high
- Implementation findings require the Adversary to invent implausible inputs, not observe actual flaws
- All formal properties pass proof; fuzzing finds nothing new
This is a qualitative signal, not a counter. Tag each finding on intake as "substantive" or "hypothetical" and track the ratio across rounds — when the Adversary can only raise hypothetical issues, the loop has converged.
When This Backfires¶
VSDD's cost is proportional to convergence cycles. Skip it or expect degraded results when:
- Low-stakes or small tasks. Refactors, single-line patches, throwaway scripts, and prototypes produce low-signal critiques and stall on style. Orchestration cost — multiple model calls per phase, context management, finding triage — exceeds defect-prevention value when failure is cheap to fix post-deployment.
- Thin specs or weak Adversary prompts. Both push the Adversary toward inventing gaps or surface-level stylistic feedback rather than finding real flaws. Phases 3 and 4 then cycle without meaningful signal — illusion of convergence rather than the reality. Multi-agent systems are specifically susceptible to premature consensus when reviewer incentives are not explicitly orthogonal (Failure Modes in LLM Systems, 2025).
- Narrow specialist domains. General-purpose adversary models hallucinate plausible but incorrect findings in embedded systems, cryptography, or other deep-context domains. Domain-specific tests must validate Adversary output before acting on it.
- Purity boundary retrofitting. If Phase 1 skips the map, the effectful shell typically entangles with the pure core during Phase 2. Separating them later often requires near-full rewrites.
The Waterfall Trap¶
Treating Phase 1 specs as a fixed gate repeats waterfall's failure mode. Implementation is discovery — edge cases emerge during building, not beforehand. When Phase 3 finds a genuine behavioral gap, update the spec. Route minor edge case additions directly to Phase 2; reserve Phase 1 revision for findings that change the behavioral contract.
Example¶
The following shows a minimal two-role pipeline using Claude as Builder and Gemini as Adversary. The Builder accumulates context across phases; the Adversary is initialised fresh for each review pass.
import anthropic
import google.generativeai as genai
# Phase 1 & 2: Builder accumulates context
builder = anthropic.Anthropic()
builder_history = []
def builder_turn(prompt: str) -> str:
builder_history.append({"role": "user", "content": prompt})
response = builder.messages.create(
model="claude-opus-4-5",
max_tokens=4096,
system="You are the Builder. Author specs, write failing tests, then implement.",
messages=builder_history,
)
reply = response.content[0].text
builder_history.append({"role": "assistant", "content": reply})
return reply
# Phase 3: Adversary gets NO prior context — fresh model call each time
genai.configure(api_key="GEMINI_API_KEY")
adversary_model = genai.GenerativeModel("gemini-2.0-flash")
def adversary_review(spec: str, tests: str, code: str) -> str:
prompt = (
"Review the following spec, tests, and implementation. "
"Identify spec fidelity gaps, missing test scenarios, and implementation flaws. "
f"\n\n## Spec\n{spec}\n\n## Tests\n{tests}\n\n## Code\n{code}"
)
# No history passed — context reset is the mechanism
return adversary_model.generate_content(prompt).text
spec = builder_turn("Write a spec for a rate-limiter with a sliding window algorithm.")
tests = builder_turn("Write failing pytest tests that cover every clause in that spec.")
code = builder_turn("Implement the rate-limiter so all tests pass.")
findings = adversary_review(spec, tests, code)
print(findings)
The Adversary call passes only the artifacts under review — no prior conversation history. If findings contains substantive behavioral gaps, route them back into builder_turn with the appropriate phase prompt; repeat until the Adversary can only raise stylistic issues.
Key Takeaways¶
- The context reset on the Adversary is the mechanism — it cannot rationalize decisions it did not make
- Use a different model family for the Adversary so its blind spots do not overlap the Builder's
- Define the Purity Boundary Map in Phase 1; retrofitting it after implementation is expensive
- Convergence is when the Adversary can only invent problems, not find real ones
- Treat specs as living hypotheses; route minor edge case discoveries to Phase 2, not Phase 1 re-review
Related¶
- Convergence Detection in Iterative Refinement — the signal-based model behind the Phase 6 convergence criterion
- Evaluator-Optimizer Pattern — the two-role evaluator/generator scaffold VSDD specialises
- Committee Review Pattern — alternative when you want multiple adversaries instead of one
- Closed-Loop Role-Based Refinement — generalised Builder/Adversary loop without the spec-first phases
- Multi-Model Plan Synthesis — uses cross-model diversity at the planning stage rather than the review stage
- Independent Test Generation in Multi-Agent Code Systems — the Phase 2 mechanism applied across agents
- Red-Green-Refactor for Agent Development — the TDD substrate Phase 2 builds on
- Spec-Driven Development — the spec-authorship workflow Phase 1 invokes