Opponent Processor / Multi-Agent Debate¶

Deploy two agents with structurally opposed incentives to independently critique each other's reasoning, then synthesize the result.

Each solves a different problem:

Pattern	Structure	Critique phase
Voting / Ensemble	N agents run same task independently → aggregate	No — just aggregate
Adversarial Multi-Model Pipeline (VSDD)	Builder produces artifacts → adversary attacks with fresh context	Sequential, one-directional
Critic Agent	Primary agent plans → critic gates before execution	One-directional, pre-execution
Opponent Processor (this page)	Two co-equal agents with opposing incentives → mutual critique	Bidirectional, structured debate

The defining feature is structural. Opposition is role-encoded in the system prompts from the start, and each co-equal agent critiques the other before synthesis.

Mechanism¶

graph TD
    T[Task / Decision] --> A[Agent A — Role: Advocate / Pro]
    T --> B[Agent B — Role: Critic / Con]
    A -->|Output A| C[Cross-Critique Round]
    B -->|Output B| C
    C -->|Revised A| S[Synthesis]
    C -->|Revised B| S
    S --> D{Resolution}
    D -->|Automated| E[Third-agent synthesis]
    D -->|Weighted| F[Confidence aggregation]
    D -->|High-stakes| G[Human review]

The mechanism is uncorrelated context windows. Each agent receives the same input but reasons independently before seeing the other's output. This stops the first agent's framing from anchoring the second, the main source of groupthink in correlated-context approaches.

Steps:

Assign opposing system prompts with explicit, conflicting incentives.
Spawn both agents with identical input context.
Collect independent outputs. Neither agent sees the other's result yet.
Cross-critique: each agent reviews and challenges the other's reasoning.
Route the revised outputs to synthesis.

Role pair design¶

Opposing roles must be structurally incompatible to generate genuine disagreement:

Domain	Agent A	Agent B
Code review	Author-defender	Security auditor
Architecture	Simplicity advocate	Future-proofing advocate
Cost decisions	Department representative	Company auditor
Risk assessment	Optimistic analyst	Conservative risk officer
Content moderation	Free expression advocate	Safety reviewer

Encode opposition in the system prompt as a role with explicit incentives, not a vague "be critical." An agent told to defend a decision surfaces different evidence than one told to challenge it.

Synthesis options¶

Three synthesis strategies carry different cost and latency profiles:

Strategy	Mechanism	When to use
Third-agent synthesis	A separate agent integrates the two positions	Decisions with complex trade-offs that need reasoning
Weighted aggregation	Combine outputs by confidence scores or domain authority	Classification tasks with measurable confidence
Human-in-the-loop	Present the competing analyses for human judgment	Highest-stakes decisions; irresolvable value conflicts

Include a max-round limit and deadlock detection in any automated synthesis. Two agents with opposing incentives can loop without convergence, so the loop must have an exit condition.

When to apply¶

Apply when:

The decision is consequential and hard to reverse — architecture choices, security policies, resource allocation
Single-agent outputs show systematic bias toward one framing (for example, always recommending the simpler solution)
The decision requires surfacing a value conflict — trade-offs where reasonable people disagree
You have an existing critic or voting pattern but need adversarial pressure before synthesis, not just parallel aggregation

Skip when:

The task has an objectively correct answer (use Voting / Ensemble instead)
Latency is a constraint — debate rounds add 2–4 model round-trips
The decision is routine — the overhead is not justified for low-stakes tasks

Cost profile¶

Debate costs at least 2× the tokens of a single-agent run. With cross-critique and third-agent synthesis, expect 3–4×. It is justified only where decision quality has asymmetric value relative to compute cost.

The Voting / Ensemble Pattern at N=3 costs 3× but skips the critique phase — where debate's quality gain originates.

Empirical caveats¶

Controlled studies show mixed results. Zhang et al. (2025) found MAD often fails to outperform Chain-of-Thought or Self-Consistency at equal compute. Becker et al. (2025) document problem drift: 35% of debate rounds show lack of progress. Majority pressure can suppress independent correction — agents converge on a confident wrong answer.

Debate earns its cost only for hard-to-reverse decisions with genuine value conflicts. For tasks with an objectively correct answer, single-agent scaling or voting is more efficient.

Example¶

Architecture review using an advocate/skeptic pair:

import anthropic

client = anthropic.Anthropic()

TASK = """
We are deciding whether to move our monolith to microservices.
The codebase is 200k lines, team of 8, no existing service mesh.
Assess the architecture decision.
"""

ADVOCATE_PROMPT = (
    "You are an architecture advocate. Your role is to identify the strongest "
    "case FOR the proposed change. Surface concrete benefits and risks "
    "that would be missed by a conservative analysis."
)

SKEPTIC_PROMPT = (
    "You are an architecture skeptic. Your role is to identify the strongest "
    "case AGAINST the proposed change. Surface concrete failure modes and "
    "risks that would be missed by an optimistic analysis."
)

# Phase 1: Independent analysis — neither sees the other's output
advocate_output = client.messages.create(
    model="claude-opus-4-5",
    max_tokens=1024,
    system=ADVOCATE_PROMPT,
    messages=[{"role": "user", "content": TASK}],
).content[0].text

skeptic_output = client.messages.create(
    model="claude-opus-4-5",
    max_tokens=1024,
    system=SKEPTIC_PROMPT,
    messages=[{"role": "user", "content": TASK}],
).content[0].text

# Phase 2: Cross-critique
advocate_critique = client.messages.create(
    model="claude-opus-4-5",
    max_tokens=512,
    system=ADVOCATE_PROMPT,
    messages=[{"role": "user", "content": f"{TASK}\n\nThe skeptic argues:\n{skeptic_output}\n\nChallenge these objections."}],
).content[0].text

# Phase 3: Synthesis (third agent)
synthesis = client.messages.create(
    model="claude-opus-4-5",
    max_tokens=1024,
    system="You are a neutral decision synthesizer. Integrate the opposing analyses into a balanced recommendation.",
    messages=[{"role": "user", "content": f"Advocate:\n{advocate_output}\n\nSkeptic:\n{skeptic_output}\n\nAdvocate response to skeptic:\n{advocate_critique}"}],
).content[0].text

The advocate and skeptic receive the same task but structurally incompatible incentives — the advocate cannot simply agree with the skeptic's framing.

Key Takeaways¶

Opposition is structural, not emergent — encode it in system prompts with explicit, conflicting incentives
Uncorrelated context windows are the mechanism: each agent reasons independently before seeing the other's output
Three synthesis paths: third-agent integration, weighted aggregation, or human review — choose based on decision type
Deadlock is a real failure mode: add max-round limits and a forced exit to any automated synthesis loop
Cost floor is 2×; with critique rounds and synthesis, expect 3–4× — justified only for high-stakes, hard-to-reverse decisions
Distinct from voting (no critique phase) and VSDD (sequential, not co-equal) — use debate when you need bidirectional adversarial pressure before synthesis