Pattern Replication Risk¶

Pattern replication is an agent absorbing codebase conventions and reproducing them at scale: deprecated APIs, legacy error handling, and hand-rolled utilities you meant to retire.

The Mechanism¶

Agents learn from what they find. When an agent scans your codebase, it treats golden-path implementations and legacy workarounds equally. Suboptimal patterns propagate faster than any team can review them — faithful reproduction, not a prompting failure.

graph LR
    A[Legacy pattern<br>in codebase] --> B[Agent reads<br>codebase]
    B --> C[Agent reproduces<br>pattern at scale]
    C --> D[More instances<br>for agent to learn from]
    D --> B
    style A fill:#c62828,color:#fff
    style D fill:#c62828,color:#fff

The Evidence¶

Finding	Source
Copy/paste code rose from 8.3% to 12.3%; refactoring dropped from 25% to under 10%	GitClear, 211M lines analyzed
Static analysis warnings rose ~30% post-AI-adoption; complexity rose 40%+	CMU controlled study, 807 repos
AI-authored PRs contain 1.7x more issues than human-only PRs	CodeRabbit, 470 PRs
67.3% of AI-generated PRs rejected vs 15.6% for manual code	LinearB via Mike Mason
AI magnifies strengths of high-performing orgs and dysfunctions of struggling ones	DORA Report 2025

Specific Manifestations¶

Three failure modes (via Mike Mason):

Brute force fixes. Raising Docker memory limits instead of finding the leak; adding retry loops instead of fixing the root error.

Backward compatibility shortcuts. Thin wrappers around deprecated APIs. The deprecated code persists under an extra layer.

Excessive mocking. Test suites that validate the mocks rather than the code.

Why It Happens¶

Agents retrieve context by syntactic and semantic similarity, not quality. The retriever surfaces the nearest matching implementation — a # TODO: remove comment is not a weighting feature.

Generation amplifies the match: few-shot conditioning on in-repo examples dominates prose instructions. The model treats surrounding code as higher-fidelity evidence of "what this codebase does" than any guidance, and every new usage becomes retrieval context for the next run.

Mechanical enforcement beats guidance, the case made in hooks for enforcement vs prompts for guidance. A linter rejecting the deprecated pattern removes it from the retrieval surface; a prompt to "prefer the new API" competes with N existing calls and loses.

The Fix: Clean the House Before Inviting the Agent¶

OpenAI's Harness team spent 20% of sprint time cleaning up "AI slop" before arriving at this approach:

Encode golden patterns as mechanical rules. Linters and CI checks that reject known anti-patterns — prose guidance is routinely overridden by contradicting examples.
Auto-generate refactoring PRs. Replace deprecated patterns with approved alternatives before scaling agent usage — part of getting to codebase readiness.
Track quality metrics. Monitor duplication rates, lint violations, and complexity scores. Degradation signals replication is outpacing remediation.

When This Backfires¶

Conditions where clean-first is worse than proceeding directly:

Mid-migration codebases. Blanket lint rules fire on valid compatibility shims when two patterns intentionally coexist. Lint rules require pattern stability to act as deterministic guardrails.

Load-bearing deprecated APIs. When the replacement isn't available in all deploy targets, encoding a rejection rule creates CI failures with no resolution path.

Large legacy codebases. Pre-remediation spanning months may erase the productivity gain before agents are enabled; narrow rules scoped to new files reduce blast radius.

Key Takeaways¶

Agents replicate whatever patterns they find; legacy code and golden paths propagate at the same rate.
The risk compounds: each agent-generated instance becomes retrieval context for the next run, accreting into shadow tech debt.
Prose guidance loses to codebase examples — encode anti-patterns as CI-enforced lint rules.
Remediate before scaling, but scope rules narrowly when the codebase is mid-migration or the replacement API isn't universally reachable.

Example¶

A codebase uses a hand-rolled fetchWithRetry utility dating from 2019. The team intended to migrate to a standard library wrapper once their HTTP client was upgraded, but the migration never happened.

When an agent is asked to add a new API integration, it scans the codebase for patterns:

# Legacy utility -- flagged for removal in a 2021 TODO comment
def fetchWithRetry(url, retries=3, backoff=1):
    for i in range(retries):
        try:
            return requests.get(url, timeout=5)
        except requests.RequestException:
            time.sleep(backoff * (2 ** i))
    raise RuntimeError(f"Request failed after {retries} retries")

The agent finds three existing usages, treats them as the established pattern, and generates five new usages in the new integration -- each calling fetchWithRetry with slightly different backoff values.

After two sprints of agent-assisted work, the codebase has 23 usages of fetchWithRetry. The team's plan to delete it now requires touching 23 files instead of 3. A CI lint rule rejecting direct calls to fetchWithRetry (pointing to the approved alternative) would have blocked the first agent-generated usage, keeping the migration cost manageable.

Copy-Paste Agent -- Agent config duplication across projects
Codebase Readiness -- Preparing a codebase for agent-assisted development
Agent-First Software Design -- designing systems where agents are the primary consumers
Hooks for Enforcement vs Prompts for Guidance -- Mechanical enforcement over prose instructions
Deterministic Guardrails -- Linters and CI as agent boundaries
Abstraction Bloat -- Over-engineering and unnecessary hierarchies from agent output
Comprehension Debt -- The growing gap between agent-produced code and developer understanding
Shadow Tech Debt -- Cumulative codebase drift from autonomous agent commits