AIRA: Inspection Framework for AI-Generated Code¶

A deterministic 15-check inspection framework that targets the patterns where AI-generated code preserves the appearance of functionality while silently degrading guarantees.

The failure mode AIRA targets¶

The AIRA paper (Parris, 2026) defines failure truthfulness as "the property that a system's observable outputs accurately represent its internal success or failure state, without suppression, ambiguity, or degradation masking."

A matched-control replication (955 AI-attributed files vs 955 human controls across JavaScript, Python, TypeScript) found 0.435 high-severity findings per AI file vs 0.242 for human controls, a 1.80x differential concentrated in exception-handling patterns (AIRA paper §Study 3). Independent work converges. An empirical study of AI-generated build code across 387 PRs and 945 files identifies lack of error handling as a recurring maintainability smell (Mudbhari et al., arxiv:2601.16839), and a 304,362-commit longitudinal study finds AI-authored code accumulates technical debt faster than human-authored code (arxiv:2603.28592).

The reward-shaped failure hypothesis¶

AIRA proposes the pattern is an artifact of optimization through human feedback rather than random bug distribution (AIRA paper).

graph TD
    A[RLHF training] --> B[Rater sees output]
    B -->|Visible crash| C[Strong negative signal]
    B -->|Silent degradation| D[Weak / no signal]
    C --> E[Model learns:<br/>avoid visible failures]
    D --> E
    E --> F[Generation biased toward<br/>surface correctness]

Visible crashes draw stronger negative feedback than silent failures, because raters can see a crash but not a silent degradation. Optimizing against this lopsided signal biases the model toward code that looks correct under shallow inspection: broad except: blocks, fallback paths that always succeed, retry loops that mask contract violations.

An LLM-based reviewer inherits the same training bias and is blind to the same patterns. AIRA is deterministic by design — resistant to the failure mode it detects.

The 15 checks¶

AIRA defines 15 deterministic checks, each mapped to a specific failure-truthfulness pattern (AIRA paper §Framework). Thirteen are automatable. C07 and C12 require human review.

Code	Check	What it catches
C01	Success Integrity	Code paths returning success without verifying the operation completed
C02	Audit / Evidence Integrity	Logging that omits or misrepresents failure state
C03	Broad Exception Suppression	`except:`, `catch(Exception)`, empty catches that swallow errors
C04	Distributed Fallback	Scattered fallback paths that accumulate into unconditional success
C05	Bypass / Override Paths	Hidden flags or env vars that disable safety checks
C06	Ambiguous Return Contracts	Returning `None`, empty, or sentinel values that conflate success and failure
C07	Parallel Logic Drift	Duplicated branches that diverge silently (human review only)
C08	Unsupervised Background Tasks	Fire-and-forget work with no error propagation
C09	Environment-Dependent Safety	Checks that pass only because of a test-environment artifact
C10	Startup Integrity	Initialization that proceeds past partial failure
C11	Deterministic Reasoning Drift	Logic that depends on non-deterministic ordering
C12	Source-to-Output Lineage	Unclear data provenance in derived outputs (human review only)
C13	Confidence Misrepresentation	Hard-coded or miscalibrated confidence values
C14	Test Coverage Asymmetry	Happy-path coverage with no adversarial cases
C15	Retry / Idempotency Drift	Retries that duplicate side effects or mask root cause

Each check resolves to PASS, FAIL, or UNKNOWN. PASS indicates pattern absence, not system safety.

Where AIRA fits in a review stack¶

AIRA is a deterministic inspection layer, not a replacement for LLM-based review:

Run AIRA as an early-pipeline gate before LLM review. Findings enter tiered code review at the tier matching the check (C03, C05, C08 escalate to human; C06, C14 route to AI+human).
Combine with deterministic guardrails around probabilistic agents — AIRA is the rule layer; LLM review is the context layer.
Feed accepted findings into learned review rules to avoid re-flagging legitimate patterns.

Scope and limits¶

The framework targets "governance, compliance, and safety-critical systems where fail-closed behavior is required" (AIRA paper), not general-purpose review. The paper acknowledges these limits:

Cross-file semantic reasoning is limited — checks work on single files or short spans.
False positives are unavoidable — broad exception handling is legitimate in resilience engineering and low-level systems code. False-positive rates are a documented trade-off for rule-based static analysis.
PASS is not safety — it measures pattern absence, not correctness.
The checks measure patterns, not authorship — the 1.80x figure describes a population difference, not an individual-file classifier.
UNKNOWN on governance-critical paths needs manual verification.

Outside governance/safety-critical contexts (prototypes, research code, small teams with strong CI), process cost likely outweighs findings volume.

Example¶

Here is the core pattern AIRA catches under C03 (Broad Exception Suppression) combined with C01 (Success Integrity).

Before, the code fails untruthfully:

def save_user(user):
    try:
        db.insert(user)
        notify_search_index(user)
        return {"ok": True}
    except Exception:
        return {"ok": True}

AIRA flags C03 (broad exception) and C01 (returns success on failure). The caller sees ok: True no matter what happened, so search index divergence accumulates silently.

After, the code is failure-truthful:

def save_user(user):
    db.insert(user)
    try:
        notify_search_index(user)
    except SearchIndexError as e:
        logger.error("search index update failed", extra={"user": user.id, "err": str(e)})
        return {"ok": True, "degraded": "search_index"}
    return {"ok": True}

The database failure now propagates. The degraded search index is explicit in the return contract. ok: True means what it claims.

Key Takeaways¶

AIRA targets the gap between observable output state and internal success state — the specific failure mode amplified by RLHF reward shaping.
Empirical signal is real and cross-language: AI-attributed files show ~1.80x high-severity findings vs human controls, concentrated in exception handling.
Deterministic checks resist the training bias that would blind an LLM reviewer to the same patterns.
The framework scopes to governance, compliance, and safety-critical systems — PASS means pattern absent, not system safe.

Tiered Code Review — route AIRA findings to the appropriate review tier
Deterministic Guardrails Around Probabilistic Agents — the rule layer AIRA fits into
Anti-Reward-Hacking: Rubrics That Resist Gaming — the same reward-shaping concern from the eval-design side
Exception Handling and Recovery Patterns — legitimate patterns AIRA must distinguish from suppression
Agentic Code Review Architecture — where LLM-based review complements rule-based inspection