Improper Output Handling: Validate Agent Output Before Downstream Use¶

Treat agent output as untrusted input to the next system — every downstream sink needs its own per-sink validation gate.

Learn it hands-on: The Output Is Untrusted Too — guided lesson with quizzes.

Improper Output Handling is the OWASP LLM05:2025 vulnerability class: "insufficient validation, sanitization, and handling of the outputs generated by large language models before they are passed downstream to other components and systems." For coding agents, the canonical case is generated code executed or committed without review, plus agent text rendered in surfaces that auto-interpret it.

The mechanism¶

Trust does not transfer through string boundaries. An LLM produces strings that bypass any input-validation layer already in place — that layer validated the user prompt, not the model response. When the response reaches a code-interpreting sink, the system meets that sink with text nothing in the pipeline scrutinized. OWASP's rule: "Treat the model as any other user, adopting a zero-trust approach, and apply proper input validation on responses coming from the model to backend functions" (OWASP LLM05:2025).

OWASP lists five canonical sinks: exec/eval/shell (RCE), browser-rendered JS or markdown (XSS), unparameterized SQL (injection), unsanitized file paths (traversal), and email templates without escaping (phishing) (OWASP LLM05:2025).

Downstream sinks in coding-agent workflows¶

Sink	Risk	Per-sink control
Source control (auto-commit)	Vulnerable code lands in main	SAST gate in CI; human approval on security-sensitive paths (Infosecurity Magazine)
Shell / `exec` / `eval` in tool execution	Remote code execution	Allowlist of permitted commands; refuse `eval` on any LLM-produced string (OWASP LLM05:2025)
SQL through ORM or string-concat	SQL injection (for example CVE-2025-1793)	Parameterized queries or prepared statements — never string interpolation (OWASP LLM05:2025)
Markdown / HTML in PR comments, chat, email	XSS, exfiltration via auto-fetched images	Context-aware HTML encoding; CSP; image-fetch gating in the renderer
Package manager (`pip install`, `npm install`)	Slopsquatting — install of hallucinated package	Resolve generated dependency lists against an installed lockfile before install (Help Net Security)
File paths	Path traversal	Canonicalize and constrain to a base directory before any FS call (OWASP LLM05:2025)
Downstream LLM (multi-agent pipeline)	Re-injection via tool output	Route via Action-Selector Pattern so tool output never re-enters a model

The canonical SQL instantiation is CVE-2025-1793 in LlamaIndex (CVSS 9.8): vector-store integrations constructed SQL by string-concatenating LLM-built queries, fixed in 0.12.28 by switching to parameterized queries (Endor Labs advisory; GHSA-v3c8-3pr6-gr7p). The fix is not LLM-specific; it is the same parameterization control that defends against any user-supplied SQL string.

Why it works¶

LLM output and user input belong to the same trust class — both are arbitrary strings that cross trust boundaries at a code-interpreting sink. Per-sink controls (parameterization, HTML encoding, allowlists) defend the boundary, not the source, so they apply to LLM output for the same reason they apply to user input. The novelty of LLM05 is not new controls; it is new applicability surface — LLM responses now cross boundaries that previously consumed only validated or trusted strings. Simon Willison frames it: "anyone who can control more than a few sentences of input to that model can cause it to output anything they like" (Willison, 2025).

The defect baseline confirms the need: about 24.7% of AI-generated code carries a security vulnerability, 1.5 to 2× the rate of human-written code (Infosecurity Magazine; Checkmarx). Without a downstream gate, that defect rate ships.

Relation to adjacent patterns¶

LLM05 is the risk class; these patterns are concrete mitigations against specific sinks:

Safe Outputs Pattern — bounds the write blast radius by routing every modification through a validating MCP; operation filter, content moderation, and secret removal each realize per-sink validation.
Action-Selector Pattern — eliminates the tool-output-to-LLM feedback loop, so the downstream sink is a deterministic executor instead of another model.
Agent-Authored Messages as a Deferred Exfiltration Channel — a concrete LLM05 instantiation where the unsafe sink is renderer auto-fetch.

Distinction from LLM06¶

OWASP draws the boundary explicitly: "Excessive Agency differs from Insecure Output Handling which is concerned with insufficient scrutiny of LLM outputs" (OWASP LLM06:2025). LLM06 is the agent taking action (excessive functionality, permissions, autonomy); LLM05 is a downstream consumer mishandling the text the agent produced. A permission-bounded agent can still trigger LLM05 if its bounded actions emit strings consumed unsafely downstream. The controls are orthogonal — bound actions (LLM06) and validate output at each sink (LLM05).

When this backfires¶

Mature teams already enforce per-sink encoding — A team that already parameterizes every DB call, escapes every HTML render, and refuses eval on arbitrary strings gains nothing from a parallel LLM-specific scanner. Confirm existing per-sink controls cover the new applicability surface; do not duplicate them.
The LLM never reaches a code-interpreting sink — An agent that drafts emails for human review or generates code a developer reads before commit has no LLM05 surface in those paths. Adding output validation there is theatre with a latency cost.
Output-stream filtering blocks legitimate technical content — Pattern-based "strip URLs / strip code blocks" sanitizers reject legitimate documentation links and code samples. Prefer context-aware escaping at the sink over content-stripping of the LLM stream (Hoyt: Don't sanitize input. Escape output).
Schema-constrained tool calling already enforces validation — Strict structured outputs (Pydantic / JSON Schema) at the model interface mean the LLM emits only schema-conformant parameters, never arbitrary strings. Layering per-sink validation on top adds latency without marginal risk reduction; the schema is the validator.

Example¶

A retrieval-augmented agent answers user questions over a vector store. The agent constructs a metadata filter from natural language; the application executes it against PostgreSQL.

Before — generated query interpolated into SQL:

filter_clause = llm.generate(f"build a WHERE clause for: {user_question}")
cursor.execute(f"SELECT * FROM docs WHERE {filter_clause}")

A user prompt like delete all rephrased by the LLM into 1=1; DROP TABLE docs -- executes against the database. This is the CVE-2025-1793 class.

After — LLM emits a schema-constrained filter object; the executor builds parameterized SQL. Placeholders can bind only values — not column names or operators — so identifiers and operators go through an explicit allowlist mapping instead:

from pydantic import BaseModel
from typing import Literal

class Filter(BaseModel):
    field: Literal["author", "year", "tag"]
    op: Literal["=", ">", "<"]
    value: str

# Identifiers and operators cannot be bound by placeholders;
# map each validated enum member to a fixed SQL fragment.
COLUMNS = {"author": "author", "year": "year", "tag": "tag"}
OPERATORS = {"=": "=", ">": ">", "<": "<"}

f = Filter.model_validate(llm.structured_output(user_question, schema=Filter))
cursor.execute(
    f"SELECT * FROM docs WHERE {COLUMNS[f.field]} {OPERATORS[f.op]} %s",
    (f.value,),
)

The LLM never writes SQL; the deterministic executor assembles the query from allowlisted fragments — the column and operator come from fixed mappings keyed by the validated enum, and only the value is parameterised.

Key Takeaways¶

Agent output is untrusted input to the next system — enumerate downstream sinks and validate at each one.
The controls are not new: parameterised queries, context-aware encoding, command allowlists, schema validation. The new surface is that LLM-generated strings now reach those sinks.
LLM05 (downstream mishandling) is distinct from LLM06 (agent action) — defend both, independently.
Safe Outputs and Action-Selector are concrete architectural mitigations; per-sink encoding is the universal control.
For coding agents specifically, the sinks include source control, package managers, and shells — sinks that previously consumed only trusted strings.