Monotonic Capability Attenuation for Composition-Safe Tool Use¶

Tag every value with a sink-specific capability budget and intersect budgets through composition — authority can only shrink, closing permission laundering.

When this recommendation applies¶

The pattern adds real security only when all these conditions hold (Jiang et al., 2026):

Expert-crafted manifests are feasible — someone who understands the threat model authors the budgets.
Attacks are explicit-flow — the adversary exfiltrates data the proxy can see (arguments, return values, chained inputs), not through timing or side channels.
All tool traffic passes through one observation point — off-protocol egress (curl shellouts, raw HTTP, headless browsers) bypasses the proxy.
A token-budget margin exists — capability metadata on every value adds to proxy overhead.

Outside these conditions, the mechanism degrades or fails silently. See When this backfires.

What permission laundering is¶

An agent reads a confidential document, summarizes it, then sends the summary externally. Each per-tool check passes — read is allowed, summarize is content-agnostic, send-email is permitted to that recipient — yet the chained effect is exfiltration. The vulnerability is the composition itself, which creates authority no single value ever held (Jiang et al., 2026).

This is distinct from prompt injection (which corrupts the instruction channel) and from overreaching tool calls (which exceed scope at one call): permission laundering chains legal calls into an illegal outcome.

How monotonic attenuation works¶

Every value carries a sink-specific capability budget — the set of sinks it may reach ({file-read, summarise} for a confidential document; {external-email, log} for a user-typed recipient). The runtime tracks budgets per value, not per tool (Jiang et al., 2026).

Composition propagates budgets by intersection: when a tool consumes inputs A and B, its output carries budget(A) ∩ budget(B). A sink can reach that output only if every input could reach it. Authority strictly attenuates: a value loses sinks through composition, never gains them.

graph TD
    D[Confidential doc<br/>budget: read, summarise]
    R[User recipient<br/>budget: external-email]
    D -->|summarise| S[Summary<br/>budget: read, summarise]
    S -->|send-email| C{budget ∩<br/>send-email?}
    R -->|send-email| C
    C -->|empty intersection| B[Denied]

    style D fill:#fbca04
    style R fill:#0e8a16,color:#fff
    style B fill:#b60205,color:#fff

The check at send-email reduces to set membership: is external-email in budget(summary)? The summary inherited the document's budget by intersection, so it is not, and the call is denied. As a transparent MCP proxy, the mechanism needs no change to agent or tool servers. It sees every call, intersects budgets, and gates each sink (Jiang et al., 2026).

Why it works¶

Composition is what permission laundering exploits — per-tool checks each pass while the chained effect is unsafe. Monotonic intersection denies any composition that creates new authority. Because intersection is local, the proxy needs no global plan: each interception is a set-membership check against the value's accumulated budget. The enforcement boundary shifts from "per-tool" to "per-value lifetime," the layer at which laundering occurs. This mirrors CaMeL, which encodes the same logic through a Python interpreter. Across 82 tasks on five frontier models, attack success drops from 25–68% to 0–4.8% while 96–100% of benign tasks still complete (Jiang et al., 2026).

When this backfires¶

The mechanism degrades or fails outside its operating envelope.

Naive manifests collapse the defense. The blocking rate falls to 27.3% when budgets are authored without security expertise — close to no defense (Jiang et al., 2026). The paper names manifest quality "the dominant deployment bottleneck"; teams without a dedicated security function inherit the naive number by default.
Implicit flows escape. The scope is bounded to "explicit-flow composition safety under trusted manifests and proxy-visible data movement." Causality laundering — exfiltrating data through denial signals — is invisible because no budgeted value changes hands (Causality Laundering, 2026).
Conjunctive emergent capability is missed. Per-value intersection cannot detect two individually safe values combining into an unsafe end-state. A formal result reports that 42.6% of 900 real multi-tool trajectories contain at least one conjunctive dependency (Safety is Non-Compositional, 2026).
Off-protocol egress bypasses the proxy. Any side channel that skips the MCP runtime control plane — direct shell curl, embedded SDK calls, cached state — sits outside the mechanism's reach.
Token budgets compound. Capability metadata on every value adds to the 35× MCP proxy overhead over equivalent CLI tools; agents near context ceilings cannot absorb it without truncating reasoning.

When these conditions fail, a practical alternative is to remove a leg of the lethal trifecta — disable egress, narrow private-data scope, or sandbox untrusted input — which closes the gap deterministically with no manifest authoring debt.

Relation to CaMeL and the MCP control plane¶

Monotonic capability attenuation, CaMeL, and the MCP Runtime Control Plane sit at three layers of one architecture:

Pattern	Enforcement layer	Mechanism	Manifest authoring
CaMeL	Dual-LLM with custom Python interpreter	Capability labels on values; policies checked at tool-call time	Per-tool policies authored alongside tool definitions
Monotonic capability attenuation	Transparent MCP proxy between agent and tools	Per-value capability budgets; intersection across composition	Per-sink budgets; expert-crafted required for 100% blocking
MCP Runtime Control Plane	Proxy between agent and tools	Identity, rate-limit, tool-name policy per call	Identity/policy rules per tool surface

The patterns compose: the control plane gates on identity, capability budgets gate on composition, and CaMeL-style separation handles instruction-vs-data integrity. None alone covers the others' failure modes.

Key Takeaways¶

Monotonic capability attenuation closes the permission-laundering gap by tagging every value with a sink-specific budget and intersecting budgets through composition — authority can only shrink.
Attack success drops from 25–68% to 0–4.8% across 82 tasks on five frontier models when manifests are expert-crafted, with 96–100% benign completion preserved.
Manifest quality is the deciding variable: naive manifests reach 27.3% blocking, expert-crafted reach 100%. Teams without security-engineering capacity will deploy the naive number.
The mechanism is bounded to explicit-flow attacks under trusted manifests and proxy-visible data movement. Implicit flows, conjunctive emergent capabilities, and off-protocol egress are not addressed.
A transparent MCP proxy is the deployment vehicle. The 35× token overhead of MCP versus CLI tools compounds when every value carries capability metadata — confirm context-budget margin before adopting.
Pair with lethal-trifecta leg removal and CaMeL control/data separation as a layered posture; neither alone covers the others' failure modes.

CaMeL: Defeating Prompt Injections by Separating Control and Data Flow
MCP Runtime Control Plane: Policy Evaluation Between Agent and Tool
Lethal Trifecta Threat Model
Hybrid Deterministic + Semantic Authorization for Agent Tool Calls
Blast Radius Containment: Least Privilege for AI Agents
Revocable Resource-and-Effect Capabilities for Coding Agents (PORTICO) — Subgoal-scoped revocation through a reference monitor; same envelope of mediated tools + typed catalog