Cloud-Agent Three-Layer State Decoupling¶
Split a cloud agent's state across three layers — agent loop, machine state, and conversation state — so pods, sessions, and threads each migrate, hibernate, and recover independently.
The three-layer state decoupling for cloud agents keeps the agent loop, the machine state, and the conversation state as separate runtime components — each addressable, serialisable, and reassignable so no single infrastructure failure costs a user thread, a sandbox, or in-flight reasoning. The pattern is named directly in Cursor's cloud-agent retrospective: "keep the agent loop, the machine state, and the conversation state as decoupled components" (Cursor, 2026-05-21).
When This Pattern Applies¶
The split pays off when at least one holds:
- Sessions stretch across infrastructure events. The layered architecture lets a cloud agent "survive blips in inference reliability, pod hibernation and resumption, and runs that stretch across days or even weeks" (Cursor, 2026-05-21).
- Subagents fan out across heterogeneous pods. "An agent might run on one machine, spawn async subagents across several, or start locally then delegate work to the cloud. A subagent might even outlive its parent, or run on a completely different kind of pod" (Cursor, 2026-05-21).
- Pod lifecycle is optimised independently of agent identity. Readonly VMs, prewarmed VMs, and hibernation are only reachable when the loop does not pin to a machine (Cursor, 2026-05-21).
When none hold — short single-machine sessions on stable infrastructure — a coupled alternative ships faster. This is the cloud-agent variant of the more general session-harness-sandbox separation; use that page for the abstract three-primitive theory.
The Three Layers¶
| Layer | What it holds | Lifetime | Substrate | Failure mode absorbed |
|---|---|---|---|---|
| Agent loop | Model decisions, tool dispatch, retry control flow | Seconds to minutes per task | Durable workflow engine (Cursor on Temporal — "more than 50 million actions per day across more than 7 million unique workflows" (Cursor, 2026-05-21)) | Inference outages, harness deploys |
| Machine state | Sandbox filesystem, processes, env, dev server | Minutes to hours per session | Hibernatable VM, addressable independently of the loop | Spot reclamation, pod restart, region migration |
| Conversation state | Transcript, tool-call records, streamed events | Days to weeks per thread | Append-only storage with retry-aware sequencing | Client disconnects, mid-stream retries, multi-device sessions |
Anthropic's Managed Agents reach the same shape under different names — Session (log), Harness (stateless loop), Sandbox (execution) — and report the decoupling cut p50 time-to-first-token by ~60% and p95 by >90% (Anthropic, 2026). Two independent primary sources converging is evidence the split is structural, not vendor-specific.
graph TD
L[Agent Loop<br>durable workflow] -->|tool calls| M[Machine State<br>hibernatable VM]
L -->|append events| C[Conversation State<br>append-only log]
M -->|tool results| L
C -->|replay| L
L2[Different loop instance<br>after deploy] -.->|reattach| C
M2[Different VM<br>after migration] -.->|reattach| L
Why It Works¶
The three layers have different failure modes, lifetimes, and churn rates; coupling two forces the union of their constraints onto both. A loop pinned to a VM cannot survive pod loss; a conversation pinned to a VM cannot survive region migration; machine state pinned to a conversation cannot be reused. Cursor's mechanism is operational: "Because the agent loop lives in Temporal rather than on the VM itself, we can manage pod lifecycles independently and run agents across different kinds of pods — including optimizations like readonly VMs or prewarmed VMs" (Cursor, 2026-05-21). Anthropic's architecture reaches the same property via stateless harness replay: "Any Harness instance can pick up any Session and continue from where it left off. This is what makes horizontal scaling trivial" (Anthropic, 2026). The durable layer is authoritative; any compute attached to it is replaceable.
Loop-layer durability constrains its shape — Cursor moved from "'eternal' agent workflows to multiple shorter ones that exit after completing a single task, which makes version upgrades easier" (Cursor, 2026-05-21). The conversation layer must reconcile streamed retries with what the client already saw — Cursor's append-only storage "accounts for retries, so that if a step of the agent loop fails after streaming partial output and then gets retried, the client can detect this, rewind its stream, and show the new data" (Cursor, 2026-05-21).
When This Backfires¶
- Short sessions on stable infrastructure. When p99 session is under a few minutes and pods rarely reclaim, migration and hibernation payoffs never fire. Durable execution adds 10-50ms per activity dispatch (AgentMarketCap, 2026) — negligible against LLM latency but a real ongoing engineering cost.
- Pre-product-market-fit teams. The split commits the product to a fixed operational topology. A coupled prototype ships in weeks; reworking after UX shifts costs more than rebuilding. Multi-agent systems share the risk — most enterprises "risk adding distributed complexity long before they have a problem worth distributing" (InfoWorld, 2026).
- Schema-evolution-heavy products. Replay correctness requires a stable event shape; when events change semantics across harness versions, old logs stop replaying under new code — the hazard session-harness-sandbox-separation flags.
- Local-first or offline agents. The pattern is justified by cloud failure modes — spot reclamation, region failover, multi-tenant scheduling. Local agents have implicit one-machine pinning that removes the payoff.
- Multi-agent coordination beyond a single thread. A flat conversation log is "fundamentally insufficient" for complex multi-agent coordination (Yan, Medium, 2026-04) — additional memory layers must sit on top when subagent graphs branch beyond parent-child.
Composition¶
The split is the substrate other cloud-agent patterns sit on:
- Cloud-Agent Session Bootstrap — install/start lifecycle for the machine-state layer
- Prebuilt Agent Environments — bakes machine-state initial conditions into a container image
- Delta Channels — keeps the conversation log linear in storage cost
- Long-Running Agents — durability and resumability primitives the loop layer uses
- Deep Agent Runtime — runtime concerns the three layers expose externally
Example¶
A team operating a Cursor-style cloud agent product wants user threads to survive a region-wide pod restart without losing in-flight reasoning or transcript.
Before — coupled state on a single VM:
VM #A
├── agent loop (in-process)
├── sandbox FS + processes
└── conversation buffer (in-memory)
Pod restart -> all three lost. User thread dies; transcript truncates at last flush.
After — three layers split across substrates:
Temporal workflow: agent loop (durable, restartable)
│
▼
VM #A (hibernatable): machine state (FS + processes)
│
▼
Append-only store: conversation state (streamed to clients, retry-aware)
Pod #A reclaimed -> Temporal restarts the loop activity on a fresh VM #B,
replays conversation events for context, attaches to a prewarmed machine.
Client stream rewinds and replays from last acknowledged event.
The structural change is that no single layer holds the full agent state — each is independently restorable, and the durable substrates (workflow engine, append-only log) are the source of truth (Cursor, 2026-05-21).
Key Takeaways¶
- Cloud-hosted agents survive infrastructure churn only if the agent loop, machine state, and conversation state are addressable and recoverable independently
- The split is enabled by durable substrates — a workflow engine for the loop and append-only storage for the conversation — that make compute attachments replaceable
- Two independent primary sources (Cursor, Anthropic) converged on the same three-layer architecture with different naming; the structural shape is not vendor-specific
- The pattern is overhead for short single-machine sessions, pre-PMF teams, and schema-churning products; apply it when session length and infrastructure failure rate justify the workflow infrastructure
- Composes downward with session bootstrap, prebuilt environments, and delta-channel checkpointing; the layered split is the substrate other cloud-agent patterns sit on
Related¶
- Session Harness Sandbox Separation — the general three-primitive theory under a different naming convention
- Deep Agent Runtime — the runtime layer underneath the harness that exposes durable runs, lifecycle, streaming, and versioning
- Cloud-Agent Session Bootstrap — install/start lifecycle for the machine-state layer
- Long-Running Agents — the durability, checkpointing, and resumability primitives across Anthropic, Cursor, and Google designs
- Delta Channels: Bounded Checkpoint Storage — keeps the conversation layer's append-only log linear in storage cost