Skip to content

MCP Client Design: Building Robust Host-Side Logic

MCP client design is the host-side logic that connects to MCP servers, negotiates capabilities, routes tool calls, caches descriptions, and degrades gracefully on failure.

Host, Client, Server

MCP separates three participants. Each client handles one server connection with its own session state, capabilities, and transport:

Role Responsibility
Host The AI application; creates and manages client instances
Client One instance per server connection; handles protocol lifecycle
Server Exposes tools, resources, and prompts over MCP

Connection Lifecycle

Initialization is three steps (MCP spec):

sequenceDiagram
    participant Client
    participant Server
    Client->>Server: initialize (protocol version, capabilities)
    Server-->>Client: capabilities response
    Client->>Server: notifications/initialized
    Note over Client,Server: Session active — tool calls permitted

Client rules: do not batch initialize; send no non-ping requests before the capability response; disconnect on unsupported versions; use only negotiated features.

Shutdown

Shutdown differs by transport:

Transport Shutdown sequence
stdio Close stdin, wait for server exit, send SIGTERM, then SIGKILL
Streamable HTTP Send HTTP DELETE with session ID; closing the connection also signals termination

Multi-Server Tool Routing

MCP defines no collision resolution. When servers share a tool name, the host picks:

  • Namespace by server ID. Maintain a serverId -> tools[] map; route tools/call to the owning client.
  • Priority ordering. Assign precedence; higher-priority server wins on collision.
  • User disambiguation. Ask the user. Interactive sessions only.

Tool Description Caching

Two approaches cut tools/list latency and tokens:

  • Static caching. Cache tools/list locally; re-fetch on notifications/tools/list_changed, TTL expiry, or explicit refresh.
  • Dynamic discovery. Expose a search interface; the agent fetches schemas only for matched tools at call time — Anthropic's Tool Search Tool reports ~85% token reduction versus loading all definitions upfront (advanced tool use).

Timeout and Cancellation

On timeout: send notifications/cancelled with the request ID, stop waiting, log. Progress notifications MAY reset the clock; enforce a hard maximum to prevent stalling.

Health checks

Either side can send ping to verify liveness. Multiple failures trigger reconnection or session reset. Keep ping frequency configurable — aggressive pinging wastes resources.

Streamable HTTP Session Management

For remote servers using Streamable HTTP:

  • Servers MAY assign Mcp-Session-Id at init; clients MUST include it on later requests
  • On 404 for a known session, the client MUST reinitialize — the session expired or was invalidated
  • SSE event IDs and Last-Event-ID enable resumability after disconnects, preventing message loss

Security

Tool description integrity

A server can change descriptions post-approval without re-consent — a "rug pull" attack. Defenses:

  • Version-pin descriptions. Hash the manifest at approval; flag post-approval changes.
  • Treat descriptions as untrusted. Poisoned descriptions can manipulate reasoning to exfiltrate data or trigger unintended actions.

Authorization

OAuth 2.1 for remote servers: PKCE with S256, Dynamic Client Registration (RFC 7591), and Resource Indicators (RFC 8707) against confused-deputy attacks.

Defense layers

Layer What it protects Implementation
Sandboxing Host system Container/VM isolation, network egress default-deny
Authorization Server identity OAuth 2.1, per-client consent, resource indicators
Tool integrity Model reasoning Description hashing, version pinning, change detection
Monitoring Operational safety Audit trails, behavioral baselines, anomaly detection

Local server hardening

Local Streamable HTTP servers must validate Origin, bind to localhost, and require auth — preventing DNS rebinding.

Observability

Metric Why it matters
Session init success/failure rate Flaky connections surface as tool call failures
tools/list latency Slow discovery delays agent startup
call_tool latency (avg, p95) Identifies slow or degraded servers
Error rate per tool and server Surfaces reliability issues per integration
Tool registry size (token count) Tracks context window cost of tool descriptions

When This Backfires

Caching stales tool schemas. Static TTL caching fails against servers pushing frequent updates. If a required parameter changes between refreshes, the agent issues malformed calls. Short TTLs or notifications/tools/list_changed cut risk but raise tools/list traffic.

Tool list churn invalidates provider prompt caches. Providers key prompt caching on the tool list; mid-session changes raise per-turn costs. Avoid designs that shift the visible tool set between turns.

Full routing stack overhead on single-server agents. Namespace maps (the serverId -> tools[] table), priority ordering, and per-server lifecycle yield no benefit with one server. Apply multi-server routing only when collision risk is real.

OAuth 2.1 PKCE assumes a capable HTTP client. CLI or embedded agents may lack the browser or system capabilities the flow expects.

Key Takeaways

  • One client per server connection — each owns its lifecycle, capabilities, and transport state.
  • Cache tools/list and refresh on notifications/tools/list_changed or TTL; treat every description as untrusted.
  • Namespace tool names by server ID; apply priority routing only when collisions are real.
  • Pin OAuth 2.1 with PKCE S256, Dynamic Client Registration, and Resource Indicators for remote servers.
  • Track init success, tools/list latency, per-tool error rates, and registry token count — these expose degradation before users do.

Example

A TypeScript host with namespace routing and cached tool lists:

interface ServerSession {
  id: string;
  client: McpClient;
  tools: Map<string, ToolDefinition>;
  lastToolsFetch: number;
}

class McpHost {
  private sessions: Map<string, ServerSession> = new Map();

  async routeToolCall(toolName: string): Promise<ServerSession> {
    // Namespace lookup: find which server owns this tool
    for (const [id, session] of this.sessions) {
      if (session.tools.has(toolName)) return session;
    }
    throw new Error(`No server provides tool: ${toolName}`);
  }

  async refreshToolsIfStale(session: ServerSession, ttlMs = 300_000) {
    if (Date.now() - session.lastToolsFetch > ttlMs) {
      const response = await session.client.listTools();
      session.tools = new Map(response.tools.map(t => [t.name, t]));
      session.lastToolsFetch = Date.now();
    }
  }
}

One ServerSession per MCP server; tool calls route through the namespace map; lists refresh only on TTL expiry or a listChanged notification.

Feedback