Skip to content

LLM Map-Reduce Pattern

Map-reduce splits a large input into context-window-sized chunks, processes each independently (map), then combines the chunk results into one coherent output (reduce).

Also known as

Chunk-Process-Merge, Parallel Summarization, Input-Partitioned Fan-Out. For the task-level delegation variant, see Orchestrator-Worker. For same-task parallel diversity, see Fan-Out Synthesis. For implementation, see Sub-Agents Fan-Out.

How It Differs from Adjacent Patterns

Pattern Splits by Map phase Reduce phase
Map-Reduce Input partition (pages, files, modules) Same operation on each chunk Merge chunk results into unified output
Orchestrator-Worker Subtask type (research, test, implement) Different operation per worker Synthesize heterogeneous outputs
Fan-Out Synthesis Nothing — same input N times Same task, independent attempts Select/merge best-of-N

Map-reduce is data-parallel: the same operation applied to different slices of input.

Structure

graph TD
    A[Large Input] --> B[Decompose]
    B --> C1[Chunk 1]
    B --> C2[Chunk 2]
    B --> C3[Chunk N]
    C1 --> M1[Map Agent 1]
    C2 --> M2[Map Agent 2]
    C3 --> M3[Map Agent N]
    M1 -->|Result 1| R[Reduce Agent]
    M2 -->|Result 2| R
    M3 -->|Result N| R
    R --> O[Final Output]
  1. Decompose — split input into chunks sized for individual context windows
  2. Map — process each chunk independently with the same instructions
  3. Reduce — combine chunk-level results into a single output

Context Window Arithmetic

Each map agent's context must hold:

instructions + input_chunk + output_budget <= context_window_limit
Component Typical allocation
System prompt + instructions 1,000–3,000 tokens
Input chunk 60–75% of remaining budget
Output headroom 25–40% of remaining budget

Err toward smaller chunks. Anthropic frames context degradation ("context rot") as "a performance gradient rather than a hard cliff" (Anthropic: effective context engineering), and its onset tracks an absolute token threshold (~32K–100K), not a fixed percentage of the window — so size chunks below that onset for the task type, not to a fill ratio.

Decomposition Strategies

Strategy Works for Risk
Fixed-size (every N tokens/lines) Logs, homogeneous data Splits semantic units mid-thought
Boundary-aware (by file, section, chapter) Codebases, documents Uneven chunk sizes
Overlap (sliding window with N-token overlap) Narrative text Duplicate findings in reduce

Reduce Strategies

Reduce strategy When to use Method
Single-pass Few chunks (3–8) All map results fit in one reduce context
Hierarchical Many chunks (10+) Reduce in groups, then reduce the reductions
Merge Structured outputs (lists, tables) Deterministic concatenation + deduplication
Vote/filter Classification tasks Majority vote or threshold across chunks

Hierarchical reduce groups map results into intermediate reduce nodes, compressing outputs at each level until a single final reduce agent synthesizes the tree.

graph TD
    M1[Map 1] --> R1[Reduce A]
    M2[Map 2] --> R1
    M3[Map 3] --> R1
    M4[Map 4] --> R2[Reduce B]
    M5[Map 5] --> R2
    M6[Map 6] --> R2
    R1 --> RF[Final Reduce]
    R2 --> RF
    RF --> O[Output]

Implementation with Sub-Agents

Claude Code sub-agents are a native map-reduce primitive. Each sub-agent runs in its own context window, explores independently, and returns condensed results — tens of thousands of tokens internally compressed to 1,000–2,000 tokens returned to the lead.

# .claude/agents/chunk-analyzer.md
---
name: chunk-analyzer
description: Analyzes a single code module and returns a structured summary.
tools:
  - Read
  - Glob
  - Grep
model: haiku
---

Analyze the module at the provided path. Return:
- Purpose (one sentence)
- Public API surface (function signatures only)
- Dependencies (external packages)
- Potential issues found

Do not return source code. Return only the structured summary.

The lead agent decomposes by module, fans out a chunk-analyzer per module, and reduces summaries into a unified review.

For file-system isolation during map phases that write files, use isolation: worktree to give each sub-agent its own git worktree.

When to Use Map-Reduce vs. Alternatives

Use map-reduce when: input exceeds a single context window; the same operation applies to every partition; chunk-level results are independently meaningful.

Use orchestrator-worker instead when: subtasks require different operations or tool sets; decomposition is by task type, not input partition.

Use sequential processing when: chunks depend on prior results; total input fits one context window; or Anthropic’s long-running agent guide applies.

When This Backfires

Map-reduce underperforms or fails in several conditions:

  • Cross-chunk dependencies — when answering requires evidence spread across chunks (e.g., a refactor changing an interface used in 10 files), each map agent sees only its slice and cannot surface the cross-chunk pattern. Outputs look clean individually but miss the systemic issue.
  • Boundary mismatch — fixed-size chunking splits semantic units mid-sentence or mid-function, causing map agents to misinterpret partial context. The reduce agent reconciles contradictory findings without knowing they are artifacts of the split.
  • Hierarchical reduce error propagation — each reduce level loses information. A two-level hierarchy reducing 50 map outputs to 5 summaries to 1 final output compounds extraction errors at every stage — coherent but wrong in ways invisible without the raw inputs.
  • Thin map outputs — when chunks are small or homogeneous, each map result adds marginal information. The reduce agent processes N near-identical outputs; cost scales linearly while output quality plateaus.

Failure Handling

Partial map failures are the norm at scale:

  • Retry failed chunks independently without re-running successful ones
  • Degrade gracefully — 8/10 successful map results beats no result
  • Cap parallelism to rate limits; Anthropic’s research system uses 3–5 subagents, not 50

Example: Codebase Architecture Review

A 200-file codebase with 15 modules, each too large for casual review in a single context:

  1. Decompose — split by module boundary (15 chunks)
  2. Map — each sub-agent reads one module, returns purpose + API surface + issues (15 parallel calls)
  3. Reduce — lead agent receives 15 summaries (~20,000 tokens total), identifies cross-module patterns, dependency issues, and architectural concerns
  4. Output — unified architecture review covering the full codebase, based on summaries no single context window could have held raw

Key Takeaways

  • Map-reduce splits by input partition — same operation on different data slices
  • Size chunks conservatively: keep each below the task's degradation onset (an absolute token threshold, not a fixed fill ratio) rather than near the window limit
  • Choose reduce strategy (single-pass, hierarchical, merge, vote) by chunk count and output type
  • Design for partial failure — retry individual chunks, degrade gracefully on incomplete results
Feedback