Concept Map¶
All site content grouped by theme, cutting across sections. Find related concepts regardless of where they live in the navigation.
This map cuts across the directory tree to reflect the practitioner-resource spine — patterns, anti-patterns, paper summaries, and tool features sit together when they're about the same problem. See the Tags index for topic-tag entry points or STANDARDS.md for what makes the cut.
-
Context Engineering
Budgets, compression, attention, and retrieval
-
Instructions & Prompts
Layering, scoping, polarity, and instruction files
-
Agent Architecture
Composition, delegation, coordination, and memory
-
Tool Design
Descriptions, discovery, filtering, and engineering
-
Security & Safety
Injection defense, sandboxing, and credentials
-
Testing & Evaluation
Evals, metrics, code review, and verification
-
Hooks & Guardrails
Lifecycle events and deterministic enforcement
-
Reliability & Recovery
Error handling, cost control, and loop detection
-
Standards & Protocols
AGENTS.md, MCP, A2A, and interop conventions
-
Development Workflows
Planning, bootstrapping, CI/CD, and iteration
-
Human Factors
Cognitive load, attention management, and adoption
-
Anti-Patterns
Common mistakes with context, prompts, and trust
-
Emerging Concepts
Product-as-IDE, personalization, and early signals
Jump to section → - Discoverability & GEO
Generative-engine optimisation, llms.txt, and topical authority — making content reachable by AI answer engines.
-
Frameworks
Multi-page framework syntheses — end-to-end agent operating models and migration playbooks.
-
Learning & Training
Structured curriculum modules and hands-on tracks for agent-assisted development.
-
Tool-Specific Guides
Tool-specific features and workflows for Claude Code, GitHub Copilot, Cursor, and other assistants.
Context Engineering¶
Managing what enters the context window — budgets, compression, attention, and retrieval.
- Attention Sinks: Why First Tokens Always Win
- Context Budget Allocation: Every Token Has a Cost
- Context Compression Strategies
- Context Engineering: The Discipline of Designing Agent Context
- Context Priming
- Context Window Management: The Dumb Zone
- Discoverable vs Non-Discoverable Context
- Layered Context Architecture
- Lost in the Middle: The U-Shaped Attention Curve
- Separation of Knowledge and Execution
- Three Knowledge Tiers: Sourced, Unverified, Hallucinated
- Context Hub: On-Demand Versioned API Docs
- Filter and Aggregate in the Execution Environment
- Observation Masking: Filter Tool Outputs from Context
- Prompt Compression: Maximizing Signal Per Token
- Retrieval-Augmented Agent Workflows
- Seeding Agent Context: Breadcrumbs in Code
- Semantic Context Loading
- Static Content First to Maximize Cache Hits
- AOCI: Symbolic-Semantic Repository Indexing
- Chunking Strategy for RAG-Based Code Completion
- Codebase-Derived Pattern Libraries as Agent Context
- Component-Wise RAG Prioritization for Software Engineering Tasks
- Context Window Anxiety: Countering Premature Task Closure
- Context-Window Diagnostic Tooling: Identifying Context-Heavy Tools
- Cross-Lingual Prompt Preprocessing (Local-LLM Token Arbitrage)
- Disable Attribution Headers to Preserve KV Cache in Local Inference
- Elastic Context Orchestration: A Per-Turn Vocabulary for Long-Horizon Search Agents
- Environment Specification as Context: Closing the Version Gap
- Error Preservation in Context for AI Agent Development
- Evolving Playbooks: Incremental Context That Preserves Knowledge
- Exclude Dynamic System Prompt Sections for Cross-Machine Cache Sharing
- Extended Prompt Cache TTL for Long Agent Sessions
- Frontmost-Window Snapshot as Agent Context
- Goal Recitation: Countering Drift in Long Sessions
- Instruction-Guided Code Completion: Controlling What Models Generate
- Live Browser as Agent Context Channel
- LLM-Driven Logical Retrieval: Boolean Queries over an Inverted Index
- Manual Compaction Strategy for Dumb Zone Mitigation
- Narrative Problem Reformulation for Code Generation
- PEEK: Orientation Cache for Recurring-Context Agents
- Phase-Specific Context Assembly for AI Agent Development
- Prompt Cache Economics: Comparing Costs by Provider
- Repository Map Pattern: AST + PageRank for Dynamic Code
- Repository-Level Retrieval for Code Generation
- Schema-Guided Graph Retrieval
- Selective Rewind Summarization: Compress Earlier Turns, Keep Recent Ones Intact
- Semantic Density Optimization for Agent Codebases
- Source Code Minification for State-in-Context Agents
- Stale Repository Retrieval Induces Incorrect Code
- Structured Domain Retrieval: Knowledge Graphs and Case-Based Reasoning
- Token-Efficient Code Generation: Structural Beats Prompting
- Tokenizer Swap Tax: Budgeting for Model Migrations That Change Token Counts
- Turn-Level Context Decisions for AI Coding Sessions
- Validating Token-Optimized Formats Inside Agentic Loops
Instructions & Prompts¶
Writing effective instructions — layering, scoping, polarity, and instruction file conventions.
- AGENTS.md as Table of Contents, Not Encyclopedia
- CLAUDE.md Convention
- Convention Over Configuration for Agent Workflows
- Dynamic System Prompt Composition
- The Instruction Compliance Ceiling
- Project Instruction File Ecosystem
- Layer Instructions by Specificity
- Negative Space Instructions: What NOT to Do
- Pre-Completion Checklists
- Prompt Chaining
- Prompt Layering: How Instructions Stack and Override
- Session Initialization Ritual
- The Specification as Prompt
- Standards as Agent Instructions
- Controlling Agent Output
- Critical Instruction Repetition
- Domain-Specific System Prompts
- Example-Driven vs Rule-Driven Instructions
- Hierarchical CLAUDE.md
- Example-Driven vs Rule-Driven Instructions
- Instruction Polarity: Positive Rules Over Negative
- Narrow Task Instructions to Reduce Injection
- Agent Memory Patterns: Persistent, Scoped Corrections
- System Prompt Altitude
- System Prompt Replacement for Domain-Specific Personas
- @import Composition Pattern for Agent Instruction Files
- Acknowledged-Debt Ledger with Next-Trigger Conditions
- AGENTS.md Design Patterns for Effective Agent Files
- claudeMdExcludes: Selective Ancestor Instruction-File Exclusion
- Configuration File Structure Does Not Drive Compliance
- Constraint Degradation in AI Code Generation
- Constraint Encoding Does Not Fix Constraint Compliance
- Content Exclusion Gap: AI Security Boundaries by Mode
- Contractual Skill Files: Inspectable SKILL.md for Enterprise Agents
- Cost-Aware Skill Rewriting: Preserve Operational Anchors, Not Skill Tokens
- Empirical Baseline: Agentic AI Coding Tool Configuration
- Encoding Values in AGENTS.md: Why Prose Without Verification Fails
- Enforcing Agent Behavior with Hooks
- Evaluating AGENTS.md: When Context Files Hurt More Than Help
- Frozen Spec File: Preserving Intent in AI Agent Sessions
- Getting Started: Setting Up Your Instruction File
- GROUNDING.md: Field-Scoped Hard Constraints and Convention Parameters
- Guardrails Beat Guidance: Rule Design for Coding Agents
- Hints Over Code Samples in Agent Prompts
- HTML as Agent Output Format: When to Ask for HTML Instead of Markdown
- Iterative Binary Feedback for Pattern Adherence
- Living-Docs-Grounded Agent Design Conversations
- Mermaid as Agent Output Format: When to Ask for a Diagram Instead of Prose
- Method Map: Failure-Mode to Smallest-Artifact Triage
- Multi-Layer Specification Redundancy as a Robustness Budget
- Natural-Language Customization Bootstrap
- Post-Compaction Re-read Protocol for Agent Continuity
- Production System Prompt Architecture and Techniques
- Prompt File Libraries for Reusable Agent Instructions
- Prompt Governance via PRs: Reviewable AI Behaviour
- Prompt-Rewrite Discipline on Cross-Generation Model Migration
- Rule Lifecycle Metadata for Prunable Instruction Surfaces
- Shared Context Bundle Registry for Agent Teams
- Ubiquitous Language for AI Plans
- WRAP Framework for Writing Agent-Ready Issue Descriptions
Agent Architecture¶
Composing, structuring, and coordinating agents — delegation, composition patterns, multi-agent coordination, and memory.
- Adversarial Multi-Model Pipeline (VSDD)
- Agent Backpressure: Automated Feedback for Self-Correction
- Agent Composition Patterns: Chains, Fan-Out, Pipelines
- Agent-First Software Design
- Agent Harness: Initializer and Coding Agent
- Agent Pushback Protocol
- Skill Library Evolution
- Agent Turn Model
- Agents vs Commands
- Lay the Architectural Foundation by Hand
- Codebase Readiness for Agents
- Committee Review Pattern
- The Delegation Decision
- Domain-Specific Agent Challenges
- Evaluator-Optimizer Pattern
- Fan-Out Synthesis Pattern
- File-Based Agent Coordination
- Orchestrator-Worker Pattern
- Progressive Disclosure for Agent Definitions
- The Ralph Wiggum Loop
- Specialized Agent Roles
- Task-Specific Agents vs Role-Based Agents
- Worktree Isolation
- Agent Handoff Protocols
- Agent Memory Patterns
- Beads: Structured Task Graphs as External Agent Memory
- Oracle-Based Task Decomposition
- Skeleton Projects as Agent Scaffolding
- Sub-Agents for Fan-Out Research
- Parallel Agent Sessions
- Abstention-Aware Memory Retrieval for Coding Agents
- ACID for Agent Repository State
- Adaptive Generate-Rank-Verify Under Costly Verification
- Adaptive Sandbox Fan-Out Controller
- Agent Circuit Breaker
- Agent Development Lifecycle for Agent Products
- Agent Event Streaming: Consumer Contract Above the Tokens
- Agent JIT Compilation: Compile Tasks Into Executable Plans
- Agent Loop Middleware — Safety Nets and Message Injection
- Agent Project State Purge: Clean-Slate Session Reset
- Agent Runtime Middleware: Per-Call Interception Pipeline
- Agent Self-Review Loop for Iterative Self-Improvement
- Agent Terminology Disambiguation for AI Coding Systems
- Agent-Discoverable Slash Commands
- Agentic AI Architecture: From Prompt to Goal-Directed
- Agentic Flywheel: Building Self-Improving Agent Systems
- Agentless vs Autonomous: When Simple Beats Complex
- Anthropic's Effective Agents Framework: A Pattern Map
- AST-Grounded Critic Loop for Documentation Maintenance
- AST-Guided Agent Memory for Repository-Level Code Generation
- Async Non-Blocking Subagent Dispatch
- Asynchronous Agent I/O and Speculative Tool Calling
- Attention Latch: When Agents Stay Anchored to Stale Instructions
- Auto Model Selection: Harness-Driven Routing per Task
- AX/UX/DX Triad: Three Experience Layers in Agent Systems
- Background Todo Agent: Offload Plan Maintenance to a Lightweight Model
- Behavioral Drivers of Coding Agent Success and Failure
- Bounded Batch Dispatch for Parallel Agent Execution
- Classical SE Patterns as Agent Design Analogues
- Classifier-Gated Auto-Permission for Cloud-IDE Coding Agents
- Classifier-Subagent Run Mode for Per-Call Permission Routing
- Classifying and Auto-Correcting Coding Agent Misbehaviors (Wink)
- Clock-In / Clock-Out Protocol: Bracketed Session Continuity
- Closed-Loop Role-Based Refinement for Agent Systems
- Cloud-Agent Session Bootstrap: Cached Install plus Per-Session Start
- Cloud-Agent Three-Layer State Decoupling
- Cloud-Agent Tiered Model Routing: Cheap Tiers for Bounded Tasks
- CoALA Decision-Making Loop as an Orchestration Lens
- CoALA Memory Taxonomy as a Classifier for Harness Artifacts
- CoALA Structured Action Space: Internal vs External Actions
- Code-Health-Gated LLM Tier Routing
- Coding Agent Scope Expansion: When to Extend Beyond the Codebase
- Cognitive Reasoning vs Execution: A Two-Layer Agent
- Cohesion-Aware Task Partitioning for Multi-Agent Coding
- Component-Isolated Memory Stress Testing for LLM Agents
- Contextual Capability Calibration for Multi-Agent Delegation
- Continual Learning for AI Agents: Three Layers of Knowledge Accumulation
- Convergence Detection in Iterative Agent Refinement
- Critic Agent Pattern: Dual-Model Plan Review
- Cross-Cycle Consensus Relay
- Cross-Tool Subagent Comparison
- Cross-Vendor Competitive Routing for LLM Selection
- Decentralized Memory for Self-Evolving Multi-Agent Systems
- Declarative Multi-Agent Composition
- Declarative Multi-Agent Topology: Topology-as-Code
- Deep Agent Runtime: The Layer Beneath the Harness
- Deferred Permission Pattern: Headless Agent Session Pausing
- Delta Channels: Bounded Checkpoint Storage for Append-Only Agent State
- Deterministic Orchestration for Structured Modernization
- Discrete Phase Separation
- Distributed Computing Parallels in Agent Architecture
- Domain-Scoped Parallel Exploration for Multi-File Change Localization
- DSPy: Programmatic Prompt Optimization for Compound Agent Systems
- Dual-Budget Control for Search Agents: VOI Scoring Per Action
- Dual-Trace Memory Encoding: Pair Facts with the Scene They Were Learned In
- Dual-Write Append-Mirror for Agent Transcript Externalization
- Durable Interactive Artifacts: Agent Output Outside the Transcript
- Economic Value Signaling in Multi-Agent Networks
- Editor and Manager Surface Separation in Agent IDEs
- Effective Feedback Compute (EFC) for Harness Comparison
- Emergent Architecture in AI-Driven Codebases
- Emergent Behavior Sensitivity for AI Agent Development
- Episodic Memory Retrieval for AI Coding Agent Loops
- Eval Strategy by Agent Generation: A Structure-to-Eval Locator
- Event-Driven Agent Routing for Multi-Team AI Pipelines
- Exception Handling and Recovery Patterns for AI Coding Agents
- Execution Lineage: DAG of Artifacts vs Agent Loops
- Execution-First Delegation: The AI-as-Executor Pattern
- Experience Graphs as Structured Memory for Self-Evolving Agents
- Externalization in LLM Agents
- Feedback as Capability Equalizer: Iterative Feedback Outweighs Model Scale
- Five-Failure-Layers Diagnostic: Attribute Before Swapping the Model
- Forked vs Fresh Subagents: When to Inherit the Parent Conversation
- Formal Process Models as Prompting Scaffolds (Petri Net of Thoughts)
- Gateway Model Routing: Treat the LLM Gateway as a Discovery Source
- Generative Agents Memory Stream: Three-Layer Architecture for Long-Running Agent Sessions
- Goal Contract: Separating the Doer from the Done-Checker
- Goal Monitoring and Progress Tracking for Long-Running Agents
- Goal-Driven Autonomous Loop with Budget Cap
- Governed Sources of Truth for Analytics Agents (Structure Over Access)
- Graph of Thoughts: Directed Graph Reasoning for Multi-Path Problems
- Grill Me: Developer-Initiated Plan Interrogation
- Handoff Skill: Structured Context Transfer Between Agent Sessions
- Harness Design Dimensions and Archetypes
- Harness Engineering for Building Reliable AI Agents
- Harness Hill-Climbing: Eval-Driven Iterative Improvement of Agent Harnesses
- Harness Impermanence: Build Scaffolding To Be Deleted
- Harness-Memory Coupling as a Design Axis
- Hypothesis-Driven Debugging: Instrument Before You Patch
- Idle-Time Speculative Planning for ReAct Agents
- Independent Test Generation in Multi-Agent Code Systems
- Inference-Time Tool-Call Reviewer: Pre-Execution Feedback for Tool-Calling Agents
- Interactive Clarification for Underspecified Tasks
- Interactive Effort Sliders: Per-Turn Reasoning-Budget Controls
- Inversion Analysis: Surface Capabilities Competitors Cannot Replicate
- Isometric Harness Ablation: Rank Subsystem Investment by Removing One at a Time
- Issue Requirements Preprocessing: Structured Input Before Code Generation
- Lane-Based Execution Queueing
- Layered Domain Architecture: A Prescriptive Default for Agent-Built Code
- Layered Mutability: Governing Persistent Self-Modifying Agents
- Lead-to-Teammate Plan-Approval Handshake for Multi-Agent Work
- LLM Map-Reduce Pattern for Parallel Input Processing
- Long-Running Agents: Durability and Resumability Across Sessions
- Loop Strategy Spectrum: Accumulated vs Fresh Context
- Magentic Orchestration: Task-Ledger-Driven Adaptive Multi-Agent Planning
- Managed vs Self-Hosted Agent Harness: Deployment Trade-offs
- Memory Reinforcement Learning (MemRL)
- Memory Synthesis: Extracting Lessons from Execution Logs
- Memory Transfer Learning: Cross-Domain Memory Reuse in Coding Agents
- Memory-as-Governance: Pre-Action Gates for Coding Agents
- Meta-Engineering Harness for Production AI-Native Software Delivery
- Model-Neutral Agent Architecture: Model Portability Over Cloud Portability
- Most-Restrictive-Wins Fusion for Parallel Agent Control Returns
- Multi-Agent SE Design Patterns: A Taxonomy Across 94 Papers
- Multi-Agent Topology Taxonomy: Centralised, Decentralised
- Multi-Model Plan Synthesis for System Architecture
- Multi-Shape BYOK Provider: Declare API Family per Endpoint
- Observability-Driven Harness Evolution
- Observation Contract Preservation in Tool-Augmented Agents
- Observation-Driven Coordination: CRDT-Based Parallel Agent
- Open Agent School Pattern Mapping for Practitioners
- Opponent Processor / Multi-Agent Debate Pattern
- Parameter-Keyed Caching and Dependency-Aware Parallelism for Plan-Execute Pipelines
- Parsimonious Agent Routing for Multi-Agent Dispatch
- Pattern Selection Map: Trade-off Matrix for This Site's Patterns
- Per-Call Budget Hints on Tool Invocations
- Per-Model Harness Tuning: Treating the Backing Model as a Harness Variable
- Per-Tool Extended Reasoning Opt-In: Tool-Call-Scoped Budgets
- Per-User Supervisor Process for Background Agent Sessions
- Persistent Shared Search Sub-Agent for Output-Token Reuse
- Persistent-Connection Agent Transport
- Persona-as-Code: Defining Agent Roles as Structured Docs
- Plan Compliance in Agents: Measure What They Execute, Not What You Wrote
- Prebuilt Agent Environments: Amortising Cloud Agent Cold Start with Custom Container Images
- Proactive Idle-Time Anticipation (ProAct)
- Production Hosting Topology for Self-Hosted Agent SDK Runtimes
- Progressive Spend Threshold Alerting for Agent Cost Governance
- Quality Score Rubric and Simplification Log for Agent Harnesses
- RAG over Thinking Traces: Index Reasoning Trajectories Instead of Documents
- Rainbow Deployments for Agents: Gradual Version Migration
- Recursive Agent Harnesses (RAH)
- Recursive Best-of-N Delegation
- Reflective Prompt Evolution with Pareto Selection (GEPA)
- Remote Agent Host Sessions over SSH and Dev Tunnels
- Remote Session Control for Local CLI Agents
- Reverse-Engineered Executable Specifications for Agentic Program Repair
- Role Orchestration on a Single Model
- RubricRefine: Pre-Execution Rubric Refinement for Code-Mode Tool Use
- Runtime Harness Adaptation: Four-Layer Interface Fixes for Frozen LLM Agents
- Runtime Scaffold Evolution: Agents That Build Tools
- Scaffold Architecture Taxonomy for Coding Agents
- Selective Autonomy from Copilot Feedback
- Selective Checkpoint Restore Across Code and Conversation State
- Self-Discover Reasoning: LLM-Composed Reasoning Structures
- Self-Healing Production Agent: Automated Regression Detection and Autofix PR
- Self-Reporting Loops: Autonomous Routines That File Their Own Backlog
- Self-Rewriting Meta-Prompt Loop
- Semantic Caching for Multi-Agent Code Systems
- Session Harness Sandbox Separation for Long-Running Agents
- Session Recap: Goal-Shaped Handoff at Context Boundaries
- Shared Agent Context Store API: When to Expose Curated Context as an Endpoint
- Six-Shape Approval Response Taxonomy: Beyond Binary Allow/Deny
- Skill Program Functions: Executable Guardrails Compiled From Past Failures
- Solver-Externalized Constraint Reasoning (MaxSAT/SMT Encoding)
- Specialist Orchestrated Queuing for Multi-Agent SE (SPOQ)
- Specialized Small Language Models as Agent Sub-Tools
- Sprint Contracts: Pre-Coding Success Agreements for Multi-Agent Tasks
- Staggered Agent Launch: Preventing Thundering-Herd in Swarms
- Stochastic-Deterministic Boundary as First-Class Contract
- Structured Agentic Software Engineering (SASE)
- Subprocess-per-Session Hosting Model: Four Session-Lifecycle Topology Patterns
- Subtask-Level Memory for Software Engineering Agents
- Swarm Migration Pattern
- Syntax-Aware Completion Routing for Local-First Code Models
- System-Level Optimization Pipeline
- Task Feasibility Awareness: Stop Before You Start
- Temporary Compensatory Mechanisms in Agent Harnesses
- Tenant Model Policy: Organization-Scoped Rules for AI Model Selection
- The Advisor Strategy: Frontier Model as Strategic Advisor
- The Agent Stack Bet: Architectural Decisions for Production Agents
- The AX Stack: A Layered Model of an AI Coding Agent's Prompt-to-Compile Path
- The Three Loops of Agentic Coding: A Diagnostic Vocabulary
- Three Reasoning Spaces: Plan-Bead-Code Phase Gates
- Tiered Memory Architecture: Episodic-to-Semantic Consolidation for Long-Running Agents
- Tool Confirmation Carousel: Batched UI for Per-Call Approvals
- Tool Preamble: User-Visible Status Updates Before Tool Calls
- Typed Memory from VCS History: When It Pays Off
- Typed Schemas at Agent Boundaries for Multi-Agent Systems
- Utility-Model Split: Background Tasks on a Cheaper Model
- Verify-Gated Completion as Admission Control
- Voting / Ensemble Pattern for AI Agent Development
- VS Code Agents App: Agent-Native Parallel Task Execution
- WIP=1 and Little's Law: Kanban Throughput Theory for Agent Task Design
Tool Design¶
Building, describing, and filtering tools — making tools agent-friendly.
- Consolidate Agent Tools
- Feature List Files
- Filesystem-Based Tool Discovery
- Token-Efficient Tool Design
- Tool Minimalism and High-Level Prompting
- Advanced Tool Use: Scaling Agent Tool Libraries
- Batch File Operations via Bash Scripts
- CLI Scripts as Agent Tools
- Permutation Frameworks for Batch Code Generation
- Semantic Tool Output
- Subagent Schema-Level Tool Filtering
- Tool Description Quality
- Write Tool Descriptions Like Onboarding Docs
- Tool Engineering
- Agent-Aware CLI Behaviour via Environment Variable
- Agent-Computer Interface (ACI): Tool Design as UX Discipline
- Chance-Corrected Shortlist Depth Sizing for Tool Retrieval (Bits-over-Random)
- CLI-First Skill Design
- Code Interpreter as a Primary Agent Tool
- Conditional Hook Execution: Filter Hooks by Tool Pattern
- Cross-Repo Agent Search: GitHub-API-Backed Text Search Beyond the Workspace
- Designing for Agent Consumers (Agent Experience)
- Documentation-Grounding MCP Servers for Vendor SDKs
- Edit Format Selection: Diff vs. Search-Replace vs. Full Rewrite
- Effort-Aware Hooks: Reading the Reasoning Tier from PreToolUse and PostToolUse
- Function-Level Debugger Interfaces for Coding Agents
- Future-Based Asynchronous Function Calling
- Google ADK Skills: Portable SKILL.md Across ADK Agents
- Graceful Tool-Output Truncation: The PARTIAL Signal
- Headless-First Services: APIs for Agent Consumers
- Hint-Driven Concurrency for Read-Only MCP Tools
- Hook Exec Form vs Shell Form: Shell-Injection-Safe Hook Commands
- Hooks Invoking MCP Tools: Closing the Loop Between Policy and Tool Execution
- Indexed Regex Search for Agent Tools
- Lexical-First Retrieval for Agentic Search: When BM25 Is Enough
- Machine-Readable Error Responses for AI Agents (RFC 9457)
- MCP alwaysLoad: Classifying Servers as Eager or Just-in-Time
- MCP Client Design: Building Robust Host-Side Logic
- MCP Client/Server Architecture Best Practices Guide
- MCP Elicitation: Servers Requesting Structured Input Mid-Task
- MCP LLM Sampling: Servers Requesting AI Inference Mid-Tool
- MCP Server Design: Building Agent-Friendly Servers
- MCP Tool Result Persistence via _meta Annotation
- MessageDisplay Hook: Transforming Assistant Text at the Display Boundary
- On-Demand Skill Hooks: Session-Scoped Guardrails via Skill Invocation
- One-Shot Record and Deterministic Replay for Periodic Agent Tasks
- OpenAPI Documentation Smells for Agent-Ready APIs
- Out-of-Band Hook Notifications via terminalSequence
- Override Pattern: Reusing Interactive Commands in Automated Pipelines
- Poka-Yoke for Agent Tools: Mistake-Proof Tool Interfaces
- PostToolBatch Hook: Once-Per-Decision-Cycle Injection at the Batch Boundary
- PostToolUse continueOnBlock: Refusal With a Load-Bearing Reason
- PostToolUse Hook for BSD/GNU CLI Incompatibilities
- PostToolUse Output Replacement: Hooks That Rewrite Tool Results
- PreCompact Hook: Vetoing Compaction at Lifecycle Boundaries
- Production MCP Agent Stack: Sequencing Six Decisions into One Deployment
- Project Writing Skill: House Style as Model-Invocable Skill
- Proprietary-to-Open-Standard Tool Migration (Copilot Extensions to MCP)
- Push-Event MCP Channels: Inverting the Pull-Tool Polarity
- Reactive Environment Hooks: CwdChanged and FileChanged
- Scoped MCP Server Discovery: Most-Specific-Wins Resolution
- Self-Healing Tool Routing
- Skill as Instruction Surface and Callable API (Interpreter Skills)
- Skill as Knowledge Pattern for AI Agent Development
- Skill Authoring Patterns: Description to Deployment
- Skill Context Isolation: Forking the Skill into a Subagent Window
- Skill Library Technical Debt: Library-Time Maintenance for Agent Skills
- Skill Tool as Enforcement: Loading Command Prompts at Runtime
- SKILL.md Frontmatter Reference: All Fields Explained
- StopFailure Hook: Observability for API Error Termination
- Terminal Tool Output Compression: Filtering Predictable Noise at the Harness
- Terminal Tools for Agents: send_to_terminal and Background Interaction
- Tool Cloning and Provenance Assessment in Agent Ecosystems
- Tool Necessity Probing: Reading Tool-Call Decisions From Hidden States
- Toolset Agentization: Wrapping Co-Used Tools as Sub-Agents
- Unix CLI as the Native Tool Interface for AI Agents
- Web Search Agent Loop: Iterative Research Patterns
Security & Safety¶
Defending against prompt injection, data exfiltration, and credential exposure.
- Blast Radius Containment: Least Privilege for AI Agents
- Dual-Boundary Sandboxing
- Prompt Injection: A First-Class Threat
- Scope Sandbox Rules to Harness-Owned Tools
- URL-Based Data Exfiltration Guard
- Treat Task Scope as a Security Boundary
- Defense-in-Depth Agent Safety
- Close the Attack-to-Fix Loop
- PII Tokenization in Agent Context
- Protecting Sensitive Files from Agent Context
- RL-Trained Automated Red Teamers
- Scoped Credentials via Proxy
- Secrets Management for Agent Workflows
- Public-Web Index to Gate URL Fetching
- Action-Audit Divergence: A Four-Mode Taxonomy for Runtime Hardening
- Action-Selector Pattern: LLM as Intent Decoder with Deterministic Execution
- Agent Network Egress Policy: Admin-Controlled Domain Allow/Deny
- Agent-Authored Messages as a Deferred Exfiltration Channel
- Agent-Emitted Dependency Version Ranges Widen the Supply-Chain Attack Surface
- Agentic Detection and Response at the MCP Boundary
- Always-On Agentic PR Security Review
- Authority Confusion: Untrusted Context Must Not Authorize Side Effects
- Behavioral Firewall for Tool-Call Trajectories
- Clarification Mode Amplifies Prompt Injection
- Code Injection Defence in Multi-Agent Pipelines
- Cognitive Poisoning: Untrusted Tool Feedback as a Trajectory Attack
- Compositional Vulnerability Induction in Coding Agents
- Constraint Drift: Why Safety Must Be Maintained, Not Asserted
- Containment Playbook: npm-to-Signing-Channel Compromise
- Context-Fractured Decomposition Attacks on Tool-Using Agents
- Control/Data-Flow Separation for Prompt Injection Defense (CaMeL)
- Credential Hygiene for Agent Skill Authorship
- Cryptographic Governance Audit Trail for AI Agents
- Customer-Hosted MCP Tunnel: Outbound-Only Connectivity to Private MCP Servers
- Designing Agents to Resist Prompt Injection
- Discovering Indirect Injection Vulnerabilities in Your Agent
- Docker sbx Adoption for Coding Agents
- Dormant Memory Payloads Triggered by Sensitive Topics (Trojan Hippo)
- Dual-Graph Alignment for Indirect Prompt Injection Defense (AuthGraph)
- Embedding Inversion: Vector Stores as a Source-Text Disclosure Surface
- Enterprise Agent Hardening: Three Production Gates
- Enterprise-Managed Plugin Governance for Agent CLIs
- Evidence-Based Allowlist Auto-Discovery for Agents
- Fail-Closed Remote Settings Enforcement for Enterprise Agents
- Five-Stage Policy Layer Typology for Generalist Agents
- Foresight-Guided Defense Against Infectious Jailbreaks in Multi-Agent Systems
- Four-Layer Taxonomy of Agent Security Risks
- Gate Agent Writes to Executable Config Files as Privileged Actions
- Goal Reframing: The Primary Exploitation Trigger for LLM Agents
- Harness Composition for Scaled Security Audits
- Heartbeat-Bound Hierarchical Credentials for Agent Swarms
- History Anchors: Consistency-Cued Continuation of Unsafe Prior Actions
- Hostname-Allowlist Proxy: The TLS-Inspection Blind Spot
- Hybrid Deterministic + Semantic Authorization for Agent Tool Calls
- Improper Output Handling: Validate Agent Output Before Downstream Use
- In-Process WebAssembly Sandboxes for Agent-Generated Code
- Inline Safety Harness with Cascade Verification (FinHarness)
- Lethal Trifecta Threat Model for AI Agent Development
- Lifecycle-Integrated Security Architecture for Agent Harnesses
- LLM-Pinned Library Versions Carry Systemic CVE Exposure
- Lock-State Safeguards for Desktop-Controlling Agents
- MCP Runtime Control Plane: Policy Evaluation Between Agent and Tool
- Mid-Trajectory Guardrail Selection for Multi-Step Tool Calls
- Monotonic Capability Attenuation for Composition-Safe Tool Use
- Multi-Tenant Isolation Knobs for Shared-Container Agent SDK Hosting
- Multitenant RAG: Closing the Relevance-Authorization Gap
- Network-less Container + Unix-Socket Egress Proxy for Agent Sandboxes
- Oracle Poisoning: Knowledge Graph Corruption Against Tool-Using Agents
- Org-Membership-Gated Agent Entitlement
- OWASP LLM Top 10 (2025): Agent Security Crosswalk
- Per-Server MCP Environment Scoping for Credential Isolation
- Permission Framework Choice Outweighs Model Choice for Limiting Overeager Actions
- Plan-Then-Execute as the Default for Web Agents
- Pre-Execution Risk Classification for Terminal Commands
- Pre-Trust Execution Surface in Coding Agent Harnesses
- Privacy-Preserving LLM Requests: Eight Techniques and a Practical Combination
- Provenance-Aware Decision Auditing for LLM Agents
- RAG Architecture as a Poisoning Robustness Decision
- Restricted-Access Defensive AI: Project Glasswing as a Deployment Model
- Safe Outputs Pattern for Trustworthy Agent Responses
- Sandbox + Approvals + Auto-Review Governance Triad
- Sandboxed Coding Environments: Containers vs MicroVMs vs OS-Level Isolators
- Scanner-as-MCP-Server: Secret and Dependency Scans as Typed Agent Tools
- Security Budget as Token Economics
- Security Constitution for AI Code Generation
- Security Drift in Iterative LLM Code Refinement
- Selective Network Access in Agent Sandboxes: The allowNetwork Pattern
- Semantic Intent Validation for Agent Skills
- Sensitive Terminal Prompt Interception
- Skill Shell Execution Gate: Disabling Inline Shell from Skills
- Skill Supply-Chain Poisoning
- Slopsquatting: Hallucinated Package Names as a Supply-Chain Vector
- Subprocess PID Namespace Sandboxing in Claude Code
- Sufficiency-Tightness Decomposition for Agent-Authored Permissions
- System Prompt as Secret Store (OWASP LLM07)
- Task-Based Access Control with Hybrid Inspection
- Three-Depth In-Session Security Review
- Three-Vector Evasion Taxonomy for Agent Security Tests
- Tool Signing and Signature Verification for Agents
- Tool-Invocation Attack Surface in Coding Agents
- Transcript-Driven Permission Allowlist
- Unbounded Consumption: Bounding Agent Resource Use Against DoS and Denial-of-Wallet
- Usability Pressure as a Silent Security-Regression Vector
- Verifying LLM-Generated Cryptographic Code
- Windows Sandboxing for Coding Agents
- Workload Identity Federation for Agent Runtimes
Testing & Evaluation¶
Measuring agent quality — evals, metrics, code review, and verification strategies.
- Verification Ledger
- Diff-Based Review Over Output Review
- Incremental Verification
- Agent-Assisted Code Review
- Analyzing Agent Evaluation Transcripts
- Golden Query Pairs as Regression Tests
- Grade Agent Outcomes, Not Execution Paths
- Test Harness Design for LLM Context Windows
- pass@k and pass^k Metrics
- Red-Green-Refactor with Agents
- Risk-Based Task Sizing for Verification Depth
- Task List Divergence as Instruction Diagnostic
- Test-Driven Agent Development
- Eval-Driven Development
- Evaluation-Driven Development for Agent Tools
- LLM-as-Judge Evaluation
- Simulation and Replay Testing
- Agent PR Volume vs. Value: The Productivity Paradox
- Agent-Authored PR Integration: Collaboration Signals That Determine Merge Success
- Agent-Driven PR Slicing
- Agent-Generated Code Maintenance Asymmetry
- Agent-Proposed Merge Resolution
- Agentic Code Review Architecture With Tool-Calling
- AIRA: Inspection Framework for AI-Generated Code
- Anti-Reward-Hacking: Rubrics That Resist Gaming
- Batched Suggestion Application: Bulk-Apply Agent Fixes on PRs
- Behavioral Testing for Non-Deterministic AI Agents
- Benchmark Contamination as Eval Risk
- Benchmark-Driven Tool Selection for Code Generation
- CausalFlow: Counterfactual Repair for Failed Agent Trajectories
- Chain-of-Verification for Coding Agents
- Cloud Parallel Review Pattern
- Completion Failure Taxonomy: Why Code Suggestions Miss
- ComplexMCP: Three Bottlenecks in Large Interdependent Tool Sandboxes
- Constraint Decay in Backend Code Generation
- Control Lexical Leakage in Agent-Memory Retrieval Evals (Entity-Collision)
- Controlled Benchmark Rewriting for Agent Safety Judgment
- Corpus-Level Trace Diagnostics for LLM Agents
- CoT Robustness in Code Generation
- Coverage-Guided Agents for Fuzz Harness Generation
- Coverage-Guided Fuzzing for Multi-Agent LLM Systems (FLARE)
- CRA-Only Review and the Merge Rate Gap
- Cross-Framework Signal Semantics: Re-Measure Borrowed Trajectory Rules
- Data Fidelity Guardrails: Preventing Agent Data Mutation
- Decomposed Red-Teaming for Agent Monitors
- Decomposing Agent Output Variability by Layer (Sampling vs Orchestration State)
- Defense-in-Depth Against Coding Agent Fabrication (Honesty Harness)
- Deferred Standards Enforcement via Review Agents
- Demand-Driven Repository Auditing
- Dependency Gap Validation for AI-Generated Code
- Direct-Apply Review Comments via Cloud Agent
- Distillation-Induced Similarity Metrics for Tool-Use Agents
- Dominator-Graph Trajectory Invariants for Non-Deterministic Agents
- Eval Awareness: Designing Evals Agents Cannot Recognise
- Evaluator Templates: Portable Primitives for Agent Eval Suites
- Five-Pass Blunder Hunt: Repeated Critique Passes for Plans
- Generative Provenance Records for Tool-Using Agents
- Golden Journeys: Restartability as a First-Class Verification Primitive
- Held-Out Test Gap: A Long-Horizon Reward-Hacking Signal
- Human-AI Review Synergy in Agentic Code Review
- Human-Review-Driven Curation of Golden Eval Datasets
- Incident-to-Eval Synthesis: Production Failures as Evals
- Interaction-Pattern Evaluation for Agentic PRs
- Learned Prefix Monitors for Agent Traces
- LLM Agent Bug Fix Taxonomy: 23 Fix Patterns from 930 Real Bugs
- LLM Static Verification Against Natural-Language Requirements
- LLM-Driven Benchmark Auditing
- Macro Evals for Agentic Systems: Population-Level Behavior Patterns
- Markov-Chain Reliability for LLM Agents: Audit the Abstraction Before You Trust the Metric
- Measuring Synthetic Eval Data Quality (SynAE)
- Multi-Agent RAG for Spec-to-Test Automation
- Multi-Turn Conversation Evaluation: Per-Turn and Trace-Level Scoring Together
- Mutation Testing as a Quality Gate for AI-Generated Test Suites
- Nonstandard Errors in AI Agents: Model-Family Variance
- Overeager-Behavior Elicitation: Scope + Trap Fragments as a Diagnostic for Out-of-Scope Tool Calls
- PASS@(k,T): Evaluate RL for Agents Along Sampling and Interaction Depth
- Phantom Symbol Detection for LLM API Migration
- Planted-Bug Methodology: Deliberate Bugs as Observability Calibration
- PR Description Style as a Lever for Agent PR Merge Rates
- Pre-Change Impact Analysis: Dependency Maps That Prevent Agent Regressions
- Precise Debugging: Measure Edit Precision, Not Just Test Pass Rate
- Predicting Reviewable Code: Pre-Flagging Functions Reviewers Will Delete
- RAG/Agent Reliability Problem Map: 16-Domain Failure Taxonomy
- Repository Perturbation as Context-Reasoning Diagnosis (RepoMirage)
- Reproduce-Before-Report Verification Gate
- Review-Feedback-to-Rule Loop: Promoting Recurring PR Comments into Harness Rules
- Review-Then-Apply CLI Flag for In-Process Auto-Fix
- Review-Then-Implement Loop for AI Agent Development
- Reviewer's Playbook for Agent-Authored Pull Requests
- Risk-Score Threshold Calibration for Auto-Approval
- Runnable Documentation as Agent Verification
- Self-Improving Code Review Agents — Learned Rules
- Signal Over Volume in AI Review for AI Agent Development
- Skill Evals: Measuring Skill Quality as a Dataset-Graded Unit
- Skill Retrieval Realism Gap
- Skill Specification Violation Fuzzing
- Source-Grounded Test Plan with Pre-Action Assertion Annotation
- Spec-Derived Execution as a Correctness Oracle
- Staged Evidence Gates for Agentic Program Repair
- Structural Coverage Criteria for Agent Workflows
- Structure-Aware Diff Labeling with Two-Stage LLM Pipelines
- Symptom-Reduction-as-Root-Cause: Why Oracle Tests Alone Miss Architectural Drift
- Task Alignment: The Selective-Compliance Gap Benchmarks Miss
- Test Evolution Blind Spot in Coding Agents
- Test-Driven Intent Clarification: Tests as Intermediate Alignment Artifacts
- The Security Review Gap in AI-Authored PRs
- Tiered Code Review: AI-First with Human Escalation
- Tool-Use Sim-to-Real Perturbation Taxonomy
- Trajectory Decomposition: Diagnose Where Coding Agents Fail
- Trajectory-Opaque Evaluation Gap: Why Final-Output Grading Misses Safety Violations
- Tunable Effort Levels for Code Review Agents
- Variance-Based RL Sample Selection
Hooks & Guardrails¶
Deterministic enforcement — lifecycle hooks, permissions, and structured constraints.
- Deterministic Guardrails Around Probabilistic Agents
- Hooks for Enforcement vs Prompts for Guidance
- Structured Output Constraints
- Event-Driven System Reminders
- Hook Catalog: Enforcing Agent Behavior with Hooks
- Hooks and Lifecycle Events
- Permission-Gated Custom Commands
- PostToolUse Auto-Formatting and Linting
Reliability & Recovery¶
Error handling, cost control, and recovery — circuit breakers, rollbacks, loop detection, and performance budgets.
- Circuit Breakers for Agent Loops
- Cost-Aware Agent Design
- Idempotent Agent Operations
- Rollback-First Design
- Layered Accuracy Defense
- Risk-Based Shipping
- Context-Injected Error Recovery
- Heuristic-Based Effort Scaling
- Loop Detection
- Reasoning Budget Allocation
- Stateless Agent Loop with Prompt Caching
- Steering Running Agents
- The Think Tool
- Agent Debugging
- Escape Hatches: Unsticking Stuck Agents
- Agent Chat History as a First-Class Artifact
- Agent Debug Log Panel: Chronological Event Inspection for Session Debugging
- Agent Observability with OpenTelemetry and Trajectory Logging
- Agent-Trace Data Layer: Storage for Hours-Long Traces
- BYOK Model Token Visibility: Closing the Observability Gap on Self-Hosted Routes
- Context-Usage Attribution: Per-Source Breakdown of Agent Context
- Cost-Aware Tracing for Skill Distillation
- Event Sourcing for Agents: Separating Cognitive Intention
- Failure-Aware Observability for Multi-Agent LLM Systems
- Harness Bug Detection Patterns
- Harness Preflight Doctor Command for Agent Diagnostics
- In-Session Transcript Search: Navigating Long Agent Conversations
- Making Application Observability Legible to Agents
- Observability Feedback Loop: A 7-Step Debug Runbook for Agents
- Offline Trajectory Replay for Multi-Agent Workflow Debugging
- Per-Plugin Token-Cost Attribution via claude plugin details
- Prebuilt Agent Monitoring Dashboard
- Programmatic Agent Session Export via
claude agents --json - Strained Coherence as a Pre-Failure Signal in Agent Trajectories
- Subagent OTel Trace Correlation via agent_id Attribute
- Traces Need Feedback to Power Learning
- Trajectory Logging via Progress Files and Git History
- Trajectory Pre-Filter for Failure Diagnosis (TrajAudit)
Standards & Protocols¶
Open standards and conventions for agent interoperability — AGENTS.md, MCP, A2A, and more.
- Agent-to-Agent (A2A) Protocol
- Agent Cards: Capability Discovery
- Agent Definition Formats
- Agent Skills: Cross-Tool Task Knowledge
- AGENTS.md: A README for AI Coding Agents
- llms.txt: Spec, Adoption, and Honest Limitations
- MCP: The Plumbing Behind Agent Tool Access
- OpenAPI as Agent Tool Specification
- OpenTelemetry for Agent Observability
- Plugin and Extension Packaging
- Tool Calling Schema Standards
- Distributed AGENTS.md Conventions
- A2UI: Framework-Agnostic Generative UI Standard for Agents
- ACDL: A Language for Describing Agentic LLM Contexts
- Cross-IDE Plugin Discovery: One Install Surface, Many Consuming Agents
- Directory-Aware Plugin Suggestions via
pluginSuggestionMarketplaces - llms.txt: Making Your Project Discoverable to AI Agents
- OAuth Client ID Metadata Documents (CIMD) for MCP Servers
- Plugin Dependency Declaration and Disable-Chain Hints
- Portable Agent Definitions: Full-Stack Identity as Code
- Pre-Install Context-Cost Projection in Plugin Marketplaces
- Pre-Install Plugin Transparency: Capability Inventory and Cost Projection
- SUDP: Secret-Use Delegation Protocol for Agentic Systems
- Swarm Skills: Multi-Agent Extension of the Agent Skills Standard
- Symphony: Open Spec for Issue-Tracker-Driven Coding Agent Orchestration
- WebMCP: Browser-Hosted Tool Contracts for In-Page AI Agents
Development Workflows¶
End-to-end processes — planning, bootstrapping, CI/CD integration, and team adoption.
- Agent-Driven Greenfield Product Development
- The AI Development Maturity Model
- Continuous Agent Improvement
- The Plan-First Loop: Design Before Code
- Repository Bootstrap Checklist
- Team Onboarding for Agent Workflows
- Vibe Coding: Outcome-Oriented Development
- Browser Automation as a Research Tool
- Headless Claude in CI
- Agent Commit Attribution: Signed Commits and Agent Identity
- Agent Environment Bootstrapping for AI Agent Development
- Agent Governance Policies for AI Agent Development
- Agent-Generated Onboarding Guide as a Durable Artefact
- Agent-Led Dev-Environment Iteration with Validation and Rollback
- Agent-Powered Codebase Q&A and Onboarding Workflow
- Agentic-Agile: Adapting Agile Rituals for Agent Work
- AI Bot CI/CD Workflow Reliability by Agent
- AI Slop as a Process Problem: Encoding Quality Standards as Pipeline Gates
- AI-Powered Vulnerability Triage for AI Agent Development
- Architecting a Central Repo for Shared Agent Standards
- Auto-Triage Workflow: Bug-Monitoring Agent that Connects Related Reports and Opens Fix PRs
- Backlog Triage as a Named Agent Skill
- Bootstrapping an Agent-Driven Project from Scratch
- Building Custom Agents from Substrate to Production (Agents All the Way Down)
- Burn the Boats — Commitment-Forcing Deprecation
- Canary Rollout for Agent Policy Changes
- CARE: Three-Party Stage-Gated Engineering of LLM Agents
- Chat-Platform Agent Delegation: Invoking Cloud Coding Agents from Team Channels
- CLI-IDE-GitHub Context Ladder for AI Agent Development
- Closed-Loop Agent Training from Tool Schemas
- Closed-Loop CI Failure Remediation with Cloud Coding Agents
- Cloud Planning with Inline-Comment Review and Execute-Anywhere Choice
- Cloud-Local Agent Handoff for AI Agent Development
- Code Cleanliness as an Agent Cost Lever
- Compound Engineering: Learning Loops That Make Each Feature Easier
- Continuous AI (Agentic CI/CD) for AI Agent Development
- Continuous Autonomous Task Loop
- Continuous Documentation as an Agent-Driven Practice
- Continuous Triage: Automating Issue Classification with AI Workflows
- Daily-Use Skill Library: Encoding Your Process as Agent Skills
- Discovery-Only Refactor Pass: Surface Candidates Before Touching Code
- Distilled Bootstrap Contract: Agent-Authored Repo Setup
- Documentation-Guided Legacy Migration: Architecture Docs as a C-to-Rust Blueprint
- Encoding Tacit Knowledge into Agent Improvement Loops
- Enterprise Skill Marketplace: Distribution and Quality
- Entropy Reduction Agents: Automated Codebase Hygiene
- Experiential-Learning Setup Agents with Snapshot Rollback (SetupX)
- Factory Over Assistant: Orchestrating Parallel Agent Fleets
- Failure-Driven Iteration for Improving Agent Workflows
- Humans and Agents in Software Engineering Loops
- In-Thread Side-Channel: Bounded Side Questions Without Losing the Main Task
- Incident Log Investigation Skill: Parallel Queries
- Introspective Skill Generation: Mining Agent Patterns
- Issue-to-PR Delegation Pipeline for AI Agent Development
- Issue-Tracker as Agent Dispatch Surface
- Labels as Locks: Pipelined Backlog Processing with Stage Gates
- Lazy Worktree Isolation: Enter the Worktree on First Write, Not on Dispatch
- Mise en Place for Agentic Coding
- Model Deprecation Lifecycle for Agent Workloads
- Model-ID-as-Dependency: Migration Protocol for Deprecation Churn
- Monolith-to-Sub-Agents Refactor: Five Lessons from a Brittle Prototype
- Multi-Repo and No-Repo Coding Agent Automation Templates
- One-Click CI Auto-Fix: Human-Triggered Cloud-Agent Remediation for Failing GitHub Actions
- Parallel Polyglot Ports as a Spec-Ambiguity Oracle
- Pre-Execution Codebase Exploration for AI Coding Agents
- Programmatic Cloud-Agent Dispatch via REST API and Webhooks
- Prototype Before Optimizing: Establish Quality Baselines Before Token Constraints
- Public-Channel Agent Work as Lehrwerkstatt for Team Learning
- QA Session to Issues Pipeline for AI Agent Development
- Runbooks as Agent Instructions: Agent-Followable Ops
- Scheduled Instruction File Fact-Checker for Accuracy
- SDLC-Phase Skill Taxonomy: Full-Lifecycle Skill Libraries
- Seamless Background-to-Foreground Handoff
- Semantic Issue Search from Chat vs Query Syntax
- Single-Branch Git for Agent Swarms: A Trade-Off Pattern
- Single-CLI Agent Platform: Create to Production in One CLI
- Skill Library Refinement Loops: Organisational Feedback for Shared Skills
- Spec-Driven Development with Spec Kit
- Staged Literal Porting with a Per-Stage Numeric Oracle
- Stakeholder Trust Through Evals and Observability
- The 7 Phases of AI-Assisted Feature Development
- The Research-Plan-Implement Pattern
- Throwaway-Prototype Skill: Build to Discard, Keep Only the Answer
- Tiled Agent Layout: Supervising Parallel Agents Through Dedicated Panes
- Token-Cost Profiling and Reduction for Always-On Agentic Workflows
- Velocity-Quality Asymmetry: Why AI Speed Gains Fade
- Verification-Centric Development for AI-Generated Code
Human Factors¶
The human side — cognitive load, attention management, supervision, and team dynamics.
- Human-in-the-Loop: Where and How to Supervise
- Human-in-the-Loop Confirmation Gates
- Developer as CPU Scheduler: Attention Management
- Cognitive Load, AI Fatigue, and Sustainable Agent Use
- Cross-Tool Translation
- Initiatives and Community
- Safe Command Allowlisting
- Agentic Education: Persona Progression for Teaching AI Coding Tools
- AI Abundance Reshapes Software Engineering Identity
- AI Adoption Footprint: The Segmented Shape of Engineering Orgs
- Ambition Scaling: Moving the Target as Model Capability Increases
- Coding-Agent Reversibility: Platform Choice as a Two-Way Door
- Cohort Segmentation in the Copilot Usage Metrics API
- Convenience Loops and AI-Friendly Code in Your Stack
- Copilot vs Claude Billing Semantics for Enterprise Teams
- Deliberate AI-Assisted Learning: Accelerating Skill Acquisition
- Developer Control Strategies for AI Coding Agents
- Empowerment Over Automation for AI Agent Development
- Evaluating Agent Patterns Catalog as a Source
- From Preventive to Reactive: Front-Loading Security in AI Coding Prompts
- Human-Equivalent Hours for Autonomous Coding Agent Productivity
- Human-Facing Docs in the Agent Era: Mental Models Over Reference
- Intent-Centric Engineering: Oversight Over Authorship
- LLM Refactoring Adoption Patterns
- PM on the AI Exponential
- Polya Small-Steps: Using AI to Think Better, Not Think Less
- Process Amplification: Scaling Human Work with Agents
- Progressive Autonomy: Scaling Trust with Model Evolution
- Rigor Relocation: Engineering Discipline with AI Agents
- Skill Atrophy: When AI Reliance Erodes Developer Capability
- Strategy Over Code Generation: Why AI Speed Doesn't Fix Wrong Goals
- Suggestion Gating: Fewer Completions, Better DX
- The Addictive Flow State of Agent-Assisted Development
- The Bottleneck Migration When Humans Supervise Agents
- The Context Ceiling -- Where AI Fails Expert Architects
- The Productivity-Experience Paradox in AI-Assisted Development
- Visible Thinking in AI-Assisted Development
Anti-Patterns¶
What to avoid — common mistakes and why they fail.
- The Anthropomorphized Agent
- Cargo Cult Agent Setup
- The Copy-Paste Agent
- Distractor Interference
- Framework-First Agent Development
- The Infinite Context
- The Mega-Prompt
- Objective Drift
- The Prompt Tinkerer
- Indiscriminate Structured Reasoning
- The Kitchen Sink Session
- Single-Layer Prompt Injection Defence
- Trust Without Verify
- Vibe Coding
- The Yes-Man Agent
- Abstraction Bloat in AI Agent-Generated Code Output
- Agent Headcount as a Vanity Metric
- Agent-Laundered Bug Reports
- AI Agents in CI/CD with Elevated Permissions and Untrusted Content (GitInject)
- Assumption Propagation: Compounding Agent Misunderstandings
- Boring Technology Bias: When Agents Recommend by Popularity
- bypassPermissions Silently Overrides allowedTools (The Restricted-Bypass Trap)
- Chain-of-Thought Reasoning Fallacy: Traces Are Not Truth
- Coding-Agent Misalignment Forms (Seven-Symptom Taxonomy)
- Context Poisoning: When Hallucinations Become Premises
- Cross-Component Interference in Agent Scaffolds
- Demo-to-Production Gap: When Demos Hide Real Costs
- Direct Prompt Injection via Collaboration (User as Attack Vector)
- Dynamic Tool Fetching Destroys KV Cache Performance
- External Artifacts Treated as Data, Not Adversarial Input
- Happy Path Bias: How AI Agents Skip Error Handling
- Large-Codebase Coding-Agent Failure Patterns (Sourcegraph Five)
- Law of Triviality in AI PRs for AI Agent Development
- LLM Code Review Overcorrection for AI Agent Development
- LLM Comprehension Fallacy: When Models Seem to Understand
- LLM Self-Review Failure in Code Modernization Tasks
- MCP Allowlist by Label, Not by Identity (serverName Trap)
- Memory-Induced Tool-Drift in LLM Agents
- Mid-Session Config Changes as Invisible Cache Invalidators
- Pattern Replication Risk in Agentic Code Generation
- Perceived Model Degradation: Why Vibes Are Not Evals
- PR Scope Creep as a Human Review Bottleneck
- Premature Completion: Agents That Declare Success Too Early
- Prior Dominance Over Feedback in Agent Optimization Loops
- Prompt as Security Knob
- Prompt-Only Tool Access Control
- Refactoring Runaway: Tangled Refactorings in Agent Patches
- Run-Status vs Task-Status Confusion in Autonomous Agent Runs
- Shadow Tech Debt Created by Autonomous AI Agent Commits
- Spec Complexity Displacement: When Specs Become Code
- Stale AI Configuration Artifacts (Context Rot)
- The AI Knowledge Generation Fallacy: LLMs Recombine, Not Invent
- The Consistent Capability Fallacy in LLM Agent Design
- The Effortless AI Fallacy for AI Agent Development
- The Implicit Knowledge Problem for AI Coding Agents
- The LLM Laziness Deficit Fallacy: Restraint Comes From Harness, Not Instruction
- The Reasoning-Complexity Trade-off
- The Synthetic Ground Truth Fallacy in Agent Evaluation
- The Task Framing Irrelevance Fallacy in Agent Prompting
- The Test Homogenization Trap: When LLM-Generated Tests Mirror Model Blind Spots
- Token Preservation Backfire for AI Agent Development
- Treating Agent Safety as Uniform Across a Session (Cold-Start Safety Gap)
- Trusting Human Review to Catch Deliberate Agent Sabotage
- Trusting Tool Error Messages as Implicit Authority (Error-Path Injection)
- When Developers Understand Less of Their Own Codebase
Emerging Concepts¶
Fringe ideas not yet standardized — early signals worth watching.
- First-Party Agent Composition
- Hyper-Personalized Software
- Product-as-IDE
- Bootstrapping Coding Agents: The Specification Is the Program
- Interactive Canvases: Agent-Generated Visual Artifacts as Outputs
Discoverability & GEO¶
Generative-engine optimisation, llms.txt, and topical authority — making content reachable by AI answer engines.
- AI Crawler Policy: robots.txt for the Three-Tier Crawler Landscape
- Answer-First Writing: Structure Content for AI Retrieval
- Assertion Density — Stats and Quotes Over Vague Claims
- Atomic Pages and Chunking — One Concept Per Page for RAG
- GEO for Technical Docs: Developer Documentation Checklist
- Google Search Console Monitoring Workflow
- How AI Engines Cite — ChatGPT, Perplexity, Claude, Gemini
- Measuring GEO Performance for AI Search Visibility
- Schema and Structured Data for GEO — AI Citation Guide
- SEO vs GEO — How Signals and Metrics Differ
- Topical Authority — Entity Coverage for AI Citation
- What is GEO — Generative Engine Optimization Defined
Frameworks¶
Multi-page framework syntheses — end-to-end agent operating models and migration playbooks.
- Agentic Framework Landscape: When Each Framework Fits
- Cognitive Architectures for Language Agents (CoALA): A Classifier for Agent Harnesses
- Consistent-Format Customer Capture
- Cross-Functional Knowledge Artifacts
- Functional Folder Taxonomy
- L0 → L1: Making the Repo Readable
- L1 → L2: Adding Feedback Loops to a Brownfield Repo
- L2 → L3: Building Mechanical Enforcement for Agents
- L3 → L5: Reaching Agent-First from a Brownfield Repo
- Natural-Language Git as Adoption Unlock
- Plan Files as Resumable Artifacts
- Plan Mode for Knowledge Artifacts
- Self-Explanation Loop for Code-Review Comments
Learning & Training¶
Structured curriculum modules and hands-on tracks for agent-assisted development.
- Air-Gapped RAG: Architecture Fundamentals
- Air-Gapped RAG: Chunking Strategies
- Air-Gapped RAG: Deployment, Operations, and Compliance
- Air-Gapped RAG: Document Ingestion and Parsing
- Air-Gapped RAG: Grounding, Citations, and Evaluation
- Air-Gapped RAG: Local Embeddings and Vector Stores
- Air-Gapped RAG: Local LLM Inference
- Air-Gapped RAG: Overview and When to Use It
- Air-Gapped RAG: Retrieval and Re-Ranking
- Autonomous Research Loops: Loops That Know When to Stop
- Context Engineering (Training Module)
- Earned-Complexity Agent Maturity Ladder
- Eval Engineering (Training Module)
- GitHub Copilot Advanced Patterns: Multi-Agent and Automation
- GitHub Copilot Platform Surface Map: All Capabilities
- GitHub Copilot: Context Engineering & Agent Workflows
- GitHub Copilot: Customization Primitives and Stack
- GitHub Copilot: Harness Engineering for Agent-Ready Code
- GitHub Copilot: Model Selection, Routing, and Costs
- GitHub Copilot: Team Adoption and Governance Guide
- Grading Strategies for Eval-Driven Development
- Hardening Agent Evals for Production-Grade Reliability
- Harness Engineering (Training Module)
- How the Four Agent Engineering Disciplines Compound
- Prompt Engineering for Agent Instructions and Systems
- Step-by-Step: Building Your First Eval-Driven Feature
- The Eval-First Development Loop for AI Agent Features
- Tool Engineering (Training Module)
- What Evals Are and Why AI Agents Need Them for Quality
- Writing Your First Agent Evaluation Suite from Scratch
Tool-Specific Guides¶
Tool-specific features and workflows for Claude Code, GitHub Copilot, Cursor, and other assistants.
- Agent HQ (Multi-Agent Platform) for AI Agent Development
- Agent Mission Control for Orchestrating Agent Tasks
- Agent View: Dispatch-Attach-Monitor Surface for Parallel Sessions
- Channels Permission Relay
- Claude Agent SDK: Building Custom Agentic Workflows
- Claude Code --bare Flag
- Claude Code /batch and Worktrees for AI Agent Development
- Claude Code Agent Teams for Collaborative AI Workflows
- Claude Code Auto Mode: Classifier-Based Permission Gating
- Claude Code Dynamic Workflows
- Claude Code Extension Points: When to Use What
- Claude Code Feature Flags and Environment Variables
- Claude Code Hooks: Deterministic Lifecycle Automation
- Claude Code Review
- Claude Code Sub-Agents for Delegating Complex Tasks
- Cloud-Scheduled Routines vs Local Session Scheduling
- Copilot CLI Agentic Workflows for AI Agent Development
- Copilot CLI BYOK and Local Model Support
- Copilot Cloud Agent Organization Controls
- Copilot Cloud Agent Three-Phase Execution Model
- Copilot Inline Agent Mode in JetBrains IDEs
- Copilot Memory and Cross-Agent Persistence
- Copilot Spaces: Curated Context Collections for Grounding
- Copilot Unified Sessions View and CLI Agent in JetBrains IDEs
- copilot-instructions.md as a Repo-Level Instruction Convention
- Cursor /multitask: Async Subagent Dispatch in the Editor
- Cursor 3 Agents Window: Parallel Agents and Worktree Isolation
- Cursor Multi-Root Workspaces for Cross-Repo Agent Edits
- Cursor SDK: Programmable TypeScript Agent Runtime
- Cursor Self-Hosted Cloud Agents
- Dependabot Agent Assignment for AI-Driven Vulnerability Remediation
- GitHub Agentic Workflows for Automating Dev Processes
- GitHub Copilot Agent Mode for AI Agent Development
- GitHub Copilot Coding Agent for AI Agent Development
- GitHub Copilot Custom Agents and Skills Extensibility Guide
- GitHub Copilot Dedicated App as Agent-First Surface
- GitHub Copilot Extensions for AI Agent Development
- GitHub Copilot MCP Integration for AI Agent Development
- GitHub Copilot SDK for AI Agent Development
- GitHub Models in Actions for AI-Driven CI Workflows
- Hard-Deny Classifier Rule: Unconditional Block in Auto Mode
- Local Plugin Scaffolding via
claude plugin initand Auto-Loaded.claude/skills - Managed Settings Drop-In Directory: Enterprise Policy Fragmentation
- Managing Agent Skills from the GitHub CLI with gh skill
- Monitor Tool: Event Streaming from Background Scripts
- Monorepo Skill and Agent Discovery: Hierarchical Configuration
- Next Edit Suggestions Paradigm for AI Agent Development
- OpenAI Agents SDK Sandboxes Harness and Memory
- Plan Mode: Read-Only Exploration Before Implementation
- Plugin Background Monitors: Declarative Supervision Auto-Armed at Session Start
- Plugin-Activated Main-Agent Override and Bin/ PATH Injection
- PowerShell Tool: Native Windows Shell for Claude Code
- Reloading Skills Mid-Session in Claude Code
- Session Scheduling with Loop and Cron in Claude Code
- Skill disallowed-tools Frontmatter: Skill-Layer Tool Denial
- Skill Eval Loop
- Sparse-Checkout Worktrees for Monorepo Agent Isolation
- Video Transcript Skill: Meeting Recording to Markdown
- Visual-Prompt Agent Steering (Cursor Design Mode)