Tags¶
Browse the full Agent Patterns library by topic tag.
Tags are the primary topic-first index for this site. Each topic tag below gets a short intro and anchor pages; the full auto-generated tag listing follows beneath the intros. Tool-specific tags (claude, copilot, cursor, tool-agnostic) carry no prose — see the auto-list at the bottom.
Topic intros¶
context-engineering¶
Patterns for managing what enters the agent's context window: budgets, compression, attention curves, retrieval, and prompt caching. Read this if you're hitting context limits, debugging "lost in the middle" failures, or paying too much for token-heavy prompts. Start with:
- Context Engineering: The Discipline of Designing Agent Context
- Context Budget Allocation: Every Token Has a Cost
- Layered Context Architecture
- Lost in the Middle: The U-Shaped Attention Curve
- Static Content First to Maximize Cache Hits
agent-design¶
How a single agent is structured: harness, delegation, backpressure, composition, and memory. Read this when picking a harness, designing a long-running loop, or choosing between role-based and task-based delegation. Start with:
- Harness Engineering
- The Delegation Decision
- Agent Backpressure: Automated Feedback for Self-Correction
- Agent Composition Patterns: Chains, Fan-Out, Pipelines
- Agent Memory Patterns
multi-agent¶
Orchestration across multiple agents: fan-out, hand-offs, file-based coordination, and sub-agent topology. Read this when one agent isn't enough and the question shifts from "how do I prompt it" to "how do I wire them together". Start with:
- Orchestrator-Worker Pattern
- Fan-Out Synthesis Pattern
- Sub-Agents for Fan-Out Research
- Agent Handoff Protocols
- File-Based Agent Coordination
memory¶
What an agent remembers across turns, sessions, and projects — versioned vs streaming, episodic vs reinforced, and the failure modes when memory leaks into wrong contexts. Read this when designing persistence or chasing memory-induced drift bugs. Start with:
- Agent Memory Patterns
- Tiered Memory Architecture
- Episodic Memory Retrieval
- Structured Task Graphs as External Agent Memory (Beads)
- Memory-Induced Tool Drift
code-review¶
Agentic code review patterns: how agents read diffs, when committee or tiered reviews pay off, and how to wire review verdicts back into iteration. Read this when integrating agents into PRs or designing a review-then-implement loop. Start with:
- Agent-Assisted Code Review
- Diff-Based Review Over Output Review
- Committee Review Pattern
- Tiered Code Review
- Agentic Code Review Architecture
instructions¶
System prompts, CLAUDE.md / AGENTS.md files, polarity rules, and the compliance ceiling that limits how many rules an agent will honour. Read this when writing or pruning instruction files, deciding what's primacy-critical, or debugging non-compliance. Start with:
- System Prompt Altitude
- CLAUDE.md Convention
- AGENTS.md as Table of Contents, Not Encyclopedia
- The Instruction Compliance Ceiling
- Instruction Polarity: Positive Rules Over Negative
workflows¶
End-to-end agent-assisted development loops: planning, eval-driven iteration, CI integration, onboarding, and team collaboration. Read this when fitting agents into your existing process rather than starting from a blank page. Start with:
- The Plan-First Loop: Design Before Code
- Eval-Driven Development
- Headless Claude in CI
- Repository Bootstrap Checklist
- Continuous Agent Improvement
cost-performance¶
Routing, token efficiency, model selection, and tool-call cost control. Read this when token spend is hurting, you're tuning a reasoning/execution model split, or designing tool descriptions to minimise per-call overhead. Start with:
- Cost-Aware Agent Design
- Token-Efficient Tool Design
- Reasoning Budget Allocation
- Stateless Agent Loop with Prompt Caching
- Gateway Model Routing
testing-verification¶
General testing and verification discipline for agent-generated work: TDD, incremental checks, verification ledgers, and pre-completion gates. Read this for the broad "did the agent do the right thing" patterns — see code-review and evals for the narrower sub-domains. Start with:
- Test-Driven Agent Development
- Incremental Verification
- Verification Ledger
- Pre-Completion Checklists
- Risk-Based Task Sizing for Verification Depth
evals¶
Evaluation design specifically: pass@k, LLM-as-judge, golden query pairs, and turning incidents into regression cases. Read this when measuring agent quality with structure, building regression suites, or scoring rollouts before they ship. Start with:
- pass@k and pass^k Metrics
- LLM-as-Judge Evaluation
- Eval-Driven Development
- Golden Query Pairs as Regression Tests
- Incident-to-Eval Synthesis
security¶
Prompt-injection defence, blast-radius containment, secrets handling, sandboxing, and the lethal-trifecta posture. Read this before granting agents write capability, ingesting untrusted content, or wiring an MCP server. Start with:
- Prompt Injection: A First-Class Threat
- Blast Radius Containment: Least Privilege for AI Agents
- Dual-Boundary Sandboxing
- URL-Based Data Exfiltration Guard
- Defense-in-Depth Agent Safety
observability¶
Tracing, loop detection, transcript analysis, OTel for agents, and harness debugging. Read this when agents fail silently, loops run wild, or you need to diagnose what an agent actually did versus what it claims. Start with:
- Agent Debugging
- Loop Detection
- Circuit Breakers for Agent Loops
- OpenTelemetry for Agent Observability
- Analyzing Agent Evaluation Transcripts
human-factors¶
The human side of agent-driven work: cognitive load, attention management across parallel sessions, skill atrophy, adoption, and team dynamics. Read this when scaling agents to a team, fighting AI fatigue, or designing supervision rituals. Start with:
- Developer as CPU Scheduler: Attention Management
- Cognitive Load, AI Fatigue, and Sustainable Agent Use
- Cross-Tool Translation
- Skill Atrophy
- Agentic Education Persona Progression
github-actions¶
GitHub Actions-specific integration for agent workflows: CI-driven triage, closed-loop remediation, headless runs, and one-click auto-fix. Read this when wiring agents into Actions or designing the CI-side automation that triggers and consumes agent output. Start with:
- Headless Claude in CI
- Closed-Loop CI Failure Remediation
- One-Click CI Auto-Fix
- Continuous Triage
- AI Bot CI Workflow Reliability
All tags¶
agent-design¶
- A2UI: Framework-Agnostic Generative UI Standard for Agents
- ACID for Agent Repository State
- AI Bot CI/CD Workflow Reliability by Agent
- AI Slop as a Process Problem: Encoding Quality Standards as Pipeline Gates
- AI-Powered Vulnerability Triage for AI Agent Development
- AOCI: Symbolic-Semantic Repository Indexing
- AST-Grounded Critic Loop for Documentation Maintenance
- AST-Guided Agent Memory for Repository-Level Code Generation
- AX/UX/DX Triad: Three Experience Layers in Agent Systems
- Abstention-Aware Memory Retrieval for Coding Agents
- Abstraction Bloat in AI Agent-Generated Code Output
- Action-Audit Divergence: A Four-Mode Taxonomy for Runtime Hardening
- Action-Selector Pattern: LLM as Intent Decoder with Deterministic Execution
- Adaptive Generate-Rank-Verify Under Costly Verification
- Adaptive Sandbox Fan-Out Controller
- Advanced Tool Use: Scaling Agent Tool Libraries
- Adversarial Multi-Model Development Pipeline (VSDD)
- Agent Backpressure: Automated Feedback for Self-Correction
- Agent Cards: Capability Discovery Standard for AI Agents
- Agent Circuit Breaker
- Agent Commit Attribution: Signed Commits and Agent Identity
- Agent Composition: Chains, Fan-Out, Pipelines, Supervisors
- Agent Debug Log Panel: Chronological Event Inspection for Session Debugging
- Agent Debugging: Diagnosing Bad Agent Output
- Agent Definition Formats: How Tools Define Agent Behavior
- Agent Design Patterns and Architectures for AI Agents
- Agent Development Lifecycle for Agent Products
- Agent Environment Bootstrapping for AI Agent Development
- Agent Event Streaming: Consumer Contract Above the Tokens
- Agent Governance Policies for AI Agent Development
- Agent HQ (Multi-Agent Platform) for AI Agent Development
- Agent Handoff Protocols: Passing Work Between Agents
- Agent Harness: Initializer and Coding Agent Pattern
- Agent JIT Compilation: Compile Tasks Into Executable Plans
- Agent Loop Middleware — Safety Nets and Message Injection
- Agent Memory Patterns: Learning Across Conversations
- Agent Mission Control for Orchestrating Agent Tasks
- Agent Network Egress Policy: Admin-Controlled Domain Allow/Deny
- Agent Observability with OpenTelemetry and Trajectory Logging
- Agent Project State Purge: Clean-Slate Session Reset
- Agent Pushback Protocol for Managing Disagreements
- Agent Runtime Middleware: Per-Call Interception Pipeline
- Agent Self-Review Loop for Iterative Self-Improvement
- Agent Skills: A Cross-Tool Task Knowledge Standard
- Agent Terminology Disambiguation for AI Coding Systems
- Agent View: Dispatch-Attach-Monitor Surface for Parallel Sessions
- Agent-Authored Messages as a Deferred Exfiltration Channel
- Agent-Authored PR Integration and Merge Predictors
- Agent-Aware CLI Behaviour via Environment Variable
- Agent-Computer Interface (ACI): Tool Design as UX Discipline
- Agent-Discoverable Slash Commands
- Agent-Driven Greenfield Product Development from Scratch
- Agent-First Software Design for AI Agent Development
- Agent-Generated Onboarding Guide as a Durable Artefact
- Agent-Led Dev-Environment Iteration with Validation and Rollback
- Agent-Powered Codebase Q&A and Onboarding Workflow
- Agent-Trace Data Layer: Storage for Hours-Long Traces
- Agent-to-Agent (A2A) Protocol for AI Agent Development
- Agentic AI Architecture: From Prompt to Goal-Directed
- Agentic Detection and Response at the MCP Boundary
- Agentic Flywheel: Building Self-Improving Agent Systems
- Agentic-Agile: Adapting Agile Rituals for Agent Work
- Agentless vs Autonomous: When Simple Beats Complex
- Agents vs Commands: Separation of Role and Workflow
- Air-Gapped RAG: Architecture Fundamentals
- Always-On Agentic PR Security Review
- Ambition Scaling: Moving the Target as Model Capability Increases
- Anthropic's Effective Agents Framework: A Pattern Map
- Anti-Reward-Hacking: Rubrics That Resist Gaming
- Architecting a Central Repo for Shared Agent Standards
- Assumption Propagation: Compounding Agent Misunderstandings
- Async Non-Blocking Subagent Dispatch
- Asynchronous Agent I/O and Speculative Tool Calling
- Attention Latch: When Agents Stay Anchored to Stale Instructions
- Authority Confusion: Untrusted Context Must Not Authorize Side Effects
- Auto Model Selection: Harness-Driven Routing per Task
- Auto-Triage Workflow: Bug-Monitoring Agent that Connects Related Reports and Opens Fix PRs
- Autonomous Research Loops: Loops That Know When to Stop
- Background Todo Agent: Offload Plan Maintenance to a Lightweight Model
- Backlog Triage as a Named Agent Skill
- Behavioral Drivers of Coding Agent Success and Failure
- Behavioral Firewall for Tool-Call Trajectories
- Behavioral Testing for Non-Deterministic AI Agents
- Blast Radius Containment: Least Privilege for AI Agents
- Bootstrapping Coding Agents: The Specification Is the Program
- Bootstrapping an Agent-Driven Project from Scratch
- Boring Technology Bias: When Agents Recommend by Popularity
- Bounded Batch Dispatch for Parallel Agent Execution
- Brownfield to Agent-First: Repo Maturity Framework
- Browser Automation as a Research Tool: Bypassing Bot Detection
- Burn the Boats — Commitment-Forcing Deprecation
- CARE: Three-Party Stage-Gated Agent Engineering
- CLI-First Skill Design
- CLI-IDE-GitHub Context Ladder for AI Agent Development
- Canary Rollout for Agent Policy Changes
- CausalFlow: Counterfactual Repair for Failed Agent Trajectories
- Channels Permission Relay
- Chat-Platform Agent Delegation: Invoking Cloud Coding Agents from Team Channels
- Circuit Breakers for Agent Loops
- Clarification Mode Amplifies Prompt Injection
- Classical SE Patterns as Agent Design Analogues
- Classifier-Gated Auto-Permission for Cloud-IDE Coding Agents
- Classifier-Subagent Run Mode for Per-Call Permission Routing
- Classifying and Auto-Correcting Coding Agent Misbehaviors (Wink)
- Claude Agent SDK: Building Custom Agentic Workflows
- Claude Code /batch and Worktrees for AI Agent Development
- Claude Code Agent Teams for Collaborative AI Workflows
- Claude Code Auto Mode: Classifier-Based Permission Gating
- Claude Code Dynamic Workflows
- Claude Code Extension Points: When to Use What
- Claude Code Feature Flags and Environment Variables
- Claude Code Hooks: Deterministic Lifecycle Automation
- Claude Code Sub-Agents for Delegating Complex Tasks
- Claude Code ↔ Copilot CLI: Changelog-Driven Feature Parity
- Clock-In / Clock-Out Protocol: Bracketed Session Continuity
- Close the Attack-to-Fix Loop: Adversarially Train Agent
- Closed-Loop Agent Training from Tool Schemas
- Closed-Loop CI Failure Remediation with Cloud Coding Agents
- Closed-Loop Role-Based Refinement for Agent Systems
- Cloud Planning with Inline-Comment Review and Execute-Anywhere Choice
- Cloud-Agent Session Bootstrap: Cached Install plus Per-Session Start
- Cloud-Agent Three-Layer State Decoupling
- Cloud-Agent Tiered Model Routing: Cheap Tiers for Bounded Tasks
- Cloud-Local Agent Handoff for AI Agent Development
- CoALA Decision-Making Loop as an Orchestration Lens
- CoALA Memory Taxonomy as a Classifier for Harness Artifacts
- CoALA Structured Action Space: Internal vs External Actions
- CoT Robustness in Code Generation
- Code Cleanliness as an Agent Cost Lever
- Code Injection Defence in Multi-Agent Pipelines
- Code Interpreter as a Primary Agent Tool
- Code-Health-Gated LLM Tier Routing
- Codebase Readiness for Agents: Agent-Friendly Code
- Coding Agent Scope Expansion: When to Extend Beyond the Codebase
- Coding-Agent Misalignment Forms (Seven-Symptom Taxonomy)
- Cognitive Architectures for Language Agents (CoALA): A Classifier for Agent Harnesses
- Cognitive Poisoning: Untrusted Tool Feedback as a Trajectory Attack
- Cognitive Reasoning vs Execution: A Two-Layer Agent
- Committee Review Pattern for Multi-Agent Code Review
- Component-Isolated Memory Stress Testing for LLM Agents
- Compositional Vulnerability Induction in Coding Agents
- Compound Engineering: Systematic Agent Learning Loop
- Conditional Hook Execution: Filter Hooks by Tool Pattern
- Confirmation Gates for Consequential Agent Actions
- Consistent-format customer capture
- Consolidate Agent Tools to Reduce Cognitive Overhead
- Constraint Decay in Backend Code Generation
- Constraint Drift: Why Safety Must Be Maintained, Not Asserted
- Content Exclusion Gap: AI Security Boundaries by Mode
- Context Poisoning: When Hallucinations Become Premises
- Context-Injected Error Recovery for AI Agent Development
- Contextual Capability Calibration for Multi-Agent Delegation
- Continual Learning for AI Agents: Three Layers of Knowledge Accumulation
- Continuous AI (Agentic CI/CD) for AI Agent Development
- Continuous Agent Improvement: Iterating on Agent Quality
- Continuous Autonomous Task Loop
- Continuous Documentation as an Agent-Driven Practice
- Continuous Triage: Automating Issue Classification with AI Workflows
- Control/Data-Flow Separation for Prompt Injection Defense (CaMeL)
- Controlled Benchmark Rewriting for Agent Safety Judgment
- Convention Over Configuration in Agent Workflow Design
- Convergence Detection in Iterative Agent Refinement
- Copilot CLI Agentic Workflows for AI Agent Development
- Copilot Cloud Agent Organization Controls
- Copilot Cloud Agent Three-Phase Execution Model
- Copilot Inline Agent Mode in JetBrains IDEs
- Copilot Memory: Autonomous Cross-Agent Persistence
- Copilot Unified Sessions View and CLI Agent in JetBrains IDEs
- Cost-Aware Agent Design: Route by Complexity, Not Habit
- Cost-Aware Tracing for Skill Distillation
- Coverage-Guided Agents for Fuzz Harness Generation
- Credential Hygiene for Agent Skill Authorship
- Critic Agent Pattern: Dual-Model Plan Review
- Cross-Component Interference in Agent Scaffolds
- Cross-Cycle Consensus Relay
- Cross-Framework Signal Semantics: Re-Measure Borrowed Trajectory Rules
- Cross-Session Memory Poisoning for Data Exfiltration (Trojan Hippo)
- Cross-Tool Subagent Comparison
- Cross-Vendor Competitive Routing for LLM Selection
- Cryptographic Governance Audit Trail for AI Agents
- Cursor /multitask: Async Subagent Dispatch in the Editor
- Cursor 3 Agents Window: Parallel Agents and Worktree Isolation
- Cursor SDK: Programmable TypeScript Agent Runtime
- Cursor Self-Hosted Cloud Agents
- Cursor for AI Agent Development
- Customer-Hosted MCP Tunnel: Outbound-Only Connectivity to Private MCP Servers
- DSPy: Programmatic Prompt Optimization for Compound Agent Systems
- Daily-Use Skill Library: Encoding Your Process as Agent Skills
- Data Fidelity Guardrails: Preventing Agent Data Mutation
- Decentralized Memory for Self-Evolving Multi-Agent Systems
- Declarative Multi-Agent Composition
- Decomposed Red-Teaming for Agent Monitors
- Deep Agent Runtime: The Layer Beneath the Harness
- Defense-in-Depth Against Coding Agent Fabrication (Honesty Harness)
- Defense-in-Depth Agent Safety for AI Agent Development
- Deferred Permission Pattern: Headless Agent Session Pausing
- Delta Channels: Bounded Checkpoint Storage for Append-Only Agent State
- Demo-to-Production Gap: When Demos Hide Real Costs
- Designing Agents to Resist Prompt Injection
- Deterministic Orchestration for Structured Modernization
- Developer Attention Management with Parallel Agents
- Direct Prompt Injection via Collaboration (User as Attack Vector)
- Discovering Indirect Injection Vulnerabilities in Your Agent
- Discovery-Only Refactor Pass: Surface Candidates Before Touching Code
- Discrete Phase Separation
- Distillation-Induced Similarity Metrics for Tool-Use Agents
- Distilled Bootstrap Contract: Agent-Authored Repo Setup
- Distributed Computing Parallels in Agent Architecture
- Docker sbx Adoption for Coding Agents
- Documentation-Guided Legacy Migration: Architecture Docs as a C-to-Rust Blueprint
- Domain-Specific Agent Challenges
- Domain-Specific System Prompts with Concrete Examples
- Dominator-Graph Trajectory Invariants for Non-Deterministic Agents
- Dormant Memory Payloads Triggered by Sensitive Topics (Trojan Hippo)
- Dual-Boundary Sandboxing for Secure Agent Execution
- Dual-Budget Control for Search Agents: VOI Scoring Per Action
- Dual-Graph Alignment for Indirect Prompt Injection Defense (AuthGraph)
- Dual-Trace Memory Encoding: Pair Facts with the Scene They Were Learned In
- Dual-Write Append-Mirror for Agent Transcript Externalization
- Durable Interactive Artifacts: Agent Output Outside the Transcript
- Editor and Manager Surface Separation in Agent IDEs
- Effective Feedback Compute (EFC) for Harness Comparison
- Elastic Context Orchestration: A Per-Turn Vocabulary for Long-Horizon Search Agents
- Embedding Inversion: Vector Stores as a Source-Text Disclosure Surface
- Emergent Architecture in AI-Driven Codebases
- Emergent Behavior Sensitivity for AI Agent Development
- Emerging Concepts for AI Agent Development
- Empirical Baseline: Agentic AI Coding Tool Configuration
- Encoding Tacit Knowledge into Agent Improvement Loops
- Enforcing Agent Behavior with Hooks
- Enterprise Agent Hardening: Three Production Gates
- Enterprise Skill Marketplace: Distribution and Quality
- Entropy Reduction Agents: Automated Codebase Hygiene
- Environment Specification as Context: Closing the Version Gap
- Episodic Memory Retrieval for AI Coding Agent Loops
- Error Preservation in Context for AI Agent Development
- Escape Hatches: Unsticking Stuck Agents
- Eval Strategy by Agent Generation: A Structure-to-Eval Locator
- Eval-Driven Development: Write Evals Before Building Agent
- Evaluation-Driven Development for Agent Tools
- Evaluator-Optimizer Pattern for AI Agent Development
- Event Sourcing for Agents: Separating Cognitive Intention
- Event-Driven Agent Routing for Multi-Team AI Pipelines
- Evolving Playbooks: Incremental Context That Preserves Knowledge
- Exception Handling and Recovery Patterns for AI Coding Agents
- Execution Lineage: DAG of Artifacts vs Agent Loops
- Execution-First Delegation: The AI-as-Executor Pattern
- Experience Graphs as Structured Memory for Self-Evolving Agents
- Experiential-Learning Setup Agents with Snapshot Rollback (SetupX)
- External Artifacts Treated as Data, Not Adversarial Input
- Externalization in LLM Agents
- Factory Over Assistant: Orchestrating Parallel Agent Fleets
- Fail-Closed Remote Settings Enforcement for Enterprise Agents
- Failure-Driven Iteration for Improving Agent Workflows
- Fan-Out Synthesis Pattern for AI Agent Development
- Feature List Files for Reliable AI Agent Development
- Feedback as Capability Equalizer: Iterative Feedback Outweighs Model Scale
- File-Based Agent Coordination for AI Agent Development
- Filesystem-Based Tool Discovery for AI Agent Development
- Filter and Aggregate Data in the Execution Environment
- First-Party Agent Composition: Agent-Built Features
- Five-Failure-Layers Diagnostic: Attribute Before Swapping the Model
- Five-Stage Policy Layer Typology for Generalist Agents
- Foresight-Guided Defense Against Infectious Jailbreaks in Multi-Agent Systems
- Forked vs Fresh Subagents: When to Inherit the Parent Conversation
- Formal Process Models as Prompting Scaffolds (Petri Net of Thoughts)
- Four-Layer Taxonomy of Agent Security Risks
- Framework-First Agent Development: An AI Anti-Pattern
- Frameworks
- Functional folder taxonomy
- Gateway Model Routing: Treat the LLM Gateway as a Discovery Source
- Generative Agents Memory Stream: Three-Layer Architecture for Long-Running Agent Sessions
- Generative Provenance Records for Tool-Using Agents
- GitHub Agentic Workflows for Automating Dev Processes
- GitHub Copilot Agent Mode for AI Agent Development
- GitHub Copilot Coding Agent for AI Agent Development
- GitHub Copilot Custom Agents and Skills Extensibility Guide
- GitHub Copilot Dedicated App as Agent-First Surface
- GitHub Copilot Extensions for AI Agent Development
- GitHub Copilot MCP Integration for AI Agent Development
- GitHub Copilot SDK for AI Agent Development
- GitHub Copilot for AI Agent Development
- GitHub Copilot: Harness Engineering for Agent-Ready Code
- GitHub Models in Actions for AI-Driven CI Workflows
- Goal Contract: Separating the Doer from the Done-Checker
- Goal Monitoring and Progress Tracking for Long-Running Agents
- Goal Reframing: The Primary Exploitation Trigger for LLM Agents
- Goal-Driven Autonomous Loop with Budget Cap
- Golden Journeys: Restartability as a First-Class Verification Primitive
- Golden Query Pairs as Continuous Regression Tests for Agents
- Governed Sources of Truth for Analytics Agents (Structure Over Access)
- Graceful Tool-Output Truncation: The PARTIAL Signal
- Grade Agent Outcomes, Not Execution Paths
- Graph of Thoughts: Directed Graph Reasoning for Multi-Path Problems
- Grill Me: Developer-Initiated Plan Interrogation
- Guarding Against URL-Based Data Exfiltration in Agentic Workflows
- Handoff Skill: Structured Context Transfer Between Agent Sessions
- Happy Path Bias: How AI Agents Skip Error Handling
- Harness Bug Detection Patterns
- Harness Design Dimensions and Archetypes
- Harness Engineering (Training Module)
- Harness Engineering for Building Reliable AI Agents
- Harness Hill-Climbing: Eval-Driven Iterative Improvement of Agent Harnesses
- Harness Impermanence: Build Scaffolding To Be Deleted
- Harness Preflight Doctor Command for Agent Diagnostics
- Harness-Memory Coupling as a Design Axis
- Headless Claude in CI: Using -p and --max-turns for Safe Pipeline Integration
- Headless-First Services: APIs for Agent Consumers
- Heartbeat-Bound Hierarchical Credentials for Agent Swarms
- Held-Out Test Gap: A Long-Horizon Reward-Hacking Signal
- Heuristic-Based Effort Scaling in Agent System Prompts
- History Anchors: Consistency-Cued Continuation of Unsafe Prior Actions
- Hook Catalog for Claude Code Enforcement
- Hooks and Lifecycle Events: Intercepting Agent Behavior
- How the Four Agent Engineering Disciplines Compound
- Human-in-the-Loop Placement: Where and How to Supervise
- Humans and Agents in Software Engineering Loops
- Hybrid Deterministic + Semantic Authorization for Agent Tool Calls
- Hypothesis-Driven Debugging: Instrument Before You Patch
- Idempotent Agent Operations: Safe to Retry
- Idle-Time Speculative Planning for ReAct Agents
- Improper Output Handling: Validate Agent Output Before Downstream Use
- In-Thread Side-Channel: Bounded Side Questions Without Losing the Main Task
- Incident Log Investigation Skill: Parallel Queries
- Independent Test Generation in Multi-Agent Code Systems
- Indiscriminate Structured Reasoning on Every Agent Task
- Inference-Time Tool-Call Reviewer: Pre-Execution Feedback for Tool-Calling Agents
- Inline Safety Harness with Cascade Verification (FinHarness)
- Interactive Canvases: Agent-Generated Visual Artifacts as Outputs
- Interactive Clarification for Underspecified Tasks
- Interactive Effort Sliders: Per-Turn Reasoning-Budget Controls
- Introspective Skill Generation: Mining Agent Patterns
- Inversion Analysis: Surface Capabilities Competitors Cannot Replicate
- Isometric Harness Ablation: Rank Subsystem Investment by Removing One at a Time
- Issue Requirements Preprocessing: Structured Input Before Code Generation
- Issue-Tracker as Agent Dispatch Surface
- Issue-to-PR Delegation Pipeline for AI Agent Development
- L0 → L1: Making the Repo Readable
- L1 → L2: Adding Feedback Loops
- L2 → L3: Building Mechanical Enforcement
- L3 → L5: Reaching Agent-First
- LLM Agent Bug Fix Taxonomy: 23 Fix Patterns from 930 Real Bugs
- LLM Code Review Overcorrection for AI Agent Development
- LLM Map-Reduce Pattern for Parallel Input Processing
- LLM-as-Judge Evaluation with Human Spot-Checking
- Lane-Based Execution Queueing
- Large-Codebase Coding-Agent Failure Patterns (Sourcegraph Five)
- Lay the Architectural Foundation by Hand Before Delegating
- Layered Context Architecture for AI Agent Development
- Layered Domain Architecture: A Prescriptive Default for Agent-Built Code
- Layered Instruction Scopes: Global, Project, and Directory
- Layered Mutability: Governing Persistent Self-Modifying Agents
- Lazy Worktree Isolation: Enter the Worktree on First Write, Not on Dispatch
- Lethal Trifecta Threat Model for AI Agent Development
- Lexical-First Retrieval for Agentic Search: When BM25 Is Enough
- Lifecycle-Integrated Security Architecture for Agent Harnesses
- Live Browser as Agent Context Channel
- Lock-State Safeguards for Desktop-Controlling Agents
- Long-Running Agents: Durability and Resumability Across Sessions
- Loop Detection for AI Agents: Stopping Micro-Loops
- Loop Strategy Spectrum: Accumulated vs Fresh Context
- MCP Client Design: Building Robust Host-Side Logic
- MCP Client/Server Architecture Best Practices Guide
- MCP Runtime Control Plane: Policy Evaluation Between Agent and Tool
- MCP Server Design: Building Agent-Friendly Servers
- MCP: The Open Protocol Connecting Agents to External Tools
- Machine-Readable Error Responses for AI Agents (RFC 9457)
- Magentic Orchestration: Task-Ledger-Driven Adaptive Multi-Agent Planning
- Making Application Observability Legible to Agents
- Managed vs Self-Hosted Agent Harness: Deployment Trade-offs
- Memory Reinforcement Learning (MemRL)
- Memory Synthesis: Extracting Lessons from Execution Logs
- Memory Transfer Learning: Cross-Domain Memory Reuse in Coding Agents
- Memory-Induced Tool-Drift in LLM Agents
- Mermaid as Agent Output Format: When to Ask for a Diagram Instead of Prose
- Meta-Engineering Harness for Production AI-Native Software Delivery
- Method Map: Failure-Mode to Smallest-Artifact Triage
- Mid-Trajectory Guardrail Selection for Multi-Step Tool Calls
- Mise en Place for Agentic Coding
- Model Deprecation Lifecycle for Agent Workloads
- Model a Single Agent Turn as Many Inference and Tool-Call
- Model-ID-as-Dependency: Migration Protocol for Deprecation Churn
- Model-Neutral Agent Architecture: Model Portability Over Cloud Portability
- Monitor Tool: Event Streaming from Background Scripts
- Monolith-to-Sub-Agents Refactor: Five Lessons from a Brittle Prototype
- Monorepo Skill and Agent Discovery: Hierarchical Configuration
- Monotonic Capability Attenuation for Composition-Safe Tool Use
- Most-Restrictive-Wins Fusion for Parallel Agent Control Returns
- Multi-Agent RAG for Spec-to-Test Automation
- Multi-Agent SE Design Patterns: A Taxonomy Across 94 Papers
- Multi-Agent Systems: Coordination and Orchestration
- Multi-Agent Topology Taxonomy: Centralised, Decentralised
- Multi-Model Plan Synthesis for System Architecture
- Multi-Repo and No-Repo Coding Agent Automation Templates
- Multi-Shape BYOK Provider: Declare API Family per Endpoint
- Multi-Turn Conversation Evaluation: Per-Turn and Trace-Level Scoring Together
- Multitenant RAG: Closing the Relevance-Authorization Gap
- Nonstandard Errors in AI Agents: Model-Family Variance
- OAuth Client ID Metadata Documents (CIMD) for MCP Servers
- OWASP LLM Top 10 (2025): Agent Security Crosswalk
- Objective Drift: When Agents Lose Sight of the Goal
- Observability-Driven Harness Evolution
- Observation Contract Preservation in Tool-Augmented Agents
- Observation-Driven Coordination: CRDT-Based Parallel Agent
- On-Demand Skill Hooks: Session-Scoped Guardrails via Skill Invocation
- One-Click CI Auto-Fix: Human-Triggered Cloud-Agent Remediation for Failing GitHub Actions
- One-Shot Record and Deterministic Replay for Periodic Agent Tasks
- Open Agent School Pattern Mapping for Practitioners
- OpenAI Agents SDK Sandboxes Harness and Memory
- OpenAI Agents SDK for AI Agent Development
- OpenAPI as the Source of Truth for Agent Tool Definitions
- OpenTelemetry for AI Agent Observability and Tracing
- Opponent Processor / Multi-Agent Debate Pattern
- Oracle Poisoning: Knowledge Graph Corruption Against Tool-Using Agents
- Oracle-Based Task Decomposition for AI Agent Development
- Orchestrator-Worker Pattern for AI Agent Development
- Overeager-Behavior Elicitation: Scope + Trap Fragments as a Diagnostic for Out-of-Scope Tool Calls
- Override Pattern: Reusing Interactive Commands in Automated Pipelines
- PASS@(k,T): Evaluate RL for Agents Along Sampling and Interaction Depth
- PEEK: Orientation Cache for Recurring-Context Agents
- PII Tokenization in Agent Context
- Parallel Agent Sessions Shift the Bottleneck from Writing
- Parallel Polyglot Ports as a Spec-Ambiguity Oracle
- Parameter-Keyed Caching and Dependency-Aware Parallelism for Plan-Execute Pipelines
- Parsimonious Agent Routing for Multi-Agent Dispatch
- Pattern Replication Risk in Agentic Code Generation
- Pattern Selection Map: Trade-off Matrix for This Site's Patterns
- Patterns: Agent Design, Multi-Agent, and Anti-Patterns
- Per-Call Budget Hints on Tool Invocations
- Per-Model Harness Tuning: Treating the Backing Model as a Harness Variable
- Per-Tool Extended Reasoning Opt-In: Tool-Call-Scoped Budgets
- Per-User Supervisor Process for Background Agent Sessions
- Permission Framework Choice Outweighs Model Choice for Limiting Overeager Actions
- Permission-Gated Custom Commands for AI Agent Development
- Permutation Frameworks for Batch Code Generation
- Persistent Shared Search Sub-Agent for Output-Token Reuse
- Persistent-Connection Agent Transport
- Persona-as-Code: Defining Agent Roles as Structured Docs
- Phase-Specific Context Assembly for AI Agent Development
- Plan Compliance in Agents: Measure What They Execute, Not What You Wrote
- Plan-Then-Execute as the Default for Web Agents
- Plugin Dependency Declaration and Disable-Chain Hints
- Plugin and Extension Packaging: Distributing Agent Capabilities
- Poka-Yoke for Agent Tools: Mistake-Proof Tool Interfaces
- Portable Agent Definitions: Full-Stack Identity as Code
- Pre-Change Impact Analysis: Dependency Maps That Prevent Agent Regressions
- Pre-Completion Checklists for AI Agent Development
- Pre-Execution Codebase Exploration for AI Coding Agents
- Pre-Install Plugin Transparency: Capability Inventory and Cost Projection
- Pre-Trust Execution Surface in Coding Agent Harnesses
- Prebuilt Agent Environments: Amortising Cloud Agent Cold Start with Custom Container Images
- Premature Completion: Agents That Declare Success Too Early
- Prior Dominance Over Feedback in Agent Optimization Loops
- Privacy-Preserving LLM Requests: Eight Techniques and a Practical Combination
- Proactive Idle-Time Anticipation (ProAct)
- Process Amplification: Scaling Human Work with Agents
- Product-as-IDE: When the Application Becomes the Development
- Production Hosting Topology for Self-Hosted Agent SDK Runtimes
- Production MCP Agent Stack: Sequencing Six Decisions into One Deployment
- Programmatic Agent Session Export via `claude agents --json`
- Programmatic Cloud-Agent Dispatch via REST API and Webhooks
- Progressive Autonomy: Scaling Trust with Model Evolution
- Progressive Disclosure for Layered Agent Definitions
- Progressive Spend Threshold Alerting for Agent Cost Governance
- Project Writing Skill: House Style as Model-Invocable Skill
- Prompt Caching: Architectural Discipline for Agents
- Prompt Chaining: Sequential LLM Calls for Agent Workflows
- Prompt Injection: A First-Class Threat to Agentic Systems
- Proprietary-to-Open-Standard Tool Migration (Copilot Extensions to MCP)
- Protecting Sensitive Files from Agent Context Access
- Prototype Before Optimizing: Establish Quality Baselines Before Token Constraints
- Provenance-Aware Decision Auditing for LLM Agents
- Public-Channel Agent Work as Lehrwerkstatt for Team Learning
- QA Session to Issues Pipeline for AI Agent Development
- Quality Score Rubric and Simplification Log for Agent Harnesses
- RAG Architecture as a Poisoning Robustness Decision
- RAG over Thinking Traces: Index Reasoning Trajectories Instead of Documents
- RL-Trained Automated Red Teamers for Prompt Injection Discovery
- Rainbow Deployments for Agents: Gradual Version Migration
- Reactive Environment Hooks: CwdChanged and FileChanged
- Reasoning Budget Allocation: The Reasoning Sandwich
- Recursive Best-of-N Delegation
- Red-Green-Refactor with Agents: Tests as the Spec
- Reflective Prompt Evolution with Pareto Selection (GEPA)
- Remote Agent Host Sessions over SSH and Dev Tunnels
- Remote Session Control for Local CLI Agents
- Repository Bootstrap Checklist: Wiring Agent Support
- Repository Map Pattern: AST + PageRank for Dynamic Code
- Repository Perturbation as Context-Reasoning Diagnosis (RepoMirage)
- Repository-Level Retrieval for Code Generation
- Retrieval-Augmented Agent Workflows: On-Demand Context
- Review-Then-Implement Loop for AI Agent Development
- Rigor Relocation: Engineering Discipline with AI Agents
- Risk-Based Shipping: Review by Risk Matrix, Not by Default
- Risk-Based Task Sizing for Agent Verification Depth
- Role Orchestration on a Single Model
- Rollback-First Design: Every Agent Action Should Be Reversible
- RubricRefine: Pre-Execution Rubric Refinement for Code-Mode Tool Use
- Runbooks as Agent Instructions: Agent-Followable Ops
- Runtime Harness Adaptation: Four-Layer Interface Fixes for Frozen LLM Agents
- Runtime Scaffold Evolution: Agents That Build Tools
- SDLC-Phase Skill Taxonomy: Full-Lifecycle Skill Libraries
- SKILL.md Frontmatter Reference: All Fields Explained
- Safe Command Allowlisting: Reducing Approval Fatigue
- Safe Outputs Pattern for Trustworthy Agent Responses
- Sandbox + Approvals + Auto-Review Governance Triad
- Sandboxed Coding Environments: Containers vs MicroVMs vs OS-Level Isolators
- Scaffold Architecture Taxonomy for Coding Agents
- Scheduled Instruction File Fact-Checker for Accuracy
- Schema-Guided Graph Retrieval
- Scope Sandbox Rules to Harness-Owned Tools, Not Third-Party
- Scoped Credentials via Proxy Outside the Agent Sandbox
- Seamless Background-to-Foreground Handoff
- Secrets Management for AI Agents: Credential Injection
- Security Drift in Iterative LLM Code Refinement
- Security for AI Agent Development
- Selective Autonomy from Copilot Feedback
- Selective Checkpoint Restore Across Code and Conversation State
- Selective Network Access in Agent Sandboxes: The allowNetwork Pattern
- Self-Discover Reasoning: LLM-Composed Reasoning Structures
- Self-Healing Production Agent: Automated Regression Detection and Autofix PR
- Self-Healing Tool Routing
- Self-Rewriting Meta-Prompt Loop
- Semantic Context Loading: Language Server Plugins for Agents
- Semantic Intent Validation for Agent Skills
- Semantic Issue Search from Chat vs Query Syntax
- Semantic Tool Output: Designing for Agent Readability
- Separation of Knowledge and Execution in Agent Systems
- Session Harness Sandbox Separation for Long-Running Agents
- Session Initialization Ritual: How Agents Orient Themselves
- Session Recap: Goal-Shaped Handoff at Context Boundaries
- Shadow Tech Debt Created by Autonomous AI Agent Commits
- Shared Agent Context Store API: When to Expose Curated Context as an Endpoint
- Shared Context Bundle Registry for Agent Teams
- Simulation and Replay Testing for Agent Verification
- Single-Branch Git for Agent Swarms: A Trade-Off Pattern
- Single-CLI Agent Platform: Create to Production in One CLI
- Single-Layer Prompt Injection Defence Anti-Pattern
- Six-Shape Approval Response Taxonomy: Beyond Binary Allow/Deny
- Skeleton Projects as Agent Scaffolding
- Skill Authoring Patterns: Description to Deployment
- Skill Library Evolution: Lifecycle Governance for Agents
- Skill Library Refinement Loops: Organisational Feedback for Shared Skills
- Skill Library Technical Debt: Library-Time Maintenance for Agent Skills
- Skill Program Functions: Executable Guardrails Compiled From Past Failures
- Skill Supply-Chain Poisoning
- Skill Tool as Enforcement: Loading Command Prompts at Runtime
- Skill as Instruction Surface and Callable API (Interpreter Skills)
- Skill as Knowledge Pattern for AI Agent Development
- Solver-Externalized Constraint Reasoning (MaxSAT/SMT Encoding)
- Source-Grounded Test Plan with Pre-Action Assertion Annotation
- Sparse-Checkout Worktrees for Monorepo Agent Isolation
- Spec-Driven Development with Spec Kit
- Specialist Orchestrated Queuing for Multi-Agent SE (SPOQ)
- Specialized Agent Roles for Effective AI Pipelines
- Specialized Small Language Models as Agent Sub-Tools
- Sprint Contracts: Pre-Coding Success Agreements for Multi-Agent Tasks
- Staged Evidence Gates for Agentic Program Repair
- Staggered Agent Launch: Preventing Thundering-Herd in Swarms
- Stakeholder Trust Through Evals and Observability
- Steering Running Agents: Mid-Run Redirection and Follow-Ups
- Stochastic-Deterministic Boundary as First-Class Contract
- StopFailure Hook: Observability for API Error Termination
- Strained Coherence as a Pre-Failure Signal in Agent Trajectories
- Strategy Over Code Generation: Why AI Speed Doesn't Fix Wrong Goals
- Structure Prompts with Static Content First to Maximize Cache Hits
- Structured Agentic Software Engineering (SASE)
- Structured Domain Retrieval: Knowledge Graphs and Case-Based Reasoning
- Structured Task Graphs as External Agent Memory (Beads)
- Sub-Agents for Fan-Out Research and Context Isolation
- Subagent OTel Trace Correlation via agent_id Attribute
- Subagent Schema-Level Tool Filtering for AI Agents
- Subprocess-per-Session Hosting Model: Four Session-Lifecycle Topology Patterns
- Subtask-Level Memory for Software Engineering Agents
- Sufficiency-Tightness Decomposition for Agent-Authored Permissions
- Syntax-Aware Completion Routing for Local-First Code Models
- System-Level Optimization Pipeline
- Task Feasibility Awareness: Stop Before You Start
- Task-Based Access Control with Hybrid Inspection
- Task-Specific Agents vs Role-Based Agents
- Team Onboarding for AI Agent Workflows and Adoption
- Temporary Compensatory Mechanisms in Agent Harnesses
- Tenant Model Policy: Organization-Scoped Rules for AI Model Selection
- Terminal Tools for Agents: send_to_terminal and Background Interaction
- Test Evolution Blind Spot in Coding Agents
- The 7 Phases of AI-Assisted Feature Development
- The AI Development Maturity Model: From Skeptic to Agentic
- The AX Stack: A Layered Model of an AI Coding Agent's Prompt-to-Compile Path
- The Advisor Strategy: Frontier Model as Strategic Advisor
- The Agent Stack Bet: Architectural Decisions for Production Agents
- The Bottleneck Migration When Humans Supervise Agents
- The Copy-Paste Agent Anti-Pattern in AI Development
- The Delegation Decision: When to Use an Agent vs Do It Yourself
- The Plan-First Loop: Always Design Before Writing Code
- The Ralph Wiggum Loop: Fresh-Context Iteration Pattern
- The Reasoning-Complexity Trade-off
- The Research-Plan-Implement Pattern
- The Think Tool: Mid-Stream Reasoning for AI Agents
- The Yes-Man Agent: Compliance Without Verification
- Three Reasoning Spaces: Plan, Bead, and Code
- Three-Vector Evasion Taxonomy for Agent Security Tests
- Throwaway-Prototype Skill: Build to Discard, Keep Only the Answer
- Tiered Memory Architecture: Episodic-to-Semantic Consolidation for Long-Running Agents
- Tiled Agent Layout: Supervising Parallel Agents Through Dedicated Panes
- Token-Cost Profiling and Reduction for Always-On Agentic Workflows
- Tool Calling Schema Standards for AI Agent Development
- Tool Confirmation Carousel: Batched UI for Per-Call Approvals
- Tool Description Quality for Effective Agent Guidance
- Tool Engineering Principles for AI Agent Development
- Tool Engineering: Designing and Managing AI Agent Tooling
- Tool Minimalism and High-Level Prompting
- Tool Necessity Probing: Reading Tool-Call Decisions From Hidden States
- Tool Preamble: User-Visible Status Updates Before Tool Calls
- Tool Signing and Signature Verification for Agents
- Tool-Invocation Attack Surface in Coding Agents
- Tool-Use Sim-to-Real Perturbation Taxonomy
- Toolset Agentization: Wrapping Co-Used Tools as Sub-Agents
- Training Modules
- Trajectory Decomposition: Diagnose Where Coding Agents Fail
- Trajectory Logging via Progress Files and Git History
- Trajectory Pre-Filter for Failure Diagnosis (TrajAudit)
- Trajectory-Opaque Evaluation Gap: Why Final-Output Grading Misses Safety Violations
- Treat Task Scope as a Security Boundary
- Treating Agent Safety as Uniform Across a Session (Cold-Start Safety Gap)
- Trusting Tool Error Messages as Implicit Authority (Error-Path Injection)
- Typed Memory from VCS History: When It Pays Off
- Typed Schemas at Agent Boundaries for Multi-Agent Systems
- Unix CLI as the Native Tool Interface for AI Agents
- Use a Public-Web Index to Gate Automatic URL Fetching
- Using the Agent to Analyze Its Own Evaluation Transcripts
- Utility-Model Split: Background Tasks on a Cheaper Model
- VS Code Agents App: Agent-Native Parallel Task Execution
- Velocity-Quality Asymmetry: Why AI Speed Gains Fade
- Verification Ledger for Tracking Agent Output Quality
- Verification-Centric Development for AI-Generated Code
- Verify-Gated Completion as Admission Control
- Visual-Prompt Agent Steering (Cursor Design Mode)
- Voting / Ensemble Pattern for AI Agent Development
- WIP=1 and Little's Law: Kanban Throughput Theory for Agent Task Design
- Web Search Agent Loop: Iterative Research Patterns
- WebMCP: Browser-Hosted Tool Contracts for In-Page AI Agents
- Windows Sandboxing for Coding Agents
- Workflows for AI Agent Development
- Worktree Isolation: Parallel Agent Sessions in Safe Sandboxes
- Write Tool Descriptions as Agent Onboarding Documents
agentless¶
ai-coding-assistants¶
anti-pattern¶
- AI Agent Development Anti-Patterns and Failure Modes
- Abstraction Bloat in AI Agent-Generated Code Output
- Agent Headcount as a Vanity Metric
- Agent-Laundered Bug Reports
- Assumption Propagation: Compounding Agent Misunderstandings
- Boring Technology Bias: When Agents Recommend by Popularity
- Cargo Cult Agent Setup: Copying Without Understanding
- Coding-Agent Misalignment Forms (Seven-Symptom Taxonomy)
- Comprehension Debt from AI-Generated Code Velocity
- Context Poisoning: When Hallucinations Become Premises
- Cross-Component Interference in Agent Scaffolds
- Demo-to-Production Gap: When Demos Hide Real Costs
- Direct Prompt Injection via Collaboration (User as Attack Vector)
- Distractor Interference: Why Relevance Is Not Enough
- Dynamic Tool Fetching Destroys KV Cache Performance
- External Artifacts Treated as Data, Not Adversarial Input
- Framework-First Agent Development: An AI Anti-Pattern
- Happy Path Bias: How AI Agents Skip Error Handling
- Indiscriminate Structured Reasoning on Every Agent Task
- LLM Code Review Overcorrection for AI Agent Development
- LLM Self-Review Failure in Code Modernization Tasks
- Large-Codebase Coding-Agent Failure Patterns (Sourcegraph Five)
- Law of Triviality in AI PRs for AI Agent Development
- MCP Allowlist by Label, Not by Identity (serverName Trap)
- Memory-Induced Tool-Drift in LLM Agents
- Mid-Session Config Changes as Invisible Cache Invalidators
- Objective Drift: When Agents Lose Sight of the Goal
- PR Scope Creep as a Human Review Bottleneck
- Pattern Replication Risk in Agentic Code Generation
- Patterns: Agent Design, Multi-Agent, and Anti-Patterns
- Perceived Model Degradation: Why Vibes Are Not Evals
- Permission Framework Choice Outweighs Model Choice for Limiting Overeager Actions
- Premature Completion: Agents That Declare Success Too Early
- Prior Dominance Over Feedback in Agent Optimization Loops
- Prompt as Security Knob
- Prompt-Only Tool Access Control
- Refactoring Runaway: Tangled Refactorings in Agent Patches
- Run-Status vs Task-Status Confusion in Autonomous Agent Runs
- Shadow Tech Debt Created by Autonomous AI Agent Commits
- Single-Layer Prompt Injection Defence Anti-Pattern
- Skill Atrophy: When AI Reliance Erodes Developer Capability
- Slopsquatting: Hallucinated Package Names as a Supply-Chain Vector
- Spec Complexity Displacement: When Specs Become Code
- Stale AI Configuration Artifacts (Context Rot)
- Symptom-Reduction-as-Root-Cause: Why Oracle Tests Alone Miss Architectural Drift
- The Anthropomorphized Agent for AI Agent Development
- The Context Ceiling -- Where AI Fails Expert Architects
- The Copy-Paste Agent Anti-Pattern in AI Development
- The Effortless AI Fallacy for AI Agent Development
- The Implicit Knowledge Problem for AI Coding Agents
- The Infinite Context Anti-Pattern in Agent Systems
- The Kitchen Sink Session Anti-Pattern in AI Agents
- The Prompt Tinkerer Anti-Pattern in Agent Workflows
- The Reasoning-Complexity Trade-off
- The Test Homogenization Trap: When LLM-Generated Tests Mirror Model Blind Spots
- The Yes-Man Agent: Compliance Without Verification
- Token Preservation Backfire for AI Agent Development
- Treating Agent Safety as Uniform Across a Session (Cold-Start Safety Gap)
- Trust Without Verify: Skipping Agent Output Checks
- Trusting Human Review to Catch Deliberate Agent Sabotage
- Trusting Tool Error Messages as Implicit Authority (Error-Path Injection)
- Vibe Coding: Outcome-Oriented Agent-Assisted Development
- bypassPermissions Silently Overrides allowedTools (The Restricted-Bypass Trap)
articles¶
arxiv¶
- AI Bot CI/CD Workflow Reliability by Agent
- AOCI: Symbolic-Semantic Repository Indexing
- AST-Grounded Critic Loop for Documentation Maintenance
- AST-Guided Agent Memory for Repository-Level Code Generation
- AX/UX/DX Triad: Three Experience Layers in Agent Systems
- Abstention-Aware Memory Retrieval for Coding Agents
- Adaptive Generate-Rank-Verify Under Costly Verification
- Agent JIT Compilation: Compile Tasks Into Executable Plans
- Agent-Generated Code Maintenance Asymmetry
- Agentic AI Architecture: From Prompt to Goal-Directed
- Agentic Detection and Response at the MCP Boundary
- Attention Latch: When Agents Stay Anchored to Stale Instructions
- Authority Confusion: Untrusted Context Must Not Authorize Side Effects
- Behavioral Drivers of Coding Agent Success and Failure
- Behavioral Firewall for Tool-Call Trajectories
- Bootstrapping Coding Agents: The Specification Is the Program
- CARE: Three-Party Stage-Gated Agent Engineering
- CRA-Only Review and the Merge Rate Gap
- CausalFlow: Counterfactual Repair for Failed Agent Trajectories
- Chain-of-Verification for Coding Agents
- Chunking Strategy for RAG-Based Code Completion
- Clarification Mode Amplifies Prompt Injection
- Classifying and Auto-Correcting Coding Agent Misbehaviors (Wink)
- Closed-Loop Agent Training from Tool Schemas
- CoALA Decision-Making Loop as an Orchestration Lens
- CoALA Memory Taxonomy as a Classifier for Harness Artifacts
- Code Cleanliness as an Agent Cost Lever
- Coding-Agent Misalignment Forms (Seven-Symptom Taxonomy)
- Completion Failure Taxonomy: Why Code Suggestions Miss
- ComplexMCP: Three Bottlenecks in Large Interdependent Tool Sandboxes
- Component-Isolated Memory Stress Testing for LLM Agents
- Component-Wise RAG Prioritization for Software Engineering Tasks
- Compositional Vulnerability Induction in Coding Agents
- Configuration File Structure Does Not Drive Compliance
- Constraint Decay in Backend Code Generation
- Constraint Drift: Why Safety Must Be Maintained, Not Asserted
- Constraint Encoding Does Not Fix Constraint Compliance
- Contractual Skill Files: Inspectable SKILL.md for Enterprise Agents
- Control Lexical Leakage in Agent-Memory Retrieval Evals (Entity-Collision)
- Control/Data-Flow Separation for Prompt Injection Defense (CaMeL)
- Controlled Benchmark Rewriting for Agent Safety Judgment
- Cost-Aware Skill Rewriting: Preserve Operational Anchors, Not Skill Tokens
- Cost-Aware Tracing for Skill Distillation
- Cross-Component Interference in Agent Scaffolds
- Cross-Framework Signal Semantics: Re-Measure Borrowed Trajectory Rules
- Cross-Lingual Prompt Preprocessing (Local-LLM Token Arbitrage)
- Cross-Session Memory Poisoning for Data Exfiltration (Trojan Hippo)
- Decentralized Memory for Self-Evolving Multi-Agent Systems
- Decomposed Red-Teaming for Agent Monitors
- Designing Agents to Resist Prompt Injection
- Developer Control Strategies for AI Coding Agents
- Direct-Apply Review Comments via Cloud Agent
- Discovery-Only Refactor Pass: Surface Candidates Before Touching Code
- Distillation-Induced Similarity Metrics for Tool-Use Agents
- Distilled Bootstrap Contract: Agent-Authored Repo Setup
- Documentation-Guided Legacy Migration: Architecture Docs as a C-to-Rust Blueprint
- Dormant Memory Payloads Triggered by Sensitive Topics (Trojan Hippo)
- Dual-Budget Control for Search Agents: VOI Scoring Per Action
- Dual-Graph Alignment for Indirect Prompt Injection Defense (AuthGraph)
- Elastic Context Orchestration: A Per-Turn Vocabulary for Long-Horizon Search Agents
- Encoding Values in AGENTS.md: Why Prose Without Verification Fails
- Enterprise Agent Hardening: Three Production Gates
- Environment Specification as Context: Closing the Version Gap
- Evolving Playbooks: Incremental Context That Preserves Knowledge
- Failure-Aware Observability for Multi-Agent LLM Systems
- Feedback as Capability Equalizer: Iterative Feedback Outweighs Model Scale
- Foresight-Guided Defense Against Infectious Jailbreaks in Multi-Agent Systems
- Function-Level Debugger Interfaces for Coding Agents
- GEO for Technical Docs: Developer Documentation Checklist
- Goal Reframing: The Primary Exploitation Trigger for LLM Agents
- Harness Design Dimensions and Archetypes
- Heartbeat-Bound Hierarchical Credentials for Agent Swarms
- History Anchors: Consistency-Cued Continuation of Unsafe Prior Actions
- Human-AI Review Synergy in Agentic Code Review
- Inference-Time Tool-Call Reviewer: Pre-Execution Feedback for Tool-Calling Agents
- Inline Safety Harness with Cascade Verification (FinHarness)
- Instruction-Guided Code Completion: Controlling What Models Generate
- Interaction-Pattern Evaluation for Agentic PRs
- Interactive Clarification for Underspecified Tasks
- Issue Requirements Preprocessing: Structured Input Before Code Generation
- Iterative Binary Feedback for Pattern Adherence
- LLM Agent Bug Fix Taxonomy: 23 Fix Patterns from 930 Real Bugs
- LLM Refactoring Adoption Patterns
- LLM Self-Review Failure in Code Modernization Tasks
- LLM-Driven Benchmark Auditing
- LLM-Driven Logical Retrieval: Boolean Queries over an Inverted Index
- LLM-Pinned Library Versions Carry Systemic CVE Exposure
- Layered Mutability: Governing Persistent Self-Modifying Agents
- Learned Prefix Monitors for Agent Traces
- Lifecycle-Integrated Security Architecture for Agent Harnesses
- Measuring Synthetic Eval Data Quality (SynAE)
- Memory Transfer Learning: Cross-Domain Memory Reuse in Coding Agents
- Memory-Induced Tool-Drift in LLM Agents
- Meta-Engineering Harness for Production AI-Native Software Delivery
- Mid-Trajectory Guardrail Selection for Multi-Step Tool Calls
- Mise en Place for Agentic Coding
- Monotonic Capability Attenuation for Composition-Safe Tool Use
- Multi-Agent RAG for Spec-to-Test Automation
- Multi-Layer Specification Redundancy as a Robustness Budget
- Multitenant RAG: Closing the Relevance-Authorization Gap
- Narrative Problem Reformulation for Code Generation
- Observability-Driven Harness Evolution
- Observation Contract Preservation in Tool-Augmented Agents
- Offline Trajectory Replay for Multi-Agent Workflow Debugging
- One-Shot Record and Deterministic Replay for Periodic Agent Tasks
- OpenAPI Documentation Smells for Agent-Ready APIs
- PASS@(k,T): Evaluate RL for Agents Along Sampling and Interaction Depth
- PEEK: Orientation Cache for Recurring-Context Agents
- PR Description Style as a Lever for Agent PR Merge Rates
- Parallel Polyglot Ports as a Spec-Ambiguity Oracle
- Parameter-Keyed Caching and Dependency-Aware Parallelism for Plan-Execute Pipelines
- Permission Framework Choice Outweighs Model Choice for Limiting Overeager Actions
- Persistent Shared Search Sub-Agent for Output-Token Reuse
- Phantom Symbol Detection for LLM API Migration
- Plan Compliance in Agents: Measure What They Execute, Not What You Wrote
- Plan-Then-Execute as the Default for Web Agents
- Pre-Execution Codebase Exploration for AI Coding Agents
- Precise Debugging: Measure Edit Precision, Not Just Test Pass Rate
- Privacy-Preserving LLM Requests: Eight Techniques and a Practical Combination
- Proactive Idle-Time Anticipation (ProAct)
- Prompt as Security Knob
- Provenance-Aware Decision Auditing for LLM Agents
- RAG Architecture as a Poisoning Robustness Decision
- Refactoring Runaway: Tangled Refactorings in Agent Patches
- Repository Perturbation as Context-Reasoning Diagnosis (RepoMirage)
- Reverse-Engineered Executable Specifications for Agentic Program Repair
- Risk-Score Threshold Calibration for Auto-Approval
- Role Orchestration on a Single Model
- RubricRefine: Pre-Execution Rubric Refinement for Code-Mode Tool Use
- Runtime Harness Adaptation: Four-Layer Interface Fixes for Frozen LLM Agents
- Runtime Scaffold Evolution: Agents That Build Tools
- SUDP: Secret-Use Delegation Protocol for Agentic Systems
- Scaffold Architecture Taxonomy for Coding Agents
- Selective Autonomy from Copilot Feedback
- Self-Discover Reasoning: LLM-Composed Reasoning Structures
- Skill Library Technical Debt: Library-Time Maintenance for Agent Skills
- Skill Program Functions: Executable Guardrails Compiled From Past Failures
- Skill Retrieval Realism Gap
- Skill Specification Violation Fuzzing
- Solver-Externalized Constraint Reasoning (MaxSAT/SMT Encoding)
- Source Code Minification for State-in-Context Agents
- Spec-Derived Execution as a Correctness Oracle
- Specialist Orchestrated Queuing for Multi-Agent SE (SPOQ)
- Stale AI Configuration Artifacts (Context Rot)
- Stale Repository Retrieval Induces Incorrect Code
- Strained Coherence as a Pre-Failure Signal in Agent Trajectories
- Structural Coverage Criteria for Agent Workflows
- Structure-Aware Diff Labeling with Two-Stage LLM Pipelines
- Structured Agentic Software Engineering (SASE)
- Structured Domain Retrieval: Knowledge Graphs and Case-Based Reasoning
- Subtask-Level Memory for Software Engineering Agents
- Sufficiency-Tightness Decomposition for Agent-Authored Permissions
- Suggestion Gating: Fewer Completions, Better DX
- Symptom-Reduction-as-Root-Cause: Why Oracle Tests Alone Miss Architectural Drift
- Syntax-Aware Completion Routing for Local-First Code Models
- Task Alignment: The Selective-Compliance Gap Benchmarks Miss
- Task Feasibility Awareness: Stop Before You Start
- Task-Based Access Control with Hybrid Inspection
- Test Evolution Blind Spot in Coding Agents
- Test-Driven Intent Clarification: Tests as Intermediate Alignment Artifacts
- The Reasoning-Complexity Trade-off
- The Security Review Gap in AI-Authored PRs
- The Test Homogenization Trap: When LLM-Generated Tests Mirror Model Blind Spots
- Three-Vector Evasion Taxonomy for Agent Security Tests
- Tiered Memory Architecture: Episodic-to-Semantic Consolidation for Long-Running Agents
- Token-Efficient Code Generation: Structural Beats Prompting
- Tool Cloning and Provenance Assessment in Agent Ecosystems
- Tool-Invocation Attack Surface in Coding Agents
- Tool-Use Sim-to-Real Perturbation Taxonomy
- Trajectory Pre-Filter for Failure Diagnosis (TrajAudit)
- Trajectory-Opaque Evaluation Gap: Why Final-Output Grading Misses Safety Violations
- Trusting Tool Error Messages as Implicit Authority (Error-Path Injection)
- Typed Memory from VCS History: When It Pays Off
- Validating Token-Optimized Formats Inside Agentic Loops
- Verify-Gated Completion as Admission Control
- What is GEO — Generative Engine Optimization Defined
automation¶
autonomous-agents¶
claude¶
- Advanced Tool Use: Scaling Agent Tool Libraries
- Agent Observability with OpenTelemetry and Trajectory Logging
- Agent Project State Purge: Clean-Slate Session Reset
- Agent View: Dispatch-Attach-Monitor Surface for Parallel Sessions
- Agent-Generated Onboarding Guide as a Durable Artefact
- Channels Permission Relay
- Claude Agent SDK: Building Custom Agentic Workflows
- Claude Code --bare Flag
- Claude Code /batch and Worktrees for AI Agent Development
- Claude Code Agent Teams for Collaborative AI Workflows
- Claude Code Auto Mode: Classifier-Based Permission Gating
- Claude Code Dynamic Workflows
- Claude Code Extension Points: When to Use What
- Claude Code Feature Flags and Environment Variables
- Claude Code Hooks: Deterministic Lifecycle Automation
- Claude Code Review
- Claude Code Sub-Agents for Delegating Complex Tasks
- Claude Code for AI Agent Development
- Cloud Parallel Review Pattern
- Cloud Planning with Inline-Comment Review and Execute-Anywhere Choice
- Cloud-Scheduled Routines vs Local Session Scheduling
- Conditional Hook Execution: Filter Hooks by Tool Pattern
- Context-Window Diagnostic Tooling: Identifying Context-Heavy Tools
- Deferred Permission Pattern: Headless Agent Session Pausing
- Directory-Aware Plugin Suggestions via `pluginSuggestionMarketplaces`
- Effort-Aware Hooks: Reading the Reasoning Tier from PreToolUse and PostToolUse
- Enforcing Agent Behavior with Hooks
- Enterprise-Managed Plugin Governance for Agent CLIs
- Evidence-Based Allowlist Auto-Discovery for Agents
- Exclude Dynamic System Prompt Sections for Cross-Machine Cache Sharing
- Extended Prompt Cache TTL for Long Agent Sessions
- Fail-Closed Remote Settings Enforcement for Enterprise Agents
- Filesystem-Based Tool Discovery for AI Agent Development
- Gateway Model Routing: Treat the LLM Gateway as a Discovery Source
- Handoff Skill: Structured Context Transfer Between Agent Sessions
- Hard-Deny Classifier Rule: Unconditional Block in Auto Mode
- Headless Claude in CI: Using -p and --max-turns for Safe Pipeline Integration
- Hierarchical CLAUDE.md: Structuring Context Files at Multiple Levels
- Hook Catalog for Claude Code Enforcement
- Hook Exec Form vs Shell Form: Shell-Injection-Safe Hook Commands
- Hooks Invoking MCP Tools: Closing the Loop Between Policy and Tool Execution
- In-Session Transcript Search: Navigating Long Agent Conversations
- Lazy Worktree Isolation: Enter the Worktree on First Write, Not on Dispatch
- Local Plugin Scaffolding via `claude plugin init` and Auto-Loaded `.claude/skills`
- MCP Elicitation: Servers Requesting Structured Input Mid-Task
- MCP Tool Result Persistence via _meta Annotation
- Managed Settings Drop-In Directory: Enterprise Policy Fragmentation
- MessageDisplay Hook: Transforming Assistant Text at the Display Boundary
- Monitor Tool: Event Streaming from Background Scripts
- Multi-Tenant Isolation Knobs for Shared-Container Agent SDK Hosting
- On-Demand Skill Hooks: Session-Scoped Guardrails via Skill Invocation
- Out-of-Band Hook Notifications via terminalSequence
- Per-Plugin Token-Cost Attribution via claude plugin details
- Plan Mode: Read-Only Exploration Before Implementation
- Plan mode for knowledge artifacts
- Plugin Background Monitors: Declarative Supervision Auto-Armed at Session Start
- Plugin Dependency Declaration and Disable-Chain Hints
- Plugin-Activated Main-Agent Override and Bin/ PATH Injection
- Post-Compaction Re-read Protocol for Agent Continuity
- PostToolBatch Hook: Once-Per-Decision-Cycle Injection at the Batch Boundary
- PostToolUse Hook for BSD/GNU CLI Incompatibilities
- PostToolUse Hooks: Auto-Formatting on Every File Edit
- PostToolUse Output Replacement: Hooks That Rewrite Tool Results
- PostToolUse continueOnBlock: Refusal With a Load-Bearing Reason
- PowerShell Tool: Native Windows Shell for Claude Code
- Pre-Install Context-Cost Projection in Plugin Marketplaces
- Pre-Install Plugin Transparency: Capability Inventory and Cost Projection
- PreCompact Hook: Vetoing Compaction at Lifecycle Boundaries
- Production Hosting Topology for Self-Hosted Agent SDK Runtimes
- Production System Prompt Architecture and Techniques
- Programmatic Agent Session Export via `claude agents --json`
- Reactive Environment Hooks: CwdChanged and FileChanged
- Reloading Skills Mid-Session in Claude Code
- Safe Command Allowlisting: Reducing Approval Fatigue
- Session Scheduling with Loop and Cron in Claude Code
- Skill Eval Loop
- Skill Shell Execution Gate: Disabling Inline Shell from Skills
- Skill disallowed-tools Frontmatter: Skill-Layer Tool Denial
- Sparse-Checkout Worktrees for Monorepo Agent Isolation
- StopFailure Hook: Observability for API Error Termination
- Subagent OTel Trace Correlation via agent_id Attribute
- Subprocess PID Namespace Sandboxing in Claude Code
- Subprocess-per-Session Hosting Model: Four Session-Lifecycle Topology Patterns
- The Advisor Strategy: Frontier Model as Strategic Advisor
- Tools: Claude Code, Cursor, and GitHub Copilot
- Turn-Level Context Decisions: Continue, Rewind, Clear, Compact, or Delegate
- Video Transcript Skill: Converting Recordings to Docs
- Workload Identity Federation for Agent Runtimes
- claudeMdExcludes: Selective Ancestor Instruction-File Exclusion
code-generation¶
- Chunking Strategy for RAG-Based Code Completion
- Completion Failure Taxonomy: Why Code Suggestions Miss
- Constraint Degradation in AI Code Generation
- Constraint Encoding Does Not Fix Constraint Compliance
- Instruction-Guided Code Completion: Controlling What Models Generate
- Repository-Level Retrieval for Code Generation
code-review¶
- AIRA: Inspection Framework for AI-Generated Code
- Agent PR Volume vs. Value: The Productivity Paradox
- Agent Self-Review Loop for Iterative Self-Improvement
- Agent-Assisted Code Review: Agents as PR First Pass
- Agent-Authored PR Integration and Merge Predictors
- Agent-Driven PR Slicing
- Agent-Generated Code Maintenance Asymmetry
- Agent-Proposed Merge Resolution
- Agentic Code Review Architecture With Tool-Calling
- Agentic Code Review Patterns and Review Architectures
- Always-On Agentic PR Security Review
- Batched Suggestion Application: Bulk-Apply Agent Fixes on PRs
- CRA-Only Review and the Merge Rate Gap
- Claude Code Review
- Cloud Parallel Review Pattern
- Code Cleanliness as an Agent Cost Lever
- Committee Review Pattern for Multi-Agent Code Review
- Compositional Vulnerability Induction in Coding Agents
- Copilot CLI Agentic Workflows for AI Agent Development
- Deferred Standards Enforcement via Review Agents
- Diff-Based Review: Focus on Changes, Not Complete Outputs
- Direct-Apply Review Comments via Cloud Agent
- Documentation-Guided Legacy Migration: Architecture Docs as a C-to-Rust Blueprint
- Engineering: Tools, Review, Verification, Security, and Observability
- Human-AI Review Synergy in Agentic Code Review
- Interaction-Pattern Evaluation for Agentic PRs
- Interactive Canvases: Agent-Generated Visual Artifacts as Outputs
- Law of Triviality in AI PRs for AI Agent Development
- PR Description Style as a Lever for Agent PR Merge Rates
- PR Scope Creep as a Human Review Bottleneck
- Phantom Symbol Detection for LLM API Migration
- Precise Debugging: Measure Edit Precision, Not Just Test Pass Rate
- Predicting Which AI-Generated Functions Will Be Deleted
- Reproduce-Before-Report Verification Gate
- Review-Feedback-to-Rule Loop: Promoting Recurring PR Comments into Harness Rules
- Review-Then-Apply CLI Flag for In-Process Auto-Fix
- Review-Then-Implement Loop for AI Agent Development
- Reviewer's Playbook for Agent-Authored Pull Requests
- Risk-Score Threshold Calibration for Auto-Approval
- Self-Improving Code Review Agents — Learned Rules
- Signal Over Volume in AI Review for AI Agent Development
- Structure-Aware Diff Labeling with Two-Stage LLM Pipelines
- The Bottleneck Migration When Humans Supervise Agents
- The Security Review Gap in AI-Authored PRs
- Three-Depth In-Session Security Review
- Tiered Code Review: AI-First with Human Escalation
- Tunable Effort Levels for Code Review Agents
- Velocity-Quality Asymmetry: Why AI Speed Gains Fade
complexity¶
context-engineering¶
- @import Composition Pattern for Agent Instruction Files
- ACDL: A Language for Describing Agentic LLM Contexts
- AGENTS.md as a Table of Contents, Not an Encyclopedia
- AOCI: Symbolic-Semantic Repository Indexing
- AST-Guided Agent Memory for Repository-Level Code Generation
- Advanced Tool Use: Scaling Agent Tool Libraries
- Agent Memory Patterns: Learning Across Conversations
- Agent-Computer Interface (ACI): Tool Design as UX Discipline
- Agent-Powered Codebase Q&A and Onboarding Workflow
- Air-Gapped RAG: Chunking Strategies
- Air-Gapped RAG: Retrieval and Re-Ranking
- Attention Latch: When Agents Stay Anchored to Stale Instructions
- Attention Sinks: Why First Tokens Always Win
- Batch File Operations via Bash Scripts for AI Agents
- CLI Scripts as Agent Tools: Return Only What Matters
- CLI-IDE-GitHub Context Ladder for AI Agent Development
- Chunking Strategy for RAG-Based Code Completion
- Claude Code Dynamic Workflows
- CoALA Memory Taxonomy as a Classifier for Harness Artifacts
- Codebase-Derived Pattern Libraries as Agent Context
- Component-Wise RAG Prioritization for Software Engineering Tasks
- Compound Engineering: Systematic Agent Learning Loop
- Configuration File Structure Does Not Drive Compliance
- Consistent-format customer capture
- Context Budget Allocation: Spending Every Token Wisely
- Context Compression Strategies: Offloading and Summarisation
- Context Engineering (Training Module)
- Context Engineering: Shaping AI Agent Input and Attention
- Context Engineering: The Practice of Shaping Agent Context
- Context Hub: On-Demand Versioned API Docs for Coding Agents
- Context Poisoning: When Hallucinations Become Premises
- Context Priming: Pre-Loading Files for AI Agent Tasks
- Context Window Anxiety: Countering Premature Task Closure
- Context Window Management: Understanding the Dumb Zone
- Context-Injected Error Recovery for AI Agent Development
- Context-Usage Attribution: Per-Source Breakdown of Agent Context
- Context-Window Diagnostic Tooling: Identifying Context-Heavy Tools
- Convenience Loops and AI-Friendly Code in Your Stack
- Copilot Memory: Autonomous Cross-Agent Persistence
- Copilot Spaces: Curated Context Collections for Grounding
- Critical Instruction Repetition via Primacy and Recency
- Cross-Functional Knowledge Artifacts
- Cross-Lingual Prompt Preprocessing (Local-LLM Token Arbitrage)
- Cross-Repo Agent Search: GitHub-API-Backed Text Search Beyond the Workspace
- Deferred Standards Enforcement via Review Agents
- Disable Attribution Headers to Preserve KV Cache in Local Inference
- Discoverable vs Non-Discoverable Context for Agents
- Distractor Interference: Why Relevance Is Not Enough
- Distributed Computing Parallels in Agent Architecture
- Documentation-Grounding MCP Servers for Vendor SDKs
- Dynamic System Prompt Composition
- Dynamic Tool Fetching Destroys KV Cache Performance
- Elastic Context Orchestration: A Per-Turn Vocabulary for Long-Horizon Search Agents
- Encode Project Conventions in Distributed AGENTS.md Files
- Environment Specification as Context: Closing the Version Gap
- Episodic Memory Retrieval for AI Coding Agent Loops
- Error Preservation in Context for AI Agent Development
- Evaluating AGENTS.md: When Context Files Hurt More Than Help
- Event-Driven System Reminders for AI Agent Development
- Evolving Playbooks: Incremental Context That Preserves Knowledge
- Exclude Dynamic System Prompt Sections for Cross-Machine Cache Sharing
- Extended Prompt Cache TTL for Long Agent Sessions
- Filesystem-Based Tool Discovery for AI Agent Development
- Filter and Aggregate Data in the Execution Environment
- Formal Process Models as Prompting Scaffolds (Petri Net of Thoughts)
- Foundations: Context Engineering and Instructions
- Frontmost-Window Snapshot as Agent Context
- Functional folder taxonomy
- Goal Recitation: Countering Drift in Long Sessions
- Governed Sources of Truth for Analytics Agents (Structure Over Access)
- Graceful Tool-Output Truncation: The PARTIAL Signal
- Guardrails Beat Guidance: Rule Design for Coding Agents
- Handoff Skill: Structured Context Transfer Between Agent Sessions
- Hierarchical CLAUDE.md: Structuring Context Files at Multiple Levels
- Hints Over Code Samples in Agent Prompts
- How the Four Agent Engineering Disciplines Compound
- In-Thread Side-Channel: Bounded Side Questions Without Losing the Main Task
- Indexed Regex Search for Agent Tools
- Instruction-Guided Code Completion: Controlling What Models Generate
- LLM Map-Reduce Pattern for Parallel Input Processing
- LLM-Driven Logical Retrieval: Boolean Queries over an Inverted Index
- Lay the Architectural Foundation by Hand Before Delegating
- Layered Context Architecture for AI Agent Development
- Layered Instruction Scopes: Global, Project, and Directory
- Live Browser as Agent Context Channel
- Living-Docs-Grounded Agent Design Conversations
- Lost in the Middle: Understanding U-Shaped Attention
- MCP Tool Result Persistence via _meta Annotation
- MCP alwaysLoad: Classifying Servers as Eager or Just-in-Time
- Manual Compaction Strategy for Dumb Zone Mitigation
- Memory Synthesis: Extracting Lessons from Execution Logs
- Mid-Session Config Changes as Invisible Cache Invalidators
- Mise en Place for Agentic Coding
- Narrative Problem Reformulation for Code Generation
- Next Edit Suggestions Paradigm for AI Agent Development
- Objective Drift: When Agents Lose Sight of the Goal
- Observation Masking: Filter Tool Outputs from Context
- Open Agent School Pattern Mapping for Practitioners
- PEEK: Orientation Cache for Recurring-Context Agents
- PII Tokenization in Agent Context
- Phase-Specific Context Assembly for AI Agent Development
- Plan Mode: Read-Only Exploration Before Implementation
- Plan files as resumable artifacts
- Plan mode for knowledge artifacts
- Post-Compaction Re-read Protocol for Agent Continuity
- Pre-Execution Codebase Exploration for AI Coding Agents
- PreCompact Hook: Vetoing Compaction at Lifecycle Boundaries
- Production System Prompt Architecture and Techniques
- Prompt Cache Economics: Comparing Costs by Provider
- Prompt Caching: Architectural Discipline for Agents
- Prompt Chaining: Sequential LLM Calls for Agent Workflows
- Prompt Compression: Maximizing Signal Per Token
- Prompt Injection: A First-Class Threat to Agentic Systems
- Prompt Layering: How Instructions Stack and Override
- Prototype Before Optimizing: Establish Quality Baselines Before Token Constraints
- RAG Architecture as a Poisoning Robustness Decision
- RAG over Thinking Traces: Index Reasoning Trajectories Instead of Documents
- Repository Map Pattern: AST + PageRank for Dynamic Code
- Repository Perturbation as Context-Reasoning Diagnosis (RepoMirage)
- Repository-Level Retrieval for Code Generation
- Retrieval-Augmented Agent Workflows: On-Demand Context
- Role Orchestration on a Single Model
- Schema-Guided Graph Retrieval
- Seeding Agent Context: Breadcrumbs in Code
- Selective Rewind Summarization: Compress Earlier Turns, Keep Recent Ones Intact
- Semantic Caching for Multi-Agent Code Systems
- Semantic Context Loading: Language Server Plugins for Agents
- Semantic Density Optimization for Agent Codebases
- Semantic Tool Output: Designing for Agent Readability
- Session Initialization Ritual: How Agents Orient Themselves
- Session Recap: Goal-Shaped Handoff at Context Boundaries
- Shared Context Bundle Registry for Agent Teams
- Single-Layer Prompt Injection Defence Anti-Pattern
- Skill Context Isolation: Forking the Skill into a Subagent Window
- Source Code Minification for State-in-Context Agents
- Stale AI Configuration Artifacts (Context Rot)
- Stale Repository Retrieval Induces Incorrect Code
- Structure Prompts with Static Content First to Maximize Cache Hits
- Structured Domain Retrieval: Knowledge Graphs and Case-Based Reasoning
- Structured Task Graphs as External Agent Memory (Beads)
- Sub-Agents for Fan-Out Research and Context Isolation
- Subtask-Level Memory for Software Engineering Agents
- System Prompt Altitude: Specific Without Being Brittle
- Team OS: Coding-Agent Repo as Cross-Functional Team Brain
- Terminal Tool Output Compression: Filtering Predictable Noise at the Harness
- Test Harness Design for LLM Context Windows
- The Context Ceiling -- Where AI Fails Expert Architects
- The Infinite Context Anti-Pattern in Agent Systems
- The Instruction Compliance Ceiling: How Rule Count Limits AI
- The Kitchen Sink Session Anti-Pattern in AI Agents
- The Plan-First Loop: Always Design Before Writing Code
- The Research-Plan-Implement Pattern
- The Specification as Prompt: Existing Artifacts as Agent
- The Task Framing Irrelevance Fallacy in Agent Prompting
- Three Knowledge Tiers: Sourced, Unverified, Hallucinated
- Token-Efficient Code Generation: Structural Beats Prompting
- Token-Efficient Tool Design: Tools That Don't Eat Your Context
- Tokenizer Swap Tax: Budgeting for Model Migrations That Change Token Counts
- Turn-Level Context Decisions: Continue, Rewind, Clear, Compact, or Delegate
- Typed Memory from VCS History: When It Pays Off
- Ubiquitous Language for AI Plans
- Validating Token-Optimized Formats Inside Agentic Loops
- When to Use Examples vs Rules in Agent Instructions
- llms.txt: Making Your Project Discoverable to AI Agents
copilot¶
- AGENTS.md Design Patterns: Commands, Boundaries, Personas
- Agent Environment Bootstrapping for AI Agent Development
- Agent HQ (Multi-Agent Platform) for AI Agent Development
- Agent Mission Control for Orchestrating Agent Tasks
- Agentic Code Review Architecture With Tool-Calling
- Auto Model Selection: Harness-Driven Routing per Task
- Cloud-Agent Tiered Model Routing: Cheap Tiers for Bounded Tasks
- Cloud-Local Agent Handoff for AI Agent Development
- Cohort Segmentation in the Copilot Usage Metrics API
- Content Exclusion Gap: AI Security Boundaries by Mode
- Copilot CLI Agentic Workflows for AI Agent Development
- Copilot CLI BYOK and Local Model Support
- Copilot Cloud Agent Organization Controls
- Copilot Cloud Agent Three-Phase Execution Model
- Copilot Inline Agent Mode in JetBrains IDEs
- Copilot Memory: Autonomous Cross-Agent Persistence
- Copilot Spaces: Curated Context Collections for Grounding
- Copilot Unified Sessions View and CLI Agent in JetBrains IDEs
- Critic Agent Pattern: Dual-Model Plan Review
- Cross-IDE Plugin Discovery: One Install Surface, Many Consuming Agents
- Dependabot Agent Assignment for AI-Driven Vulnerability Remediation
- Dominator-Graph Trajectory Invariants for Non-Deterministic Agents
- Enterprise-Managed Plugin Governance for Agent CLIs
- GitHub Agentic Workflows for Automating Dev Processes
- GitHub Copilot Advanced Patterns: Multi-Agent and Automation
- GitHub Copilot Agent Mode for AI Agent Development
- GitHub Copilot Coding Agent for AI Agent Development
- GitHub Copilot Custom Agents and Skills Extensibility Guide
- GitHub Copilot Dedicated App as Agent-First Surface
- GitHub Copilot Extensions for AI Agent Development
- GitHub Copilot MCP Integration for AI Agent Development
- GitHub Copilot Platform Surface Map: All Capabilities
- GitHub Copilot SDK for AI Agent Development
- GitHub Copilot Training Modules for Engineering Teams
- GitHub Copilot for AI Agent Development
- GitHub Copilot: Context Engineering & Agent Workflows
- GitHub Copilot: Customization Primitives and Stack
- GitHub Copilot: Harness Engineering for Agent-Ready Code
- GitHub Copilot: Model Selection, Routing, and Costs
- GitHub Copilot: Team Adoption and Governance Guide
- GitHub Models in Actions for AI-Driven CI Workflows
- MCP LLM Sampling: Servers Requesting AI Inference Mid-Tool
- Managing Agent Skills from the GitHub CLI with gh skill
- Monorepo Skill and Agent Discovery: Hierarchical Configuration
- Next Edit Suggestions Paradigm for AI Agent Development
- One-Click CI Auto-Fix: Human-Triggered Cloud-Agent Remediation for Failing GitHub Actions
- Org-Membership-Gated Agent Entitlement
- Pre-Execution Risk Classification for Terminal Commands
- Proprietary-to-Open-Standard Tool Migration (Copilot Extensions to MCP)
- Review-Then-Implement Loop for AI Agent Development
- Semantic Issue Search from Chat vs Query Syntax
- Shared Agent Context Store API: When to Expose Curated Context as an Endpoint
- Tenant Model Policy: Organization-Scoped Rules for AI Model Selection
- Terminal Tool Output Compression: Filtering Predictable Noise at the Harness
- Tools: Claude Code, Cursor, and GitHub Copilot
- copilot-instructions.md as a Repo-Level Instruction Convention
cost-performance¶
- Adaptive Generate-Rank-Verify Under Costly Verification
- Adaptive Sandbox Fan-Out Controller
- Advanced Tool Use: Scaling Agent Tool Libraries
- Agent Composition: Chains, Fan-Out, Pipelines, Supervisors
- Agent JIT Compilation: Compile Tasks Into Executable Plans
- Agent Observability with OpenTelemetry and Trajectory Logging
- Air-Gapped RAG: Local Embeddings and Vector Stores
- Air-Gapped RAG: Local LLM Inference
- Asynchronous Agent I/O and Speculative Tool Calling
- Auto Model Selection: Harness-Driven Routing per Task
- BYOK Model Token Visibility: Closing the Observability Gap on Self-Hosted Routes
- Background Todo Agent: Offload Plan Maintenance to a Lightweight Model
- Batch File Operations via Bash Scripts for AI Agents
- Benchmark-Driven Tool Selection for Code Generation
- Bounded Batch Dispatch for Parallel Agent Execution
- CLI Scripts as Agent Tools: Return Only What Matters
- Claude Code Dynamic Workflows
- Claude Code Feature Flags and Environment Variables
- Cloud-Agent Tiered Model Routing: Cheap Tiers for Bounded Tasks
- Code Cleanliness as an Agent Cost Lever
- Code Interpreter as a Primary Agent Tool
- Code-Health-Gated LLM Tier Routing
- Cognitive Reasoning vs Execution: A Two-Layer Agent
- Cohesion-Aware Task Partitioning for Multi-Agent Coding
- Component-Wise RAG Prioritization for Software Engineering Tasks
- Consolidate Agent Tools to Reduce Cognitive Overhead
- Constraint Encoding Does Not Fix Constraint Compliance
- Context Budget Allocation: Spending Every Token Wisely
- Context Compression Strategies: Offloading and Summarisation
- Context Hub: On-Demand Versioned API Docs for Coding Agents
- Contextual Capability Calibration for Multi-Agent Delegation
- Continuous Triage: Automating Issue Classification with AI Workflows
- Copilot CLI BYOK and Local Model Support
- Copilot vs Claude Billing Semantics for Enterprise Teams
- Cost-Aware Agent Design: Route by Complexity, Not Habit
- Cost-Aware Skill Rewriting: Preserve Operational Anchors, Not Skill Tokens
- Cost-Aware Tracing for Skill Distillation
- Cross-Component Interference in Agent Scaffolds
- Cross-Lingual Prompt Preprocessing (Local-LLM Token Arbitrage)
- Cross-Vendor Competitive Routing for LLM Selection
- DSPy: Programmatic Prompt Optimization for Compound Agent Systems
- Deterministic Orchestration for Structured Modernization
- Disable Attribution Headers to Preserve KV Cache in Local Inference
- Dual-Budget Control for Search Agents: VOI Scoring Per Action
- Dynamic Tool Fetching Destroys KV Cache Performance
- Edit Format Selection: Diff vs. Search-Replace vs. Full Rewrite
- Effective Feedback Compute (EFC) for Harness Comparison
- Effort-Aware Hooks: Reading the Reasoning Tier from PreToolUse and PostToolUse
- Evaluating AGENTS.md: When Context Files Hurt More Than Help
- Exclude Dynamic System Prompt Sections for Cross-Machine Cache Sharing
- Extended Prompt Cache TTL for Long Agent Sessions
- Fan-Out Synthesis Pattern for AI Agent Development
- Feedback as Capability Equalizer: Iterative Feedback Outweighs Model Scale
- Filesystem-Based Tool Discovery for AI Agent Development
- Filter and Aggregate Data in the Execution Environment
- First-Party Agent Composition: Agent-Built Features
- Framework-First Agent Development: An AI Anti-Pattern
- Function-Level Debugger Interfaces for Coding Agents
- Future-Based Asynchronous Function Calling
- Gateway Model Routing: Treat the LLM Gateway as a Discovery Source
- Google Search Console Monitoring Workflow
- Headless Claude in CI: Using -p and --max-turns for Safe Pipeline Integration
- Heuristic-Based Effort Scaling in Agent System Prompts
- Hint-Driven Concurrency for Read-Only MCP Tools
- How the Four Agent Engineering Disciplines Compound
- Human-Equivalent Hours for Autonomous Coding Agent Productivity
- Idle-Time Speculative Planning for ReAct Agents
- Indiscriminate Structured Reasoning on Every Agent Task
- Interactive Effort Sliders: Per-Turn Reasoning-Budget Controls
- LLM-Driven Logical Retrieval: Boolean Queries over an Inverted Index
- Lexical-First Retrieval for Agentic Search: When BM25 Is Enough
- MCP Client Design: Building Robust Host-Side Logic
- MCP Client/Server Architecture Best Practices Guide
- MCP Server Design: Building Agent-Friendly Servers
- MCP alwaysLoad: Classifying Servers as Eager or Just-in-Time
- Machine-Readable Error Responses for AI Agents (RFC 9457)
- Mid-Session Config Changes as Invisible Cache Invalidators
- Model Deprecation Lifecycle for Agent Workloads
- Model-ID-as-Dependency: Migration Protocol for Deprecation Churn
- Model-Neutral Agent Architecture: Model Portability Over Cloud Portability
- Multi-Model Plan Synthesis for System Architecture
- Multi-Shape BYOK Provider: Declare API Family per Endpoint
- Observation Masking: Filter Tool Outputs from Context
- Observation-Driven Coordination: CRDT-Based Parallel Agent
- One-Shot Record and Deterministic Replay for Periodic Agent Tasks
- Open Agent School Pattern Mapping for Practitioners
- OpenAPI Documentation Smells for Agent-Ready APIs
- OpenAPI as the Source of Truth for Agent Tool Definitions
- Opponent Processor / Multi-Agent Debate Pattern
- Parameter-Keyed Caching and Dependency-Aware Parallelism for Plan-Execute Pipelines
- Parsimonious Agent Routing for Multi-Agent Dispatch
- Pattern Selection Map: Trade-off Matrix for This Site's Patterns
- Per-Call Budget Hints on Tool Invocations
- Per-Plugin Token-Cost Attribution via claude plugin details
- Per-Tool Extended Reasoning Opt-In: Tool-Call-Scoped Budgets
- Perceived Model Degradation: Why Vibes Are Not Evals
- Persistent Shared Search Sub-Agent for Output-Token Reuse
- Persistent-Connection Agent Transport
- Plan Mode: Read-Only Exploration Before Implementation
- Pre-Install Context-Cost Projection in Plugin Marketplaces
- Pre-Install Plugin Transparency: Capability Inventory and Cost Projection
- Proactive Idle-Time Anticipation (ProAct)
- Production MCP Agent Stack: Sequencing Six Decisions into One Deployment
- Progressive Spend Threshold Alerting for Agent Cost Governance
- Prompt Cache Economics: Comparing Costs by Provider
- Prompt Caching: Architectural Discipline for Agents
- Prompt Compression: Maximizing Signal Per Token
- Prototype Before Optimizing: Establish Quality Baselines Before Token Constraints
- Reasoning Budget Allocation: The Reasoning Sandwich
- Reflective Prompt Evolution with Pareto Selection (GEPA)
- Retrieval-Augmented Agent Workflows: On-Demand Context
- Role Orchestration on a Single Model
- Security Budget as Token Economics
- Self-Healing Tool Routing
- Semantic Caching for Multi-Agent Code Systems
- Semantic Context Loading: Language Server Plugins for Agents
- Semantic Density Optimization for Agent Codebases
- Semantic Tool Output: Designing for Agent Readability
- Source Code Minification for State-in-Context Agents
- Specialist Orchestrated Queuing for Multi-Agent SE (SPOQ)
- Specialized Small Language Models as Agent Sub-Tools
- Structure Prompts with Static Content First to Maximize Cache Hits
- Syntax-Aware Completion Routing for Local-First Code Models
- Task Feasibility Awareness: Stop Before You Start
- Tenant Model Policy: Organization-Scoped Rules for AI Model Selection
- The Advisor Strategy: Frontier Model as Strategic Advisor
- The Infinite Context Anti-Pattern in Agent Systems
- The Kitchen Sink Session Anti-Pattern in AI Agents
- The Plan-First Loop: Always Design Before Writing Code
- Token-Cost Profiling and Reduction for Always-On Agentic Workflows
- Token-Efficient Code Generation: Structural Beats Prompting
- Token-Efficient Tool Design: Tools That Don't Eat Your Context
- Tokenizer Swap Tax: Budgeting for Model Migrations That Change Token Counts
- Tool Calling Schema Standards for AI Agent Development
- Tool Description Quality for Effective Agent Guidance
- Tool Engineering (Training Module)
- Tool Engineering Principles for AI Agent Development
- Tool Necessity Probing: Reading Tool-Call Decisions From Hidden States
- Toolset Agentization: Wrapping Co-Used Tools as Sub-Agents
- Tunable Effort Levels for Code Review Agents
- Unbounded Consumption: Bounding Agent Resource Use Against DoS and Denial-of-Wallet
- Unix CLI as the Native Tool Interface for AI Agents
- Utility-Model Split: Background Tasks on a Cheaper Model
- Validating Token-Optimized Formats Inside Agentic Loops
- Variance-Based RL Sample Selection
- Voting / Ensemble Pattern for AI Agent Development
- pass@k and pass^k: Capability and Consistency Metrics
cursor¶
- Cursor /multitask: Async Subagent Dispatch in the Editor
- Cursor 3 Agents Window: Parallel Agents and Worktree Isolation
- Cursor Multi-Root Workspaces for Cross-Repo Agent Edits
- Cursor SDK: Programmable TypeScript Agent Runtime
- Cursor Self-Hosted Cloud Agents
- Cursor for AI Agent Development
- Enterprise-Managed Plugin Governance for Agent CLIs
- Multi-Repo and No-Repo Coding Agent Automation Templates
- Self-Improving Code Review Agents — Learned Rules
- Tiled Agent Layout: Supervising Parallel Agents Through Dedicated Panes
- Tunable Effort Levels for Code Review Agents
- Visual-Prompt Agent Steering (Cursor Design Mode)
emerging¶
engineering¶
evals¶
- Agent Development Lifecycle for Agent Products
- Agent Harness: Initializer and Coding Agent Pattern
- Agentic-Agile: Adapting Agile Rituals for Agent Work
- Air-Gapped RAG: Grounding, Citations, and Evaluation
- Anti-Reward-Hacking: Rubrics That Resist Gaming
- Behavioral Drivers of Coding Agent Success and Failure
- Behavioral Testing for Non-Deterministic AI Agents
- Benchmark Contamination as Eval Risk
- Benchmark-Driven Tool Selection for Code Generation
- CARE: Three-Party Stage-Gated Agent Engineering
- CoT Robustness in Code Generation
- Completion Failure Taxonomy: Why Code Suggestions Miss
- ComplexMCP: Three Bottlenecks in Large Interdependent Tool Sandboxes
- Constraint Decay in Backend Code Generation
- Control Lexical Leakage in Agent-Memory Retrieval Evals (Entity-Collision)
- Controlled Benchmark Rewriting for Agent Safety Judgment
- Corpus-Level Trace Diagnostics for LLM Agents
- Coverage-Guided Fuzzing for Multi-Agent LLM Systems (FLARE)
- Cross-Framework Signal Semantics: Re-Measure Borrowed Trajectory Rules
- Decomposed Red-Teaming for Agent Monitors
- Distillation-Induced Similarity Metrics for Tool-Use Agents
- Dominator-Graph Trajectory Invariants for Non-Deterministic Agents
- Eval Awareness: Designing Evals Agents Cannot Recognise
- Eval Engineering (Training Module)
- Eval Strategy by Agent Generation: A Structure-to-Eval Locator
- Eval-Driven Development Training for AI Agent Teams
- Evaluator Templates: Portable Primitives for Agent Eval Suites
- Evaluator-Optimizer Pattern for AI Agent Development
- Golden Query Pairs as Continuous Regression Tests for Agents
- Grade Agent Outcomes, Not Execution Paths
- Grading Strategies for Eval-Driven Development
- Hardening Agent Evals for Production-Grade Reliability
- Harness Hill-Climbing: Eval-Driven Iterative Improvement of Agent Harnesses
- Held-Out Test Gap: A Long-Horizon Reward-Hacking Signal
- Human-Review-Driven Curation of Golden Eval Datasets
- Incident-to-Eval Synthesis: Production Failures as Evals
- Inference-Time Tool-Call Reviewer: Pre-Execution Feedback for Tool-Calling Agents
- Isometric Harness Ablation: Rank Subsystem Investment by Removing One at a Time
- L3 → L5: Reaching Agent-First
- LLM-Driven Benchmark Auditing
- Learned Prefix Monitors for Agent Traces
- Macro Evals for Agentic Systems: Population-Level Behavior Patterns
- Markov-Chain Reliability for LLM Agents: Audit the Abstraction Before You Trust the Metric
- Measuring Synthetic Eval Data Quality (SynAE)
- Multi-Turn Conversation Evaluation: Per-Turn and Trace-Level Scoring Together
- Mutation Testing as a Quality Gate for AI-Generated Test Suites
- Nonstandard Errors in AI Agents: Model-Family Variance
- Observability-Driven Harness Evolution
- Overeager-Behavior Elicitation: Scope + Trap Fragments as a Diagnostic for Out-of-Scope Tool Calls
- PASS@(k,T): Evaluate RL for Agents Along Sampling and Interaction Depth
- Plan Compliance in Agents: Measure What They Execute, Not What You Wrote
- Planted-Bug Methodology: Deliberate Bugs as Observability Calibration
- Precise Debugging: Measure Edit Precision, Not Just Test Pass Rate
- RAG/Agent Reliability Problem Map: 16-Domain Failure Taxonomy
- Repository Perturbation as Context-Reasoning Diagnosis (RepoMirage)
- Skill Eval Loop
- Skill Evals: Measuring Skill Quality as a Dataset-Graded Unit
- Skill Retrieval Realism Gap
- Skill Specification Violation Fuzzing
- Stakeholder Trust Through Evals and Observability
- Step-by-Step: Building Your First Eval-Driven Feature
- Task Alignment: The Selective-Compliance Gap Benchmarks Miss
- Test Evolution Blind Spot in Coding Agents
- Test Harness Design for LLM Context Windows
- The Consistent Capability Fallacy in LLM Agent Design
- The Eval-First Development Loop for AI Agent Features
- The Synthetic Ground Truth Fallacy in Agent Evaluation
- The Test Homogenization Trap: When LLM-Generated Tests Mirror Model Blind Spots
- Tool-Use Sim-to-Real Perturbation Taxonomy
- Traces Need Feedback to Power Learning
- Trajectory Decomposition: Diagnose Where Coding Agents Fail
- Trajectory-Opaque Evaluation Gap: Why Final-Output Grading Misses Safety Violations
- Variance-Based RL Sample Selection
- What Evals Are and Why AI Agents Need Them for Quality
- Writing Your First Agent Evaluation Suite from Scratch
- pass@k and pass^k: Capability and Consistency Metrics
fallacies¶
- AI Knowledge Generation Fallacy: LLMs Recombine, Not Invent
- Chain-of-Thought Reasoning Fallacy: Traces Are Not Truth
- LLM Comprehension Fallacy: When Models Seem to Understand
- Reference: Standards, Human Factors, Emerging, Fallacies, and Articles
- The Consistent Capability Fallacy in LLM Agent Design
- The LLM Laziness Deficit Fallacy: Restraint Comes From Harness, Not Instruction
- The Synthetic Ground Truth Fallacy in Agent Evaluation
- The Task Framing Irrelevance Fallacy in Agent Prompting
foundations¶
frameworks¶
- Agentic Framework Landscape: When Each Framework Fits
- Cognitive Architectures for Language Agents (CoALA): A Classifier for Agent Harnesses
- Consistent-format customer capture
- Cross-Functional Knowledge Artifacts
- Functional folder taxonomy
- L0 → L1: Making the Repo Readable
- L1 → L2: Adding Feedback Loops
- L2 → L3: Building Mechanical Enforcement
- L3 → L5: Reaching Agent-First
- Natural-language git
- Plan files as resumable artifacts
- Plan mode for knowledge artifacts
- Self-Explanation Loop
geo¶
- AI Crawler Policy: robots.txt for the Three-Tier Crawler Landscape
- Answer-First Writing: Structure Content for AI Retrieval
- Assertion Density — Stats and Quotes Over Vague Claims
- Atomic Pages and Chunking — One Concept Per Page for RAG
- GEO for Technical Docs: Developer Documentation Checklist
- Generative Engine Optimization for Developer Sites
- Google Search Console Monitoring Workflow
- How AI Engines Cite — ChatGPT, Perplexity, Claude, Gemini
- Measuring GEO Performance: Metrics, Monitoring, and Tools
- SEO vs GEO — Signals, Metrics, Optimization Targets
- Schema and Structured Data for GEO — AI Citation Guide
- Topical Authority — Entity Coverage for AI Citation
- What is GEO — Generative Engine Optimization Defined
- llms.txt: Full Specification, Adoption, and Limitations
github-actions¶
- AI Bot CI/CD Workflow Reliability by Agent
- Agent Environment Bootstrapping for AI Agent Development
- Claude Code --bare Flag
- Closed-Loop CI Failure Remediation with Cloud Coding Agents
- Continuous Triage: Automating Issue Classification with AI Workflows
- GitHub Agentic Workflows for Automating Dev Processes
- GitHub Models in Actions for AI-Driven CI Workflows
- Headless Claude in CI: Using -p and --max-turns for Safe Pipeline Integration
- One-Click CI Auto-Fix: Human-Triggered Cloud-Agent Remediation for Failing GitHub Actions
- Prebuilt Agent Environments: Amortising Cloud Agent Cold Start with Custom Container Images
hands-on¶
harness-engineering¶
- AX/UX/DX Triad: Three Experience Layers in Agent Systems
- Agent Harness: Initializer and Coding Agent Pattern
- GitHub Copilot: Harness Engineering for Agent-Ready Code
- Goal Monitoring and Progress Tracking for Long-Running Agents
- Golden Journeys: Restartability as a First-Class Verification Primitive
- Harness Bug Detection Patterns
- Harness Composition for Scaled Security Audits
- Harness Design Dimensions and Archetypes
- Harness Engineering for Building Reliable AI Agents
- Harness Preflight Doctor Command for Agent Diagnostics
- Isometric Harness Ablation: Rank Subsystem Investment by Removing One at a Time
- Meta-Engineering Harness for Production AI-Native Software Delivery
- Method Map: Failure-Mode to Smallest-Artifact Triage
- Observability-Driven Harness Evolution
- Per-Model Harness Tuning: Treating the Backing Model as a Harness Variable
- Quality Score Rubric and Simplification Log for Agent Harnesses
- Review-Feedback-to-Rule Loop: Promoting Recurring PR Comments into Harness Rules
- Rigor Relocation: Engineering Discipline with AI Agents
- Runtime Harness Adaptation: Four-Layer Interface Fixes for Frozen LLM Agents
- Scaffold Architecture Taxonomy for Coding Agents
- Temporary Compensatory Mechanisms in Agent Harnesses
- The AX Stack: A Layered Model of an AI Coding Agent's Prompt-to-Compile Path
human-factors¶
- AI Abundance Reshapes Software Engineering Identity
- AI Adoption Footprint: The Segmented Shape of Engineering Orgs
- AI Knowledge Generation Fallacy: LLMs Recombine, Not Invent
- Agent Governance Policies for AI Agent Development
- Agent Headcount as a Vanity Metric
- Agent PR Volume vs. Value: The Productivity Paradox
- Agent-Authored PR Integration and Merge Predictors
- Agent-Driven Greenfield Product Development from Scratch
- Agent-First Software Design for AI Agent Development
- Agent-Generated Code Maintenance Asymmetry
- Agent-Generated Onboarding Guide as a Durable Artefact
- Agent-Laundered Bug Reports
- Agentic Education: Persona Progression for Teaching AI Coding Tools
- Agentic-Agile: Adapting Agile Rituals for Agent Work
- Ambition Scaling: Moving the Target as Model Capability Increases
- Articles: Long-Form Analysis of AI-Driven Development
- Brownfield to Agent-First: Repo Maturity Framework
- Cargo Cult Agent Setup: Copying Without Understanding
- Chain-of-Thought Reasoning Fallacy: Traces Are Not Truth
- Classical SE Patterns as Agent Design Analogues
- Claude Code Auto Mode: Classifier-Based Permission Gating
- Coding-Agent Reversibility: Platform Choice as a Two-Way Door
- Cognitive Load, AI Fatigue, and Sustainable Agent Use
- Cohort Segmentation in the Copilot Usage Metrics API
- Comprehension Debt from AI-Generated Code Velocity
- Confirmation Gates for Consequential Agent Actions
- Convenience Loops and AI-Friendly Code in Your Stack
- Copilot vs Claude Billing Semantics for Enterprise Teams
- Cross-Functional Knowledge Artifacts
- Cross-Tool Translation: Learning from Multiple AI Assistants
- Deliberate AI-Assisted Learning: Accelerating Skill Acquisition
- Developer Attention Management with Parallel Agents
- Developer Control Strategies for AI Coding Agents
- Direct Prompt Injection via Collaboration (User as Attack Vector)
- Earned-Complexity Agent Maturity Ladder
- Empowerment Over Automation for AI Agent Development
- Encoding Tacit Knowledge into Agent Improvement Loops
- Encoding Values in AGENTS.md: Why Prose Without Verification Fails
- Enterprise Skill Marketplace: Distribution and Quality
- Evaluating Agent Patterns Catalog as a Source
- Factory Over Assistant: Orchestrating Parallel Agent Fleets
- Fallacies for AI Agent Development
- Human Impact of AI Agents on Developer Teams and Workflows
- Human-Equivalent Hours for Autonomous Coding Agent Productivity
- Human-Facing Docs in the Agent Era: Mental Models Over Reference
- Human-in-the-Loop Placement: Where and How to Supervise
- Humans and Agents in Software Engineering Loops
- Hyper-Personalized Software: The Return of RAD
- Initiatives and Community: Tracking the Agentic Engineering Landscape
- LLM Comprehension Fallacy: When Models Seem to Understand
- LLM Refactoring Adoption Patterns
- Law of Triviality in AI PRs for AI Agent Development
- Lay the Architectural Foundation by Hand Before Delegating
- Natural-language git
- Nonstandard Errors in AI Agents: Model-Family Variance
- Org-Membership-Gated Agent Entitlement
- PM on the AI Exponential
- PR Description Style as a Lever for Agent PR Merge Rates
- PR Scope Creep as a Human Review Bottleneck
- Parallel Agent Sessions Shift the Bottleneck from Writing
- Plan files as resumable artifacts
- Plan mode for knowledge artifacts
- Polya Small-Steps: Using AI to Think Better, Not Think Less
- Pre-Execution Risk Classification for Terminal Commands
- Predicting Which AI-Generated Functions Will Be Deleted
- Process Amplification: Scaling Human Work with Agents
- Progressive Autonomy: Scaling Trust with Model Evolution
- Public-Channel Agent Work as Lehrwerkstatt for Team Learning
- Reference: Standards, Human Factors, Emerging, Fallacies, and Articles
- Rigor Relocation: Engineering Discipline with AI Agents
- Scope Sandbox Rules to Harness-Owned Tools, Not Third-Party
- Seamless Background-to-Foreground Handoff
- Selective Autonomy from Copilot Feedback
- Self-Explanation Loop
- Skill Atrophy: When AI Reliance Erodes Developer Capability
- Skill Library Refinement Loops: Organisational Feedback for Shared Skills
- Stakeholder Trust Through Evals and Observability
- Steering Running Agents: Mid-Run Redirection and Follow-Ups
- Strategy Over Code Generation: Why AI Speed Doesn't Fix Wrong Goals
- Suggestion Gating: Fewer Completions, Better DX
- Team OS: Coding-Agent Repo as Cross-Functional Team Brain
- Team Onboarding for AI Agent Workflows and Adoption
- The AI Development Maturity Model: From Skeptic to Agentic
- The AX Stack: A Layered Model of an AI Coding Agent's Prompt-to-Compile Path
- The Addictive Flow State of Agent-Assisted Development
- The Anthropomorphized Agent for AI Agent Development
- The Bottleneck Migration When Humans Supervise Agents
- The Consistent Capability Fallacy in LLM Agent Design
- The Context Ceiling -- Where AI Fails Expert Architects
- The Effortless AI Fallacy for AI Agent Development
- The Productivity-Experience Paradox in AI-Assisted Development
- The Prompt Tinkerer Anti-Pattern in Agent Workflows
- The Synthetic Ground Truth Fallacy in Agent Evaluation
- The Task Framing Irrelevance Fallacy in Agent Prompting
- Tiled Agent Layout: Supervising Parallel Agents Through Dedicated Panes
- Tool Confirmation Carousel: Batched UI for Per-Call Approvals
- Tool Preamble: User-Visible Status Updates Before Tool Calls
- Velocity-Quality Asymmetry: Why AI Speed Gains Fade
- Verification-Centric Development for AI-Generated Code
- Vibe Coding: Outcome-Oriented Agent-Assisted Development
- Visible Thinking in AI-Assisted Development
index¶
instructions¶
- @import Composition Pattern for Agent Instruction Files
- AGENTS.md Design Patterns: Commands, Boundaries, Personas
- AGENTS.md as a Table of Contents, Not an Encyclopedia
- AGENTS.md: Project-Level README for AI Coding Agents
- Acknowledged-Debt Ledger with Next-Trigger Conditions
- Agent Debugging: Diagnosing Bad Agent Output
- Agent Pushback Protocol for Managing Disagreements
- Architecting a Central Repo for Shared Agent Standards
- Authority Confusion: Untrusted Context Must Not Authorize Side Effects
- Bootstrapping Coding Agents: The Specification Is the Program
- Bootstrapping an Agent-Driven Project from Scratch
- Boring Technology Bias: When Agents Recommend by Popularity
- CLAUDE.md Convention for Structuring Agent Instructions
- Cargo Cult Agent Setup: Copying Without Understanding
- Classifier-Gated Auto-Permission for Cloud-IDE Coding Agents
- Claude Code Extension Points: When to Use What
- Claude Code Hooks: Deterministic Lifecycle Automation
- Close the Attack-to-Fix Loop: Adversarially Train Agent
- Configuration File Structure Does Not Drive Compliance
- Constraint Degradation in AI Code Generation
- Constraint Encoding Does Not Fix Constraint Compliance
- Content Exclusion Gap: AI Security Boundaries by Mode
- Context Priming: Pre-Loading Files for AI Agent Tasks
- Continuous Agent Improvement: Iterating on Agent Quality
- Contractual Skill Files: Inspectable SKILL.md for Enterprise Agents
- Controlling Agent Output: Concise Answers, Not Essays
- Convention Over Configuration in Agent Workflow Design
- Cost-Aware Skill Rewriting: Preserve Operational Anchors, Not Skill Tokens
- Critical Instruction Repetition via Primacy and Recency
- Daily-Use Skill Library: Encoding Your Process as Agent Skills
- Deferred Standards Enforcement via Review Agents
- Designing Agents to Resist Prompt Injection
- Domain-Specific System Prompts with Concrete Examples
- Dynamic System Prompt Composition
- Empirical Baseline: Agentic AI Coding Tool Configuration
- Encode Project Conventions in Distributed AGENTS.md Files
- Encoding Values in AGENTS.md: Why Prose Without Verification Fails
- Enforcing Agent Behavior with Hooks
- Evaluating AGENTS.md: When Context Files Hurt More Than Help
- Event-Driven System Reminders for AI Agent Development
- Feature List Files for Reliable AI Agent Development
- Five-Stage Policy Layer Typology for Generalist Agents
- Foundations: Context Engineering and Instructions
- Frozen Spec File: Preserving Intent in AI Agent Sessions
- Functional folder taxonomy
- GROUNDING.md: Field-Scoped Hard Constraints and Convention Parameters
- Getting Started: Setting Up Your Instruction File
- GitHub Copilot Custom Agents and Skills Extensibility Guide
- Goal Recitation: Countering Drift in Long Sessions
- Goal Reframing: The Primary Exploitation Trigger for LLM Agents
- Google ADK Skills: Portable SKILL.md Across ADK Agents
- Grill Me: Developer-Initiated Plan Interrogation
- Guardrails Beat Guidance: Rule Design for Coding Agents
- HTML as Agent Output Format: When to Ask for HTML Instead of Markdown
- Hard-Deny Classifier Rule: Unconditional Block in Auto Mode
- Heuristic-Based Effort Scaling in Agent System Prompts
- Hierarchical CLAUDE.md: Structuring Context Files at Multiple Levels
- Hints Over Code Samples in Agent Prompts
- Hook Catalog for Claude Code Enforcement
- Hooks for Enforcement vs Prompts for Guidance: When to Use Each
- How the Four Agent Engineering Disciplines Compound
- Instruction Polarity: Positive Rules Over Negative
- Instructions: System Prompts, Rules, and Agent Configuration
- Interactive Clarification for Underspecified Tasks
- Iterative Binary Feedback for Pattern Adherence
- Layered Instruction Scopes: Global, Project, and Directory
- Living-Docs-Grounded Agent Design Conversations
- Managed Settings Drop-In Directory: Enterprise Policy Fragmentation
- Mermaid as Agent Output Format: When to Ask for a Diagram Instead of Prose
- MessageDisplay Hook: Transforming Assistant Text at the Display Boundary
- Method Map: Failure-Mode to Smallest-Artifact Triage
- Multi-Layer Specification Redundancy as a Robustness Budget
- Natural-Language Customization Bootstrap
- Negative Space Instructions: What NOT to Do in Agent Prompts
- Override Pattern: Reusing Interactive Commands in Automated Pipelines
- Permission-Gated Custom Commands for AI Agent Development
- Permutation Frameworks for Batch Code Generation
- Persona-as-Code: Defining Agent Roles as Structured Docs
- Plan Compliance in Agents: Measure What They Execute, Not What You Wrote
- Portable Agent Definitions: Full-Stack Identity as Code
- Post-Compaction Re-read Protocol for Agent Continuity
- PostToolBatch Hook: Once-Per-Decision-Cycle Injection at the Batch Boundary
- PostToolUse continueOnBlock: Refusal With a Load-Bearing Reason
- Pre-Trust Execution Surface in Coding Agent Harnesses
- Production System Prompt Architecture and Techniques
- Project Instruction File Ecosystem
- Project Writing Skill: House Style as Model-Invocable Skill
- Prompt Engineering for Agent Instructions and Systems
- Prompt File Libraries for Reusable Agent Instructions
- Prompt Governance via PRs: Reviewable AI Behaviour
- Prompt Layering: How Instructions Stack and Override
- Prompt-Only Tool Access Control
- Prompt-Rewrite Discipline on Cross-Generation Model Migration
- Protecting Sensitive Files from Agent Context Access
- Reflective Prompt Evolution with Pareto Selection (GEPA)
- Repository Bootstrap Checklist: Wiring Agent Support
- Review-Feedback-to-Rule Loop: Promoting Recurring PR Comments into Harness Rules
- Rule Lifecycle Metadata for Prunable Instruction Surfaces
- SKILL.md Frontmatter Reference: All Fields Explained
- Scheduled Instruction File Fact-Checker for Accuracy
- Scoped Credentials via Proxy Outside the Agent Sandbox
- Security Constitution for AI Code Generation
- Seeding Agent Context: Breadcrumbs in Code
- Self-Explanation Loop
- Shared Context Bundle Registry for Agent Teams
- Skill Authoring Patterns: Description to Deployment
- Skill Program Functions: Executable Guardrails Compiled From Past Failures
- Skill Tool as Enforcement: Loading Command Prompts at Runtime
- Skill as Instruction Surface and Callable API (Interpreter Skills)
- Skill as Knowledge Pattern for AI Agent Development
- Spec Complexity Displacement: When Specs Become Code
- Spec-Driven Development with Spec Kit
- Specialized Agent Roles for Effective AI Pipelines
- Stale AI Configuration Artifacts (Context Rot)
- Standards as Agent Instructions for AI Agent Development
- System Prompt Altitude: Specific Without Being Brittle
- System Prompt Replacement for Domain-Specific Agent Personas
- System Prompt as Secret Store (OWASP LLM07)
- Task List Divergence as Instruction Quality Diagnostic
- The Implicit Knowledge Problem for AI Coding Agents
- The Instruction Compliance Ceiling: How Rule Count Limits AI
- The LLM Laziness Deficit Fallacy: Restraint Comes From Harness, Not Instruction
- The Prompt Tinkerer Anti-Pattern in Agent Workflows
- The Specification as Prompt: Existing Artifacts as Agent
- Three Knowledge Tiers: Sourced, Unverified, Hallucinated
- Throwaway-Prototype Skill: Build to Discard, Keep Only the Answer
- Token Preservation Backfire for AI Agent Development
- Tool Engineering Principles for AI Agent Development
- Tool Minimalism and High-Level Prompting
- Treat Task Scope as a Security Boundary
- Ubiquitous Language for AI Plans
- Usability Pressure as a Silent Security-Regression Vector
- Use a Public-Web Index to Gate Automatic URL Fetching
- WRAP Framework for Writing Agent-Ready Issue Descriptions
- When to Use Examples vs Rules in Agent Instructions
- Write Tool Descriptions as Agent Onboarding Documents
- claudeMdExcludes: Selective Ancestor Instruction-File Exclusion
- copilot-instructions.md as a Repo-Level Instruction Convention
long-form¶
- Advanced Tool Use: Scaling Agent Tool Libraries
- Agent-Authored Messages as a Deferred Exfiltration Channel
- Auto Model Selection: Harness-Driven Routing per Task
- Classifier-Gated Auto-Permission for Cloud-IDE Coding Agents
- Cost-Aware Agent Design: Route by Complexity, Not Habit
- Episodic Memory Retrieval for AI Coding Agent Loops
- GEO for Technical Docs: Developer Documentation Checklist
- Harness Engineering for Building Reliable AI Agents
- Loop Strategy Spectrum: Accumulated vs Fresh Context
- Open Agent School Pattern Mapping for Practitioners
- Pattern Selection Map: Trade-off Matrix for This Site's Patterns
- Proactive Idle-Time Anticipation (ProAct)
- Production Hosting Topology for Self-Hosted Agent SDK Runtimes
- Six-Shape Approval Response Taxonomy: Beyond Binary Allow/Deny
- Skill Authoring Patterns: Description to Deployment
- Specialized Agent Roles for Effective AI Pipelines
- Symptom-Reduction-as-Root-Cause: Why Oracle Tests Alone Miss Architectural Drift
- The Context Ceiling -- Where AI Fails Expert Architects
mcp¶
- Agentic Detection and Response at the MCP Boundary
- ComplexMCP: Three Bottlenecks in Large Interdependent Tool Sandboxes
- Customer-Hosted MCP Tunnel: Outbound-Only Connectivity to Private MCP Servers
- Hint-Driven Concurrency for Read-Only MCP Tools
- MCP Client Design: Building Robust Host-Side Logic
- MCP Client/Server Architecture Best Practices Guide
- MCP Elicitation: Servers Requesting Structured Input Mid-Task
- MCP LLM Sampling: Servers Requesting AI Inference Mid-Tool
- MCP Runtime Control Plane: Policy Evaluation Between Agent and Tool
- MCP Server Design: Building Agent-Friendly Servers
- MCP Tool Result Persistence via _meta Annotation
- MCP alwaysLoad: Classifying Servers as Eager or Just-in-Time
- MCP: The Open Protocol Connecting Agents to External Tools
- OAuth Client ID Metadata Documents (CIMD) for MCP Servers
- Per-Server MCP Environment Scoping for Credential Isolation
- Production MCP Agent Stack: Sequencing Six Decisions into One Deployment
- Scanner-as-MCP-Server: Secret and Dependency Scans as Typed Agent Tools
- Scoped MCP Server Discovery: Most-Specific-Wins Resolution
- WebMCP: Browser-Hosted Tool Contracts for In-Page AI Agents
memory¶
- ACID for Agent Repository State
- AST-Guided Agent Memory for Repository-Level Code Generation
- Abstention-Aware Memory Retrieval for Coding Agents
- Agent Memory Patterns: Learning Across Conversations
- Agent Project State Purge: Clean-Slate Session Reset
- Agentic Framework Landscape: When Each Framework Fits
- Clock-In / Clock-Out Protocol: Bracketed Session Continuity
- CoALA Memory Taxonomy as a Classifier for Harness Artifacts
- Component-Isolated Memory Stress Testing for LLM Agents
- Continual Learning for AI Agents: Three Layers of Knowledge Accumulation
- Control Lexical Leakage in Agent-Memory Retrieval Evals (Entity-Collision)
- Copilot Memory: Autonomous Cross-Agent Persistence
- Cost-Aware Tracing for Skill Distillation
- Cross-Cycle Consensus Relay
- Cross-Session Memory Poisoning for Data Exfiltration (Trojan Hippo)
- Decentralized Memory for Self-Evolving Multi-Agent Systems
- Dormant Memory Payloads Triggered by Sensitive Topics (Trojan Hippo)
- Dual-Trace Memory Encoding: Pair Facts with the Scene They Were Learned In
- Durable Interactive Artifacts: Agent Output Outside the Transcript
- Episodic Memory Retrieval for AI Coding Agent Loops
- Evolving Playbooks: Incremental Context That Preserves Knowledge
- Experience Graphs as Structured Memory for Self-Evolving Agents
- Experiential-Learning Setup Agents with Snapshot Rollback (SetupX)
- Generative Agents Memory Stream: Three-Layer Architecture for Long-Running Agent Sessions
- Harness-Memory Coupling as a Design Axis
- Layered Mutability: Governing Persistent Self-Modifying Agents
- Memory Reinforcement Learning (MemRL)
- Memory Synthesis: Extracting Lessons from Execution Logs
- Memory Transfer Learning: Cross-Domain Memory Reuse in Coding Agents
- Memory-Induced Tool-Drift in LLM Agents
- OpenAI Agents SDK Sandboxes Harness and Memory
- OpenAI Agents SDK for AI Agent Development
- PEEK: Orientation Cache for Recurring-Context Agents
- Proactive Idle-Time Anticipation (ProAct)
- RAG over Thinking Traces: Index Reasoning Trajectories Instead of Documents
- Shared Agent Context Store API: When to Expose Curated Context as an Endpoint
- Structured Task Graphs as External Agent Memory (Beads)
- Subtask-Level Memory for Software Engineering Agents
- Tiered Memory Architecture: Episodic-to-Semantic Consolidation for Long-Running Agents
- Typed Memory from VCS History: When It Pays Off
meta¶
multi-agent¶
- Adaptive Sandbox Fan-Out Controller
- Adversarial Multi-Model Development Pipeline (VSDD)
- Agent Composition: Chains, Fan-Out, Pipelines, Supervisors
- Agent HQ (Multi-Agent Platform) for AI Agent Development
- Agent Handoff Protocols: Passing Work Between Agents
- Agent Headcount as a Vanity Metric
- Agentic AI Architecture: From Prompt to Goal-Directed
- Agentic Framework Landscape: When Each Framework Fits
- Async Non-Blocking Subagent Dispatch
- Bounded Batch Dispatch for Parallel Agent Execution
- Claude Code Agent Teams for Collaborative AI Workflows
- Closed-Loop Role-Based Refinement for Agent Systems
- Cloud Parallel Review Pattern
- Cognitive Reasoning vs Execution: A Two-Layer Agent
- Cohesion-Aware Task Partitioning for Multi-Agent Coding
- Committee Review Pattern for Multi-Agent Code Review
- Constraint Drift: Why Safety Must Be Maintained, Not Asserted
- Contextual Capability Calibration for Multi-Agent Delegation
- Coverage-Guided Fuzzing for Multi-Agent LLM Systems (FLARE)
- Cross-Tool Subagent Comparison
- Cursor /multitask: Async Subagent Dispatch in the Editor
- Cursor Multi-Root Workspaces for Cross-Repo Agent Edits
- Decentralized Memory for Self-Evolving Multi-Agent Systems
- Declarative Multi-Agent Composition
- Declarative Multi-Agent Topology: Topology-as-Code
- Developer Attention Management with Parallel Agents
- Distributed Computing Parallels in Agent Architecture
- Economic Value Signaling in Multi-Agent Networks
- Emergent Behavior Sensitivity for AI Agent Development
- Event Sourcing for Agents: Separating Cognitive Intention
- Factory Over Assistant: Orchestrating Parallel Agent Fleets
- Failure-Aware Observability for Multi-Agent LLM Systems
- Fan-Out Synthesis Pattern for AI Agent Development
- File-Based Agent Coordination for AI Agent Development
- Foresight-Guided Defense Against Infectious Jailbreaks in Multi-Agent Systems
- Forked vs Fresh Subagents: When to Inherit the Parent Conversation
- GitHub Copilot Advanced Patterns: Multi-Agent and Automation
- Heartbeat-Bound Hierarchical Credentials for Agent Swarms
- Independent Test Generation in Multi-Agent Code Systems
- LLM Map-Reduce Pattern for Parallel Input Processing
- Lead-to-Teammate Plan-Approval Handshake for Multi-Agent Work
- Magentic Orchestration: Task-Ledger-Driven Adaptive Multi-Agent Planning
- Meta-Engineering Harness for Production AI-Native Software Delivery
- Monolith-to-Sub-Agents Refactor: Five Lessons from a Brittle Prototype
- Multi-Agent SE Design Patterns: A Taxonomy Across 94 Papers
- Multi-Agent Systems: Coordination and Orchestration
- Multi-Agent Topology Taxonomy: Centralised, Decentralised
- Multi-Model Plan Synthesis for System Architecture
- Observation-Driven Coordination: CRDT-Based Parallel Agent
- Offline Trajectory Replay for Multi-Agent Workflow Debugging
- Opponent Processor / Multi-Agent Debate Pattern
- Oracle-Based Task Decomposition for AI Agent Development
- Orchestrator-Worker Pattern for AI Agent Development
- Parallel Agent Sessions Shift the Bottleneck from Writing
- Parsimonious Agent Routing for Multi-Agent Dispatch
- Patterns: Agent Design, Multi-Agent, and Anti-Patterns
- Persistent Shared Search Sub-Agent for Output-Token Reuse
- Rainbow Deployments for Agents: Gradual Version Migration
- Recursive Best-of-N Delegation
- Reverse-Engineered Executable Specifications for Agentic Program Repair
- Semantic Caching for Multi-Agent Code Systems
- Specialist Orchestrated Queuing for Multi-Agent SE (SPOQ)
- Specialized Agent Roles for Effective AI Pipelines
- Staggered Agent Launch: Preventing Thundering-Herd in Swarms
- Structural Coverage Criteria for Agent Workflows
- Sub-Agents for Fan-Out Research and Context Isolation
- Subagent OTel Trace Correlation via agent_id Attribute
- Subagent Schema-Level Tool Filtering for AI Agents
- Swarm Migration Pattern
- Swarm Skills: Multi-Agent Extension of the Agent Skills Standard
- Symphony: Open Spec for Issue-Tracker-Driven Coding Agent Orchestration
- System-Level Optimization Pipeline
- Tiled Agent Layout: Supervising Parallel Agents Through Dedicated Panes
- Toolset Agentization: Wrapping Co-Used Tools as Sub-Agents
- Typed Schemas at Agent Boundaries for Multi-Agent Systems
- Verify-Gated Completion as Admission Control
- Voting / Ensemble Pattern for AI Agent Development
navigation¶
observability¶
- Action-Audit Divergence: A Four-Mode Taxonomy for Runtime Hardening
- Agent Chat History as a First-Class Artifact
- Agent Debug Log Panel: Chronological Event Inspection for Session Debugging
- Agent Debugging: Diagnosing Bad Agent Output
- Agent Development Lifecycle for Agent Products
- Agent Event Streaming: Consumer Contract Above the Tokens
- Agent Harness: Initializer and Coding Agent Pattern
- Agent Headcount as a Vanity Metric
- Agent Observability with OpenTelemetry and Trajectory Logging
- Agent-Trace Data Layer: Storage for Hours-Long Traces
- Agentic AI Architecture: From Prompt to Goal-Directed
- Agentic Detection and Response at the MCP Boundary
- Agentic-Agile: Adapting Agile Rituals for Agent Work
- BYOK Model Token Visibility: Closing the Observability Gap on Self-Hosted Routes
- Behavioral Drivers of Coding Agent Success and Failure
- CausalFlow: Counterfactual Repair for Failed Agent Trajectories
- Circuit Breakers for Agent Loops
- Coding-Agent Misalignment Forms (Seven-Symptom Taxonomy)
- Context-Usage Attribution: Per-Source Breakdown of Agent Context
- Context-Window Diagnostic Tooling: Identifying Context-Heavy Tools
- Corpus-Level Trace Diagnostics for LLM Agents
- Cost-Aware Tracing for Skill Distillation
- Declarative Multi-Agent Composition
- Delta Channels: Bounded Checkpoint Storage for Append-Only Agent State
- Dominator-Graph Trajectory Invariants for Non-Deterministic Agents
- Dual-Write Append-Mirror for Agent Transcript Externalization
- Engineering: Tools, Review, Verification, Security, and Observability
- Enterprise Agent Hardening: Three Production Gates
- Event Sourcing for Agents: Separating Cognitive Intention
- Failure-Aware Observability for Multi-Agent LLM Systems
- Five-Failure-Layers Diagnostic: Attribute Before Swapping the Model
- Harness Bug Detection Patterns
- Harness Preflight Doctor Command for Agent Diagnostics
- In-Session Transcript Search: Navigating Long Agent Conversations
- LLM Agent Bug Fix Taxonomy: 23 Fix Patterns from 930 Real Bugs
- Learned Prefix Monitors for Agent Traces
- Loop Detection for AI Agents: Stopping Micro-Loops
- Macro Evals for Agentic Systems: Population-Level Behavior Patterns
- Making Application Observability Legible to Agents
- Markov-Chain Reliability for LLM Agents: Audit the Abstraction Before You Trust the Metric
- Monitor Tool: Event Streaming from Background Scripts
- Monolith-to-Sub-Agents Refactor: Five Lessons from a Brittle Prototype
- Multi-Turn Conversation Evaluation: Per-Turn and Trace-Level Scoring Together
- Observability Feedback Loop: A 7-Step Debug Runbook for Agents
- Observability for AI Agents: Tracing and Debugging
- Observability-Driven Harness Evolution
- Offline Trajectory Replay for Multi-Agent Workflow Debugging
- OpenTelemetry for AI Agent Observability and Tracing
- Out-of-Band Hook Notifications via terminalSequence
- Per-Plugin Token-Cost Attribution via claude plugin details
- Persistent-Connection Agent Transport
- Planted-Bug Methodology: Deliberate Bugs as Observability Calibration
- Plugin Background Monitors: Declarative Supervision Auto-Armed at Session Start
- Prebuilt Agent Monitoring Dashboard
- Programmatic Agent Session Export via `claude agents --json`
- Run-Status vs Task-Status Confusion in Autonomous Agent Runs
- Session Harness Sandbox Separation for Long-Running Agents
- Stakeholder Trust Through Evals and Observability
- Strained Coherence as a Pre-Failure Signal in Agent Trajectories
- Subagent OTel Trace Correlation via agent_id Attribute
- Traces Need Feedback to Power Learning
- Trajectory Decomposition: Diagnose Where Coding Agents Fail
- Trajectory Logging via Progress Files and Git History
- Trajectory Pre-Filter for Failure Diagnosis (TrajAudit)
- Transcript-Driven Permission Allowlist
- Using the Agent to Analyze Its Own Evaluation Transcripts
- Verification Ledger for Tracking Agent Output Quality
onboarding¶
pattern¶
- Agent Terminology Disambiguation for AI Coding Systems
- Anthropic's Effective Agents Framework: A Pattern Map
- Classical SE Patterns as Agent Design Analogues
- Cognitive Load, AI Fatigue, and Sustainable Agent Use
- Durable Interactive Artifacts: Agent Output Outside the Transcript
- Patterns: Agent Design, Multi-Agent, and Anti-Patterns
- Selective Rewind Summarization: Compress Earlier Turns, Keep Recent Ones Intact
- Session Recap: Goal-Shaped Handoff at Context Boundaries
- Suggestion Gating: Fewer Completions, Better DX
rag¶
- AOCI: Symbolic-Semantic Repository Indexing
- Chunking Strategy for RAG-Based Code Completion
- Codebase-Derived Pattern Libraries as Agent Context
- Component-Wise RAG Prioritization for Software Engineering Tasks
- Context Hub: On-Demand Versioned API Docs for Coding Agents
- Embedding Inversion: Vector Stores as a Source-Text Disclosure Surface
- LLM-Driven Logical Retrieval: Boolean Queries over an Inverted Index
- Lexical-First Retrieval for Agentic Search: When BM25 Is Enough
- Multitenant RAG: Closing the Relevance-Authorization Gap
- RAG Architecture as a Poisoning Robustness Decision
- RAG/Agent Reliability Problem Map: 16-Domain Failure Taxonomy
- Repository-Level Retrieval for Code Generation
- Retrieval-Augmented Agent Workflows: On-Demand Context
- Schema-Guided Graph Retrieval
- Semantic Context Loading: Language Server Plugins for Agents
- Stale Repository Retrieval Induces Incorrect Code
- Structured Domain Retrieval: Knowledge Graphs and Case-Based Reasoning
reference¶
reliability¶
- Error Preservation in Context for AI Agent Development
- Exception Handling and Recovery Patterns for AI Coding Agents
security¶
- AI-Powered Vulnerability Triage for AI Agent Development
- Action-Audit Divergence: A Four-Mode Taxonomy for Runtime Hardening
- Action-Selector Pattern: LLM as Intent Decoder with Deterministic Execution
- Agent Commit Attribution: Signed Commits and Agent Identity
- Agent Network Egress Policy: Admin-Controlled Domain Allow/Deny
- Agent Runtime Middleware: Per-Call Interception Pipeline
- Agent-Authored Messages as a Deferred Exfiltration Channel
- Agent-Emitted Dependency Version Ranges Widen the Supply-Chain Attack Surface
- Agentic Detection and Response at the MCP Boundary
- Air-Gapped RAG Training Series
- Air-Gapped RAG: Deployment, Operations, and Compliance
- Air-Gapped RAG: Overview and When to Use It
- Always-On Agentic PR Security Review
- Authority Confusion: Untrusted Context Must Not Authorize Side Effects
- Behavioral Firewall for Tool-Call Trajectories
- Blast Radius Containment: Least Privilege for AI Agents
- Chat-Platform Agent Delegation: Invoking Cloud Coding Agents from Team Channels
- Clarification Mode Amplifies Prompt Injection
- Classifier-Gated Auto-Permission for Cloud-IDE Coding Agents
- Classifier-Subagent Run Mode for Per-Call Permission Routing
- Claude Code Auto Mode: Classifier-Based Permission Gating
- Close the Attack-to-Fix Loop: Adversarially Train Agent
- Code Injection Defence in Multi-Agent Pipelines
- Code Interpreter as a Primary Agent Tool
- Coding Agent Scope Expansion: When to Extend Beyond the Codebase
- Cognitive Poisoning: Untrusted Tool Feedback as a Trajectory Attack
- Compositional Vulnerability Induction in Coding Agents
- Confirmation Gates for Consequential Agent Actions
- Constraint Drift: Why Safety Must Be Maintained, Not Asserted
- Containment Playbook: npm-to-Signing-Channel Compromise
- Content Exclusion Gap: AI Security Boundaries by Mode
- Control/Data-Flow Separation for Prompt Injection Defense (CaMeL)
- Controlled Benchmark Rewriting for Agent Safety Judgment
- Copilot Cloud Agent Organization Controls
- Credential Hygiene for Agent Skill Authorship
- Cross-Repo Agent Search: GitHub-API-Backed Text Search Beyond the Workspace
- Cross-Session Memory Poisoning for Data Exfiltration (Trojan Hippo)
- Cryptographic Governance Audit Trail for AI Agents
- Cursor Self-Hosted Cloud Agents
- Customer-Hosted MCP Tunnel: Outbound-Only Connectivity to Private MCP Servers
- Data Fidelity Guardrails: Preventing Agent Data Mutation
- Decomposed Red-Teaming for Agent Monitors
- Defense-in-Depth Agent Safety for AI Agent Development
- Dependabot Agent Assignment for AI-Driven Vulnerability Remediation
- Designing Agents to Resist Prompt Injection
- Direct Prompt Injection via Collaboration (User as Attack Vector)
- Directory-Aware Plugin Suggestions via `pluginSuggestionMarketplaces`
- Discovering Indirect Injection Vulnerabilities in Your Agent
- Docker sbx Adoption for Coding Agents
- Dormant Memory Payloads Triggered by Sensitive Topics (Trojan Hippo)
- Dual-Boundary Sandboxing for Secure Agent Execution
- Dual-Graph Alignment for Indirect Prompt Injection Defense (AuthGraph)
- Embedding Inversion: Vector Stores as a Source-Text Disclosure Surface
- Engineering: Tools, Review, Verification, Security, and Observability
- Enterprise Agent Hardening: Three Production Gates
- Enterprise-Managed Plugin Governance for Agent CLIs
- Evidence-Based Allowlist Auto-Discovery for Agents
- External Artifacts Treated as Data, Not Adversarial Input
- Fail-Closed Remote Settings Enforcement for Enterprise Agents
- Five-Stage Policy Layer Typology for Generalist Agents
- Foresight-Guided Defense Against Infectious Jailbreaks in Multi-Agent Systems
- Four-Layer Taxonomy of Agent Security Risks
- Gate Agent Writes to Executable Config Files as Privileged Actions
- Goal Reframing: The Primary Exploitation Trigger for LLM Agents
- Guarding Against URL-Based Data Exfiltration in Agentic Workflows
- Hard-Deny Classifier Rule: Unconditional Block in Auto Mode
- Harness Composition for Scaled Security Audits
- Heartbeat-Bound Hierarchical Credentials for Agent Swarms
- History Anchors: Consistency-Cued Continuation of Unsafe Prior Actions
- Hook Exec Form vs Shell Form: Shell-Injection-Safe Hook Commands
- Hooks Invoking MCP Tools: Closing the Loop Between Policy and Tool Execution
- Hostname-Allowlist Proxy: The TLS-Inspection Blind Spot
- Hybrid Deterministic + Semantic Authorization for Agent Tool Calls
- Improper Output Handling: Validate Agent Output Before Downstream Use
- In-Process WebAssembly Sandboxes for Agent-Generated Code
- Inline Safety Harness with Cascade Verification (FinHarness)
- LLM-Pinned Library Versions Carry Systemic CVE Exposure
- Lethal Trifecta Threat Model for AI Agent Development
- Lifecycle-Integrated Security Architecture for Agent Harnesses
- Live Browser as Agent Context Channel
- Lock-State Safeguards for Desktop-Controlling Agents
- MCP Allowlist by Label, Not by Identity (serverName Trap)
- MCP Runtime Control Plane: Policy Evaluation Between Agent and Tool
- Managed Settings Drop-In Directory: Enterprise Policy Fragmentation
- Mid-Trajectory Guardrail Selection for Multi-Step Tool Calls
- Monotonic Capability Attenuation for Composition-Safe Tool Use
- Multi-Repo and No-Repo Coding Agent Automation Templates
- Multi-Tenant Isolation Knobs for Shared-Container Agent SDK Hosting
- Multitenant RAG: Closing the Relevance-Authorization Gap
- Network-less Container + Unix-Socket Egress Proxy for Agent Sandboxes
- OAuth Client ID Metadata Documents (CIMD) for MCP Servers
- OWASP LLM Top 10 (2025): Agent Security Crosswalk
- On-Demand Skill Hooks: Session-Scoped Guardrails via Skill Invocation
- OpenAI Agents SDK Sandboxes Harness and Memory
- OpenAI Agents SDK for AI Agent Development
- Oracle Poisoning: Knowledge Graph Corruption Against Tool-Using Agents
- Org-Membership-Gated Agent Entitlement
- Overeager-Behavior Elicitation: Scope + Trap Fragments as a Diagnostic for Out-of-Scope Tool Calls
- PII Tokenization in Agent Context
- Per-Server MCP Environment Scoping for Credential Isolation
- Permission Framework Choice Outweighs Model Choice for Limiting Overeager Actions
- Permission-Gated Custom Commands for AI Agent Development
- Plan-Then-Execute as the Default for Web Agents
- Pre-Execution Risk Classification for Terminal Commands
- Pre-Trust Execution Surface in Coding Agent Harnesses
- Privacy-Preserving LLM Requests: Eight Techniques and a Practical Combination
- Programmatic Cloud-Agent Dispatch via REST API and Webhooks
- Prompt Injection: A First-Class Threat to Agentic Systems
- Prompt as Security Knob
- Prompt-Only Tool Access Control
- Protecting Sensitive Files from Agent Context Access
- Provenance-Aware Decision Auditing for LLM Agents
- RAG Architecture as a Poisoning Robustness Decision
- RL-Trained Automated Red Teamers for Prompt Injection Discovery
- Restricted-Access Defensive AI: Project Glasswing as a Deployment Model
- SUDP: Secret-Use Delegation Protocol for Agentic Systems
- Safe Command Allowlisting: Reducing Approval Fatigue
- Safe Outputs Pattern for Trustworthy Agent Responses
- Sandbox + Approvals + Auto-Review Governance Triad
- Sandboxed Coding Environments: Containers vs MicroVMs vs OS-Level Isolators
- Scanner-as-MCP-Server: Secret and Dependency Scans as Typed Agent Tools
- Scope Sandbox Rules to Harness-Owned Tools, Not Third-Party
- Scoped Credentials via Proxy Outside the Agent Sandbox
- Secrets Management for AI Agents: Credential Injection
- Security Budget as Token Economics
- Security Constitution for AI Code Generation
- Security Drift in Iterative LLM Code Refinement
- Security for AI Agent Development
- Selective Network Access in Agent Sandboxes: The allowNetwork Pattern
- Semantic Intent Validation for Agent Skills
- Sensitive Terminal Prompt Interception
- Single-Layer Prompt Injection Defence Anti-Pattern
- Skill Shell Execution Gate: Disabling Inline Shell from Skills
- Skill Specification Violation Fuzzing
- Skill Supply-Chain Poisoning
- Skill disallowed-tools Frontmatter: Skill-Layer Tool Denial
- Slopsquatting: Hallucinated Package Names as a Supply-Chain Vector
- Subprocess PID Namespace Sandboxing in Claude Code
- Sufficiency-Tightness Decomposition for Agent-Authored Permissions
- System Prompt as Secret Store (OWASP LLM07)
- Task Alignment: The Selective-Compliance Gap Benchmarks Miss
- Task-Based Access Control with Hybrid Inspection
- The Agent Stack Bet: Architectural Decisions for Production Agents
- The Security Review Gap in AI-Authored PRs
- Three-Depth In-Session Security Review
- Three-Vector Evasion Taxonomy for Agent Security Tests
- Tool Cloning and Provenance Assessment in Agent Ecosystems
- Tool Signing and Signature Verification for Agents
- Tool-Invocation Attack Surface in Coding Agents
- Trajectory-Opaque Evaluation Gap: Why Final-Output Grading Misses Safety Violations
- Transcript-Driven Permission Allowlist
- Treat Task Scope as a Security Boundary
- Treating Agent Safety as Uniform Across a Session (Cold-Start Safety Gap)
- Trusting Human Review to Catch Deliberate Agent Sabotage
- Trusting Tool Error Messages as Implicit Authority (Error-Path Injection)
- Unbounded Consumption: Bounding Agent Resource Use Against DoS and Denial-of-Wallet
- Usability Pressure as a Silent Security-Regression Vector
- Use a Public-Web Index to Gate Automatic URL Fetching
- Verifying LLM-Generated Cryptographic Code
- Windows Sandboxing for Coding Agents
- Workload Identity Federation for Agent Runtimes
- bypassPermissions Silently Overrides allowedTools (The Restricted-Bypass Trap)
seo¶
skills¶
- Contractual Skill Files: Inspectable SKILL.md for Enterprise Agents
- SDLC-Phase Skill Taxonomy: Full-Lifecycle Skill Libraries
- Skill Library Refinement Loops: Organisational Feedback for Shared Skills
- Throwaway-Prototype Skill: Build to Discard, Keep Only the Answer
source:docs¶
source:marri-constitutional-spec¶
source:opendev-paper¶
- Agent Harness: Initializer and Coding Agent Pattern
- Agent Memory Patterns: Learning Across Conversations
- Context Compression Strategies: Offloading and Summarisation
- Context-Injected Error Recovery for AI Agent Development
- Cost-Aware Agent Design: Route by Complexity, Not Habit
- Defense-in-Depth Agent Safety for AI Agent Development
- Dynamic System Prompt Composition
- Event-Driven System Reminders for AI Agent Development
- Filesystem-Based Tool Discovery for AI Agent Development
- Loop Detection for AI Agents: Stopping Micro-Loops
- Model a Single Agent Turn as Many Inference and Tool-Call
- Objective Drift: When Agents Lose Sight of the Goal
- Reasoning Budget Allocation: The Reasoning Sandwich
- Subagent Schema-Level Tool Filtering for AI Agents
- Tool Engineering Principles for AI Agent Development
source:osmani-80-percent¶
standards¶
- A2UI: Framework-Agnostic Generative UI Standard for Agents
- ACDL: A Language for Describing Agentic LLM Contexts
- AGENTS.md: Project-Level README for AI Coding Agents
- Agent Cards: Capability Discovery Standard for AI Agents
- Agent Definition Formats: How Tools Define Agent Behavior
- Agent Skills: A Cross-Tool Task Knowledge Standard
- Agent-to-Agent (A2A) Protocol for AI Agent Development
- Cross-IDE Plugin Discovery: One Install Surface, Many Consuming Agents
- Directory-Aware Plugin Suggestions via `pluginSuggestionMarketplaces`
- MCP: The Open Protocol Connecting Agents to External Tools
- OAuth Client ID Metadata Documents (CIMD) for MCP Servers
- Open Standards and Protocols for AI Agent Development
- OpenAPI as the Source of Truth for Agent Tool Definitions
- OpenTelemetry for AI Agent Observability and Tracing
- Plugin Dependency Declaration and Disable-Chain Hints
- Plugin and Extension Packaging: Distributing Agent Capabilities
- Portable Agent Definitions: Full-Stack Identity as Code
- Pre-Install Context-Cost Projection in Plugin Marketplaces
- Pre-Install Plugin Transparency: Capability Inventory and Cost Projection
- Reference: Standards, Human Factors, Emerging, Fallacies, and Articles
- SUDP: Secret-Use Delegation Protocol for Agentic Systems
- Swarm Skills: Multi-Agent Extension of the Agent Skills Standard
- Symphony: Open Spec for Issue-Tracker-Driven Coding Agent Orchestration
- Tool Calling Schema Standards for AI Agent Development
- WebMCP: Browser-Hosted Tool Contracts for In-Page AI Agents
- llms.txt: Making Your Project Discoverable to AI Agents
technique¶
- @import Composition Pattern for Agent Instruction Files
- AGENTS.md Design Patterns: Commands, Boundaries, Personas
- AI Crawler Policy: robots.txt for the Three-Tier Crawler Landscape
- Answer-First Writing: Structure Content for AI Retrieval
- Assertion Density — Stats and Quotes Over Vague Claims
- Atomic Pages and Chunking — One Concept Per Page for RAG
- Chunking Strategy for RAG-Based Code Completion
- Convergence Detection in Iterative Agent Refinement
- Cross-Tool Translation: Learning from Multiple AI Assistants
- Evidence-Based Allowlist Auto-Discovery for Agents
- GEO for Technical Docs: Developer Documentation Checklist
- Handoff Skill: Structured Context Transfer Between Agent Sessions
- How AI Engines Cite — ChatGPT, Perplexity, Claude, Gemini
- Incident Log Investigation Skill: Parallel Queries
- Instruction-Guided Code Completion: Controlling What Models Generate
- Inversion Analysis: Surface Capabilities Competitors Cannot Replicate
- Issue Requirements Preprocessing: Structured Input Before Code Generation
- Manual Compaction Strategy for Dumb Zone Mitigation
- Measuring GEO Performance: Metrics, Monitoring, and Tools
- Post-Compaction Re-read Protocol for Agent Continuity
- Prompt Governance via PRs: Reviewable AI Behaviour
- The AX Stack: A Layered Model of an AI Coding Agent's Prompt-to-Compile Path
- Three Reasoning Spaces: Plan, Bead, and Code
- Topical Authority — Entity Coverage for AI Citation
- llms.txt: Full Specification, Adoption, and Limitations
testing-verification¶
- AI-Powered Vulnerability Triage for AI Agent Development
- AIRA: Inspection Framework for AI-Generated Code
- Adaptive Generate-Rank-Verify Under Costly Verification
- Adversarial Multi-Model Development Pipeline (VSDD)
- Agent Self-Review Loop for Iterative Self-Improvement
- Agent-Assisted Code Review: Agents as PR First Pass
- Agentic Code Review Architecture With Tool-Calling
- Air-Gapped RAG: Grounding, Citations, and Evaluation
- Anti-Reward-Hacking: Rubrics That Resist Gaming
- Assumption Propagation: Compounding Agent Misunderstandings
- Behavioral Testing for Non-Deterministic AI Agents
- Benchmark Contamination as Eval Risk
- Benchmark-Driven Tool Selection for Code Generation
- CausalFlow: Counterfactual Repair for Failed Agent Trajectories
- Chain-of-Verification for Coding Agents
- Claude Code Review
- Close the Attack-to-Fix Loop: Adversarially Train Agent
- CoT Robustness in Code Generation
- Committee Review Pattern for Multi-Agent Code Review
- Completion Failure Taxonomy: Why Code Suggestions Miss
- ComplexMCP: Three Bottlenecks in Large Interdependent Tool Sandboxes
- Component-Isolated Memory Stress Testing for LLM Agents
- Constraint Decay in Backend Code Generation
- Control Lexical Leakage in Agent-Memory Retrieval Evals (Entity-Collision)
- Controlled Benchmark Rewriting for Agent Safety Judgment
- Corpus-Level Trace Diagnostics for LLM Agents
- Coverage-Guided Agents for Fuzz Harness Generation
- Coverage-Guided Fuzzing for Multi-Agent LLM Systems (FLARE)
- Cross-Framework Signal Semantics: Re-Measure Borrowed Trajectory Rules
- Data Fidelity Guardrails: Preventing Agent Data Mutation
- Decomposed Red-Teaming for Agent Monitors
- Defense-in-Depth Against Coding Agent Fabrication (Honesty Harness)
- Demand-Driven Repository Auditing
- Demo-to-Production Gap: When Demos Hide Real Costs
- Dependency Gap Validation for AI-Generated Code
- Deterministic Guardrails Around Probabilistic Agents
- Diff-Based Review: Focus on Changes, Not Complete Outputs
- Discovery-Only Refactor Pass: Surface Candidates Before Touching Code
- Distillation-Induced Similarity Metrics for Tool-Use Agents
- Dominator-Graph Trajectory Invariants for Non-Deterministic Agents
- Engineering: Tools, Review, Verification, Security, and Observability
- Eval Awareness: Designing Evals Agents Cannot Recognise
- Eval Engineering (Training Module)
- Eval Strategy by Agent Generation: A Structure-to-Eval Locator
- Eval-Driven Development Training for AI Agent Teams
- Eval-Driven Development: Write Evals Before Building Agent
- Evaluation-Driven Development for Agent Tools
- Evaluator Templates: Portable Primitives for Agent Eval Suites
- Evaluator-Optimizer Pattern for AI Agent Development
- Event Sourcing for Agents: Separating Cognitive Intention
- Execution Lineage: DAG of Artifacts vs Agent Loops
- Failure-Driven Iteration for Improving Agent Workflows
- Feature List Files for Reliable AI Agent Development
- Five-Failure-Layers Diagnostic: Attribute Before Swapping the Model
- Five-Pass Blunder Hunt: Repeated Critique Passes for Plans
- Function-Level Debugger Interfaces for Coding Agents
- Generative Provenance Records for Tool-Using Agents
- Golden Journeys: Restartability as a First-Class Verification Primitive
- Golden Query Pairs as Continuous Regression Tests for Agents
- Governed Sources of Truth for Analytics Agents (Structure Over Access)
- Grade Agent Outcomes, Not Execution Paths
- Grading Strategies for Eval-Driven Development
- Guarding Against URL-Based Data Exfiltration in Agentic Workflows
- Happy Path Bias: How AI Agents Skip Error Handling
- Hardening Agent Evals for Production-Grade Reliability
- Harness Bug Detection Patterns
- Harness Composition for Scaled Security Audits
- Harness Hill-Climbing: Eval-Driven Iterative Improvement of Agent Harnesses
- Held-Out Test Gap: A Long-Horizon Reward-Hacking Signal
- Human-Review-Driven Curation of Golden Eval Datasets
- Incident-to-Eval Synthesis: Production Failures as Evals
- Incremental Verification: Check at Each Step, Not at the End
- Independent Test Generation in Multi-Agent Code Systems
- Inference-Time Tool-Call Reviewer: Pre-Execution Feedback for Tool-Calling Agents
- Interaction-Pattern Evaluation for Agentic PRs
- Isometric Harness Ablation: Rank Subsystem Investment by Removing One at a Time
- LLM Agent Bug Fix Taxonomy: 23 Fix Patterns from 930 Real Bugs
- LLM Code Review Overcorrection for AI Agent Development
- LLM Self-Review Failure in Code Modernization Tasks
- LLM Static Verification Against Natural-Language Requirements
- LLM-Driven Benchmark Auditing
- LLM-as-Judge Evaluation with Human Spot-Checking
- Layered Accuracy Defense for Reliable Agent Outputs
- Learned Prefix Monitors for Agent Traces
- Macro Evals for Agentic Systems: Population-Level Behavior Patterns
- Markov-Chain Reliability for LLM Agents: Audit the Abstraction Before You Trust the Metric
- Measuring Synthetic Eval Data Quality (SynAE)
- Monolith-to-Sub-Agents Refactor: Five Lessons from a Brittle Prototype
- Multi-Agent RAG for Spec-to-Test Automation
- Multi-Layer Specification Redundancy as a Robustness Budget
- Multi-Turn Conversation Evaluation: Per-Turn and Trace-Level Scoring Together
- Mutation Testing as a Quality Gate for AI-Generated Test Suites
- Narrative Problem Reformulation for Code Generation
- Nonstandard Errors in AI Agents: Model-Family Variance
- Observability Feedback Loop: A 7-Step Debug Runbook for Agents
- Observation Contract Preservation in Tool-Augmented Agents
- Offline Trajectory Replay for Multi-Agent Workflow Debugging
- OpenTelemetry for AI Agent Observability and Tracing
- Oracle-Based Task Decomposition for AI Agent Development
- Overeager-Behavior Elicitation: Scope + Trap Fragments as a Diagnostic for Out-of-Scope Tool Calls
- PASS@(k,T): Evaluate RL for Agents Along Sampling and Interaction Depth
- Parallel Polyglot Ports as a Spec-Ambiguity Oracle
- Perceived Model Degradation: Why Vibes Are Not Evals
- Phantom Symbol Detection for LLM API Migration
- Planted-Bug Methodology: Deliberate Bugs as Observability Calibration
- Pre-Change Impact Analysis: Dependency Maps That Prevent Agent Regressions
- Pre-Completion Checklists for AI Agent Development
- Prebuilt Agent Monitoring Dashboard
- Precise Debugging: Measure Edit Precision, Not Just Test Pass Rate
- Predicting Which AI-Generated Functions Will Be Deleted
- Premature Completion: Agents That Declare Success Too Early
- QA Session to Issues Pipeline for AI Agent Development
- Quality Score Rubric and Simplification Log for Agent Harnesses
- RAG/Agent Reliability Problem Map: 16-Domain Failure Taxonomy
- RL-Trained Automated Red Teamers for Prompt Injection Discovery
- Red-Green-Refactor with Agents: Tests as the Spec
- Refactoring Runaway: Tangled Refactorings in Agent Patches
- Repository Perturbation as Context-Reasoning Diagnosis (RepoMirage)
- Reproduce-Before-Report Verification Gate
- Reverse-Engineered Executable Specifications for Agentic Program Repair
- Review-Feedback-to-Rule Loop: Promoting Recurring PR Comments into Harness Rules
- Review-Then-Implement Loop for AI Agent Development
- Risk-Based Shipping: Review by Risk Matrix, Not by Default
- Risk-Based Task Sizing for Agent Verification Depth
- Risk-Score Threshold Calibration for Auto-Approval
- RubricRefine: Pre-Execution Rubric Refinement for Code-Mode Tool Use
- Runnable Documentation as Agent Verification
- Security Drift in Iterative LLM Code Refinement
- Signal Over Volume in AI Review for AI Agent Development
- Simulation and Replay Testing for Agent Verification
- Skill Eval Loop
- Skill Evals: Measuring Skill Quality as a Dataset-Graded Unit
- Skill Retrieval Realism Gap
- Skill Specification Violation Fuzzing
- Solver-Externalized Constraint Reasoning (MaxSAT/SMT Encoding)
- Source-Grounded Test Plan with Pre-Action Assertion Annotation
- Spec-Derived Execution as a Correctness Oracle
- Staged Evidence Gates for Agentic Program Repair
- Stale Repository Retrieval Induces Incorrect Code
- Step-by-Step: Building Your First Eval-Driven Feature
- Stochastic-Deterministic Boundary as First-Class Contract
- Strained Coherence as a Pre-Failure Signal in Agent Trajectories
- Structural Coverage Criteria for Agent Workflows
- Structure-Aware Diff Labeling with Two-Stage LLM Pipelines
- Structured Output Constraints: Reducing Hallucination
- Symptom-Reduction-as-Root-Cause: Why Oracle Tests Alone Miss Architectural Drift
- Task Alignment: The Selective-Compliance Gap Benchmarks Miss
- Test Evolution Blind Spot in Coding Agents
- Test Harness Design for LLM Context Windows
- Test-Driven Agent Development: Tests as Spec and Guardrail
- Test-Driven Intent Clarification: Tests as Intermediate Alignment Artifacts
- The Eval-First Development Loop for AI Agent Features
- The Productivity-Experience Paradox in AI-Assisted Development
- The Synthetic Ground Truth Fallacy in Agent Evaluation
- The Test Homogenization Trap: When LLM-Generated Tests Mirror Model Blind Spots
- Tiered Code Review: AI-First with Human Escalation
- Tool-Use Sim-to-Real Perturbation Taxonomy
- Traces Need Feedback to Power Learning
- Trajectory Decomposition: Diagnose Where Coding Agents Fail
- Trajectory Pre-Filter for Failure Diagnosis (TrajAudit)
- Trajectory-Opaque Evaluation Gap: Why Final-Output Grading Misses Safety Violations
- Trust Without Verify: Skipping Agent Output Checks
- Trusting Human Review to Catch Deliberate Agent Sabotage
- Usability Pressure as a Silent Security-Regression Vector
- Using the Agent to Analyze Its Own Evaluation Transcripts
- Variance-Based RL Sample Selection
- Velocity-Quality Asymmetry: Why AI Speed Gains Fade
- Verification Ledger for Tracking Agent Output Quality
- Verification-Centric Development for AI-Generated Code
- Verification: Testing, Evals, and Guardrails for Agents
- Verify-Gated Completion as Admission Control
- Verifying LLM-Generated Cryptographic Code
- Vibe Coding: Outcome-Oriented Agent-Assisted Development
- What Evals Are and Why AI Agents Need Them for Quality
- Writing Your First Agent Evaluation Suite from Scratch
- pass@k and pass^k: Capability and Consistency Metrics
tool-agnostic¶
- @import Composition Pattern for Agent Instruction Files
- A2UI: Framework-Agnostic Generative UI Standard for Agents
- ACDL: A Language for Describing Agentic LLM Contexts
- ACID for Agent Repository State
- AGENTS.md as a Table of Contents, Not an Encyclopedia
- AGENTS.md: Project-Level README for AI Coding Agents
- AI Abundance Reshapes Software Engineering Identity
- AI Adoption Footprint: The Segmented Shape of Engineering Orgs
- AI Bot CI/CD Workflow Reliability by Agent
- AI Crawler Policy: robots.txt for the Three-Tier Crawler Landscape
- AI Knowledge Generation Fallacy: LLMs Recombine, Not Invent
- AI Slop as a Process Problem: Encoding Quality Standards as Pipeline Gates
- AI-Powered Vulnerability Triage for AI Agent Development
- AIRA: Inspection Framework for AI-Generated Code
- AOCI: Symbolic-Semantic Repository Indexing
- AST-Grounded Critic Loop for Documentation Maintenance
- AST-Guided Agent Memory for Repository-Level Code Generation
- AX/UX/DX Triad: Three Experience Layers in Agent Systems
- About
- Abstention-Aware Memory Retrieval for Coding Agents
- Abstraction Bloat in AI Agent-Generated Code Output
- Acknowledged-Debt Ledger with Next-Trigger Conditions
- Action-Audit Divergence: A Four-Mode Taxonomy for Runtime Hardening
- Action-Selector Pattern: LLM as Intent Decoder with Deterministic Execution
- Adaptive Generate-Rank-Verify Under Costly Verification
- Adaptive Sandbox Fan-Out Controller
- Adversarial Multi-Model Development Pipeline (VSDD)
- Agent Backpressure: Automated Feedback for Self-Correction
- Agent Cards: Capability Discovery Standard for AI Agents
- Agent Chat History as a First-Class Artifact
- Agent Circuit Breaker
- Agent Commit Attribution: Signed Commits and Agent Identity
- Agent Composition: Chains, Fan-Out, Pipelines, Supervisors
- Agent Debug Log Panel: Chronological Event Inspection for Session Debugging
- Agent Debugging: Diagnosing Bad Agent Output
- Agent Definition Formats: How Tools Define Agent Behavior
- Agent Development Lifecycle for Agent Products
- Agent Event Streaming: Consumer Contract Above the Tokens
- Agent Governance Policies for AI Agent Development
- Agent Handoff Protocols: Passing Work Between Agents
- Agent Harness: Initializer and Coding Agent Pattern
- Agent Headcount as a Vanity Metric
- Agent JIT Compilation: Compile Tasks Into Executable Plans
- Agent Loop Middleware — Safety Nets and Message Injection
- Agent Memory Patterns: Learning Across Conversations
- Agent Network Egress Policy: Admin-Controlled Domain Allow/Deny
- Agent PR Volume vs. Value: The Productivity Paradox
- Agent Pushback Protocol for Managing Disagreements
- Agent Runtime Middleware: Per-Call Interception Pipeline
- Agent Self-Review Loop for Iterative Self-Improvement
- Agent Skills: A Cross-Tool Task Knowledge Standard
- Agent Terminology Disambiguation for AI Coding Systems
- Agent-Assisted Code Review: Agents as PR First Pass
- Agent-Authored Messages as a Deferred Exfiltration Channel
- Agent-Authored PR Integration and Merge Predictors
- Agent-Aware CLI Behaviour via Environment Variable
- Agent-Computer Interface (ACI): Tool Design as UX Discipline
- Agent-Discoverable Slash Commands
- Agent-Driven Greenfield Product Development from Scratch
- Agent-Driven PR Slicing
- Agent-Emitted Dependency Version Ranges Widen the Supply-Chain Attack Surface
- Agent-First Software Design for AI Agent Development
- Agent-Generated Code Maintenance Asymmetry
- Agent-Laundered Bug Reports
- Agent-Led Dev-Environment Iteration with Validation and Rollback
- Agent-Powered Codebase Q&A and Onboarding Workflow
- Agent-Proposed Merge Resolution
- Agent-Trace Data Layer: Storage for Hours-Long Traces
- Agent-to-Agent (A2A) Protocol for AI Agent Development
- Agentic AI Architecture: From Prompt to Goal-Directed
- Agentic Detection and Response at the MCP Boundary
- Agentic Education: Persona Progression for Teaching AI Coding Tools
- Agentic Flywheel: Building Self-Improving Agent Systems
- Agentic Framework Landscape: When Each Framework Fits
- Agentic-Agile: Adapting Agile Rituals for Agent Work
- Agentless vs Autonomous: When Simple Beats Complex
- Agents vs Commands: Separation of Role and Workflow
- Air-Gapped RAG Training Series
- Air-Gapped RAG: Architecture Fundamentals
- Air-Gapped RAG: Chunking Strategies
- Air-Gapped RAG: Deployment, Operations, and Compliance
- Air-Gapped RAG: Document Ingestion and Parsing
- Air-Gapped RAG: Grounding, Citations, and Evaluation
- Air-Gapped RAG: Local Embeddings and Vector Stores
- Air-Gapped RAG: Local LLM Inference
- Air-Gapped RAG: Overview and When to Use It
- Air-Gapped RAG: Retrieval and Re-Ranking
- Always-On Agentic PR Security Review
- Ambition Scaling: Moving the Target as Model Capability Increases
- Answer-First Writing: Structure Content for AI Retrieval
- Anthropic's Effective Agents Framework: A Pattern Map
- Anti-Reward-Hacking: Rubrics That Resist Gaming
- Architecting a Central Repo for Shared Agent Standards
- Assertion Density — Stats and Quotes Over Vague Claims
- Assumption Propagation: Compounding Agent Misunderstandings
- Async Non-Blocking Subagent Dispatch
- Asynchronous Agent I/O and Speculative Tool Calling
- Atomic Pages and Chunking — One Concept Per Page for RAG
- Attention Latch: When Agents Stay Anchored to Stale Instructions
- Attention Sinks: Why First Tokens Always Win
- Authority Confusion: Untrusted Context Must Not Authorize Side Effects
- Auto-Triage Workflow: Bug-Monitoring Agent that Connects Related Reports and Opens Fix PRs
- Autonomous Research Loops: Loops That Know When to Stop
- BYOK Model Token Visibility: Closing the Observability Gap on Self-Hosted Routes
- Background Todo Agent: Offload Plan Maintenance to a Lightweight Model
- Backlog Triage as a Named Agent Skill
- Batch File Operations via Bash Scripts for AI Agents
- Batched Suggestion Application: Bulk-Apply Agent Fixes on PRs
- Behavioral Drivers of Coding Agent Success and Failure
- Behavioral Firewall for Tool-Call Trajectories
- Behavioral Testing for Non-Deterministic AI Agents
- Benchmark Contamination as Eval Risk
- Benchmark-Driven Tool Selection for Code Generation
- Blast Radius Containment: Least Privilege for AI Agents
- Bootstrapping Coding Agents: The Specification Is the Program
- Bootstrapping an Agent-Driven Project from Scratch
- Boring Technology Bias: When Agents Recommend by Popularity
- Bounded Batch Dispatch for Parallel Agent Execution
- Brownfield to Agent-First: Repo Maturity Framework
- Browser Automation as a Research Tool: Bypassing Bot Detection
- Burn the Boats — Commitment-Forcing Deprecation
- CARE: Three-Party Stage-Gated Agent Engineering
- CLAUDE.md Convention for Structuring Agent Instructions
- CLI Scripts as Agent Tools: Return Only What Matters
- CLI-First Skill Design
- CLI-IDE-GitHub Context Ladder for AI Agent Development
- CRA-Only Review and the Merge Rate Gap
- Canary Rollout for Agent Policy Changes
- Cargo Cult Agent Setup: Copying Without Understanding
- CausalFlow: Counterfactual Repair for Failed Agent Trajectories
- Chain-of-Thought Reasoning Fallacy: Traces Are Not Truth
- Chain-of-Verification for Coding Agents
- Chat-Platform Agent Delegation: Invoking Cloud Coding Agents from Team Channels
- Chunking Strategy for RAG-Based Code Completion
- Circuit Breakers for Agent Loops
- Clarification Mode Amplifies Prompt Injection
- Classical SE Patterns as Agent Design Analogues
- Classifier-Gated Auto-Permission for Cloud-IDE Coding Agents
- Classifier-Subagent Run Mode for Per-Call Permission Routing
- Classifying and Auto-Correcting Coding Agent Misbehaviors (Wink)
- Claude Code ↔ Copilot CLI: Changelog-Driven Feature Parity
- Clock-In / Clock-Out Protocol: Bracketed Session Continuity
- Close the Attack-to-Fix Loop: Adversarially Train Agent
- Closed-Loop Agent Training from Tool Schemas
- Closed-Loop CI Failure Remediation with Cloud Coding Agents
- Closed-Loop Role-Based Refinement for Agent Systems
- Cloud-Agent Session Bootstrap: Cached Install plus Per-Session Start
- Cloud-Agent Three-Layer State Decoupling
- CoALA Decision-Making Loop as an Orchestration Lens
- CoALA Memory Taxonomy as a Classifier for Harness Artifacts
- CoALA Structured Action Space: Internal vs External Actions
- CoT Robustness in Code Generation
- Code Cleanliness as an Agent Cost Lever
- Code Injection Defence in Multi-Agent Pipelines
- Code Interpreter as a Primary Agent Tool
- Code-Health-Gated LLM Tier Routing
- Codebase Readiness for Agents: Agent-Friendly Code
- Codebase-Derived Pattern Libraries as Agent Context
- Coding Agent Scope Expansion: When to Extend Beyond the Codebase
- Coding-Agent Misalignment Forms (Seven-Symptom Taxonomy)
- Coding-Agent Reversibility: Platform Choice as a Two-Way Door
- Cognitive Architectures for Language Agents (CoALA): A Classifier for Agent Harnesses
- Cognitive Load, AI Fatigue, and Sustainable Agent Use
- Cognitive Poisoning: Untrusted Tool Feedback as a Trajectory Attack
- Cognitive Reasoning vs Execution: A Two-Layer Agent
- Cohesion-Aware Task Partitioning for Multi-Agent Coding
- Committee Review Pattern for Multi-Agent Code Review
- Completion Failure Taxonomy: Why Code Suggestions Miss
- ComplexMCP: Three Bottlenecks in Large Interdependent Tool Sandboxes
- Component-Isolated Memory Stress Testing for LLM Agents
- Component-Wise RAG Prioritization for Software Engineering Tasks
- Compositional Vulnerability Induction in Coding Agents
- Compound Engineering: Systematic Agent Learning Loop
- Comprehension Debt from AI-Generated Code Velocity
- Concept Map
- Configuration File Structure Does Not Drive Compliance
- Confirmation Gates for Consequential Agent Actions
- Consistent-format customer capture
- Consolidate Agent Tools to Reduce Cognitive Overhead
- Constraint Decay in Backend Code Generation
- Constraint Degradation in AI Code Generation
- Constraint Drift: Why Safety Must Be Maintained, Not Asserted
- Constraint Encoding Does Not Fix Constraint Compliance
- Containment Playbook: npm-to-Signing-Channel Compromise
- Context Budget Allocation: Spending Every Token Wisely
- Context Compression Strategies: Offloading and Summarisation
- Context Engineering (Training Module)
- Context Engineering: The Practice of Shaping Agent Context
- Context Hub: On-Demand Versioned API Docs for Coding Agents
- Context Poisoning: When Hallucinations Become Premises
- Context Priming: Pre-Loading Files for AI Agent Tasks
- Context Window Anxiety: Countering Premature Task Closure
- Context Window Management: Understanding the Dumb Zone
- Context-Injected Error Recovery for AI Agent Development
- Context-Usage Attribution: Per-Source Breakdown of Agent Context
- Contextual Capability Calibration for Multi-Agent Delegation
- Continual Learning for AI Agents: Three Layers of Knowledge Accumulation
- Continuous AI (Agentic CI/CD) for AI Agent Development
- Continuous Agent Improvement: Iterating on Agent Quality
- Continuous Autonomous Task Loop
- Continuous Documentation as an Agent-Driven Practice
- Continuous Triage: Automating Issue Classification with AI Workflows
- Contractual Skill Files: Inspectable SKILL.md for Enterprise Agents
- Control Lexical Leakage in Agent-Memory Retrieval Evals (Entity-Collision)
- Control/Data-Flow Separation for Prompt Injection Defense (CaMeL)
- Controlled Benchmark Rewriting for Agent Safety Judgment
- Controlling Agent Output: Concise Answers, Not Essays
- Convenience Loops and AI-Friendly Code in Your Stack
- Convention Over Configuration in Agent Workflow Design
- Convergence Detection in Iterative Agent Refinement
- Copilot vs Claude Billing Semantics for Enterprise Teams
- Corpus-Level Trace Diagnostics for LLM Agents
- Cost-Aware Agent Design: Route by Complexity, Not Habit
- Cost-Aware Skill Rewriting: Preserve Operational Anchors, Not Skill Tokens
- Cost-Aware Tracing for Skill Distillation
- Coverage-Guided Agents for Fuzz Harness Generation
- Coverage-Guided Fuzzing for Multi-Agent LLM Systems (FLARE)
- Credential Hygiene for Agent Skill Authorship
- Critical Instruction Repetition via Primacy and Recency
- Cross-Component Interference in Agent Scaffolds
- Cross-Cycle Consensus Relay
- Cross-Framework Signal Semantics: Re-Measure Borrowed Trajectory Rules
- Cross-Functional Knowledge Artifacts
- Cross-Lingual Prompt Preprocessing (Local-LLM Token Arbitrage)
- Cross-Repo Agent Search: GitHub-API-Backed Text Search Beyond the Workspace
- Cross-Session Memory Poisoning for Data Exfiltration (Trojan Hippo)
- Cross-Tool Subagent Comparison
- Cross-Tool Translation: Learning from Multiple AI Assistants
- Cross-Vendor Competitive Routing for LLM Selection
- Cryptographic Governance Audit Trail for AI Agents
- Customer-Hosted MCP Tunnel: Outbound-Only Connectivity to Private MCP Servers
- DSPy: Programmatic Prompt Optimization for Compound Agent Systems
- Daily-Use Skill Library: Encoding Your Process as Agent Skills
- Data Fidelity Guardrails: Preventing Agent Data Mutation
- Decentralized Memory for Self-Evolving Multi-Agent Systems
- Declarative Multi-Agent Composition
- Declarative Multi-Agent Topology: Topology-as-Code
- Decomposed Red-Teaming for Agent Monitors
- Deep Agent Runtime: The Layer Beneath the Harness
- Defense-in-Depth Against Coding Agent Fabrication (Honesty Harness)
- Defense-in-Depth Agent Safety for AI Agent Development
- Deferred Standards Enforcement via Review Agents
- Deliberate AI-Assisted Learning: Accelerating Skill Acquisition
- Delta Channels: Bounded Checkpoint Storage for Append-Only Agent State
- Demand-Driven Repository Auditing
- Demo-to-Production Gap: When Demos Hide Real Costs
- Dependency Gap Validation for AI-Generated Code
- Designing Agents to Resist Prompt Injection
- Designing for Agent Consumers (Agent Experience)
- Deterministic Guardrails Around Probabilistic Agents
- Deterministic Orchestration for Structured Modernization
- Developer Attention Management with Parallel Agents
- Developer Control Strategies for AI Coding Agents
- Diff-Based Review: Focus on Changes, Not Complete Outputs
- Direct Prompt Injection via Collaboration (User as Attack Vector)
- Direct-Apply Review Comments via Cloud Agent
- Disable Attribution Headers to Preserve KV Cache in Local Inference
- Discoverable vs Non-Discoverable Context for Agents
- Discovering Indirect Injection Vulnerabilities in Your Agent
- Discovery-Only Refactor Pass: Surface Candidates Before Touching Code
- Discrete Phase Separation
- Distillation-Induced Similarity Metrics for Tool-Use Agents
- Distilled Bootstrap Contract: Agent-Authored Repo Setup
- Distractor Interference: Why Relevance Is Not Enough
- Distributed Computing Parallels in Agent Architecture
- Docker sbx Adoption for Coding Agents
- Documentation-Grounding MCP Servers for Vendor SDKs
- Documentation-Guided Legacy Migration: Architecture Docs as a C-to-Rust Blueprint
- Domain-Specific Agent Challenges
- Domain-Specific System Prompts with Concrete Examples
- Dormant Memory Payloads Triggered by Sensitive Topics (Trojan Hippo)
- Dual-Boundary Sandboxing for Secure Agent Execution
- Dual-Budget Control for Search Agents: VOI Scoring Per Action
- Dual-Graph Alignment for Indirect Prompt Injection Defense (AuthGraph)
- Dual-Trace Memory Encoding: Pair Facts with the Scene They Were Learned In
- Dual-Write Append-Mirror for Agent Transcript Externalization
- Durable Interactive Artifacts: Agent Output Outside the Transcript
- Dynamic System Prompt Composition
- Dynamic Tool Fetching Destroys KV Cache Performance
- Earned-Complexity Agent Maturity Ladder
- Economic Value Signaling in Multi-Agent Networks
- Edit Format Selection: Diff vs. Search-Replace vs. Full Rewrite
- Editor and Manager Surface Separation in Agent IDEs
- Effective Feedback Compute (EFC) for Harness Comparison
- Elastic Context Orchestration: A Per-Turn Vocabulary for Long-Horizon Search Agents
- Embedding Inversion: Vector Stores as a Source-Text Disclosure Surface
- Emergent Architecture in AI-Driven Codebases
- Emergent Behavior Sensitivity for AI Agent Development
- Empirical Baseline: Agentic AI Coding Tool Configuration
- Empowerment Over Automation for AI Agent Development
- Encode Project Conventions in Distributed AGENTS.md Files
- Encoding Tacit Knowledge into Agent Improvement Loops
- Encoding Values in AGENTS.md: Why Prose Without Verification Fails
- Engineering: Tools, Review, Verification, Security, and Observability
- Enterprise Agent Hardening: Three Production Gates
- Enterprise Skill Marketplace: Distribution and Quality
- Entropy Reduction Agents: Automated Codebase Hygiene
- Environment Specification as Context: Closing the Version Gap
- Episodic Memory Retrieval for AI Coding Agent Loops
- Error Preservation in Context for AI Agent Development
- Escape Hatches: Unsticking Stuck Agents
- Eval Awareness: Designing Evals Agents Cannot Recognise
- Eval Engineering (Training Module)
- Eval Strategy by Agent Generation: A Structure-to-Eval Locator
- Eval-Driven Development Training for AI Agent Teams
- Eval-Driven Development: Write Evals Before Building Agent
- Evaluating AGENTS.md: When Context Files Hurt More Than Help
- Evaluating Agent Patterns Catalog as a Source
- Evaluation-Driven Development for Agent Tools
- Evaluator Templates: Portable Primitives for Agent Eval Suites
- Evaluator-Optimizer Pattern for AI Agent Development
- Event Sourcing for Agents: Separating Cognitive Intention
- Event-Driven Agent Routing for Multi-Team AI Pipelines
- Event-Driven System Reminders for AI Agent Development
- Evolving Playbooks: Incremental Context That Preserves Knowledge
- Exception Handling and Recovery Patterns for AI Coding Agents
- Execution Lineage: DAG of Artifacts vs Agent Loops
- Execution-First Delegation: The AI-as-Executor Pattern
- Experience Graphs as Structured Memory for Self-Evolving Agents
- Experiential-Learning Setup Agents with Snapshot Rollback (SetupX)
- External Artifacts Treated as Data, Not Adversarial Input
- Externalization in LLM Agents
- Factory Over Assistant: Orchestrating Parallel Agent Fleets
- Failure-Aware Observability for Multi-Agent LLM Systems
- Failure-Driven Iteration for Improving Agent Workflows
- Fan-Out Synthesis Pattern for AI Agent Development
- Feature List Files for Reliable AI Agent Development
- Feedback as Capability Equalizer: Iterative Feedback Outweighs Model Scale
- File-Based Agent Coordination for AI Agent Development
- Filter and Aggregate Data in the Execution Environment
- First-Party Agent Composition: Agent-Built Features
- Five-Failure-Layers Diagnostic: Attribute Before Swapping the Model
- Five-Pass Blunder Hunt: Repeated Critique Passes for Plans
- Five-Stage Policy Layer Typology for Generalist Agents
- Foresight-Guided Defense Against Infectious Jailbreaks in Multi-Agent Systems
- Forked vs Fresh Subagents: When to Inherit the Parent Conversation
- Formal Process Models as Prompting Scaffolds (Petri Net of Thoughts)
- Foundational Disciplines for AI-Assisted Development
- Foundations: Context Engineering and Instructions
- Four-Layer Taxonomy of Agent Security Risks
- Framework-First Agent Development: An AI Anti-Pattern
- Frameworks
- Frontmost-Window Snapshot as Agent Context
- Frozen Spec File: Preserving Intent in AI Agent Sessions
- Function-Level Debugger Interfaces for Coding Agents
- Functional folder taxonomy
- Future-Based Asynchronous Function Calling
- GEO for Technical Docs: Developer Documentation Checklist
- GROUNDING.md: Field-Scoped Hard Constraints and Convention Parameters
- Gate Agent Writes to Executable Config Files as Privileged Actions
- Generative Agents Memory Stream: Three-Layer Architecture for Long-Running Agent Sessions
- Generative Engine Optimization for Developer Sites
- Generative Provenance Records for Tool-Using Agents
- Getting Started: Setting Up Your Instruction File
- Goal Contract: Separating the Doer from the Done-Checker
- Goal Monitoring and Progress Tracking for Long-Running Agents
- Goal Recitation: Countering Drift in Long Sessions
- Goal Reframing: The Primary Exploitation Trigger for LLM Agents
- Goal-Driven Autonomous Loop with Budget Cap
- Golden Journeys: Restartability as a First-Class Verification Primitive
- Golden Query Pairs as Continuous Regression Tests for Agents
- Google ADK Skills: Portable SKILL.md Across ADK Agents
- Google Search Console Monitoring Workflow
- Governed Sources of Truth for Analytics Agents (Structure Over Access)
- Graceful Tool-Output Truncation: The PARTIAL Signal
- Grade Agent Outcomes, Not Execution Paths
- Grading Strategies for Eval-Driven Development
- Graph of Thoughts: Directed Graph Reasoning for Multi-Path Problems
- Grill Me: Developer-Initiated Plan Interrogation
- Guarding Against URL-Based Data Exfiltration in Agentic Workflows
- Guardrails Beat Guidance: Rule Design for Coding Agents
- HTML as Agent Output Format: When to Ask for HTML Instead of Markdown
- Happy Path Bias: How AI Agents Skip Error Handling
- Hardening Agent Evals for Production-Grade Reliability
- Harness Bug Detection Patterns
- Harness Composition for Scaled Security Audits
- Harness Design Dimensions and Archetypes
- Harness Engineering (Training Module)
- Harness Engineering for Building Reliable AI Agents
- Harness Hill-Climbing: Eval-Driven Iterative Improvement of Agent Harnesses
- Harness Impermanence: Build Scaffolding To Be Deleted
- Harness Preflight Doctor Command for Agent Diagnostics
- Harness-Memory Coupling as a Design Axis
- Headless-First Services: APIs for Agent Consumers
- Heartbeat-Bound Hierarchical Credentials for Agent Swarms
- Held-Out Test Gap: A Long-Horizon Reward-Hacking Signal
- Heuristic-Based Effort Scaling in Agent System Prompts
- Hint-Driven Concurrency for Read-Only MCP Tools
- Hints Over Code Samples in Agent Prompts
- History Anchors: Consistency-Cued Continuation of Unsafe Prior Actions
- Hooks and Lifecycle Events: Intercepting Agent Behavior
- Hooks for Enforcement vs Prompts for Guidance: When to Use Each
- Hostname-Allowlist Proxy: The TLS-Inspection Blind Spot
- How AI Engines Cite — ChatGPT, Perplexity, Claude, Gemini
- How the Four Agent Engineering Disciplines Compound
- Human-AI Review Synergy in Agentic Code Review
- Human-Equivalent Hours for Autonomous Coding Agent Productivity
- Human-Facing Docs in the Agent Era: Mental Models Over Reference
- Human-Review-Driven Curation of Golden Eval Datasets
- Human-in-the-Loop Placement: Where and How to Supervise
- Humans and Agents in Software Engineering Loops
- Hybrid Deterministic + Semantic Authorization for Agent Tool Calls
- Hyper-Personalized Software: The Return of RAD
- Hypothesis-Driven Debugging: Instrument Before You Patch
- Idempotent Agent Operations: Safe to Retry
- Idle-Time Speculative Planning for ReAct Agents
- Improper Output Handling: Validate Agent Output Before Downstream Use
- In-Process WebAssembly Sandboxes for Agent-Generated Code
- In-Thread Side-Channel: Bounded Side Questions Without Losing the Main Task
- Incident Log Investigation Skill: Parallel Queries
- Incident-to-Eval Synthesis: Production Failures as Evals
- Incremental Verification: Check at Each Step, Not at the End
- Independent Test Generation in Multi-Agent Code Systems
- Indexed Regex Search for Agent Tools
- Indiscriminate Structured Reasoning on Every Agent Task
- Inference-Time Tool-Call Reviewer: Pre-Execution Feedback for Tool-Calling Agents
- Initiatives and Community: Tracking the Agentic Engineering Landscape
- Inline Safety Harness with Cascade Verification (FinHarness)
- Instruction Polarity: Positive Rules Over Negative
- Instruction-Guided Code Completion: Controlling What Models Generate
- Interaction-Pattern Evaluation for Agentic PRs
- Interactive Canvases: Agent-Generated Visual Artifacts as Outputs
- Interactive Clarification for Underspecified Tasks
- Interactive Effort Sliders: Per-Turn Reasoning-Budget Controls
- Introspective Skill Generation: Mining Agent Patterns
- Inversion Analysis: Surface Capabilities Competitors Cannot Replicate
- Isometric Harness Ablation: Rank Subsystem Investment by Removing One at a Time
- Issue Requirements Preprocessing: Structured Input Before Code Generation
- Issue-Tracker as Agent Dispatch Surface
- Issue-to-PR Delegation Pipeline for AI Agent Development
- Iterative Binary Feedback for Pattern Adherence
- L0 → L1: Making the Repo Readable
- L1 → L2: Adding Feedback Loops
- L2 → L3: Building Mechanical Enforcement
- L3 → L5: Reaching Agent-First
- LLM Agent Bug Fix Taxonomy: 23 Fix Patterns from 930 Real Bugs
- LLM Code Review Overcorrection for AI Agent Development
- LLM Comprehension Fallacy: When Models Seem to Understand
- LLM Map-Reduce Pattern for Parallel Input Processing
- LLM Refactoring Adoption Patterns
- LLM Self-Review Failure in Code Modernization Tasks
- LLM Static Verification Against Natural-Language Requirements
- LLM-Driven Benchmark Auditing
- LLM-Driven Logical Retrieval: Boolean Queries over an Inverted Index
- LLM-Pinned Library Versions Carry Systemic CVE Exposure
- LLM-as-Judge Evaluation with Human Spot-Checking
- Lane-Based Execution Queueing
- Large-Codebase Coding-Agent Failure Patterns (Sourcegraph Five)
- Law of Triviality in AI PRs for AI Agent Development
- Lay the Architectural Foundation by Hand Before Delegating
- Layered Accuracy Defense for Reliable Agent Outputs
- Layered Context Architecture for AI Agent Development
- Layered Domain Architecture: A Prescriptive Default for Agent-Built Code
- Layered Instruction Scopes: Global, Project, and Directory
- Layered Mutability: Governing Persistent Self-Modifying Agents
- Lead-to-Teammate Plan-Approval Handshake for Multi-Agent Work
- Learned Prefix Monitors for Agent Traces
- Lethal Trifecta Threat Model for AI Agent Development
- Lexical-First Retrieval for Agentic Search: When BM25 Is Enough
- Lifecycle-Integrated Security Architecture for Agent Harnesses
- Live Browser as Agent Context Channel
- Living-Docs-Grounded Agent Design Conversations
- Lock-State Safeguards for Desktop-Controlling Agents
- Long-Running Agents: Durability and Resumability Across Sessions
- Loop Detection for AI Agents: Stopping Micro-Loops
- Loop Strategy Spectrum: Accumulated vs Fresh Context
- Lost in the Middle: Understanding U-Shaped Attention
- MCP Allowlist by Label, Not by Identity (serverName Trap)
- MCP Client Design: Building Robust Host-Side Logic
- MCP Client/Server Architecture Best Practices Guide
- MCP Runtime Control Plane: Policy Evaluation Between Agent and Tool
- MCP Server Design: Building Agent-Friendly Servers
- MCP alwaysLoad: Classifying Servers as Eager or Just-in-Time
- MCP: The Open Protocol Connecting Agents to External Tools
- Machine-Readable Error Responses for AI Agents (RFC 9457)
- Macro Evals for Agentic Systems: Population-Level Behavior Patterns
- Magentic Orchestration: Task-Ledger-Driven Adaptive Multi-Agent Planning
- Making Application Observability Legible to Agents
- Managed vs Self-Hosted Agent Harness: Deployment Trade-offs
- Manual Compaction Strategy for Dumb Zone Mitigation
- Markov-Chain Reliability for LLM Agents: Audit the Abstraction Before You Trust the Metric
- Measuring GEO Performance: Metrics, Monitoring, and Tools
- Measuring Synthetic Eval Data Quality (SynAE)
- Memory Reinforcement Learning (MemRL)
- Memory Synthesis: Extracting Lessons from Execution Logs
- Memory Transfer Learning: Cross-Domain Memory Reuse in Coding Agents
- Memory-Induced Tool-Drift in LLM Agents
- Mermaid as Agent Output Format: When to Ask for a Diagram Instead of Prose
- Meta-Engineering Harness for Production AI-Native Software Delivery
- Method Map: Failure-Mode to Smallest-Artifact Triage
- Mid-Session Config Changes as Invisible Cache Invalidators
- Mid-Trajectory Guardrail Selection for Multi-Step Tool Calls
- Mise en Place for Agentic Coding
- Model Deprecation Lifecycle for Agent Workloads
- Model a Single Agent Turn as Many Inference and Tool-Call
- Model-ID-as-Dependency: Migration Protocol for Deprecation Churn
- Model-Neutral Agent Architecture: Model Portability Over Cloud Portability
- Monolith-to-Sub-Agents Refactor: Five Lessons from a Brittle Prototype
- Monotonic Capability Attenuation for Composition-Safe Tool Use
- Most-Restrictive-Wins Fusion for Parallel Agent Control Returns
- Multi-Agent RAG for Spec-to-Test Automation
- Multi-Agent SE Design Patterns: A Taxonomy Across 94 Papers
- Multi-Agent Topology Taxonomy: Centralised, Decentralised
- Multi-Layer Specification Redundancy as a Robustness Budget
- Multi-Model Plan Synthesis for System Architecture
- Multi-Shape BYOK Provider: Declare API Family per Endpoint
- Multi-Turn Conversation Evaluation: Per-Turn and Trace-Level Scoring Together
- Multitenant RAG: Closing the Relevance-Authorization Gap
- Mutation Testing as a Quality Gate for AI-Generated Test Suites
- Narrative Problem Reformulation for Code Generation
- Natural-Language Customization Bootstrap
- Natural-language git
- Negative Space Instructions: What NOT to Do in Agent Prompts
- Network-less Container + Unix-Socket Egress Proxy for Agent Sandboxes
- Nonstandard Errors in AI Agents: Model-Family Variance
- OAuth Client ID Metadata Documents (CIMD) for MCP Servers
- OWASP LLM Top 10 (2025): Agent Security Crosswalk
- Objective Drift: When Agents Lose Sight of the Goal
- Observability Feedback Loop: A 7-Step Debug Runbook for Agents
- Observability-Driven Harness Evolution
- Observation Contract Preservation in Tool-Augmented Agents
- Observation Masking: Filter Tool Outputs from Context
- Observation-Driven Coordination: CRDT-Based Parallel Agent
- Offline Trajectory Replay for Multi-Agent Workflow Debugging
- One-Shot Record and Deterministic Replay for Periodic Agent Tasks
- Open Agent School Pattern Mapping for Practitioners
- Open Standards and Protocols for AI Agent Development
- OpenAI Agents SDK Sandboxes Harness and Memory
- OpenAPI Documentation Smells for Agent-Ready APIs
- OpenAPI as the Source of Truth for Agent Tool Definitions
- OpenTelemetry for AI Agent Observability and Tracing
- Opponent Processor / Multi-Agent Debate Pattern
- Oracle Poisoning: Knowledge Graph Corruption Against Tool-Using Agents
- Oracle-Based Task Decomposition for AI Agent Development
- Orchestrator-Worker Pattern for AI Agent Development
- Overeager-Behavior Elicitation: Scope + Trap Fragments as a Diagnostic for Out-of-Scope Tool Calls
- Override Pattern: Reusing Interactive Commands in Automated Pipelines
- PASS@(k,T): Evaluate RL for Agents Along Sampling and Interaction Depth
- PEEK: Orientation Cache for Recurring-Context Agents
- PII Tokenization in Agent Context
- PM on the AI Exponential
- PR Description Style as a Lever for Agent PR Merge Rates
- PR Scope Creep as a Human Review Bottleneck
- Parallel Agent Sessions Shift the Bottleneck from Writing
- Parallel Polyglot Ports as a Spec-Ambiguity Oracle
- Parameter-Keyed Caching and Dependency-Aware Parallelism for Plan-Execute Pipelines
- Parsimonious Agent Routing for Multi-Agent Dispatch
- Pattern Replication Risk in Agentic Code Generation
- Pattern Selection Map: Trade-off Matrix for This Site's Patterns
- Per-Call Budget Hints on Tool Invocations
- Per-Model Harness Tuning: Treating the Backing Model as a Harness Variable
- Per-Server MCP Environment Scoping for Credential Isolation
- Per-Tool Extended Reasoning Opt-In: Tool-Call-Scoped Budgets
- Per-User Supervisor Process for Background Agent Sessions
- Perceived Model Degradation: Why Vibes Are Not Evals
- Permission Framework Choice Outweighs Model Choice for Limiting Overeager Actions
- Permission-Gated Custom Commands for AI Agent Development
- Permutation Frameworks for Batch Code Generation
- Persistent Shared Search Sub-Agent for Output-Token Reuse
- Persistent-Connection Agent Transport
- Persona-as-Code: Defining Agent Roles as Structured Docs
- Phantom Symbol Detection for LLM API Migration
- Phase-Specific Context Assembly for AI Agent Development
- Plan Compliance in Agents: Measure What They Execute, Not What You Wrote
- Plan files as resumable artifacts
- Plan-Then-Execute as the Default for Web Agents
- Planted-Bug Methodology: Deliberate Bugs as Observability Calibration
- Plugin and Extension Packaging: Distributing Agent Capabilities
- Poka-Yoke for Agent Tools: Mistake-Proof Tool Interfaces
- Polya Small-Steps: Using AI to Think Better, Not Think Less
- Portable Agent Definitions: Full-Stack Identity as Code
- Pre-Change Impact Analysis: Dependency Maps That Prevent Agent Regressions
- Pre-Completion Checklists for AI Agent Development
- Pre-Execution Codebase Exploration for AI Coding Agents
- Pre-Trust Execution Surface in Coding Agent Harnesses
- Prebuilt Agent Environments: Amortising Cloud Agent Cold Start with Custom Container Images
- Prebuilt Agent Monitoring Dashboard
- Precise Debugging: Measure Edit Precision, Not Just Test Pass Rate
- Predicting Which AI-Generated Functions Will Be Deleted
- Premature Completion: Agents That Declare Success Too Early
- Prior Dominance Over Feedback in Agent Optimization Loops
- Privacy-Preserving LLM Requests: Eight Techniques and a Practical Combination
- Proactive Idle-Time Anticipation (ProAct)
- Process Amplification: Scaling Human Work with Agents
- Product-as-IDE: When the Application Becomes the Development
- Production MCP Agent Stack: Sequencing Six Decisions into One Deployment
- Programmatic Cloud-Agent Dispatch via REST API and Webhooks
- Progressive Autonomy: Scaling Trust with Model Evolution
- Progressive Disclosure for Layered Agent Definitions
- Progressive Spend Threshold Alerting for Agent Cost Governance
- Project Instruction File Ecosystem
- Project Writing Skill: House Style as Model-Invocable Skill
- Prompt Cache Economics: Comparing Costs by Provider
- Prompt Caching: Architectural Discipline for Agents
- Prompt Chaining: Sequential LLM Calls for Agent Workflows
- Prompt Compression: Maximizing Signal Per Token
- Prompt Engineering for Agent Instructions and Systems
- Prompt File Libraries for Reusable Agent Instructions
- Prompt Governance via PRs: Reviewable AI Behaviour
- Prompt Injection: A First-Class Threat to Agentic Systems
- Prompt Layering: How Instructions Stack and Override
- Prompt as Security Knob
- Prompt-Only Tool Access Control
- Prompt-Rewrite Discipline on Cross-Generation Model Migration
- Protecting Sensitive Files from Agent Context Access
- Prototype Before Optimizing: Establish Quality Baselines Before Token Constraints
- Provenance-Aware Decision Auditing for LLM Agents
- Public-Channel Agent Work as Lehrwerkstatt for Team Learning
- Push-Event MCP Channels: Inverting the Pull-Tool Polarity
- QA Session to Issues Pipeline for AI Agent Development
- Quality Score Rubric and Simplification Log for Agent Harnesses
- RAG Architecture as a Poisoning Robustness Decision
- RAG over Thinking Traces: Index Reasoning Trajectories Instead of Documents
- RAG/Agent Reliability Problem Map: 16-Domain Failure Taxonomy
- RL-Trained Automated Red Teamers for Prompt Injection Discovery
- Rainbow Deployments for Agents: Gradual Version Migration
- Reasoning Budget Allocation: The Reasoning Sandwich
- Recursive Best-of-N Delegation
- Red-Green-Refactor with Agents: Tests as the Spec
- Refactoring Runaway: Tangled Refactorings in Agent Patches
- Reference: Standards, Human Factors, Emerging, Fallacies, and Articles
- Reflective Prompt Evolution with Pareto Selection (GEPA)
- Remote Agent Host Sessions over SSH and Dev Tunnels
- Remote Session Control for Local CLI Agents
- Repository Bootstrap Checklist: Wiring Agent Support
- Repository Map Pattern: AST + PageRank for Dynamic Code
- Repository Perturbation as Context-Reasoning Diagnosis (RepoMirage)
- Repository-Level Retrieval for Code Generation
- Reproduce-Before-Report Verification Gate
- Restricted-Access Defensive AI: Project Glasswing as a Deployment Model
- Retrieval-Augmented Agent Workflows: On-Demand Context
- Reverse-Engineered Executable Specifications for Agentic Program Repair
- Review-Feedback-to-Rule Loop: Promoting Recurring PR Comments into Harness Rules
- Review-Then-Apply CLI Flag for In-Process Auto-Fix
- Reviewer's Playbook for Agent-Authored Pull Requests
- Rigor Relocation: Engineering Discipline with AI Agents
- Risk-Based Shipping: Review by Risk Matrix, Not by Default
- Risk-Based Task Sizing for Agent Verification Depth
- Risk-Score Threshold Calibration for Auto-Approval
- Role Orchestration on a Single Model
- Rollback-First Design: Every Agent Action Should Be Reversible
- RubricRefine: Pre-Execution Rubric Refinement for Code-Mode Tool Use
- Rule Lifecycle Metadata for Prunable Instruction Surfaces
- Run-Status vs Task-Status Confusion in Autonomous Agent Runs
- Runbooks as Agent Instructions: Agent-Followable Ops
- Runnable Documentation as Agent Verification
- Runtime Harness Adaptation: Four-Layer Interface Fixes for Frozen LLM Agents
- Runtime Scaffold Evolution: Agents That Build Tools
- SDLC-Phase Skill Taxonomy: Full-Lifecycle Skill Libraries
- SEO vs GEO — Signals, Metrics, Optimization Targets
- SKILL.md Frontmatter Reference: All Fields Explained
- SUDP: Secret-Use Delegation Protocol for Agentic Systems
- Safe Outputs Pattern for Trustworthy Agent Responses
- Sandbox + Approvals + Auto-Review Governance Triad
- Sandboxed Coding Environments: Containers vs MicroVMs vs OS-Level Isolators
- Scaffold Architecture Taxonomy for Coding Agents
- Scanner-as-MCP-Server: Secret and Dependency Scans as Typed Agent Tools
- Scheduled Instruction File Fact-Checker for Accuracy
- Schema and Structured Data for GEO — AI Citation Guide
- Schema-Guided Graph Retrieval
- Scope Sandbox Rules to Harness-Owned Tools, Not Third-Party
- Scoped Credentials via Proxy Outside the Agent Sandbox
- Scoped MCP Server Discovery: Most-Specific-Wins Resolution
- Seamless Background-to-Foreground Handoff
- Secrets Management for AI Agents: Credential Injection
- Security Budget as Token Economics
- Security Constitution for AI Code Generation
- Security Drift in Iterative LLM Code Refinement
- Seeding Agent Context: Breadcrumbs in Code
- Selective Autonomy from Copilot Feedback
- Selective Checkpoint Restore Across Code and Conversation State
- Selective Network Access in Agent Sandboxes: The allowNetwork Pattern
- Selective Rewind Summarization: Compress Earlier Turns, Keep Recent Ones Intact
- Self-Discover Reasoning: LLM-Composed Reasoning Structures
- Self-Explanation Loop
- Self-Healing Production Agent: Automated Regression Detection and Autofix PR
- Self-Healing Tool Routing
- Self-Rewriting Meta-Prompt Loop
- Semantic Caching for Multi-Agent Code Systems
- Semantic Context Loading: Language Server Plugins for Agents
- Semantic Density Optimization for Agent Codebases
- Semantic Intent Validation for Agent Skills
- Semantic Tool Output: Designing for Agent Readability
- Sensitive Terminal Prompt Interception
- Separation of Knowledge and Execution in Agent Systems
- Session Harness Sandbox Separation for Long-Running Agents
- Session Initialization Ritual: How Agents Orient Themselves
- Session Recap: Goal-Shaped Handoff at Context Boundaries
- Shadow Tech Debt Created by Autonomous AI Agent Commits
- Shared Context Bundle Registry for Agent Teams
- Signal Over Volume in AI Review for AI Agent Development
- Simulation and Replay Testing for Agent Verification
- Single-Branch Git for Agent Swarms: A Trade-Off Pattern
- Single-CLI Agent Platform: Create to Production in One CLI
- Single-Layer Prompt Injection Defence Anti-Pattern
- Six-Shape Approval Response Taxonomy: Beyond Binary Allow/Deny
- Skeleton Projects as Agent Scaffolding
- Skill Atrophy: When AI Reliance Erodes Developer Capability
- Skill Authoring Patterns: Description to Deployment
- Skill Context Isolation: Forking the Skill into a Subagent Window
- Skill Evals: Measuring Skill Quality as a Dataset-Graded Unit
- Skill Library Evolution: Lifecycle Governance for Agents
- Skill Library Refinement Loops: Organisational Feedback for Shared Skills
- Skill Library Technical Debt: Library-Time Maintenance for Agent Skills
- Skill Program Functions: Executable Guardrails Compiled From Past Failures
- Skill Retrieval Realism Gap
- Skill Specification Violation Fuzzing
- Skill Supply-Chain Poisoning
- Skill Tool as Enforcement: Loading Command Prompts at Runtime
- Skill as Instruction Surface and Callable API (Interpreter Skills)
- Skill as Knowledge Pattern for AI Agent Development
- Slopsquatting: Hallucinated Package Names as a Supply-Chain Vector
- Solver-Externalized Constraint Reasoning (MaxSAT/SMT Encoding)
- Source Code Minification for State-in-Context Agents
- Source-Grounded Test Plan with Pre-Action Assertion Annotation
- Spec Complexity Displacement: When Specs Become Code
- Spec-Derived Execution as a Correctness Oracle
- Spec-Driven Development with Spec Kit
- Specialist Orchestrated Queuing for Multi-Agent SE (SPOQ)
- Specialized Agent Roles for Effective AI Pipelines
- Specialized Small Language Models as Agent Sub-Tools
- Sprint Contracts: Pre-Coding Success Agreements for Multi-Agent Tasks
- Staged Evidence Gates for Agentic Program Repair
- Staggered Agent Launch: Preventing Thundering-Herd in Swarms
- Stakeholder Trust Through Evals and Observability
- Stale AI Configuration Artifacts (Context Rot)
- Stale Repository Retrieval Induces Incorrect Code
- Standards as Agent Instructions for AI Agent Development
- Steering Running Agents: Mid-Run Redirection and Follow-Ups
- Step-by-Step: Building Your First Eval-Driven Feature
- Stochastic-Deterministic Boundary as First-Class Contract
- Strained Coherence as a Pre-Failure Signal in Agent Trajectories
- Strategy Over Code Generation: Why AI Speed Doesn't Fix Wrong Goals
- Structural Coverage Criteria for Agent Workflows
- Structure Prompts with Static Content First to Maximize Cache Hits
- Structure-Aware Diff Labeling with Two-Stage LLM Pipelines
- Structured Agentic Software Engineering (SASE)
- Structured Domain Retrieval: Knowledge Graphs and Case-Based Reasoning
- Structured Output Constraints: Reducing Hallucination
- Structured Task Graphs as External Agent Memory (Beads)
- Sub-Agents for Fan-Out Research and Context Isolation
- Subagent Schema-Level Tool Filtering for AI Agents
- Subtask-Level Memory for Software Engineering Agents
- Sufficiency-Tightness Decomposition for Agent-Authored Permissions
- Suggestion Gating: Fewer Completions, Better DX
- Swarm Migration Pattern
- Swarm Skills: Multi-Agent Extension of the Agent Skills Standard
- Symphony: Open Spec for Issue-Tracker-Driven Coding Agent Orchestration
- Symptom-Reduction-as-Root-Cause: Why Oracle Tests Alone Miss Architectural Drift
- Syntax-Aware Completion Routing for Local-First Code Models
- System Prompt Altitude: Specific Without Being Brittle
- System Prompt Replacement for Domain-Specific Agent Personas
- System Prompt as Secret Store (OWASP LLM07)
- System-Level Optimization Pipeline
- Task Alignment: The Selective-Compliance Gap Benchmarks Miss
- Task Feasibility Awareness: Stop Before You Start
- Task List Divergence as Instruction Quality Diagnostic
- Task-Based Access Control with Hybrid Inspection
- Task-Specific Agents vs Role-Based Agents
- Team OS: Coding-Agent Repo as Cross-Functional Team Brain
- Team Onboarding for AI Agent Workflows and Adoption
- Temporary Compensatory Mechanisms in Agent Harnesses
- Terminal Tools for Agents: send_to_terminal and Background Interaction
- Test Evolution Blind Spot in Coding Agents
- Test Harness Design for LLM Context Windows
- Test-Driven Agent Development: Tests as Spec and Guardrail
- Test-Driven Intent Clarification: Tests as Intermediate Alignment Artifacts
- The 7 Phases of AI-Assisted Feature Development
- The AI Development Maturity Model: From Skeptic to Agentic
- The AX Stack: A Layered Model of an AI Coding Agent's Prompt-to-Compile Path
- The Addictive Flow State of Agent-Assisted Development
- The Agent Stack Bet: Architectural Decisions for Production Agents
- The Anthropomorphized Agent for AI Agent Development
- The Bottleneck Migration When Humans Supervise Agents
- The Consistent Capability Fallacy in LLM Agent Design
- The Context Ceiling -- Where AI Fails Expert Architects
- The Copy-Paste Agent Anti-Pattern in AI Development
- The Delegation Decision: When to Use an Agent vs Do It Yourself
- The Effortless AI Fallacy for AI Agent Development
- The Eval-First Development Loop for AI Agent Features
- The Implicit Knowledge Problem for AI Coding Agents
- The Infinite Context Anti-Pattern in Agent Systems
- The Instruction Compliance Ceiling: How Rule Count Limits AI
- The Kitchen Sink Session Anti-Pattern in AI Agents
- The LLM Laziness Deficit Fallacy: Restraint Comes From Harness, Not Instruction
- The Plan-First Loop: Always Design Before Writing Code
- The Productivity-Experience Paradox in AI-Assisted Development
- The Prompt Tinkerer Anti-Pattern in Agent Workflows
- The Ralph Wiggum Loop: Fresh-Context Iteration Pattern
- The Reasoning-Complexity Trade-off
- The Research-Plan-Implement Pattern
- The Security Review Gap in AI-Authored PRs
- The Specification as Prompt: Existing Artifacts as Agent
- The Synthetic Ground Truth Fallacy in Agent Evaluation
- The Task Framing Irrelevance Fallacy in Agent Prompting
- The Test Homogenization Trap: When LLM-Generated Tests Mirror Model Blind Spots
- The Think Tool: Mid-Stream Reasoning for AI Agents
- The Yes-Man Agent: Compliance Without Verification
- Three Knowledge Tiers: Sourced, Unverified, Hallucinated
- Three Reasoning Spaces: Plan, Bead, and Code
- Three-Depth In-Session Security Review
- Three-Vector Evasion Taxonomy for Agent Security Tests
- Throwaway-Prototype Skill: Build to Discard, Keep Only the Answer
- Tiered Code Review: AI-First with Human Escalation
- Tiered Memory Architecture: Episodic-to-Semantic Consolidation for Long-Running Agents
- Token Preservation Backfire for AI Agent Development
- Token-Cost Profiling and Reduction for Always-On Agentic Workflows
- Token-Efficient Code Generation: Structural Beats Prompting
- Token-Efficient Tool Design: Tools That Don't Eat Your Context
- Tokenizer Swap Tax: Budgeting for Model Migrations That Change Token Counts
- Tool Calling Schema Standards for AI Agent Development
- Tool Cloning and Provenance Assessment in Agent Ecosystems
- Tool Confirmation Carousel: Batched UI for Per-Call Approvals
- Tool Description Quality for Effective Agent Guidance
- Tool Engineering (Training Module)
- Tool Engineering Principles for AI Agent Development
- Tool Minimalism and High-Level Prompting
- Tool Necessity Probing: Reading Tool-Call Decisions From Hidden States
- Tool Preamble: User-Visible Status Updates Before Tool Calls
- Tool Signing and Signature Verification for Agents
- Tool-Invocation Attack Surface in Coding Agents
- Tool-Use Sim-to-Real Perturbation Taxonomy
- Toolset Agentization: Wrapping Co-Used Tools as Sub-Agents
- Topical Authority — Entity Coverage for AI Citation
- Traces Need Feedback to Power Learning
- Trajectory Decomposition: Diagnose Where Coding Agents Fail
- Trajectory Logging via Progress Files and Git History
- Trajectory Pre-Filter for Failure Diagnosis (TrajAudit)
- Trajectory-Opaque Evaluation Gap: Why Final-Output Grading Misses Safety Violations
- Transcript-Driven Permission Allowlist
- Treat Task Scope as a Security Boundary
- Treating Agent Safety as Uniform Across a Session (Cold-Start Safety Gap)
- Trust Without Verify: Skipping Agent Output Checks
- Trusting Human Review to Catch Deliberate Agent Sabotage
- Trusting Tool Error Messages as Implicit Authority (Error-Path Injection)
- Typed Memory from VCS History: When It Pays Off
- Typed Schemas at Agent Boundaries for Multi-Agent Systems
- Ubiquitous Language for AI Plans
- Unbounded Consumption: Bounding Agent Resource Use Against DoS and Denial-of-Wallet
- Unix CLI as the Native Tool Interface for AI Agents
- Usability Pressure as a Silent Security-Regression Vector
- Use a Public-Web Index to Gate Automatic URL Fetching
- Using the Agent to Analyze Its Own Evaluation Transcripts
- Utility-Model Split: Background Tasks on a Cheaper Model
- VS Code Agents App: Agent-Native Parallel Task Execution
- Validating Token-Optimized Formats Inside Agentic Loops
- Variance-Based RL Sample Selection
- Velocity-Quality Asymmetry: Why AI Speed Gains Fade
- Verification Ledger for Tracking Agent Output Quality
- Verification-Centric Development for AI-Generated Code
- Verify-Gated Completion as Admission Control
- Verifying LLM-Generated Cryptographic Code
- Vibe Coding: Outcome-Oriented Agent-Assisted Development
- Visible Thinking in AI-Assisted Development
- Voting / Ensemble Pattern for AI Agent Development
- WIP=1 and Little's Law: Kanban Throughput Theory for Agent Task Design
- WRAP Framework for Writing Agent-Ready Issue Descriptions
- Web Search Agent Loop: Iterative Research Patterns
- WebMCP: Browser-Hosted Tool Contracts for In-Page AI Agents
- What Evals Are and Why AI Agents Need Them for Quality
- What is GEO — Generative Engine Optimization Defined
- When to Use Examples vs Rules in Agent Instructions
- Windows Sandboxing for Coding Agents
- Workflows for AI Agent Development
- Worktree Isolation: Parallel Agent Sessions in Safe Sandboxes
- Write Tool Descriptions as Agent Onboarding Documents
- Writing Your First Agent Evaluation Suite from Scratch
- bypassPermissions Silently Overrides allowedTools (The Restricted-Bypass Trap)
- llms.txt: Full Specification, Adoption, and Limitations
- llms.txt: Making Your Project Discoverable to AI Agents
- pass@k and pass^k: Capability and Consistency Metrics
tool-engineering¶
- Advanced Tool Use: Scaling Agent Tool Libraries
- Agent-Aware CLI Behaviour via Environment Variable
- Agent-Computer Interface (ACI): Tool Design as UX Discipline
- Agent-Discoverable Slash Commands
- Batch File Operations via Bash Scripts for AI Agents
- Browser Automation as a Research Tool: Bypassing Bot Detection
- CLI Scripts as Agent Tools: Return Only What Matters
- CLI-First Skill Design
- Closed-Loop Agent Training from Tool Schemas
- Code Interpreter as a Primary Agent Tool
- Codebase-Derived Pattern Libraries as Agent Context
- Conditional Hook Execution: Filter Hooks by Tool Pattern
- Consolidate Agent Tools to Reduce Cognitive Overhead
- Cross-IDE Plugin Discovery: One Install Surface, Many Consuming Agents
- Cross-Repo Agent Search: GitHub-API-Backed Text Search Beyond the Workspace
- Designing for Agent Consumers (Agent Experience)
- Documentation-Grounding MCP Servers for Vendor SDKs
- Edit Format Selection: Diff vs. Search-Replace vs. Full Rewrite
- Effort-Aware Hooks: Reading the Reasoning Tier from PreToolUse and PostToolUse
- Engineering: Tools, Review, Verification, Security, and Observability
- Filesystem-Based Tool Discovery for AI Agent Development
- Function-Level Debugger Interfaces for Coding Agents
- Future-Based Asynchronous Function Calling
- Google ADK Skills: Portable SKILL.md Across ADK Agents
- Graceful Tool-Output Truncation: The PARTIAL Signal
- Headless-First Services: APIs for Agent Consumers
- Hint-Driven Concurrency for Read-Only MCP Tools
- Hook Catalog for Claude Code Enforcement
- Hook Exec Form vs Shell Form: Shell-Injection-Safe Hook Commands
- Hooks Invoking MCP Tools: Closing the Loop Between Policy and Tool Execution
- Hooks and Lifecycle Events: Intercepting Agent Behavior
- Indexed Regex Search for Agent Tools
- Lexical-First Retrieval for Agentic Search: When BM25 Is Enough
- Local Plugin Scaffolding via `claude plugin init` and Auto-Loaded `.claude/skills`
- MCP Client Design: Building Robust Host-Side Logic
- MCP Client/Server Architecture Best Practices Guide
- MCP Elicitation: Servers Requesting Structured Input Mid-Task
- MCP LLM Sampling: Servers Requesting AI Inference Mid-Tool
- MCP Server Design: Building Agent-Friendly Servers
- MCP Tool Result Persistence via _meta Annotation
- MCP alwaysLoad: Classifying Servers as Eager or Just-in-Time
- Machine-Readable Error Responses for AI Agents (RFC 9457)
- Managing Agent Skills from the GitHub CLI with gh skill
- MessageDisplay Hook: Transforming Assistant Text at the Display Boundary
- Natural-Language Customization Bootstrap
- On-Demand Skill Hooks: Session-Scoped Guardrails via Skill Invocation
- One-Shot Record and Deterministic Replay for Periodic Agent Tasks
- OpenAPI Documentation Smells for Agent-Ready APIs
- Oracle Poisoning: Knowledge Graph Corruption Against Tool-Using Agents
- Out-of-Band Hook Notifications via terminalSequence
- Override Pattern: Reusing Interactive Commands in Automated Pipelines
- Plugin Background Monitors: Declarative Supervision Auto-Armed at Session Start
- Plugin-Activated Main-Agent Override and Bin/ PATH Injection
- Poka-Yoke for Agent Tools: Mistake-Proof Tool Interfaces
- PostToolBatch Hook: Once-Per-Decision-Cycle Injection at the Batch Boundary
- PostToolUse Hook for BSD/GNU CLI Incompatibilities
- PostToolUse Output Replacement: Hooks That Rewrite Tool Results
- PostToolUse continueOnBlock: Refusal With a Load-Bearing Reason
- PowerShell Tool: Native Windows Shell for Claude Code
- Pre-Execution Risk Classification for Terminal Commands
- PreCompact Hook: Vetoing Compaction at Lifecycle Boundaries
- Production MCP Agent Stack: Sequencing Six Decisions into One Deployment
- Project Writing Skill: House Style as Model-Invocable Skill
- Proprietary-to-Open-Standard Tool Migration (Copilot Extensions to MCP)
- Push-Event MCP Channels: Inverting the Pull-Tool Polarity
- Reactive Environment Hooks: CwdChanged and FileChanged
- Reloading Skills Mid-Session in Claude Code
- Runtime Scaffold Evolution: Agents That Build Tools
- SKILL.md Frontmatter Reference: All Fields Explained
- Scanner-as-MCP-Server: Secret and Dependency Scans as Typed Agent Tools
- Scoped MCP Server Discovery: Most-Specific-Wins Resolution
- Self-Healing Tool Routing
- Semantic Tool Output: Designing for Agent Readability
- Skill Authoring Patterns: Description to Deployment
- Skill Context Isolation: Forking the Skill into a Subagent Window
- Skill Library Evolution: Lifecycle Governance for Agents
- Skill Library Technical Debt: Library-Time Maintenance for Agent Skills
- Skill Shell Execution Gate: Disabling Inline Shell from Skills
- Skill Tool as Enforcement: Loading Command Prompts at Runtime
- Skill as Instruction Surface and Callable API (Interpreter Skills)
- Skill as Knowledge Pattern for AI Agent Development
- StopFailure Hook: Observability for API Error Termination
- Terminal Tool Output Compression: Filtering Predictable Noise at the Harness
- Terminal Tools for Agents: send_to_terminal and Background Interaction
- Token-Efficient Tool Design: Tools That Don't Eat Your Context
- Tool Cloning and Provenance Assessment in Agent Ecosystems
- Tool Description Quality for Effective Agent Guidance
- Tool Engineering Principles for AI Agent Development
- Tool Engineering: Designing and Managing AI Agent Tooling
- Tool Minimalism and High-Level Prompting
- Tool Necessity Probing: Reading Tool-Call Decisions From Hidden States
- Tools: Claude Code, Cursor, and GitHub Copilot
- Toolset Agentization: Wrapping Co-Used Tools as Sub-Agents
- Unix CLI as the Native Tool Interface for AI Agents
- Video Transcript Skill: Converting Recordings to Docs
- Web Search Agent Loop: Iterative Research Patterns
- Write Tool Descriptions as Agent Onboarding Documents
training¶
- Air-Gapped RAG Training Series
- Air-Gapped RAG: Architecture Fundamentals
- Air-Gapped RAG: Chunking Strategies
- Air-Gapped RAG: Deployment, Operations, and Compliance
- Air-Gapped RAG: Document Ingestion and Parsing
- Air-Gapped RAG: Grounding, Citations, and Evaluation
- Air-Gapped RAG: Local Embeddings and Vector Stores
- Air-Gapped RAG: Local LLM Inference
- Air-Gapped RAG: Overview and When to Use It
- Air-Gapped RAG: Retrieval and Re-Ranking
- Autonomous Research Loops: Loops That Know When to Stop
- Context Engineering (Training Module)
- Earned-Complexity Agent Maturity Ladder
- Eval Engineering (Training Module)
- Eval-Driven Development Training for AI Agent Teams
- Foundational Disciplines for AI-Assisted Development
- GitHub Copilot Advanced Patterns: Multi-Agent and Automation
- GitHub Copilot Platform Surface Map: All Capabilities
- GitHub Copilot Training Modules for Engineering Teams
- GitHub Copilot: Context Engineering & Agent Workflows
- GitHub Copilot: Customization Primitives and Stack
- GitHub Copilot: Harness Engineering for Agent-Ready Code
- GitHub Copilot: Model Selection, Routing, and Costs
- GitHub Copilot: Team Adoption and Governance Guide
- Grading Strategies for Eval-Driven Development
- Hardening Agent Evals for Production-Grade Reliability
- Harness Engineering (Training Module)
- How the Four Agent Engineering Disciplines Compound
- L0 → L1: Making the Repo Readable
- L1 → L2: Adding Feedback Loops
- L2 → L3: Building Mechanical Enforcement
- L3 → L5: Reaching Agent-First
- Prompt Engineering for Agent Instructions and Systems
- Step-by-Step: Building Your First Eval-Driven Feature
- The Eval-First Development Loop for AI Agent Features
- Tool Engineering (Training Module)
- Training Modules
- What Evals Are and Why AI Agents Need Them for Quality
- Writing Your First Agent Evaluation Suite from Scratch
workflows¶
- AI Adoption Footprint: The Segmented Shape of Engineering Orgs
- AI Bot CI/CD Workflow Reliability by Agent
- AI Crawler Policy: robots.txt for the Three-Tier Crawler Landscape
- AI Slop as a Process Problem: Encoding Quality Standards as Pipeline Gates
- AI-Powered Vulnerability Triage for AI Agent Development
- AST-Grounded Critic Loop for Documentation Maintenance
- Acknowledged-Debt Ledger with Next-Trigger Conditions
- Adversarial Multi-Model Development Pipeline (VSDD)
- Agent Chat History as a First-Class Artifact
- Agent Commit Attribution: Signed Commits and Agent Identity
- Agent Debug Log Panel: Chronological Event Inspection for Session Debugging
- Agent Debugging: Diagnosing Bad Agent Output
- Agent Development Lifecycle for Agent Products
- Agent Environment Bootstrapping for AI Agent Development
- Agent Governance Policies for AI Agent Development
- Agent Harness: Initializer and Coding Agent Pattern
- Agent Loop Middleware — Safety Nets and Message Injection
- Agent Mission Control for Orchestrating Agent Tasks
- Agent Observability with OpenTelemetry and Trajectory Logging
- Agent PR Volume vs. Value: The Productivity Paradox
- Agent Project State Purge: Clean-Slate Session Reset
- Agent-Authored PR Integration and Merge Predictors
- Agent-Driven Greenfield Product Development from Scratch
- Agent-Driven PR Slicing
- Agent-Generated Onboarding Guide as a Durable Artefact
- Agent-Laundered Bug Reports
- Agent-Led Dev-Environment Iteration with Validation and Rollback
- Agent-Powered Codebase Q&A and Onboarding Workflow
- Agent-Proposed Merge Resolution
- Agentic Education: Persona Progression for Teaching AI Coding Tools
- Agentic Flywheel: Building Self-Improving Agent Systems
- Agentic-Agile: Adapting Agile Rituals for Agent Work
- Air-Gapped RAG Training Series
- Air-Gapped RAG: Deployment, Operations, and Compliance
- Air-Gapped RAG: Document Ingestion and Parsing
- Air-Gapped RAG: Overview and When to Use It
- Answer-First Writing: Structure Content for AI Retrieval
- Architecting a Central Repo for Shared Agent Standards
- Assertion Density — Stats and Quotes Over Vague Claims
- Atomic Pages and Chunking — One Concept Per Page for RAG
- Auto-Triage Workflow: Bug-Monitoring Agent that Connects Related Reports and Opens Fix PRs
- Autonomous Research Loops: Loops That Know When to Stop
- Backlog Triage as a Named Agent Skill
- Batched Suggestion Application: Bulk-Apply Agent Fixes on PRs
- Bootstrapping an Agent-Driven Project from Scratch
- Brownfield to Agent-First: Repo Maturity Framework
- Browser Automation as a Research Tool: Bypassing Bot Detection
- Burn the Boats — Commitment-Forcing Deprecation
- CARE: Three-Party Stage-Gated Agent Engineering
- CLI Scripts as Agent Tools: Return Only What Matters
- CLI-IDE-GitHub Context Ladder for AI Agent Development
- Canary Rollout for Agent Policy Changes
- Chain-of-Verification for Coding Agents
- Channels Permission Relay
- Chat-Platform Agent Delegation: Invoking Cloud Coding Agents from Team Channels
- Classical SE Patterns as Agent Design Analogues
- Classifier-Subagent Run Mode for Per-Call Permission Routing
- Classifying and Auto-Correcting Coding Agent Misbehaviors (Wink)
- Claude Code --bare Flag
- Claude Code /batch and Worktrees for AI Agent Development
- Claude Code Review
- Claude Code ↔ Copilot CLI: Changelog-Driven Feature Parity
- Clock-In / Clock-Out Protocol: Bracketed Session Continuity
- Closed-Loop Agent Training from Tool Schemas
- Closed-Loop CI Failure Remediation with Cloud Coding Agents
- Closed-Loop Role-Based Refinement for Agent Systems
- Cloud Parallel Review Pattern
- Cloud Planning with Inline-Comment Review and Execute-Anywhere Choice
- Cloud-Agent Session Bootstrap: Cached Install plus Per-Session Start
- Cloud-Agent Three-Layer State Decoupling
- Cloud-Local Agent Handoff for AI Agent Development
- Code Cleanliness as an Agent Cost Lever
- Coding Agent Scope Expansion: When to Extend Beyond the Codebase
- Coding-Agent Misalignment Forms (Seven-Symptom Taxonomy)
- Coding-Agent Reversibility: Platform Choice as a Two-Way Door
- Committee Review Pattern for Multi-Agent Code Review
- Compound Engineering: Systematic Agent Learning Loop
- Conditional Hook Execution: Filter Hooks by Tool Pattern
- Consistent-format customer capture
- Containment Playbook: npm-to-Signing-Channel Compromise
- Context-Injected Error Recovery for AI Agent Development
- Continual Learning for AI Agents: Three Layers of Knowledge Accumulation
- Continuous AI (Agentic CI/CD) for AI Agent Development
- Continuous Agent Improvement: Iterating on Agent Quality
- Continuous Autonomous Task Loop
- Continuous Documentation as an Agent-Driven Practice
- Continuous Triage: Automating Issue Classification with AI Workflows
- Convention Over Configuration in Agent Workflow Design
- Convergence Detection in Iterative Agent Refinement
- Copilot CLI Agentic Workflows for AI Agent Development
- Copilot CLI BYOK and Local Model Support
- Copilot Cloud Agent Three-Phase Execution Model
- Copilot Unified Sessions View and CLI Agent in JetBrains IDEs
- Cross-Functional Knowledge Artifacts
- Cursor Multi-Root Workspaces for Cross-Repo Agent Edits
- Daily-Use Skill Library: Encoding Your Process as Agent Skills
- Deliberate AI-Assisted Learning: Accelerating Skill Acquisition
- Dependabot Agent Assignment for AI-Driven Vulnerability Remediation
- Deterministic Orchestration for Structured Modernization
- Developer Control Strategies for AI Coding Agents
- Direct-Apply Review Comments via Cloud Agent
- Discovery-Only Refactor Pass: Surface Candidates Before Touching Code
- Distilled Bootstrap Contract: Agent-Authored Repo Setup
- Documentation-Guided Legacy Migration: Architecture Docs as a C-to-Rust Blueprint
- Earned-Complexity Agent Maturity Ladder
- Editor and Manager Surface Separation in Agent IDEs
- Emergent Architecture in AI-Driven Codebases
- Encoding Tacit Knowledge into Agent Improvement Loops
- Enterprise Skill Marketplace: Distribution and Quality
- Enterprise-Managed Plugin Governance for Agent CLIs
- Entropy Reduction Agents: Automated Codebase Hygiene
- Escape Hatches: Unsticking Stuck Agents
- Eval-Driven Development: Write Evals Before Building Agent
- Evaluation-Driven Development for Agent Tools
- Event-Driven Agent Routing for Multi-Team AI Pipelines
- Evidence-Based Allowlist Auto-Discovery for Agents
- Execution Lineage: DAG of Artifacts vs Agent Loops
- Execution-First Delegation: The AI-as-Executor Pattern
- Experiential-Learning Setup Agents with Snapshot Rollback (SetupX)
- Factory Over Assistant: Orchestrating Parallel Agent Fleets
- Failure-Driven Iteration for Improving Agent Workflows
- Fallacies for AI Agent Development
- Fan-Out Synthesis Pattern for AI Agent Development
- File-Based Agent Coordination for AI Agent Development
- Five-Pass Blunder Hunt: Repeated Critique Passes for Plans
- GEO for Technical Docs: Developer Documentation Checklist
- Generative Engine Optimization for Developer Sites
- GitHub Agentic Workflows for Automating Dev Processes
- GitHub Copilot Dedicated App as Agent-First Surface
- GitHub Models in Actions for AI-Driven CI Workflows
- Goal Contract: Separating the Doer from the Done-Checker
- Goal Monitoring and Progress Tracking for Long-Running Agents
- Golden Journeys: Restartability as a First-Class Verification Primitive
- Headless Claude in CI: Using -p and --max-turns for Safe Pipeline Integration
- Hook Catalog for Claude Code Enforcement
- How AI Engines Cite — ChatGPT, Perplexity, Claude, Gemini
- Human-in-the-Loop Placement: Where and How to Supervise
- Humans and Agents in Software Engineering Loops
- Hyper-Personalized Software: The Return of RAD
- Hypothesis-Driven Debugging: Instrument Before You Patch
- In-Session Transcript Search: Navigating Long Agent Conversations
- In-Thread Side-Channel: Bounded Side Questions Without Losing the Main Task
- Incident Log Investigation Skill: Parallel Queries
- Initiatives and Community: Tracking the Agentic Engineering Landscape
- Interaction-Pattern Evaluation for Agentic PRs
- Introspective Skill Generation: Mining Agent Patterns
- Inversion Analysis: Surface Capabilities Competitors Cannot Replicate
- Issue-Tracker as Agent Dispatch Surface
- Issue-to-PR Delegation Pipeline for AI Agent Development
- L0 → L1: Making the Repo Readable
- L1 → L2: Adding Feedback Loops
- L2 → L3: Building Mechanical Enforcement
- L3 → L5: Reaching Agent-First
- LLM Refactoring Adoption Patterns
- LLM Static Verification Against Natural-Language Requirements
- LLM-as-Judge Evaluation with Human Spot-Checking
- Large-Codebase Coding-Agent Failure Patterns (Sourcegraph Five)
- Law of Triviality in AI PRs for AI Agent Development
- Lay the Architectural Foundation by Hand Before Delegating
- Lazy Worktree Isolation: Enter the Worktree on First Write, Not on Dispatch
- Long-Running Agents: Durability and Resumability Across Sessions
- Loop Detection for AI Agents: Stopping Micro-Loops
- Loop Strategy Spectrum: Accumulated vs Fresh Context
- Managing Agent Skills from the GitHub CLI with gh skill
- Measuring GEO Performance: Metrics, Monitoring, and Tools
- Meta-Engineering Harness for Production AI-Native Software Delivery
- Mise en Place for Agentic Coding
- Model Deprecation Lifecycle for Agent Workloads
- Model a Single Agent Turn as Many Inference and Tool-Call
- Model-ID-as-Dependency: Migration Protocol for Deprecation Churn
- Monolith-to-Sub-Agents Refactor: Five Lessons from a Brittle Prototype
- Monorepo Skill and Agent Discovery: Hierarchical Configuration
- Multi-Agent RAG for Spec-to-Test Automation
- Multi-Model Plan Synthesis for System Architecture
- Multi-Repo and No-Repo Coding Agent Automation Templates
- Natural-Language Customization Bootstrap
- Natural-language git
- Observability Feedback Loop: A 7-Step Debug Runbook for Agents
- One-Click CI Auto-Fix: Human-Triggered Cloud-Agent Remediation for Failing GitHub Actions
- Oracle-Based Task Decomposition for AI Agent Development
- Orchestrator-Worker Pattern for AI Agent Development
- PR Description Style as a Lever for Agent PR Merge Rates
- PR Scope Creep as a Human Review Bottleneck
- Parallel Agent Sessions Shift the Bottleneck from Writing
- Parallel Polyglot Ports as a Spec-Ambiguity Oracle
- Pattern Selection Map: Trade-off Matrix for This Site's Patterns
- Permutation Frameworks for Batch Code Generation
- Persona-as-Code: Defining Agent Roles as Structured Docs
- Phase-Specific Context Assembly for AI Agent Development
- Plan files as resumable artifacts
- Plan mode for knowledge artifacts
- Polya Small-Steps: Using AI to Think Better, Not Think Less
- PostToolUse Hook for BSD/GNU CLI Incompatibilities
- Pre-Execution Codebase Exploration for AI Coding Agents
- Prebuilt Agent Environments: Amortising Cloud Agent Cold Start with Custom Container Images
- Prebuilt Agent Monitoring Dashboard
- Premature Completion: Agents That Declare Success Too Early
- Prior Dominance Over Feedback in Agent Optimization Loops
- Product-as-IDE: When the Application Becomes the Development
- Programmatic Cloud-Agent Dispatch via REST API and Webhooks
- Prompt Chaining: Sequential LLM Calls for Agent Workflows
- Prompt Governance via PRs: Reviewable AI Behaviour
- Prompt-Rewrite Discipline on Cross-Generation Model Migration
- Proprietary-to-Open-Standard Tool Migration (Copilot Extensions to MCP)
- Prototype Before Optimizing: Establish Quality Baselines Before Token Constraints
- Public-Channel Agent Work as Lehrwerkstatt for Team Learning
- QA Session to Issues Pipeline for AI Agent Development
- Refactoring Runaway: Tangled Refactorings in Agent Patches
- Remote Agent Host Sessions over SSH and Dev Tunnels
- Remote Session Control for Local CLI Agents
- Repository Bootstrap Checklist: Wiring Agent Support
- Review-Feedback-to-Rule Loop: Promoting Recurring PR Comments into Harness Rules
- Review-Then-Apply CLI Flag for In-Process Auto-Fix
- Reviewer's Playbook for Agent-Authored Pull Requests
- Rigor Relocation: Engineering Discipline with AI Agents
- Run-Status vs Task-Status Confusion in Autonomous Agent Runs
- Runbooks as Agent Instructions: Agent-Followable Ops
- Runnable Documentation as Agent Verification
- SDLC-Phase Skill Taxonomy: Full-Lifecycle Skill Libraries
- SEO vs GEO — Signals, Metrics, Optimization Targets
- Scheduled Instruction File Fact-Checker for Accuracy
- Schema and Structured Data for GEO — AI Citation Guide
- Seamless Background-to-Foreground Handoff
- Secrets Management for AI Agents: Credential Injection
- Self-Healing Production Agent: Automated Regression Detection and Autofix PR
- Semantic Issue Search from Chat vs Query Syntax
- Session Initialization Ritual: How Agents Orient Themselves
- Session Scheduling with Loop and Cron in Claude Code
- Shadow Tech Debt Created by Autonomous AI Agent Commits
- Simulation and Replay Testing for Agent Verification
- Single-Branch Git for Agent Swarms: A Trade-Off Pattern
- Single-CLI Agent Platform: Create to Production in One CLI
- Skeleton Projects as Agent Scaffolding
- Skill Library Refinement Loops: Organisational Feedback for Shared Skills
- Spec Complexity Displacement: When Specs Become Code
- Spec-Driven Development with Spec Kit
- Staggered Agent Launch: Preventing Thundering-Herd in Swarms
- Stakeholder Trust Through Evals and Observability
- Steering Running Agents: Mid-Run Redirection and Follow-Ups
- Swarm Migration Pattern
- Symptom-Reduction-as-Root-Cause: Why Oracle Tests Alone Miss Architectural Drift
- System-Level Optimization Pipeline
- Team OS: Coding-Agent Repo as Cross-Functional Team Brain
- Team Onboarding for AI Agent Workflows and Adoption
- Terminal Tools for Agents: send_to_terminal and Background Interaction
- The 7 Phases of AI-Assisted Feature Development
- The AI Development Maturity Model: From Skeptic to Agentic
- The Effortless AI Fallacy for AI Agent Development
- The Eval-First Development Loop for AI Agent Features
- The LLM Laziness Deficit Fallacy: Restraint Comes From Harness, Not Instruction
- The Plan-First Loop: Always Design Before Writing Code
- The Research-Plan-Implement Pattern
- The Think Tool: Mid-Stream Reasoning for AI Agents
- Three Reasoning Spaces: Plan, Bead, and Code
- Throwaway-Prototype Skill: Build to Discard, Keep Only the Answer
- Tiled Agent Layout: Supervising Parallel Agents Through Dedicated Panes
- Token-Cost Profiling and Reduction for Always-On Agentic Workflows
- Topical Authority — Entity Coverage for AI Citation
- Trajectory Logging via Progress Files and Git History
- Turn-Level Context Decisions: Continue, Rewind, Clear, Compact, or Delegate
- Using the Agent to Analyze Its Own Evaluation Transcripts
- Velocity-Quality Asymmetry: Why AI Speed Gains Fade
- Verification-Centric Development for AI-Generated Code
- Visible Thinking in AI-Assisted Development
- Voting / Ensemble Pattern for AI Agent Development
- WIP=1 and Little's Law: Kanban Throughput Theory for Agent Task Design
- Web Search Agent Loop: Iterative Research Patterns
- What is GEO — Generative Engine Optimization Defined
- Workflows for AI Agent Development
- Worktree Isolation: Parallel Agent Sessions in Safe Sandboxes
- llms.txt: Full Specification, Adoption, and Limitations
- llms.txt: Making Your Project Discoverable to AI Agents