Skip to content

Pre-Change Impact Analysis: Dependency Maps That Prevent Agent Regressions

Build a graph of code-to-test dependencies and deliver it as a lightweight agent skill — agents query which tests are at risk before committing, cutting regressions by 70%.

The problem

AI coding agents fix issues but frequently break things that were working. On SWE-bench Verified, vanilla agent runs show a 6.08% test-level regression rate (TDAD paper, Table 3). METR's March 2026 review of 296 SWE-bench Verified patches found roughly half would not be merged by maintainers — with regressions and code quality cited among top rejection reasons.

Agents lack visibility into which tests exercise the code they modify.

The technique

Pre-change impact analysis gives agents a dependency map between source and test files. The agent queries the map before committing, runs at-risk tests, and self-corrects if any fail.

The TDAD tool (Alonso, Yovine, Braberman 2026) follows three steps:

  1. Index — parse source files with an AST to build a dependency graph of functions, classes, imports, call targets, and inheritance.
  2. Impact — traverse the graph from changed files to find affected tests.
  3. Verify — run only impacted tests, then fix regressions before submission.
graph LR
    A[Changed Files] --> B[Dependency Graph]
    B --> C[Impacted Tests]
    C --> D{Tests Pass?}
    D -->|Yes| E[Commit Patch]
    D -->|No| F[Self-Correct]
    F --> C

Graph structure

The dependency graph maps five edge types:

Edge Type Example
CONTAINS utils.pyparse_config()
CALLS process()validate()
IMPORTS api.pymodels.py
TESTS test_api.pyhandle_request()
INHERITS AdminUserBaseUser

TDAD finds tests by naming convention (test_*.py), prefix matching, and proximity.

Delivery as a lightweight skill

Deliver the dependency map as static text files, not a runtime API or graph database:

  • test_map.txt — one line per source-to-test mapping, ready to grep
  • SKILL.md — 20 lines of short guidance: fix, grep test_map, verify

The agent queries the map with grep, so it needs no special tools. The skill must work within the agent's existing tool set.

The TDD prompting paradox

Procedural TDD instructions without dependency context make regressions worse, not better.

Approach Regression Rate vs. Baseline
Vanilla (no intervention) 6.08%
Procedural TDD instructions 9.94% +64% worse
Dependency map + concise guidance 1.82% -70% better

Source: TDAD paper, evaluated on SWE-bench Verified with Qwen3-Coder 30B (100 instances).

Why procedural TDD backfires:

  • Context consumption — verbose instructions consume tokens and push out the repository knowledge needed for accurate changes
  • Unfocused ambition — without knowing which tests matter, agents touch more files and cause collateral damage
  • Procedure without information — "run the tests" is useless without "run these tests"

Simplifying from 107 lines to 20 lines of short guidance quadrupled the resolution rate, from 12% to 50%.

The principle is context over procedure. When you design an agent skill, put decision-relevant facts ahead of step-by-step processes.

Practical implementation

Building the map

# Install TDAD from source (Python, MIT license)
git clone https://github.com/pepealonso95/TDAD.git
cd TDAD/tdad && pip install -e .

# Index a repository
tdad index /path/to/repo

# Query impact for changed files
tdad impact /path/to/repo --files src/module.py

TDAD uses Python's ast module. For other languages, Tree-sitter provides a unified parsing interface.

Integrating with agent workflows

Place both files in the repository root. For CI, run impact analysis on the diff and execute only affected tests.

Limitations

  • Static analysis only — cannot capture dynamic dispatch, monkey-patching, or runtime-generated code
  • Python-focused — AST parsing is language-specific, so multi-language repos need a parser per language
  • Sparse test suites — weak test-code coupling reduces how well it works
  • Smaller model bias — observed with 30B models on 32K context, so frontier models may differ

Example

A developer tasks an agent with fixing a bug in src/auth/session.py. The agent uses TDAD to identify at-risk tests before committing:

# 1. Index the repository (run once, or on CI)
tdad index /path/to/repo

# 2. Query the test map for the changed file
grep "src/auth/session.py" test_map.txt
# => src/auth/session.py -> tests/test_session.py
# => src/auth/session.py -> tests/integration/test_auth_flow.py

# 3. Run only the impacted tests
pytest tests/test_session.py tests/integration/test_auth_flow.py

# 4. One test fails — agent self-corrects and re-runs
pytest tests/integration/test_auth_flow.py
# => PASSED

The SKILL.md the agent reads contains:

Before committing any change:
1. Identify changed files
2. Run: grep "<changed_file>" test_map.txt
3. Run the listed tests
4. If any fail, fix and re-run before committing

Without the map, the agent would either skip tests entirely or run the full suite — missing regressions or wasting time.

Key Takeaways

  • Map dependencies before agents commit — A static text file mapping source to tests reduces regressions by 70%
  • Context beats procedure — Targeted facts outperform prescriptive TDD workflows; verbose instructions can harm performance
  • Keep skills minimal — 20 lines outperformed 107 lines by 4x on resolution rate
  • Use standard tools — grep-able text files work within any agent's existing tool set
Feedback