Skip to content

Security Constitution for AI Code Generation

Formalize security constraints as a versioned, machine-readable document that agents consume at specification time — enforcing security by construction, not post-hoc review.

The problem

AI coding agents optimize for functional correctness. Without explicit security constraints, they produce code that works but contains vulnerabilities — SQL injection, hardcoded secrets, missing input validation. Veracode's 2025 GenAI Code Security Report found 45% of AI-generated samples introduced a known security flaw. Later review catches these at high cost: rework, delays, and growing security debt.

The alternative is to embed security rules in the specification layer, so agents never generate the vulnerable pattern.

Constitution structure

A security constitution is a versioned document where each principle maps to a specific weakness class — typically drawn from the MITRE CWE Top 25 — and carries an enforcement level (Marri, 2026):

# security-constitution.yml
principles:
  - id: SEC-001
    cwe: CWE-89
    level: MUST        # MUST | SHOULD | MAY
    constraint: "All database queries use parameterized statements"
    pattern: "Use ORM query builders or prepared statements; never concatenate user input into SQL strings"
    rationale: "SQL injection remains a top-25 CWE; string concatenation is the root cause"

  - id: SEC-002
    cwe: CWE-798
    level: MUST
    constraint: "No credentials in source code"
    pattern: "Read secrets from environment variables or a secrets manager; fail fast if missing"
    rationale: "Hardcoded credentials in AI-generated code are common and persist across sessions"

Each principle contains:

Field Purpose
id Stable reference for traceability
cwe Links to Common Weakness Enumeration for auditability
level MUST/SHOULD/MAY — distinguishes hard requirements from guidance
constraint What the agent must ensure
pattern How to implement it — the positive pattern, not just the prohibition
rationale Why — agents follow constraints better when they understand the reasoning (Marri, 2026)

The two principles above target real weakness classes. CWE-89 (SQL injection) sits at rank 3 on the 2024 CWE Top 25. Hardcoded credentials are disproportionately common in AI-assisted code: GitGuardian's State of Secrets Sprawl 2026 reports Copilot-active repositories leak secrets at roughly 40% above the public-repo baseline.

Progressive disclosure: select relevant principles

Including the full constitution in every prompt degrades compliance. A case study found that injecting 3 to 5 task-relevant principles achieved 96% compliance, compared with 78% for the complete document. Context truncation made the model lose track of distant principles (Marri, 2026). This applies the same load-on-demand principle that progressive disclosure for agent definitions uses for skills.

graph TD
    A[Full Constitution] --> B[Task Analysis]
    B --> C[Select 3-5<br/>relevant principles]
    C --> D[Inject into<br/>task spec]
    D --> E[Agent generates<br/>constrained code]

Front-load the constraints that matter for the current task. A database migration task gets SEC-001 (SQL injection) and SEC-002 (no hardcoded credentials). An API endpoint task gets input validation and authentication principles instead.

Three-phase integration

Inject security principles at each phase of a spec-driven workflow (Marri, 2026):

Specification phase. The feature spec names which constitution principles apply. Select 3 to 5 relevant principles here.

Planning phase. The implementation plan includes security constraints as acceptance criteria. "All endpoints validate input against schema" becomes a plan step, not an afterthought.

Task phase. Each task definition carries its applicable principles (SEC-001, SEC-002) inline. The agent sees the constraint where it generates the code.

graph TD
    C[Security Constitution] --> S[Feature Spec]
    C --> P[Implementation Plan]
    C --> T[Task Definitions]
    S --> P
    P --> T
    T --> I[Generated Code]
    I --> V[Mechanical Verification]
    V -->|Fail| T

Mechanical enforcement

A constitution in a prompt is guidance. A constitution backed by linters, CI gates, and hooks is enforcement — the layer deterministic guardrails provides. The document serves both roles: human-readable constraints that also feed automated checks.

Linters and static analysis. Map each MUST principle to a linter rule. SEC-001 maps to a SQL injection scanner (see CWE-89). SEC-002 maps to secret detection tools like gitleaks or trufflehog. See secrets management for agents for the wider credential handling pattern.

Pre-commit hooks. Block commits that violate MUST principles before they reach the repository. Claude Code's PreToolUse hooks can intercept file writes and run validation (deterministic guardrails).

CI gates. Run the full constitution's MUST principles as a CI check. SHOULD principles generate warnings. MAY principles are informational.

The constitution becomes the single source of truth for both agent prompts and automated enforcement. See defense-in-depth agent safety for the wider pattern of layered safety mechanisms.

Integration with existing workflows

The constitution pattern works with instruction files already in use:

  • CLAUDE.md and .github/copilot-instructions.md: reference the constitution file and add a directive to load relevant principles per task. See hierarchical instruction files.
  • Spec-driven development: the constitution becomes an input to the specification phase, alongside requirements and architecture constraints. See spec-driven development.
  • Deterministic guardrails: constitution MUST principles map directly to guardrail rules. See deterministic guardrails.

Limitations

The published evidence comes from a single case study: one developer, one AI assistant, one domain (banking microservices). The reported 73% reduction in CWE violations is indicative but not yet independently replicated (Marri, 2026).

The 16-hour upfront investment is from a single practitioner. Teams with existing security standards can adapt faster. Teams without them will need more time. Whether the same structure works for other domains (embedded systems, frontend applications, data pipelines) is untested.

Key Takeaways

  • Formalize security rules as structured, versioned principles with CWE mappings — not prose paragraphs in a wiki
  • Inject 3–5 relevant principles per task rather than the full constitution — context window limits make selective loading more effective than exhaustive inclusion
  • Back every MUST principle with a mechanical check (linter, hook, CI gate) — prompts guide, guardrails enforce
  • Integrate at the specification phase, not as a post-implementation review step
Feedback