State of Agent Governance 2026
Executive Summary
Agent-capable development tools — Claude Code, Cursor, Copilot, Codex, Windsurf, Gemini CLI, and dozens of frameworks — have fragmented the configuration surface. Each tool has its own config format, its own permission model, and its own enforcement mechanisms. No standard governance framework exists.
We scanned 395 public repositories with agent infrastructure and documented what we found — not through surveys, but through automated analysis of real configurations and empirical reproduction of failure modes across 12 tools.
Five governance patterns we observed
-
Silent failure is the default. Budget enforcement, content guardrails, and lifecycle hooks all have confirmed failure modes where the control is configured but not enforced — with no error signal. Your staging environment passes. Production does not. The control says “configured.” It means “not enforced.”
-
Single enforcement points are insufficient. We documented 25+ failure modes in Claude Code’s hook system alone — the primary mechanism teams use to enforce governance policies. No single hook event fires reliably enough to serve as the sole control.
-
Documentation claims do not match production behaviour. Every finding in this report documents a gap between what the docs say and what the tool does under real conditions. This is the theory delta — the space between documented behaviour and observed reality.
-
Dependencies inherit the agent’s trust level. MCP servers, database connections, and memory stores operate with the agent’s full permissions unless explicitly restricted. The restrictions are often bypassable — every MCP database server we reviewed enforces “read-only” with a string check, not SQL parsing.
-
Configuration files are execution contexts. Agent config files control what code runs, with what permissions, at what privilege level. Two confirmed CVEs in Claude Code’s project settings demonstrate that config files should be treated like executable code — reviewed, tested, and audited.
What this report contains
- Seven case studies of governance failures we reproduced, organised by failure mode (Section 2)
- Observatory data from scanning 395 repositories, showing how common these patterns are in practice (Section 3)
- A governance taxonomy of 151 checks across 26+ tools, mapped to OWASP categories and classified by severity (Section 4)
- Transparent methodology — sample sizes, confidence intervals, and known biases stated explicitly (Section 5)
- Actionable recommendations for engineering leaders, practitioners, and security teams (Section 6)
How this report was produced
This report is scan-based, not survey-based. We used automated scanning of public repositories combined with manual reproduction of failure modes. Every finding traces to a public GitHub issue, CVE, or documented reproduction. Our methodology, sample sizes, and confidence intervals are disclosed throughout — including what we cannot yet measure.
Theory Delta is an empirical investigator of the agentic tool landscape. Our knowledge base contains 250+ evidence blocks and 151 governance checks, maintained through continuous scanning, changelog monitoring, and empirical testing. This report repackages that engine output for a broader audience.
Run the scan yourself
The governance patterns documented in this report are detectable. The AI Agent Setup Check runs the same checks described in Section 4 against your repository — free, automated, and evidence-backed.
Why Agent Governance Matters Now
From Theory Delta | State of Agent Governance 2026
The landscape exploded. Governance did not follow.
There are now over 50 tools that give AI agents the ability to write code, execute commands, read files, and call APIs on a developer’s behalf. Claude Code, Cursor, Copilot, Codex, Windsurf, Gemini CLI, Goose, Cline, Aider, Continue, CrewAI, LangChain, AutoGen, Semantic Kernel, OpenAI Agents SDK, Vercel AI SDK, LlamaIndex, n8n, Dify, Open WebUI, and more. Each ships its own configuration format. Each defines its own model for permissions, autonomy, and tool access.
No standard governance framework exists for any of them.
This matters because agent configurations are not preferences. They are policy. An agent config controls what AI can do in your codebase: which files it can read, which commands it can run, whether it asks permission or acts autonomously, which external services it can call. These are decisions with the same consequence as IAM policies or firewall rules — but they are not treated that way.
They are rarely reviewed in pull requests. They are almost never tested. They are frequently copy-pasted from blog posts and left unchanged.
The gap is measurable
Multiple independent surveys have reached the same conclusion from different directions.
Akto’s 2025 survey of enterprise AI adoption found that 79% of organizations lack formal governance policies for AI agents. Gravitee’s 2025 API and AI governance survey found that only 24.4% of organizations have full visibility into how agents interact with their systems. The rest are operating with partial visibility or none at all.
These are not hypothetical risks. The institutional response has already begun:
- June 2025: Snyk acquired Invariant Labs, the team behind
mcp-scan, signaling that MCP security tooling is now part of the enterprise supply-chain security stack. - September 2025: Docker acquired MCP-Defender, a runtime interception proxy for MCP servers, integrating agent security into the container toolchain.
- February 2026: Mend.io launched agent configuration scanning as part of its application security platform.
- 2025-2026: OWASP published the Agentic Security Index (ASI), establishing a shared taxonomy for agent-related risk categories.
When Snyk, Docker, and OWASP move on the same problem within twelve months, the problem is real. But the tooling so far targets attack surfaces — malicious MCP servers, prompt injection, credential exfiltration. The more common problem is not attacks. It is misconfiguration.
Governance is the missing layer
The difference between a security finding and a governance finding is intent. A security finding says: “an attacker could exploit this.” A governance finding says: “your configuration does not match what you probably intended.”
An agent running in fully autonomous mode with no execution limits is not a vulnerability. It is a governance gap. The developer who committed that config probably did not mean “run unlimited tool calls without asking.” They meant “stop interrupting me with approval prompts” — and the config they found on Stack Overflow happened to remove all guardrails, not just the noisy ones.
This is the pattern we see repeatedly: developers configuring agents with reasonable intent and ending up with configurations that do more than they realize. Not because the tools are malicious, but because the configuration surfaces are fragmented, underdocumented, and unlike anything developers have had to manage before.
What we found
We built 151 governance checks covering 26+ agent tools and frameworks. We scanned 395 public repositories that contain agent infrastructure — configuration files, MCP server definitions, orchestration configs, hook scripts, permission policies.
This report presents what we observed. Not survey results. Not theoretical risk models. Automated analysis of real configurations committed to real repositories.
The findings fall into three categories:
- Configuration gaps — settings that are missing, defaulting to insecure values, or contradicting each other within the same repository.
- Reproducible failure modes — documented cases where agent infrastructure behaves differently than its documentation claims, confirmed through testing.
- Governance patterns — what the repositories with the strongest agent governance practices have in common, and what everyone else is missing.
The next section breaks down the methodology: how the checks work, what they detect, and what they do not claim to detect.
What We Found
We scanned public repositories with agent infrastructure and reproduced failure modes documented in our evidence corpus. This section presents seven findings — each independently verified — organised by the governance pattern they reveal, not the tool they affect.
Every finding follows the same structure: what the documentation claims, what we observed, and what to do about it. Full reproduction evidence, version-pinned environments, and source links are available in the individual finding pages.
1. Safety controls that don’t fire when they should
The most common governance failure we documented is not missing controls — it is controls that exist, appear configured, and silently do nothing under production conditions.
LiteLLM gateway: budget limits, guardrails, and fallback routing fail silently
LiteLLM is the standard proxy for multi-provider LLM deployments. Its documentation promises budget enforcement, content guardrails, response caching, and fallback routing. We traced 16 distinct failure modes to public GitHub issues with reproductions.
Budget enforcement drifts under concurrent load. The budget counter uses read-modify-write without synchronisation. Under concurrent requests, increments are lost. One documented case: a $50 budget, $764.78 actual spend — a 15x overshoot with no alert, no exception, no log entry (Issue #12977). Team-level budgets are bypassed entirely when virtual keys belong to teams (Issue #12905). Pass-through routes ignore budgets completely (Issue #10750).
Guardrails intermittently fail or never execute. Content filters show inconsistent behaviour — German passports sometimes masked, sometimes not; SSN detection unreliable (Issue #19637). Guardrails configured via the GUI appear in the interface but are never invoked (Issue #15584). Model-level guardrails run before they are attached to the request (Issue #18363). Bedrock guardrails run but their output is discarded (Issue #22949).
The governance pattern: Your staging environment (low concurrency) passes. Your production environment (concurrent requests) does not. The controls appear configured. They are not enforced. There is no error message.
Full finding: LLM gateway silent failures
Agentic RAG: three silent data failures the docs don’t mention
We documented three independent failure modes across three popular RAG frameworks — each produces incorrect output with no error signal.
GraphRAG merges entities with the same name but different types. “Python” the programming language and “Python” the snake become one graph node. Multi-hop reasoning over the merged graph produces hallucinated answers. Marked fatal in Issue #1718, no shipped fix as of February 2026.
LangGraph conditional edge routing corrupts silently. Inline docstrings inside Python dict literals used as routing maps become part of the dictionary key at definition time. The routing fails at runtime with a KeyError — sometimes swallowed entirely under async streaming. No static analysis tool warns on this. Issues #4968, #4891, #4226.
Hard step caps return raw tool output to users. When max_agent_steps triggers mid-retrieval, the agent returns JSON blobs, API responses, and schema dumps directly to the user instead of a synthesised answer. Haystack Issue #10001 marks this “not planned” to fix. This affects any agent framework with a hard step count.
The governance pattern: Your data pipeline transforms input into output. The transformation is wrong. There is no error. The output looks plausible. The only defence is testing the output against known-good results — which requires governance practices that most teams have not established for their agent pipelines.
Full finding: Three agentic RAG failures
2. Single enforcement points are insufficient
When teams do invest in governance controls, they typically configure a single enforcement mechanism and trust it. We found that no single mechanism is reliable enough to serve as the sole control.
Claude Code hooks: 25+ confirmed failure modes across five categories
Claude Code hooks are the primary mechanism for customising and securing Claude Code’s behaviour — PreToolUse runs before a tool is called, PostToolUse runs after. We documented 25+ confirmed failure modes across five categories:
- Silent non-firing. Hooks intermittently fail to trigger. The tool runs, the hook does not execute, no error is logged.
- Ignored decisions. A hook returns “deny” and the tool call proceeds anyway.
- Platform breakage. Five or more failure modes appear on Windows that do not appear on macOS or Linux.
- Data corruption. Hooks that modify tool arguments can produce corrupted, partially applied, or double-escaped output.
- Architectural constraints. After context compaction (automatic in long sessions), hook enforcement can be silently lost.
The governance pattern: A policy that relies on a single hook firing reliably will eventually fail. Defence-in-depth — using multiple hook events, external monitoring, and independent verification — is the minimum viable approach. This applies beyond Claude Code: any agent framework that offers lifecycle hooks as its governance mechanism is subject to the same class of reliability failures.
Full finding: Claude Code hooks unreliable enforcement
3. Dependencies have ungoverned access by default
Agent tools routinely connect to databases, APIs, and external services. The default access level is typically unrestricted — governance controls exist in the documentation but fail in practice.
MCP database servers: read-only enforcement is bypassable everywhere
MCP database servers advertise a “read-only mode” that restricts agents to SELECT queries. We reviewed three MCP database server implementations. Every one implements read-only enforcement as a string prefix check (startsWith('select')). None use SQL AST parsing.
This means queries like SELECT ... INTO OUTFILE, multi-statement injections via semicolons, and PRAGMA-based state modifications all pass the read-only guard. One implementation has a confirmed CVE (GHSA-65hm-pwj5-73pw).
The governance pattern: Your agent has database access. The documentation says it is read-only. The enforcement is a string check that a crafted query can bypass. If your agent processes user input that reaches the database tool, the read-only boundary is not a security control — it is a suggestion.
Full finding: MCP database server security bypass
MCP supply chain: institutionally confirmed as a governance category
MCP supply chain governance is no longer emerging. Two enterprise acquisitions in 90 days confirmed it as institutional infrastructure:
- Snyk acquired Invariant Labs (June 2025) — the team that built mcp-scan and coined the attack vocabulary: “tool poisoning,” “MCP rug pulls,” “cross-origin escalation.”
- Docker acquired MCP-Defender (September 2025) — adding proxy-based runtime interception to the container infrastructure stack.
Three confirmed real-world incidents validated the threat model: WhatsApp message exfiltration via tool poisoning, npm package impersonation of Postmark’s email service, and a Smithery registry compromise affecting 3,000+ applications.
The governance pattern: Your agent’s MCP server list is a dependency manifest — equivalent to package.json or requirements.txt. It requires the same governance practices: pinning, auditing, and monitoring for changes between sessions. Most teams apply none of these practices to their MCP configurations.
Full finding: MCP supply chain security institutionally confirmed
4. Persistent state introduces silent corruption
Agent memory systems — graph databases, vector stores, session persistence — introduce persistent state that affects every subsequent interaction. When this state corrupts, the corruption is invisible.
Graph memory: self-hosted is not production-ready
We tested three agent memory tools in production-like environments:
- Graphiti (temporal knowledge graph):
RuntimeError: Future attached to a different loopwhen embedded directly in FastAPI or LangGraph — the most common production Python agent stack. The failure surfaces under real async load, not in development. The fix (subprocess isolation) is a prerequisite architectural decision the docs do not surface. - Mem0 (47.6K stars, “any LLM provider”): graph features hardcode
openai_structured, returning 401 errors with Anthropic, Groq, and other providers (Issue #3711). Vector-only mode works with any provider. Teams who chose Mem0 for its provider flexibility discover the limitation after integration. - MCP server-memory (official reference): JSONL file corruption under concurrent reads/writes (Issues #1819, #2577). Safe for single-agent, single-session use only — despite being the reference implementation builders start from.
The governance pattern: Your agent’s memory is persistent state. Persistent state requires governance: concurrency controls, provider compatibility testing, and recovery procedures. The tools’ documentation treats memory as a feature. In production, it is infrastructure — and infrastructure requires governance.
Full finding: Graph memory self-hosted not production-ready
5. Configuration files are execution contexts
Agent configuration files are not passive declarations. They are execution contexts — they control what code runs, with what permissions, at what privilege level.
Claude Code settings: a first-class execution surface
Claude Code’s project settings file (.claude/settings.json) has two confirmed CVEs and five additional supply-chain vectors, all sharing one root cause: project-scoped settings execute with user privileges before trust verification.
- CVE-1: A malicious settings file redirects all API calls — including the authentication token — to an attacker-controlled endpoint via
ANTHROPIC_BASE_URLoverride. - CVE-2:
enableAllProjectMcpServersauto-approves every MCP server defined in the project config, gaining arbitrary tool execution on clone. - Five additional vectors: symlink-based path escape, hook injection, MCP server injection, permission escalation, and dependency chain loading.
Enterprise hardening exists but is opt-in. The default for every non-enterprise user is full exposure.
The governance pattern: Treat agent configuration files like executable code. Review .claude/settings.json, .cursorrules, and mcp.json with the same scrutiny applied to Makefile or .github/workflows/. A cloned repository’s agent config runs with your privileges — before you have a chance to read it.
Full finding: Claude Code settings attack surface
What these findings have in common
Five governance patterns emerge across all seven findings:
- Silent failure is the default. Controls that should block, filter, or limit operations fail without error messages, log entries, or alerts. The failure is invisible.
- Single enforcement points are insufficient. Hooks, guards, and filters each have reliability failures. Defence-in-depth is a requirement, not an option.
- Documentation claims do not match production behaviour. Every finding documents a gap between what the docs say and what the tool does under real conditions.
- Dependencies inherit the trust level of the agent. MCP servers, database connections, and memory stores operate with the agent’s full permissions unless explicitly restricted — and the restrictions are often bypassable.
- Configuration is code. Agent config files control execution, permissions, and network access. They require the same governance practices as application code.
These are not tool-specific bugs. They are architectural patterns that recur across the agent ecosystem. Governance practices that address these patterns — testing controls under production conditions, implementing defence-in-depth, auditing dependencies, and reviewing configuration files — apply regardless of which tools a team uses.
How Common Are These Patterns?
The case studies in Section 2 document specific failure modes we reproduced. This section asks: how prevalent are the underlying governance gaps across the broader ecosystem?
To answer this, we ran our check registry against a corpus of 395 public repositories with agent infrastructure. This is early observatory data — the first systematic scan of agent governance in the wild. We present it with full transparency about what we can and cannot claim.
How to read these numbers: Every statistic includes the sample size and 95% Wilson score confidence interval. We report what we observed in our corpus, not what we claim about the ecosystem at large. “In our corpus of 395 repos, we observed X in Y/Z applicable repos (95% CI: A%-B%)” means exactly that — not “X% of all agent-configured repos have this problem.”
Corpus composition
We sourced 395 repositories from GitHub using code search queries targeting agent configuration files across 20+ tool ecosystems.
| Dimension | Value |
|---|---|
| Total repos scanned | 395 |
| Tool ecosystems represented | 20+ |
| Most common config type | .cursor/rules (5.6%) |
| Languages | Python (39%), TypeScript (19%), Go (6%), JavaScript (5%), 11 others |
| Diversity check | No single config type exceeds 6% — well below the 30% bias threshold |
Config type distribution (top 10):
| Config type | Repos | % |
|---|---|---|
.cursor/rules | 22 | 5.6% |
AGENTS.md | 22 | 5.6% |
crew.yaml (CrewAI) | 14 | 3.5% |
mcp.json | 13 | 3.3% |
.roo/rules | 13 | 3.3% |
MCP server manifest | 12 | 3.0% |
.github/agents | 11 | 2.8% |
CLAUDE.md | 10 | 2.5% |
GEMINI.md | 10 | 2.5% |
.windsurfrules | 10 | 2.5% |
Known bias: The sourcing pipeline finds repos that contain specific config files via GitHub code search. Repos with non-standard configurations, private infrastructure, or gitignored agent configs are not represented. The corpus skews toward Python (39%) and TypeScript (19%).
Highest-fire governance checks
These are the governance checks that fired most frequently across the corpus. We exclude version-specific CVE checks (which detect a specific vulnerable version, not a governance pattern) and include only checks with 20+ applicable repos.
| Rank | Check | What it detects | Fire rate | Applicable repos | 95% CI |
|---|---|---|---|---|---|
| 1 | B9 | Suppressed errors in agent scripts and code | 58% | 126/216 | 52%-65% |
| 2 | B6 | No safety boundaries in agent configuration | 54% | 117/216 | 48%-61% |
| 3 | B168 | CI workflows without test execution steps | 47% | 72/154 | 39%-55% |
| 4 | B83 | Multiple IDE configs without cross-platform governance | 43% | 77/178 | 36%-51% |
| 5 | B17 | Safety boundaries in some platform configs but not others | 38% | 28/74 | 28%-49% |
| 6 | B8 | No error handling, retry, or fallback strategies | 22% | 48/216 | 17%-28% |
| 7 | B5 | Secret patterns in agent configs of public repos | 20% | 42/215 | 15%-25% |
What these numbers mean:
- More than half of the repos we scanned have agent configurations that suppress errors without logging (B9) and lack any safety boundary keywords (B6). These aren’t edge cases — they are the majority pattern.
- 43% of repos with multiple IDE agent configs have no cross-platform governance — different tools in the same repo with different permission levels, different safety settings, or different autonomy boundaries.
- 1 in 5 public repos with agent infrastructure have detectable secret patterns in their agent config files (B5). These are API keys, tokens, and credentials committed to version control in configuration files that control agent behaviour.
Graduation filter note: Check graduation requires a fire rate between 5% and 80%. Checks detecting very common problems (>80%) or rare but critical ones (<5%) cannot graduate, so this table is structurally bounded. Extremely prevalent governance gaps may be under-represented in graduated check data.
Per-tool breakdown
Multi-IDE governance conflicts (B83) provide a natural per-tool lens. In our corpus of 178 repos where 2+ IDE agent configs coexist:
- 43% have at least one governance conflict between platforms
- The most common conflict: safety boundaries configured for one platform but missing in another (B17, 38% of multi-platform repos)
Per-tool fire rates for individual platform checks require larger per-tool sample sizes than our current corpus provides. With 10-22 repos per tool ecosystem, confidence intervals are too wide (typically ±15-25%) for meaningful per-tool comparison. Future editions with larger per-tool samples will include tool-specific breakdowns.
What never fired
31 checks never fired in our corpus. This breaks down as:
- Tool-niche checks (majority): Checks targeting specific tool configurations (Goose permission.yaml, Agno store_history_messages, Composio CLI flags) that have few or no representatives in the corpus. These checks need targeted sourcing to validate.
- Config-private checks: Checks targeting gitignored files (
.claude/settings.jsoninternals) that are invisible in public repos by definition. - Narrow pattern checks: Checks detecting specific vulnerable code patterns that happen not to appear in the repos we scanned.
The non-firing checks are diagnostic: they reveal which tool ecosystems and configuration surfaces our corpus under-represents, informing future sourcing expansion.
What the skip rates reveal
When a check is “skipped,” it means the check’s prerequisite was not met — the repo doesn’t contain the relevant configuration file. High skip rates reveal which tools are under-represented.
| Check prerequisite | Skip rate | Repos with config | Interpretation |
|---|---|---|---|
| Agent config files (generic) | 45% | 216 / 395 | 55% of repos have no scannable agent config at all |
| Multi-platform configs | 55% | 178 / 395 | 45% have 2+ platform configs |
| CrewAI configs | 93% | 25 / 395 | Small CrewAI sample |
| MCP configs | 93% | 25 / 395 | Small MCP config sample |
The 45% generic skip rate means over half our sourced repos have agent infrastructure markers (they were found by config-file search) but don’t have the specific configuration files our basic checks target. This suggests many repos are in early adoption — they use agent tools but haven’t created configuration files beyond the defaults.
The Agent Governance Taxonomy
This section presents the full governance check registry — 151 checks across 40+ tools — as a framework that teams can use to assess their own agent governance posture.
The taxonomy is organised by what it governs, not which tool it targets. A team using Claude Code and a team using Cursor face the same governance categories; the specific checks differ, but the patterns are shared.
Of the 151 checks, 109 (72%) are governance-focused — detecting configuration drift, missing boundaries, stale documentation, and ungoverned access. 31 (21%) detect security-relevant patterns like credential exposure and injection vectors. 11 (7%) are mixed. This is a governance taxonomy first, with security coverage where the governance and security surfaces overlap.
Governance categories
By severity
| Severity | Count | What it means |
|---|---|---|
| Critical | 34 | Immediate risk — unbounded autonomy, credential exposure, ungoverned tool access |
| Warning | 85 | Governance gap — missing boundaries, stale configurations, unversioned artifacts |
| Info | 27 | Best practice deviation — patterns that improve governance posture but don’t create direct risk |
5 additional checks are sentinel checks (null severity) used for scan infrastructure.
By OWASP Agentic Security Index (ASI) category
The OWASP ASI provides a shared vocabulary for agent-related risks. 93 of our 151 checks (62%) map to an ASI category. The remaining 58 (38%) cover governance patterns — documentation staleness, configuration drift, dependency versioning — that don’t fit the ASI’s security-focused taxonomy.
| ASI Category | Description | Checks | Coverage assessment |
|---|---|---|---|
| ASI04 | Data Exposure | 28 | Strongest coverage — API keys, credentials, history leakage |
| ASI03 | Insufficient Oversight | 23 | Strong — tool access control, auto-approve, unbounded autonomy |
| ASI06 | Supply Chain Vulnerabilities | 13 | Moderate — MCP supply chain, dependency pinning |
| ASI05 | Insecure Output Handling | 8 | Moderate — unsafe deserialization, output injection |
| ASI01 | Excessive Agency | 7 | Light — model output injection, config inheritance |
| ASI07 | Lack of Logging | 7 | Light — MITM hooks, SSL bypass, audit gaps |
| ASI02 | Inadequate Guardrails | 6 | Light — command injection, SDK examples |
| ASI08 | Insufficient Access Control | 1 | Gap — MCP SDK ReDoS only; access control patterns under-represented |
| (unmapped) | Governance patterns | 58 | Not security-focused — docs, config drift, versioning |
Coverage is concentrated: ASI04 and ASI03 account for 55% of all OWASP-mapped checks. ASI08 (access control) and ASI02 (guardrails) are under-served — these are areas where governance checks should expand in future editions.
By what can be detected
Not all governance patterns are observable from a repository’s public contents. Our checks are classified by observability:
| Class | Count | What it covers | Can the scan detect it? |
|---|---|---|---|
| repo-observable | 87 (58%) | Config files, dependency versions, CI/CD pipelines, documentation | Yes — automated scanning |
| config-private | 26 (17%) | IDE settings files that are gitignored, local environment configs | No — requires local access |
| tool-niche | 38 (25%) | Tool-specific configs that require external APIs or registries | No — requires tool-specific integration |
Automated scanning covers 58% of known governance patterns. The remaining 42% requires team-level practices: config review, tool permission audits, and runtime monitoring. This is a structural limitation of any repo-based scanning approach — it is not specific to Theory Delta.
Tool coverage
The registry covers 40+ tools. The top 15 by check count:
| Tool | Checks | % of registry | Focus |
|---|---|---|---|
| Claude Code | 24 | 15.9% | Hooks, permissions, settings, agent isolation |
| MCP (protocol) | 14 | 9.3% | Schemas, sessions, transport, stickiness |
| Goose | 9 | 6.0% | Config, permissions, MCP integration |
| CrewAI | 6 | 4.0% | Race conditions, path traversal, config |
| FastMCP | 5 | 3.3% | Sandbox, code mode, config |
| Open WebUI | 4 | 2.6% | Tool installation, PyPI supply chain |
| Cline | 4 | 2.6% | MCP auto-approve, hooks, rules |
| Aider | 4 | 2.6% | Credentials, config, conventions |
| Gemini CLI | 3 | 2.0% | Prototype pollution, config |
| LiteLLM | 3 | 2.0% | API key logging, migrations |
| LangChain | 3 | 2.0% | ReDoS, SSRF, dependency pins |
| OpenAI (SDK) | 3 | 2.0% | Model execution, guardrails |
| GitHub Copilot | 3 | 2.0% | CODEOWNERS, instruction injection |
| Codex (CLI) | 3 | 2.0% | Sandbox, policies, config |
| LlamaIndex | 3 | 2.0% | Pickle deserialization, deps |
18 additional tools have 1-2 checks each: Windsurf, Cursor, Continue, Semantic Kernel, Vercel AI SDK, Dify, Firecrawl, DeepEval, browser-use, Mem0, Crawl4AI, Temporal, Agno, DSPy, n8n, Composio, Haystack, Google ADK.
Coverage concentration
Claude Code + MCP account for 25% of the registry — the deepest governance coverage. This reflects the tools’ maturity and the depth of evidence available, not a claim that these tools have more governance problems than others. Tools with 1-2 checks are covered at a detection level; deeper evaluation requires more evidence.
Coverage gaps
Tools with significant agent infrastructure but minimal governance checks in this edition:
- Cursor (1 check) — largest community (38K+ awesome-cursorrules) but limited governance coverage beyond cross-platform conflict detection
- Windsurf (2 checks) — config bleed detection only
- Amazon Bedrock Agents — no dedicated checks
- Roo Code — no dedicated checks
- Mastra, AG2/AutoGen — minimal coverage
These gaps are priorities for future editions.
The readiness framework
Based on the governance patterns we observe across repositories, we classify governance maturity into three levels:
Early
The team uses agent tools. Minimal governance practices are in place. Typical signals:
- Agent config files exist but are not version-controlled or reviewed
- Tool permissions are implicit (whatever the default is)
- No monitoring of agent behaviour or control enforcement
- MCP servers added without review
Documented
Agent governance is explicit and version-controlled. Typical signals:
- Agent config files are reviewed in PRs
- Tool permissions are explicitly declared (allowlists, not defaults)
- Safety boundaries are configured (maxTurns, allowed tools, restricted paths)
- MCP server list is curated and reviewed
Tested
Governance controls are verified under production conditions. Typical signals:
- Controls are tested to confirm they actually fire (not just configured)
- Agent pipeline outputs are tested against golden results
- Hook execution is independently monitored
- Governance posture is tracked over time (scan history, trajectory)
Most teams we observe are at the Early stage. Moving from Early to Documented requires process changes, not new tooling. Moving from Documented to Tested requires monitoring and testing infrastructure.
Using this taxonomy
For self-assessment: Review the check categories above against your team’s agent infrastructure. Which categories have you addressed? Which haven’t you considered?
For policy writing: Use the OWASP ASI mapping to connect agent governance checks to your existing security control framework. Note the ASI08 and ASI02 gaps — your policy should cover access control and guardrails even where automated checks don’t yet exist.
For automated scanning: Run the AI Agent Setup Check to detect the repo-observable patterns in your codebase automatically.
For the full check catalog: The complete 151-check registry with descriptions, consequences, evidence links, and OWASP mappings is available as a companion artifact at theorydelta.com/checks/.
Section 5: Methodology and Limitations
This section describes exactly what we did, exactly what we measured, and exactly what we cannot claim. Most comparable reports are opaque about methodology. We think that is a mistake. If you cannot evaluate the method, you cannot evaluate the findings.
How the observatory works
The Theory Delta observatory is a pipeline with four stages:
-
Sourcing. GitHub code search identifies public repositories containing agent infrastructure config files —
CLAUDE.md,.cursorrules,mcp.json,crewai.yaml,.github/copilot-instructions.md, and similar indicators across 26+ tools. Repos are selected for the presence of these files, not by popularity, language, or topic. -
Scanning. Each repo is shallow-cloned locally. An automated scan runs 151 governance checks against the clone. Checks are shell-script based and deterministic — no LLM inference is used for basic or technical checks. Strategic checks (governance document quality) use minimal LLM evaluation for content assessment only.
-
Aggregation. Results are collected with per-check fire rates, false positive rates, and sample sizes. Wilson score confidence intervals are computed for all rates.
-
Graduation. Checks must demonstrate acceptable fire rates (5—80%) and false positive rates (<20%) across 10+ scanned repos before their results are included in aggregate statistics. Checks that exceed 40% false positives are pulled from the registry entirely.
The check registry
The scan runs 151 checks across three scan types:
| Scan type | Count | What it measures |
|---|---|---|
| Basic | 126 | Detection of config files, dependency patterns, permission settings, safety signals |
| Technical | 16 | Infrastructure concerns: vulnerable dependency versions, CI/CD pipeline gaps, supply chain risks |
| Strategic | 9 | Governance documentation quality: whether intent docs exist and are meaningful |
Every check must be backed by an evidence block from a knowledge corpus of 250+ blocks documenting empirical findings about agent tools. A check without evidence is not admitted to the registry. Each check specifies a severity level, a tier (Free or Pro), an observability class, and a consequence statement describing the governance risk it detects.
What we can measure
87 of 151 checks (58%) are classified repo-observable. These target artifacts visible in public repositories:
- Config files — agent instruction files, permission settings, MCP server declarations
- Dependency versions — pinned versions in
package.json,requirements.txt,pyproject.toml - CI/CD pipelines — GitHub Actions workflows, pre-commit hooks, test infrastructure
- Documentation — README files, AGENTS.md, tool-specific instruction files
- Safety boundaries — permission allowlists, autonomy limits, hook configurations
These checks produce the quantitative findings in this report. When we state a fire rate, it is derived from repo-observable checks run against repos in our corpus.
What we cannot measure
The remaining 64 checks (42%) target artifacts that are not visible in public repos:
| Class | Count | What it covers | Why it is invisible |
|---|---|---|---|
config-private | 26 | IDE settings files (e.g., .claude/settings.json), local environment configs | These files are typically gitignored. Their absence from a repo does not mean they do not exist locally. |
tool-niche | 38 | Tool-specific configs for tools with fewer than 10 public repos using them | Insufficient public data to compute meaningful fire rates. |
Beyond observability class, several categories of governance data are structurally inaccessible to our method:
- Private repositories. The observatory only scans public repos. Enterprise governance practices — which may be significantly more mature — are not represented in any finding.
- Runtime outcomes. Scan data shows what is configured, not whether misconfigurations cause failures in practice. A repo with no permission boundaries may never experience a runaway agent. A repo with careful governance may still have incidents. We measure configuration, not consequences.
- Longitudinal trends. This is edition 1. We have no scan history. Every finding is a point-in-time snapshot, not a trend. Claims about whether governance is “improving” or “declining” are not supported by our data.
Known biases
Five structural biases shape the data in this report. We name them so readers can calibrate accordingly.
Tool concentration. Claude Code and MCP together account for approximately 37% of the check registry. The report has deeper governance coverage for these tools than for others. Findings about Claude Code governance gaps are more granular than findings about, say, CrewAI governance gaps — not because CrewAI has fewer gaps, but because we have more checks looking for them in Claude Code.
Sourcing bias. GitHub code search finds repos that contain specific config filenames. Repos with non-standard naming conventions, repos hosted outside GitHub, and repos that configure agent tools through mechanisms we do not search for are invisible to the observatory.
Public repo bias. Public repos skew toward open-source projects, solo developers, and educational repositories. Teams building production agent infrastructure on private repos are not represented. Enterprise governance maturity may differ substantially from what we observe.
Graduation filter bias. Check graduation requires a fire rate between 5% and 80%. This means aggregate statistics computed across graduated checks are structurally bounded: checks that almost never fire and checks that almost always fire are excluded before aggregation. When we report an average fire rate across graduated checks, the graduation filter mechanically prevents that average from being near 0% or near 100%. This filter exists for quality reasons — checks outside the 5—80% range are either too noisy or too rare to be informative — but it means aggregate fire rates should not be interpreted as unfiltered ecosystem measurements.
Temporal bias. All scans reflect the state of repos at the time of scanning. Agent infrastructure is changing rapidly. Findings from this edition may not reflect the state of the same repos even weeks later.
Statistical approach
All quantitative findings in this report follow these conventions:
- Fire rates are reported with sample size and 95% Wilson score confidence intervals.
- Standard format: “In our corpus of N repos, we observed X in Y of Z applicable repos (95% CI: A%—B%).”
- Where N is small (fewer than 30 applicable repos for a given check), we flag the wide confidence interval explicitly.
- We do not claim ecosystem-wide representativeness. Observatory data characterises the repos we scanned. Extrapolation to “all repos using agent tools” is not supported by our method.
What we are doing about the limitations
- Multi-platform sourcing. Expanding search patterns to improve representation of tools beyond Claude Code and MCP.
- Longitudinal tracking. Building scan history infrastructure so edition 2 can report trends, not just snapshots.
- Private repo data. Seeking design partners willing to share anonymised scan results from private repositories, to test whether public-repo governance patterns generalise.
- Outcome data. Exploring partnerships to correlate configuration gaps with incident reports, to move from “what is configured” toward “what matters.”
The full check registry, including check IDs, descriptions, observability classes, and backing block references, is published alongside this report. Readers who want to evaluate specific checks can inspect the detection logic directly — every check is open source.
Recommendations
The findings in this report point to a common root cause: agent infrastructure is treated as application configuration when it should be treated as operational infrastructure. The governance practices that teams apply to CI/CD pipelines, database access, and deployment permissions have not yet been extended to agent configurations, tool permissions, and memory systems.
These recommendations are organised by audience. Each is grounded in specific findings from Section 2.
For Engineering Leaders
1. Treat agent configuration as infrastructure, not settings
Agent config files (CLAUDE.md, .cursorrules, mcp.json, .claude/settings.json) control what AI can do in your codebase — file access, network access, tool permissions, autonomy level. They should be version-controlled, reviewed in PRs, and subject to the same change management as CI/CD pipelines.
Why this matters now: Our findings show that configuration files are execution contexts (Section 2.5). A .claude/settings.json can redirect API credentials, auto-approve MCP servers, and inject hooks that run shell commands — all before a developer has a chance to review.
First step: Add agent config files to your team’s PR review checklist. If .claude/settings.json or .cursorrules changes in a PR, it gets the same review scrutiny as a Dockerfile or GitHub Actions workflow change.
2. Assume your safety controls fail silently
Every control we tested — budget enforcement, content guardrails, hook-based policies, read-only database access — has confirmed failure modes where the control is configured but not enforced, with no error signal (Section 2.1, 2.2, 2.3).
First step: For any safety-critical control in your agent stack, add independent monitoring that verifies the control is actually firing. A hook that logs to an external service. A budget check that compares gateway-reported spend against provider invoices. A read-only test that attempts a write and verifies rejection. If the monitoring shows a gap, the control is not working — regardless of what the configuration says.
3. Assess your team’s governance readiness level
Use the readiness framework from Section 4 to place your team on the governance maturity scale:
- Early — agent tools are in use, minimal or no governance practices
- Documented — agent configurations are version-controlled and reviewed; tool permissions are explicit
- Tested — governance controls are verified under production conditions; monitoring detects control failures
Most teams are at the Early stage. Moving to Documented requires only process changes, not new tooling.
For Practitioners
4. Audit your MCP server list like a dependency manifest
Your MCP configuration is a dependency manifest equivalent to package.json or requirements.txt. Each server you add gets the agent’s full permissions unless explicitly restricted.
What to do:
- Review every MCP server in your config. Do you know what each one does?
- Check for
enableAllProjectMcpServersin.claude/settings.json— if present, remove it. This auto-approves all project-defined MCP servers without consent. - Pin MCP server versions where possible. Monitor for description changes between sessions (the “rug-pull” vector documented in Section 2.3).
5. Implement defence-in-depth for any hook-based policy
If you use hooks (Claude Code PreToolUse/PostToolUse, Cline hooks, or equivalent) as a governance mechanism, do not rely on a single hook as the enforcement point (Section 2.2).
What to do:
- Use multiple hook events for the same policy (both
PreToolUseandPostToolUse) - Add the
PostCompacthook to re-inject configuration after context compaction - Log hook execution to an external file — if the log entry is missing, the hook did not fire
- Test hooks on your deployment platform, not just your development machine
6. Test your agent pipeline outputs, not just inputs
Silent data corruption in RAG pipelines, memory systems, and routing logic produces plausible but incorrect output with no error signal (Section 2.1, 2.4).
What to do:
- Add golden-output tests to your agent pipeline — feed known inputs and verify the output matches expected results
- For graph-based RAG: test with entities that share names but differ in type
- For agent memory: test concurrent writes from multiple sessions
- For step-limited agents: verify what happens when the step cap triggers mid-retrieval
For Security Teams
7. Map agent config governance to your existing control framework
The 151 governance checks in Section 4 are mapped to OWASP Agentic Security Index (ASI) categories. Use this mapping to integrate agent config governance into your existing security control framework.
Key categories:
- ASI04 (Data Exposure): 28 checks — API keys in config, credential exposure, history leakage
- ASI03 (Unauthorised Execution): 23 checks — tool access control, auto-approve settings, unbounded autonomy
- ASI06 (Misconfiguration): 13 checks — config conflicts, stale rules, precedence overrides
8. Require governance review for agent infrastructure changes
Agent configuration changes should follow the same approval process as infrastructure changes. Specifically:
- Changes to MCP server lists require review (equivalent to adding a new dependency)
- Changes to tool permissions require review (equivalent to modifying IAM policies)
- Changes to hook or rule configurations require review (equivalent to modifying firewall rules)
- New agent tool adoption requires a governance assessment (equivalent to a vendor review)
9. Recognise what automated scanning cannot detect
42% of our check registry targets configurations that are not visible in public repositories — IDE-specific settings files that are gitignored, tool-specific configs that require external APIs, and runtime behaviours that are only observable during execution (Section 5).
Automated scanning is a starting point, not a complete governance solution. It detects the 58% of governance patterns that are observable from repository contents. The remaining 42% requires team-level practices: config review, tool permission audits, and runtime monitoring.
How to start
For teams at any governance maturity level, the single highest-leverage action is:
Run the scan. The AI Agent Setup Check is a free, automated scan that detects the governance patterns documented in this report. It runs against your repository and produces specific, evidence-backed findings — not generic advice.
The scan covers N checks across 26+ tools. Each finding includes what was detected, why it matters, and what to do about it. It takes under 5 minutes.
For teams that want deeper evaluation — trajectory tracking, contextual recommendations, and full evidence chains — the Pro Audit provides ongoing governance assessment that tracks whether your governance is improving over time.