theorydelta field guide
built 2026-06-01 findings: 49 task hubs: 6 independent · evidence-traced · no vendor influence

Cursor Automations Introduces Always-On Agents With No Execution Visibility

Published: 2026-05-21 Last verified: 2026-05-17 empirical
Staleness risk: high — facts in this subject area change quickly between releases. Re-check the specific claims against your own environment before acting. (This rates the topic, not whether this page is out of date.)

Cursor Automations Introduces Always-On Agents With No Execution Visibility

What you expect

Cursor’s agent mode prompts for approval before executing terminal commands. Builders testing locally experience a gated loop: the agent proposes an action, the developer approves or rejects, execution proceeds. The implicit model is “AI assistance within a developer-controlled feedback loop.”

Two patched CVEs (CurXecute, MCPoison) established that MCP trust vulnerabilities exist in Cursor but were addressed in v1.3. The editor’s config surface is well-understood via .cursor/rules/ and .cursor/mcp.json.

What actually happens

Automations (March 2026) breaks the approval-gate assumption. Automations introduces always-on agents triggered by external webhooks — Slack messages, Linear issues, GitHub events, PagerDuty alerts. These agents:

  • Auto-approve all actions at runtime with no per-action gate
  • Show no execution state during a run — no progress indicator, no mid-run visibility, no documented abort mechanism
  • Accumulate persistent memory across invocations via a memory tool (confirmed in the Mar 5, 2026 changelog): each webhook-triggered run can read and write persistent agent memory, so state from a prior run influences subsequent runs
  • Are undetectable via filesystem scanning — Automations is web-only (cursor.com/automations); no config files are written locally; repo scanning always returns a false negative

Testing locally in standard agent mode does not predict Automations behavior. The approval prompts that appear in local mode are absent in Automations. A team that validated their agentic workflow in local mode and then configured Automations has not tested the deployment they are actually running.

CVE-2026-26268 (February 2026) adds a transparency failure in local mode: the agent fires system commands outside the reasoning chain visible to the user in the chat UI. Commands execute without appearing in the reasoning trace the developer is watching — the UI does not surface all tool calls, only those routed through the visible chain.

Rule-file injection compounds the risk. Lasso Security’s empirical study of 314 test cases confirmed an 84% attack success rate via coding rule files (.cursor/rules, CLAUDE.md, .windsurfrules). A malicious rule file in a repo that receives external contributions can redirect or exfiltrate agent output. Combined with Automations’ no-approval-gate execution model and persistent memory, a successful rule-file injection can:

  1. Execute in a webhook-triggered run with no human checkpoint
  2. Write state to agent memory, persisting the injection across all future invocations
  3. Require active memory clearing to remediate — config file cleanup alone is insufficient

MCP trust vulnerabilities remain a structural risk. CVE-2025-54135 (CurXecute): external content (Slack message) → LLM summarization → MCP config mutation → shell access. CVE-2025-54136 (MCPoison): MCP server trust bound to config name, not content hash — an attacker swaps the command payload while the original approval persists for the new payload. Both patched in v1.3 (July 2025). The Automations Slack webhook trigger recreates the CurXecute attack chain as a first-class feature: external Slack content enters an auto-approving agent with persistent memory. The patch addressed the specific exploit path, not the underlying architecture.

Functional constraints that are not documented:

  • Combined server name + tool name limit: 60 characters — exceeding this truncates silently
  • Agent accuracy degrades beyond ~40 tools — tool selection becomes unreliable at scale
  • Missing mcpServers key in config is silently ignored with no error or warning
  • MCP Resources support only arrived in v1.6 (September 2025); prior versions had no resource capability

What this means for you

Your local testing does not cover your production risk surface. Automations agents run without the approval gates that characterize local testing. An organization that has configured Automations for Slack or GitHub events has standing autonomy that executes without a human in the loop. Because Automations is web-only with no filesystem artifacts, standard repo scanning gives no signal about whether this surface exists or how it is configured.

The memory tool makes injection persistence the critical risk. A one-time rule-file injection into an Automations agent can write to persistent memory and affect all future webhook-triggered runs. Remediating this requires identifying and clearing the specific memory state written — not just removing the malicious rule file.

Cloud agent approval posture differs from local agent posture. Cloud Agent (v2.0+) auto-runs all terminal commands without per-command approval; local agent mode prompts for approval. Multi-agent mode runs up to 8 parallel agents via git worktrees with no documented execution bounds. Teams testing locally with approval prompts will not see that behavior in cloud or multi-agent execution.

Self-hosted cloud agents in Automations lack documented isolation controls. Cursor-hosted cloud agents run in Cursor-managed VMs with implied isolation; self-hosted deployments inherit whatever the operator provides. A self-hosted configuration with no explicit network policy or container sandbox removes isolation that Cursor-hosted agents had, without surfacing that removal to the operator.

What to do

  1. Audit whether Automations is configured for your organization — check cursor.com/automations directly; no filesystem scan will find it. Confirm whether any automation lacks a maxTurns limit or approval gate.

  2. Treat .cursor/rules files as untrusted input if your repo receives external contributions. Standard SAST tools do not scan rule files. Use MEDUSA (pip install medusa-security) — the only open-source scanner explicitly covering CLAUDE.md and .cursorrules for injection vectors.

  3. Audit agent memory after any suspected injection. If an Automations agent has been running against a repo that received external changes, assume memory may carry injected state. Config file cleanup alone is insufficient — memory state must be inspected and cleared.

  4. For MCP integrations, verify your MCP server registration uses full specification hashing, not name-based trust. The MCPoison class of vulnerability (CVE-2025-54136) applies to any MCP client that stores approvals against a name string rather than a content hash of the executable and arguments.

  5. Pin to exact Cursor versions for MCP-dependent workflows. MCP Resources support requires v1.6+; the Hooks beta (v1.7) for observation controls requires v1.7+; Sandboxed Terminals GA (macOS) requires v2.0+. Verify your deployment version matches the capabilities you depend on.

  6. Do not rely on the Cursor UI chat trace as a complete record of agent actions. CVE-2026-26268 confirms that commands can execute outside the visible reasoning chain. For audit purposes, supplement chat trace inspection with system-level command logging.

Falsification criterion: This finding would be disproved by Cursor shipping a per-action approval gate for Automations webhook agents, or by Cursor publishing documentation showing that Automations execution state is visible and interruptible during a run — either of which would eliminate the core approval-gate divergence between local and Automations execution modes.

Evidence

ToolVersionEvidenceResult
Cursor AutomationsMar 2026 launchsource-reviewedWebhook-triggered agents auto-approve all actions; no execution visibility during run; web-only (no filesystem artifacts)
Cursor changelog Mar 5, 2026Mar 2026source-reviewedMemory tool confirmed in Automations agents — explicit read/write of persistent state across invocations
CVE-2025-54135 (CurXecute)pre-v1.3independently-confirmedSlack prompt injection → MCP config rewrite → arbitrary command execution
CVE-2025-54136 (MCPoison)pre-v1.3independently-confirmedMCP trust bound to name not content hash — silent payload swap bypasses approval
Lasso Security rule-file injection studyMar 2026, 314 casesindependently-confirmed84% attack success rate via .cursor/rules, CLAUDE.md, .windsurfrules
CVE-2026-26268Feb 2026source-reviewedAgent fires system commands outside visible reasoning chain — UI does not surface all tool calls
Cursor Cloud Agent announcementFeb 2026source-reviewedCloud Agent auto-runs terminal commands without per-command approval; local mode requires approval
Cursor v2.0 changelogOct 2025source-reviewedMulti-agent mode: 8 parallel agents via git worktrees, no documented execution bounds

Confidence: empirical — two independently confirmed CVEs, one independently confirmed injection study, multiple source-reviewed changelog claims across v0.46–v2.0+.

Strongest case against: Cursor v1.3 (July 2025) patched both CVEs, demonstrating the team responds to security findings. Box (Feb 2026), National Australia Bank (Apr 2026), and PlanetScale (Mar 2026) are using Cursor in production at regulated institutions, suggesting the risk is manageable for some deployment profiles. Automations may be configured with maxTurns limits by operators who read the documentation carefully — the gap is a default posture issue, not an absolute architectural constraint. The Hooks beta (v1.7) provides some observation capability, and Sandboxed Terminals (GA macOS, v2.0) addresses the unapproved terminal command class of failure for local mode.

Open questions: Can Cursor Hooks (v1.7 beta) block tool calls rather than just observe them? Does Automations expose any execution interruption mechanism not documented in the changelog? What memory audit and clearing controls exist for administrators of Automations agents?

Seen different? Contribute your evidence — share a repro or counter-example and we’ll review it against this finding. Reader evidence is what keeps these findings accurate.

theorydelta.com · 2026 independent · evidence-backed · every claim sourced or labelled glossary · rss · mcp · /scan · llms.txt