Multi-agent confused-deputy → tool-call escalation
User-facing agent has limited tools; back-end planning agent has powerful tools (shell, file system). Prompt injection in user input → user agent → back-end agent. The back-end runs the attacker's intent under the planner's higher trust.
§ Context
Assumed environment: target deploys an agent stack with role specialisation — front-end agent talks to user, back-end planner / executor with broader tool surface. The two agents trust each other's outputs without re-authorising.
§ Steps
- 01Tool output exfils host secretsExfiltrationT1041— Exfiltration Over C2 Channel
- 02Code-exec on agent hostExecutionT1059— Command and Scripting Interpreter
- 03Send injection payload to user agentInitial AccessAI-PROMPT-INJECT— Direct Prompt Injection
- 04Planner trusts relayed content as user intentInitial AccessAI-INDIRECT-INJECT— Indirect Prompt Injection (RAG / Web)
- 05Planner invokes shell / fs toolExecutionAI-TOOL-ABUSE— Tool / Function-Call Abuse
- 06User agent relays to planner agentPrivilege EscalationAI-AGENT-MULTI— Multi-Agent Collusion / Confused Deputy
§ References
§ Frequently asked
- What is the "Multi-agent confused-deputy → tool-call escalation" attack path?
- User-facing agent has limited tools; back-end planning agent has powerful tools (shell, file system). Prompt injection in user input → user agent → back-end agent. The back-end runs the attacker's intent under the planner's higher trust. It chains 6 steps drawn from real-world offensive-security techniques.
- What starting position does this attack require?
- The first step is Tool output exfils host secrets (T1041) — a exfiltration primitive. Assumed environment: target deploys an agent stack with role specialisation — front-end agent talks to user, back-end planner / executor with broader tool surface.
- What is the final impact of this kill-chain?
- The final step lands on User agent relays to planner agent (AI-AGENT-MULTI), which falls under Privilege Escalation. From here, an operator typically pivots into post-exploitation or maintains persistence.
- How can defenders detect or prevent this attack?
- Detection and prevention vary per step. Refer to each linked MITRE ATT&CK entry under "References" — every technique on that page lists defensive controls, detection telemetry, and known threat-actor usage.
§ Related dossiers
- Shared techniques3
Indirect prompt injection via RAG document
Attacker uploads a poisoned document to a customer wiki / SharePoint that the LLM ingests at query time. Injection fires when a privileged user later asks a question that retrieves the doc.
- Shared techniques3
Prompt injection → tool-call shell RCE
Coding-assistant agent has a 'run command' tool. Hidden prompt in a README inside a project triggers `rm -rf` or fetches a reverse shell when the developer asks for help.
- Shared techniques2
Apache Struts S2-045 (CVE-2017-5638) → Equifax-style breach
Crafted Content-Type header is parsed as OGNL — execute commands as the app user. The 2017 Equifax breach origin: unpatched Struts endpoint exposed to the internet.
- Shared techniques2
Agent goal hijack via web search
An autonomous agent searches the web and reads tool output. Attacker SEO-poisons / posts a comment that, when fetched, contains 'NEW INSTRUCTION:' the agent obediently follows.