Agent goal hijack via web search
An autonomous agent searches the web and reads tool output. Attacker SEO-poisons / posts a comment that, when fetched, contains 'NEW INSTRUCTION:' the agent obediently follows.
§ Context
Assumed environment: target deploys an LLM agent with browse + tool-call capabilities (LangChain / AutoGPT / custom). The agent re-reads workspace memory each turn and trusts retrieved content.
§ Steps
- 01Wait for agent to encounter the pageInitial AccessT1078— Valid Accounts
- 02Tool reaches shell / file / external APIExecutionT1059— Command and Scripting Interpreter
- 03Post hidden prompt on a page the agent will fetchPersistenceAI-RAG-POISON— RAG Index Poisoning
- 04Agent invokes its tools toward attacker goalExecutionAI-TOOL-ABUSE— Tool / Function-Call Abuse
- 05Agent appends attacker instruction to planPrivilege EscalationAI-AGENT-GOAL-HIJACK— Agent Goal Hijack
§ References
§ Frequently asked
- What is the "Agent goal hijack via web search" attack path?
- An autonomous agent searches the web and reads tool output. Attacker SEO-poisons / posts a comment that, when fetched, contains 'NEW INSTRUCTION:' the agent obediently follows. It chains 5 steps drawn from real-world offensive-security techniques.
- What starting position does this attack require?
- The first step is Wait for agent to encounter the page (T1078) — a initial access primitive. Assumed environment: target deploys an LLM agent with browse + tool-call capabilities (LangChain / AutoGPT / custom).
- What is the final impact of this kill-chain?
- The final step lands on Agent appends attacker instruction to plan (AI-AGENT-GOAL-HIJACK), which falls under Privilege Escalation. From here, an operator typically pivots into post-exploitation or maintains persistence.
- How can defenders detect or prevent this attack?
- Detection and prevention vary per step. Refer to each linked MITRE ATT&CK entry under "References" — every technique on that page lists defensive controls, detection telemetry, and known threat-actor usage.
§ Related dossiers
- Shared techniques3
Indirect prompt injection via RAG document
Attacker uploads a poisoned document to a customer wiki / SharePoint that the LLM ingests at query time. Injection fires when a privileged user later asks a question that retrieves the doc.
- Shared techniques3
Prompt injection → tool-call shell RCE
Coding-assistant agent has a 'run command' tool. Hidden prompt in a README inside a project triggers `rm -rf` or fetches a reverse shell when the developer asks for help.
- Shared techniques2
Multi-agent confused-deputy → tool-call escalation
User-facing agent has limited tools; back-end planning agent has powerful tools (shell, file system). Prompt injection in user input → user agent → back-end agent. The back-end runs the attacker's intent under the planner's higher trust.
- Shared techniques2
nf_tables UAF → kernel R/W → root
CVE-2024-1086-class nf_tables UAF reachable from a user namespace. Win the race with userfaultfd to land an attacker object in the freed slot, build a kernel R/W primitive, overwrite the current task's cred struct.
- Shared techniques2
io_uring UAF → modprobe_path overwrite → root
Use an io_uring UAF to land arbitrary kernel write, repoint /proc/sys/kernel/modprobe to an attacker binary, then trigger a kernel auto-modprobe — runs the binary as root.
- Shared techniques2
Process doppelgänging → spawn signed image with attacker bytes
Use NTFS transactional file APIs to overlay an attacker image during process creation. The final mapped process differs from the on-disk file — AV sees only the legit signed image at scan time.