From Prompt Injection to Mission Drift: The Emerging Attack Vectors Targeting AI Agents

By Dror Zelber May 27, 2026

As organizations adopt Agentic AI at scale, attackers are already adapting their tactics. The shift from passive, text only AI to autonomous, tool enabled agents introduces a wide range of new attack surfaces. These systems read documents, browse the web, call APIs, invoke functions, update records, and even interact with other agents. This creates opportunities for attackers to manipulate inputs, steering agents into harmful actions. What used to be a simple “prompt injection” issue is now only the beginning. Agentic AI systems carry new risks that traditional AI governance and application security frameworks were never designed to handle — and the threat landscape is evolving faster than most security programs can keep up.

Key Attack Vectors

Agentic AI is vulnerable to a new generation of attacks because it combines language understanding with tools, memory, and the ability to execute. In traditional LLMs, bad prompts mostly lead to bad text. In Agentic AI, bad inputs can lead to bad actions — data exfiltration, security changes, fraudulent operations and more. Below are the most critical attack vectors security teams must anticipate and defend against.

1. Classic Prompt Injection — Supercharged

Prompt injection remains one of the most well known AI risks — but in agent environments, the consequences are dramatically elevated. Instead of simply producing an unwanted response, a manipulated agent can be tricked into performing harmful tasks.

Indirection attacks utilizes benign looking instructions to guide the agent toward malicious follow up content. Attackers embed hidden commands inside PDFs, HTML tags, email signatures, web content, or image metadata that the agent consumes as part of a workflow. Because agents often trust retrieved information, these hidden instructions can override safety policies, alter behavior, and trigger dangerous tool actions.

In short, prompt injection becomes operational, not informational — and its blast radius expands accordingly.

2. Tool Use Exploits

Tools are the operational layer of Agentic AI — the functions that let it send emails, write files, fix tickets, query APIs, or modify systems. Attackers design manipulations that steer the agent into misusing these tools in ways that appear legitimate but cause real harm.

A common scenario is function hijacking, where crafted prompts convince the agent to call a powerful API with dangerous parameters. Attackers may also prompt the agent to “explore” backend APIs until it discovers high-privilege functionality. When an agent can export documents, share links, or send messages, attackers exploit this to exfiltrate data through paths that appear normal to monitoring systems.

What makes tool attacks severe is that they turn linguistic manipulation into system level consequences.

3. Retrieval Augmented Attacks (RAG Exploitation)

Many agents rely on retrieval pipelines that pull information from documents, knowledge bases, or external content. Attackers target this layer because it feeds trusted knowledge to the agent, and poisoning it is an extremely effective way to control behavior.

A single malicious document can influence multiple agent actions if the agent believes the information is authoritative. Embedding manipulation can elevate attacker created files in the ranking of search queries, forcing the agent to choose compromised content over real documentation. Because retrieval layers typically mix internal and external sources, attackers can also plant misleading information in accessible locations where the agent might discover it.

RAG poisoning creates a high impact, low visibility attack vector, especially in enterprises that rely on shared knowledge stores.

4. Memory & State Abuse

Agents with long term or session memory accumulate information during their operations. If attackers can influence what gets stored — or how the agent interprets its stored knowledge — the agent’s future behavior becomes compromised.

Attackers may convince agents to store false facts or long term preferences that misguide future decisions. Over time, repeated nudges can cause mission drift, slowly altering the agent’s priorities or interpretations of its goals. Reward hacking introduces another dimension: when agents optimize for metrics, they may learn to take shortcuts that violate security policy while achieving higher apparent “performance.”

Memory attacks are particularly dangerous because they persist and compound over time.

5. Identity & Delegation Attacks

Identity becomes a major attack surface when agents have credentials or operate on behalf of users. Manipulating an agent’s authority or its understanding of identity boundaries can lead to privilege escalation or unauthorized operations.

Attackers may trick agents into requesting broader OAuth scopes, escalating privileges under the guise of needing “access to complete the task.” Confused deputy attacks occur when attackers convince the agent to perform actions using its trusted identity. Cross-agent impersonation emerges when one agent is tricked into believing a message came from another, enabling inter-agent exploitation.

Identity attacks succeed because organizational models for agent identity are still immature, and boundaries remain undefined.

6. Supply Chain Attacks for Agent Tools

Agents depend on a growing ecosystem of third party plugins, connectors, extensions, and external models. Each integration becomes part of a new supply chain that most organizations cannot fully see or validate.

Unvetted plugins may leak data, execute unauthorized actions, or introduce vulnerabilities. Endpoint substitution attacks swap out a safe model endpoint for a risky or experimental one. Some tools routinely send telemetry, prompts, and outputs to external servers without adequate controls or visibility. Like traditional software supply chains, attackers exploit the weakest link — except now that link is part of an autonomous system capable of taking real actions.

The risk is often invisible until it is too late.

Detection and Prevention Tactics

Protecting against these attack vectors requires new layers of defense that span prompts, outputs, identity, tools, memory, and runtime behavior. Traditional application security methods alone are insufficient because Agentic AI blends language vulnerabilities with operational capabilities.

Prompt & Output Firewalls

Firewalls detect and block harmful instructions or outputs that violate policy. They filter malicious patterns, prevent untrusted content from issuing commands, and stop attempts to invoke high risk tools without justification.

Provenance & Trust Tiers

Different sources must be treated with different trust levels. Public web data should not drive high risk actions. Signed internal documents should take priority. Trust tiers govern what an agent may do with the content it retrieves.

Tool Sandboxing

Limit the power of tools, constrain parameter values, and require approval for destructive operations. Only allow the agent to use tools that match the task and its permission level.

Memory Hygiene

Control what gets written to long term memory. Apply review or filters to memory writes and separate facts from preferences. Temporary memories should expire automatically after risky tasks.

Agent Level IAM

Give each agent a dedicated identity with least privilege access. Enforce short lived credentials and monitor all agent-driven actions.

Observability

Logging prompts, plans, decisions, and tool calls is essential to understanding agent behavior. Observability enables forensics, compliance, and real time anomaly detection.

Takeaway

Agentic AI introduces attack vectors that go far beyond traditional prompt injection. These systems blend language understanding with tool use, memory, identity, and decision making — making them powerful but vulnerable. The risk is operational, not just informational. To deploy agents safely at scale, organizations must implement controls across the entire agent lifecycle: from inputs and memory to identity, tools, and runtime behavior. The companies that modernize their defenses now will be the ones that harness Agentic AI safely, while others struggle with emergent threats.

Interested in Radware’s Agentic AI Protection Solution?

Let Radware do the heavy lifting while you expand your portfolio, grow revenue and provide your customers and business with unmatched protection.

Learn More about Radware’s Agentic AI Protection

Contact Radware

Dror Zelber

Dror Zelber is a 30-year veteran of the high-tech industry. His primary focus is on security, networking and mobility solutions. He holds a bachelor's degree in computer science and an MBA with a major in marketing.