Weaponizing Images: Prompt Injection Meets Dark AI


Introduction

Imagine this – you use your favorite AI assistant, and you send it a seemingly innocent image, when, all of a sudden, emails with sensitive information scraped from your computer are sent to an unknown email address.

This scenario isn’t science fiction – it’s based on a newly published research paper that shows how attackers can hide malicious instructions inside images, only to have them ‘come to life’ once processed by AI systems.

How this works

When a user sends an image to an AI assistant, the image is usually in high resolution. AI systems often downscale images to a fixed resolution, which can reduce visible detail but is primarily done for performance and consistency reasons. Downscaling can happen through several algorithms, among them:

  • Nearest Neighbor – every new pixel gets the color of the nearest pixel in the original image.
  • Bilinear Interpolation – every new pixel gets the color by the weighted average of the 4 pixels around it.
  • Bicubic Interpolation – every new pixel gets its color by the weighted average of the 16 surrounding pixels, using cubic polynomial calculations.

The attack described in the article is using the concept of downscaling like this – the user will send an image that seems like a normal image, but when downscaled, a prompt with malicious instructions is exposed. Then the AI assistant will see the prompt in the image, will interpret this as part of the user’s instructions, and will act on them.

Figure 1: harmless original image vs. downscaled with prompt image (source: trail of bits)

Figure 1: harmless original image vs. downscaled with prompt image (source: trail of bits)

The researchers created a custom test to identify the downscaling algorithm used by each AI system, and then crafted an image tailored to that algorithm to reveal the hidden text as accurately as possible. The researchers also stated that they applied the default configuration for the Zapier MCP server (a Model Context Protocol integration that lets AI systems securely trigger actions across thousands of apps through Zapier). The default configuration automatically approves all MCP tool calls without user confirmation.

Common use

The attack is not a simple one – the attacker needs to identify which downscaling algorithm is being used, and then create an image with a hidden prompt that works specifically with that algorithm. A year ago, that might have been a challenge for attackers without strong technical skills. Today, you only need to feed a link to the research into a tool like Xanthorox or an open-source AI agent, and it will handle everything: understanding the paper, generating code, and tailoring the exploit image. We can take this one step further: instead of just using AI to create the exploit, attackers can feed all the stolen data back into AI systems. With its ability to process massive amounts of information quickly and systematically, AI can extract sensitive details that a human attacker might miss. What once required hours of manual analysis can now be automated, making large-scale data exfiltration and exploitation far more effective.

Xanthorox – AI for Offensive Operations

While mainstream AI assistants like ChatGPT, Gemini, or Claude have strict safety policies that intend to block malicious use, Xanthorox is a dark-web AI platform built specifically for offensive security and cybercrime. It offers modules for generating custom exploits, analyzing stolen datasets, automating phishing campaigns, and even performing reconnaissance at scale.

Figure 2: Xanthorox's models (source: Xanthorox)

Figure 2: Xanthorox's models (source: Xanthorox)

Attackers can feed it research papers or proof-of-concept code, and Xanthorox will adapt them into fully functional tools. Combined with attacks like hidden prompts in images, this makes AI a powerful force multiplier for cybercriminals, lowering the technical barrier to launching sophisticated attacks.

Figure 3: Xanthorox billing plan. Not that different than mainstream AI (source: Xanthorox)

Figure 3: Xanthorox billing plan. Not that different than mainstream AI (source: Xanthorox)

Conclusion

This research demonstrates that even seemingly harmless images can become attack vectors: by embedding malicious prompts that only appear after downscaling, attackers can bypass filters and execute commands invisibly. Today, a growing number of studies highlight prompt injection and AI vulnerabilities, and with the rise of dark AI tools like Xanthorox, even attackers with limited technical expertise can weaponize these findings, turning academic research into real-world offensive tools. They can also leverage AI to sift through massive amounts of stolen data, extracting sensitive insights at a scale and speed no human could match.

Ori Meidan

Ori Meidan

Ori is a third-year Computer Science student at Bar-Ilan University with a strong background in software development, cyber threat intelligence, and team leadership. She led a military data analysis training program, managing the instruction and development of data analysts, designing technical curricula, and delivering lectures on advanced systems and secure communication. Proficient in full-stack development with React, Node.js, and MongoDB, she built web and mobile applications with real-time features, RESTful APIs, and AI integrations. Ori is driven by a passion for problem-solving and continuous growth and learning.

Contact Radware Sales

Our experts will answer your questions, assess your needs, and help you understand which products are best for your business.

Already a Customer?

We’re ready to help, whether you need support, additional services, or answers to your questions about our products and solutions.

Locations
Get Answers Now from KnowledgeBase
Get Free Online Product Training
Engage with Radware Technical Support
Join the Radware Customer Program

Get Social

Connect with experts and join the conversation about Radware technologies.

Blog
Security Research Center
CyberPedia