CVE is the new PoC


In a previous blog, I wrote about proof of concept (PoC) exploits and the risks involved in publishing them before a patch is available. But what if I told you that the PoC isn’t even necessary today, and publishing the common vulnerabilities and exposures (CVE) description alone might be just a few prompts away from a working exploit?

Then & Now

In the old days (like two years ago), turning a CVE into a working exploit was a long and demanding process. You probably needed to be a security expert with a deep understanding of systems and code. And you often required advanced environments, extensive trial and error, and a significant investment of time and technical skill.

Today, with the rise of large language models (LLMs)—especially ChatGPT4—much of that expertise is no longer necessary. An LLM acting as an agent (not just a chatbot, but one that can interact with websites, run code, analyse results, adapt its strategy, and more) can take a CVE description and a decent prompt and generate a working PoC. With today’s LLMs, yesterday’s PoC is today’s session with ChatGPT.

CVE = PoC

In April 2024, a research paper demonstrated how large language models (LLMs) can autonomously exploit one-day vulnerabilities.

One-day vulnerabilities are security flaws that have already been publicly disclosed (usually through a CVE) but remain unpatched in many systems. In the article, the researchers described an experiment. They took 15 one-day vulnerabilities and tested two key scenarios:

  1. With a CVE description:
    • Ten different LLMs were tested and all but one failed completely. This includes popular open-source models and GPT-3.5.
    • Only GPT-4 succeeded and the results were remarkable. It exploited 87% of the vulnerabilities, generating working PoCs for 13 out of 15 cases.
      It achieved this using only:
      • The CVE description
      • Access to tools like terminal, code execution and web browsing
      • A prompt that encourages it no to give up and try different paths
      • A framework called ReAct (through LangChain platform) that allows GPT-4 to act like an agent

    This showed that you don’t need sophisticated tools or great knowledge to create an exploit. You just need a CVE—and encouragement for GPT-4 to not give up.

  2. Without a CVE description:

    After only GPT-4 succeeded in creating a PoC based on a CVE, the researchers tested only its ability to discover and exploit vulnerabilities without any prior information, such as details provided in a CVE. Here the model could only exploit one vulnerability out of 15 (7%).

To sum it up with a sentence: The CVE itself became the exploit blueprint and GPT-4 was just here to fill in the missing code.

When CVEs Become Exploits

These new capabilities of LLMs have significant implications for the cybersecurity world.

Now that exploiting vulnerabilities no longer requires advanced programming skills, a new wave of attackers with limited technical knowledge may emerge simply by using the right prompt.

Another critical challenge is the shrinking time window between publishing a CVE and deploying a patch. With LLMs capable of generating working exploits within minutes, vendors and defenders need to move much faster than before. Beyond speed and accessibility, LLM-generated exploits introduce deeper challenges: from blurring ethical boundaries to overwhelming traditional defense tools. In a world where AI can scan, craft, and adapt attacks in real time, defenders must rethink what preparedness really means.

XBOW

XBOW is an autonomous system developed by former teams from Microsoft, GitHub, and OpenAI, designed to simulate the work of elite white-hat hackers. Instead of relying on human penetration testers, XBOW starts with a general attack goal (e.g., "find RCE or XSS vulnerabilities") and independently orchestrates the full exploitation process. It combines LLMs with symbolic reasoning and decision trees to scan for attack surfaces, generate payloads, test them, refine the approach, and validate successful exploitation—all without human involvement. What makes it even more impressive is its ability to adapt in real time: changing tactics, evading defenses like EDR, and choosing alternative paths when needed.

HackerOne is a leading bug bounty platform where ethical hackers report security vulnerabilities to companies in exchange for rewards. On August 1, 2025, for the first time ever, a non-human topped HackerOne’s leaderboard. It was XBOW. Reports stated that XBOW found over 1000 vulnerabilities. While they didn’t specify how many were zero-day or one-day vulnerabilities, other sources confirm that XBOW is capable of identifying true zero-days. Its false positive rate is also extremely low, meaning the system is highly accurate.

Conclusion

The boundaries between CVE and PoC are starting to blur. When a detailed vulnerability description can lead directly to a working exploit that’s generated in minutes by a language model, we may need to rethink what responsible disclosure really means. Transparency is still a core value in cybersecurity, but in the age of autonomous AI agents, the cost of that transparency may be higher than we thought. If GPT-4 can already do this, then what should we expect now that GPT‑5 is here? (Hint: It may not even need a description to build an exploit!) The line between "vulnerability discovered" and "vulnerability exploited" could disappear entirely and be replaced by a few seconds of processing and a prompt.

Ori Meidan

Ori Meidan

Ori is a third-year Computer Science student at Bar-Ilan University with a strong background in software development, cyber threat intelligence, and team leadership. She led a military data analysis training program, managing the instruction and development of data analysts, designing technical curricula, and delivering lectures on advanced systems and secure communication. Proficient in full-stack development with React, Node.js, and MongoDB, she built web and mobile applications with real-time features, RESTful APIs, and AI integrations. Ori is driven by a passion for problem-solving and continuous growth and learning.

Contact Radware Sales

Our experts will answer your questions, assess your needs, and help you understand which products are best for your business.

Already a Customer?

We’re ready to help, whether you need support, additional services, or answers to your questions about our products and solutions.

Locations
Get Answers Now from KnowledgeBase
Get Free Online Product Training
Engage with Radware Technical Support
Join the Radware Customer Program

Get Social

Connect with experts and join the conversation about Radware technologies.

Blog
Security Research Center
CyberPedia