The Vulnerability Report Is Dead. Long Live the Prompt!

mikael.barbero@eclipse-foundation.org (Mikaël Barbero) — Fri, 19 Jun 2026 11:30:00 +0200

For years, maintainers have asked security reporters for a fairly reasonable thing: reproduction steps. Not a vibe. Not a screenshot from a scanner. Not a majestic wall of speculative prose explaining how “an attacker could maybe possibly exploit this under unspecified conditions.” Just the steps. What did you do? What happened? What should have happened instead? Can I reproduce it before I spend my Saturday afternoon spelunking through a dependency graph held together by hope and YAML?

This was, and remains, a sensible standard. But the environment around vulnerability reporting is changing quickly. With frontier models, automated agents, and increasingly capable security tooling, the volume and shape of reports sent to project maintainers are starting to look different. We are moving from “a human found a bug and wrote it up” to “a system explored a codebase, generated hypotheses, tested some of them, and produced a polished-looking report.”

That shift matters. Because in this new world, the final report is no longer the primary artifact. It is often just the agent’s summary. Sometimes it is useful. Sometimes it is verbose. Sometimes it is impressively confident in exactly the way one prefers vulnerability reports not to be.

The core artifact should now be something else: the prompt.

Reproduction Steps Were for a Different Reporting Economy

Requesting reproduction steps made sense when the reporter’s work was largely manual. The report was the compressed form of their investigation. It told the maintainer how to follow the same path and verify the finding.

But with AI-assisted discovery, the path is no longer just a sequence of shell commands, HTTP requests, or UI clicks. The path includes the instructions given to the model, the context it was shown, the amount of code it could inspect, the tools it was allowed to use, the number of attempts it made, and the assumptions baked into the exploration.

A report that says “the agent found a SQL injection vulnerability” is not enough. Neither is a beautifully formatted five-page explanation that looks as if it was ghostwritten by an overcaffeinated compliance department. The question maintainers need answered is not merely “what did the agent conclude?” It is “how did the agent get there?”

The Prompt Is the New Reproduction Step

If a vulnerability was discovered by an AI agent, the prompt is not incidental. It is part of the evidence. The prompt tells maintainers what the agent was trying to do. It reveals whether the model was asked to look for real exploitable vulnerabilities, theoretical weaknesses, dependency risks, configuration issues, or “anything that sounds scary enough to submit to a bug bounty program.”

That distinction is not academic. Maintainers need to know whether a report came from a focused investigation or from a broad instruction to produce security-looking output until something plausible emerged. In practice, an AI-assisted vulnerability report should include:

the prompt or instructions used to initiate the analysis;
the model name and version, where available;
the amount of context provided to the model;
the relevant files, repository state, commit hash, or package version examined;
the tools the agent was allowed to use;
whether the finding was verified independently of the model’s reasoning;
how much effort was spent, including the number of runs, retries, or exploration steps;
the generated report, preferably treated as a summary rather than sacred scripture.

This is not bureaucracy for its own sake. It is provenance. And provenance is what separates a useful report from a machine-generated guessing contest with CVE-shaped ambitions.

The Full Generated Report Is Metadata

This may sound counterintuitive, especially because AI-generated reports often look complete. They have headings. They have impact statements. They have remediation advice. They may even include a proof of concept that appears convincing at first glance. That is precisely the problem.

The report is the most polished artifact, but not necessarily the most informative one. It is the model’s final narrative. Like all narratives, it can omit uncertainty, smooth over failed attempts, and present inference as fact. Frontier models are very good at producing reports that look finished. Unfortunately, maintainers are usually not looking for literary closure. They are looking for technical truth.

The generated report should therefore be considered supporting material. Useful, but secondary.

The prompt and execution context are closer to the real source material. They explain the conditions under which the finding emerged. They help maintainers assess whether the issue is reproducible, whether the agent was operating with enough context, and whether the conclusion was reached through actual validation or through what we might politely call “statistical enthusiasm.”

This Matters Especially for Open Source

Open source maintainers already operate under severe attention constraints. Many projects receive bug reports, feature requests, support questions, dependency update noise, vulnerability disclosures, and drive-by “security research” from people who have never read the contribution guidelines but have strong feelings about severity labels.

Adding AI-generated reports to that queue can be helpful, but only if the reports are structured to reduce maintainer burden.

A good AI-assisted vulnerability report should make triage easier. It should not require maintainers to reverse-engineer the agent’s reasoning, guess which repository snapshot was analyzed, or determine whether the reporter has personally verified anything beyond the existence of a “Submit” button.

Without provenance, AI-generated vulnerability reports risk becoming a denial-of-service attack against maintainer attention. Not maliciously, necessarily. Just at scale. Which is, of course, everyone’s favorite way for a process problem to become a security problem.

A Better Norm for AI-Assisted Reports

We should update the norm. For human-discovered vulnerabilities, reproduction steps remain essential. For AI-assisted vulnerabilities, reproduction steps are still useful, but they are no longer sufficient. The report should include the prompt, model details, context window size or supplied context, repository version, tool access, validation method, and effort level.

In other words: show your work. Not just the model’s final answer.

Security reporting has always depended on trust, but trust does not mean accepting a confident paragraph at face value. It means providing enough information for the recipient to understand, verify, and act. As AI agents become more common in vulnerability discovery, maintainers should feel comfortable asking reporters for the prompt and execution context. Reporters should treat that information as a normal part of disclosure, not as an implementation detail or proprietary magic trick.

The future of vulnerability reporting should not be a flood of immaculate-looking machine-generated PDFs sent to already overworked maintainers.