AI found the bugs faster than we can patch them

The story Anthropic told on May 22 has a number in it that should be the headline, and instead the number was the part that got reported as good news. Project Glasswing’s first month, run by about fifty partners using Claude Mythos Preview, surfaced more than ten thousand high- or critical-severity vulnerabilities in the software that everything else runs on. The framing was that AI just closed the security gap. The data, read carefully, is that AI just opened a different one.

”AI closes the security gap” was always two steps

The pitch since the security industry started running AI against codebases was always one step described as if it were the whole thing. Find a vulnerability. The find half is the part that compresses under machine speed — a model that can read every line of every important open-source project in a month is a model that can find every bug whose shape it recognizes, and a 90.6% validation rate (Anthropic’s number) means most of what it finds is real. But the find half was never the only half. Find. Verify. Disclose. Patch. Four steps, only one of which is now running at AI cadence.

The same Glasswing update, in Anthropic’s own words, calls this out: “Progress on software security used to be limited by how quickly we could find new vulnerabilities. Now it’s limited by how quickly we can verify, disclose, and patch them.” That sentence, read the way the people writing it meant, is a concession. The shape of the security gap did not change; the place it sits did. The bottleneck moved upstream a quarter step and stopped — and the entire downstream pipeline is sized for human cadence.

I keep thinking about the maintainers. Anthropic notes that some of them, faced with the volume of findings being routed at them, asked the program to slow disclosure down. The average reported time per high- or critical-severity bug is two weeks of human work. Multiply that out for one of the larger projects in the partner list and you get a queue measured in years, against an inflow that the AI side could refill indefinitely. The honest read is that AI is producing security findings at a rate that overwhelms the maintainers expected to fix them, and asking maintainers to fix faster is asking a person to keep pace with a machine.

Same shape as the bottleneck before

There is a precedent for this exact shape in our own work. Code review as the bottleneck made the point that once writing code got cheap — because AI was doing more of it — the limit on shipping became the review step that had not changed. Reviewers became the rate-limiter, then the burnout point, then the place where the system quietly broke. The Glasswing data is the same shape applied to security. Once finding got cheap, the limit became patching, and the patch step has not changed. Maintainers are the new reviewers — the un-automated link in a chain whose other links are now moving at a different speed entirely.

There is a second precedent, closer to home. The AI-coded supply-chain CVE wave covered in agent supply chain CVE wave put AI on the production side of the security gap, seeding flawed code into the open-source ecosystem at scale. Glasswing puts AI on the discovery side. So both sides of the exposure-disclosure-patch loop have something running at machine speed, and the middle — human review, maintainer triage, the patch — is being asked to absorb both. That is not the configuration the loop was designed for.

The steelman, which is real

I do not think Glasswing is a bad program. The 90.6% validation rate is genuinely high — the AI is not hallucinating thousands of bugs and dumping them on maintainers; it is finding mostly real ones, and the partner organizations are doing serious triage before anything reaches an open-source project. The named cases are real, too: Cloudflare’s 2,000 bugs include 400 high- or critical-severity issues against their critical-path systems, and Mozilla fixed 271 vulnerabilities in Firefox 150 during Mythos Preview testing. These are vulnerabilities that would have existed without Glasswing and gone unfixed; finding them is a clear public good. The hundred million dollars in Mythos Preview credits and four million in direct OSS donations are also real, and they are an honest acknowledgment from Anthropic that the downstream cost of all this finding has to be paid by someone. The argument is not that AI security scanning is bad. The argument is that the way it was sold — as the thing that closes the gap — described half a loop and assumed the other half scaled too.

The asymmetry is the news

So the part of the Glasswing story that should change how I read every subsequent AI-security announcement is the asymmetry. The find step compressed by something like ten or twenty times. The patch step compressed by zero. A ten-times asymmetry inside a two-step pipeline is, in practice, the same as not solving the problem at all — because the bottleneck just moves to the un-compressed step and waits for you there. The bug-finding rate is now a function of compute spend; the patching rate is still a function of how many maintainers a project can pay or convince to volunteer their evenings. Those two curves do not meet. They diverge, and every additional dollar of Glasswing credit widens the divergence.

The honest version of “AI closes the security gap” goes like this: AI compresses the find half of the security pipeline, and the patch half stays where it was. The gap does not close. It slides — from “we did not know about the bug” to “we know about the bug and have not patched it yet” — and “yet” is the word doing the load-bearing work, because some of those yets are going to be measured in years.

What I changed

I stopped reading “AI finds bugs” as good news on its own. The question I ask of any AI-security announcement now is which step in the loop it compressed and whether the next step downstream has the capacity to absorb the new throughput. Glasswing did the find step and did it well. The next program that earns the framing it got would be one that compressed verify, or disclose, or patch — and so far nobody is selling that, because verifying a CVE is hard work and patching it is harder, and neither compresses the way reading code does. Until one of those steps catches up, the headlines about AI finding ten thousand vulnerabilities in a month should read the way they read inside the maintainer mailing lists: as a wave to brace for, not a wave that fixed anything. The release that triggered this whole asymmetry is Anthropic’s Glasswing finds 10,000 critical bugs in a month; the same week’s news quietly admits something else, in a different domain, in the bill finally itemized.