Anthropic has an AI that finds vulnerabilities in Windows, Linux, and macOS: here's why it was given to Big Tech first

Tuesday, April 7

Anthropic has done something in the tech world that has never been seen before, or at least not in this form: it announced that it has a powerful AI model that can find security vulnerabilities in software, and instead of releasing it to the public, it has chosen to give it to a consortium of companies (including some direct competitors) to use for their defense. The model is called Claude Mythos Preview, the initiative is named Project Glasswing, and the partners involved are those you would expect if someone were setting up a global cybersecurity crisis table: Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks, along with around forty other organizations that build or maintain critical software infrastructure.

Anthropic has put up to $100 million in usage credits for Mythos Preview and $4 million in direct donations to open-source security organizations. For a company that has yet to go public (expected, according to VentureBeat, in October 2026), this is a fairly significant financial commitment.

What Mythos Preview Really Does

Let’s start with the facts, which in this case are more interesting than any press release. Mythos Preview is a general-purpose model: it has not been specifically trained for cybersecurity. Its capabilities in that field are a side effect of improvements in coding, reasoning, and operational autonomy, something called "emergent capability" in jargon, which here takes on a rather unsettling concreteness.

The results, documented in the technical post of the Frontier Red Team, are the kind that make you want to check if you've updated your operating system: thousands of zero-day vulnerabilities (flaws no one knew about before) in every major operating system and in every major web browser. Many of these vulnerabilities had been there for ten, fifteen, and in one case even twenty-seven years. The most emblematic case concerns OpenBSD, an operating system whose main feature is precisely security: Mythos Preview found a bug in the TCP SACK protocol implementation that has been present since 1998, capable of crashing any reachable OpenBSD server.

Anthropic researcher Nicholas Carlini stated that he found more bugs in the last two weeks than in the rest of his career, and considering that Carlini is one of the most respected names in AI security research, this shows the scale of the phenomenon. In one case, Mythos Preview wrote an exploit for a browser that chained four different vulnerabilities together, constructing an attack chain that managed to bypass all protection levels of both the browser and the underlying operating system. In another case, it autonomously built a remote code execution exploit on FreeBSD (a very commonly used open-source operating system for servers and network infrastructure) that granted root access to unauthenticated users, exploiting a 17-year-old bug.

The Generational Leap, in Numbers

For reference: when Claude Opus 4.6, Anthropic's most advanced public model, tried to turn vulnerabilities found in Firefox into working exploits, the success rate was near zero (two out of hundreds of attempts). With Mythos Preview, on the same test, there were 181 successful exploits. In tests on open-source repositories, Opus 4.6 achieved one severe crash case; Mythos Preview achieved ten with complete control of the execution flow, the maximum level. For N-day vulnerabilities (known vulnerabilities that have not yet been patched across multiple systems), the model took a list of 100 known vulnerabilities in the Linux kernel (the so-called CVEs, public records that the security community uses to catalog every discovered flaw) and successfully exploited more than half, with costs per exploit in the thousands of dollars and times of under a day.

And here comes the detail that is more thought-provoking than any table: Anthropic engineers without training in cybersecurity asked Mythos Preview to find remote execution vulnerabilities before going to sleep, and the next morning they found a complete and functioning exploit. The AI worked all night, alone, reading source code, formulating hypotheses, testing them, using debuggers, and ultimately producing a report with a practical demonstration of the attack and reproduction instructions.

What This Tells Us About the Future of AI

If you are involved in AI even a little, what Mythos Preview does in the field of cybersecurity should make you think about something broader. We are talking about an AI agent that operates for hours without supervision, reads complex code, formulates theories, experimentally verifies them, adapts strategy when it fails, concatenates multiple steps in sophisticated logical chains, and produces usable output without human intervention. The fact that it does this in the domain of cybersecurity is secondary to the fact that it does it at all.

For those following the debate on agentic AI, Mythos Preview is probably the most concrete demonstration we have seen so far of what truly an autonomous agent capable of solving complex problems in the real world looks like. Academic benchmarks and curated demos have always left a reasonable margin of skepticism; here we are talking about confirmed vulnerabilities, verified exploits, patches released by maintainers of major operating systems. The cycle in which the model formulates a hypothesis, tests, adjusts, and retries is real, measurable, and the results have already been incorporated into the software running on your computers.

And it is natural, looking at what Mythos Preview does, to ask a bigger question. AGI, or artificial general intelligence, is that hypothetical milestone in which an AI would be able to tackle any intellectual task that a human can perform, not just in a specific domain but across the board: reasoning, planning, adapting to new problems, learning from experience. It is the point where AI would stop being a specialized tool and become something qualitatively different, with massive implications for the economy, science, geopolitics, and, more broadly, for the place of humans in the world.

No one knows for sure when we will get there, and some argue that we are still far off. But what Mythos Preview does is exactly the type of behavior that one would expect from a system approaching that threshold: working for hours on problems that require abstract reasoning, creativity in combining different techniques, deep understanding of complex systems, and the ability to adapt strategy when things don’t work on the first try. It does this in one domain, cybersecurity, and this keeps it officially away from the definition of AGI. But the fact that these capabilities have emerged as a side effect of general improvements in reasoning and programming, without specific training, is the detail that should make us reflect the most.

If capabilities emerge like this, in which other domains are they emerging without anyone measuring them yet? This brings us directly to vibe coding and the entire ecosystem of AI-generated code that is growing at an impressive speed. If an AI model is capable of finding and exploiting vulnerabilities in code written by human professionals with decades of experience and continuously checked by dedicated security teams, it is natural to ask what will happen with code produced en masse by the models themselves.

Vibe coding generates functioning software at a speed that was unthinkable just a year ago, but the security quality of that code is a huge blind spot. The same type of model that writes the code is also the one that can break it, and if Mythos Preview finds bugs in OpenBSD, imagine what it would find in an app written in an afternoon with a prompt and a couple of iterations.

However, there is also another side to the story. A model with the capabilities of Mythos Preview could also write better code than we do, in the sense of being more secure. If the model understands vulnerability classes so well that it can find them in code written by expert humans, it could also avoid them when generating new code, producing software that is already more robust from the start. Anthropic has not announced anything in this direction yet, but the logic is quite transparent: the same model that finds a bug in 16-year-old FFmpeg is also the one that, writing code from scratch, would not make that kind of error. The point is that today we still do not have the tools to systematically verify that this happens, and until that moment, vibe coding remains a practice where the speed of production far exceeds the ability to control.

Mythos Preview could be both the problem and the solution, depending on how it will be used.

The Context and Some Uncomfortable Questions

As VentureBeat noted, the timing deserves some reflection: the announcement came on the same day that Anthropic communicated its revenue milestone and the deal with Broadcom and Google for about 3.5 gigawatts of computing capacity. A high-profile national security initiative with blue-chip partners is exactly the type of program that reinforces an IPO narrative.

Gizmodo pointed out, in its characteristic blunt tone, that a few weeks ago Anthropic kept Mythos hidden because it was too dangerous, and that the leap from keeping it locked away to distributing it through critical infrastructure is a jump that deserves explanations. Anthropic's response is that the alternative is worse: if models with similar capabilities become available to less responsible actors within a few months, it is better to give defenders a temporal advantage. Over 99% of the vulnerabilities found have not yet been fixed, and to demonstrate they are not bluffing while not revealing details that would put millions of systems at risk, they published a sort of "cryptographic receipt" for each exploit: a mathematical code that allows anyone to verify in the future that Anthropic indeed had those exploits in hand at the announcement date, without being able to read the content. An ingenious mechanism, and also a signal of how serious the matter is.

Then there is the question of the relationship with the Department of Defense. As reported by TechCrunch, Anthropic and the Trump administration are in a legal battle after the Pentagon classified the company as a supply chain risk due to its refusal to allow the use of Claude for autonomous targeting. Announcing a national security initiative at this moment has a flavor that goes beyond cybersecurity.

What Changes for Software Developers

Anthropic's practical message to defenders is quite clear: patch cycles must drastically shorten. If an AI model can take the public record of a known vulnerability and turn it into a working exploit in just a few hours without human intervention, the window between the publication of a flaw and its active exploitation shrinks to almost nothing. And this applies to everyone, not just those managing critical infrastructure.

Anthropic also suggests starting to use models already available for vulnerability research, without waiting for models of the Mythos class. Opus 4.6 had already found hundreds of critical vulnerabilities wherever it looked. The point is that the improvement curve is steep, and those who have not yet incorporated these tools into their security processes are already behind. For maintainers of legacy software, the advice is even more urgent: prepare contingency plans for critical vulnerabilities in code that no longer receives support, because AI models do not distinguish between modern and old software.

Researcher Simon Willison commented that Project Glasswing seems like a necessary move, and it would also be useful to involve OpenAI. He also added, and it's hard to fault him on this, that this story sounds like an industrial reckoning, one of those that require huge investments to stay ahead of an inevitable wave of vulnerabilities.

For those following this column, the point I want to emphasize is that we are facing a case in which AI is shifting the balance between attack and defense in a field where that balance had remained largely stable for twenty years. Anthropic makes a historical comparison with the first software fuzzers of the 2000s, those automated tools that bombard a program with random inputs to see where it breaks, and which when they appeared scared everyone because it seemed they would give a huge advantage to attackers. They indeed did, for a while. But today those fuzzers are a fundamental component of the security ecosystem, and Anthropic bets that the same will happen with language models. The problem is that the transition period, when attackers could have the advantage, is now. And in the meantime, the amount of AI-generated code circulating in the world grows every day, with security standards that no one is really verifying. Mythos Preview is the proof that these models can be extraordinary defense tools, but also a rather stark reminder of how quickly the ground is shifting beneath everyone's feet.