Project Glasswing: When a Model Finds Too Many Bugs

April 07, 2026 8 min read Updated April 19, 2026

ai claude-code agents security vulnerability-research anthropic engineering

From the guide: Claude Code Comprehensive Guide

Two weeks ago, Nicholas Carlini showed that Claude Code could find a 23-year-old Linux kernel vulnerability using a 10-line bash script. Today, Anthropic announced what happened when they scaled that approach: a new model called Claude Mythos that found thousands of high and critical-severity zero-day vulnerabilities, then decided not to release it publicly.¹

Project Glasswing is Anthropic’s restricted deployment of Claude Mythos, a frontier model that discovered thousands of zero-day vulnerabilities across every major operating system and web browser. Mythos found critical bugs including a 27-year-old OpenBSD TCP SACK flaw and a FreeBSD NFS remote code execution vulnerability. Anthropic restricted access to 12 partner organizations for defensive security only, committed $100M in usage credits, and opened the Cyber Verification Program application form at claude.com/form/cyber-use-case for qualified researchers.

Project Glasswing is Anthropic’s answer to the question practitioners have been asking since Carlini’s [un]prompted talk: what happens when this capability is deployed at scale? The answer: you restrict it.

TL;DR

Claude Mythos Preview is a frontier model whose cybersecurity capabilities, according to Anthropic, “emerged as a downstream consequence of general improvements in code, reasoning, and autonomy.”¹ Anthropic positions it as more cyber-capable than any generally available Opus model (including the April 16, 2026 Opus 4.7 release), and restricts access to 12 partner organizations (Apple, Amazon, Microsoft, Google, Linux Foundation, and others) for defensive security work only. The model found thousands of zero-days, including a 27-year-old OpenBSD TCP SACK bug, a 16-year-old FFmpeg vulnerability, and a FreeBSD NFS RCE (CVE-2026-4747).¹ Anthropic committed $100M in usage credits and $4M to open-source security organizations. The Cyber Verification Program application form is now live for legitimate security researchers seeking access.¹

Key Takeaways

Security engineers: The capability threshold that Carlini demonstrated at [un]prompted is real, and it scales. Mythos found vulnerabilities in “every major operating system and web browser.”² Defensive security teams at the 12 partner organizations now have access. Everyone else should be preparing for what comes when these capabilities reach generally available models.
Scaffold builders: Mythos runs via Claude Code in isolated containers.¹ The scaffold pattern (agent CLI + sandboxed execution + automated triage) now serves as the production architecture for frontier security research at Anthropic itself. The orchestration patterns practitioners built independently hold up at the highest level.
Everyone else: Anthropic chose restriction over release. That is a real governance decision with real tradeoffs. The model exists. Anthropic demonstrated the capabilities. The question is no longer whether AI can find zero-days but who gets access and under what constraints.

Update (April 19, 2026)

Since this post went live on April 7, two things changed:

Opus 4.7 shipped on April 16, 2026 as the new generally-available flagship. Anthropic states that Opus 4.7 is deliberately less cyber-capable than Mythos Preview and ships with real-time cyber safeguards. Mythos Preview remains separate and restricted.⁵
The Cyber Verification Program application form is now live at claude.com/form/cyber-use-case. What the original announcement called a “future” program is now a concrete application path.⁵
Claude Code shipped two relevant infrastructure releases: v2.1.111 added Opus 4.7 / xhigh / Auto Mode support; v2.1.113 added sandbox.network.deniedDomains, wrapper-command deny rules (env / sudo / watch / ionice / setsid), stricter find -exec / -delete handling, and macOS /private/{etc,var,tmp,home} removal protection under Bash(rm:*).⁶ These are exactly the kind of hardening primitives a Mythos-style security research scaffold needs.

The core argument below, capability restriction over release, scaffold patterns holding up at the highest level, everyone else preparing for what comes when these reach GA, is unchanged. If anything, Opus 4.7’s explicit cyber-safeguard framing strengthens it.

From Talk to Product

Carlini’s [un]prompted talk in early April was the public preview.³ He showed five Linux kernel vulnerabilities and 22 Firefox CVEs found with a simple file-iteration script. The bottleneck, he said, was human validation, “several hundred crashes I haven’t validated yet.”

Mythos is what happens when you remove that bottleneck with a more capable model and dedicated infrastructure. The scale difference is significant:¹

Metric	Carlini’s talk	Project Glasswing
Vulnerabilities found	5 kernel + 22 Firefox CVEs	Thousands across all major platforms
Targets	Linux kernel, Firefox	Every major OS, browser, open-source project
Validation	Manual, researcher-driven	Professional security contractors, 89% severity confirmation
Access	Opus 4.6 at the time of Carlini’s talk; Opus 4.7 is now the GA flagship	Mythos Preview (restricted to 12 partners)

The professional validation number matters: 89% of 198 reviewed reports had severity assessments confirmed by independent security contractors, with 98% within one severity level.¹ These are not hallucinated findings.

The Restriction Decision

Anthropic’s stated position: “We do not plan to make Claude Mythos Preview generally available due to its cybersecurity capabilities.”⁴

The decision stands out. Model companies typically race to ship capabilities. Anthropic built a model that is demonstrably better at finding vulnerabilities than any publicly available system, then chose to restrict it to defensive use by vetted partners. The $100M commitment in usage credits signals this is not a marketing exercise.¹

The restriction model has three tiers:¹ 1. Project Glasswing partners (12 organizations): Direct access for defensive security 2. Broader access (40 organizations total): Supervised deployment 3. Cyber Verification Program (now live at claude.com/form/cyber-use-case): Application path for verified security professionals⁵

For practitioners, the standard API and Claude Code do not expose Mythos’s vulnerability-finding capabilities. The strongest generally available model is now Opus 4.7 (launched April 16, 2026), which Anthropic positions as deliberately less cyber-capable than Mythos and ships with real-time cyber safeguards.⁵ Mythos’s demonstrated capabilities already influenced that April 16 release, Opus 4.7 is Anthropic’s first post-Glasswing model with dedicated cyber safeguards.

What This Validates

Project Glasswing validates several patterns that the practitioner community built independently:

Claude Code as the execution scaffold. Mythos runs via Claude Code in isolated containers.¹ The same agent CLI that practitioners use for daily coding serves as the execution layer for frontier security research. The hooks, skills, and sandboxing that Claude Code provides are not convenience features. They are the infrastructure that makes autonomous security scanning safe enough to deploy.

The verification bottleneck is an orchestration problem. Carlini’s talk identified human validation as the bottleneck. Project Glasswing’s solution: professional security contractors for validation, SHA-3 hash commitments for responsible disclosure, and structured triage infrastructure.¹ The same triage problem surfaced in When Your Agent Finds a Vulnerability, and the solution is infrastructure, not model capability.

Governance hooks matter more than scanning capability. The model can find the vulnerabilities. The hard problem is controlling disclosure, managing access, and ensuring findings reach defenders before attackers. Anthropic’s answer is organizational (restrict the model, vet the partners, commit resources). For practitioners building their own security scanning, the governance hooks that gate output are the equivalent.

What This Means for Practitioners

You are not getting Mythos access. Here is what you can do with what you have:

Opus 4.6 is already capable. Carlini’s [un]prompted results (5 kernel bugs, 22 Firefox CVEs) used Opus 4.6, not Mythos.³ The capture-the-flag methodology, ASAN-instrumented builds, and file-iteration script are all reproducible with the generally available model.

Build the triage layer now. When future Opus models inherit some of Mythos’s capabilities (as Anthropic has implied), the bottleneck will be the same one Carlini identified: human validation. The teams that have automated deduplication, severity classification, and disclosure workflows ready will benefit first.

Apply to the Cyber Verification Program. The application form is live at claude.com/form/cyber-use-case. If you do legitimate security research, this is the path to elevated access.

The trajectory is clear: AI-assisted vulnerability discovery is real, it scales, and the governance question is now the central problem. The model capability is solved. The scaffold that orchestrates discovery, triage, and responsible disclosure is not.

Sources

Frequently Asked Questions

Can I use Claude Mythos through Claude Code?

No. Mythos Preview is restricted to Project Glasswing partners. Opus 4.7 (April 16, 2026) is the strongest model available through Claude Code for general users; Anthropic states Mythos remains more cyber-capable than any GA model.

Will Mythos capabilities come to Opus?

Opus 4.7 is Anthropic’s first post-Glasswing Opus release and ships with real-time cyber safeguards. The pattern suggests future Opus models will carry additional safeguards rather than the full Mythos capability envelope. Anthropic’s original announcement said they aim to “enable safer deployment through new safeguards in future Claude Opus models.”

How does this relate to the earlier vulnerability blog post?

Carlini’s [un]prompted talk (covered in When Your Agent Finds a Vulnerability) used Opus 4.6 and found 5 kernel bugs + 22 Firefox CVEs. Mythos scaled that approach to thousands of vulnerabilities across all major platforms. The methodology is the same; the model is more capable.

Claude Mythos Preview, Project Glasswing. Anthropic, April 7, 2026. Official announcement. Thousands of high/critical-severity zero-days found. 89% severity confirmation rate by professional validators. $100M in usage credits. Led by Nicholas Carlini with 21+ co-authors. ↩↩↩↩↩↩↩↩↩↩↩
Anthropic’s Project Glasswing. Simon Willison, April 7, 2026. Analysis and context on the restricted release model and Carlini’s earlier work. ↩
Nicholas Carlini, “Black-hat LLMs,” [un]prompted AI security conference, April 2026. Conference agenda. See also: AI Finds Vulns You Can’t, Security Cryptography Whatever podcast. ↩↩
Anthropic says its most powerful AI cyber model is too dangerous to release publicly. VentureBeat, April 7, 2026. ↩
Post-publication updates (April 19, 2026). Anthropic’s Introducing Claude Opus 4.7 announcement (April 16, 2026) positions Opus 4.7 as the GA flagship while noting Mythos Preview remains more cyber-capable. Real-time cyber safeguard details at Anthropic Support: Real-time cyber safeguards on Claude. Cyber Verification Program application form live at claude.com/form/cyber-use-case. ↩↩↩↩
Claude Code CHANGELOG. v2.1.111 added Opus 4.7 launch support (xhigh effort, Auto Mode for Max without flag). v2.1.113 added sandbox.network.deniedDomains, wrapper-command deny rules, find -exec/-delete permission tightening, and macOS /private/{etc,var,tmp,home} removal protection. ↩