First findings from Project Glasswing and what they mean for AI security
AI is rapidly changing how security teams hunt for vulnerabilities, but it isn’t rewriting the rules from scratch. Early findings from Anthropic’s Project Glasswing — and its security-focused model Mythos — show that classic security principles still matter just as much as cutting-edge AI.
What is Project Glasswing and Mythos?
Project Glasswing is Anthropic’s initiative to explore how advanced AI models can help find and understand software vulnerabilities. Mythos is the specialized large language model (LLM) at the center of that effort, designed specifically for security research and exploit discovery.
Participants like Cloudflare and IBM have been testing Mythos on real-world codebases, feeding it dozens of repositories and asking it to uncover weaknesses. The goal isn’t just to find bugs, but to learn how AI should be integrated into security workflows in a safe and effective way.
Mythos’s real strengths: chaining attacks and proving exploits
Cloudflare’s early write-up highlights two standout capabilities where Mythos clearly outperforms general-purpose LLMs:
1. Building exploit chains. Mythos is particularly good at taking several small, seemingly minor issues and chaining them into a realistic attack path. Instead of treating each flaw in isolation, it can reason about how an attacker might combine them to reach something critical.
2. Generating proof-of-concept exploits. Mythos doesn’t just say “this might be vulnerable.” It’s strong at writing proof-of-concept (PoC) code to demonstrate that an issue is actually exploitable. That’s a big deal for security teams that need to prioritize real risks over theoretical ones.
These strengths make Mythos feel less like a generic chatbot and more like a very fast, very focused security researcher — but only when it’s used correctly.
Why “just point it at the repo” doesn’t work
One of the most important lessons from Cloudflare’s testing is what doesn’t work: simply pointing Mythos at a large codebase and saying “find vulnerabilities.”
In practice, that approach runs into several problems:
Context limits. Even large-context models can’t hold an entire complex repository in their working memory at once. The model loses track of important details or gets pulled into side paths.
Wandering behavior. Without structure, the model can drift, follow unhelpful leads, or generate noisy findings that waste analyst time.
Safety guardrails. When prompts are vague, Mythos may refuse certain actions or require careful rephrasing to stay within safety policies while still doing useful security work.
The takeaway: Mythos isn’t a magic scanner you can fire at 50 repositories and walk away. It needs structure, guidance, and a clear workflow.
The “harness”: small, specialized agents over one big model
The most important concept to emerge from Glasswing so far is the idea of a harness. Instead of treating Mythos as a single, all-knowing agent, Cloudflare and others are building orchestrated systems of smaller, specialized agents that each do one task well.
Think of it like the Unix or Linux philosophy: lots of small tools that each do one thing, chained together into a powerful pipeline.
A typical harness might break the process into steps like:
1. Code discovery. An agent maps the repository, identifies components, and selects files or modules likely to be interesting from a security perspective.
2. Static analysis and reasoning. Another agent reviews specific files or functions, looking for known vulnerability patterns or suspicious logic.
3. Exploit path building. A specialized agent tries to connect individual issues into an end-to-end attack chain.
4. Proof-of-concept generation. A final agent writes and refines PoC code to validate that the vulnerabilities are exploitable.
Each agent is tightly scoped and heavily prompted for its task, then hands off to the next. This “microservices for AI” approach mirrors what many teams are now doing with AI agents and skills-based architectures.
Purpose-built models and the end of “one model to rule them all”
These findings reinforce something many practitioners have been saying: massive general-purpose models aren’t the answer to every problem. For security work, purpose-built models and agents are often more useful than a single giant model trying to do everything.
That means:
Smaller, focused agents. Instead of one Mythos instance doing everything, use multiple agents tuned for tasks like triage, exploit generation, or documentation.
Domain-specific prompting and tools. Give each agent access to the right tools (like code search, test harnesses, or vulnerability databases) and prompts that assume security context, not generic conversation.
Human operators in the loop. Skilled security engineers are still essential. They design the harness, review outputs, and decide what’s real, what’s noise, and what needs escalation.
In other words, AI is becoming another specialized tool in the security stack — not a replacement for the stack or the people running it.
Balancing AI safety with real security work
Cloudflare’s report also touches on a tricky issue: how do you let an AI do serious security research without letting it go off the rails?
On one hand, you want Mythos to explore exploit paths, generate PoCs, and reason about attacks. On the other, you need strong safety controls to prevent misuse or unsafe behavior.
This leads to practical challenges:
Prompt design. Sometimes Mythos will refuse a request as unsafe, even when it’s part of a legitimate test. Teams have to learn how to phrase tasks so they’re clearly within policy while still being useful.
Operator skill. The idea of a “skilled operator” hasn’t gone away. Now, instead of just running tools, operators are managing and steering AI systems — deciding what to ask, how to interpret answers, and when to push back.
In a sense, we’re reliving old debates (monolith vs microservices, strict controls vs usability) in an AI context. The answer looks familiar: small, focused systems with clear guardrails, overseen by humans who know what they’re doing.
Speed vs architecture: what matters more?
One of the loudest reactions to Mythos from security leaders has been about speed: scan faster, patch faster, compress the response cycle. But Cloudflare’s takeaway is more nuanced.
They argue that while speed matters, the bigger win is improving your architecture so that:
Exploitation is harder even when bugs exist. If your systems are built with strong isolation, least privilege, and defense in depth, then a single unpatched vulnerability is less likely to lead to a catastrophic breach.
The patch window matters less. If your architecture assumes that vulnerabilities will always exist and focuses on limiting blast radius, the gap between disclosure and patch becomes less terrifying.
This aligns with modern ideas like zero trust: assume compromise, limit what an attacker (or an AI inside your environment) can do, and focus on resilience over perfection.
Mythos as “just another scan tool” — with caveats
From a CISO perspective, the healthiest way to think about Mythos and similar frontier models is simple: they’re new scan tools, not oracles.
That means:
Don’t blindly trust the output. Just as with any scanner, Mythos will produce false positives and false negatives. In one public example, it flagged multiple supposed vulnerabilities in curl; only one turned out to be real, and it was minor.
Use multiple models for QA. One promising pattern is to use a different model to validate Mythos’s findings. If two independent systems agree — and a human reviews the result — confidence goes up.
You own the outcome. No matter how advanced the AI, responsibility for acting on its findings stays with the organization. Due diligence, verification, and prioritization are still human jobs.
Old problems, new tools: the eternal security loop
The Glasswing findings landed around the same time as the 28th anniversary of L0pht Heavy Industries’ famous testimony to the U.S. Congress — a moment many see as the birth of modern cybersecurity policy.
Looking back, one of the original L0pht members noted that the vulnerabilities they described in 1998 — weak authentication, unencrypted protocols, fragile infrastructure with no accountability — became the blueprint for the next three decades of breaches. In many ways, we’re still fighting the same battles, just with cloud, AI, and bigger stakes.
Even with advances like route security (RPKI) and better detection, the pattern repeats:
We invent new technology. The internet, cloud, AI agents, and more.
We create new attack surfaces. Misconfigured repos, leaked keys, vulnerable APIs, over-permissive agents.
We then build tools to defend them. Scanners, SIEMs, agent harnesses, SBOMs, and (soon) AIBOMs.
Project Glasswing fits right into that cycle: a powerful new tool that both solves and introduces problems, depending on how it’s used.
Governance, friction, and the human factor
Alongside AI-specific lessons, the conversation around Glasswing kept circling back to classic governance issues — especially as new leaks and supply chain incidents surface.
Some recurring themes:
Friction kills controls. If governance and security processes are too painful, people will work around them. That’s as true for developers pushing to GitHub as it is for teams trying to integrate AI into their workflows.
Defense in depth still matters. Any single control can fail. Whether it’s a leaked key, a misconfigured repo, or an over-privileged AI agent, you need layered defenses so one mistake doesn’t become a disaster.
Third-party and supply chain risk is everywhere. From contractors mishandling credentials to platform providers being breached, your security posture is only as strong as your weakest partner. Tools like SBOMs help, but only if organizations actually understand and manage their code provenance.
These are not new ideas — but Glasswing is forcing teams to revisit them in an AI-first world.
Training people, then training AI
One of the most practical calls to action from practitioners is surprisingly human: we need to train more people, not just more models.
Several pain points stand out:
Shortage of skilled operators. Many organizations eliminated junior security roles for years, and now there’s a gap in mid-level and senior talent. AI tools still need experts to operate and interpret them.
Every user is an endpoint. Basic cyber hygiene for the general population is still weak. Poor password habits, blind trust in links, and misunderstanding of online risk continue to open doors attackers can walk through.
AI needs good training data. If we want reliable AI agents, we need robust, high-quality knowledge bases — not just AI models trained on AI-generated content. That requires real documentation, real expertise, and real curation.
Only after we invest in human skills and governance can we responsibly scale up AI-driven security. Otherwise, we risk building powerful tools on top of shaky foundations.
How security teams can act on Glasswing’s early lessons
If you’re building or securing software today, here are practical ways to apply what Glasswing is already teaching the industry:
1. Treat frontier models as advanced tools, not replacements. Add Mythos-like systems to your scanning and review pipeline, but keep humans in the loop for triage and validation.
2. Design a harness, don’t rely on ad hoc prompts. Break your workflow into clear steps and build small, specialized agents for each one. This is the same mindset that underpins modern AI tools that automate complex tasks with multiple agents.
3. Focus on architecture, not just speed. Use AI to find more bugs faster, but also invest in designs that make exploitation harder and limit blast radius when something is missed.
4. Strengthen governance and reduce pointless friction. Controls should be strong but usable. If people constantly work around them, the design is wrong — and AI will only amplify that risk.
5. Invest in people. Train junior security staff, upskill developers on secure coding, and educate everyday users on basic hygiene. Then use those people to design, monitor, and improve your AI systems.
Project Glasswing is still in its early days, but its first findings are clear: AI can dramatically accelerate vulnerability discovery, yet the fundamentals of security — architecture, governance, and human expertise — matter more than ever.
Comments
No comments yet. Be the first to share your thoughts!