Meta Goes Closed-Source, Mythos Gets Official, Gemma 4 Slaps | Weekly Digest
Project Glasswing, Cursor 3 agent IDE & Gemma 4 beats Llama 4
Hey! Welcome to the latest Creators’ AI Edition.
The Mythos leak was just a rumor. This week Anthropic made it real — and announced no one gets to use it. Meta shipped its first model under Alexandr Wang — and for the first time in its history, shipped it closed-source. Google dropped an open model that beats Llama 4 across every major benchmark — at a fraction of the compute. Today we have:
Featured Materials 🎟️
News of the week 🌍
Useful tools ⚒️
Weekly Guides 📕
AI Meme of the Week 🤡
AI Tweet of the Week 🐦
(Bonus) Materials 🎁
Keep your mailbox updated with key knowledge & news from the AI industry!
Stop losing contacts. Let AI remember everyone for you.
ConnectMachine is your private AI agent for professional networking. Describe who you're looking for in plain English — ConnectMachine finds them, surfaces the context of when and where you met, and helps you reconnect at exactly the right moment. No scattered business cards. No noisy social feeds. No lost opportunities slipping through the cracks.
Built for founders, executives, and operators who network with intention — not volume. From digital business cards with selective sharing to calendar-aware meeting scheduling — ConnectMachine handles your entire relationship layer, quietly and privately.
Featured Materials 🎟️
Anthropic Officially Releases Mythos — And Immediately Locks It Away 🔐
Three weeks ago the model leaked from a CMS error. On April 7, Anthropic made it official — and confirmed that Claude Mythos Preview will not be publicly available. Not because it underperformed. Because it performed too well.
Internally, Anthropic used Mythos to scan the world’s most critical software. It found thousands of zero-day vulnerabilities autonomously — including a 17-year-old remote code execution flaw in FreeBSD (CVE-2026-4747) that let anyone gain root over NFS. No human assistance. Fully autonomous.
- 93.9% on SWE-bench Verified — highest score ever posted for a coding benchmark
- 97.6% on USAMO 2026 — math olympiad level
- Parameter count undisclosed — Anthropic has not revealed the architecture details of Mythos Preview
- $100M in usage credits committed to Project Glasswing partners
What Project Glasswing actually is:
Instead of a public release, Anthropic launched Project Glasswing — a controlled-access security initiative. 50+ organizations receive Mythos Preview access specifically to find and patch vulnerabilities before attackers get there: AWS, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, Linux Foundation, Microsoft, Nvidia, Palo Alto Networks. Plus ~40 more critical infrastructure maintainers.
Mythos drew a line the industry hasn't seen before: a model so capable that the responsible move is to not ship it. Project Glasswing is the right call. The uncomfortable question is what happens when the next lab reaches the same capability and makes a different choice.
Source: Anthropic | Red Team Blog | TechCrunch | Fortune | NBC News
Meta Builds the Best Health AI on the Planet — Then Closes the Source 🔒
Meta launched Muse Spark on April 8. First model from Meta Superintelligence Labs. First model under Alexandr Wang. And the first model Meta has ever shipped as closed-source proprietary software.
That last part is the story. Meta built Llama. Meta defined the open-source AI movement. Muse Spark ships with no weights, no license, no public release plan. Zuckerberg’s team rebuilt their entire AI stack from scratch over nine months — and the result is too competitive to give away.
The model:
Inputs: voice, text, image — Output: text only (for now)
Fast mode for quick queries, Contemplating mode for complex tasks (runs sub-agents in parallel), Shopping mode with behavioral signals layered on top
HealthBench Hard: 42.8 — beats every frontier model tested. GPT-5.4 scores 40.1. Gemini 3.1 Pro scores 20.6.
Ranks 4th on the Artificial Analysis Intelligence Index — behind only Gemini 3.1 Pro, GPT-5.4, and Claude Opus 4.6
Live now in the Meta AI app and meta.ai. Rolling out to WhatsApp, Instagram, Facebook, Messenger, and Ray-Bans in the coming weeks — US first
What Wang’s team actually built:
Nine months, no legacy constraints, full stack from the ground up. Small and fast by design, capable enough to outperform every competitor on health reasoning. Meta plans an open-source version eventually — but “eventually” with no date is not a commitment.
Meta open-sourcing Llama was a strategic move to commoditize competitors and commoditize compute. Muse Spark ships closed-source — the first time Meta has ever done that. An open-source version is planned but without a date, which in practice means the competitive advantage stays proprietary for now.
Source: Meta AI Blog | CNBC | The Verge | gHacks
Gemma 4: A 31B Model That Outperforms Llama 4 at a Fraction of the Size 🪶
Google released Gemma 4 under Apache 2.0. No restrictions, no fees, no gating. A 31B Dense model that beats Meta’s Llama 4 across every major benchmark: AIME 2026 Math (89.2% vs 88.3%), LiveCodeBench v6 (80.0% vs 77.1%), GPQA Diamond (84.3% vs 82.3%), τ2-bench Agentic (86.4% vs 85.5%).
Four models for four use cases:
Effective 2B (E2B) — smartphones, Raspberry Pi, 5GB RAM
Effective 4B (E4B) — lightweight consumer hardware
26B Mixture of Experts — mid-range GPUs, 16GB+ RAM (4-bit)
31B Dense — high-end consumer GPU, 20GB RAM (4-bit)
All models: text + image inputs, 256K context window, 140+ languages. Edge models also handle audio. Day-one support on Hugging Face, Ollama, vLLM, llama.cpp, MLX, LM Studio, Unsloth, NVIDIA NIM, and Docker.
Google distilled Gemini 3’s research into a model family anyone can run on their laptop. The 31B Dense model fits on a single consumer GPU and outperforms everything Meta, Alibaba, and Mistral have shipped at any size. The open-source frontier moved again — in Google’s direction.
Source: Google DeepMind Blog | Google Open Source Blog | AI Dev Page
News of the Week 🌍
Cursor 3 ships an Agents Window 🪟 — Cursor rebuilt its IDE from scratch around one bet: that developers will manage agents, not write code. The new Agents Window is a separate interface built alongside the existing IDE — shows every local and cloud agent in one unified sidebar, with the option to switch back to the classic IDE at any time — including sessions started from Slack, GitHub, or Linear. Parallel execution, cloud handoff so agents keep running after you close your laptop, Design Mode for annotating UI in a built-in browser, and a 30-plugin marketplace from Atlassian, GitLab, Datadog, and Hugging Face. Price unchanged at $20/month.
Source: Cursor Blog | Cursor Changelog
Google open-sources Scion 🧪 — Google published Scion: a multi-agent orchestration testbed that runs Claude Code, Gemini CLI, and Codex as isolated containers with separate credentials and git worktrees. Each agent gets its own environment. Guardrails enforce at the infrastructure level, not in software. Google calls it a “hypervisor for agents.” Local mode is stable. Kubernetes runtime has known rough edges. Google shipped a demo game built entirely by collaborating agent teams — worth looking at to understand what the tool actually does.
OpenAI publishes its economic blueprint for the intelligence age 📋 — Sam Altman released a 13-page policy paper titled “Industrial Policy for the Intelligence Age.” The proposals: a national Public Wealth Fund giving every American a stake in AI companies, a subsidized 32-hour workweek at full pay, taxes on automated labor, and automatic safety net triggers that activate when AI-driven displacement hits preset thresholds. Altman told Axios it’s “a starting point, not a prescription.” The timing — weeks before an expected Spud launch and a potential 2026 IPO — is not accidental. The company racing to build AGI is now also writing the rulebook for what happens after.
Source: OpenAI | Axios | TechCrunch
GPT-5.5 “Spud” confirmed weeks away 🥔 — Sam Altman confirmed pretraining wrapped March 24. “A few weeks.” Greg Brockman on the Big Technology podcast: “two years of research, big model feel.” Polymarket assigns ~78% probability to an April release and 95%+ by end of June. GPT-5.4 set the current baseline at 75% on OSWorld-Verified, just above the 72.4% human score. Whether Spud ships as GPT-5.5 or GPT-6 remains unconfirmed — it depends on how large the performance gap turns out to be. Spud is positioned as the first OpenAI model genuinely built for autonomous agentic workflows, not chat.
Source: The Information | TechRadar
Anthropic launches Claude Managed Agents 🤖 — Anthropic launched the public beta of Claude Managed Agents on April 8: a suite of composable APIs for building and deploying production-grade cloud agents without touching the infrastructure. The design philosophy is “brain vs. hands” — Claude reasons, each session runs in a disposable isolated Linux container. Sessions persist through disconnections. Sandboxed execution, checkpointing, credential management, scoped permissions — all handled by Anthropic. Pricing: standard API token costs plus $0.08 per session-hour (~$58/month for a 24/7 agent before token costs). One constraint: runs exclusively on Anthropic’s infrastructure — not available through Bedrock or Vertex AI yet. Early adopters already in production: Notion, Asana, Rakuten, and Sentry. Anthropic reports 10x faster time-to-ship vs. building agent infrastructure from scratch. The announcement hit 5.09M views on X.
Source: Anthropic Docs | SiliconANGLE | TechRadar
Useful tools ⚒️
⭐ NovaVoice — Voice OS for your desktop: context-aware dictation, AI assistant on any screen via hotkey, and cross-app voice commands without switching tabs.
NovaVoice isn’t just dictation. Dictate into Notion and get formatted Markdown. Hit the hotkey, ask anything by voice — no tab switching. Tell it to message someone in Telegram: it opens the app, finds the contact, drafts the message. You press send. Supports 6 languages mid-sentence, custom dictionary for names and shortcuts, and a smart popover for instant text reformatting. Available on desktop, free tier included.
Google Gemma 4 — Google’s open model family that outperforms Llama 4 across every major benchmark. Apache 2.0, no fees, 256K context window. Runs on a single consumer GPU.
Handle Extension — Point and click to refine UI directly in your browser, feed the changes straight to Claude Code, Cursor, or Codex. No re-prompting, no copy-paste. Chrome extension, open source, free.
Influcio — Self-learning AI agent for influencer marketing: finds creators, runs campaigns end-to-end, and optimizes every next launch from performance data. Replaces the full manual workflow with one system.
Want more breakdowns like this? Share this edition with a friend.
Weekly Guides 📕
Gemma 4 Official Blog — Google — The canonical starting point. Architecture breakdown, full benchmark comparisons against Llama 4, and deployment options across every platform.
Run Gemma 4 Locally — Unsloth Docs — Exact RAM requirements by quantization level, hardware targets for each model size, and commands to get everything running. Shorter and more useful than the official docs.
Project Glasswing — Anthropic — Full explanation of what Mythos found autonomously, how the controlled-access program works, which organizations participate, and why the model won’t ship publicly.
Cursor 3: What Actually Changed — Medium — Hands-on walkthrough of the Agents Window, Design Mode, and cloud handoff. More candid about trade-offs than the official announcement.
Introducing Muse Spark — Meta AI Blog — Meta’s account of the nine-month development arc and architectural decisions. Also where you can try the model.
AI Meme of the Week 🤡
AI Tweet of the Week 🐦
(Bonus) Materials 🎁
Claude Mythos Preview — Red Team Technical Report — Benchmarks, the specific CVEs Mythos found autonomously, and the capability thresholds that triggered the restricted release. Required reading.
Muse Spark: Benchmarks & Full Comparison — Side-by-side Muse Spark vs GPT-5.4, Claude Opus 4.6, and Gemini 3.1 Pro across reasoning, coding, vision, and health evals.
Gemma 4: Performance-to-Size Breakdown — Deep dive into how Google achieved frontier-level results at a fraction of the compute. Architecture decisions, training choices, benchmark context.
Google Scion — GitHub — The actual repo. README explains the agent hypervisor architecture and the demo game built entirely by multi-agent teams.
If you missed our previous updates:
OpenAI Raises $122B, Claude Code Leaks Its Secrets | Weekly Digest








