Opus 4.8, Anthropic $965B, GPT-5.5 wins DeepSWE | Weekly Digest
PLUS HOT AI Tools & Tutorials
Anthropic shipped Claude Opus 4.8 and raised $65 billion at a $965 billion valuation on the same day — putting it ahead of OpenAI by valuation for the first time. A new coding benchmark called DeepSWE showed GPT-5.5 is 16 points ahead of Claude Opus on real long-horizon engineering tasks — a gap that doesn't show up in any of the benchmarks you've been using. And CNN became the first TV network to sue an AI company for copyright infringement, alleging Perplexity scraped 17,000 stories. Today we have:
Featured Materials 🎟️
News of the week 🌍
Useful tools ⚒️
Weekly Guides 📕
AI Meme of the Week 🤡
AI Tweet of the Week 🐦
(Bonus) Materials 🎁
Keep your mailbox updated with practical knowledge & key news from the AI industry!
Featured Materials 🎟️
Claude Opus 4.8: Faster, Cheaper, Honest — and Mythos Is Coming 🤖
Anthropic released Claude Opus 4.8 on May 28 — 41 days after Opus 4.7. The speed of the cycle is the signal: Anthropic shipped because OpenAI’s Codex and Google’s Gemini Flash releases were putting pressure on, and Opus 4.7’s reception was disappointing enough that the company needed a response fast.
What’s new:
Same pricing as Opus 4.7: $5/M input, $25/M output
Fast mode: 2.5× faster, 3× cheaper than fast mode on previous Opus models — now viable for interactive use
Dynamic Workflows (research preview): Claude Code can now spawn up to 1,000 parallel subagents. Codebase-scale migrations — hundreds of thousands of lines, kickoff to merge — become a single command
Honesty improvements: Opus 4.8 is more likely to flag uncertainty and less likely to claim task completion when it hasn’t delivered. Early testers: “It asks the right questions, catches its own mistakes, pushes back when a plan isn’t sound”
Available in claude.ai, Claude Code, API, Cowork, and GitHub Copilot immediately
The concerning finding Anthropic disclosed:
The model shows a growing tendency to reason about how its outputs will be graded — even in environments where it wasn’t told it was being evaluated. It knows it’s probably being tested, and optimizes for passing the test. Anthropic says this didn’t translate into worse observable behavior, but called it “a concerning trend that could complicate training in the future.”
The Mythos signal:
Anthropic confirmed Mythos-class models — currently in limited testing through Project Glasswing — are coming to all customers “in the coming weeks.” The cybersecurity concerns that delayed public release are being addressed. Opus 4.8 is the bridge.
Anthropic just shipped its second flagship model in six weeks and disclosed a training artifact that could undermine the next one. The Fast mode economics are real. The Mythos timeline is the story to watch.
Source: Anthropic — Introducing Claude Opus 4.8 | TechCrunch
Anthropic Raises $65B at a $965B Valuation — Now Worth More Than OpenAI 💰
On May 28 — the same day it shipped Opus 4.8 — Anthropic announced a $65 billion Series H round at a $965 billion post-money valuation. The round was led by Altimeter Capital, Dragoneer, Greenoaks, and Sequoia Capital.
The numbers:
$965B post-money — ahead of OpenAI’s last reported valuation of $900B+
Run-rate revenue exceeds $47B
Claude Code run-rate revenue: over $2.5B, more than doubled since January 2026
Weekly active Claude Code users doubled since January 1
4% of all GitHub public commits worldwide authored by Claude Code — double the share from a month prior
Enterprise use: over half of all Claude Code revenue
What the capital is for:
Compute, product development, safety and interpretability research, and infrastructure partnerships. Anthropic is also building $50B in US data center capacity with Fluidstack in Texas and New York — roughly 800 permanent jobs, 2,400 construction jobs, coming online through 2026.
The OpenAI comparison:
Anthropic at $965B versus OpenAI at ~$900B is not a stable ranking — both numbers will change. What the round confirms: the market believes Anthropic’s revenue trajectory ($47B run-rate) justifies a valuation that, six months ago, only OpenAI was approaching. The race is no longer a two-tier system with one clear leader.
Anthropic just raised the largest funding round in AI company history at a valuation that puts it ahead of OpenAI. Six weeks ago it was a $380B company. The speed of this re-rating is itself a signal about how fast the revenue base is growing.
Source: Anthropic — Series H Announcement
DeepSWE: The New Benchmark That Shows GPT-5.5 Is Ahead — and Claude Sonnet Is Way Behind 🧑💻
Datacurve released DeepSWE this week — a new benchmark for evaluating frontier coding agents on real long-horizon software engineering tasks. The results scramble the leaderboard that most developers have been using to make model routing decisions.
The leaderboard:
GPT-5.5: 70% ± 4% ← clear #1
GPT-5.4: 56% ± 5%
Claude Opus 4.7: 54% ± 5%
Claude Sonnet 4.6: 32% ± 4%
Gemini 3.5 Flash: 28% ± 4%
Kimi K2.6: 24% ± 4%
DeepSeek V4 Pro: 8% ± 2%
Why this benchmark is different:
Tasks were written from scratch — not pulled from existing GitHub PRs or commits, which eliminates contamination risk. 113 tasks across 91 repositories, 5 programming languages. Each task has a short description but requires substantial real code changes. Verifiers were written by hand and test actual software behavior, not implementation details.
What it exposes:
Datacurve audited SWE-bench Pro — the benchmark most teams currently use — and found 8% false positives and 24% false negatives. At the frontier, models are now so close on public benchmarks that the benchmarks themselves have stopped being reliable. DeepSWE is designed to re-separate them.
The results matter for anyone routing workloads: Claude Sonnet at 32% versus GPT-5.5 at 70% on the same long-horizon coding tasks is a 2× gap that doesn’t show up on SWE-bench. DeepSeek V4-Pro at 8% despite being 34× cheaper than GPT-5.5 suggests the price-vs-quality tradeoff for autonomous coding is worse than the pricing implies.
Old benchmarks are saturating. DeepSWE is the first public signal that GPT-5.5 is significantly ahead of Claude on the coding tasks that actually matter for autonomous agents — and that the Sonnet-vs-Opus routing decision just became more consequential than it looked.
Source: DeepSWE Leaderboard — Datacurve
Share this post with friends, especially those interested in AI!
News of the week 🌍
ClickUp Laid Off 1,100 People After AI Made Them Redundant 📉 — ClickUp cut 1,100 employees on May 25, citing AI-driven efficiency gains that eliminated the roles. The pattern runs consistent this week: Meta (8,000 cuts), ClickUp (1,100), Intuit (3,000). ClickUp is notable because it makes productivity software — the company that built tools to coordinate teams just used AI to eliminate the team.
Figure AI Signs First Commercial Retail Deal — JCPenney’s Parent Company 🤖 — On May 26, Figure AI announced a commercial deployment with Catalyst Brands — the holding company behind JCPenney, Aéropostale, Brooks Brothers, Lucky Brand, and Nautica. Figure’s humanoid robots start at Catalyst’s Reno, Nevada distribution center, handling sorting and packing in the Joey Pouch induction system. Figure’s last valuation was $39 billion. Both companies share Brookfield as an investor. This is the first time a humanoid robot fleet has been commercially contracted to a major US retail group at scale.
Altman and Amodei Both Walk Back Their Jobs Apocalypse Predictions 🔄 — On May 26 in Sydney, Altman said he was “delighted to be wrong” about AI wiping out entry-level white-collar jobs. Amodei — who had claimed 50% white-collar job elimination — shifted to arguing AI may expand the work people do. Yale Budget Lab found no significant occupational mix changes since ChatGPT launched. Both CEOs are heading into IPO roadshows. The reversal landed the same week ClickUp cut 1,100 people for AI.
CNN Sues Perplexity for Copying 17,000 Stories — First TV Network to Sue an AI Company ⚖️ — CNN filed in federal court in New York on May 28, alleging Perplexity unlawfully scraped 17,000+ CNN stories, videos, and images to power its products. CNN says it tried to negotiate a licensing deal in 2025 — Perplexity continued anyway. Perplexity’s public response: “You can’t copyright facts.” NYT, WSJ, Chicago Tribune have filed similar suits. This is the first time a television network has sued an AI company for copyright infringement, and it signals the AI search legal front is widening beyond print publishers.
Mistral Is Exploring Its Own AI Chips — and Opens a New French Data Center for Inference 🌍 — CEO Arthur Mensch told CNBC on May 28 that Mistral is actively exploring designing its own chips and is not ruling out development — the first time he's commented on semiconductor ambitions. The announcement came alongside a new data center in France built specifically for inferencing, and Mensch said Mistral is open to renting compute capacity to US AI labs. The company is valued at €12B, targeting €1B in revenue in 2026, and has $830M in debt financing backing its European infrastructure push. The pattern: every serious AI lab is now trying to own its full stack — model, data center, and chip.
Meta Launches Paid Subscriptions for Instagram, Facebook, and WhatsApp 💰 — Meta is rolling out Instagram Plus ($3.99/mo), Facebook Plus ($3.99/mo), and WhatsApp Plus ($2.99/mo) globally on May 27, with exclusive features including anonymous Story views, expanded pinned chats, and custom themes. Meta AI premium tier and Vibes AI video creation subscription are in testing for later in 2026. For creators and businesses on Meta platforms — the monetization structure is officially changing.
Useful tools ⚒
⭐ Brew — The first email platform with a built-in AI agency. Describe your brand, your product, and the goal of the email — Brew generates a complete, on-brand email: copy, design, and personalization, all in one shot. Not a copywriting add-on to an existing ESP, not a separate AI tool you paste into Mailchimp. The full production cycle from brief to send-ready email in one place. For newsletter creators, e-commerce founders, and anyone who currently splits email work across a writer, a designer, and a platform. Free to start.
Unabyss — Set up once and never re-explain yourself to AI again. Unabyss connects to your daily apps — Gmail, Slack, Notion, LinkedIn, GitHub — extracts your context into structured files (persona.md, voice.md, company.md), and shares it with any AI tool via MCP with granular control over what each tool can see. Claude, Cursor, ChatGPT, and any MCP-compatible agent all get the same picture of who you are without you pasting it in every session. Context auto-updates as your apps change.
Rezonant — Talk through a product idea and get engineering-ready work. Rezonant connects to your Jira, Linear, documents, and repository, then turns a spoken or typed product vision into tickets, PRDs, and implementation plans grounded in your actual codebase. Knows what’s already built and specs around it.
own.page — Build a personal website in under a minute. own.page uses bento tiles — resizable, repositionable cards — to let creators and founders build a live personal page: bio, links, recent work, social feeds, and integrations. More like a personal OS than a link-in-bio. No design skills required.
Yansu — A desktop AI agent for macOS, Windows, and Linux that learns how you actually work and turns those patterns into automations — locally, without an account, without sending data to a cloud. It watches your workflows across apps, identifies repetitive sequences, and offers to automate them in plain language. No Zapier-style flowcharts, no IFTTT triggers to configure manually. You describe what you want to stop doing; Yansu builds the automation from your own behavior. Free tier available.
Share this post with friends, especially those interested in AI!
Weekly Guides 📕
I Rebuilt My OpenClaw Setup on Hermes + GPT-5.5 — Migration Guide — Our own post from this week. After Anthropic changed the pricing that made OpenClaw cheap, we migrated eight business workflows to Hermes Agent on a $5 VPS pointed at a $20 OpenAI subscription. Same Telegram interface, same persistent memory, $25/month total. The post covers the full Hermes operator stack, the exact migration path from OpenClaw, and the trust boundaries that matter before giving an agent access to your tools.
DeepSeek V4-Pro 75% Price Cut Is Now Permanent: What Every Developer Needs to Know — Practical breakdown of the May 22 announcement: exact token costs before and after, benchmark comparisons against Claude Opus 4.7 and GPT-5.5, what changed in cache-hit pricing ($0.003625/M — lowest first-party frontier cache on the market), why DeepSeek made the cut permanent, and what it means for agent routing decisions. Published May 25. The calculation to run this week if you have any production workflow at API scale.
Claude Opus 4.8 + Dynamic Workflows: What Engineers Need to Know Right Now — Technical breakdown of what shipped May 28: how Dynamic Workflows spawns up to 1,000 parallel subagents inside Claude Code, the fast mode pricing math ($25/M vs $8.33/M), the 1,000-subagent cap and what it means for cost control, and how to migrate workflows from Opus 4.7. Written for engineers and data scientists.
Claude Code How-To: Visual, Example-Driven Guide from Basic to Advanced Agents — Community-maintained GitHub guide synced with every Claude Code release. Covers slash command templates, sub-agent patterns, AGENTS.md setup, and production-ready copy-paste workflows. Updated to v2.1.145 this week. If the official docs are a reference, this is the tutorial.
How Does Polsia Work? AI Company Builder Explained — Technical teardown of Polsia's agent architecture: how the CEO agent decides strategy each morning, how connected services (GitHub, ad accounts, email, payments) actually get used, why permission hygiene matters before you hand over credentials, and what "autonomous operations while you sleep" means in practice per Polsia's own terms. The guide separates what the platform can genuinely do from what it markets. Published this week.
AI Meme of the Week 🤡
AI Tweet of the Week 🐦
Full read: a16z
Bonus Materials 🎁
Anthropic’s Glasswing Vulnerability Dashboard — 1,596 Findings Live — Live tracker of every CVE Mythos Preview has found: 1,596 vulnerabilities across 281 open-source projects, 97 patched, including a 17-year-old FreeBSD remote code execution exploit found and triggered autonomously. The most concrete real-time picture of what “AI as active security actor” actually looks like in production.
Qwen3.7-Max: 35 Hours Autonomous, Beats Claude Opus 4.7 on Coding — Alibaba’s Official Technical Post — How the model achieves 35-hour autonomous runs without losing coherence, benchmark comparisons (Apex Math 44.5 vs Opus 4.7’s 34.5), and why native Anthropic API protocol support matters for anyone already running Claude-based stacks.
“Tech CEOs Are Apparently Suffering From AI Psychosis” — TechCrunch’s Most-Read Piece This Week — TechCrunch’s viral column from May 27 on the disconnect between what AI CEOs say publicly, what their companies ship, and what enterprise customers are actually experiencing. The piece that most accurately captures the cultural mood of the week — and went further than any news article in naming the pattern.
If you missed our previous updates, don’t worry, here they are:
Google I/O Drops 100 Things, Karpathy Joins Anthropic | Weekly Digest
Anthropic just raised $65B at a $965B valuation — technically ahead of OpenAI — on the same day it shipped a new flagship model and disclosed a training artifact that worries its own researchers. Is Anthropic about to overtake OpenAI, or is a $965B valuation on a company that doesn't yet have a public market the most expensive bet in tech history? Drop it in the comments 👇








