← Blog

Artikel ini belum tersedia dalam bahasa Anda. Menampilkan versi bahasa Inggris.

Claude Fable 5 Just Shipped: 80.3% on SWE-Bench Pro, 2× Opus 4.8 Pricing, Free Through June 22

Anthropic released the first publicly available Mythos-class model today — Claude Fable 5, with safeguards that fall back to Opus 4.8 on high-risk prompts. 80.3% SWE-Bench Pro (11 points over Opus 4.8). $10/$50 per 1M tokens. Free on Pro/Max plans until June 22, then credit-metered.

By WaveSpeedAI 10 min read

Anthropic shipped its first publicly available Mythos-class model today. Claude Fable 5 went generally available on the Claude API, AWS Bedrock, Vertex AI, and Microsoft Foundry on June 9, 2026 — paired with Claude Mythos 5, the same underlying model but with safeguards lifted, kept inside Project Glasswing’s partner program. Pricing for both lands at $10 input / $50 output per 1M tokens — exactly 2× the rate of Claude Opus 4.8.

The interesting bits aren’t in the headline, though. Below the launch numbers there’s a meaningful pricing twist (free on Pro/Max through June 22, then credit-metered), a state-of-the-art benchmark profile that genuinely separates Fable 5 from the previous frontier, and a new architectural choice — automated classifier safeguards that fall back to Opus 4.8 rather than refusing — that materially affects how this gets deployed.

What shipped

DetailValue
Model nameClaude Fable 5 (public) + Claude Mythos 5 (restricted)
API model IDclaude-fable-5
Input pricing$10/1M tokens
Output pricing$50/1M tokens
vs Opus 4.8 pricing2× ($5/$25)
AvailabilityClaude API, AWS Bedrock, Vertex AI, Microsoft Foundry
Subscription accessPro / Max / Team / Enterprise — free through June 22, 2026
Post-June 22Requires extra usage credits on top of subscription
Mythos 5 accessProject Glasswing partners only; biology researchers via trusted-access program

Two pricing notes worth pulling out:

  1. The 13-day free window is the launch lever. Anthropic is using the subscription bundle to drive adoption during the first two weeks. After June 22 the same model on Pro/Max requires explicit credit purchase, which puts production users back into per-token billing math at the headline $10/$50 rate.
  2. 2× Opus 4.8 pricing is steep, but the benchmark gap explains most of it (see below). At $50/1M output, Fable is the most expensive frontier model on the market — about 3.3× Sonnet 4.6, 5× GPT-5.5, and 5.5× Gemini 3.5 Flash on output tokens.

The benchmarks — and what they actually show

Fable 5 ships with state-of-the-art claims on “nearly all tested benchmarks.” The numbers Anthropic published that are most concrete and directly comparable:

BenchmarkFable 5Opus 4.8GPT-5.5Gemini 3.1 Pro
SWE-Bench Pro80.3%69.2%58.6%54.2%
FrontierCode29.3%13.4%5.7%

Three reads:

80.3% on SWE-Bench Pro is the load-bearing number. This is the toughest of the SWE-Bench variants and the one that’s most predictive of how a model performs on real production code. Fable 5 leads Opus 4.8 by 11.1 points, GPT-5.5 by 21.7 points, and Gemini 3.1 Pro by 26.1 points. That’s a tier gap, not a marginal lead. For coding workflows where you’re comparing model quality head-to-head, this is the largest single-axis frontier-lead reported this year.

FrontierCode is the more telling number. This benchmark targets the hardest tier of programming problems — abstract algorithmic puzzles, novel data structure work, performance-critical optimization. Fable 5 at 29.3% is over 2× Opus 4.8 and over 5× GPT-5.5. On a benchmark where the frontier ceiling is around 30%, this kind of jump is the closest thing to evidence of step-change capability that benchmark data can provide.

Anthropic’s case studies back the numbers. Stripe reportedly used Fable 5 to complete a 50-million-line codebase migration in one day. Hebbia reports Fable 5 as the highest scorer on its Finance Benchmark. IMC says it “aced their trading-analysis evaluations nearly across the board.” These are first-party-friendly testimonials, but the consistency of the “code/finance/scientific knowledge work” framing across multiple independent customers is the signal worth weighting.

What’s not in the benchmark release: Terminal-Bench numbers, GPQA, MMLU, HumanEval, AIME. Anthropic seems to have prioritized SWE-Bench Pro and FrontierCode as the headline coding metrics, which is consistent with the model’s framing as a software-engineering frontier.

The safeguards — a different architectural choice

The reason Fable 5 can ship publicly at all is the three automated classifier safeguards Anthropic built on top of Mythos 5:

  1. Cybersecurity — blocks offensive cyber tasks and exploit development
  2. Biology/Chemistry — falls back to Opus 4.8 on most bio/chem requests with dual-use risk (specifically including AAV design)
  3. Distillation prevention — blocks attempts to extract capabilities for competing models

The architectural choice that matters: when a safeguard triggers, the response is handled by Opus 4.8 rather than refused. That’s a meaningful UX choice. Users get a usable answer to most queries even when Fable 5’s frontier reasoning is gated behind safety review. Anthropic reports more than 95% of Fable sessions involve no fallback at all — meaning the safeguards are tuned tightly enough that production workflows largely don’t notice them.

External red-teaming results:

  • 1,000+ hours of external bug bounty testing — no universal jailbreaks discovered
  • Zero harmful single-turn cybersecurity requests complied with across 30 public jailbreak techniques
  • An external red-teamer called the safeguards “most robust of any model tested”

For builders deploying customer-facing applications, this is a different operational posture than refusing-on-policy-trigger. You can build product around the model without designing fallback UX for every safety refusal — Opus 4.8 handles the long tail invisibly.

Mythos 5 — and what its existence implies

The same model with safeguards lifted ships as Mythos 5, restricted to Project Glasswing cybersecurity partners and expanding through a trusted-access program for biology researchers. The capability gap between Fable 5 and Mythos 5 is concentrated in three categories:

  • ExploitBench (cyber): Mythos 5 at 78% vs Opus 4.8’s 40% (Fable 5 doesn’t run this benchmark because cyber prompts hit the safeguard)
  • Drug design: Mythos 5 reportedly accelerated protein design processes by ~10×, with 9 of 14 targets yielding drug candidates
  • Scientific hypothesis generation: novel molecular biology hypotheses preferred ~80% of the time in blind expert comparisons

Mythos 5 existing tells you that the underlying model has materially stronger capabilities than what Fable 5 exposes. For most production workflows that’s invisible — but for security research, drug discovery, and similar domains, the Mythos branch is where the actual frontier lives. Public access is gated specifically because Anthropic concluded the raw model has dual-use risks that the classifier safeguards exist to mitigate.

What the Sonnet pattern break tells you

Two weeks ago I argued that Anthropic’s historical pattern of pairing Opus and Sonnet minor versions made a Sonnet 4.7 release more likely than Sonnet 4.8. The pattern broke harder than I expected: no Sonnet 4.7, no Sonnet 4.8, and now the Mythos branch is the headline release. Opus 4.8 shipped to fill the Pro-tier slot; Fable 5 occupies a new tier entirely above it.

Three interpretations of what this means for the Claude lineup going forward:

  1. The Mythos branch is the new frontier, with Sonnet and Opus as production-tier choices below it. The capability ceiling has visibly raised.
  2. Naming has decoupled from versioning. Going forward, expect more named branches (Mythos, Fable) and less of the Opus/Sonnet/Haiku triad. The string-in-source-map approach to predicting model releases is dead.
  3. Pricing tier separation is widening. Fable 5 at $10/$50 is 2× Opus 4.8, which was already 5× Haiku 4.5. The cost of frontier capability is rising faster than the cost of production-grade capability — which means routing decisions matter more than they used to.

Where Fable 5 fits in production today

Concrete deployment reads:

Use Fable 5 for:

  • Code-heavy workflows where SWE-Bench Pro performance directly matters — large-scale migrations, novel algorithm work, performance-critical refactors. The 11-point lead over Opus 4.8 translates to measurably better output on real codebases.
  • Vision-rich knowledge work — extracting structured data from scientific figures, rebuilding web apps from screenshots, processing technical documentation with embedded diagrams.
  • Long-context reasoning — Anthropic’s framing emphasizes “maintaining focus across millions of tokens,” which suggests the model’s degradation curve at the 1M+ context end is meaningfully better than the prior frontier.
  • Finance/trading analysis — independent benchmarks (Hebbia) put Fable 5 at the top of finance-specific evaluations.

Stay on cheaper alternatives for:

  • High-volume / low-stakes generation — at $50/1M output, Fable 5 is uneconomical for content generation, classification, or structured extraction. Sonnet 4.6 or Gemini 3.5 Flash do this work at ~10% of the price.
  • MCP / tool-orchestrated agent workflows — Gemini 3.5 Flash currently leads MCP Atlas and Toolathlon at a fraction of the cost. Fable’s coding strength doesn’t automatically translate to agent-orchestration strength.
  • Cyber or bio queries where Fable will fall back anyway — you’re paying Fable pricing for Opus 4.8 output. Just use Opus 4.8 directly.

How to access Claude Fable 5 today

Three deployment paths:

  1. Direct via Anthropicclaude-fable-5 is live on the Claude API and across AWS Bedrock, Vertex AI, and Microsoft Foundry. Free on Pro/Max plans through June 22; per-token billing kicks in after that at $10/$50.
  2. Through the WaveSpeedAI LLM endpoint — OpenAI-compatible access to the current frontier text models behind a single API key. When Fable 5 propagates through the platform, you’ll be able to A/B-test it against Opus 4.8, Sonnet 4.6, GPT-5.5, and Gemini 3.5 Flash under the same surface without rotating provider credentials.
  3. Through provider routers — if you’re on an aggregator (Vercel AI SDK, LangChain, etc.), the claude-fable-5 model ID is already in the public model registry. Routing policies for “use Fable for SWE-heavy tasks, Sonnet for everything else” become a one-line config change.

What to watch for in the next two weeks

Three signals:

  1. June 22 pricing transition. When the free Pro/Max window ends, the public adoption signal becomes how many subscribers actually pay overage for Fable 5. That’s the cleanest read on whether the 2× Opus 4.8 pricing is sustainable for the long tail of use cases.
  2. Independent benchmark replication. Anthropic’s SWE-Bench Pro and FrontierCode numbers are first-party. Watch for replication by independent benchmark suites — and for evidence of whether the 11-point lead over Opus 4.8 holds outside Anthropic-curated test sets.
  3. The next Sonnet release. With Mythos as the new frontier branch, what happens to Sonnet? An updated Sonnet positioned against Gemini 3.5 Flash on cost/agent benchmarks would re-anchor the value tier of the Claude lineup. Silence on Sonnet would signal Anthropic is letting the production tier ride on 4.6 for longer than the historical cadence suggests.

Until then: Fable 5 is the new ceiling, Opus 4.8 is the production-grade default, and Sonnet 4.6 remains the value choice. The next two weeks will tell you whether the Mythos branch is a one-off frontier or the new normal for Anthropic’s release cadence.

Sources: Anthropic’s Fable 5 / Mythos 5 announcement, The Decoder benchmarks breakdown, TechCrunch on the launch and safety context, CNBC coverage, Amazon Bedrock availability.