by Anthropic · Claude 4 family · best for the Opus price reset, stable target
Claude Opus 4.5, released November 24, 2025, is the model that re-priced the Opus tier — dropping input from $15 to $5 and output from $75 to $25, the pricing that has held through Opus 4.6 and 4.7. It paired that cut with a capability jump (SWE-bench Verified 80.9%, the first Opus to break 80%) and Anthropic's strongest alignment profile to date. For a buyer, the single sentence is this: the historically pivotal Opus model whose pricing reset made Opus-tier capability viable for daily use — now superseded twice, but still a stable, fully supported target. - Provider: Anthropic - Released: 2025-11-24 - Status: GA (legacy — superseded by Opus 4.6/4.7, still actively supported) - Context window: 200,000 tokens - Max output: 64,000 tokens - Modalities: text, image - Knowledge cutoff: May 2025 reliable (training cutoff August 2025) - Headline price: $5 input / $25 output per 1M tokens
| Benchmark | Score | Source |
|---|---|---|
| Humanity's Last Exam | 43.4% | vellum.ai 2025-11-24T00:00:00.000Z |
| MMLU | 90.8% | vellum.ai 2025-11-24T00:00:00.000Z |
| MMMU | 80.7% | vellum.ai 2025-11-24T00:00:00.000Z |
| MMLU-Pro | 90% | artificialanalysis.ai 2025-11-24T00:00:00.000Z |
| HumanEval | 92% | automatio.ai 2025-11-24T00:00:00.000Z |
| TAU-bench | 88.9% | vellum.ai 2025-11-24T00:00:00.000Z |
| LMArena Elo | 1468 | openlm.ai 2026-05-28T00:00:00.000Z |
| GPQA Diamond | 87% | vellum.ai 2025-11-24T00:00:00.000Z |
| Terminal-Bench | 59.8% | vellum.ai 2025-11-24T00:00:00.000Z |
| LMArena Coding Elo | 1510 | openlm.ai 2026-05-28T00:00:00.000Z |
| SWE-bench Verified | 80.9% | vellum.ai 2025-11-24T00:00:00.000Z |
| Artificial Analysis Index | 43 | artificialanalysis.ai 2025-11-24T00:00:00.000Z |
Six personas, six verdicts — the same panel that reviews every product on TopReviewed.
“Opus 4.5 was the inflection point — the 3x price cut changed the Opus deployment math for good.”
Opus 4.5 changed the strategic calculus for the Opus tier: the cut from $15/$75 to $5/$25 made frontier Opus capability viable for daily use, and that price still anchors the family. For new workloads in mid-2026 the question is whether to go to 4.6 (1M context, better ARC-AGI-2) or 4.7 (better SWE-bench, vision) — either is the strategically correct choice for new builds. Continued use of 4.5 is reasonable in production windows with tuned prompt suites and a controlled migration path. Multi-cloud availability is unchanged.
“Opus 4.5's 3x price cut was the strategic masterstroke that reset the entire frontier's pricing expectations.”
Opus 4.5's market significance is hard to overstate: by tripling Opus affordability overnight while raising capability, Anthropic reset buyer expectations for what frontier intelligence should cost and pressured competitors on price-to-capability. It also led on alignment, a differentiator that matters to enterprise and regulated buyers. As a standalone product its capability has been surpassed, but its pricing legacy defines the tier and its safety positioning still resonates with risk-averse adopters.
“This is the model that made Opus affordable — and on rate card it's identical to 4.6 and 4.7 today.”
Opus 4.5 is the model that made Opus-tier work financially feasible; the November 2025 reset to $5/$25 is the pricing that still anchors the tier. On TCO it is identical to 4.6/4.7 on headline rates, with matching cache and batch discounts, and its token-efficiency focus can make it cheaper per task than peers that emit more tokens. The financial case to migrate to 4.6/4.7 is the absence of any cost downside combined with capability upside — there is no money reason to delay migration on cost grounds.
“The first Opus that felt economically viable for daily agent use — SWE-bench 80.9% still holds up.”
For builders, Opus 4.5 was the first Opus model that felt economically viable for daily agent use, and SWE-bench Verified 80.9% remains competitive even now. The `effort` parameter is useful but less polished than the adaptive thinking shipped with 4.6. The 200k context cap is the real constraint — long-repo work pushes you to 4.6 or later. Tool use is unchanged across the family. For maintaining existing 4.5 integrations it is fine; for new builds, go to 4.7.
“Polished and safe — on casual chat most users can't tell 4.5 from the newer Opus models.”
For a consumer chat product, Opus 4.5 delivers a polished experience: moderate latency, high conversation quality, calibrated and notably safe refusals. End users will not perceive a difference between 4.5, 4.6, and 4.7 on casual chat. Where the gap shows is hard reasoning and computer use, both of which improved in 4.6 and again in 4.7. The May 2025 reliable cutoff is dated for current events; web search mitigates. The standout user-facing trait is trustworthiness — this was Anthropic's most aligned model.
“A landmark for pricing, not capability — and its ARC-AGI-2 of 37.6% was halved by the very next release.”
Opus 4.5's claims were honest and its alignment leadership is genuinely verifiable (the ~4.7% prompt-injection number is a real, useful datapoint). The skeptical caveat is that its fame is about price, not raw capability: ARC-AGI-2 at 37.6% was nearly doubled by Opus 4.6's 68.8% just ten weeks later, and the 200k context plus May 2025 cutoff now look dated. The "AIME 100% with tools" framing flatters it. There is no capability or cost case for new work here versus 4.6/4.7; it earns its score on safety and historical significance.
- Production systems integrated against Opus 4.5 where prompt stability beats the 4.6/4.7 lift. - 200k-context-or-less workloads where the $5/$25 price is justified. - Security-sensitive deployments where the strongest prompt-injection resistance is a hard requirement. - Workloads with prompts tuned tightly to Opus 4.5's instruction-following behavior.
It set the $5/$25 Opus pricing that 4.6 and 4.7 still use, and it has the strongest alignment profile of the set.
No — go to Opus 4.7 (or 4.6 for tokenizer stability) at the same price with more capability.
The 200k context and the May 2025 cutoff, both improved in later Opus models.
Exceptionally — "most robustly aligned" at release, ~4.7% prompt-injection success, SOC 2 Type II, ISO 27001/42001, HIPAA BAA, GDPR.
First-party Claude API plus Bedrock, Vertex AI, and Microsoft Foundry with regional endpoints.
The 3x Opus price cut, the `effort` parameter, and Anthropic's strongest alignment work to date.
Does not train on API inputs by default
Last verified 2026-05-27