Claude Opus 4.5

GA

by Anthropic · Claude 4 family · best for the Opus price reset, stable target

FrontierReasoningCodingMultimodal
8.0
AI Panel Score
Value 7.5/10

Claude Opus 4.5, released November 24, 2025, is the model that re-priced the Opus tier — dropping input from $15 to $5 and output from $75 to $25, the pricing that has held through Opus 4.6 and 4.7. It paired that cut with a capability jump (SWE-bench Verified 80.9%, the first Opus to break 80%) and Anthropic's strongest alignment profile to date. For a buyer, the single sentence is this: the historically pivotal Opus model whose pricing reset made Opus-tier capability viable for daily use — now superseded twice, but still a stable, fully supported target. - Provider: Anthropic - Released: 2025-11-24 - Status: GA (legacy — superseded by Opus 4.6/4.7, still actively supported) - Context window: 200,000 tokens - Max output: 64,000 tokens - Modalities: text, image - Knowledge cutoff: May 2025 reliable (training cutoff August 2025) - Headline price: $5 input / $25 output per 1M tokens

What's new

  • vs Opus 4.1: dramatic price cut from $15/$75 to $5/$25 — a 3x reduction.
  • New `effort` parameter for explicit speed-vs-capability trade-off.
  • Anthropic's "most robustly aligned" model at release, with prompt-injection attack success rate ~4.7%.
  • Token-efficiency gains — similar or better outcomes with fewer tokens (used only ~48M output tokens to run the full Artificial Analysis Intelligence Index).
  • Upgraded vision, reasoning, and math.
  • First Opus to exceed 80% on SWE-bench Verified.

Benchmarks

BenchmarkScoreSource
Humanity's Last Exam43.4%vellum.ai 2025-11-24T00:00:00.000Z
MMLU90.8%vellum.ai 2025-11-24T00:00:00.000Z
MMMU80.7%vellum.ai 2025-11-24T00:00:00.000Z
MMLU-Pro90%artificialanalysis.ai 2025-11-24T00:00:00.000Z
HumanEval92%automatio.ai 2025-11-24T00:00:00.000Z
TAU-bench88.9%vellum.ai 2025-11-24T00:00:00.000Z
LMArena Elo1468openlm.ai 2026-05-28T00:00:00.000Z
GPQA Diamond87%vellum.ai 2025-11-24T00:00:00.000Z
Terminal-Bench59.8%vellum.ai 2025-11-24T00:00:00.000Z
LMArena Coding Elo1510openlm.ai 2026-05-28T00:00:00.000Z
SWE-bench Verified80.9%vellum.ai 2025-11-24T00:00:00.000Z
Artificial Analysis Index43artificialanalysis.ai 2025-11-24T00:00:00.000Z

AI Panel Review

Six personas, six verdicts — the same panel that reviews every product on TopReviewed.

Decision Maker7.5/10
Opus 4.5 was the inflection point — the 3x price cut changed the Opus deployment math for good.

Opus 4.5 changed the strategic calculus for the Opus tier: the cut from $15/$75 to $5/$25 made frontier Opus capability viable for daily use, and that price still anchors the family. For new workloads in mid-2026 the question is whether to go to 4.6 (1M context, better ARC-AGI-2) or 4.7 (better SWE-bench, vision) — either is the strategically correct choice for new builds. Continued use of 4.5 is reasonable in production windows with tuned prompt suites and a controlled migration path. Multi-cloud availability is unchanged.

Strategic Fit 7Vendor Risk 6Roadmap Confidence 8
Pros
  • Reset pricing, strong alignment, multi-cloud, mature
Cons
  • 200k cap
  • superseded twice
Right for: tuned 4.5 production
Avoid if: new builds (use 4.6/4.7)
Domain Strategist8/10
Opus 4.5's 3x price cut was the strategic masterstroke that reset the entire frontier's pricing expectations.

Opus 4.5's market significance is hard to overstate: by tripling Opus affordability overnight while raising capability, Anthropic reset buyer expectations for what frontier intelligence should cost and pressured competitors on price-to-capability. It also led on alignment, a differentiator that matters to enterprise and regulated buyers. As a standalone product its capability has been surpassed, but its pricing legacy defines the tier and its safety positioning still resonates with risk-averse adopters.

Competitive Positioning 8Differentiation 8Market Timing 9
Pros
  • Reset tier pricing
  • alignment leadership
Cons
  • Capability superseded
  • short context
Right for: safety-first adopters
Avoid if: you need the current capability frontier
Finance Lead8/10
This is the model that made Opus affordable — and on rate card it's identical to 4.6 and 4.7 today.

Opus 4.5 is the model that made Opus-tier work financially feasible; the November 2025 reset to $5/$25 is the pricing that still anchors the tier. On TCO it is identical to 4.6/4.7 on headline rates, with matching cache and batch discounts, and its token-efficiency focus can make it cheaper per task than peers that emit more tokens. The financial case to migrate to 4.6/4.7 is the absence of any cost downside combined with capability upside — there is no money reason to delay migration on cost grounds.

Cost Efficiency 8Pricing Transparency 9Value per Dollar 8
Pros
  • Reset rates, token efficiency, full discounts
Cons
  • Capability left on table vs 4.6/4.7
Right for: cost-neutral short-term operation
Avoid if: you need newer capability now
Domain Practitioner8/10
The first Opus that felt economically viable for daily agent use — SWE-bench 80.9% still holds up.

For builders, Opus 4.5 was the first Opus model that felt economically viable for daily agent use, and SWE-bench Verified 80.9% remains competitive even now. The `effort` parameter is useful but less polished than the adaptive thinking shipped with 4.6. The 200k context cap is the real constraint — long-repo work pushes you to 4.6 or later. Tool use is unchanged across the family. For maintaining existing 4.5 integrations it is fine; for new builds, go to 4.7.

API Ergonomics 8.5Tool/Agent Support 8.5Reliability 9
Pros
  • Competitive coding, effort knob, strong reliability
Cons
  • 200k cap
  • `effort` less polished than adaptive thinking
Right for: tuned 4.5 integrations
Avoid if: building fresh (use 4.7)
Power User8/10
Polished and safe — on casual chat most users can't tell 4.5 from the newer Opus models.

For a consumer chat product, Opus 4.5 delivers a polished experience: moderate latency, high conversation quality, calibrated and notably safe refusals. End users will not perceive a difference between 4.5, 4.6, and 4.7 on casual chat. Where the gap shows is hard reasoning and computer use, both of which improved in 4.6 and again in 4.7. The May 2025 reliable cutoff is dated for current events; web search mitigates. The standout user-facing trait is trustworthiness — this was Anthropic's most aligned model.

Output Quality 8Speed 7Everyday Usefulness 8
Pros
  • Polished, very safe, helpful
Cons
  • Dated cutoff
  • below 4.6/4.7 on hard tasks
Right for: trust-sensitive chat
Avoid if: you need newest knowledge or coding edge
Skeptic7.5/10
A landmark for pricing, not capability — and its ARC-AGI-2 of 37.6% was halved by the very next release.

Opus 4.5's claims were honest and its alignment leadership is genuinely verifiable (the ~4.7% prompt-injection number is a real, useful datapoint). The skeptical caveat is that its fame is about price, not raw capability: ARC-AGI-2 at 37.6% was nearly doubled by Opus 4.6's 68.8% just ten weeks later, and the 200k context plus May 2025 cutoff now look dated. The "AIME 100% with tools" framing flatters it. There is no capability or cost case for new work here versus 4.6/4.7; it earns its score on safety and historical significance.

Claim Accuracy 8.5Weakness Severity 7Hype vs Reality 7.5
Pros
  • Honest claims
  • real alignment leadership
Cons
  • Capability superseded fast
  • short context
  • dated cutoff
Right for: skeptics prioritizing safety evidence
Avoid if: you would otherwise use 4.6/4.7

Strengths

  • Set the Opus pricing floor ($5/$25) that the family has held since.
  • First Opus over 80% on SWE-bench Verified; still competitive into 2026.
  • Strongest alignment/safety profile of the set — ~4.7% prompt-injection success rate.
  • Token efficiency reduces effective cost on long agent loops.
  • Stable, mature model with months of production hardening.

Limitations

  • 200k context window (Opus 4.6 jumped to 1M).
  • ARC-AGI-2 37.6% is roughly half Opus 4.6's — novel-puzzle reasoning is a weak spot.
  • May 2025 reliable cutoff misses a year of recent events.
  • No adaptive thinking; explicit `effort` parameter only.
  • Anthropic recommends migrating to Opus 4.6/4.7 for new builds.

Best use cases

- Production systems integrated against Opus 4.5 where prompt stability beats the 4.6/4.7 lift. - 200k-context-or-less workloads where the $5/$25 price is justified. - Security-sensitive deployments where the strongest prompt-injection resistance is a hard requirement. - Workloads with prompts tuned tightly to Opus 4.5's instruction-following behavior.

Buyer questions

Why does Opus 4.5 still matter?

It set the $5/$25 Opus pricing that 4.6 and 4.7 still use, and it has the strongest alignment profile of the set.

Should I use it for new builds?

No — go to Opus 4.7 (or 4.6 for tokenizer stability) at the same price with more capability.

What is its biggest limitation?

The 200k context and the May 2025 cutoff, both improved in later Opus models.

Is it secure for enterprise?

Exceptionally — "most robustly aligned" at release, ~4.7% prompt-injection success, SOC 2 Type II, ISO 27001/42001, HIPAA BAA, GDPR.

Which clouds host it?

First-party Claude API plus Bedrock, Vertex AI, and Microsoft Foundry with regional endpoints.

What did Opus 4.5 introduce?

The 3x Opus price cut, the `effort` parameter, and Anthropic's strongest alignment work to date.

Comparable models

Claude Opus 4.6: Direct successor; same price, 5x context, much better ARC-AGI-2 and reasoning.
Claude Opus 4.7: Two generations forward; current flagship with the agentic-coding lift at the same price.
Claude Opus 4.1: Direct predecessor; 3x more expensive ($15/$75) with weaker benchmarks.

Model specs

Input price
$5 / Mtok
Output price
$25 / Mtok
Cached input
$0.50 / Mtok
Batch (in/out)
$2.50 / $12.50
Context window
200K tokens
Max output
64K tokens
Knowledge cutoff
2025-05
Released
2025-11-23
Modalities
text, image → text
Output speed
~53.2 tok/s
License
Proprietary
Clouds
Bedrock, Vertex AI, Azure AI Foundry

Does not train on API inputs by default

Last verified 2026-05-27