Claude Sonnet 4.5

GA

by Anthropic · Claude 4 family · best for stable legacy workhorse

CodingMultimodalCost-Optimized
7.5
AI Panel Score
Value 8.0/10

Claude Sonnet 4.5, released September 29, 2025, was Anthropic's workhorse from late September 2025 until Sonnet 4.6 superseded it in February 2026. It introduced state-of-the-art coding at its tier and long-horizon focus (30+ hours on task in Anthropic testing), and it remains fully supported at the same $3/$15 price as 4.6. For a buyer, the single sentence is this: a still-capable, well-hardened model whose only real gaps versus 4.6 are the 200k context (vs 1M) and weaker computer use — fine to keep in a controlled migration window, but new builds should target 4.6. - Provider: Anthropic - Released: 2025-09-29 - Status: GA (legacy — superseded by Sonnet 4.6, still actively supported) - Context window: 200,000 tokens - Max output: 64,000 tokens - Modalities: text, image - Knowledge cutoff: January 2025 reliable (training cutoff July 2025) - Headline price: $3 input / $15 output per 1M tokens

What's new

  • vs Sonnet 4: state-of-the-art coding at release, with focus held across multi-step tasks for 30+ hours in Anthropic testing.
  • Claude Code checkpoint system for progress saving and rollback shipped here first.
  • Native VS Code extension at launch.
  • Context editing and memory tools for long-running agents.
  • File creation (spreadsheets, slides, documents) directly in claude.ai.
  • Claude Agent SDK released alongside the model.
  • Enhanced alignment work: reduced deception and power-seeking behaviors in evaluations.

Benchmarks

BenchmarkScoreSource
MMLU89.1%leanware.co 2025-09-29T00:00:00.000Z
AIME 202587%leanware.co 2025-09-29T00:00:00.000Z
TAU-bench86.2%leanware.co 2025-09-29T00:00:00.000Z
LMArena Elo1420openlm.ai 2026-05-28T00:00:00.000Z
GPQA Diamond83.4%leanware.co 2025-09-29T00:00:00.000Z
LMArena Coding Elo1464openlm.ai 2026-05-28T00:00:00.000Z
SWE-bench Verified77.2%morphllm.com 2025-09-29T00:00:00.000Z
Artificial Analysis Index43artificialanalysis.ai 2025-09-29T00:00:00.000Z

AI Panel Review

Six personas, six verdicts — the same panel that reviews every product on TopReviewed.

Decision Maker7/10
Sonnet 4.5 was the right pick for two quarters; now the only strategic question is migration timing to 4.6.

From October 2025 through February 2026 this was the default workhorse. With Sonnet 4.6 now available at the same price with 1M context and large OSWorld/ARC gains, the strategic question is migration timing, not continued use. Anthropic keeps 4.5 fully supported, but new deployments should standardize on 4.6. The risk in staying is incrementally dated training data and the eventual deprecation cycle. For production traffic, plan a controlled rollout to 4.6 over a quarter and treat 4.5 as exit-only.

Strategic Fit 7Vendor Risk 6Roadmap Confidence 8
Pros
  • Mature, same price as 4.6, multi-cloud
Cons
  • 200k cap, dated cutoff, superseded
Right for: controlled migration windows
Avoid if: starting new builds (use 4.6)
Domain Strategist7.5/10
Sonnet 4.5's legacy is the agent tooling — checkpoints, the Agent SDK, VS Code — that still anchors the ecosystem.

Sonnet 4.5's market significance is less about benchmarks than about the agent infrastructure it launched: the Claude Code checkpoint system, the native VS Code extension, context editing, memory tools, and the Claude Agent SDK all shipped with it. That tooling built the ecosystem moat that 4.6 and 4.7 now inherit. As a standalone product its differentiation has faded, but its strategic footprint as the model that operationalized Claude agents remains large.

Competitive Positioning 7Differentiation 7Market Timing 8
Pros
  • Launched durable agent tooling
  • trusted
Cons
  • Superseded by 4.6 on capability and context
Right for: legacy agent stacks
Avoid if: you need current capability
Finance Lead7.5/10
Same $3/$15 as 4.6 with no cost downside to migrating — and a soft cost reason to move on document-heavy work.

Pricing is identical to Sonnet 4.6 at $3/$15, so the cost case for staying or moving is a wash on rate card, and cache/batch discounts match. The one budget consideration is the 200k context, which forces chunking workflows that can use more total tokens than equivalent work on 4.6's 1M window. Net: no financial reason to delay migration, and a soft financial reason to accelerate it on document-heavy pipelines.

Cost Efficiency 8Pricing Transparency 9Value per Dollar 8
Pros
  • Same rates as 4.6, full discounts
Cons
  • 200k cap can inflate token use via chunking
Right for: short-term cost-neutral operation
Avoid if: document-heavy work (4.6 is cheaper in practice)
Domain Practitioner7.5/10
The checkpoint system and VS Code extension shipped here first — but for browser agents, 4.6's computer use wins.

For builders, Sonnet 4.5 was the daily driver for Q4 2025 and into early 2026. Tool use is identical to later models, the Claude Code checkpoint system shipped here first, and the VS Code extension is mature. The honest gap is computer use and OSWorld — Sonnet 4.6 is meaningfully better, so for any browser-control or agentic workflow, 4.6 is the right move. Pure coding is close enough between 4.5 and 4.6 that you can stay on 4.5 if your prompt suite depends on its behavior; otherwise migrate.

API Ergonomics 9Tool/Agent Support 8Reliability 8.5
Pros
  • Mature tooling, identical API, strong coding
Cons
  • Weaker computer use
  • 200k cap
Right for: tuned legacy agent loops
Avoid if: building browser/computer-use agents
Power User8/10
Fast and high-quality — on casual chat most users won't notice the gap to 4.6 at all.

For a consumer chat product, Sonnet 4.5 delivered a strong experience: fast latency, high conversation quality, calibrated refusals. End users will not perceive a meaningful gap between 4.5 and 4.6 on casual chat. Where 4.6 wins on user-visible quality is hard reasoning, computer use, and very long sessions where context limits matter. Safety calibration is mature. The January 2025 reliable cutoff is starting to feel dated for current events — web search mitigates but does not eliminate it.

Output Quality 8Speed 8.5Everyday Usefulness 8
Pros
  • Fast, helpful, calibrated, mature
Cons
  • Dated cutoff
  • below 4.6 on hard tasks
Right for: everyday chat surfaces
Avoid if: you need newest knowledge or long sessions
Skeptic7.5/10
A fine 2025 model whose ARC-AGI-2 of 13.6% shows just how far the goalposts moved in one release.

Sonnet 4.5 was genuinely strong at release and the coding numbers held up, so there is no deception in the original claims. The skeptical point is generational: ARC-AGI-2 at 13.6% versus 58.3% on Sonnet 4.6 shows how quickly the frontier moved, and the 200k context plus January 2025 cutoff now look dated next to 4.6 at identical price. The AIME "100% with tools" framing also flatters the model — the no-tools 87% is the honest number. There is no cost or capability case for new work here; it is a maintenance model.

Claim Accuracy 8Weakness Severity 7Hype vs Reality 7.5
Pros
  • Original claims were honest
  • mature
Cons
  • Superseded at same price
  • dated cutoff
  • weak novel reasoning
Right for: skeptics maintaining legacy stacks
Avoid if: you would otherwise use 4.6

Strengths

  • Long-horizon focus: held coherent multi-step work for 30+ hours in testing.
  • Strong coding for its tier — widely deployed in Claude Code through Q4 2025.
  • Reliable agent behavior with context editing and memory tools.
  • Same $3/$15 pricing as Sonnet 4.6, so cost is not a migration driver.
  • Mature and well-hardened by months of production use; notable alignment gains.

Limitations

  • 200k context only (Sonnet 4.6 jumped to 1M) — the main reason to migrate.
  • ARC-AGI-2 13.6% reflects the pre-jump generation; novel-puzzle reasoning is weak.
  • January 2025 reliable knowledge cutoff is now meaningfully dated.
  • Computer use less hardened than Sonnet 4.6 (OSWorld 61.4% vs 72.5%).
  • Anthropic recommends migrating to Sonnet 4.6 for new builds.

Best use cases

- Production systems running stable Claude Code workflows not yet re-validated on Sonnet 4.6. - Agent loops tuned to Sonnet 4.5's specific instruction-following behavior. - Cost-conscious workloads where 200k context is sufficient and prompt stability is valued.

Buyer questions

Should I still use Sonnet 4.5?

For existing tuned stacks, short-term yes; for new builds, use Sonnet 4.6 at the same price.

What is the biggest gap to 4.6?

Context (200k vs 1M) and computer use (OSWorld 61.4% vs 72.5%).

Is migration costly?

No — same rate card; mostly prompt re-validation, and 4.6's larger context can reduce chunking overhead.

Is it secure for enterprise?

Yes — no training on inputs, SOC 2 Type II, ISO 27001/42001, HIPAA BAA, GDPR.

Which clouds host it?

First-party Claude API plus Bedrock, Vertex AI, and Microsoft Foundry; it was the first model with Bedrock global/regional endpoints.

What did Sonnet 4.5 introduce?

The Claude Agent SDK, Claude Code checkpoints, the VS Code extension, and context/memory tools.

Comparable models

Claude Sonnet 4.6: Direct successor; same price, 5x context, better OSWorld and ARC-AGI-2 — the migration target.
Claude Opus 4.6 / 4.7: Higher-tier flagships; ~1.7x input cost, stronger on hard reasoning.
GPT-5OpenAI

Comparable workhorse from a competing provider; trade-offs vary by workload.

Model specs

Input price
$3 / Mtok
Output price
$15 / Mtok
Cached input
$0.30 / Mtok
Batch (in/out)
$1.50 / $7.50
Context window
200K tokens
Max output
64K tokens
Knowledge cutoff
2025-01
Released
2025-09-28
Modalities
text, image → text
Output speed
~60 tok/s
License
Proprietary
Clouds
Bedrock, Vertex AI, Azure AI Foundry

Does not train on API inputs by default

Last verified 2026-05-27