by Anthropic · Claude 4 family · best for high-volume low-latency worker
Claude Haiku 4.5 is Anthropic's fastest, lowest-cost model, released October 15, 2025, delivering near-frontier coding and agent performance at $1/$5 per 1M tokens. The headline: it matches or exceeds Sonnet 4 on coding and computer use at roughly one-third the cost and more than twice the speed, and it is the first Haiku with extended thinking. For a buyer, the single sentence is this: route the long tail of work here, and only pay Sonnet or Opus when a job actually needs them. - Provider: Anthropic - Released: 2025-10-15 - Status: GA (current Haiku tier) - Context window: 200,000 tokens - Max output: 64,000 tokens - Modalities: text, image - Knowledge cutoff: February 2025 reliable (training cutoff July 2025) - Headline price: $1 input / $5 output per 1M tokens
| Benchmark | Score | Source |
|---|---|---|
| MMLU-Pro | 76% | artificialanalysis.ai 2025-10-15T00:00:00.000Z |
| LMArena Elo | 1378 | openlm.ai 2026-05-28T00:00:00.000Z |
| GPQA Diamond | 67.2% | caylent.com 2025-10-15T00:00:00.000Z |
| Terminal-Bench | 41.75% | anthropic.com 2025-10-15T00:00:00.000Z |
| LMArena Coding Elo | 1436 | openlm.ai 2026-05-28T00:00:00.000Z |
| SWE-bench Verified | 73.3% | anthropic.com 2025-10-15T00:00:00.000Z |
| Artificial Analysis Index | 31 | artificialanalysis.ai 2025-10-15T00:00:00.000Z |
Six personas, six verdicts — the same panel that reviews every product on TopReviewed.
“Haiku 4.5 is the cost backbone — it makes Anthropic-on-everything viable by absorbing the long tail of requests.”
Strategically, Haiku 4.5 is what lets you standardize on Anthropic without blowing the budget: route the long tail here and pay Sonnet or Opus only when a job justifies it. Coverage on Bedrock, Vertex, and first-party is fine for failover, and ASL-2 is appropriate for the tier. The 200k context is the planning cap — long-document jobs still need Sonnet 4.6 or Opus 4.7. Its biggest strategic value is enabling multi-tier router patterns that cut blended cost without obvious quality drops on routine work.
“Near-frontier coding at $1/$5 is the wedge that defends Anthropic's high-volume base against cheap rivals.”
Haiku 4.5 occupies the price-sensitive, high-concurrency segment where defection risk to cheaper providers is highest. Matching Sonnet 4 on coding and computer use at one-third the cost is a differentiation story that keeps cost-conscious teams inside the Anthropic ecosystem and feeds the multi-agent pattern Anthropic is promoting. Market timing is good: it shipped as agent architectures matured and worker-tier demand exploded. The competitive pressure is real, though — GPT-5 nano/mini and Gemini Flash contest this tier hard on input price and context size.
“At $1/$5, dropping to $0.50/$2.50 on batch and $0.10 cache reads, this is the cheapest path to Anthropic quality.”
This is the most cost-efficient model in the Claude lineup and the primary lever for blended-cost optimization. At $1/$5 it is 3x cheaper than Sonnet and 5x cheaper than Opus on input; batch takes it to $0.50/$2.50 and cache hits to $0.10. For a stable, well-cached agent workload you can reach fractions of a cent per call, and the support-ticket math (~$37 per 10k tickets) is excellent. The risk to flag in budget reviews is creep: teams start on Haiku and "temporarily" promote to Sonnet, then never come back. Tier discipline is the entire game.
“The API is identical to Sonnet and Opus — Haiku is the model you put behind everything that doesn't need to be smart.”
For builders, Haiku 4.5 is the workhorse for everything that does not need to be smart: the API surface is identical to Sonnet and Opus, tool use works the same way, and extended thinking is now available for medium-complexity tasks that previously had to be punted upstream. SWE-bench Verified 73.3% means it actually closes real GitHub issues, not just toy questions, and the function-calling reliability makes it a clean multi-agent worker. The 200k context is the practical limit and the only thing that forces escalation regularly. Latency is the fastest in the family, which makes tight loops pleasant.
“Snappy and helpful — most users won't notice it isn't Sonnet until they push into hard reasoning.”
For a consumer-facing chat product, Haiku 4.5 delivers a snappy, helpful experience: replies arrive fast, refusals are calibrated, and conversation quality is far better than the Haiku tier was a year ago. Users will not notice they are not on Sonnet unless they push into hard reasoning or long context. The personality is more transactional than Sonnet's slightly warmer voice, which suits utility products (support, search, transactional assistants) more than companion products. Safety is mature and ASL-2 is fine for these surfaces.
“Cheap and fast, yes — but 'near-frontier' means near Sonnet 4, not near Opus, and the context cap bites fast.”
The value is genuine and the latency is best-in-family, so the core pitch holds. But "near-frontier intelligence" is generous: Haiku 4.5 matches Sonnet 4 (a 2025 model), not the current frontier, and GPQA Diamond 67.2% plus an AA Index of 31 confirm it is a budget model, not a stealth flagship. The 200k context is half a generation behind the 1M Sonnet/Opus offer, and the February 2025 knowledge cutoff is the oldest shipping, so real-time tasks lean heavily on web search. The promotion-creep dynamic also means the advertised savings often do not survive contact with production. Buy it for what it is: a fast, cheap worker.
- High-volume customer support — Anthropic's worked example is ~$37 per 10,000 conversations on Haiku 4.5. - Multi-agent architectures where Haiku is the worker and Sonnet/Opus is the planner. - Real-time chat, classification, and routing surfaces where sub-second latency matters. - Computer-use agents at scale where Sonnet's per-step cost would be prohibitive. - Batch document classification and structured-extraction pipelines.
With good caching and batch, fractions of a cent per call; Anthropic's example is ~$37 per 10,000 support tickets.
When you exceed the 200k context, need hard scientific/math reasoning, or need the warmest conversational tone.
Yes — full first-party tool suite plus explicit extended thinking (the first Haiku to offer it); no adaptive thinking.
For current events, yes — enable web search/web fetch; for stable domains it is a non-issue.
Yes — no training on inputs, SOC 2 Type II, ISO 27001/42001, HIPAA BAA, GDPR; deployed at ASL-2.
Multi-agent worker tier: Haiku executes many cheap fast steps, Sonnet/Opus plans and handles hard decisions.
Comparable budget tier; similar price band with different strength profiles; OpenAI often offers larger context at the low end.
Larger context and cheaper input, but weaker on coding benchmarks.
Does not train on API inputs by default
Last verified 2026-05-27