by Mistral AI · Codestral family · best for low-latency code completion and FIM
Codestral 25.08 (model ID codestral-2508, shipped end of July 2025) is Mistral's IDE code-completion specialist: a low-latency, high-frequency model built for tab-complete, fill-in-the-middle (FIM), code correction, and test generation across 80+ programming languages, exposed via a dedicated `/v1/fim/completions` endpoint. IMPORTANT: it is a Premier/proprietary, closed-weight model with a 128K context — not an Apache-2.0 open-weight model and not 256K (an earlier draft conflated it with the separate open "Codestral 2" relicensing). Priced at $0.30/$0.90. The buyer's sentence: the right model to power an editor's autocomplete via API, not a self-hostable open coder. - Provider: Mistral AI (Paris, France) - Release: 2025-07-31, status GA - Context: 128,000 tokens; max output 16,384 - Modalities: text only (no vision) - Knowledge cutoff: ~May 2025 - Headline price: $0.30 input / $0.90 output per 1M tokens - Specialty: code completion, fill-in-the-middle (FIM), code correction, test generation - Weights: Premier / proprietary (closed)
| Benchmark | Score | Source |
|---|---|---|
| HumanEval | 86.6% | mistral.ai 2025-07-31T00:00:00.000Z |
Six personas, six verdicts — the same panel that reviews every product on TopReviewed.
“The right brain for an in-product autocomplete feature via API — just don't confuse it with an agentic coder or expect to self-host it.”
Codestral 25.08 fills a specific slot: the low-latency completion engine behind an editor or product. For a SaaS that wants a "Copilot inside our product" experience, it is a clean API choice at $0.30/$0.90 with class-leading FIM. The strategic caveat is that it is Premier/closed — there is no self-host for this model, so the sovereignty story is limited to EU-hosted API. It is not a Devstral 2 / Medium 3.5 substitute; those are agents. Value-per-dollar is high for the right use case and low if you mistake it for a generalist coder. Pick it for autocomplete, route agentic work elsewhere.
“It owns the IDE-completion slot in Mistral's coding stack, but its closed license caps the on-prem story that Devstral Small carries.”
Codestral anchors the completion layer of Mistral's enterprise coding stack (Codestral 25.08 + Devstral + Codestral Embed + Mistral Code). Strategically it competes with the model behind GitHub Copilot on FIM quality and with open coders on price. Its weakness as a strategic asset is the closed license: the self-host / on-prem completion story belongs to Devstral Small 2 (Apache 2.0) or the open Codestral 2, not codestral-2508. So as a differentiator it is "best-in-class FIM via API," a real but bounded position. Mature integrations (Continue.dev, Tabnine, Mistral Code) give it distribution.
“$0.30/$0.90 is cheap, and for a high-frequency completion workload the per-call economics are excellent — but there's no self-host lever here.”
At $0.30/$0.90 with a $0.03 cached-input rate and ~50% batch discount, Codestral is inexpensive per call, which matters enormously for autocomplete, a workload of thousands of small calls per developer per day. The financial caveat versus an earlier assumption: there is no self-host capex lever for this specific model (it is closed), so it is a pure API-economics play. For the high-frequency completion use case the API cost is low enough to be a non-issue; for teams that wanted to amortise GPUs, the open path is Devstral Small 2. Excellent unit economics within its scope.
“This is the autocomplete I actually want — FIM quality is class-leading, latency is low, and the dedicated FIM endpoint is clean.”
As the autocomplete engine in an editor or product, Codestral does the job: FIM quality is genuinely class-leading, latency is low, and the `/v1/fim/completions` endpoint is purpose-built. 128K context means it sees enough repo to make sensible completions. I don't use it as a chat companion — that's Medium 3.5's job — but for tab-complete it's excellent. The constraint to internalise is that it's closed, so I can't fine-tune it on a private codebase the way I could with Devstral Small 2 (Apache 2.0). For pure completion ergonomics, a strong specialist.
“I never see Codestral directly — I see fast, usually-correct completions in my editor, which is exactly the right standard for autocomplete.”
End users experience Codestral as autocomplete suggestions, not as a chatbot. Subjectively completions feel snappy and frequently correct for mainstream languages, less reliable for newer languages or obscure frameworks. The UX is "felt only when wrong," which is the correct bar for autocomplete — a good completion model disappears into the workflow. Good enough that I don't switch to alternatives. No conversational or vision dimension to evaluate; this is infrastructure that either helps invisibly or annoys when it misfires.
“Solid FIM model — but the prior write-up claimed Apache 2.0 and 256K; it's actually closed and 128K. Verify the model, not the family.”
Codestral 25.08 is a genuinely good completion model, so the skepticism here is about provenance, not quality. The Codestral brand spans three things — the 2024 open 22B (MNPL), the 25.08 Premier API model, and a separate "Codestral 2" relicensed Apache 2.0 — and they are easy to conflate (an earlier draft did, claiming 25.08 was Apache 2.0 and 256K). The verified reality: codestral-2508 is Premier/closed with a 128K context. The lesson is to check the exact model ID and its docs page, because the marketing umbrella blurs the licensing. On capability the FIM claim holds; on openness, it does not.
- IDE / editor extensions for tab-complete and FIM (the core purpose). - Code-correction passes and lint-style suggestions at scale. - Automated test scaffolding for newly written functions. - High-volume code search / explanation where latency dominates. - In-product completion features served via API at low per-call cost.
No — this specific model is Premier/closed. For self-host code completion, use Devstral Small 2 (Apache 2.0) or the separate open Codestral 2.
No — that applies to a different "Codestral 2" model. codestral-2508 remains proprietary. Always check the model ID.
128K (not 256K). Enough for multi-file completion context.
`/v1/fim/completions` — a dedicated fill-in-the-middle API for inserting code between a prefix and suffix, ideal for editor autocomplete.
No — it's a completion model. For planning multi-file edits and opening PRs, use Medium 3.5 or Devstral 2.
At $0.30/$0.90 with caching, thousands of small completion calls per developer per day stay inexpensive.
Bedrock, Azure AI Foundry, and Vertex AI, plus La Plateforme.
Does not train on API inputs by default
Last verified 2026-05-27