DeepSeek R1 (0528) vs Magistral Medium 1.2 vs Grok 4.20

Best cell per row highlighted. Null means undisclosed — never counted as zero.

DeepSeek
DeepSeek R1 (0528)GA
8.0
AI Panel
Mistral AI
Magistral Medium 1.2GA
7.2
AI Panel
xAI
Grok 4.20GA
6.9
AI Panel

Identity & lifecycle

ProviderDeepSeekMistral AIxAI
Family / tierDeepSeek R1 · ReasoningMagistral · ReasoningGrok 4 · Reasoning
StatusGAGAGA
Released2025-05-272025-09-172026-03-09
Knowledge cutoff2025-042025-062024-11

Architecture & context

Context window128K131K1M
Max output tokens64K41K
Input modalitiestexttext, imagetext, image
Reasoning modealwaysalwaysoptional
Open weightsYesNoNo

Pricing (per Mtok)

Input$0.55$2$1.25
Output$2.19$5$2.5
Cached input$0.14$0.2
Batch input$1
Free tierYesYesYes

Speed

Output speed (tok/s)38.9171.4
Time to first token (s)1.713.24
Latency tierslowslowmedium

Trust & deployment

Trains on inputsYesNoYes
LicenseMITProprietaryProprietary
CloudsAzure AI Foundry
ComplianceSOC2, ISO27001, GDPR

AI Panel scoring

Unified score87.26.9
Decision Maker877
Domain Strategist8.56.56.5
Finance Lead976.5
Domain Practitioner8.57.57.5
Power User87.56.5
Skeptic7.56.56
Value score9.277

Benchmarks

MMLU93.4
MMLU-Pro85
GPQA Diamond8176.2678.5
AIME 202587.583.48
MATH-50087.3
SWE-bench Verified57.6
LiveCodeBench73.3
Aider Polyglot71.6
MMMU70
IFEval81
TAU-bench63.993
SimpleQA27.8
Humanity's Last Exam17.7
LMArena Elo13821491
Artificial Analysis Index682749

Scores link to their sources. Missing cells mean the vendor hasn't disclosed a result — honesty over padding.