DeepSeek R1 (0528) vs Grok 4.20

Best cell per row highlighted. Null means undisclosed — never counted as zero.

DeepSeek
DeepSeek R1 (0528)GA
8.0
AI Panel
xAI
Grok 4.20GA
6.9
AI Panel

Identity & lifecycle

ProviderDeepSeekxAI
Family / tierDeepSeek R1 · ReasoningGrok 4 · Reasoning
StatusGAGA
Released2025-05-272026-03-09
Knowledge cutoff2025-042024-11

Architecture & context

Context window128K1M
Max output tokens64K
Input modalitiestexttext, image
Reasoning modealwaysoptional
Open weightsYesNo

Pricing (per Mtok)

Input$0.55$1.25
Output$2.19$2.5
Cached input$0.14$0.2
Batch input
Free tierYesYes

Speed

Output speed (tok/s)171.4
Time to first token (s)13.24
Latency tierslowmedium

Trust & deployment

Trains on inputsYesYes
LicenseMITProprietary
Clouds
Compliance

AI Panel scoring

Unified score86.9
Decision Maker87
Domain Strategist8.56.5
Finance Lead96.5
Domain Practitioner8.57.5
Power User86.5
Skeptic7.56
Value score9.27

Benchmarks

MMLU93.4
MMLU-Pro85
GPQA Diamond8178.5
AIME 202587.5
MATH-50087.3
SWE-bench Verified57.6
LiveCodeBench73.3
Aider Polyglot71.6
IFEval81
TAU-bench63.993
SimpleQA27.8
Humanity's Last Exam17.7
LMArena Elo13821491
Artificial Analysis Index6849

Scores link to their sources. Missing cells mean the vendor hasn't disclosed a result — honesty over padding.