DeepSeek R1 (0528) vs QwQ-32B vs Magistral Medium 1.2

Best cell per row highlighted. Null means undisclosed — never counted as zero.

Add model:

DeepSeek

DeepSeek R1 (0528)GA

8.0

AI Panel

Alibaba Cloud

6.8

AI Panel

Mistral AI

Magistral Medium 1.2GA

7.2

AI Panel

Identity & lifecycle

Provider	DeepSeek	Alibaba Cloud	Mistral AI
Family / tier	DeepSeek R1 · Reasoning	QwQ · Reasoning	Magistral · Reasoning
Status	GA	GA	GA
Released	2025-05-27	2025-03-04	2025-09-17
Knowledge cutoff	2025-04	2024-09	2025-06

Architecture & context

Context window	128K	131K	131K
Max output tokens	64K	33K	41K
Input modalities	text	text	text, image
Reasoning mode	always	always	always
Open weights	Yes	Yes	No

Pricing (per Mtok)

Input	$0.55	$0.12	$2
Output	$2.19	$0.18	$5
Cached input	$0.14	—	—
Batch input	—	—	$1
Free tier	Yes	Yes	Yes

Speed

Output speed (tok/s)	—	—	38.9
Time to first token (s)	—	—	1.7
Latency tier	slow	slow	slow

Trust & deployment

Trains on inputs	Yes	No	No
License	MIT	Apache-2.0	Proprietary
Clouds	—	GCP	Azure AI Foundry
Compliance	—	—	SOC2, ISO27001, GDPR

AI Panel scoring

Unified score	8	6.8	7.2
Decision Maker	8	7	7
Domain Strategist	8.5	7	6.5
Finance Lead	9	7	7
Domain Practitioner	8.5	7.5	7.5
Power User	8	6	7.5
Skeptic	7.5	6.5	6.5
Value score	9.2	8	7

Benchmarks

MMLU	93.4	—	—
MMLU-Pro	85	—	—
GPQA Diamond	81	65.2	76.26
AIME 2025	87.5	—	83.48
MATH-500	—	90.6	—
SWE-bench Verified	57.6	—	—
LiveCodeBench	73.3	63.4	—
Aider Polyglot	71.6	—	—
MMMU	—	—	70
IFEval	—	83.9	—
TAU-bench	63.9	—	—
SimpleQA	27.8	—	—
Humanity's Last Exam	17.7	—	—
LMArena Elo	1382	—	—
Artificial Analysis Index	68	—	27

Scores link to their sources. Missing cells mean the vendor hasn't disclosed a result — honesty over padding.