Llama 3.3 70B vs Llama 4 Maverick vs Llama 4 Scout

Best cell per row highlighted. Null means undisclosed — never counted as zero.

Add model:

Meta

Llama 3.3 70BGA

7.4

AI Panel

Meta

Llama 4 MaverickGA

7.7

AI Panel

Meta

Llama 4 ScoutGA

7.5

AI Panel

Identity & lifecycle

Provider	Meta	Meta	Meta
Family / tier	Llama 3 · Large	Llama 4 · Maverick	Llama 4 · Scout
Status	GA	GA	GA
Released	2024-12-05	2025-04-04	2025-04-04
Knowledge cutoff	2023-12	2024-08	2024-08

Architecture & context

Context window	128K	1M	10M
Max output tokens	4K	8K	8K
Input modalities	text	text, image	text, image
Reasoning mode	none	none	none
Open weights	Yes	Yes	Yes

Pricing (per Mtok)

Input	$0.12	$0.2	$0.1
Output	$0.4	$0.85	$0.34
Cached input	—	—	—
Batch input	—	—	—
Free tier	No	No	No

Speed

Output speed (tok/s)	81.8	104.3	106.1
Time to first token (s)	0.4	0.66	0.56
Latency tier	fast	fast	fast

Trust & deployment

Trains on inputs	No	No	No
License	Llama-3-Community	Llama-4-Community	Llama-4-Community
Clouds	Bedrock, Vertex AI, Azure AI Foundry, GCP, OCI, IBM watsonx	Bedrock, Vertex AI, Azure AI Foundry, GCP, OCI, IBM watsonx	Bedrock, Vertex AI, Azure AI Foundry, GCP, OCI, IBM watsonx
Compliance	—	—	—

AI Panel scoring

Unified score	7.4	7.7	7.5
Decision Maker	8	8.5	8.5
Domain Strategist	7	7.5	7.5
Finance Lead	8	9	9
Domain Practitioner	8	7.5	7.5
Power User	7	6.5	6.5
Skeptic	6.5	5.5	5.5
Value score	8.5	9	9.5

Benchmarks

MMLU	86	85.5	79.6
MMLU-Pro	68.9	80.5	74.3
GPQA Diamond	50.5	69.8	57.2
MATH-500	77	61.2	50.3
HumanEval	88.4	85.8	82
LiveCodeBench	—	43.4	32.8
Aider Polyglot	—	15.6	—
MMMU	—	73.4	69.4
IFEval	92.1	—	—
LMArena Elo	1257	1271	—
Artificial Analysis Index	14	18	14

Scores link to their sources. Missing cells mean the vendor hasn't disclosed a result — honesty over padding.