Llama 3.1 8B vs Llama 4 Maverick vs Llama 4 Scout

Best cell per row highlighted. Null means undisclosed — never counted as zero.

Add model:

Meta

7.4

AI Panel

Meta

Llama 4 MaverickGA

7.7

AI Panel

Meta

Llama 4 ScoutGA

7.5

AI Panel

Identity & lifecycle

Provider	Meta	Meta	Meta
Family / tier	Llama 3 · Small	Llama 4 · Maverick	Llama 4 · Scout
Status	GA	GA	GA
Released	2024-07-22	2025-04-04	2025-04-04
Knowledge cutoff	2023-12	2024-08	2024-08

Architecture & context

Context window	128K	1M	10M
Max output tokens	4K	8K	8K
Input modalities	text	text, image	text, image
Reasoning mode	none	none	none
Open weights	Yes	Yes	Yes

Pricing (per Mtok)

Input	$0.05	$0.2	$0.1
Output	$0.08	$0.85	$0.34
Cached input	—	—	—
Batch input	—	—	—
Free tier	Yes	No	No

Speed

Output speed (tok/s)	159.4	104.3	106.1
Time to first token (s)	0.3	0.66	0.56
Latency tier	fast	fast	fast

Trust & deployment

Trains on inputs	No	No	No
License	Llama-3-Community	Llama-4-Community	Llama-4-Community
Clouds	Bedrock, Vertex AI, Azure AI Foundry, GCP, OCI, IBM watsonx	Bedrock, Vertex AI, Azure AI Foundry, GCP, OCI, IBM watsonx	Bedrock, Vertex AI, Azure AI Foundry, GCP, OCI, IBM watsonx
Compliance	—	—	—

AI Panel scoring

Unified score	7.4	7.7	7.5
Decision Maker	8	8.5	8.5
Domain Strategist	7.5	7.5	7.5
Finance Lead	9.5	9	9
Domain Practitioner	8.5	7.5	7.5
Power User	6	6.5	6.5
Skeptic	6.5	5.5	5.5
Value score	9.5	9	9.5

Benchmarks

MMLU	69.4	85.5	79.6
MMLU-Pro	48.3	80.5	74.3
GPQA Diamond	30.4	69.8	57.2
MATH-500	51.9	61.2	50.3
HumanEval	72.6	85.8	82
LiveCodeBench	—	43.4	32.8
Aider Polyglot	—	15.6	—
MMMU	—	73.4	69.4
IFEval	80.4	—	—
BBH	64.2	—	—
LMArena Elo	1176	1271	—
Artificial Analysis Index	12	18	14

Scores link to their sources. Missing cells mean the vendor hasn't disclosed a result — honesty over padding.