Best cell per row highlighted. Null means undisclosed — never counted as zero.
| Provider | Meta | Meta |
| Family / tier | Llama 3 · Large | Llama 4 · Maverick |
| Status | GA | GA |
| Released | 2024-07-22 | 2025-04-04 |
| Knowledge cutoff | 2023-12 | 2024-08 |
| Context window | 128K | 1M |
| Max output tokens | 4K | 8K |
| Input modalities | text | text, image |
| Reasoning mode | none | none |
| Open weights | Yes | Yes |
| Input | $3 | $0.2 |
| Output | $3 | $0.85 |
| Cached input | — | — |
| Batch input | — | — |
| Free tier | No | No |
| Output speed (tok/s) | 29 | 104.3 |
| Time to first token (s) | 0.7 | 0.66 |
| Latency tier | slow | fast |
| Trains on inputs | No | No |
| License | Llama-3-Community | Llama-4-Community |
| Clouds | Bedrock, Vertex AI, Azure AI Foundry, GCP, OCI, IBM watsonx | Bedrock, Vertex AI, Azure AI Foundry, GCP, OCI, IBM watsonx |
| Compliance | — | — |
| Unified score | 6.3 | 7.7 |
| Decision Maker | 6.5 | 8.5 |
| Domain Strategist | 6 | 7.5 |
| Finance Lead | 5 | 9 |
| Domain Practitioner | 6.5 | 7.5 |
| Power User | 6 | 6.5 |
| Skeptic | 6 | 5.5 |
| Value score | 4.5 | 9 |
| MMLU | 88.6 | 85.5 |
| MMLU-Pro | 73.3 | 80.5 |
| GPQA Diamond | 51.1 | 69.8 |
| MATH-500 | 73.8 | 61.2 |
| HumanEval | 89 | 85.8 |
| LiveCodeBench | — | 43.4 |
| Aider Polyglot | — | 15.6 |
| MMMU | — | 73.4 |
| IFEval | 88.6 | — |
| BBH | 81.3 | — |
| LMArena Elo | 1267 | 1271 |
| Artificial Analysis Index | 17 | 18 |
Scores link to their sources. Missing cells mean the vendor hasn't disclosed a result — honesty over padding.