Best cell per row highlighted. Null means undisclosed — never counted as zero.
| Provider | Meta | Meta | Meta |
| Family / tier | Llama 3 · Large | Llama 4 · Maverick | Llama 4 · Scout |
| Status | GA | GA | GA |
| Released | 2024-12-05 | 2025-04-04 | 2025-04-04 |
| Knowledge cutoff | 2023-12 | 2024-08 | 2024-08 |
| Context window | 128K | 1M | 10M |
| Max output tokens | 4K | 8K | 8K |
| Input modalities | text | text, image | text, image |
| Reasoning mode | none | none | none |
| Open weights | Yes | Yes | Yes |
| Input | $0.12 | $0.2 | $0.1 |
| Output | $0.4 | $0.85 | $0.34 |
| Cached input | — | — | — |
| Batch input | — | — | — |
| Free tier | No | No | No |
| Output speed (tok/s) | 81.8 | 104.3 | 106.1 |
| Time to first token (s) | 0.4 | 0.66 | 0.56 |
| Latency tier | fast | fast | fast |
| Trains on inputs | No | No | No |
| License | Llama-3-Community | Llama-4-Community | Llama-4-Community |
| Clouds | Bedrock, Vertex AI, Azure AI Foundry, GCP, OCI, IBM watsonx | Bedrock, Vertex AI, Azure AI Foundry, GCP, OCI, IBM watsonx | Bedrock, Vertex AI, Azure AI Foundry, GCP, OCI, IBM watsonx |
| Compliance | — | — | — |
| Unified score | 7.4 | 7.7 | 7.5 |
| Decision Maker | 8 | 8.5 | 8.5 |
| Domain Strategist | 7 | 7.5 | 7.5 |
| Finance Lead | 8 | 9 | 9 |
| Domain Practitioner | 8 | 7.5 | 7.5 |
| Power User | 7 | 6.5 | 6.5 |
| Skeptic | 6.5 | 5.5 | 5.5 |
| Value score | 8.5 | 9 | 9.5 |
| MMLU | 86 | 85.5 | 79.6 |
| MMLU-Pro | 68.9 | 80.5 | 74.3 |
| GPQA Diamond | 50.5 | 69.8 | 57.2 |
| MATH-500 | 77 | 61.2 | 50.3 |
| HumanEval | 88.4 | 85.8 | 82 |
| LiveCodeBench | — | 43.4 | 32.8 |
| Aider Polyglot | — | 15.6 | — |
| MMMU | — | 73.4 | 69.4 |
| IFEval | 92.1 | — | — |
| LMArena Elo | 1257 | 1271 | — |
| Artificial Analysis Index | 14 | 18 | 14 |
Scores link to their sources. Missing cells mean the vendor hasn't disclosed a result — honesty over padding.