Best cell per row highlighted. Null means undisclosed — never counted as zero.
| Provider | DeepSeek | Alibaba Cloud | Mistral AI | xAI |
| Family / tier | DeepSeek R1 · Reasoning | QwQ · Reasoning | Magistral · Reasoning | Grok 4 · Reasoning |
| Status | GA | GA | GA | GA |
| Released | 2025-05-27 | 2025-03-04 | 2025-09-17 | 2026-03-09 |
| Knowledge cutoff | 2025-04 | 2024-09 | 2025-06 | 2024-11 |
| Context window | 128K | 131K | 131K | 1M |
| Max output tokens | 64K | 33K | 41K | — |
| Input modalities | text | text | text, image | text, image |
| Reasoning mode | always | always | always | optional |
| Open weights | Yes | Yes | No | No |
| Input | $0.55 | $0.12 | $2 | $1.25 |
| Output | $2.19 | $0.18 | $5 | $2.5 |
| Cached input | $0.14 | — | — | $0.2 |
| Batch input | — | — | $1 | — |
| Free tier | Yes | Yes | Yes | Yes |
| Output speed (tok/s) | — | — | 38.9 | 171.4 |
| Time to first token (s) | — | — | 1.7 | 13.24 |
| Latency tier | slow | slow | slow | medium |
| Trains on inputs | Yes | No | No | Yes |
| License | MIT | Apache-2.0 | Proprietary | Proprietary |
| Clouds | — | GCP | Azure AI Foundry | — |
| Compliance | — | — | SOC2, ISO27001, GDPR | — |
| Unified score | 8 | 6.8 | 7.2 | 6.9 |
| Decision Maker | 8 | 7 | 7 | 7 |
| Domain Strategist | 8.5 | 7 | 6.5 | 6.5 |
| Finance Lead | 9 | 7 | 7 | 6.5 |
| Domain Practitioner | 8.5 | 7.5 | 7.5 | 7.5 |
| Power User | 8 | 6 | 7.5 | 6.5 |
| Skeptic | 7.5 | 6.5 | 6.5 | 6 |
| Value score | 9.2 | 8 | 7 | 7 |
| MMLU | 93.4 | — | — | — |
| MMLU-Pro | 85 | — | — | — |
| GPQA Diamond | 81 | 65.2 | 76.26 | 78.5 |
| AIME 2025 | 87.5 | — | 83.48 | — |
| MATH-500 | — | 90.6 | — | 87.3 |
| SWE-bench Verified | 57.6 | — | — | — |
| LiveCodeBench | 73.3 | 63.4 | — | — |
| Aider Polyglot | 71.6 | — | — | — |
| MMMU | — | — | 70 | — |
| IFEval | — | 83.9 | — | 81 |
| TAU-bench | 63.9 | — | — | 93 |
| SimpleQA | 27.8 | — | — | — |
| Humanity's Last Exam | 17.7 | — | — | — |
| LMArena Elo | 1382 | — | — | 1491 |
| Artificial Analysis Index | 68 | — | 27 | 49 |
Scores link to their sources. Missing cells mean the vendor hasn't disclosed a result — honesty over padding.