Best cell per row highlighted. Null means undisclosed — never counted as zero.
| Provider | DeepSeek | Alibaba Cloud |
| Family / tier | DeepSeek R1 · Reasoning | QwQ · Reasoning |
| Status | GA | GA |
| Released | 2025-05-27 | 2025-03-04 |
| Knowledge cutoff | 2025-04 | 2024-09 |
| Context window | 128K | 131K |
| Max output tokens | 64K | 33K |
| Input modalities | text | text |
| Reasoning mode | always | always |
| Open weights | Yes | Yes |
| Input | $0.55 | $0.12 |
| Output | $2.19 | $0.18 |
| Cached input | $0.14 | — |
| Batch input | — | — |
| Free tier | Yes | Yes |
| Output speed (tok/s) | — | — |
| Time to first token (s) | — | — |
| Latency tier | slow | slow |
| Trains on inputs | Yes | No |
| License | MIT | Apache-2.0 |
| Clouds | — | GCP |
| Compliance | — | — |
| Unified score | 8 | 6.8 |
| Decision Maker | 8 | 7 |
| Domain Strategist | 8.5 | 7 |
| Finance Lead | 9 | 7 |
| Domain Practitioner | 8.5 | 7.5 |
| Power User | 8 | 6 |
| Skeptic | 7.5 | 6.5 |
| Value score | 9.2 | 8 |
| MMLU | 93.4 | — |
| MMLU-Pro | 85 | — |
| GPQA Diamond | 81 | 65.2 |
| AIME 2025 | 87.5 | — |
| MATH-500 | — | 90.6 |
| SWE-bench Verified | 57.6 | — |
| LiveCodeBench | 73.3 | 63.4 |
| Aider Polyglot | 71.6 | — |
| IFEval | — | 83.9 |
| TAU-bench | 63.9 | — |
| SimpleQA | 27.8 | — |
| Humanity's Last Exam | 17.7 | — |
| LMArena Elo | 1382 | — |
| Artificial Analysis Index | 68 | — |
Scores link to their sources. Missing cells mean the vendor hasn't disclosed a result — honesty over padding.