| Model | Score | Status | Released | Price in/out | Context | |
|---|---|---|---|---|---|---|
| Gemini 3.5 Flash production agent + coding backbone | 8.5 | GA | 2026-05-18 | $1.50 / $9 | 1.0M | Review → |
| Gemini 3.1 Flash-Lite best $/intelligence in Google's lineup | 7.8 | preview | 2026-03-02 | $0.25 / $1.50 | 1.0M | Review → |
| Gemini 3.1 Pro frontier reasoning + long-context on Google Cloud | 8.7 | preview | 2026-02-18 | $2 / $12 | 1.0M | Review → |
| Gemini 2.5 Flash-Lite cheapest GA model with 1M context | 7.7 | GA | 2025-07-21 | $0.10 / $0.40 | 1.0M | Review → |
| Gemini 2.5 Pro cost-effective long-context Pro | 7.6 | GA | 2025-06-16 | $1.25 / $10 | 1.0M | Review → |
| Gemini 2.5 Flash mature mid-tier with thinking toggle | 6.7 | GA | 2025-06-16 | $0.30 / $2.50 | 1.0M | Review → |
Gemini 3.1 Pro is Google DeepMind's flagship reasoning model, launched 2026-02-19 in preview to validate the release before general availability. As of 2026-05-28 it remains the headline model in the Gemini app (Google AI Pro and Ultra) and the top reasoning option on the Gemini API and Vertex AI, even though its API/Vertex surface is still governed by Pre-GA Offerings Terms. It posts the highest public GPQA Diamond score of any proprietary model (94.3%, no tools), pairs that with a real 1M-token context (2M on Vertex for enterprise), and grounds answers in live Google Search. For a buyer: if you want frontier reasoning plus the deepest enterprise-cloud and live-data integration, this is Google's answer — accept that the API is technically still pre-GA. - Provider: Google (DeepMind) - Released: 2026-02-19 (preview; no GA date announced) - Status: preview (Pre-GA terms on API/Vertex; production-default in the consumer Gemini app) - Context window: 1,048,576 tokens (2,097,152 / 2M on Vertex AI) - Max output: 65,536 tokens - Modalities: text, image, audio, video in; text out - Knowledge cutoff: January 2025 - Headline price: $2.00 in / $12.00 out per 1M tokens (<=200K prompt)
Full review →Gemini 3.1 Flash-Lite is Google DeepMind's most cost-effective frontier-adjacent model, released 2026-03-03 in preview on the Gemini API, AI Studio, and Vertex AI. It pairs a remarkable GPQA Diamond of 86.9% — beating the older Gemini 2.5 Flash on hard science — with the fastest output in any reasoning-capable Gemini (~332 tok/s) and the full 1M-token context, all at $0.25 in / $1.50 out per 1M tokens. For a buyer: this is the high-volume workhorse — classification, extraction, summarization, and bulk content at the best price-to-intelligence ratio in Google's lineup, with the caveat that it is still pre-GA. - Provider: Google (DeepMind) - Released: 2026-03-03 (preview, Pre-GA terms) - Status: preview - Context window: 1,048,576 tokens (1M) - Max output: 65,536 tokens - Modalities: text, image, audio, video in; text out - Knowledge cutoff: January 2025 - Headline price: $0.25 in / $1.50 out per 1M tokens
Full review →