Llama 4 models

6 versions from Meta, reviewed by the TopReviewed AI panel.

ModelScoreStatusReleasedPrice in/outContext
Llama 4 Maverick
self-hosted multimodal workhorse with no vendor lock-in
7.7GA2025-04-04$0.20 / $0.851MReview →
Llama 4 Scout
single-GPU long-context open-weights deploy
7.5GA2025-04-04$0.10 / $0.3410MReview →
Llama 3.3 70B
operationally-mature text-only open default
7.4GA2024-12-05$0.12 / $0.40128KReview →
Llama 3.1 8B
on-device and high-volume open-weights workhorse
7.4GA2024-07-22$0.05 / $0.08128KReview →
Llama 3.1 70B
70B base checkpoint for continued pretraining
6.7GA2024-07-22$0.40 / $0.59128KReview →
Llama 3.1 405B
largest open base for from-scratch fine-tuning
6.3GA2024-07-22$3 / $3128KReview →

Strongest: Llama 4 Maverick

Llama 4 Maverick is Meta's flagship open-weights model, released April 5, 2025 as the first Llama to ship as a Mixture-of-Experts. It pairs a 400-billion-parameter knowledge pool with only 17 billion active parameters per token, is natively multimodal (text + image from pre-training), and serves a 1M-token context window. The one-sentence buyer takeaway: it is not the smartest model on any leaderboard, but it is the strongest open-weights workhorse you can self-host and run on every major inference provider for roughly a tenth the price of a closed frontier API — making it the default hedge against vendor lock-in. - Provider: Meta - Release: 2025-04-05 (GA, open weights) - Status: GA, latest in its tier (no successor shipped as of May 2026) - Context: 1,000,000 tokens (256K native pre-training, extended via iRoPE) - Max output: 8,192 tokens (provider-dependent) - Modalities: text + image in, text out - Knowledge cutoff: August 2024 - Headline price: ~$0.20 in / ~$0.85 out per 1M tokens (representative across providers)

Full review →

Best value: Llama 4 Scout

Llama 4 Scout is the small, deployable member of Meta's Llama 4 herd, released April 5, 2025. It is a 109B-total / 17B-active Mixture-of-Experts model (16 experts), natively multimodal, with a headline 10,000,000-token context window — the largest of any openly available model at release. The one-sentence buyer takeaway: it is the only model in 2026 that combines a 10M context, native vision, and single-GPU deployability, making it the obvious open-weights pick when context size and on-prem economics matter more than peak intelligence. - Provider: Meta - Release: 2025-04-05 (GA, open weights) - Status: GA, latest in its tier (no successor shipped as of May 2026) - Context: 10,000,000 tokens (256K native pre-training, extended via iRoPE) - Max output: 8,192 tokens (provider-dependent) - Modalities: text + image in, text out - Knowledge cutoff: August 2024 - Headline price: ~$0.08–$0.11 in / ~$0.30–$0.34 out per 1M tokens

Full review →