How much does Grok 4.3 cost?

$1.25 per 1M input tokens and $2.50 per 1M output tokens, with cached input at $0.20 (84% off). Above 200K input tokens, rates double. Consumer access starts at X Premium $8/mo or SuperGrok $30/mo.

Is it good for coding?

It is the weakest area. xAI did not publish a SWE-bench Verified score for 4.3, and it trails Claude Opus 4.7 by double digits on hard agentic coding. For a coding agent, prefer Grok Build 0.1 within xAI or Claude/Codex elsewhere.

What makes it different?

Native, first-party access to the live X (Twitter) firehose plus web search, and native video input — neither of which competing flagships offer natively.

Can I get it with enterprise controls?

Yes, via Microsoft Foundry / Azure AI Foundry (RBAC, private networking, customer-managed keys), though Azure caps context at ~200K tokens versus 1M on the direct API.

Does xAI train on my data?

On the API, only if you opt into data sharing (irreversible once enabled). On the X consumer surface, the 2026 ToS trains on prompts/outputs by default with no opt-out.

Why is first response so slow?

Reasoning is always on (effort dial only, no off switch), so time-to-first-token is ~20s. It is built for quality, not interactive latency.

How do I migrate from a retired Grok slug?

Retired slugs (grok-3, grok-4-0709, the fast pairs) auto-redirect to Grok 4.3 and bill at 4.3 pricing; reasoning maps to low effort, non-reasoning to none.

Grok 4.3 Review — Benchmarks, Pricing & AI Panel Verdict

Benchmark	Score	Source
IFEval	81%	Artificial Analysis (IFBench, held flat vs 4.20)2026-04-30T00:00:00.000Z
TAU-bench	98%	Artificial Analysis (tau-2-Bench Telecom)2026-04-30T00:00:00.000Z
GPQA Diamond	90%	Artificial Analysis / xAI launch coverage (approximate)2026-04-30T00:00:00.000Z
Artificial Analysis Index	53	artificialanalysis.ai 2026-05-28T00:00:00.000Z

Architecture

xAI discloses almost nothing about Grok 4.3's internals. Architecture type, parameter counts, attention mechanism, layer counts, training tokens, and compute are all undisclosed — honestly null here. The model is widely assumed to be a mixture-of-experts design trained on xAI's Colossus / Memphis compute, but this is not confirmed by any xAI model card, so it is marked unknown. What is documented: a 1M-token input context, always-on chain-of-thought reasoning with selectable effort (low / medium / high), native multimodal ingestion including video, and a December 2025 knowledge cutoff. Compared with Anthropic, OpenAI, and Google — all of whom publish at least partial architecture and benchmark detail — xAI's transparency is materially thinner, which is the single biggest caveat on this entry.

Capabilities

Reasoning (8.0): Always-on chain-of-thought drives strong analytical performance; GPQA Diamond around 90% and an Artificial Analysis Intelligence Index of 53 place it in the frontier band, below GPT-5.5 and Gemini 3.1 Pro but ahead of Claude Sonnet 4.6 on the AA Index.
Math (8.0): Inherits and extends Grok 4.20's strong math profile; reasoning effort scales accuracy on multi-step problems.
Agentic / tool use (8.5): The standout. GDPval-AA Elo of 1500 is among the highest measured, and tau-2-Bench Telecom at 98% is excellent. Native web + X search tools are wired into the agent loop.
Coding (6.5): The clear weak spot. xAI did not publish a SWE-bench Verified figure for 4.3 — itself a tell — and independent coverage places it behind Claude Opus 4.7 by double digits on hard agentic coding. Fine for scripting and review; not the pick for a coding-agent ceiling.
Long context (7.5): 1M tokens is generous, but the double-rate billing above 200K input tokens complicates large-document economics, and the 2M of Grok 4.20 multi-agent is larger.
Vision / document OCR (7.0 / 6.5): Solid image understanding plus unique video; OCR is competent but not a published strength.
Creative writing (7.5): The looser, more opinionated Grok voice produces less corporate copy than peers — an asset for some brands.
Instruction following (8.0): Inherits Grok 4.20's strict prompt-adherence improvements; IFBench held at 81%.
Function calling (8.0): Clean structured output and parallel tool calls via an OpenAI/Anthropic-compatible API.
Safety calibration (5.5): Deliberately lower refusal rate on text; image/NSFW moderation tightened after January 2026 regulatory pressure. Calibration is a positioning choice, not a leaderboard strength.
Real-time data (10.0): The wedge. No other frontier model has native, first-party access to the live X firehose plus web search. For "what is happening right now," sentiment, and breaking-news synthesis, Grok 4.3 is structurally unmatched.

Benchmark analysis

Benchmark	Score	vs Predecessor	vs Top Competitor	Source
Artificial Analysis Intelligence Index	53	+4 vs Grok 4.20 (49)	Below GPT-5.5 (~60), Gemini 3.1 Pro (~57); ahead of Claude Sonnet 4.6	Artificial Analysis
GPQA Diamond	~90%	Up from ~78.5% (Grok 4.20)	Near GPT-5.5 / Gemini 3.1 Pro band	AA / launch coverage
GDPval-AA (tool-use Elo)	1500	+321 vs Grok 4.20 (1179)	Among highest agentic Elos measured	AA launch article
tau-2-Bench Telecom	98%	+5 vs Grok 4.20	Top-tier	AA launch article
IFBench	81%	Flat vs Grok 4.20	Competitive	AA launch article

(MMLU-Pro, AIME 2025, MATH-500, SWE-bench Verified, HumanEval, LiveCodeBench, Aider Polyglot, MMMU, and a clean Grok 4.3 LMArena text Elo were not published by xAI's model card or available from third-party leaderboards as standalone, verifiable figures at verification time. Rows left null rather than invented. xAI publishes materially less benchmark detail than Anthropic, OpenAI, or Google.)

Speed & latency

Output throughput is strong — Artificial Analysis measures ~181.5 tokens/sec at high reasoning (top-10 among tracked models). The catch is time-to-first-token: ~19.7 seconds, because reasoning is always on and cannot be disabled, only dialed down. This makes Grok 4.3 a poor fit for sub-second interactive UX but fine for research, batch-style, and agentic workloads where total quality matters more than first-token latency. The model is also verbose — it used ~88M output tokens to complete the AA Intelligence Index suite (about 44% more than Grok 4.20), which inflates per-call output cost.

Pricing analysis

Surface	Cost	Notes
API input	$1.25 / 1M tok	Doubles above 200K input tokens (long-context tier)
API output	$2.50 / 1M tok	Always-on reasoning makes outputs verbose
Cached input	$0.20 / 1M tok	84% discount (cache read)
Batch	n/a	No documented batch endpoint
Image input	token-converted	No published per-image price
Direct UI (SuperGrok)	$30 / mo	Standalone Grok subscription, no X perks
Direct UI (SuperGrok Lite)	$10 / mo	Light tier, Grok Imagine access
Direct UI (X Premium / Premium+)	$8 / $40 mo	Bundled Grok inside the X app; Premium+ adds higher limits + ad-free X
SuperGrok Heavy	$300 / mo	Power-user / heavy-reasoning tier
Free tier	$0	grok.com + free X tier with daily caps
Rate limits	tiered by spend/plan	Per docs.x.ai

Pricing-conflict note (carried from v1, now resolved): Artificial Analysis still lists the Grok 4.x line under older figures in places, and third-party trackers vary. xAI's docs.x.ai card is canonical: $1.25 in / $2.50 out / $0.20 cached. Where OpenRouter or aggregators disagree, defer to docs.x.ai.

Deployment & access

Proprietary, API-only, no open weights and not self-hostable. The x.ai endpoint is OpenAI-SDK and Anthropic-SDK compatible, which keeps switching cost low. Managed cloud availability is now real: Microsoft Foundry / Azure AI Foundry hosts Grok 4.3 with enterprise controls (RBAC, private networking, customer-managed keys) — but note the Azure deployment caps context at ~200K tokens versus 1M on the direct x.ai API. OpenRouter resells the model. No Bedrock or Vertex availability. Rate limits are tiered by spend/plan rather than a fixed default RPM/TPM.

Safety & privacy

xAI has no published Responsible-Scaling-Policy equivalent; governance rests on an Acceptable Use Policy. Two honest points for enterprise buyers:

Training on inputs: On the API, xAI historically trained only via an opt-in data-sharing program (the $150/mo-credit program, which an early-May 2026 report says was wound down) — but once a team opts in, the choice is irreversible. On the X consumer surface, the 2026 Terms expanded "Content" to include AI prompts and outputs with no opt-out, so consumer Grok usage trains the model by default. Marked trains_on_inputs: true with data_optout_available: false to reflect the stricter, consumer-facing reality.
Moderation / refusals: Grok is positioned as less-filtered on text — a lower refusal rate that is a selling point for users wanting a non-corporate voice and a risk for regulated buyers. After January 2026 backlash over sexualized real-person imagery, xAI tightened pre-moderation filters and restricted image generation to paid tiers; "Content Moderated" errors are now common on the image side.
Compliance: No SOC2/HIPAA/ISO/GDPR certifications are publicly verified for the direct xAI API. Enterprises needing certified compliance should route through Azure AI Foundry, which layers its own controls.

Ecosystem & tooling

SDKs in Python and TypeScript, plus OpenAI-SDK and Anthropic-SDK drop-in compatibility. Framework integrations include LangChain and the Vercel AI SDK; OpenRouter resells it; Azure AI Foundry hosts it. Notable surfaces: Grok inside X, grok.com, the SuperGrok subscriptions, and the Tesla in-car assistant. Popularity is growing rather than dominant — strong among X-native and developer audiences, less penetrated in conservative enterprise.

Grok 4.3

What's new

Benchmarks

AI Panel Review

Strengths

Limitations

Best use cases

Deep dive

Architecture

Capabilities

Benchmark analysis

Speed & latency

Pricing analysis

Deployment & access

Safety & privacy

Ecosystem & tooling

Buyer questions

Comparable models

Sources

Model specs

Other Grok 4 versions