$1.00 in / $2.00 out per 1M tokens (cached $0.20) on the API; the CLI is bundled into SuperGrok ($30/mo) or X Premium+ ($40/mo), with heavy tiers around $300/mo. Some trackers show $0.20/$1.50 — trust docs.x.ai.

Does my code leave my machine?

No. Grok Build is local-first: source code is not transmitted to xAI's servers. That is its main differentiator and the basis for treating it as privacy-forward.

How does it compare to Claude Code on quality?

Lower. SWE-bench Verified 70.8% trails Claude Sonnet 4.6 (72.7%) and is far below Opus 4.7 (87.6%). Strong for scoped tasks, not for top-tier ceilings.

The agent proposes a step list and waits for your approval before mutating files — a safety gate against runaway edits.

Can it handle a large monorepo?

With effort. 256K context is the smallest in the Grok lineup, so monorepos need disciplined context selection; a 1M-context tool is easier there.

What integrations exist?

Native MCP support, the x.ai API (OpenAI/Anthropic-SDK compatible), OpenRouter, and third-party surfaces like Kilo Code. Editor-plugin coverage is thinner than Claude Code at v0.1.

What happened to grok-code-fast-1?

It was retired 2026-05-15 and now redirects to Grok Build 0.1; the 70.8% SWE-bench figure traces to that lineage.

Grok Build 0.1 Review — Benchmarks, Pricing & AI Panel Verdict

Benchmark	Score	Source
SWE-bench Verified	70.8%	xAI internal harness (inherited from grok-code-fast-1, which redirects to Grok Build 0.1)2026-05-20T00:00:00.000Z

Architecture

Base model internals are undisclosed (unknown), as with the rest of the Grok line. What's documented is the agent system: a 256K-token context (the smallest in the current Grok lineup), 256K max output, always-on reasoning, plan-mode execution gating, up to 8 parallel sub-agents isolated in Git worktrees, and native MCP tool support. The architecturally distinctive choice is local-first execution: the CLI runs on the developer's machine and does not ship source code to xAI's servers, which is the model's main differentiator against cloud-only coding agents. The SWE-bench Verified number (70.8%) is inherited from grok-code-fast-1, which redirects to Grok Build 0.1 after the May 15 retirement.

Capabilities

Coding (7.0): SWE-bench Verified 70.8% (xAI internal harness) is competitive but below Claude Sonnet 4.6 (72.7%) and far below Claude Opus 4.7 (87.6%). Good for well-scoped tasks and refactors; not a top-tier ceiling.
Agentic (7.5): Plan mode plus up to 8 parallel sub-agents in Git worktrees plus MCP makes it a real coding agent, not just a code-completion model.
Function calling (8.0): First-class structured outputs and tool use — essential for an agent CLI.
Reasoning (7.0): Always-on reasoning improves reliability on multi-step work at the cost of latency.
Long context (6.0): 256K is the smallest in the Grok lineup; adequate for most repos, tight for monorepos where context selection becomes manual work.
Safety calibration (6.5): Local-first (code off-server) and plan-mode approval are concrete safety controls developers can rely on.
Vision (6.0): Image input supports screenshots/diagrams in a coding workflow.
Real-time data (6.5): Web/X search is available but secondary; the point of the product is local code work.
Creative writing (5.0): Not its job; scored low by design.

Benchmark analysis

Benchmark	Score	vs Predecessor	vs Top Competitor	Source
SWE-bench Verified	70.8%	First xAI dedicated coding model (via grok-code-fast-1 lineage)	Below Claude Sonnet 4.6 (72.7%); far below Claude Opus 4.7 (87.6%)	xAI internal harness, via coverage

(HumanEval, LiveCodeBench, Aider Polyglot, and Terminal-bench were not available at verification time; BenchLM and benchmark aggregators show "0 sourced benchmarks" pending public evals for this v0.1 model. SWE-bench Verified is the only headline figure xAI has put forward, and it is from xAI's own harness rather than an independent run. Nothing invented to fill rows.)

Speed & latency

Output throughput is not published, and always-on reasoning slows time-to-first-token (consistent with the rest of the Grok line). For an interactive coding CLI, the practical effect is that each agent action involves a reasoning pause before output — more reliable for multi-step tasks, but slower-feeling than a non-reasoning completion model. Latency tier: slow. Parallel sub-agents partly offset this on multi-file work by running concurrently.

Pricing analysis

Surface	Cost	Notes
API input	$1.00 / 1M tok	docs.x.ai canonical (some trackers show $0.20)
API output	$2.00 / 1M tok	docs.x.ai canonical (some trackers show $1.50)
Cached input	$0.20 / 1M tok	84% discount on cache reads
CLI (Grok Build)	bundled	Included in SuperGrok ($30/mo) and X Premium+ ($40/mo)
Heavy agentic tier	~$300 / mo	SuperGrok Heavy / a separate SuperHeavy ($299/mo, ~$99/mo six-month intro promo reported)
Free tier	none	API and subscription only
Rate limits	tiered by spend/plan	Per docs.x.ai

Pricing note (v1 discrepancy retained): some third-party listings show $0.20 in / $1.50 out for Grok Build 0.1; xAI's docs.x.ai card is canonical at $1.00 / $2.00 / $0.20 cached. Use the docs figures.

Deployment & access

Proprietary model, cloud-hosted for inference, with a locally-running CLI that does not transmit source code to xAI — the key deployment distinction versus cloud-only coding agents. No open weights, no self-hosting of the model itself. Available via the x.ai API (OpenAI/Anthropic-SDK compatible), resold on OpenRouter, and usable inside third-party coding surfaces such as Kilo Code. Not a confirmed Azure AI Foundry SKU. Rate limits are spend/plan-tiered.

Safety & privacy

The local-first design is the headline governance feature: because source code stays on the developer's machine and is not sent to xAI, Grok Build is marked trains_on_inputs: false with data_optout_available: true by design — directly addressing the biggest objection to coding agents in regulated and IP-sensitive environments. Plan mode adds an explicit approval gate before any file mutation, a concrete control against runaway autonomous edits. Beyond these, the family posture applies: no published safety framework, governance via Acceptable Use Policy, no verified SOC2/HIPAA/ISO certs. Prompt/metadata handling still follows the standard xAI API data-sharing terms even though code does not leave the machine.

Ecosystem & tooling

Python/TypeScript SDKs plus OpenAI/Anthropic-SDK compatibility; native MCP; resold on OpenRouter and usable in Kilo Code. Primary surface is the Grok Build CLI. Popularity is niche — a v0.1 entrant in a category led by Claude Code and Codex, with a differentiated local-first angle.

Grok Build 0.1

What's new

Benchmarks

AI Panel Review

Strengths

Limitations

Best use cases

Deep dive

Architecture

Capabilities

Benchmark analysis

Speed & latency

Pricing analysis

Deployment & access

Safety & privacy

Ecosystem & tooling

Buyer questions

Comparable models

Sources

Model specs

Other Grok Build versions