Grok 4.3

v4.3

xAI

Modelflagshipreasoninglong-contextfunction-calling
83
Strong
About This Model

xAI's current flagship model released in early May 2026, with a 1M token context window, reasoning, function calling, and structured outputs at aggressive pricing ($1.25/$2.50 per 1M tokens). Strong frontier performance, but a thinner enterprise compliance posture than Anthropic, OpenAI, or Google.

Last Evaluated: June 10, 2026
Official Website

Trust Vector Analysis

Dimension Breakdown

🚀Performance & Reliability
+

Frontier-class performance with a 1M context window and reasoning, function calling, and structured outputs. Release date sources conflict (2026-04-30 per OpenRouter vs 2026-05-06 per llm-stats); xAI documentation is treated as primary.

task accuracy code

Review of provider documentation and third-party benchmark aggregators

Evidence
xAI Model DocumentationFrontier coding performance positioned as flagship successor to Grok 4.1
llm-statsCompetitive with frontier peers on agentic coding evaluations
mediumVerified: 2026-06-10
task accuracy reasoning

Review of reasoning benchmark results from provider and aggregators

Evidence
xAI Model DocumentationNative reasoning mode with strong math and science performance
mediumVerified: 2026-06-10
task accuracy general

Crowdsourced arena comparisons and aggregator quality metrics

Evidence
LMArena LeaderboardTop-tier placement among frontier models in crowdsourced comparisons
OpenRouter Model ListingHigh usage and quality ratings since launch
mediumVerified: 2026-06-10
output consistency

Review of structured output features and community reports of repeated-prompt behavior

Evidence
xAI Model DocumentationStructured outputs and function calling support deterministic integration patterns
mediumVerified: 2026-06-10
latency p50

Median latency from third-party API benchmarking

Evidence
Community benchmarkingTypical time-to-full-response around 2s for standard prompts (non-reasoning mode)
mediumVerified: 2026-06-10
latency p95

95th percentile response time from third-party benchmarking; reasoning mode adds variance

Evidence
Community benchmarkingTail latency higher when extended reasoning is engaged
lowVerified: 2026-06-10
context window

Official specification from provider documentation

Evidence
xAI Model Documentation1M token context window; higher per-token rate applies above 200K tokens
highVerified: 2026-06-10
uptime

Historical uptime data from official status page

Evidence
xAI Status PageGenerally stable availability since launch with occasional incidents
mediumVerified: 2026-06-10
🛡️Security
+

Reasonable baseline security, but xAI publishes substantially less safety and red-team documentation than Anthropic, OpenAI, or Google.

prompt injection resistance

Testing against OWASP LLM01 prompt injection patterns and review of published safety material

Evidence
xAI DocumentationHardened system prompt handling; limited published red-team data
mediumVerified: 2026-06-10
jailbreak resistance

Review of adversarial prompt testing results and community jailbreak reports

Evidence
xAI NewsSafety improvements cited at launch; less third-party adversarial testing than peers
mediumVerified: 2026-06-10
data leakage prevention

Analysis of privacy policies and data handling commitments

Evidence
xAI Privacy PolicyAPI data handling documented; fewer contractual controls than major enterprise providers
mediumVerified: 2026-06-10
output safety

Safety testing across harmful content categories and review of published evaluations

Evidence
xAI DocumentationContent moderation in place; xAI publishes less safety evaluation detail than Anthropic/OpenAI/Google
mediumVerified: 2026-06-10
api security

Review of API security features and authentication mechanisms

Evidence
xAI API DocumentationAPI key authentication, HTTPS only, rate limiting, team management in console
mediumVerified: 2026-06-10
🔒Privacy & Compliance
+

xAI's enterprise compliance posture remains thinner than Anthropic, OpenAI, or Google: SOC 2 in place but no HIPAA eligibility program and fewer regulated-industry attestations.

data residency

Review of provider documentation and enterprise materials

Evidence
xAI DocumentationUS-based infrastructure; no published regional residency options
mediumVerified: 2026-06-10
training data optout

Analysis of privacy policy and data usage terms

Evidence
xAI Privacy PolicyAPI customer data not used for training by default per policy
mediumVerified: 2026-06-10
data retention

Review of terms of service and data retention policies

Evidence
xAI Privacy PolicyLimited retention for abuse monitoring; zero-retention requires enterprise agreement
mediumVerified: 2026-06-10
pii handling

Review of data protection capabilities and customer responsibilities

Evidence
xAI DocumentationCustomer responsible for PII redaction; no built-in PII tooling
mediumVerified: 2026-06-10
compliance certifications

Verification of compliance certifications against major enterprise provider baselines

Evidence
xAI Trust CenterSOC 2 Type II; thinner certification portfolio (no HIPAA BAA program, limited GDPR tooling) vs Anthropic/OpenAI/Google
mediumVerified: 2026-06-10
zero data retention

Review of data handling practices and enterprise contract options

Evidence
xAI Trust CenterZero-data-retention available only via negotiated enterprise terms
mediumVerified: 2026-06-10
👁️Trust & Transparency
+

Good developer-facing documentation and inspectable reasoning, but less published safety/bias evaluation than major competitors.

explainability

Evaluation of reasoning transparency and explanation capabilities

Evidence
xAI Model DocumentationReasoning traces available via API for inspection
mediumVerified: 2026-06-10
hallucination rate

Review of provider claims and factual QA testing

Evidence
xAI NewsContinued hallucination reductions claimed at launch, building on Grok 4.1 improvements
mediumVerified: 2026-06-10
bias fairness

Review of bias benchmark disclosures and independent reporting

Evidence
xAI Public StatementsLimited published bias evaluation; past Grok versions drew scrutiny over politically tuned behavior
lowVerified: 2026-06-10
uncertainty quantification

Qualitative assessment of confidence expression in outputs

Evidence
Model BehaviorReasoning mode expresses uncertainty reasonably well
mediumVerified: 2026-06-10
model card quality

Review of documentation completeness and clarity

Evidence
xAI Model DocumentationDetailed model page with capabilities, pricing, limits, and feature support
highVerified: 2026-06-10
training data transparency

Review of public disclosures about training data

Evidence
xAI Public StatementsGeneral description including X platform data; detailed sources not disclosed
mediumVerified: 2026-06-10
guardrails

Analysis of built-in safety mechanisms

Evidence
xAI DocumentationBuilt-in moderation with developer controls; lighter-touch defaults than peers
mediumVerified: 2026-06-10
⚙️Operational Excellence
+

Strong API and pricing, but the May 2026 retirement wave (with silent slug redirects to grok-4.3) highlights an aggressive deprecation culture enterprises should plan around.

api design quality

Review of API design, consistency, and feature completeness

Evidence
xAI API DocumentationOpenAI-compatible API with reasoning, function calling, structured outputs, and prompt caching
highVerified: 2026-06-10
sdk quality

Review of SDK quality, documentation, and maintenance

Evidence
xAI SDKsOfficial SDKs plus broad compatibility with OpenAI client libraries
mediumVerified: 2026-06-10
versioning policy

Review of deprecation/migration practices; silent redirects of retired slugs reduce predictability for pinned workloads

Evidence
xAI Migration Guide (May 15 Retirement)Retired Grok model slugs silently redirect to grok-4.3 rather than returning errors
highVerified: 2026-06-10
monitoring observability

Review of available monitoring tools and metrics

Evidence
xAI ConsoleUsage dashboard with spend and rate limit visibility
mediumVerified: 2026-06-10
support quality

Assessment of documentation, community, and support responsiveness

Evidence
xAI DocumentationImproving documentation; support channels lighter than major cloud providers
mediumVerified: 2026-06-10
ecosystem maturity

Analysis of third-party integrations and tools

Evidence
OpenRouter Model ListingAvailable via OpenRouter and major LLM frameworks; growing third-party adoption
mediumVerified: 2026-06-10
license terms

Review of licensing terms and restrictions

Evidence
xAI Terms of ServiceClear commercial API terms; enterprise agreements available
highVerified: 2026-06-10
Strengths
  • +Aggressive pricing: $1.25/$2.50 per 1M tokens with $0.20 cached input
  • +1M token context window
  • +Full agentic feature set: reasoning, function calling, structured outputs
  • +Text and image input support
  • +Frontier-class performance across coding, reasoning, and general tasks
  • +OpenAI-compatible API simplifies migration
Limitations
  • !Thinner enterprise compliance posture than Anthropic, OpenAI, or Google (no HIPAA program)
  • !Retired Grok model slugs silently redirect to grok-4.3, risking unannounced behavior changes
  • !Higher per-token rate applies above 200K context
  • !Limited published safety, bias, and red-team evaluation detail
  • !Zero-data-retention only via negotiated enterprise terms
  • !Conflicting release-date records across aggregators reflect lighter release documentation
Metadata
pricing
input: $1.25 per 1M tokens
output: $2.50 per 1M tokens
notes: Cached input $0.20 per 1M tokens. Higher per-token rate applies for requests above 200K context.
last verified: 2026-06-10
context window: 1000000
languages
0: English
1: Spanish
2: French
3: German
4: Italian
5: Portuguese
6: Japanese
7: Korean
8: Chinese
9: Arabic
10: Hindi
modalities
0: text
1: image (input)
api endpoint: https://api.x.ai/v1/chat/completions
open source: false
architecture: Transformer-based with native reasoning, function calling, and structured outputs
parameters: Not disclosed
release date: Early May 2026 (2026-04-30 per OpenRouter; 2026-05-06 per llm-stats)

Use Case Ratings

code generation

Frontier-class coding with function calling and structured outputs at very competitive pricing.

customer support

Fast, capable, and cheap for support workloads; compliance posture may limit regulated deployments.

content creation

Strong long-form generation with current-events awareness from the X ecosystem.

data analysis

Strong reasoning over large inputs; 1M context handles big datasets, with higher per-token rates above 200K.

research assistant

1M context plus reasoning makes it well suited to literature-scale synthesis at low cost.

legal compliance

Capable analytically, but thinner compliance certifications than Anthropic/OpenAI/Google providers.

healthcare

No HIPAA eligibility program; not recommended for PHI workloads.

financial analysis

Strong quantitative reasoning and real-time information; verify compliance requirements first.

education

Strong explanations at low cost; content controls are lighter-touch than peers.

creative writing

Distinctive voice and strong creative range; fewer content restrictions than competitors.