Gemma 4
v4.0Google's open-weight family released April 2026 under Apache 2.0 (a shift from the custom Gemma license). Spans E2B/E4B edge models with 128K context and native audio up to a 31B dense model with 256K context. The 31B scores ~1452 on LMArena, No. 3 among open models.
Trust Vector Analysis
Dimension Breakdown
🚀Performance & Reliability+
Strongest open-weight showing from Google to date: 31B at ~1452 LMArena (No. 3 open). MoE 26B-A4B offers near-dense quality at 4B active params. Performance below proprietary frontier but excellent per-parameter efficiency.
Vendor-reported coding benchmarks compared against open-weight peer class
Reasoning benchmark review from launch materials and open-model leaderboards
Crowdsourced human preference rankings on LMArena
Community reports across deployment stacks; high variance by quantization level
Self-hosted model; latency is a function of deployer infrastructure
Official specification from launch announcement
No single provider SLA; assessed as deployment-dependent
🛡️Security+
Security profile is deployment-dependent: excellent data isolation when self-hosted, but guardrails are removable and there is no managed abuse filtering unless the deployer adds it (e.g., ShieldGemma, Vertex AI).
OWASP LLM01 assessment relative to model class; deployer must add input filtering
Adversarial testing of instruction-tuned checkpoints; open weights inherently allow guardrail removal
Architectural assessment: no third-party data flow when self-hosted
Safety testing of released checkpoints and available companion classifiers
Assessment of typical self-hosted serving stacks vs managed alternatives
🔒Privacy & Compliance+
Best-in-class data sovereignty: nothing leaves deployer infrastructure. The trade-off is that compliance certifications are not inherited from the model and must be built or bought by the deployer.
Architectural assessment of self-hosted deployment
Architectural assessment: inference data never leaves deployer
Architectural assessment
Data flow analysis for self-hosted inference
Review of certification inheritance paths for open-weight deployments
Architectural assessment
👁️Trust & Transparency+
High transparency by open-model standards: published technical report, architecture disclosure (including MoE active-parameter counts), and fully auditable weights. Apache 2.0 relicensing further reduces legal opacity.
Assessment of inspection capabilities afforded by open weights
Factual QA testing relative to model size class
Model card review and independent audit availability
Calibration assessment; logprob access partially offsets weaker verbal uncertainty
Documentation completeness review
Public disclosure review against open-model norms
Analysis of built-in and companion safety mechanisms
⚙️Operational Excellence+
Apache 2.0 relicensing is the headline trust improvement: prior Gemma generations carried custom-license use restrictions. Operational burden (monitoring, scaling, support) falls on the deployer, as with any open-weight model.
Review of available serving interfaces and their consistency
Ecosystem tooling support assessment
Release cadence and immutability review
Assessment of out-of-box observability versus managed APIs
Support channel assessment for open-weight distribution
Third-party integration and adoption analysis
License analysis; Apache 2.0 is OSI-approved with no usage restrictions
- +Apache 2.0 license — removes custom-license restrictions of prior Gemma generations
- +Top-3 open model: 31B at ~1452 LMArena Elo
- +Efficient MoE: 26B-A4B reaches ~1441 Elo with only 4B active parameters
- +Full data sovereignty: self-hosted inference, zero data leaves deployer
- +Edge-capable E2B/E4B variants with 128K context and native audio
- +256K context on 12B/26B/31B variants — large for open weights
- +Multimodal input: text, image, and video
- !No inherited compliance certifications; deployer builds or buys SOC 2/HIPAA posture
- !Safety guardrails removable via fine-tuning (inherent to open weights)
- !No first-party SLA or managed support outside Vertex AI hosting
- !Hallucination and reasoning depth below frontier hosted models, especially E2B/E4B
- !Operational burden (serving, scaling, monitoring) falls on deployer
- !Performance varies significantly with quantization choices
Use Case Ratings
code generation
Capable for an open model, especially 31B with 256K context, but well below frontier proprietary coding models.
customer support
26B-A4B MoE (4B active) gives strong quality at low serving cost for high-volume support; E4B enables on-device assistants.
content creation
Solid drafting quality at 31B (~1452 LMArena); fully private content pipelines possible.
education
E2B/E4B with native audio enable offline, on-device tutoring in low-connectivity settings.
healthcare
Self-hosting suits strict data sovereignty (PHI never leaves infrastructure), but deployer carries the full compliance and accuracy-validation burden.
research assistant
256K context on 31B handles long documents; auditable weights suit reproducible research. Reasoning depth below frontier.