Evaluation record · deepseek-v3-2

DeepSeek-V3.2

v20251201

DeepSeek

Modelopen-sourcemit-licensereasoningsparse-attention

Strong

About This Model

DeepSeek's ~685B-parameter MoE flagship with DeepSeek Sparse Attention (DSA) for dramatically cheaper long-context inference. The V3.2-Speciale variant reached IMO 2025 gold-medal level (35/42) and 96.0% AIME. MIT-licensed open weights; the dominant open model through early 2026 until superseded by DeepSeek-V4.

Last Evaluated: July 9, 2026

Official Website

Trust Vector Analysis

Dimension Breakdown

🚀Performance & Reliability

Frontier-level reasoning for an open model: V3.2-Speciale hit IMO 2025 gold-medal level and 2nd at ICPC World Finals. DSA makes 128K-context workloads unusually cheap. Speciale's dedicated API endpoint was temporary, but its weights remain open.

task accuracy code

Industry-standard coding benchmarks and competitive programming results from the official release and technical report

Evidence

DeepSeek-V3.2 Release Notes — Strong coding performance across SWE-bench and competitive programming; V3.2-Speciale placed 2nd at ICPC World Finals level

DeepSeek-V3.2 Technical Report — DSA architecture maintains coding accuracy while cutting long-context cost

highVerified: 2026-07-09

task accuracy reasoning

Olympiad-level mathematics and competition benchmarks (IMO, AIME, ICPC) reported at release and corroborated by the technical report

Evidence

DeepSeek-V3.2 Release Notes — V3.2-Speciale reached IMO 2025 gold-medal level (35/42) and 96.0% on AIME

DeepSeek-V3.2 Technical Report — Frontier-level mathematical and olympiad reasoning documented in the technical report

highVerified: 2026-07-09

task accuracy general

Comprehensive knowledge and instruction-following benchmark review from the technical report and community leaderboards

Evidence

DeepSeek-V3.2 Technical Report — Strong general knowledge and instruction-following results; best open model on most aggregate leaderboards through early 2026

highVerified: 2026-07-09

output consistency

Repeated-prompt testing across temperature settings and context lengths, supplemented by community reports

Evidence

DeepSeek API Documentation — Stable outputs across thinking and non-thinking modes; sparse attention introduces minor variance on very long contexts

mediumVerified: 2026-07-09

latency p50

Median latency for API requests with standard prompt sizes from independent benchmarking

Evidence

Artificial Analysis — Typical first-response latency ~1.8s on the first-party API; DSA keeps long-context latency near-flat

mediumVerified: 2026-07-09

latency p95

95th percentile response time across diverse workloads from independent benchmarking

Evidence

Artificial Analysis — p95 latency ~4.2s; reasoning mode substantially longer due to thinking tokens

mediumVerified: 2026-07-09

context window

Official specification from provider

Evidence

DeepSeek API Documentation — 128K context with DeepSeek Sparse Attention making long-context inference substantially cheaper

highVerified: 2026-07-09

uptime

Historical uptime data from the official status page plus availability of multiple third-party hosts

Evidence

DeepSeek Status — ~99% uptime on first-party API; self-hosted and third-party deployments (Together, Fireworks, etc.) offer independent availability

mediumVerified: 2026-07-09

🛡️Security

Adequate default guardrails. As with all open-weight models, safety properties only hold for unmodified weights; self-hosting shifts security responsibility to the deployer.

prompt injection resistance

Testing against OWASP LLM01 prompt injection attack patterns and community red-team reports

Evidence

Community red-team evaluations — Reasonable resistance to common injection patterns; weaker than frontier proprietary models on indirect injection

mediumVerified: 2026-07-09

jailbreak resistance

Testing against adversarial prompt datasets; assessment accounts for open-weight modifiability

Evidence

Independent safety evaluations — Standard RLHF-based guardrails; open weights mean alignment can be removed by downstream fine-tuners

mediumVerified: 2026-07-09

data leakage prevention

Analysis of privacy policy for the hosted API plus the self-hosting option for full data isolation

Evidence

DeepSeek Privacy Policy — Standard data handling on first-party API; self-hosting gives organizations complete data control

mediumVerified: 2026-07-09

output safety

Safety testing across harmful content categories on default weights

Evidence

DeepSeek-V3.2 Technical Report — Safety post-training applied; refusal behavior comparable to prior DeepSeek releases

mediumVerified: 2026-07-09

api security

Review of API security features and transport guarantees

Evidence

DeepSeek API Documentation — API key authentication, HTTPS-only transport, rate limiting on the first-party platform

highVerified: 2026-07-09

🔒Privacy & Compliance

Standard open-model split: the first-party API is China-hosted with China-jurisdiction data residency, while self-hosting or Western third-party hosts (which most regulated enterprises use) avoid that concern entirely.

data residency

Review of privacy policy and hosting options; China-jurisdiction caveat applies only to the first-party API

Evidence

DeepSeek Privacy Policy — First-party API data is processed and stored on servers in China; MIT-licensed weights allow self-hosting in any jurisdiction

highVerified: 2026-07-09

training data optout

Analysis of privacy policy and data usage terms for the hosted API

Evidence

DeepSeek Privacy Policy — API data usage terms documented; self-hosting eliminates the concern entirely

mediumVerified: 2026-07-09

data retention

Review of terms of service; retention is deployment-dependent for open-weight models

Evidence

DeepSeek Terms of Service — First-party API retains data per Chinese regulatory requirements; self-hosted deployments retain nothing externally

mediumVerified: 2026-07-09

pii handling

Review of data protection capabilities and customer responsibilities

Evidence

DeepSeek Platform Documentation — No built-in PII redaction tooling; customer responsible for PII handling on any deployment

mediumVerified: 2026-07-09

compliance certifications

Verification of certifications for the first-party platform; third-party hosted options inherit their providers' certifications

Evidence

DeepSeek Platform — No SOC 2, HIPAA, or FedRAMP for the first-party API; compliant deployments achievable via certified Western hosts (AWS, Azure, Together) or self-hosting

mediumVerified: 2026-07-09

zero data retention

Review of data handling across first-party API, third-party hosts, and self-hosting

Evidence

Open-weight deployment options — No zero-retention option on the first-party API, but self-hosting provides true zero external retention

mediumVerified: 2026-07-09

👁️Trust & Transparency

Strong architectural transparency (open weights, detailed DSA technical report, visible reasoning traces) offset by limited training-data disclosure and known topic-avoidance on politically sensitive subjects.

explainability

Evaluation of reasoning transparency and trace accessibility

Evidence

DeepSeek-V3.2 Release Notes — Visible chain-of-thought in thinking mode; reasoning traces fully inspectable on self-hosted deployments

highVerified: 2026-07-09

hallucination rate

Testing on factual QA datasets and community evaluations

Evidence

Community factuality testing — Moderate hallucination rate, improved over V3.1; reasoning mode reduces factual errors on multi-step tasks

mediumVerified: 2026-07-09

bias fairness

Evaluation on bias benchmarks and politically sensitive topic probes

Evidence

Independent bias evaluations — Basic bias mitigation; known topic-avoidance behavior on China-politically-sensitive subjects in default weights

mediumVerified: 2026-07-09

uncertainty quantification

Qualitative assessment of confidence expression in outputs

Evidence

Model behavior assessment — Expresses uncertainty in reasoning traces, though final answers can be overconfident

mediumVerified: 2026-07-09

model card quality

Review of technical report and model card completeness

Evidence

DeepSeek-V3.2 Technical Report — Detailed technical report covering DSA architecture, training methodology, and benchmark results, plus full open weights

highVerified: 2026-07-09

training data transparency

Review of public disclosures about training data

Evidence

DeepSeek-V3.2 Technical Report — Training methodology well documented; dataset composition described only at a high level

mediumVerified: 2026-07-09

guardrails

Analysis of built-in safety mechanisms in default weights

Evidence

DeepSeek Safety Documentation — Standard alignment guardrails in released weights; removable by downstream fine-tuning

mediumVerified: 2026-07-09

⚙️Operational Excellence

Mature ecosystem with broad third-party hosting. Main operational caveat is DeepSeek's rapid release cadence: V3.2 supersedes V3.1/V3-0324 and the standalone R1 line, and was itself superseded by V4 in April 2026.

api design quality

Review of API design, consistency, and feature completeness

Evidence

DeepSeek API Documentation — OpenAI-compatible API with thinking/non-thinking modes, function calling, and context caching

highVerified: 2026-07-09

sdk quality

Review of SDK compatibility and inference-framework support

Evidence

DeepSeek GitHub — OpenAI-SDK compatibility plus first-class support in vLLM and SGLang for self-hosting

highVerified: 2026-07-09

versioning policy

Review of versioning practices and historical endpoint lifecycle

Evidence

DeepSeek API News — Rapid model turnover; the temporary V3.2-Speciale endpoint and short deprecation windows require deployment agility

mediumVerified: 2026-07-09

monitoring observability

Review of monitoring tools across deployment options

Evidence

DeepSeek Platform — Usage dashboard with token metrics; full observability available when self-hosting

mediumVerified: 2026-07-09

support quality

Assessment of documentation, community, and support responsiveness

Evidence

DeepSeek Support Channels — Community and email support only; no enterprise SLA on the first-party platform

mediumVerified: 2026-07-09

ecosystem maturity

Analysis of third-party hosting, integrations, and community adoption

Evidence

Hugging Face / hosting ecosystem — Dominant open model through early 2026: hosted by Together, Fireworks, AWS Bedrock and others; broad fine-tune and tooling ecosystem

highVerified: 2026-07-09

license terms

Review of licensing terms and restrictions

Evidence

DeepSeek-V3.2 License — MIT license: unrestricted commercial use, modification, and redistribution

highVerified: 2026-07-09

Strengths

+Frontier open-model reasoning: V3.2-Speciale at IMO 2025 gold-medal level (35/42), 96.0% AIME, 2nd at ICPC World Finals
+DeepSeek Sparse Attention (DSA) makes 128K long-context inference dramatically cheaper
+MIT license with full ~685B MoE weights: unrestricted commercial use and self-hosting
+Dominant open model through early 2026 with broad third-party hosting (Together, Fireworks, AWS Bedrock)
+Detailed technical report and visible chain-of-thought reasoning
+Very low first-party API pricing

Limitations

!First-party API is China-hosted: China-jurisdiction data residency and no SOC 2/HIPAA/FedRAMP (self-hosting or Western hosts avoid this)
!Superseded by DeepSeek-V4 (April 2026); no longer the frontier open model
!V3.2-Speciale API endpoint was temporary; accessing Speciale now requires self-hosting its weights
!Text-only: no vision or audio modalities
!Topic-avoidance behavior on politically sensitive subjects in default weights
!~685B parameters demand substantial multi-GPU infrastructure to self-host
!No enterprise SLA or dedicated support on the first-party platform

Metadata

pricing

input: $0.28 per 1M tokens (first-party API, cache miss)

output: $0.42 per 1M tokens (first-party API)

notes: DSA-driven price cut at launch made long-context usage exceptionally cheap; context caching discounts cache hits further. Confirmed July 2026: legacy first-party endpoints are removed 2026-07-24 (15:59 UTC) in favor of V4, after which V3.2 is served via third-party hosts or self-hosting only. Self-hosting cost is infrastructure-only under MIT license.

last verified: 2026-07-09

context window: 128000

max output: 64000

languages

0: English

1: Chinese

2: Japanese

3: Korean

4: Spanish

5: French

6: German

7: Portuguese

8: Russian

modalities

0: text

api endpoint: https://api.deepseek.com/v1/chat/completions

open source: true

architecture: ~685B-parameter Mixture-of-Experts with DeepSeek Sparse Attention (DSA); thinking and non-thinking modes; V3.2-Speciale reasoning-specialized variant

parameters: ~685B total (MoE)

knowledge cutoff: Mid 2025

Use Case Ratings

code generation

Excellent coding at exceptional cost; ICPC World Finals 2nd-place pedigree via Speciale. Best open coding value of its generation.

customer support

Capable and cheap for support workloads; reasoning mode unnecessary overhead for simple tickets.

content creation

Solid long-form generation; prose style less polished than frontier proprietary models.

data analysis

Strong analytical reasoning with cheap 128K context thanks to DSA; excellent for large-document analysis on a budget.

research assistant

Frontier-level mathematical and scientific reasoning (IMO gold-level via Speciale) with inspectable reasoning traces.

legal compliance

First-party API is China-hosted with no Western certifications; viable only via self-hosting or certified third-party hosts.

healthcare

No HIPAA path on the first-party API. Self-hosted deployments in compliant infrastructure are the only viable route.

financial analysis

Excellent quantitative reasoning at low cost; data-residency planning required for regulated workloads.

education

Outstanding math tutoring capability with visible step-by-step reasoning at prices viable for education budgets.

creative writing

Competent but not a creative standout; reasoning strength does not translate to distinctive prose.

Similar Models

DeepSeek-V4

DeepSeek

DeepSeek-R1

DeepSeek

DeepSeek V3 0324

DeepSeek

Qwen3.5

Alibaba

Claude Opus 4.5

Anthropic