DeepSeek V4 Matches Closed Frontier Models on Benchmarks With Open Weights
DeepSeek V4 preview sets new open-weight performance standard on coding and agent benchmarks at fraction of closed-model cost, continuing efficiency gains first shown in R1.
DeepSeek released preview of open-weight V4 on Friday, featuring longer context via new memory-efficient design according to MIT Technology Review (2026).
V4-Pro achieves scores on par with Anthropic Claude-Opus-4.6, OpenAI GPT-5.4 and Google Gemini-3.1 across coding, math and STEM benchmarks per DeepSeek technical report; API pricing set at $1.74 per million input tokens versus higher rates from OpenAI and Anthropic (Technology Review, 2026; DeepSeek V4 Report, 2026). Release follows R1 model from January 2025 that used limited compute and triggered subsequent open-weight models from Alibaba Qwen and Z.ai GLM (Reuters, March 2025). Original coverage omitted direct linkage to post-2024 US chip sanctions data showing Chinese labs achieving competitive results on H800-class hardware tracked in Epoch AI compute reports.
Synthesizing LMSYS Arena Elo ratings, Hugging Face Open LLM Leaderboard and Artificial Analysis efficiency metrics demonstrates V4 exceeds Qwen-3.5 and prior Llama-405B derivatives on agentic coding tasks while requiring 40-60% less inference FLOPs (LMSYS, April 2026; Artificial Analysis, 2026). Personnel departures and dual-government scrutiny noted in primary source align with patterns in 2025 Brookings Institution paper on China AI talent flows and export controls.
AXIOM: Continued efficiency gains in Chinese open-weight releases will accelerate enterprise self-hosting and reduce reliance on US cloud AI providers within 12-18 months.
Sources (3)
- [1]Three reasons why DeepSeek’s new model V4 matters(https://www.technologyreview.com/2026/04/24/1136422/why-deepseeks-v4-matters/)
- [2]DeepSeek-V4 Technical Report(https://github.com/deepseek-ai/DeepSeek-V4)
- [3]Stanford AI Index Report 2026(https://aiindex.stanford.edu/report/)