DeepSeek
DeepSeek is a Chinese AI lab that became internationally prominent in January 2025 with the release of DeepSeek R1, an open-weight reasoning model that demonstrated near-state-of-the-art performance at a claimed training cost dramatically lower than US frontier labs.
Key Models
- DeepSeek R1 (January 2025) — rlvr-trained reasoning model; demonstrated RLVR’s scaling properties publicly; triggered a wave of Chinese open-weight releases; caused significant market reaction (Nvidia stock dropped ~17% on the implied inference efficiency)
- DeepSeek V3 — base model; mixture-of-experts architecture (~671B total parameters, ~37B active per token); Multi-Head Latent Attention (MLA) for KV cache compression
- Open-weight with permissive licences (fewer restrictions than Meta’s Llama user-cap terms)
Geopolitical Significance
R1’s release was the defining geopolitical AI moment of 2025 per nathan-lambert and sebastian-raschka (fridman-lambert-raschka-2026-state-of-ai). The key signals:
- Chinese labs could train competitive models with constrained hardware (Nvidia H800 instead of H100s, due to US export controls)
- The cost efficiency claims implied that RLVR training might be accessible at scale without gigawatt-scale infrastructure
- The open release of weights enabled global researchers to study and build on the architecture
Status by Early 2026
DeepSeek was losing its open-weight crown by early 2026 to Z.ai (GLM models), MiniMax, and Kimi K2 Thinking (Moonshot). Lambert and Raschka assess: no winner-takes-all scenario — differentiation comes from budget and hardware, not proprietary ideas, since researchers rotate between labs frequently.