DeepSeek

DeepSeek is a Chinese AI lab that became internationally prominent in January 2025 with the release of DeepSeek R1, an open-weight reasoning model that demonstrated near-state-of-the-art performance at a claimed training cost dramatically lower than US frontier labs.

Key Models

  • DeepSeek R1 (January 2025) — rlvr-trained reasoning model; demonstrated RLVR’s scaling properties publicly; triggered a wave of Chinese open-weight releases; caused significant market reaction (Nvidia stock dropped ~17% on the implied inference efficiency)
  • DeepSeek V3 — base model; mixture-of-experts architecture (~671B total parameters, ~37B active per token); Multi-Head Latent Attention (MLA) for KV cache compression
  • Open-weight with permissive licences (fewer restrictions than Meta’s Llama user-cap terms)

Geopolitical Significance

R1’s release was the defining geopolitical AI moment of 2025 per nathan-lambert and sebastian-raschka (fridman-lambert-raschka-2026-state-of-ai). The key signals:

  1. Chinese labs could train competitive models with constrained hardware (Nvidia H800 instead of H100s, due to US export controls)
  2. The cost efficiency claims implied that RLVR training might be accessible at scale without gigawatt-scale infrastructure
  3. The open release of weights enabled global researchers to study and build on the architecture

Status by Early 2026

DeepSeek was losing its open-weight crown by early 2026 to Z.ai (GLM models), MiniMax, and Kimi K2 Thinking (Moonshot). Lambert and Raschka assess: no winner-takes-all scenario — differentiation comes from budget and hardware, not proprietary ideas, since researchers rotate between labs frequently.


Source: fridman-lambert-raschka-2026-state-of-ai