DeepSeek

DeepSeek is a Chinese AI lab that became internationally prominent in January 2025 with the release of DeepSeek R1, an open-weight reasoning model that demonstrated near-state-of-the-art performance at a claimed training cost dramatically lower than US frontier labs.

Key Models

DeepSeek R1 (January 2025) — rlvr-trained reasoning model; demonstrated RLVR’s scaling properties publicly; triggered a wave of Chinese open-weight releases; caused significant market reaction (Nvidia stock dropped ~17% on the implied inference efficiency)
DeepSeek V3 — base model; mixture-of-experts architecture (~671B total parameters, ~37B active per token); Multi-Head Latent Attention (MLA) for KV cache compression
Open-weight with permissive licences (fewer restrictions than Meta’s Llama user-cap terms)

Geopolitical Significance

R1’s release was the defining geopolitical AI moment of 2025 per nathan-lambert and sebastian-raschka (fridman-lambert-raschka-2026-state-of-ai). The key signals:

Chinese labs could train competitive models with constrained hardware (Nvidia H800 instead of H100s, due to US export controls)
The cost efficiency claims implied that RLVR training might be accessible at scale without gigawatt-scale infrastructure
The open release of weights enabled global researchers to study and build on the architecture

Status by Early 2026

DeepSeek was losing its open-weight crown by early 2026 to Z.ai (GLM models), MiniMax, and Kimi K2 Thinking (Moonshot). Lambert and Raschka assess: no winner-takes-all scenario — differentiation comes from budget and hardware, not proprietary ideas, since researchers rotate between labs frequently.

Source: fridman-lambert-raschka-2026-state-of-ai

My Knowledge Base

Explorer

DeepSeek

DeepSeek

Key Models

Geopolitical Significance

Status by Early 2026

Graph View

Table of Contents

Backlinks