AI Factory

AI factory is jensen-huang’s term for the modern AI data centre, framed as a production facility that manufactures tokens (AI-generated outputs) rather than a warehouse that stores and retrieves pre-written files.

The Warehouse → Factory Transition

Traditional computing: store pre-written content, retrieve it with search and recommendation. Economics: storage cost is the primary value driver; retrieval is cheap.

AI computing: contextually aware, generative. Every query requires processing and generation of tokens in real time. Economics: compute is the primary cost; token production is the product.

Huang: “Warehouses don’t make much money. Factories directly correlate with the company’s revenues.” The implication is that the total addressable market for AI compute grows far faster than for traditional storage-oriented compute, because factories generate revenues, not just costs.

Token Segmentation — The iPhone Analogy

Tokens are not a single commodity. Huang predicts a tiered market:

  • Free tokens — consumer-grade AI responses
  • Mid-tier tokens — business applications
  • Premium tokens — specialised, high-stakes inference (e.g., drug discovery, legal, financial analysis) — Huang suggests $1,000/million tokens is “just around the corner”

This mirrors smartphone economics: free apps → paid apps → enterprise software, all running on the same hardware platform. openclaw-style agentic-ai systems are the “iPhone moment” that makes the token economy legible to everyday users.

Scale and Economics

nvidia estimates:

  • Token cost falls an order of magnitude per year (driven by tokens-per-watt efficiency gains)
  • GPU/rack prices rise, but token generation efficiency rises faster
  • 200 Vera Rubin pods per week of manufacturing capacity is needed to meet projected demand
  • Each rack: 1.3–1.5 million components, 200 suppliers, 2–3 tonnes shipped pre-assembled

Huang projects GDP growth acceleration and a 100× increase in computing’s share of GDP, because compute shifts from cost centre to profit centre.

Physical Requirements

AI factories at scale imply:

  • Power: gigawatt-class facilities; Huang advocates using idle grid capacity (60% headroom 99% of the time) with graceful degradation contracts rather than demanding six-nines uptime
  • Cooling: conduction and convection at rack scale; radiant cooling in space for satellite-based deployments
  • Supply chain: 200-supplier orchestration; supercomputer assembly now occurs in the supply chain, not the data centre, requiring gigawatts of build-and-test power in manufacturing

See rack-scale-computing for hardware architecture and cuda for the software layer.


Source: fridman-huang-2026-nvidia-ai-revolution