Automatic Differentiation

Automatic differentiation (autograd) is a software technique that computes exact derivatives of functions defined as computer programs. It differs from:

Numerical differentiation — approximates f'(x) ≈ (f(x+h) − f(x)) / h; imprecise and slow for many parameters
Symbolic differentiation — algebraically manipulates expressions; produces exact but often enormous formulas
Automatic differentiation — records the sequence of primitive operations at runtime and applies the chain-rule exactly and efficiently

How It Works (Reverse-Mode)

Reverse-mode autodiff (used in neural network training) works in two passes:

Forward pass: Evaluate the function, constructing a computational-graph that records every primitive operation and its inputs.
Backward pass (backpropagation): Starting from the output node with gradient 1, traverse the graph in reverse topological order, multiplying local gradients by incoming gradients to accumulate ∂output/∂input for every node.

For a function with n inputs and 1 output, one backward pass computes all n partial derivatives simultaneously — essential for training neural-networks with millions of weights.

The Value Object Pattern

micrograd implements autograd using a Value class. Each Value:

Holds a scalar data and a scalar grad (initialised to 0)
Stores a reference to its _backward function (the local gradient rule)
Maintains pointers to its child Value objects in the graph

When an operation is performed, a new Value is returned whose _backward closure knows how to push gradients to the operand Values.

Implementations

Library	Scale	Notes
micrograd	Scalar	Pedagogical; 100 lines; andrej-karpathy
pytorch	Tensor	Production; dynamic graph; dominant in research
JAX	Tensor	Functional; XLA-compiled; used at Google DeepMind
TensorFlow	Tensor	Static graph option; production deployment

Sources

karpathy-2022-micrograd-backpropagation — builds an autograd engine from scratch

My Knowledge Base

Explorer

Automatic Differentiation

Automatic Differentiation

How It Works (Reverse-Mode)

The Value Object Pattern

Implementations

Sources

Graph View

Table of Contents

Backlinks