This demo trains a shallow neural network on noisy samples from a 2D surface. The full computational graph is too large to draw usefully,
so the focus here is on what reverse-mode autodiff is doing in practice: computing parameter gradients for a scalar loss and updating the weights.
See graph-level auto-diff explanation and demo for the explicit node-by-node picture.
Use the buttons build the neural network.
Loss: \(L(\theta) = \frac{1}{N}\sum_{i=1}^{N}(\hat{y}_i(\theta) - y_i)^2\)