Pick a function, drag the sliders to set starting point, then run Forward to compute values (v) on the computational graph.
Click Backprop step to propagate gradients (g) one node at a time, in reverse order. Click Backprop all to run backpropagation to completion.
The gradients dy/dx₁ and dy/dx₂ are the same as what you’d get from calculus, but computed by chaining local derivatives along the graph.
Click here for more details on the mathematics of automatic differentiation. For a parameter-learning example with a shallow neural network, open this demo.
Tips:
• After Forward, the output node y gets a seed gradient dy/dy = 1.
• Each Backprop step picks one node and pushes its accumulated gradient to its parents.
Try to follow the calculations at each step.
Tips:
• After Backprop is complete, the gradients dy/dx₁ and dy/dx₂ tell us how to change x₁ and x₂ to increase y.
• Steepest descent direction is given by: (x₁,x₂) ← (x₁,x₂) − α(dy/dx₁, dy/dx₂).
• Choose a step size α (learning rate) and update x₁ and x₂ accordingly.