Roll downhill to find the minimum — the engine of machine learning
Gradient descent finds the minimum of a function by repeatedly stepping in the direction of steepest descent. The gradient (derivative) tells you which way is "downhill," and the learning rate controls how big each step is.
The graph shows f(x) = x⁴ − 3x² + 2, which has two valleys (local minima). Starting from a point, the algorithm follows the slope downhill, step by step, until it reaches a valley. This is exactly how neural networks learn — they "roll downhill" on a loss landscape.
Ask the AI "Start at x = 2 and descend" or "What happens with a large learning rate?"