15

I understand that the gradient is the direction of steepest descent (ref: Why is gradient the direction of steepest ascent? and Gradient of a function as the direction of steepest ascent/descent).

However, I am not able to visualize it.

enter image description here

The Blue arrow is the one pointing towards minima/maxima. The gradient (black arrow) is not and that's why we have this zig-zag motion.

Then how come gradient is the direction of steepest descent/ascent?

I have a related question: Why does gradient ascent/descent have zig-zag motion?

  • 40
    The water flows by following, at each point, the direction of steepest descent. Thus, every river is a straight line from the spring to the sea, as any good atlas shows. Is it not? –  May 05 '17 at 22:47
  • 4
    The gradient doesn't point to the function's maximum, but to a direction that locally if you go in that direction the function will change the most – Ofek Gillon May 05 '17 at 22:55
  • I don't personally think of the gradient that way. I think of the gradient as the slope in the x direction and the slope in the y direction. You can prove it is the same as the greatest slope direction by dotting with $[\cos(\theta), \sin(\theta)] $and using calculus to solve for greatest incline. – Kaynex May 05 '17 at 22:56
  • 3
    @G.Sassatelli That bit of poetry is lost on me. It's clearly not literally true, for at least one reason that water has momentum and therefore rivers change course over the years and not related to steepest descent. Is this some reference to some quote or something? – Todd Wilcox May 06 '17 at 04:58
  • 7
    @ToddWilcox: Argumentum ad absurdum. An idealised river (ignoring messy details like momentum) will flow down the locally steepest path at all points. If that were equivalent to always pointing to the global minimum, rivers would be straight lines. As we know they aren't, steepest descent and direct path to the global minimum can't be equivalent. – Tim Pederick May 06 '17 at 05:18
  • @ToddWilcox Good point: when two rivers meet, the direction of the flow, in principle, should change, because the new momentum is the sum. In principle, I see no reason why one could not model the "escavating" process by adding a second factor into the the "steepest descent" (call it TRS = "total resistance of the soil in 150 years") and set TRS of water equal to zero. –  May 06 '17 at 11:10
  • @ToddWilcox Although, of course, local difference in height is not the only concurring factor, I think what happens near Volgograd might be an example of how local factors disrupt global behaviour: rivers Don and Volga make two opposite 90° turns, ending up in two different seas. That being said, my approach was indeed idealistic: is momentum a greater factor, when it comes o actually modelling the behaviour of a river? –  May 06 '17 at 11:17
  • 7
    If the gradient always pointed to the minimum, then it would also always point away from the maximum. But satisfying both of these constraints (for a function having both one maximum and one minimum) is of course impossibly as soon as min, max, and current position are not collinear. – Hagen von Eitzen May 06 '17 at 12:41
  • I think your black line segments are simply incorrect in the graphic. – Greg Martin May 06 '17 at 17:21
  • @G.Sassatelli: Yes momentum is extremely crucial. Momentum is what causes all sorts of river phenomena such as meandering, receding of waterfalls, rapids... – user21820 May 06 '17 at 18:07
  • I realize I'm quite late to the party, but the concept you're looking for is of 'integral curve' of a vector field. In particular, the vector field should be given by the gradient of the function. – A. Thomas Yerger May 06 '17 at 19:31

3 Answers3

25

Although the gradient vector is defined at every point, it is really a local concept.

At any given point, it tells you the direction in which the function changes with the greatest rate. If you think of the function as height, then it gives the direction in which the ground is steepest.

As soon as you move an inch, the ground changes and the steepest direction changes.

Instead of the black zig-zag, you need an integral curve of the gradient vector field.

Fly by Night
  • 32,272
  • Thank you @Fly by Night. Can you please elaborate further on "Instead of the black zig-zag, you need an integral curve of the gradient vector field." – The Wanderer May 06 '17 at 00:49
  • 3
    @TheWanderer: Each arrow of the black zig-zag is (or should be) infinitesimally long. Or to put it another way, the black zig-zag is an overly coarse sampling of what would, at high enough resolution, be a smooth curve. – Tim Pederick May 06 '17 at 05:15
  • 1
    @TheWanderer At every point there is a gradient vector. An integral curve is a curve that is tangent to these gradient vectors at all of its points. Instead of making big steps like the black zig-zag, move a tiny amount in the direction of the gradient. You've moved so the gradient vector has (probably) changed. Now move a tiny bit in the new direction. Look at where the new vector is pointing then move a tiny bit in that direction, etc. If your steps are small enough then you'll get something that looks like a curve. https://upload.wikimedia.org/wikipedia/commons/5/5a/Slope_Field.png – Fly by Night May 08 '17 at 18:49
7

The gradient $\nabla f(x)$ points in the direction $u$ such that the directional derivative $D_u f(x)$ is as large as possible. You probably walk downhill in the direction of steepest descent, despite the fact that the lowest point on earth is the Dead Sea and you are probably walking in completely the wrong direction to reach it.

Edit:

Maybe walking down a hill is not a perfect analogy because it makes it seem like the "direction of steepest descent" should be a vector in $\mathbb R^3$, with a $z$ component as well as $x$ and $y$ components.

Perhaps a better analogy is a bug walking on a hot (painfully hot!) sidewalk. The bug moves in the direction of steepest descent (the direction in which temperature decreases most quickly), but the bug does not realize that the coolest spot on the sidewalk is ten meters in the opposite direction, where there is shade. Hopefully in this analogy it's clear that the temperature is a function $f(x,y)$, and the direction of steepest descent is a vector with an $x$ component and a $y$ component, but no $z$ component.

littleO
  • 51,938
7

Your black lines are not gradient lines at all. The gradient should be perpendicular to the contour lines at every point. Even in an ellipsoidal valley, the gradient will not point to the lowest point, but it will point much closer to it than your picture indicates.

A function minimizer that follows the local gradient has to take a finite sized step in the direction of the gradient, then find the gradient at the new location to take the next step. Often evaluating the gradient is very expensive and you want to do it as few times as possible. One approach is then to follow the gradient from your current point as far as the function stops decreasing, then stop, evaluate the local gradient, and set off in that direction. If that is your strategy, each new direction will be at a right angle to the prior direction. If the new gradient were not perpendicular to the old direction of travel, you could decrease the function by moving farther or not so far in the old direction of travel. You only change direction when you are at a local minimum in the direction you are going.

Ross Millikan
  • 374,822
  • "Even in an ellipsoidal valley, the gradient will not point to the lowest point, but it will point much closer to it than your picture indicates.". --- Even I felt the same way but apparently, if the ellipse is elongated, the gradient points almost perpendicular to the lower point. Please see my earlier question that provide more reference to the beginning of this confusion: https://math.stackexchange.com/questions/2256925/steepest-descent-in-elliptical-error-surface – The Wanderer May 05 '17 at 23:48