3

Imagine a function that map (x,y) -> z

Imagine if we draw a bunch of curve where the value of z is the same. So that z is "potential field". The gradient is the steepest way to reach bottom.

I found such pictures

enter image description here

Now, I am a bit confused with the picture.

The path perpendicular to the equipotential curves are indeed the black line. So that means if we want to go to steepest descent we must "zig zag". I must be missing something here

The gradient is NOT the direction that points to the minimum or maximum

I can see that it's an elipse. Say someone wants to go to the place with the lowest z (say in the middle), at the least distance possible, by following gradients. Which path he'll follow?

It looks like he'll follow the zig zag black path. After all at first point, the gradient does point to the left. Hmmm... But the blue path seems shorter.

How do we explain this?

What would be the path of always following gradient anyway for the corresponding graphs?

user4951
  • 1,714
  • 2
    I don't think this entirely answers your question, but keep in mind that the gradient shows which direction in which to "take an infinitesimal step", so when using gradient descent methods in practice, there will be some "overreach", because no matter how small of a step size we take, we can't take an "infinitesimal step", which necessitates us to correct back in the other direction, hence the zigzagging. But I'm not sure this explanation is entirely correct, and I imagine someone else will comment/answer soon with an explanation which is both more correct and makes more sense than this. – Chill2Macht May 06 '17 at 17:25

1 Answers1

0

The step size $\alpha_k$ is chosen in a way that it minimizes the objective function at current iteration. Let's note $\varphi$ the value of the objective function f at the next iterate $x_{k+1} = x_k -\alpha \nabla f(x_k)$

$$\varphi : \alpha \to f(x_k - \alpha \nabla f(x_k))$$

the necessary condition implies that $\varphi '(\alpha_k) = 0$ (the derivative with respect to $\alpha_k$ is zero, as we want the step $\alpha_k$ to minimize the function at each iteration)

If you develop the expression $$\varphi '(\alpha_k)=\langle-\nabla f(x_k),\nabla f(x_k - \alpha_k\nabla f(x_k))\rangle = \langle -\nabla f(x_k),\nabla f(x_{k+1})\rangle = 0$$

you'll find out that two successive directions are perpendicular!

Artashes
  • 133
  • 7