Some questions about gradient descent method.

Question

I want to make sure that I understand a Gradient descent method correctly. Let's say, there is a optimization problem $f = x^2+y^2 \rightarrow min$. I randomly choose the estimate of the minimum - $(0;0)$. Then I differentiate a function $df = <2x, 2y>$ and at a point $(0;0)$ f' = (0,0). How to further determine if it's a minimum?

I also have a question about the case when $f'(x_i) \neq 0$ The function then for the next point is given as $x_{i+1} = x_i - a_i f'(x_i)$ I don't understand in what instances do we put $- $ and $+$ before $a$? Also, first, how do we determine that chosen $a$ is not sufficiently small? Is it possible to determine also that the function does not have a minimum with such a method?

score 1 · Answer 1 · answered May 07 '20 at 11:15

1

In this case the function is (strictly) convex, so if you found a critical point it is the (only) minimum.

answered May 07 '20 at 11:15

Orenio

391

Ok, but if we ignore that, how would I detect that it is mimimum after I find that $f'(0;0) = 0$? – user May 07 '20 at 11:25
Gradient descent has a stopping condition. It depends on the accuracy you wish to receive. for example, step size is less than $10^-5$. However in your case you move nowhere, which means you are exactly at a critical point. – Orenio May 07 '20 at 11:36

Some questions about gradient descent method.

1 Answers1