What goes wrong when I use squared constraint as a Lagrange multiplier?

Question

Consider a problem of minimizing $f(x)$ under the constraint that $g(x)=0$. The standard approach is to use Lagrange multiplier $$\mathcal L = f(x) -\lambda g(x)$$ and differentiate $\mathcal L$ to get $$f'(x) - \lambda g'(x) = 0.$$ Now, $g(x) =0$ is equivalent to $g^2(x) = 0$. So it seems we can consider the latter constraint instead, obtaining $$\mathcal L = f(x) -\lambda g^2(x)$$ and then $$f'(x) - 2\lambda g(x) g'(x)= 0.$$ But we know that $g(x)=0$ so this can be simplified to $$f'(x)=0$$ ...which does not make any sense: the constrained solution obviously does not need to satisfy $f'(x)=0$.

Where is the mistake in this reasoning?

score 2 · Accepted Answer · answered Feb 05 '18 at 14:42

2

The Lagrange multiplier condition really says that the gradient of $f$ should be parallel to the gradient of $g$. If you replace $g$ by $g^2$ then you are requiring that the gradient of $f$ is now parallel to the zero vector, which all vectors are, so everything is fine geometrically speaking. This case is really not correctly handled by putting $\lambda$ on $\nabla g$, as multivariable calculus books usually do. It is correctly handled if you put $\lambda$ on $\nabla f$, but then cases where $\nabla f$ vanishes are lost instead.

answered Feb 05 '18 at 14:42

Ian

101,645

1

Thanks for the answer, but I think I am still confused. Let's denote $g^2(x)$ as $h(x)$. My problem is that $g(x)=0$ and $h(x)=0$ denote the same constraint so naively I would expect that I can solve the problem by using either of them (I might not even realize that I am using $h(x)$ that is actually a square of something else). However, using $g(x)$ yields an equation $f'(x)=\lambda g'(x)$ that actually gives me the solution whereas using $h(x)$ yields $f'(x)=0$ which is nonsense and does not allow to find the correct solution. Hence my confusion. – amoeba Feb 05 '18 at 14:50
2

@amoeba The problem is that the condition $\nabla f = \lambda \nabla g$ is not really what's going on. It is instead "$\nabla f$ is parallel to $\nabla g$" which means that either $\nabla f = \lambda \nabla g$ or $\lambda \nabla f = \nabla g$. When $\nabla g = 0$, the latter equation holds with $\lambda=0$ regardless of what $\nabla f$ is. – Ian Feb 05 '18 at 14:51
OK, this makes sense. Thanks a lot @Ian! Accepted. – amoeba Feb 05 '18 at 15:35
A short clarification/follow-up if I may: so imagine we have a problem of minimizing $f(x)$ subject to $h(x)=0$ and we notice that $\nabla h=0$ whenever the constraint is satisfied. It seems that Lagrange multiplier cannot help in this situation because it yields a trivially correct equation. Is there a standard way to deal with this? – amoeba Feb 07 '18 at 23:01
2

@amoeba Not really, you just should avoid describing your constraint as the level set of such a function. – Ian Feb 07 '18 at 23:07

What goes wrong when I use squared constraint as a Lagrange multiplier?

1 Answers1