1

My lecture notes mention the following:

For $f : U \to \mathbb{R}$ differentiable, consider the level set

$N_y=\{x \in U : f(x)=y\}$.

Suppose that $c : I \to N_y \subset U$ is a differentiable curve. Then $f \circ c = y$ and so

$0=\frac{d}{dt}(f \circ c)=\langle grad f(c(t)), c'(t)\rangle \iff grad f(c(t) \perp c'(t)$ where the first equality follows by the chain rule.

Since this holds for any differentiable curve running through $N_y$, we can say that the gradient vector of $f$ is perpendicular to the level sets $N_y$.

The definition of the gradient used is:

Definition:

The gradient of $f : U \to \mathbb{R}$ is the uniquely determined map

$grad f : U \to \mathbb{R}^n$ with $\langle grad f(x),v=df_x(v)$ for all $v \in \mathbb{R}^n$ where $df$ denotes the differential of $f$.

If we let $\langle \cdot \rangle$ be the standar scalar product, then $grad f(x) = J_f(x)$ where $J_f(x)$ denotes the Jacobian.

Now my question is the following: I can follow all the steps in the proof, but I cannot understand the conclusion. Why does it follow that $grad f(x)$ is perpendicular to the level set $N_y$ for any $x$ and any $y$? What does it even mean to be perpendicular to a set? Does it mean perpendicular to any vector in that set?

The notes also give an example:

For $f(x)=\|x\|^2$ each vector $grad f(x)=(2x_1,...,2x_n)^T$ is perpendicular to the sphere

$N_{\|x\|^2}=\{p \in \mathbb{R}^n : \|p\|^2 = \|x\|^2 \}$.

I have tried the following to see if my interpretation of perpendicular to a set is correct:

Let $x=(1,...,1)^T$, then $\|x\|^2=n$. Now consider a vector $y$ in $N_{\|x\|^2}$, say $y=(\sqrt{n},0,...0)^T$, then $\|y\|^2=n$ (x and y have the same length). But $\langle x,y \rangle = \sqrt{n} \neq 0$. So $x$ and $y$ are not perpendicular. So I think my interpretation of perpendicularity to a set must be wrong.

Please tell me what I am missing here. How does the conclusion follow and how to interpret it? I know there are a lot of questions on this topic, but none of them could clarify this point in a satisfactory way for me.

Thank you very much!

Edit:

Maybe I should add another thought. The proof shows that $grad f(c(t)$ is perpendicular to $c'(t)$ which is the tangent vector of $c$ at $t$. So we need to show that the tangent vector points in the same direction as the vectors in the level set.

  • I should clarify that here I am using c as an arbitrary constant. My $\mathbf{r}$ is what corresponds to what you use as c in your question. – David Reed Apr 12 '20 at 17:41

1 Answers1

2

The reasoning is as follows. You start out assuming that the level curve, $\left\{(x,y) : f(x,y) = c\right\}$, can be parameterized by some function $\mathbf{r}(t), t \in [a,b]$. This turns out to follow from something called the Implicit Function Theorem. Then for each $t \in [a,b]$, $(f\ \circ \mathbf{r})(t) = f(\mathbf{r}(t)) = f(x,y) = c$. In particular, $$(f\ \circ \mathbf{r})(t) = c $$

Differentiating both sides with respect to $t$ and applying the chain-rule gives:$$0 = (f\ \circ \mathbf{r})'(t) = \nabla f(\mathbf{r}(t)) \ \cdot \mathbf{r}'(t) $$

Remember that for each $t$, $\mathbf{r}(t)$ corresponds to some point on the level curve, and that therefore $\mathbf{r}'(t)$ is tangent to the level curve at each point. Recall that the dot product being zero means that the two vectors are at right angles with each other. Therefore at each point, the gradient is at a right angle with a vector tangent to the curve at that point ($\mathbf{r}'(t))$ and must therefore be normal to the curve.

EDIT-- IN RESPONSE TO YOUR COMMENT

Let's do the circle instead of the sphere, say of radius 3. $$f(x) = \Vert x \Vert ^2 = x^2 +y^2 = 9$$

The graph of that is your level curve. We can parameterize that by $$\mathbf{r}(t) = \left(3\cos(t),3\sin(t)\right), t \in [0,2\pi]$$

Note that this works out since: $$f(\mathbf{r}(t)) = f\left(3\cos(t),3\sin(t)\right) = 9\cos^2(t) + 9\sin^2(t) = 9\left(\cos^2(t) + \sin^2(t)\right) = 9$$

Next we compute the derivatives: $$\nabla f(x,y) = \left(2x,2y\right)$$ $ \\ $ $$\mathbf{r}'(t) = \left(-3\sin(t),3\cos(t)\right)$$

To do a "test", at the top of the sphere, at point $(0,3)$, corresponding to $t = \pi/2$, the first should point straight up (e.g. have no x component) and the second should point horizontally (e.g. have no y component). This is fairly routine to verify.

David Reed
  • 3,265
  • Thanks for your answer. The lecture notes have not covered the implicit function theorem yet and I think that is what consuses me. Let me try to summarize the important point to make sure that I understand it. The level set is basically a function itself (defined implicitly), so in some sense the level set is the curve $c(t)$ or $r(t)$ in your notation. If we now look at any point $x$ in the domain $U$ of $f$, then this will correspond to some $c(t)$ and $c'(t)$ will point in the direction the curve (or the level set) is going. Could you please tell me if that is correct? – DerivativesGuy Apr 13 '20 at 07:22
  • Oh and could you please tell me where I've gone wrong in my example? Given what you said I think I first need to find an explicit functional equation for the sphere and then calculate the tangent vector to see that it is in fact perpendicular to the gradient $grad f(x)=2x$. Thanks, again. – DerivativesGuy Apr 13 '20 at 07:53
  • See my edit. Hopefully that answers your questions. – David Reed Apr 13 '20 at 10:44
  • Thanks again, your remarks are very helpful. I've got one last question though. I have looked up the implicit function theorem and it seems that it is not always possible to find an explicit function that describes an implicit relation (e.g. the circle in $\mathbb{R}^2$). It also seems that the condition of differentiability is not strong enough to apply the implicit function theorem. To sum up, I don't understand why the fact that we can parameterize the level set by a curve follows from the implicit function theorem. Could you please point me to some source, so I can look this up a bit more? – DerivativesGuy Apr 13 '20 at 11:48
  • 1
    @DerivativesGuy In this instance it suffices that the partial derivatives are non-zero at all points on the curve. In your example your teacher simply assumes you are dealing with one that is parameterizable. Here's one source: https://math.stackexchange.com/questions/79003/proof-that-gradient-is-orthogonal-to-level-set – David Reed Apr 13 '20 at 13:05
  • Great, thanks for your effort. It's appreciated a lot. I will check it out. – DerivativesGuy Apr 13 '20 at 13:16
  • @DerivativesGuy NP. Here is another source that may be easier: https://schoolbag.info/mathematics/two-dimensional/11.html – David Reed Apr 13 '20 at 18:16
  • Thanks again. I think I've also found out why I was confused about all of this in the first place. It is important to distinguish between the tangent vector and the tangent line. The tangent line lies in $\mathbb{R}^n \times \mathbb{R}$ and can be viewed as a line that touches the graph of the function at one point just as for functions of a single variable. The tangent vector on the other hand is a vector in $\mathbb{R}^n$ and points in the direction the function is going as described here (https://mathinsight.org/parametrized_curve_derivative). – DerivativesGuy Apr 15 '20 at 12:37
  • So in this case it points in the direction of the level set and that's why the fact that $r'(t) \perp \nabla f$ means that the gradient is orthorgonal to the level set. – DerivativesGuy Apr 15 '20 at 12:38
  • @DerivativesGuy Take a look at the picture here: https://services.math.duke.edu/education/ccp/materials/mvcalc/vectors/vec2.html

    The gradient must be perpendicular to that tangent vector since the dot product is zero.

    – David Reed Apr 15 '20 at 13:05
  • Yes, I don't have a problem with that. The root cause of my confusion really was the difference between the tangent vector of the parameterized curve and the tangent line to the graph of the implicit function. – DerivativesGuy Apr 15 '20 at 14:50
  • @DerivativesGuy The implicit function theorem tells you, in an indirect way, that there is some function $\mathbf{r}(t) = (x(t),y(t)) , t \in [a,b] $ such that $ \left{ \mathbf{r}(t) : t \in [a,b] \right} = \left{ (x,y) : f(x,y) = c \right}$. Beyond that it requires no further consideration. – David Reed Apr 15 '20 at 16:33