4

I am looking for an intuitive way to explain the "$1/r$" factor in the gradient in polar coordinates.

For instance, if $g(x,y)=f(r,\theta)$, $$\nabla g=f_r\hat{e_r}+\frac 1rf_\theta\hat{e_\theta}$$

Is there a way to explain the $\frac 1r$ factor? By dimension matching? Or any other way to see that the answer is not $$\nabla g=f_r\hat{e_r}+f_\theta\hat{e_\theta}$$

Thanks for any help.

yoyostein
  • 19,608
  • 1
    What do you know about gradients in curvilinear co-ordinates? – SchrodingersCat Jul 22 '16 at 09:59
  • I do know the proof: http://www.math.jhu.edu/~js/Math202/polar.grad.chain.pdf However, I recall hearing that $\nabla g=f_r\hat{e_r}+f_\theta\hat{e_\theta}$ is dimensionally wrong or something like that. I can't recall the exact details. I would like to know how to see it is dimensionally wrong. – yoyostein Jul 22 '16 at 10:07

2 Answers2

8

Consider a point with polar coordinates $(r,\theta)$. It lies, of course, at the distance $r$ from the origin. A change of $d\theta$ in the value of $\theta$ will move this point a distance $r \, d\theta$ along the circle of radius $r$. (Notice the factor $r$; it says that the farther out you are, the bigger is the effect of a change in the angle.)

The derivative $\partial f/\partial \theta$ only measures the function's sensitivity to changes in the value of the coordinate $\theta$, and to get the physically interesting number which measures sensitivity to moving the point in the $\mathbf{e}_{\theta}$ direction, you have to compensate by dividing by this factor of $r$.

Hans Lundmark
  • 53,395
  • Incredibly clear and lucid explanation. Thanks. – yoyostein Jul 22 '16 at 15:34
  • You're welcome! – Hans Lundmark Jul 22 '16 at 21:07
  • Hi, is gradient invaraiant under rotation of coordinate axis? – Kashmiri Jan 13 '21 at 11:46
  • @YasirSadiq: Please post that as a new question, not as a comment on a question from several years ago. And try to be more precise about what exactly you mean. – Hans Lundmark Jan 13 '21 at 12:23
  • Thank you I'll do so. – Kashmiri Jan 13 '21 at 13:53
  • @HansLundmark can you explain why ∂/∂ is not physically interesting number in itself? Do gradient should specifically measure function sensitivity to "moving a point by a unit distance", and not just function sensitivity to the change of its arguments? I've only worked with gradient in cartesian coordinates before and probably became too accustomed to think about it as just "a vector of all partial derivatives". Before learning about polar coordinates this simplistic approach made perfect intuitive and visual sense to me. – simd Mar 30 '23 at 11:13
  • 1
    @user3537411: Please take note of the notation used in the question, where the functions $f$ and $g$ are wisely distinguished. If you would just consider $f(r,\theta)$ as a function $f \colon \mathbf{R}^2 \to \mathbf{R}$, then its gradient would be $\nabla f = (f_r,f_\theta)$. But the question asks for the gradient of the function $g \colon \mathbf{R}^2 \to \mathbf{R}$ defined by $g(x,y)=f(r,\theta)$, where of course $(x,y)=(r \cos\theta,r \sin\theta)$. [Cont.] – Hans Lundmark Mar 30 '23 at 11:47
  • 1
    More precisely, what is asked is how to express this gradient $\nabla g = (g_x,g_y)$ in terms of the derivatives of $f$. And what $\nabla g$ measures is how quickly the value of $g$ changes, per unit of distance moved in the $xy$-plane. – Hans Lundmark Mar 30 '23 at 11:47
  • @HansLundmark very insightful, thank you! What if we want to know how quickly the value of $g$ changes, per unit of distance moved in the $r \theta$-plane? Would it be the same? – simd Mar 30 '23 at 12:06
  • @user3537411: No, that would be just asking for $\nabla f = (f_r, f_\theta)$, since “$g$, considered as a function of $(r,\theta)$” is (by definition) nothing but the function $f$. – Hans Lundmark Mar 30 '23 at 12:08
1

I was thinking about the same question recently. I think now I get a rather satisfactory answer, I will just post it here.


After some search I think I can now see things a bit clearer. I will try to clarify my thoughts and do a summary.

Initially I was surprised by the fact that gradient in non-Cartesian coordinates has a different formula. For example, gradient in 2D polar coordinates is:

$$ \nabla f = \frac{\partial f}{\partial r} \mathbf{\hat{e}}_r + \frac{1}{r}\frac{\partial f}{\partial \theta} \mathbf{\hat{e}}_\theta $$

Searching on that problem led me to curvilinear coordinates, where $d\mathbf{r}$ is treated and manipulated like a differential form. However, that's not what I thought what differential form is, as differential form is defined to be a real valued function. I suspected there's a corresponding 'vector-valued' differential form definition, that was why I asked this question.

However thinking about this a bit more, this shouldn't be the case. $d\mathbf{r}$ is used in line integral, which is written as

$$ \oint_\gamma f \cdot d\mathbf{r} $$

If we use differential form to interpret the integral, the integral is equivalent to

$$ \oint_\gamma f_x dx + f_y dy $$

Here apparently $f_x dx + f_y dy$ is the differential form. So if there is a differential form, it should be $f \cdot d\mathbf{r}$ not $d\mathbf{r}$.


So back to the original question on the gradient in non-Cartesian coordinates. I found an alternative way to prove the result in the framework of differential geometry without manipulating $d\mathbf{r}$ directly.

First it's important to note why gradient behaves a bit non-intuitively in non-Cartesian coordinates. Although gradient looks like the differential (total derivative) in the form of $\begin{pmatrix}\frac{\partial f}{\partial x} \\ \frac{\partial f}{\partial y}\end{pmatrix}$, it's not differential. At point $p$, it is defined as

$$ g(\nabla f, \mathbf{v}) = df(\mathbf{v}) $$

where $g$ is the inner product on the tangent space $T_p$, and $df$ is the one form (or total derivative, or differential, there're so many names...), which eats a vector and results in a real number as usual. The important thing to note is the involvement of the inner product. Therefore it is reasonable that the gradient is dependent on the inner product. Thus if the inner product takes the simplest form in Cartesian coordinates, and more complicated form in other coordinate systems, then it makes sense to have gradient with more complicated form in non-Cartesian coordinate system.

The above equation means $g$ can also be viewed as a function of $T_p \to T_p^*$ that eats a tangent vector $\nabla f$ and results in a one form $df$. Now assume the coordinate system $q_i$ has orthogonal basis at point $p$, then $g(\frac{\partial}{\partial q_i}, \frac{\partial}{\partial q_j}) = 0$ and $g(\frac{\partial}{\partial q_i}, \frac{\partial}{\partial q_i}) = \lvert \frac{\partial}{\partial q_i} \rvert^2 = h_i^2$. Given a tangent vector $\sum a_i\frac{\partial}{\partial q_i}$

$$ \begin{aligned} g(\sum a_i\frac{\partial}{\partial q_i}) &= \sum h_i^2a_idq_i \end{aligned} $$

which is a one form.

Another thing is that we want express $\nabla f$ to using orthonormal vector basis, rather than just orthogonal basis. The orthonormal basis is simply $\frac{1}{h_i} \frac{\partial}{\partial q_i}$, put it into previous equation we get

$$ \begin{aligned} g(\sum \frac{1}{h_i}\frac{\partial}{\partial q_i}) &= \sum h_idq_i \end{aligned} $$

Now since $df_p = \sum \frac{\partial f}{\partial q_i}dq_i$, to find the gradient we need to find $a_i$ such that

$$ \begin{aligned} g(\sum a_i\frac{1}{h_i}\frac{\partial}{\partial q_i}) &= \sum \frac{\partial f}{\partial q_i}dq_i \end{aligned} $$

Therefore $a_i = \frac{1}{h_i}\frac{\partial f}{\partial q_i}$, hence we get the result

$$ \begin{aligned} \nabla f = \sum \frac{1}{h_i} \frac{\partial f}{\partial q_i} \mathbf{\hat{e}}_i \end{aligned} $$


Now to calculate gradient in an orthogonal coordinates, we need to calculate $h_i$. Here we relate the coordinate basis to Cartesian basis, where $\lvert \frac{\partial}{\partial x_i} \rvert = 1$.

$$ \begin{aligned} h_i &= \lvert \frac{\partial}{\partial q_i} \rvert \\ &= \lvert \sum_j \frac{\partial x_j}{\partial q_i} \frac{\partial}{\partial x_i} \rvert \end{aligned} $$

Consider polar coordinate system, where $x = r\cos \theta$ and $y = r\sin \theta$. We get

$$ \begin{aligned} \frac{\partial x}{\partial r} &= \cos \theta \\ \frac{\partial y}{\partial r} &= \sin \theta \end{aligned} $$

Therefore $h_r = \lvert \cos \theta \frac{\partial}{\partial x} + \sin \theta \frac{\partial}{\partial y} \rvert = 1$.

$$ \begin{aligned} \frac{\partial x}{\partial \theta} &= -r\sin \theta \\ \frac{\partial y}{\partial \theta} &= r\cos \theta \end{aligned} $$

Therefore $h_\theta = \lvert r\cos \theta \frac{\partial}{\partial x} - r\sin \theta \frac{\partial}{\partial y} \rvert = r$.

Now we get gradient in polar coordinates

$$ \nabla f = \frac{\partial f}{\partial r} \mathbf{\hat{e}}_r + \frac{1}{r}\frac{\partial f}{\partial \theta} \mathbf{\hat{e}}_\theta $$


Some good references that helped me:

Definition of the gradient for non-Cartesian coordinates

https://www.math.arizona.edu/~faris/methodsweb/manifold.pdf

Rui Liu
  • 567