1

I'm trying to derive the gradient in polar coordinates using the chain rule.

So the idea is that when we have a function $f(x,y)$ and we switch to polar coordinates, we're really composing $f$ with $P(r,\theta) = (r\cos(\theta),r\sin(\theta))$. So then the gradient of $f$ in polar coordinates should just be $\nabla (f\circ P)(r,\theta)$.

From my last question I know that $$\nabla(f\circ P)(r,\theta) = \begin{bmatrix}(\partial_1f\circ P)(r,\theta) & (\partial_2f\circ P)(r,\theta)\end{bmatrix}\begin{bmatrix}\partial_1P_1(r,\theta) & \partial_2P_1(r,\theta) \\ \partial_1P_2(r,\theta) & \partial_2P_2(r,\theta)\end{bmatrix} \\ = \begin{bmatrix} (\partial_1f\circ P)(r,\theta)\cdot\partial_1P_1(r,\theta) + (\partial_2f\circ P)(r,\theta)\cdot\partial_1P_2(r,\theta) \\ (\partial_1f\circ P)(r,\theta)\cdot\partial_2P_1(r,\theta) +(\partial_2f\circ P)(r,\theta)\cdot\partial_2P_2(r,\theta)\end{bmatrix}^T$$

But I also know that $$\partial_1(f\circ P)(r,\theta) = (\partial_1f\circ P)(r,\theta)\cdot\partial_1P_1(r,\theta) + (\partial_2f\circ P)(r,\theta)\cdot\partial_1P_2(r,\theta) \\ \partial_2(f\circ P)(r,\theta) = (\partial_1f\circ P)(r,\theta)\cdot\partial_2P_1(r,\theta) +(\partial_2f\circ P)(r,\theta)\cdot\partial_2P_2(r,\theta)$$

by the regular chain rule.

So, putting that together I get $$\nabla(f\circ P)(r,\theta) = \begin{bmatrix}\partial_1(f\circ P)(r,\theta) & \partial_2(f\circ P)(r,\theta)\end{bmatrix}$$

i.e. $$\nabla(f\circ P) = \frac{\partial (f\circ P)}{\partial r}\mathbf {\hat r} + \frac{\partial (f\circ P)}{\partial \theta}\mathbf {\hat \theta}$$

If I were to then write this out in more traditional notation, where $f$ and $f\circ P$ are not distinguished, it should look like $$\nabla f = \frac{\partial f}{\partial r}\mathbf {\hat r} + \frac{\partial f}{\partial \theta}\mathbf {\hat \theta}$$

But comparing this to the correct formula for the gradient in polar coordinates, which for reference is usually written as $$\nabla f = \frac{\partial f}{\partial r}\mathbf {\hat r} + \frac 1r\frac{\partial f}{\partial \theta}\mathbf {\hat \theta},$$ I see that I'm missing a factor of $\frac 1r$ on the second term. Where does that come from?


Edit: BTW, I notice something interesting, though it may have nothing to do with the problem I'm having. But $$\begin{bmatrix}\partial_1P_1(r,\theta) & \partial_2P_1(r,\theta) \\ \partial_1P_2(r,\theta) & \partial_2P_2(r,\theta)\end{bmatrix} = \begin{bmatrix}\cos(\theta) & -r\sin(\theta) \\ \sin(\theta) & r\cos(\theta)\end{bmatrix}$$ would be an orthogonal matrix (in fact, it'd be a rotation) if we multiplied the second column by $\frac 1r$. But then that's the column that goes into $\partial_2(f\circ P)(r,\theta)$. So if I normalized that column, then somehow my formula would have come out correctly with the $\frac 1r$. But I see no reason why I should do that. Is this just a weird coincidence?

Bobbie D
  • 1,961
  • If the basis vectors in cartesian coordinates are $(e_x, e_y)$, the corresponding vectors in cylindrical coordinates are $(e_r, r e_\theta)$. – Biswajit Banerjee Oct 16 '16 at 21:47
  • The problem with your reasoning is that $\partial_if \circ P \neq \partial_i(f \circ P)$. – João Caminada Oct 16 '16 at 21:47
  • @JoãoCaminada I didn't say $\partial_i(f\circ P) = \partial_i f\circ P$. Under "But I also know that" I have $$\partial_i(f\circ P) = \sum_j(\partial_jf\circ P)(r,\theta)\cdot \partial_iP_j(r,\theta)$$ – Bobbie D Oct 16 '16 at 21:50
  • @BiswajitBanerjee Can you expand on that? Aren't $e_r$ and $e_\theta$ just the basis vectors in the $r,\theta$ plane? And thus $\begin{bmatrix}\partial_1(f\circ P)(r,\theta) & \partial_2(f\circ P)(r,\theta)\end{bmatrix}$ which is in the $r,\theta$ plane should just be $\partial_1(f\circ P)(r,\theta)e_r + \partial_2(f\circ P)(r,\theta)e_\theta$? If not, why? – Bobbie D Oct 16 '16 at 21:53
  • @JoãoCaminada So to derive the correct expression I need to calculate $(\nabla f\circ P)(r,\theta)$ and not $\nabla (f\circ P)(r,\theta)$? – Bobbie D Oct 16 '16 at 22:06
  • Actually there is an implicit identification: $$ f(x,y) = f(r,\theta). $$ Rigorously, this identity is wrong! – João Caminada Oct 16 '16 at 22:12
  • Here's the reference http://math.stackexchange.com/questions/586848/how-to-obtain-the-gradient-in-polar-coordinates. – Jacky Chong Oct 16 '16 at 22:16
  • @JoãoCaminada Yeah. The whole reason I'm doing it this way instead of the way presented by the answerer in the link by Jacky Chong is that I'm trying to be rigorous. If I'm not doing it correctly, what is the rigorous way to derive (and interpret) the identity? – Bobbie D Oct 16 '16 at 22:23
  • @BobbieD You are almost there! Write $\hat{f} = f \circ P$ for simplicity. Rigorously the formula should be $$ \nabla f = \frac{\partial\hat{f}}{\partial r}\hat{\mathbf{r}} + \frac{1}{r}\frac{\partial\hat{f}}{\partial\theta}\hat{\mathbf{\theta}}. $$ See if you can proceed from here. – João Caminada Oct 16 '16 at 22:28

2 Answers2

4

Polar coordinates: $$ \begin{cases} x = r\cos(\theta) \\ y = r \sin(\theta) \end{cases} \quad \Longrightarrow \quad \begin{cases} \mathbf {\hat r} = \begin{bmatrix} \cos(\theta) \\ \sin(\theta) \end{bmatrix} \\ \mathbf {\hat \theta} = \begin{bmatrix} -\sin(\theta) \\ \cos(\theta) \end{bmatrix} \end{cases} $$ Chain rules: $$ \frac{\partial f}{\partial r} = \frac{\partial f}{\partial x}\frac{\partial x}{\partial r} + \frac{\partial f}{\partial y}\frac{\partial y}{\partial r} = \cos(\theta) \frac{\partial f}{\partial x} + \sin(\theta) \frac{\partial f}{\partial y} \\ \frac{\partial f}{\partial \theta} = \frac{\partial f}{\partial x}\frac{\partial x}{\partial \theta} + \frac{\partial f}{\partial y}\frac{\partial y}{\partial \theta} = - r \sin(\theta) \frac{\partial f}{\partial x} + r \cos(\theta) \frac{\partial f}{\partial y} $$ Matrix format: $$ \begin{bmatrix} \Large \frac{\partial f}{\partial r} \\ \Large \frac{1}{r} \frac{\partial f}{\partial \theta} \end{bmatrix} = \begin{bmatrix} \cos(\theta) & \sin(\theta) \\ - \sin(\theta) & \cos(\theta) \end{bmatrix} \begin{bmatrix} \Large \frac{\partial f}{\partial x} \\ \Large \frac{\partial f}{\partial y} \end{bmatrix} $$ Inverse transform: $$ \begin{bmatrix} \Large \frac{\partial f}{\partial x} \\ \Large \frac{\partial f}{\partial y} \end{bmatrix} = \begin{bmatrix} \cos(\theta) & - \sin(\theta) \\ \sin(\theta) & \cos(\theta) \end{bmatrix} \begin{bmatrix} \Large \frac{\partial f}{\partial r} \\ \Large \frac{1}{r} \frac{\partial f}{\partial \theta} \end{bmatrix} = \frac{\partial f}{\partial r} \begin{bmatrix} \cos(\theta) \\ \sin(\theta) \end{bmatrix} + \frac{1}{r} \frac{\partial f}{\partial \theta} \begin{bmatrix} - \sin(\theta) \\ \cos(\theta) \end{bmatrix} $$ Conclusion: $$ \nabla f = \frac{\partial f}{\partial r}\mathbf {\hat r} + \frac 1r\frac{\partial f}{\partial \theta}\mathbf {\hat \theta} $$

Han de Bruijn
  • 17,070
0

Surely the extra 1/r term in the expression for $\nabla f$ is because unit vectors are being used. If the base vectors are not normalised to unit vectors then grad f takes the simpler form: $$ \nabla f = \frac{\partial f}{\partial r}\mathbf {r} + \frac{\partial f}{\partial \theta}\mathbf{\theta} $$