Convective term in Navier-Stokes

Question

In the Navier-Stokes equations, there's a well-known convective term of the form: \begin{equation}(\mathbf{v}\cdot\nabla)\mathbf{v}\end{equation} I'm not able to understand it. As far as I know, the nabla operator is, as it name says, an operator, not simply an element of $\mathbb{R}^3$, it maps between function spaces. Therefore, notations for divergence and curl in terms of nabla are just useful, not rigurous.

So how in the world does this term make sense? Maybe there's a better definition for nabla, because, using the most popular one, this seems really strange. But, it's from Navier-Stokes, so, obviously, there's something I'm missing here.

Let it be noted, I'm not studying Navier-Stokes, so, physical interpretations and motivations for the term are not my main concern here. Actually, I just need to understand in order to lucidly prove this identity: \begin{equation}\nabla\times(\mathbf{v}\cdot\nabla)\mathbf{v}=-\nabla\times[\mathbf v\times(\nabla\times\mathbf v)] \end{equation} So, as though as I wish to understand why this term is written like that, it's not particularly important for me, right now, to dive too much into the hydrodynamics.

Any help will be appreciated.

I’m voting to close this question because it is cross posted and answered. — Kurt G., Mar 20 '23 at 13:51
This is called the material derivative. And just like the divergence and curl, it is a useful notation that is only rigorous in Cartesian coordinates. The wiki article on del in cylindrical and spherical coordinates shows the form it takes in non-cartesian systems. — eyeballfrog, Mar 20 '23 at 14:08
@eyeballfrog That is false, this makes sense in any coordinate system. Really, using $\nabla$ like this is inherently coordinate-independent. — Nicholas Todoroff, Mar 20 '23 at 19:43
@KurtG. This seems squarely like a math question to me and would be more appropriate here, and I don't think the answer you link to is very satisfactory. — Nicholas Todoroff, Mar 20 '23 at 19:55
@NicholasTodoroffnich Certainly it makes sense in any coordinate system. However, Cartesian is the only coordinate system where $\mathbf{v}\cdot \nabla$ looks like a formal dot product. Other coordinates have extra terms. — eyeballfrog, Mar 21 '23 at 13:19
@eyeballfrog That's still absolutely false. Let's work in 2D for simplicity with $e_x, e_y$ the standard basis. Let $e_r(r,\theta) = \cos(\theta)e_x + \sin(\theta)e_y$ and $e_\theta(r,\theta) =-r\sin(\theta)e_x + r\cos(\theta)e_y$; this is the natural basis for the coordinates $(r,\theta)$. Its reciprocal is $e^r = e_r$ and $e^\theta = e_\theta/r$. We could represent $v$ or $\nabla$ in different coordinates if we wanted to, but lets do both in polar. — Nicholas Todoroff, Mar 23 '23 at 17:35
Then for some fixed $(r,\theta)$ we have $v = v^re_r + v^\theta e_\theta$ and $\nabla = e^r\partial_r + e^\theta\partial_\theta$ and $$v\cdot\nabla = e_r\cdot e^rv^r\partial_r + e_r\cdot e^\theta v^r\partial_\theta + e_\theta\cdot e^rv^\theta\partial_r + e_\theta\cdot e^\theta v^\theta\partial_\theta = v^r\partial_r + v^\theta\partial_\theta$$ which is absolutely the correct expression. — Nicholas Todoroff, Mar 23 '23 at 17:36
Or here it is with $v = v^xe_x + v^ye_y$: $$v\cdot\nabla = v\cdot\nabla = e_x\cdot e^rv^x\partial_r + e_x\cdot e^\theta v^x\partial_\theta + e_y\cdot e^rv^y\partial_r + e_y\cdot e^\theta v^y\partial_\theta = (v^x\cos\theta+v^y\sin\theta)\partial_r + \frac1r(v^y\cos\theta-v^y\sin\theta)\partial_\theta.$$ Choose whatever coordinates for $v$ or $\nabla$ you want; it doesn't matter. — Nicholas Todoroff, Mar 23 '23 at 17:36

score 1 · Answer 1 · answered Mar 20 '23 at 22:51

$ \renewcommand\vec\mathbf \newcommand\R{\mathbb R} \newcommand\PD[2]{\frac{\partial#1}{\partial#2}} \newcommand\tPD[2]{\partial#1/\partial#2} \newcommand\diff{\mathrm D} \newcommand\dd{\mathrm d} $See this answer of mine where I define $\nabla$ in arbitrary expressions rigorously. I'll give a short version of that exposition here and then explain what $(\vec v\cdot\nabla)\vec v$ means.

Let's just think about the gradient of some $f : \R^n \to \R$ for the moment. In this context you know that given Cartesian coordinates $(x^1)_{i=1}^n$ and the standard basis $\{\vec e_i\}_{i=1}^n$ we formally have $$ \nabla = \sum_{i=1}^n\vec e_i\PD{}{x^i} $$ in the sense that $\nabla f = \sum_ie_i\tPD f{x^i}$. However the gradient is an inherently coordinate-free concept: we define $\nabla f(\vec x)$ as the unique vector such that $$ (\nabla f(\vec x))\cdot\vec w = \diff f_{\vec x}(\vec w) $$ for all $\vec w \in \R^n$. The linear function $\diff f_{\vec x} : \R^n \to \R$ is the total differential of $f$ at $\vec x$, making $\diff f_{\vec x}(\vec w)$ the direction derivative of $f$ at $\vec x$ in the $\vec w$ direction. We can discover from this definition that if $(x^i)_{i=1}^n$ are any coordinates with corresponding basis $\{\vec e_i(\vec x)\}_{i=1}^n$ (which depends on a chosen point $\vec x$) then $\nabla$ can be given the form $$ \nabla = \sum_{i=1}^n\vec e^i(\vec x)\PD{}{x^i} $$ where $\vec e^i(\vec x)$ is the reciprocal basis uniquely defined by $$ \vec e^i(\vec x)\cdot\vec e_j(\vec x) = \delta^i_j. $$ As is common practice we will drop the $\vec x$-dependence and simply write $\vec e_i$ and $\vec e^i$. Note that this has nothing to do with the way point $\vec x$ is expressed; we could express $\nabla$ using one set of coordinates and $\vec x$ using a completely different set of coordinates and the above expression for $\nabla$ would still be valid; we would just have to take into account the relationship between these coordinates via the chain rule (and the point-dependence of any basis vector once we extend $\nabla$ to act on vectors).

Now let $L_{\vec x} : \R^n \to V$ be a linear function for each $\vec x \in \R^n$ with $V$ some arbitary real vector space. The following formal manipulation motivates a definition: $$ L_{\vec x}(\nabla) = L_{\dot{\vec x}}\left(\sum_{i=1}^n\vec e^i\PD{}{\dot x^i}\right) = \PD{}{\dot x^i}L_{\dot{\vec x}}(\vec e^i). $$ The overdots are to make it clear that we are not differentiating any point-dependence of $\vec e^i$, only that of $\vec x \mapsto L_{\vec x}$. You can convince youself that this quantity is indpendent of the coordinates chosen; we will call this the derivative of $L$. Note that $\vec x \mapsto L_{\vec x}(\nabla) : \R^n \to V$.

When $f : \R^n \to \R$ then the derivative of $L_{\vec x}(\vec w) = \vec w f(\vec x)$ is the gradient of $f$.
When $\vec f : \R^n \to \R^n$ then the derivative of $L_{\vec x}(\vec w) = \vec w\cdot\vec f(\vec x)$ is the divergence of $f$.
When $\vec f : \R^3 \to \R^3$ then the derivative of $L_{\vec x}(\vec w) = \vec w\times\vec f(\vec x)$ is the curl of $f$.

This gives an interpretation of $\nabla$ in any expression where it appear in a "linear slot". We can define multiple uses of $\nabla$ in the same expression as higher derivatives $L_{\vec x}(\nabla, \nabla,\dotsc)$ of a multilinear $L$; for instance the second derivative of $L_{\vec x}(\vec w_1, \vec w_2) = \vec (\vec w_1\cdot\vec w_2)\vec f(\vec x)$ for any $\vec f : \R^n \to V$ is the Laplacian of $\vec f$: $$ L_{\vec x}(\nabla, \nabla) = \nabla^2\vec f(\vec x). $$ These higher derivatives make the most sense when partial derivatives commute since then the order of application of each $\nabla$ does not matter.

We can arrive at the following geometric interpretation of the derivative: $$ L_{\vec x}(\nabla) = \lim_{R_{\vec x}\to 0}\frac1{|R_{\vec x}|}\oint_{\partial R_{\vec x}}L_{\vec y}(\vec n)\,\dd S = \frac n{|\partial B_n|}\oint_{\partial B_n}\diff L_{\vec x}(\vec y)(\vec y)\,\dd S. $$ In the first integral the limit is taken over regions $R_x$ of non-zero volume containing $\vec x$ "shrinking down" to zero volume, $\vec y$ is the variable of integration, $\vec n = \vec n(\vec y)$ is the outward-pointing unit normal of boundary $\partial R_{\vec x}$, and $\dd S$ is the scalar surface area measure. In the second integral, $B_n$ is the unit $n$-ball centered at the origin with surface area $|\partial B_n|$ and $\diff L_{\vec x}$ is the differential of $\vec x \mapsto L_{\vec x}$ evaluated at $\vec x$; this is a linear map taking vectors to linear functions so $\diff L_{\vec x}(\vec y) : \R^n \to V$ and finally $\diff L_{\vec x}(\vec y)(\vec y) \in V$. I will abstain from discussing these integrals for the sake of brevity, except to say the last one is exactly $n$ times the average of $\diff L_{\vec x}(\vec y)(\vec y)$ over all directions $\vec y$.

Now we discuss the particular case $(\vec v\cdot\nabla)\vec v$ where $\vec v = \vec v(\vec x)$ and $\vec v : \R^n \to \R^n$. It is crucial to note that the intent of this notation is to only differentiate the second $\vec v$, or written explicitly $(\vec v\cdot\dot\nabla)\dot{\vec v}$.

First consider some constant $\vec u \in \R^n$ and $(\vec u\cdot\nabla)\vec v$. We formalize this as the derivative of $L_{\vec x}(\vec w) = (\vec u\cdot\vec w)\vec v$; but now notice that if we write $\vec v$ in Cartesian coordinates $\vec v = v^i\vec e_i$ then $$ L_{\vec x}(\vec w) = \sum_{i=1}^n[\vec u\cdot(\vec w v^i)]\vec e_i. $$ The derivative of $\vec w \mapsto \vec u\cdot(\vec w v^i)$ is precisely $\vec u\cdot(\nabla v^i)$, which is the derivative of $v^i$ in the $\vec u$ direction. This shows that $L_{\vec x}(\nabla) = (\vec u\cdot\nabla)\vec v$ is precisely the $\vec u$-directional derivative of $\vec v$. In fact $$ \vec u\cdot\nabla $$ is always the $\vec u$-directional derivative operator regardless of what it is applied to.

We get the expression $(\vec v\cdot\dot\nabla)\dot{\vec v}$ by simply setting $\vec u = \vec v(\vec x)$; so this expression is the derivative of $\vec v$ in the $\vec v$ direction. In other words, this expression is the change in $\vec v$ that occurs only in the direction that it is pointing at a particular $\vec x$. Just like any directional derivative we can write $$ (\vec v\cdot\nabla)\vec v = \lim_{\epsilon\to0}\frac{\vec v(\vec x + \epsilon\vec v(\vec x))- \vec v(\vec x)}\epsilon = \frac\dd{\dd t}\vec v(\vec x + t\vec v(\vec x)). $$

Applying the geometric interpretation we get $$ (\vec v(\vec x)\cdot\nabla)\vec v(\vec x) = \lim_{R_{\vec x}\to 0}\frac1{|R_{\vec x}|}\oint_{\partial R_{\vec x}}(\vec v(\vec x)\cdot\vec n(\vec y))\vec v(\vec y)\,\dd S $$ with $\vec y$ the variable of integration. Notice how $\vec n(\vec y)$ is constrained by $\vec v(\vec x)\cdot\vec n(\vec y)$.

Finally we consider the identity $$ \nabla\times[(\vec v\cdot\nabla)\vec v] = -\nabla\times[\vec v\times(\nabla\times\vec v)]. $$

The simple fact that $$ L_{\vec x}(\vec w) = M_{\vec x}(\vec w)\text{ for all $\vec w$} \implies L_{\vec x}(\nabla) = M_{\vec x}(\nabla) $$ means that we can manipulate $\nabla$ in any way as if it were a vector so long as we keep track of what it differentiates. For instance, since we have the identity $$ \vec a\times(\vec b\times\vec c) = (\vec a\cdot\vec c)\vec b - (\vec a\cdot\vec b)\vec c $$ the following is true: $$ \vec v\times(\dot\nabla\times\dot{\vec v}) = (\vec v\cdot\dot{\vec v})\dot\nabla - (\vec v\cdot\dot\nabla)\dot{\vec v}. \tag{$*$} $$ To be completely explicit, this follows from the equality of the functions $$ L_{\vec x}(\vec w) = \vec u\times(\vec w\times\vec v(\vec x)),\quad M_{\vec x}(\vec w) = (\vec u\cdot\vec v(\vec x))\vec w - (\vec u\cdot\vec w)\vec v(\vec x) $$ for any constant $\vec u$, setting $\vec u = \vec v$ after differentiating. The second term of ($*$) we are already familiar with; the first term is impossible to write with the standard convention of "$\nabla$ differentiates to the right" (though it turns out that this term is $(\diff \vec v_\vec x)^T(\vec v)$ where $(\diff \vec v_\vec x)^T$ is the adjoint of the differential). However, in this particular case we can use the very powerful but simple subexpression rule which I state without proof:

The derivative of an expression is the sum of the derivatives of its subexpressions.

Practically, what this means is the following: $$ \dot\nabla(\dot{\vec v}\cdot\dot{\vec v}) = \dot\nabla(\dot{\vec v}\cdot\vec v) + \dot\nabla(\vec v\cdot\dot{\vec v}) = 2\dot\nabla(\vec v\cdot\dot{\vec v}). $$ Then since $(\vec v\cdot\dot{\vec v})\dot\nabla = \dot\nabla(\vec v\cdot\dot{\vec v})$ because $\vec v\cdot\dot{\vec v}$ is a scalar we see $$ \vec v\times(\nabla\times\vec v) = \frac12\nabla|\vec v|^2 - (\vec v\cdot\nabla)\vec v $$ where we've returned to the standard convention that $\nabla$ differerntiates to the right. Applying $\nabla\times$ to each side we see that the first term gives the curl of a gradient, which is zero, and the second terms gives the desired identity.

score 0 · Answer 2 · answered Mar 20 '23 at 14:03

In vector calculus there are a lot of times we treat $\nabla$ like it's a vector just for simplicity in notations. as you mentioned it's not an element of $\mathbb{R}^3$. but for example (using Einstein notation for summation): $$(\vec{a}.\nabla)\vec{b} = a_i \partial_i b_j \hat{e}_j $$ in which $\hat{e}_j$ is canonical basis of $\mathbb{R}^3$ (another notation $\{ \hat{i},\hat{j},\hat{k}\}$) and $\partial_i = \frac{\partial}{\partial x_i}$.
inner product of $\mathbb{R}^n$ can be expressed as: $$ \vec{a}.\vec{b} = a_i b_i $$ and in $\mathbb{R}^3$ its outer product (cross product $\times$) can be expressed as : $$ \vec{a} \times \vec{b} = a_i b_i \epsilon_{ijk} \hat{e}_k $$ which $\epsilon_{ijk}$ is Levi-Civita symbol.

your later equality can be proved by this expressions!
we have $$ \vec{a} \times (\nabla \times \vec{b}) = \vec{a}.(\nabla \vec{b}) - (\vec{a}.\nabla)\vec{b}. \tag{1} $$ $\vec{a}.(\nabla \vec{b})$ again is a notation for $a_i (\partial_k b_i) \hat{e}_k$. Using this we can see: $$ \vec{v}\times(\nabla \times \vec{v})+(\vec{v}.\nabla)\vec{v} = \vec{v}.(\nabla\vec{v}). \tag{2} $$ Knowing (2) you can see proving $\nabla \times \{ \vec{v}.(\nabla \vec{v}) \}=0$ is enough for proving your identity. $$ \nabla \times \{ \vec{v}.(\nabla \vec{v}) \} = \partial_i \big( \vec{v}.(\nabla \vec{v})\big)_j \epsilon_{ijk} \hat{e}_k = \partial_i \big( v_l \partial_j v_l \big)\epsilon_{ijk} \hat{e}_k \\ = (\partial_j v_l )(\partial_i v_l )\epsilon_{ijk} \hat{e}_k + v_l (\partial_i \partial_j v_l )\epsilon_{ijk} \hat{e}_k $$ in last line there are summations over $i,j,l,k$ and except $\epsilon_{ijk}$ they are symmetric on $i,j$ and we have $\epsilon_{ijk}=-\epsilon_{jik}$ so the result is $0$ . $\square$

proof of (1):
$$ \vec{a} \times (\nabla \times \vec{b}) = a_i \big( \nabla \times \vec{b})_j \epsilon_{ijk} \hat{e}_k = a_i (\partial_l b_m) \, \epsilon_{lmj}\epsilon_{ijk}\,\hat{e}_k = a_i (\partial_l b_m) \, \epsilon_{jlm}\epsilon_{jki}\,\hat{e}_k $$ using the identity : $$ \epsilon_{jlm}\epsilon_{jki} = \delta_{lk}\delta_{mi} - \delta_{li}\delta_{mk} $$ we have : $$ \vec{a} \times (\nabla \times \vec{b}) = \dots = a_i (\partial_l b_m) \delta_{lk}\delta_{mi} \hat{e}_k - a_i (\partial_l b_m) \delta_{li}\delta_{mk} \hat{e}_k \\ = a_i (\partial_k b_i) \hat{e}_k - a_i (\partial_i b_k) \hat{e}_k \\ =\vec{a}.(\nabla \vec{b}) - (\vec{a}.\nabla)\vec{b} $$

Convective term in Navier-Stokes

2 Answers2