I have found numerous definitions for the divergence of a tensor which makes me confused as to trust which one to use.
In Itskov's Tensor Algebra and Tensor Analysis for Engineers, he begins with Gauss's theorem to define
\begin{equation} \text{div} ~\boldsymbol{S} = \lim_{V \to 0} \frac{1}{V} \int_{\partial V} \boldsymbol{S} ~\boldsymbol{n} ~da \end{equation}
which, resorting to some coordinates system, gives \begin{equation} \text{div} ~\boldsymbol{S} = \boldsymbol{S}_{,i} ~\boldsymbol{g}^i = S_{j}^{~~i} |_i ~\boldsymbol{g}^j \end{equation}
I actually like this definition because of its naturalness from beginning with Gauss's theorem. However, it requires choosing a basis. To define a coordinate-free divergence, I have come across multiple definitions:
One from this wiki article defines the divergence as
\begin{equation} (\boldsymbol{\nabla \cdot S}) \boldsymbol{\cdot a} = \boldsymbol{\nabla \cdot} ~(\boldsymbol{S ~a}) \end{equation}
where $\boldsymbol{a}$ is an arbitrary constant vector. This gives
\begin{equation} \boldsymbol{\nabla \cdot S} = S^{i}_{~~j} |_i ~\boldsymbol{g}^j \end{equation}
where the first index is contracted. Yet, another wiki article defines
\begin{equation} (\boldsymbol{\nabla}\cdot\boldsymbol{T})\cdot\mathbf{c} = \boldsymbol{\nabla}\cdot\left(\mathbf{c}\cdot\boldsymbol{T}^\textsf{T}\right) \end{equation}
to give the exact same result as the other wiki article. (Here I presume that Reddy's notation were used, where he uses dot product for denoting any product he can find! One problem with Reddy's notation is that I cannot figure out how he dot products a vector into a dyad, as in $\boldsymbol{e}_k \cdot \boldsymbol{e}_i\otimes\boldsymbol{e}_j$, so please do not advise me using his notation. This being said, I don't know what $\mathbf{c}\cdot\boldsymbol{T}^\textsf{T}$ means; is it $\mathbf{c}~\boldsymbol{T}^\textsf{T}$ where $\boldsymbol{T}^\textsf{T}$ is acting from the left on the vector $\mathbf{c}$? If so, I don't think this holds for a general curvilinear basis. I guess $\mathbf{c}~\boldsymbol{\cdot}\boldsymbol{T}^\textsf{T} = \boldsymbol{c}^\textsf{T}\boldsymbol{T}^\textsf{T}$ is more appropreate, but I don't reckon Reddy means this way.) This article also says that Itskov's result (contracting the second index) is actually true only for symmetric tensors, which Itskov never assumes.
Abeyratne's lecture notes (p. 64) uses this definition
\begin{equation} (\text{div} ~\boldsymbol{T})\cdot\mathbf{c} = \text{div} \left(\boldsymbol{T}^\textsf{T}\mathbf{c}\right) \end{equation}
where he claims that the second index gets contracted. I don't know whether $\text{div}$ and $\boldsymbol{\nabla \cdot}$ are different or the same.
Ogden's "Nonlinear Elastic deformations" puts it in a very nice way: that there are three possible contractions for the gradient of a 2nd rank tensor $\boldsymbol{\nabla}\otimes \boldsymbol{T}$, so defining the divergence is a matter of convention. He contracts the first index. But still, which one should one choose for a throughout consistency in his calculations. What is the definition of divergence?
Kelly's lecture notes were a little helpful, yet because of its different notations from other, I always get caught wondering if I am doing the right way. For example, he finds for the gradient of a tensor field that $\text{grad}~\boldsymbol{v} = (\boldsymbol{\nabla}\otimes\boldsymbol{v})^\textsf{T}$, but Ogden finds it with the transpose, and I believe they have used the same definitions to start with, namely the directional derivative. This will make much mess for me, as to define the divergence of the vector field whether as $\boldsymbol{\nabla\cdot v} = \text{tr} (\boldsymbol{\nabla}\otimes\boldsymbol{v})^\textsf{T}$ or as $\boldsymbol{\nabla\cdot v} = \text{tr} (\boldsymbol{\nabla}\otimes\boldsymbol{v})$.
Please help me organize my mind on the subject, and share with me your experience regarding the same notation conflictions and how you have overcome them.