Equivalence between mathematical and physical definition of a tensor

Question

I want to understand how the "physicists' definition" of a tensor and the "mathematicians' definition" are equivalent. I'm going to stick to the finite dimensional case, and note that I am not concerned with the difference between tensor fields and tensors.

Let $V$ be a finite dimensional vector space, $V^{*}$ its dual space, and $F$ the underlying field. A type $(n,m)$ tensor $T$ is a multilinear map $$T: V \times ... \times V \times V^{*} \times ... \times V^{*} \to F $$ where there are $n$ copies of $V$ and $m$ copies of $V^{*}$.

At this point after pondering this definition for a while you can see that we can represent any tensor as a linear function on a larger vector space which takes as basis vectors all combinations of the basis vector of the $n$ copies of $V$ and $m$ copies of $V^{*}.$ In other words, we should be able to represent a tensor as a (multidimensional) array of shape $d^{n+m}.$ However, the values of this array will depend on the choice of basis chosen for $V$ (and $V^{*}$ - but let's assume for now that we will always choose the ``natural'' basis for $V^{*}$ given our choice of basis for $V$.)

The fact that tensors can be represented as multidimensional arrays, but the values of the array could depend on the basis might lead you to consider a ``physicists' definition of a type $(n, m)$ tensor'':

Let $V$ be a finite dimensional vector space with dimension $d$ and $F$ the underlying field. A type $(n,m)$ tensor $T$ associated to the vector space $V$ is a multidimensional array of shape $d^{n+m}$ which obeys a certain transformation law.

At this point, we don't know what the transformation law is. Instead we want to pretend like we are discovering it. So we start by considering the simplest cases first and then working our way up in complexity. From here on forward we'll take $p=2, F=\mathbb{R}$ for simplicity, and we'll represent $v,w \in V$ as column vectors and $\phi, \psi \in V^{*}$ as row vectors.

Example 1: A type $(1,0)$ tensor $$T: V \to \mathbb{R}$$
We know that any linear map on $v$ to a scalar be represented as a row vector. So $$T(v) = \begin{bmatrix} L_1 & L_2 \end{bmatrix} \begin{bmatrix} v_1\\ v_2\\ \end{bmatrix}$$ for some $L_1, L_2 \in \mathbb{R}.$ Now what would the transformation law be? Well we can easily derive it: $$T(v) = Lv = L(R^{-1}R)v = (LR^{T})(Rv) = \hat{L} \hat{v}$$ where there was some change of basis: $$v \mapsto \hat{v}, \hat{v} = Rv, RR^{T} = I.$$

Conclusion: The transformation law is given by: $$v \mapsto Rv \implies L \mapsto LR^T.$$ (This case also tells us how the covectors transform in general.)

Example 2: A type $(0,1)$ tensor $$T: V^{*} \to \mathbb{R}$$
This case is similar. Any linear map on $\phi$ to a scalar can be represented by a column vector. So $$T(\phi) = \begin{bmatrix} \phi_1 & \phi_2 \end{bmatrix} \begin{bmatrix} L_1\\ L_2\\ \end{bmatrix}$$ for some $L_1, L_2 \in \mathbb{R}.$ $$T(\phi) = \phi L = \phi R^T R L = (\phi R^{T})(RL) = \hat{\phi} \hat{L} $$ Transformation law: $$v \mapsto Rv, \phi \mapsto \phi R^{T} \implies L \mapsto RL.$$

Now this is where I start to get confused. I want to consider all tensors which are represented by $2d$ arrays. This includes the type $(2,0), (1,1), (0,2)$ tensors. I want to represent each of these by using the normal rules for column/row vector and matrix multiplication.

Ostensibly it looks like we can only represent a type $(1,1)$ tensor: $$T(\phi, v) = \begin{bmatrix} \phi_1 & \phi_2 \end{bmatrix} \begin{bmatrix} L_{11} & L_{12} \\ L_{21} & L_{22} \end{bmatrix} \begin{bmatrix} v_1 \\ v_2 \end{bmatrix}$$ But we can actually easily represent a multilinear map from two copies of $V$ with the following: $$T(w, v) = \begin{bmatrix} w_1 \\ w_2 \end{bmatrix}^{T} \begin{bmatrix} L_{11} & L_{12} \\ L_{21} & L_{22} \end{bmatrix} \begin{bmatrix} v_1 \\ v_2 \end{bmatrix}.$$ This is a ``quadratic form.'' And if we want to represent a multilinear map from two copies of $V^{*}$ then we can just write: $$T(\phi, \psi) = \begin{bmatrix} \phi_1 & \phi_2 \end{bmatrix} \begin{bmatrix} L_{11} & L_{12} \\ L_{21} & L_{22} \end{bmatrix} \begin{bmatrix} \psi_1 & \psi_2 \end{bmatrix}^{T} $$ We also could also represent a type $(1,1)$ tensor like: $$T(w, \psi) =\begin{bmatrix} w_1 \\ w_2 \end{bmatrix}^{T} \begin{bmatrix} L_{11} & L_{12} \\ L_{21} & L_{22} \end{bmatrix} \begin{bmatrix} \psi_1 & \psi_2 \end{bmatrix}^{T}.$$

Now we derive our transformation laws. We insist that $v \in V$ transforms like $\hat{v} = Rv$ and $\phi \in V^{*}$ transforms like $\hat{\phi} = \phi R^T.$

$$\phi L v = \phi R^{T} R L R^{T} R v = (\phi R^{T}) (R L R^{T}) (R v) = \hat{\phi} \hat{L} \hat{v}$$ $$w^T L v = w^T R^{T} R L R^{T} R v = (Rw)^{T} (R L R^{T}) (R v) = \hat{w}^T \hat{L} \hat{v}$$ $$\phi L \psi^{T} = \phi R^{T} R L R^{T} R \psi^{T} = (\phi^{T} R) (R L R^{T}) (\psi R^{T})^{T} = \hat{\phi} \hat{L} \hat{\psi}^{T}$$ $$w^T L \psi^{T} = w^T R^{T} R L R^{T} R \psi^{T} = (Rw)^{T} (R L R^{T}) (\psi R^{T})^{T} = \hat{w}^T \hat{L} \hat{\psi}^{T}$$

So I am getting the same transformation law for the type $(2,0), (1,1), $ and $(0,2)$ tensors! This is clearly not right, so what is going on here?

The distinction between $(2,0)$, $(1,1)$, etc. is meant for the situation where the group of transformations is $GL_n$ or perhaps better $SL_n$. If you allow $R$ to be an arbitrary matrix with determinant one, then you would see three really different transformation rules. You got in trouble as soon as wrote $R^{-1}=R^T$, i.e., restricting to the orthogonal group. Your calculation basically shows that the three types of tensor spaces are the "same" representation of the orthogonal group. — Abdelmalek Abdesselam, Mar 18 '21 at 16:15

score 2 · Answer 1 · answered Mar 27 '20 at 22:02

I will focus on the case of contravariant tensors, i.e., elements of $V^{\otimes n}$. Equivalently, these are maps $$T:V^{\times n} \rightarrow \mathbb{F}.$$ In the physicist's notation, this would be a tensor represented with $n$ indices raised, i.e., $T^{i_1\cdots i_n}$. One important thing to note is that $T^{i_1\cdots i_n}$ represents the coefficients of the underlying mathematical tensor element, i.e., a mathematician would write, after fixing some basis $\{e_i\}$ for $V$, $$T = T^{i_1\cdots i_n}\,e_{i_1}\otimes \cdots \otimes e_{i_n},$$ where we use Einstein summation notation to implicitly sum over repeated raised and lowered indices. This is really all there is to it. A physicist's tensor is simply the collection of coefficients for a mathematician's tensor, usually with some implicit choice of basis (for example, in general relativity or differential geometry in general, the basis is fixed by the choice of the underlying chart). A covariant tensor, i.e., elements of $(V^*)^{\otimes m}$ would be correspondingly represented with the dual basis and lowered indices.

Now, you want to figure out the transformation rule for these tensors. All we have to do is compare the coefficients then. Suppose we have two choices of basis for $V$, say $\{e_i\}$ and $\{f_j\}$, with the change of basis given by $P$ such that $e_i = P_i^jf_j$. Then our tensor, with components $T^{ij}$ in the $\{e_i\}$ basis, can be equivalently written in the $\{f_j\}$ basis as $$T = T^{ij}e_i \otimes e_j = T^{ij}(P^a_if_a)\otimes (P^b_jf_b) = P_i^aP_j^bT^{ij}f_a \otimes f_b.$$ Thus if we denote the tensor components in the $\{f_j\}$ basis as $\tilde{T}^{ab}$, we have the transformation law $$\tilde{T}^{ab} = P^a_iP^b_jT^{ij},$$ which is the usual transformation law as introduced by physicists, where $P$ is more commonly denoted as some kind of Jacobian when considering tensor fields.

Trying to use matrix notation for these calculations can sometimes be useful, but more often confusing. As you've noticed, traditional matrix notation is represents $(1,1)$ due to the natural isomorphism between the space of linear operators $L(V)$ and $V^*\otimes V$.

Now, you can always convert $(2,0)$ and $(0,2)$ tensors to $(1,1)$ tensors if you have some choice of isomorphism $V\cong V^*$, say an inner product, but this is dependent on the inner product you have and is often more work than its worth (especially for non-Euclidean signatures). In your running example, when you write "But we can actually easily represent a multilinear map from two copies of V with the following" you are implicitly using the standard Euclidean inner product to identify $V$ and $V^*$.

I don't know what you are trying to do with the remaining examples, but they don't really make sense to me. They don't even map to the underlying field, so they cannot be correct. In short, matrix notation is a bit misleading because there is a hidden dual isomorphism that's unaccounted for. If someone gives you an order $2$ tensor in matrix notation, they are not supplying all necessary information for you to know the transformation, because you intrinsically cannot know whether it is $(1,1)$, $(0,2)$, or $(2,0)$. The reason this is often glossed over is because many texts focus on the case of $\mathbb{R}^n$ with the usual inner product, in which case all of these cases are identified.

@Jbag1212 Sorry, I misread your equations. I thought you were writing different equations, but it's just the same equation 3 times. I am still confused on what you are trying to do. You can indeed derive the transformation rules using the inner product isomorphism. Say you have a $(2,0)$ tensor $T$. Let's map this to a $(1,1)$ tensor $T'$, defined by $T'(\tilde{u},v)=T(u,v)$, where $\tilde{u}$ is the unique dual vector induced by $u$. You know how the components of $T'$ transform, so you can work out how the components of $T$ transform as well. It's not so clear in matrix notation though. — EuYu, Mar 28 '20 at 10:26
What I am trying to do is work out how the components of $T$ transform where $T$ is a $(2,0)$ tensor or $(0,2)$ tensor. I don't see why I can't represent the tensor using matrix notation as I did (explicitly write it out and you'll see that the way I wrote the equation is equivalent to using Einstein notation). But when I do this I don't get the expected transformation rule. — Jbag1212, Mar 29 '20 at 18:58
Like I mentioned, matrix notation is really tailored to the case of $(1,1)$ tensors. Your equations are not equivalent to the component notation counterparts because you wrote the same expression $3$ times, whereas there should really be $3$ distinct expressions for the different types of tensors. For example, you're taking a $(2,0)$ tensor like $T = T_{ij}\tilde{e}^i\otimes \tilde{e}^j$ (where $\tilde{e}$ denotes the dual basis) and writing it brute force in matrix notation. That's equivalent to pretending that $\tilde{e}^i$ is the vector $e_i$, so you get the incorrect transformation rules. — EuYu, Mar 29 '20 at 20:07

Equivalence between mathematical and physical definition of a tensor

1 Answers1