I know the identity $(AB)^T = B^TA^T$. I tried to research the exact reason why this is the case and could not find a good description. My professor said the reason is because "you ought to put on your socks before your shoes" which did not really resonate with me...
-
1See this. – azif00 Dec 18 '19 at 00:06
-
1Simple: Because the dimensions need to match. More Advanced: Because dualization is contravariant. Both of these get elaborated on in the linked question. – Thorgott Dec 18 '19 at 00:09
-
@Thorgott that is what I was looking for. Thanks. – Ty Jensen Dec 18 '19 at 00:12
-
@Azif00 Beautiful. – Ty Jensen Dec 18 '19 at 00:14
-
Socks and shoes work for the inverse $(AB)^{-1} = B^{-1}A^{-1}$, not for the transpose, that is why it did not resonate. If you put on socks first, and then shoes, the order reverses when taking them off. – Conifold Dec 18 '19 at 01:02
2 Answers
One way of defining the transpose is the matrix $M$ that satisfies $\langle Mx, y \rangle = \langle x, A y \rangle $ for all $x,y$.
It is not hard to show that $M=A^T$ where $A^T$ is defined in the usual way.
Consequently, $ \langle (AB)^Tx, y \rangle = \langle x, ABy \rangle = \langle A^Tx, By \rangle = \langle B^T A^Tx, y \rangle $ and hence $(AB)^T = B^T A^T$.

- 172,524
I don't have the reputation to comment, but to add to the answer above, the adjoint of a linear operator $T$ on a vector space $V$, denoted $T^*$, is the unique function such that $$\langle Tv, w \rangle = \langle v, T^*w \rangle$$ for all $v, w \in V$.
It is straightforward to show that $T^* $ is a linear map, that $(TS)^* = S^*T^\ast$ for any other linear operator $S$ on $V$, and that the matrix representation of $T^\ast$ with respect to any orthonormal basis of $V$ is the conjugate transpose of the matrix representation of $T$ with respect to that basis. In the case of a real $n \times n$ matrices, using the dot product on $\mathbb{R}^n$, these results imply $(AB)^T = B^T A^T$.
The book "Linear Algebra Done Right" by Sheldon Axler covers these results, for example.

- 171