To add to the earlier response, and interpreting your question to ask for heuristics for how to manipulate matrices, and not specifically for what matrix multiplication means.
I assume here we interpret vectors as columns vectors, so $x^T$ would refer to a row vector, and capitals for matrices. When $A=(a_{ij})$ then $A^T=(a_{ji})$ so transposition, that is, interchange of rows and columns, corresponds to switching the indices! Remembering that, you can easily convert a symbolic matrix product to a sum over indexed expressions, manipulate, and reconvert to a symbolic matrix product.
One useful trick is pre- and post-multiplication by diagonal matrices: premultiplication corresponds to operations on the rows, while post-multiplication corresponds to operations on the columns. That is, letting $D$ be a diagonal matrix, in $DA$ each row of $A$ is multiplied by the corresponding diagonal element of $D$, while in $AD$ each column of $A$ is multiplied with the corresponding diagonal element.
Now an example to show how to use this manipulative tricks. Suppose $X$ is an $n\times n$ matrix such that there exists an basis for $\mathbb R^n$ consisting of eigenvectors of $X$ (we assume all elements are real here). That is, the eigenvalue/eigenvector equation $Xx=\lambda x$ has $n$ linearly independent solutions, call them (or some choice of them if they are not unique) $x_1, \dots, x_n$. with corresponding eigenvalues $\lambda_i$, the elements of the diagonal matrix $\Lambda$. Write
$$
X x_i = \lambda_i x_i
$$
Now let $P$ be a matrix with the $x_i$ as columns. How can we write the equations above as one matrix equation? Note that the constants $\lambda_i$ are multiplying columns, we know that in the matrix representation the diagonal matrix $\Lambda$ must postmultiply $P$. That is, we get
$$
X P = P \Lambda
$$
Premultiplying on both sides with the inverse of $P$, we get
$$
P^{-1} X P = \Lambda
$$
That is, we can se that $X$ is similar to the diagonal matrix consisting of its eigenvalues.
One more example: If $S$ is a sample covariance matrix, how can we convert it to a sample correlation matrix? The correlation between variable $i$ and $j$ is the covariance divided into the standard deviations of variable $i$ and of variable $j$:
$$
\text{cor}(X_i,X_J) = \frac{\text{cov}(X_i, X_j)}
{\sqrt{\text{var}(X_i) \text{var}(X_j) }}
$$
Looking at this with matrix eyes, we are dividing the $(i,j)$-element of the matrix $S$ with the square roots of the $i$th and $j$th diagonal elements! We are dividing each row of $S$ and each column of $S$ with the same diagonal elements, so it can be expressed as pre- and post-multiplication by the (same) diagonal matrix, that holding the square roots of the diagonal elements of $S$. We have found:
$$
R = D^{-1/2} S D^{-1/2}
$$
where $R$ is the sample correlation matrix, and $D$ is a diagonal matrix holding the diagonal elements of $S$.
There are a lots of applications of this kind of tricks, and I find it so useful that textbooks should include them. One other example: Now let P be a permutation matrix, that is an $n\times n$ matrix representing a permutation on $n$ symbols. Such a matrix has one 1 and $n-1$ zeros in each row and each column, and can be obtained by permuting (in the same way!) the rows and columns of an identity matrix. Now $AP$ (since it is a post-multiplication) permutes the columns of $A$, while $PA$ permutes the rows of $A$.