Algebraic Rules for Matrix Equations

Question

I am learning Linear Algebra through Professor Gilbert Strangs lectures on MIT OCW.

A concept I recently covered is finding the best solution to $ AX=b$ when $b$ does not lie in the column space of $A$. ($A$ has full column rank, but not full row rank)

$$AX=b ~~~~~~~~~~~~~~~(1)\\ A^TA \hat{X} = A^T b~~~~~(2)$$

We can't solve $(1)$ , so we solve $(2)$ instead.

Eq$(2)$ looks like $AX=b$ simply multiplied on both sides with $A^T$. One would naively assume that doing the same thing to both sides of an equation does not change the status quo, but clearly something is different because $X$ does not exist and $\hat{X}$ does. Further, $(2)$ can't be reversed to get back $(1)$ because $A^{-1}$ does not exist.

This leads me to conclude that:

An equation is guaranteed to be preserved when multiplied on both sides by a matrix only if that matrix is invertible.

Is this right? Are there other caveats to keep in mind while manipulating a matrix equation?

Yes, when dealing with square matrices. Let $A$ be an $m \times n$ matrix, and let $X$ and $Y$ be column matrices of length $n$. The equation $AX = AY$ is equivalent to $X = Y$ (for all $X$, $Y$) if and only if $A$ has rank $n$. When $A$ is square, this amounts to saying that $A$ is invertible. This is analogous to an equation $ax = ay$ involving real numbers. It is equivalent to $x = y$ only when $a$ is "invertible," which for a real number means $a \ne 0$. Non-invertible matrices behave somewhat like the real number zero in this respect. — user49640, Jun 13 '17 at 06:30

score 2 · Accepted Answer · answered Jun 14 '17 at 00:54

Yes, your idea is correct.

One thing to remember is the importance of sizes and invertibility. As you said, finding $x$ such that $Ax=b$ is not even possible if $A$ is not square or not invertible (except in very special cases).

Notice that if $A\in\mathbb{R}^{m\times n}$, $A^TA\in\mathbb{R}^{n\times n}$ so this is like a way of making the coefficient matrix square. Notice also that $A^TA$ is positive (semi)-definite and is invertible if $A$ has full column rank. So, it's also like a way to make the coefficient matrix invertible. (As the commenter above notes, making $A$ square and invertible is analogous to making sure the scalar $ax=b$ has $a\ne 0$)

What is interesting is why multiplying by $A^T$ to $Ax=b$, giving the normal equations $A^TA\hat{x}=A^Tb$, magically can be solved to give $$ \hat{x} = \text{arg}\min_{\hspace{-5mm}x\in\mathbb{R}^n} || b-Ax ||_2 $$ i.e. it is as close to the "right answer" as you can be. (No such notion exists in scalar equations: that would be like having $ax=b$, where $a=0$, but suddenly solving $cax=cb$ gave you a great answer).

The other thing to keep in mind is that matrix multiplication is not commutative. In fact, for rectangular matrices, it may even be that $M_1M_2$ exists but $M_2M_1$ does not.

Algebraic Rules for Matrix Equations

1 Answers1