3

The following proof is supposed to show that the order of multiplication for two square matrices whose product is equal to the identity matrix does not matter. However, although I understand the calculations of the proof, I cannot see at all how it confirms that the previous statement is true.


enter image description here


I would greatly appreciate it if someone could clarify for me how it exactly proves this statement. Also, please use lower-level mathematical lingo, as I am (obviously) not advanced when it comes to mathematics. :) Thank you.

The Pointer
  • 4,182
  • We start with only assumption that we've been handed two square matrices $A$ and $B$ such that $AB = I$. Then we begin the proof. At the end of our manipulations, we find that the matrices we've been working with must also satisfy $BA = I$. So we conclude that if $AB = I$, then $BA = I$. In other words, we've seen that invertible matrices commute with their inverses (because $AB = BA = I$). Therefore, order of multiplication doesn't matter for these particular matrices. (Order does, of course, matter in general.) – Josh Keneda Aug 26 '16 at 06:47
  • 1
    If my comment solves your problem, I'll post it as an answer. But I'm a little unclear on what exactly you're looking for, so I'll leave it as a comment for now. – Josh Keneda Aug 26 '16 at 06:54
  • Hi Josh and thanks for the comment. I understand the first part, but I do not understand the second part - everything from 'This manipulation shows that the matrix (BA - I)B = O_n.' and onwards. I understand that it's supposed to show AB = In and BA = In but, I do not see where exactly this is happening. Hopefully that clarifies my question. – The Pointer Aug 26 '16 at 06:59
  • 2
    You don't say what is troubling for you. I started writing an answer with the intent to explain in more detail, but realized the author really does explain every single step rather simply. Is it that you do not know what rank is? – David P Aug 26 '16 at 07:00
  • Hi David. I understand rank and all of the calculations. I am having trouble with the conceptual component - the part where it actually shows that AB = In and BA = In. Sorry for the vagueness of my question - that wasn't my intent. – The Pointer Aug 26 '16 at 07:02
  • Perhaps I'm overthinking this. I'm not sure what the second half (from the point specified in my previous comment) is telling me. Specifically, I'm not sure how it's relevant to the proof at all. The first half is showing that O_n = AB - I_n and then that (BA - I)B = O_n - this makes sense to me. So what does the second half add to the proof? Why not just end it here? – The Pointer Aug 26 '16 at 07:21
  • 2
    The second half is required because we want to conclude that $BA-I = 0$, but the first half only shows that $(BA-I)B = 0$. In general, if matrices $C$ and $D$ satisfy $CD = 0$, we can't conclude that one of them is the zero matrix. So we have to take some care in showing that the $(BA - I)$ part is actually the zero matrix. Then, rearranging, we get our conclusion. – Josh Keneda Aug 26 '16 at 07:39
  • I understand now. Thank you, Josh. – The Pointer Aug 26 '16 at 18:25

2 Answers2

1

I hope this will clarify it.

Let $AB=I.$

(1). $Bv=Bw\implies v=w.$ Because$ Bv=Bw\implies B(v-w)=0\implies 0=A(B(v-w))=(AB)(v-w)=I(v-w)=v-w.$

(2). Let $\{v_1,...,v_n\}$ be a linearly independent set of vectors. Then $S=\{Bv_1,...,Bv_n\}$ is a linearly independent set of vectors. Because if $a_1,...,a_n$ are scalars with not all of them $0,$ then $0\ne\sum_{j=1}^na_jv_j $ implies (by (1) ) that $B(0)\ne B(\sum_{j=1}^na_jv_j).$ That is, $0=B(0)\ne \sum_{j=1}^n a_jBv_j.$

(3). Therefore $S$ is a vector-space basis, so every vector $v$ is of the form $\sum_{j=1}^na_jBv_j=B(\sum_{j=1}^na_jv_j)=B(x_v).$

(4). Finally,since for any vector $v$ there exists $x_v$ such that $v=Bx_v,$ we have, for every $v$, $$(BA-I)v=(BA-I)(Bx_v)=(BAB-B)x_v=(B(AB-I))x_v=$$ $$=(B\cdot O)x_v=(O)\cdot x_v=0.$$ So $(BA-I)x_v=0$ for all $v,$ so $BA-I=O.$

(5). We can also do step (4) as follows: Suppose, by contradiction,that $v\ne BAv$ for some $v.$ Then $Bx_v\ne BA(Bx_v)=B(ABx_v)=B(Ix_v)=Bx_v,$ which is absurd.

Remark. It is necessary to use the fact that we have a finite-dimensional vector space. In an infinite-dimensional vector space $V$ there are linear functions $A:V\to V$ and $B:V\to V$ with $ABv=v$ for all $v\in V$ but $BAv\ne v$ for some $v\in V.$

1

What the second part says is that if $f\circ g=0$ and $g$ is surjective, then $f$ must be zero. The reason: since $g$ is surjective, every member $v$ in the domain of $f$ is an image of some $x$ under $g$; hence $f(v)=f(g(x))=(f\circ g)(x)=0$. In particular, we have $g(x)=Bx$ and $f(v)=(BA-I)v$ in your proof.

If you read the proof carefully, you will see that $g$ is surjective because it is injective (in the first paragraph of the proof, it is first shown that $Bv=0\Rightarrow ABv=0\Rightarrow v=Iv=0$, i.e. $g$ is injective; then the rank-nullity theorem is applied to show that the column space of $B$ is the whole $\mathbb R^n$, i.e. $g$ is surjective). This "invectiveness implies subjectiveness" for linear maps actually relies on finite dimensionality of the vector spaces in question. For infinite-dimensional vector spaces, the implication $AB=I\Rightarrow BA=I$ no longer holds. See the popular thread "If $AB = I$ then $BA = I$" for an in-depth discussion.

user1551
  • 139,064