2

Dr Strang in his book linear algebra and it's applications, pg 108 says ,when talking about the left inverse of a matrix( $m$ by $n$)

UNIQUENESS: For a full column rank $r=n . A x=b$ has at most one solution $x$ for every $b$ if and only if the columns are linearly independent. Then $A$ has an $n$ by $m$ left-inverse $B$ such that $B A=I_{n}$. This is possible only if $m \geq n$.

I understand why there can be at most one solution for a full column rank but how does that lead to $A$ having a left inverse?

I'd be grateful if someone could help or hint at the answer.

Kashmiri
  • 547
  • 1
    They don't say that the equation has a solution. They say it has at most one solution. And also, I think you mean $I_n$, not $I_a$. – Arthur May 10 '21 at 09:49
  • @Arthur, yes I edited it. Thank you. – Kashmiri May 10 '21 at 10:44
  • The columns of $A$ are linearly independent if and only if the null space of $A$ is trivial, if and only if the linear map induced by $A$ is injective. And it is a general fact that a map or function (not necessarily linear) is injective if and only if it has a left inverse. Similarly, a function is surjective if and only if it has a right inverse. This is a comment instead of an answer because I'm not providing a proof. –  May 17 '21 at 08:27
  • This doesn't seem to me like a question that deserves a bounty ... – Arman Malekzadeh May 17 '21 at 08:28
  • I wasn't understanding it and it had been a while since I posted it and had got not much help – Kashmiri May 17 '21 at 08:30
  • @Bungo, I don't understand why did the author skip what you've linked as" general fact". Looking at the answers, it needs much thought to prove it and shouldn't have been just stated as in the book. I suspect there might be a simpler proof – Kashmiri May 17 '21 at 08:38
  • I've never been a fan of Strang's book. Looking at it now, I see that he defines the left inverse explicitly as $(A^T A)^{-1}A^T$, punting until a later chapter the proof that $A^T A$ is invertible when $A$ has full column rank. This completely obscures what is actually a straightforward fact. I'll go ahead and post a full answer. –  May 17 '21 at 08:44
  • If you are ok with an abstract argument (?): The column vectors of $A$ form $n$ independent vectors $a_1,...,a_n$ in ${\Bbb R}^m$ (so $m\geq n$). Supplement this with $m-n$ vectors to get a base of ${\Bbb R}^m$. Then define a linear map $B$ taking $a_1$ to $e_1$, ..., $a_n$ to $e_n$ and anything on the supplement. Then $BA=I_n$. – H. H. Rugh May 17 '21 at 16:36

3 Answers3

10

If $A$ is $m \times n$, then the following are equivalent:

  1. $A$ has full column rank $n$
  2. The columns of $A$ are linearly independent
  3. The null space of $A$ is trivial
  4. The map induced by $A$ is injective
  5. $A$ has a left inverse

Proof that 1 $\iff$ 2:

Immediate from the definition of column rank.

Proof that 2 $\iff$ 3:

Observe that the vector $Ax$ is equal to the linear combination $\sum_{i=1}^{n}a_i x_i$, where $a_i$ is the $i$'th column of $A$, and $x_i$ is the $i$'th component of $x$.

In particular, $Ax = 0$ if and only if $\sum_{i=1}^{n}a_i x_i = 0$.

The null space of $A$ is trivial if and only if $x=0$ is the only solution to $Ax = 0$ which, by what we said above, is true if and only if $\sum_{i=1}^{n}a_i x_i = 0$ implies $x_i = 0$ for all $i$, which is true if and only if $a_1, a_2, \ldots, a_n$ are linearly independent.

Proof that 3 $\iff$ 4:

Suppose that $Ax = Ay$. Since $A$ is linear, this is equivalent to $Ax - Ay = A(x-y) = 0$. Therefore $x-y$ is in the null space of $A$. But the null space of $A$ is trivial, hence $x-y = 0$, so $x=y$. This shows that (the map induced by) $A$ is injective (one-to-one).

Conversely, suppose that $A$ is injective. Then $x=0$ is the unique vector such that $Ax = 0$. Therefore the null space of $A$ is trivial.

Proof that 2 (and equivalently 4) $\implies$ 5:

Let $e_1, e_2, \ldots, e_n$ be the canonical basis for $\mathbb R^n$, meaning that $e_i$ has a $1$ in the $i$'th component, and zeros everywhere else. Note that for each $i$ we have $a_i = Ae_i$, where again $a_i$ is the $i$th column of $A$. Moreover, since $A$ is injective, $e_i$ is the unique vector that is mapped by $A$ to $a_i$.

Now, since $a_1, a_2, \ldots, a_n$ are linearly independent, they are a basis for the column space of $A$, which can be extended to a basis $a_1,a_2,\ldots, a_n, b_1,b_2,\ldots,b_{m-n}$ for $\mathbb R^m$. Hence an arbitrary $y \in \mathbb R^m$ has a unique representation of the form $y = \sum_{i=1}^{n} c_i a_i + \sum_{j=1}^{m-n} d_j b_j$ where $c_i$ and $d_j$ are scalars.

Therefore we can define a linear map $g : \mathbb R^m \to \mathbb R^n$ by first setting $g(a_i) = e_i$ for each $i=1,2,\ldots,n$ and $g(b_j) = 0$ for each $j=1,2,\ldots,m-n$, and then extending $g$ linearly to all of $\mathbb R^m$:

$$g(y) = g\left(\sum_{i=1}^{n} c_i a_i + \sum_{j=1}^{m-n} d_j b_j \right) = \sum_{i=1}^{n} c_i g(a_i) + \sum_{j=1}^{m-n} d_j g(b_j) = \sum_{i=1}^{n} c_i g(a_i) = \sum_{i=1}^{n} c_i e_i$$

Then $g$ is a left inverse of $A$:

$$g(Ax) = g\left(\sum_{i=1}^{n}a_i x_i\right) = \sum_{i=1}^{n} x_i g(a_i) = \sum_{i=1}^{n} x_i e_i = x$$

Proof that 5 $\implies$ 3:

Suppose that $Ax = 0$. Let $g$ be a left inverse of $A$. Then $x = g(Ax) = 0$. This shows that the null space of $A$ is trivial.


As a side note, it turns out that 4 and 5 are equivalent for general functions, not just linear maps. If $f$ is any injective function, then it has a left inverse, and conversely if $f$ is any function that has a left inverse, then it is injective. There is a proof here, for example. Since you indicated in the comments that this is an unfamiliar fact, I did not use it in the proof above but instead constructed a left inverse explicitly.


Note that my proof shows why a left inverse of $A$ must exist if $A$ has full column rank, but it doesn't explicitly show how to compute the left inverse.

As Strang notes, one formula for a left inverse is $B = (A^T A)^{-1} A^T$. That this is a left inverse is clear by computing:

$$BA = ((A^T A)^{-1} A^T) A = (A^T A)^{-1} (A^T A) = I_n$$

But as you will have noted, Strang punts to a later chapter the proof that $A^T A$ is invertible when $A$ has full column rank. So that's not very satisfactory!

Also, computing Strang's left inverse is very inefficient because it involves inverting $A^T A$. This requires a lot of calculation, proportional to $n^3$ operations for an $n \times n$ matrix.

In practice, probably the best way to compute a left inverse is to perform row reduction on $A$ to bring it to the form

$$\begin{bmatrix} I_n \\ 0_{m-n \times n} \end{bmatrix}$$

where $I_n$ is the $n \times n$ identity matrix, and $0_{m-n \times n}$ is the $m - n \times n$ matrix consisting of all zeros. Row reduction to this form is possible if and only if the columns of $A$ are linearly independent.

Assuming you're familiar with row reduction, you probably know that each row operation can be expressed as an $m \times m$ elementary matrix of one of three forms, corresponding to the three row reduction operations (multiplying a row by a scalar, interchanging two rows, and adding a scalar multiple of one row to another). The row reduction procedure can then be expressed by left-multiplying $A$ by the corresponding elementary matrices. Assuming there are $k$ of these, we have:

$$E_k E_{k-1} \cdots E_2 E_1 A = \begin{bmatrix} I_n \\ 0_{m-n \times n} \end{bmatrix}$$

The product $E_k E_{k-1} \cdots E_2 E_1$ is easy to understand conceptually: it corresponds to the $k$ row operations used to bring $A$ into the reduced form. Fortunately, it's not necessary to compute $E_k E_{k-1} \cdots E_2 E_1$ as a product of $k$ matrices! Instead you compute it by starting with $I_{m}$ and performing the same row operations on it as you perform on $A$.

In any case, denoting $E_k E_{k-1} \cdots E_2 E_1$ by $B$, the above becomes

$$BA = \begin{bmatrix} I_n \\ 0_{m-n \times n} \end{bmatrix}$$

Note that $B$ is an $m \times m$ matrix. It is almost the left inverse we seek, except we want just $I_n$ on the right hand side and a left inverse should be $n \times m$, not $m \times m$. If $m > n$ then the right hand side has $m-n$ spare rows of zeros at the bottom. To get rid of these, we can simply remove the bottom $m-n$ rows of $B$ to get a $n \times m$ matrix $B'$ which satisfies $B'A = I_n$ and is therefore a left inverse of $A$, as desired!

  • Thank you, I'm going through it. :) – Kashmiri May 17 '21 at 09:45
  • @Kashmiri I recognized that my proof is theoretical in that it shows that a left inverse of $A$ must exist, but it's not immediately obvious how to compute it. So I added some paragraphs at the bottom regarding computation. I hope they are helpful. –  May 17 '21 at 10:07
0

As an add on to user169852’s proof, I would note that Strang’s argument that $ A^T A $ is invertible if A has full column rank is fairly simple.

He shows that for any matrix A, $ A^T A $ has the same nullspace as A:

(1) Clearly the nullspace of A is contained in the nullspace of A^T A.

(2) To show the reverse inclusion, suppose that $ A^T A x = 0 $. Then $ x^T A^T A x = 0 $, so $ (A x)^T (A x) = 0 $. I.e., the norm of A x is zero and hence A x = 0. So the nullspace of $ A^T A $ is contained in the nullspace of A. Hence the nullspace of A equals the nullspace of $ A^T A $.

user169852 has already shown that A having full column rank implies its nullspace is trivial. So $ A^T A $ is a square matrix with trivial nullspace and hence is invertible.

-1

Question: "I understand why there can be at most one solution for a full column rank but how does that lead to A having a left inverse? I'd be grateful if someone could help or hint at the answer."

Answer: Let $k$ be a real numbers and $V:=k^n, W:=k^m$. Since the equation $Ax=b$ has a unique solution for all $b\in W$ it follows the equation $Ax=0$ has a unique solution, namely $x=0$. Hence the map $A: V \rightarrow W$ is an injection and it follows $n \leq m$. We get an exact sequence of $k$-vector spaces

$$0 \rightarrow V \rightarrow W \rightarrow^p W/V \rightarrow 0$$

and we may choose a section $s$ of $p$. This is a $k$-linear map $s: W/V \rightarrow W$ with $ p \circ s = Id$. This gives an idempotent endomorphism of $W$: $u:=s \circ p$ with $u^2=u$. From this it follows we may write

$$W \cong V \oplus Im(u)$$

and the projection map $p_V: W \cong V \oplus Im(u) \rightarrow V$ is a left inverse to the inclusion map defined by the matrix $A$. If you choose a basis for $W$ you get a matrix $B$ with $BA=Id_n$

hm2020
  • 1