5

There is a theorem in my book that states: If $A$ is $m\times n$, then the equation $Ax = b$ has a unique least square solution for each $b$ in $\mathbb{R}^m$.

But can we find a counter-example to this by providing a matrix $A$ and vector $b$ such that $A^TAx = A^Tb$ produces a general solution with a free variable?

  • 2
    You're correct: if $A \in \mathbb{R}^{m \times n}$ with $m>n$ and the rank of $A$ is less than $n$, then the least squares problem has a solution which is not unique. The projection is unique, however. That is, any solution to the problem is mapped to the same vector by $A$. – Ian Aug 17 '14 at 23:25
  • 2
    It depends in part on what a "least squares solution" means. There is indeed a (unique) solution $x$ of least 2-norm that minimizes the 2-norm of the error $||Ax-b||$, whatever the rank or dimensions of $A$. – hardmath Aug 17 '14 at 23:27
  • Perhaps if you cited your book (author, title, edition), someone could clarify the context for you. – hardmath Aug 17 '14 at 23:30
  • 1
    You must, of course, stress the "least 2-norm" part. Otherwise simple things like $A=\begin{bmatrix} 1 & 1 \ 1 & 1 \ 1 & 1 \end{bmatrix}$ provide counterexamples. – Ian Aug 18 '14 at 00:07

3 Answers3

7

Of course you can have non-unique solution when $A$ has a null space. The point of least square solution is to find the orthogonal projection of $b$ in the image space of $A$. When columns of $A$ becomes linearly dependent, you can always find more than one, in fact infinitely many, solution.

Troy Woo
  • 3,579
1

Your theorem statement is incomplete. Requirements have been omitted.

To amplify the insights of @Troy Woo, given a matrix $\mathbf{A}\in\mathbb{C}^{m \times n}$, a solution vector $x\in\mathbb{C}^{n}$, and a data vector $b\in\mathbb{C}^{m}$ such that $b\notin\mathcal{N}(\mathbf{A}^{*})$, and where $n\in\mathbb{N}$ and $m\in\mathbb{N}$, the linear system $$ \mathbf{A} x = b $$ has the least squares solution can be expressed in terms of the Moore-Penrose pseudoinverse $\mathbf{A}^{\dagger}$: $$ x_{LS} = \mathbf{A}^{\dagger}b + \left(\mathbf{I}_{n} - \mathbf{A}^{\dagger}\mathbf{A} \right) y $$ with the arbitrary vector $y\in\mathbb{C}^{n}$.

If the matrix rank $\rho < m$, the null space $\mathcal{N}\left(\mathbf{A}\right)$ is non-trivial and the projection operator $\left(\mathbf{I}_{n} - \mathbf{A}^{\dagger}\mathbf{A} \right)$ is non-zero.

Example

The linear system $$ \begin{align} \mathbf{A} x & = b \\ % \left[ \begin{array}{cc} 1 & 0 \end{array} \right] % \left[ \begin{array}{c} x_{1} \\ x_{2} \end{array} \right] % &= % \left[ \begin{array}{c} b_{1} \end{array} \right] \end{align} $$ has the least squares solution $$ \begin{align} x_{LS} & = \mathbf{A}^{\dagger} b + \left( \mathbf{I}_{2} - \mathbf{A}^{\dagger} \mathbf{A}\right) y\\ % &= % \left[ \begin{array}{c} b_{1} \\ 0 \end{array} \right] % + % \alpha \left[ \begin{array}{c} 0 \\ 1 \end{array} \right] \end{align} $$ with $\alpha \in \mathbb{C}^{n}$.

The affine space of the solution satisfies $$ \mathbf{A} \left( \left[ \begin{array}{c} b_{1} \\ 0 \end{array} \right] % + % \alpha \left[ \begin{array}{c} 0 \\ 1 \end{array} \right] \right) = % \mathbf{A} \left( \left[ \begin{array}{c} b_{1} \\ 0 \end{array} \right] \right) % + % \alpha \mathbf{A} \left( \left[ \begin{array}{c} 0 \\ 1 \end{array} \right] \right) = % \mathbf{A} \left( \left[ \begin{array}{c} b_{1} \\ 0 \end{array} \right] \right). $$

The solution vector of least norm, $$\Bigg\lVert \left[ \begin{array}{c} b_{1} \\ 0 \end{array} \right] % + % \alpha \left[ \begin{array}{c} 0 \\ 1 \end{array} \right] \Bigg\rVert_{2}^{2}$$ corresponds to $\alpha=0$.

dantopa
  • 10,342
0

Least square problem usually makes sense when m is greater than or equal to n, i.e., the system is over-determined.

Then, in order to have unique least square solution, we need matrix A to have independent columns. To cook up a counter-example, just make the columns of A dependent.