2

As shown in the title, the least norm-2 problem can be formulated as $$\min_{x}{\|Ax-b\|_2^2}$$ where $A\in\mathbb R^{m\times n},b\in\mathbb R^m$ are parameters with $\operatorname{rank}(A)=n$ and $x\in\mathbb R^n$ is variable.

As the above problem equals to $$A^TAx=A^Tb$$

the closed form solution is $$x^*=(A^TA)^{-1}A^Tb$$

As the size of $A^TA$ is $n\times n$, reaching $(A^TA)^{-1}$ becomes quite challenging if $n$ is a quite large number. So my question is is there any methods that can improve the computation efficiency with quite large $n$?

  • 2
    In general, equations of the form $Mx = y$ may be solved with less computation than is required to find $M^{-1}$. For example: with matlab, one can write x = M\y. – Ben Grossmann Feb 06 '17 at 15:47
  • Not needing to calculate or store $M^{-1}$ can easily be the difference being able to solve a problem and not. – mathreadler Feb 16 '17 at 20:10

4 Answers4

4

For problems in which the $A$ matrix is large and sparse, iterative methods such as LSQR are commonly used. An effective preconditioner can be very important to the performance of these iterative methods.

1

Factorization methods are popular and there exist very many. For example QR-factorization.

Another family of methods which are often used are the Krylov subspace methods. Maybe most well known is the Conjugate Gradient. They utilize the krylov subspaces

$$\{v,Av,A^2v,\cdots,A^kv\}$$

Which can be used to iterative approximate solutions using matrix-vector multiplication which is often very cheap and/or fast for sparse matrixes (compared to matrix inversion).

mathreadler
  • 25,824
1

One of the wost important keyword here is "pseudo-inverse".

The (Moore Penrose) pseudo-inverse can be plainly defined as $A^+:=(A^TA)^{-1}A^T$.

$x^*=(A^TA)^{-1}A^Tb \iff x^*=A^+b$,

which, if you work with Matlab, uses the following syntax: $x=pinv(A)*b$

It is not at all a subterfuge using a new word hiding old calculations.

A numerical computer software like Matlab doesn't obtain the pseudo-inverse through a computation of the inverse of $A^TA$, often very ill-conditionned thus too prone to huge numerical errors. Instead, it uses SVD (and, as I have understood, other "under the hood" tricks), bypassing the pitfalls of the direct method.

Have a look at (https://see.stanford.edu/materials/lsoeldsee263/Additional4-ls_ln_matlab.pdf) which considers the two cases $m>n$ and $m<n$ as well.

Jean Marie
  • 81,803
0

A brief comparison of solution methods is provided here: Comparing LU or QR decompositions for solving least squares which discusses flop counts and stability.

dantopa
  • 10,342