How can I solve large size Least norm-2 Problem?

Question

As shown in the title, the least norm-2 problem can be formulated as $$\min_{x}{\|Ax-b\|_2^2}$$ where $A\in\mathbb R^{m\times n},b\in\mathbb R^m$ are parameters with $\operatorname{rank}(A)=n$ and $x\in\mathbb R^n$ is variable.

As the above problem equals to $$A^TAx=A^Tb$$

the closed form solution is $$x^*=(A^TA)^{-1}A^Tb$$

As the size of $A^TA$ is $n\times n$, reaching $(A^TA)^{-1}$ becomes quite challenging if $n$ is a quite large number. So my question is is there any methods that can improve the computation efficiency with quite large $n$?

In general, equations of the form $Mx = y$ may be solved with less computation than is required to find $M^{-1}$. For example: with matlab, one can write x = M\y. — Ben Grossmann, Feb 06 '17 at 15:47
Not needing to calculate or store $M^{-1}$ can easily be the difference being able to solve a problem and not. — mathreadler, Feb 16 '17 at 20:10

Brian Borchers · Answer 1 · 2017-02-06T15:55:40.790

4

For problems in which the $A$ matrix is large and sparse, iterative methods such as LSQR are commonly used. An effective preconditioner can be very important to the performance of these iterative methods.

edited Feb 06 '17 at 15:55

answered Feb 06 '17 at 15:45

Brian Borchers

10,563

What is "LSQR", for the uninitiated? – Ben Grossmann Feb 06 '17 at 15:48
I've added a link to the answer. LSQR is an iterative algorithm for linear least squares problems that is widely used. – Brian Borchers Feb 06 '17 at 15:56
It really bothers me that however deep I dig, nobody clarifies the acronym LSQR. Clearly it's least squares ____ ____, or I guess they might mean $QR$ as in the $QR$ decomposition. Upon further inspection: yes, it's least-squares "QR", as in the decomposition. – Ben Grossmann Feb 06 '17 at 16:03
The paper in which LSQR was introduced doesn't define it as an acronym. It's simply a name. – Brian Borchers Feb 06 '17 at 16:06
and to me, that just seems insane. In any case, I've decoded the would-be acronym to my satisfaction. – Ben Grossmann Feb 06 '17 at 16:07

score 1 · Answer 2 · answered Feb 06 '17 at 16:04

Factorization methods are popular and there exist very many. For example QR-factorization.

Another family of methods which are often used are the Krylov subspace methods. Maybe most well known is the Conjugate Gradient. They utilize the krylov subspaces

$$\{v,Av,A^2v,\cdots,A^kv\}$$

Which can be used to iterative approximate solutions using matrix-vector multiplication which is often very cheap and/or fast for sparse matrixes (compared to matrix inversion).

score 1 · Answer 3 · answered Feb 06 '17 at 16:54

One of the wost important keyword here is "pseudo-inverse".

The (Moore Penrose) pseudo-inverse can be plainly defined as $A^+:=(A^TA)^{-1}A^T$.

$x^*=(A^TA)^{-1}A^Tb \iff x^*=A^+b$,

which, if you work with Matlab, uses the following syntax: $x=pinv(A)*b$

It is not at all a subterfuge using a new word hiding old calculations.

A numerical computer software like Matlab doesn't obtain the pseudo-inverse through a computation of the inverse of $A^TA$, often very ill-conditionned thus too prone to huge numerical errors. Instead, it uses SVD (and, as I have understood, other "under the hood" tricks), bypassing the pitfalls of the direct method.

Have a look at (https://see.stanford.edu/materials/lsoeldsee263/Additional4-ls_ln_matlab.pdf) which considers the two cases $m>n$ and $m<n$ as well.

score 0 · Answer 4 · answered Mar 10 '17 at 23:03

0

A brief comparison of solution methods is provided here: Comparing LU or QR decompositions for solving least squares which discusses flop counts and stability.

answered Mar 10 '17 at 23:03

dantopa

10,342

How can I solve large size Least norm-2 Problem?

4 Answers4