4

I want to implement robust line fitting over a set of $n$ points $(x_i,y_i)$ by means of the Least Absolute Deviation method, which minimizes the sum

$$\sum_{i=1}^n |y_i-a-bx_i|.$$

As described for instance in Numerical Recipes, you can find the optimal slope, by canceling the function

$$s(b):=\sum_{i=1}^n x_i\text{ sign}(y_i-bx_i-\text{med}_k(y_k-bx_k))$$

where $b$ is the unknown. To find the change of sign, a dichotomic search is - appropriately - recommended. Anyway, the starting search interval as well as the termination accuracy are based on purely statistical arguments (using a $\chi^2$ test).

I was wondering if there is an analytical way to derive safe bounds for the slopes (i.e. values such that $s(b)$ is guaranteed positive or negative), as well as the termination criterion (minimum of $|s(b)|$). (In fact, I don't even know if the function is monotonic.)


Update:

I have gained some insight in the problem.

If you sort the points on $x$ and set $b$ larger than the slope of all segments between two points, the $y_k-bx_k$ will appear in decreasing order, and the median is at index $n/2$. Then $s(b)$ is the difference of the sum of the $n/2$ largest $x_i$ and the sum of the $n/2$ smallest $x_i$, hence it is certainly positive. And larger values of $b$ make no change.

Similarly, setting $b$ smaller than the smallest slope ensures $s(b)$ negative, and this answers my first question. This can work fine when the $x_i$ are regularly spaced, or close to.

Anyway, in the case such that some of the $x_i$ are equal of very close, the slopes can take high values and the dichotomic process based on the full $[b_{min},b_{max}]$ interval might be inefficient. The ordering of the $y_k-bx_k$ will only change when $b$ crosses the value of one of these $n(n-1)/2$ slopes. Hence, the dichotomic process could be based on the (sorted) slopes rather than on the continuous values of $b$.

Royi
  • 8,711

1 Answers1

3

Let's formulate the Sum of Absolute Residuals / Least Absolute Deviation (LAD) / $ {L}_{1} $ Regression / Robust Regression:

$$ \arg \min_{x} {\left\| A x - b \right\|}_{1} $$

There are few options to solve this.

Iteratively Reweighted Least Squares (IRLS)

The main trick in the IRLS approach is that:

$$ {\left\| \boldsymbol{x} \right\|}_{p}^{p} = \sum_{i} {\left| {x}_{i} \right|}^{p - 2} {\left| {x}_{i} \right|}^{2} = \sum_{i} {w}_{i}^{2} {\left| {x}_{i} \right|}^{2} $$

On the other hand, it is known (Weighted Least Squares):

$$ \sum_{i = 1}^{m} {w}_{i} {\left| {A}_{i, :} x - {b}_{i} \right|}^{2} = {\left\| {W}^{\frac{1}{2}} \left( A x - b \right) \right\|}_{2}^{2} $$

The solution is given by:

$$ \arg \min_{x} {\left\| {W}^{\frac{1}{2}} \left( A x - b \right) \right\|}_{2}^{2} = {\left( {A}^{T} W A \right)}^{-1} {A}^{T} W b $$

In the case above, where $ p = 1 $, one could set $ {w}_{i} = {\left| {A}_{i, :} x - {b}_{i} \right|}^{-1} $.
Since the solution depends on $ x $ one could set iterative procedure:

$$ {x}^{k + 1} = {\left( {A}^{T} {W}^{k} A \right)}^{-1} {A}^{T} {W}^{k} b $$

Where $ {W}_{ii}^{k} = {w}_{i}^{k} = {\left| {A}_{i, :} {x}^{k} - {b}_{i} \right|}^{p - 2} = {\left| {A}_{i, :} {x}^{k} - {b}_{i} \right|}^{-1} $.

Sub Gradient Method

The Sub Gradient is given by:

$$ \frac{\partial {\left\| A x - b \right\|}_{1}}{\partial x} = {A}^{T} \operatorname{sgn} \left( A x - b \right) $$

So the solution is given by iterating:

$$ {x}^{k + 1} = {x}^{k} - \alpha \left[ {A}^{T} \operatorname{sgn} \left( A x - b \right) \right] $$

Linear Programming Conversion

This approach convert the problem into Linear Programming problem.

By defining $ {t}_{i} = \left| {A}_{i, :} x - {b}_{i} \right| $ the optimization problem becomes:

$$\begin{align*} \text{minimize} \quad & \boldsymbol{1}^{T} \boldsymbol{t} \\ \text{subject to} \quad & -\boldsymbol{t} \preceq A x -b \preceq \boldsymbol{t} \\ & \boldsymbol{0} \preceq \boldsymbol{t} \end{align*}$$

Which is a Linear Programming form of the problem.

MATLAB Code

I implemented and validated the approaches by comparing it to CVX solution. The MATLAB code is available at my Mathematics StackExchange Question 2603548 Repository.

Remark: Pay attention that the MATLAB code is a vanilla implementation. Real world implementation should take into account numerical issues (Like the inverse of a small number).

Royi
  • 8,711
  • It doesn't look like any of the to-be-written sections in your post would answer the OP's specific questions, "I was wondering if there is an analytical way to derive safe bounds for the slopes (i.e. values such that $()$ is guaranteed positive or negative), as well as the termination criterion (minimum of $|()|$)." –  Aug 11 '19 at 11:49
  • The way I see it, @YvesDaoust tries to solve the problem in general. He took specific numerical method and wants to optimize its iterative steps. Yet I offer different approaches to the same end - Solving the problem. – Royi Aug 11 '19 at 12:28
  • the weight should be wi=max(eps, 1/abs(ri)), otherwise a near zero residual (by chance) will create a near infinite weight. – Felipe G. Nievinski Jul 17 '22 at 03:03