2

I am looking at some C code that adds Tikhonov regularization to a positive definite symmetric matrix.

Like this:

power = (matrix[1][1] + matrix[2][2]);

factor = 0.000136 //Found via trial and error

for(i = 1; i < matrixSize; i++ ) {

matrix[i][i] = matrix[i][i]+( factor * power );

}

I am stuck in understanding what that power means, why have the second and third parts of the diagonal been used? How is this even helping make the numbers more stable?

Thanks for any help.

1 Answers1

5

I know, my answer might be too late to some extent, but I would like to post some explanations concerning Tikhonov's regularization approach anyway, because well, firstly, I had a practical experience in applied regularization and in Tikhonov's approach as well solving real inverse scientific problems (expressed mostly in form of integral equations) that were strongly ill-posed, and secondly, when I started I encountered a noticeable lack of the sources that could clearly describe a practical implementation of the regularization and, what was most important, of the solution estimation.

Roughly speaking, Tikhonov's approach is nothing more than a regularized coupling of the Gauss least squares method and the Moore-Penrose pseudoinverse matrix method to find a stable solution that exists and is uniquely defined in the solution's space. But in contrast to the least squares and a pseudoinverse matrix method, we approach a real unknown solution with a possibility of an error estimation of the regularized solution if some requirements concerning initial data and solution's spaces are fulfilled. In case of pointwise error estimation, it is possible to define a region (upper and lower boundaries) where the exact solution locates!

To have a possibility to treat the posted problem more generally let's use a matrix notation and some functional analysis as well (just for fun):

Initial inverse problem:

$A\tilde{y}=\tilde{b}, \tilde{y} \in L_2, \tilde{b} \in L_2\\ \|\bar{b}-\tilde{b}\|_{L_2} \leq \delta,$

where $A$ is a coefficients matrix of the SLE that is obtained after numerical discretization of the initial problem, $\tilde{b}$ represents an initial data, $\delta$ is an upper error of the initial data (if the initial data wouldn't have errors then the initial problem becomes correct posed and could be solved using common methods) and $\tilde{y}$ is an approximative regularized solution of the inverse problem. Operator error $\|\tilde{A} - \bar{A}\| \leq \xi$ is not taken into account as it is assumed to be insignificant if a discretization grid is fine enough.

Here it is seen, that I deliberately consider initial data as well as regularized solution belong to Hilbert $L_2$ spaces. There are some significant reasons for that. In two words, this restriction ensures (along with some extra statements that I don't provide here as they are related to the deeper understanding of the regularization in general) an existence of the regularized solution that can be found solving the next minimization problem:

$M_{\delta}^{\alpha}[\tilde{y}]=\|A\tilde{y}-\tilde{b}\|_{L_2}^2 + \alpha\|\tilde{y}\|_{L_2}^2, \\ y_{\alpha}=\text{arg}\,\min M_{\delta}^{\alpha}[\tilde{y}],$

where $M_{\delta}^{\alpha}[\tilde{y}]$ is a smoothing functional, $\alpha$ is a regularization parameter and $y_{\alpha}$ is a regularized solution that provides a global minimum of the functional $M$.

In the given case the minimization results in Euler-Tikhonov equation of the second order:

$\alpha y_{\alpha}+A^TAy_{\alpha}=A^T\tilde{b}, \\ y_{\alpha}=(\alpha I + A^TA)^{-1}A^T\tilde{b},$

where $I$ is an identity matrix.

Thus, as one can see the bare numerical approach isn't that bad as it is described in boring mathematical papers :) But so far I didn't introduce one more thing that is responsible for the success of the whole story. Yeah, that's right, this is a regularization parameter $\alpha$. The whole thing depends on it. Even if the convergence of the regularization approach is proved there is still strong demand on the method to choose an appropriate regularization parameter that is related to the optimal regularization solution which is possibly closest to the unknown exact solution and is stable enough. The regularization for the given case is made in following steps:

  • Define a vector of the regularization parameters that will be used to calculate a regularized solutions' space. For example, in such order:

    $\alpha_n = \alpha_0\cdot q^n, n = \overline{1, N},$

where $N$ denotes a total number of the regularized solutions that compose a solutions' space; $\alpha_0$ is an initial value, $\alpha_0 > 0$; $q$ is a factor that governs a rate of convergence to the exact solution (the lower the q factor, the faster the rate), $0 < q < 1$;

  • Calculate Euler-Tikhonov equation for each parameter $\alpha_n$;
  • Chose the most appropriate parameter that corresponds to the solution which is closest to the unknown one.

One of the most widely used methods is a discrepancy principle. Using this method the solution is found as a root of the next equation:

$\|Ay_{\alpha}-\tilde{b}\|_{L_2}^2 = \delta^2$

It's pretty straightforward. But the devil hides in the details. Here it should be noted that the regularization approach operates with the Euler-Tikhonov equation but not with the initial one. The close analysis of the given regularization approach with the attention to the pointed feature shows that the use of the discrepancy principle will always lead to over regularized solutions (the regularization parameter $\alpha$ will be unnecessary too large). Such a peculiarity is not obvious at all and it can be encountered very frequently that many researches don't pay all necessary attention to these facts and treat calculated regularized solutions as being close enough to exact ones while they are not. Therefore, the extra information about unknown solution will be likely needed (for example, the solution's smoothness).

One word more about a choice procedure. The minimal requirement is $\delta$. That means there is no way to choose the "right" parameter $\alpha$ without the initial data error. I know, time to time arise attempts to find a method that could find a stable reliable regularized solution without initial data error. $L$-curve method, for example. But anyway it is theoretically and practically already proved that such methods do not belong to regularization algorithms' family and can not lead consequentially to the exact solution.

The error estimation of the regularized solution is another big story that has no place to be provided here...

P.S.: Tikhonov's regularization approach is not the only way to solve ill-posed problems numerically, but from my experience, this is very stable, flexible and relatively not too much time to consume to start with the method that should be among the first candidates to try out.

constt
  • 151