3

Let ($x_1$,$y_1$),...,($x_n$,$y_n$) be my data set. I have a function $f(x,{\bf c})$ where ${\bf c}=(c_1,...,c_m)$ is a vector of $m$ parameters. I want to fit to the data using non-linear least squares :

$$\min_{\bf c}{\sum_{i=1}^n (y_i - f(x_i,{\bf c}))^2}$$

I want to find the sensitivity of my optimal parameters ${\bf c}^*$ with my data. In other words, how can I find

$$ \partial { c}_j^* \over \partial y_i $$

for $i=1,...,n$ and $j=1,...,m$.

In practice, I want to find the effect of perturbing my $y_i$'s by a small amount on the parameters ${\bf c}^*$ obtained.

Please give me some references and the name of this derivative. I am having a hard time finding information using Google on this subject. Also, please assume that my function $f$ is behaving well in term of differentiability (it is continuous and is differentiable multiple times).

Thank you very much for your help!

quantguy
  • 130

2 Answers2

2

In terms of the mathematical problem, you could try to use the implicit function theorem if your minimization problem is well behaved so that in the optimum: $$g({\bf c},y_i):=\frac{\partial\left(\sum_{i=1}^n (y_i - f(x_i,{\bf c}))^2\right)}{\partial c_j}=0$$ implies $$\min_{\bf c}{\sum_{i=1}^n (y_i - f(x_i,{\bf c}))^2}.$$

With the implicit function theorem, you can check how ${\bf c} $ has to change if $y_i$ changes so that $g({\bf c},y_i)$ remains zero, i.e., you remain at the optimal solution. The implicit function theorem states $$\frac{d c_j^*}{d y_i}=-\frac{\partial g/\partial y_i}{\partial g/\partial c^*_j},$$ which is what you want.

It might not work if your $f(.)$ function is not nicely behaved, so check the assumptions of the implicit function theorem. Also, the problem sounds important so I am sure there is a literature on this.

Nameless
  • 4,045
  • 2
  • 20
  • 36
  • Thanks a lot! That is exactly what I was looking for. In my case, the function $f$ is well behave, so I can used this "as is". – quantguy Jun 28 '16 at 13:54
0

The method of least squares is so useful not only because it provides a best estimate of the fit parameters, it is provides a qualitative measure of the stability of the solution.

For a linear regression of $y(x) = c_{0} + c_{1} x$, the number of fit parameters is $n=2$ and the sequence of measurements $\left\{ x_{k}, y_{k} \right\}_{k=1}^{m}$, the linear system is $$ \mathbf{A} c = b, $$ and the solution via normal equations is $$ \begin{align} c_{0} &= D^{-1} \left( \sum x_{k} y_{k} - \sum x_{k} \sum y_{k} \right), \\ c_{1} &= D^{-1} \left( \sum x_{k}^{2} \sum y_{k} - \sum x_{k} \sum x_{k} y_{k} \right). \\ \end{align} $$ The matrix determinant is $$ D = \det \mathbf{A}^{*}\mathbf{A} = m \sum x_{k}^{2} - \left(\sum x_{k} \right)^{2} $$

The required derivatives are $$ \begin{align} \frac{\partial c_{0}} {\partial y_{j}} &= D^{-1} \left( m x_{j} - \sum x_{k} \right), \\ % \frac{\partial c_{1}} {\partial y_{j}} &= D^{-1} \left( \sum x_{k}^{2} - x_{j} \sum x_{k} \right). \end{align} $$ The derivatives quantify how stable the fit parameters are against perturbations in the data.

The uncertainty in the fit parameters is $$ \sigma_{k}^{2} = \sum_{j=1}^{m} \left( s^{2} \frac{\partial c_{k}} { \partial y_{j}} \right)^{2}, \quad k=1,2 $$ where the square of estimated parent standard deviation is $$ s^{2} = \left( m - n \right)^{-1} \lVert \mathbf{A} c - b \rVert_{2}^{2}. $$ The final form for the uncertainties is $$ \begin{align} \sigma_{0}^{2} &= s^{2} \frac{m} {D}, \\ \sigma_{1}^{2} &= s^{2} \frac{\sum x_{k}^{2}} {D}. \\ \end{align} $$ Notice the diagonal terms of the matrix $\left( \mathbf{A}^{*}\mathbf{A}\right)^{-1}$ are $\left\{ m, \sum x_{k}^{2} \right\}.$

The answer would be quoted as intercept $= c_{0}\pm \sigma_{0}$, slope $ =c_{1} \pm \sigma_{1}.$

dantopa
  • 10,342