Finding the root for a multivariate function.

Question

Background

In a practical problem I need to find the solution to:

$$f(\bar{x}) - \bar{p} = \bar{0}$$

where $ f : \mathbb{R}^2 \rightarrow \mathbb{R}^2 $. I don't know the exact expression for $f$ but I can evaluate $f(\bar{x})$ numerically. I know that the inverse for $f$ exists but is unavailable. Further more I know that $f$ is "nice" in the sense that it is continuously differentiable at least once and it is very smooth.

What I've tried

Normally I would use one of the many univariate root-finding algorithms such as Newton-Raphson or the Ridder's method. But the only expansion of these root finding methods that I know of for multivariate problems is Newton-Raphson (using the Jacobian).

As I already have a working BFGS implementation I tried the following: $$g(\bar{x})= {\left\Vert f(\bar{x}) -p \right\Vert}^2 $$ so $g:\mathbb{R}^2\rightarrow \mathbb{R}$ and then minimize it using BFGS.

This works well to some degree but in some cases it fails to provide an accurate result beyond 7 digits or so. For my application this is not good enough.

I'm pretty sure I'm not hitting machine precision problems. And I believe the inaccuracy comes from the calculation of the finite difference approximation to the gradient used in the BFGS. If I adjust the step length for the FDM i get slightly better results but I can't get it small enough before it stops working. Which leads me to suspect that other derivative based methods (such as Newton-Raphson) will have similar problems.

Question

Are there any multivariate, multi-valued root finding algorithms that don't require the derivative that would be useful for solving this? Is there some other approach to solving this?

I'm not so sure why you're so sure it's not precision problems, because Newton-Raphson does not have any such problem. The fact that you mentioned using smaller step for approximating the gradient indicates that the precision problems appear there. Below a certain step size it stops working as you already found, and that is indeed because you used fixed precision. I think you should calculate explicitly the required precision via asymptotic expansions to see if that is the case. — user21820, Jul 10 '15 at 15:42
@user21820 I might have been less than clear. I know for a fact that I can evaluate $f(\bar x)$ with much higher accuracy than 7 digits. Also for a special case where I have $f^{-1}$ then I can get $x^* = f^{-1}(\bar p)$ such that $f(x^*) - \bar p = \bar 0$ when evaluated has much higher precision than the BFGS solution for the same arguments (1E-13 error vs 1E-9 error for BFGS). — Emily L., Jul 11 '15 at 15:02
But why do you assume that if you can compute $f$ to a certain precision that precision carries through all the way? You need to explicitly analyze the asymptotic behaviour of the approximation formula for the gradient as I suspect the problem lies there. Since you did not provide the formula you used, no one else can help you with that anyway. — user21820, Jul 11 '15 at 15:05
@user21820 I'm note sure we're on the same page... I said I believe the error comes from the approximation of the gradient (which is what you're saying too). My question was if there are methods for find the roots without the derivative, not to help me with the gradient although I appreciate that. — Emily L., Jul 11 '15 at 16:40
Well okay I see then why not use your current method until the value stabilizes and then use binary search to narrow down further than the gradient-based methods can go? You could use binary search from the start but its convergence rate is linear compared to the quadratic convergence rate of Newton-Raphson (under certain conditions; see http://math.stackexchange.com/a/800070/21820 if you're interested.) — user21820, Jul 12 '15 at 03:31
Someone else might have a better answer. Anyway it is possible that the approximation you are using is not the best for the purpose. For example in the 1d case $\frac{f(x+d)-f(x-d)}{2d} \in f'(x) + O(d^2)$ for thrice-differentiable $f$ whereas $\frac{f(x+d)-f(x)}{d} \in f'(x) + O(d)$. Similarly for 2d where taking the average of the 4 directions given by adding $(0,d),(0,-d),(d,0),(-d,0)$ would leave an error that is $o(d)$ if the function is twice differentiable and $O(d^2)$ if the function is thrice differentiable. — user21820, Jul 13 '15 at 01:57
And I'd be glad to know if any of my suggestions work if you tried them. If they really solve your problem, then I don't mind writing up an answer. — user21820, Jul 13 '15 at 02:05
@user21820 if someone else has a better answer they'll write one too. To me it sounds like you don't want to write an answer unless you'll be sure to get the tick :/ — Emily L., Jul 13 '15 at 12:29
No it's nothing to do with getting accepted. I really am not familiar with what is known about such algorithms and everything I've told you so far comes from my own experience in coding my own arbitrary precision library. So unless I'm certain something really works, I can't be sure that it is really a good answer at all. — user21820, Jul 13 '15 at 12:34

score 3 · Accepted Answer · answered Jul 13 '15 at 13:30

3

I had an error in the parameters for my BFGS search, the one of the termination criteria were too loose, adjusting it solved the problem.

answered Jul 13 '15 at 13:30

Emily L.

286

The gradient descent can get stuck in a local minimum. Try to minimize the module of the gradient – Hass Saidane Jun 27 '20 at 10:47

Finding the root for a multivariate function.

Background

What I've tried

Question

1 Answers1