12

I've googled this and I've seen different types of proofs but they all use notations that I don't understand.

But first of all, I need to understand what quadratic convergence means, I read that it has to do with the speed of an algorithm. Is this correct?

Ok, so I know that this is the Newton-Raphson method:

$$x_{n+1}=x_n-\dfrac{f(x_n)}{f'(x_n)}$$

How do I prove that it converges?

Thanks.

Jaigus
  • 505
MosesA
  • 471
  • 4
  • 7
  • 20

3 Answers3

6

The method converges under suitable hypotheses. Assume that you have determined by whatever means an interval $[a,b]$ with $$f(a)<0<f(b);\qquad f'(x)>0, \quad f''(x)>0\quad(a<x<b)$$ (or similar, but with different signs of $f$, $f'$, and $f''$). Then $f$ has exactly one zero $\xi\in\ ]a,b[\ $. Furthermore it is obvious from looking at a figure, resp., the convexity properties of $f$, that $$x_0:=b,\qquad x_{n+1}:=x_n-{f(x_n)\over f'(x_n)}\quad(n\geq0)\tag{1}$$ produces a monotonically decreasing sequence of points $x_n>\xi$. It follows that the $x_n$ converge to some $\xi'\in[\xi,x_0]$. Letting $n\to\infty$ in $(1)$ implies $f(\xi')=0$, whence $\xi'=\xi$.

In order to analyze the speed of convergence we invoke Taylor's theorem: For each $n\geq0$ there is an $x^*\in[\xi,x_n]$ with $$0=f(\xi)=f(x_n)+f'(x_n)(\xi-x_n)+{f''(x^*)\over 2!}(\xi-x_n)^2\ .$$ This implies, by definition of $x_{n+1}$, that $$x_{n+1}-\xi={f''(x^*)\over 2 f'(x_n)}(x_n-\xi)^2\ .$$ Here for large $n$ the first factor on the right hand side is approximately equal to $$C:={f''(\xi)\over 2 f'(\xi)}\ .$$ This means that for large $n$ we have approximately $$x_{n+1}-\xi\doteq C(x_n-\xi)^2\qquad(n\gg1)\ .$$ Qualitatively this means that with each Newton step the number of correct decimals is about doubled. That is what is meant by "quadratic convergence".

5

Let $f$ be twice continuous differentiable on some interval $(a, b)$. Assume that $f(c) = 0$ where $a < c < b$ and that $f'(c) \neq 0$. Then there exists $0 < \epsilon < \min(c - a, b-c)$ with the following property: pick any $x_0 \in (c- \epsilon, c + \epsilon)$ and define iteratively$$x_{n+1} = x_n - {{f(x_n)}\over{f'(x_n)}},\text{ }n \ge 0.\tag*{$(1)$}$$We have that $\{x_n\}_{n=0}^\infty$ converges to $c$ in the following fashion:$$\left|x_{n+1} - c\right| \le M\left|x_n - c\right|^2 \text{ for all }n \ge 0,\tag*{$(2)$}$$where $M$ is some constant.

We may assume $c = 0$, and $\epsilon$ small enough that$$\left|f'(x)\right| > {{\left|f'(0)\right|}\over2}$$ when $\left|x\right| < \epsilon$ for some $B \in \mathbb{R}^+$. Then by Taylor's Theorem,$$f(x_n) - xf'(x_n) = {{x_n^2}\over2} f''(y_n)$$for some $y_n$ between (fix this->)$)$ and $x_n$. Thus, for $\left|x_n\right| < \epsilon$, we have$$\left|x_{n+1}\right| = \left|-x_{n+1}\right|$$$$=\left| {{f(x_n)}\over{f'(x_n)}} - x_n\right|$$$$= {1\over{\left|f'(x_n)\right|}} \cdot \left| f(x_n) - x_n f'(x_n)\right|$$$$= {1\over{\left|f'(x_n)\right|}} \cdot \left| {{x_n^2}\over2} f''(y_n)\right|$$$$\le {2\over{f'(0)}} \cdot {B\over2} x_n^2.$$What we are doing is taking $x_{n+1}$ to be the point where the tangent line to the graph of $f$ at $x_n$ hits the $x$-axis. The inequality we proved shows that for $\left|x_n\right| < \epsilon$, we have$$\left| {{x_{n+1}}\over{x_n}}\right| < Mx_n,$$so if$$\left| x_n\right| < \min\left( \epsilon, {1\over{2M}}\right),$$we have$$\left|x_{n+1}\right| < {{\left|x_n\right|}\over2}.$$Thus,$$\left|x_n\right| \to 0\text{ as } n \to \infty.$$


The rate of convergence in $(2)$ is quadratic and thus faster than in the contraction principle. There the convergence is exponential, here it is super-exponential. This plays an important role in applications, also to problems in pure mathematics (Nash embedding). If you have experience in programming, you could write a brief code which computes this Newton sequence for simple functions of your choice. You will see how rapidly the sequence stabilizes behind the comma.

Frank
  • 880
3

You look at the size of the next function value. For simple roots and close to the root, the function value is a measure for the distance to the root. $$ f(x+h)=f(x)+f'(x)h+\frac12 f''(x+\theta h)h^2 $$ Denote $L=\max_{x\in I} |f''(x)|$ and set $f(x)+f'(x)h=0$, then $$ |f(x+h)|\le \frac L2 h^2=\frac L2\frac{f(x)^2}{f'(x)^2} $$ Now put the first derivatives into the constant and return to the iteration sequence $(x_n)$ to get $$ |f(x_{n+1})|\le C\,|f(x_n)|^2 \iff |C\,f(x_{n+1}|\le|C\,f(x_n)|^2 $$ where $C=\frac{L}{2m^2}$ with $$ 0< m\le |f'(x)|\le M<\infty $$


Repeated squaring leads to a dyadic power in the exponent, so that $$ |C\,f(x_n)|\le|C\,f(x_0)|^{2^n} $$ This is what is meant with quadratic convergence, that the exponent is $2^n$ instead of $n$ as in linear convergence.

The condition to guarantee convergence is then $|C\,f(x_0)|<1$.


For the distance to the root $x_*$ use $$ f(x)=f(x)-f(x_*)\le f'(x_*+\theta(x-x_*))\,(x-x_*) $$ so that $$ m\,|x-x_*|\le |f(x)|\le M\,|x-x_*|\iff \frac{|f(x)|}M\le |x-x_*|\le\frac{|f(x)|}m. $$

Lutz Lehmann
  • 126,666