28

(edit, 9 years later... hello smart contract developers, I know that's why you're here lol)

What is the fastest algorithm for finding the square root of a number?

I created one that can find the square root of "$987654321$" to $16$ decimal places in just $20$ iterations

I've now tried Newton's method as well as my own method (Newtons code as seen below)

What is the fastest known algorithm for taking the second root of a number?

My code for Newton's Method (*Edit: there was an error in my code, it is fixed in the comments below):

    a=2   //2nd root
    b=97654321   //base
    n=1   //initial guess
    c=0   //current iteration (this is a changing variable)
    r=500000   //total number of iterations to run
    while (c<r) 
    {
        m = n-(((n^a)-b)/(a*b))  //Newton's algorithm
        n=m
        c++;
        trace(m + "  <--guess   ...   iteration-->  " + c)
    }
  • 7
    Your attempt with Newton's method sounds unrealistic: Newton / Heron has quadratic convergence. Even starting with $x_0=1$ gives an error $<10^{-24}$ after 20 iterations. – Hagen von Eitzen Feb 06 '13 at 06:55
  • Have you seen these methods ? http://en.wikipedia.org/wiki/Methods_of_computing_square_roots – Thomas Feb 06 '13 at 07:09
  • I'm not sure what I did wrong... the method I was using that I thought was Newtons method took about 200,000 iterations and did come up with the right answer haha.... hmm – Albert Renshaw Feb 06 '13 at 07:20
  • Added my algorithms code if any programmers wanna take a crack at it and show me my flaw – Albert Renshaw Feb 06 '13 at 07:22
  • 1
    Jeez, you couldn't be bothered to compare your code with Hagen's answer yourself? You need a*n instead of a*b in the denominator. –  Feb 06 '13 at 07:26
  • I'm sorry I hadn't seen it yet! – Albert Renshaw Feb 06 '13 at 07:27
  • a*n worked great! I must have read the algorithm wrong when I was writing this code O:) – Albert Renshaw Feb 06 '13 at 07:28
  • I know this is a mathematics forum, not a programming one, but speed is going to depend on how you write the code, as much as on the formulas you use. Optimized code will run faster than than the obvious implementation, even if it ostensibly does more arithmetic or uses more iterations. And some algorithms are easier to optimize than others. In short, counting arithmetic operations and iterations is only part of the story. The Babylonians and Newton didn't have computers, so this issue didn't arise, for them :-) – bubba Feb 06 '13 at 11:18
  • 2
    check out this post on SO http://stackoverflow.com/questions/295579/fastest-way-to-determine-if-an-integers-square-root-is-an-integer It shows the fastest way to find the square root and to check if it an integer or not. – Jeel Shah Feb 06 '13 at 11:27
  • Enter a positive number: 1987654321 Iterative Step 1 = 3.31276e+08 Iterative Step 2 = 1.10425e+08 Iterative Step 3 = 3.68084e+07 Iterative Step 4 = 1.22695e+07 Iterative Step 5 = 4.08999e+06 Iterative Step 6 = 1.36376e+06 Iterative Step 7 = 455882 Iterative Step 8 = 155824 Iterative Step 9 = 62978.6 Iterative Step 10 = 45031.4 Iterative Step 11 = 44583.1 Iterative Step 12 = 44583.1 Converged Value = 44583.1 Program ended with exit code: 0 –  Dec 24 '14 at 00:01
  • 1
    LOL — lately I've been getting so many views on this question I asked back in 2013 and I was wondering why... today while programming solidity I needed a cheap square root and was using babylonian method then wondered if there was a more efficient way, googling brought me back to my own question. I can now conclude almost everyone in this recent view surge is fellow smart contract developers trying to get cheap square roots on the blockchain hahah. Hello world! – Albert Renshaw Jan 14 '22 at 06:15

8 Answers8

24

If you use Halley's method, you exhibit cubic convergence! This method is second in the class of Householder's methods.

Halley's method is: $$ x_{n+1} = x_n - \frac{2f(x_n)f'(x_n)}{2[f'(x_n)]^2-f(x_n)f''(x_n)} $$ If we let $$f(x) = x^2 - a$$ which meets the criteria, (continuous second derivative)

Then Halley's method is:

$$ x_{n+1} = x_n - \frac{\left(2x_n^3 - 2ax_n\right)}{3x_n^2 + a} $$ Which has the simplification: $$ x_{n+1} = \frac{x_n^3 + 3ax_n}{3x_n^2 + a} $$ I also will add this document which discusses extensions of the newtonian method.

There exists an extension due to Potra and Pták called the “two-step method” that may be re-written as the iterative scheme $$x_{n+1} =x_n − \frac{f(x_n)+f\left(x_n − \frac{f(x_n)}{f′(x_n)}\right)}{f'(x_n)}$$ that converges cubically in some neighborhood of of the root $x^{*}$ which does not require the computation of the second derivative.

See: On Newton-type methods with cubic convergence for more information on this topic.

As Hurkyl and others have noted, your best bet is to just use Newton's Method. These alternative methods generally come with more operations per iteration. They aren't really worth the computational cost, but they are a good comparison.

Rustyn
  • 8,407
  • The householder method is the fastest known algorithm? – Albert Renshaw Feb 06 '13 at 07:31
  • @AlbertRenshaw I'm not sure if it is the fastest algorithm_ but you should check out the article I included. Cubic convergence is really good, especially if you don't need the second derivative. – Rustyn Feb 06 '13 at 07:40
  • 11
    You can do even better: there's an easy way to get quartic convergence: each iteration, you do two steps of Newton's algorithm. :) Cubic convergence means you only need 2/3 as many iterations as quadratic convergence, but that isn't helpful if each iteration takes 3/2 times as long or more. –  Feb 06 '13 at 07:44
  • @Hurkyl I would agree, Newton's method is pretty much the most efficient route. I wanted to include other methods so that he could compare. – Rustyn Feb 06 '13 at 07:46
  • Guys, I have a feeling this answer has the potential to be actually faster than Newton's Method. The reason is because per iteration you still use only one division operation (which is about 14 times as slow as addition/multiplication on x86) while having cubic convergence. – orlp Feb 06 '13 at 13:00
  • Also, I think the formula you posted here is wrong: http://ideone.com/Reob0B – orlp Feb 06 '13 at 13:05
  • @nightcracker I don't think I got the formula wrong. – Rustyn Feb 06 '13 at 14:20
  • @RustynYazdanpour: I'm 100% sure it's wrong, because it gives the wrong results (see my linked snippet). The correct formula for the square root is given here: http://www.mathpath.org/Algor/squareroot/algor.square.root.halley.htm – orlp Feb 06 '13 at 17:58
  • @nightcracker Oh I realized I forgot to include the "$x_n - $" part in front, thanks. Sorry for being belligerent. – Rustyn Feb 06 '13 at 23:24
  • @nightcracker: Note that the problem of optimizing an algorithm for computing 53 bits of precision is very different than the problem of optimizing an algorithm for many bits of precision. I think everyone's been focusing on the latter. –  Feb 07 '13 at 00:00
  • 1
    @Hurkyl: I've been focusing on the former actually. But after benchmarking, I found this method to be slower - the overhead of the additions, squarings and multiplications is just not worth saving a division (on x64 with SSE, that is). – orlp Feb 07 '13 at 17:49
7

Newton's method for solving $f(x)=x^2-N=0$ leads to the recurrence $x_{n+1}=x_n-\frac{x_n^2-N}{2x_n}=\frac{x_n+N/x_n}2$, also known as Heron's method. Since $f'(x)\ne0$ at the root, the convergence is quadratic (i.e. the number of correct decimals doubles with each step once a threshold precision is reached). The results depend on the starting value, of course. Simply guessing $x_0=1$ leads to $$x_{1} = 493827161.00000000000\\ x_{2} = 246913581.49999999899\\ x_{3} = 123456792.74999998937\\ x_{4} = 61728400.374999909634\\ x_{5} = 30864208.187499266317\\ x_{6} = 15432120.093744108961\\ x_{7} = 7716092.0468278285538\\ x_{8} = 3858110.0230600438248\\ x_{9} = 1929183.0086989850523\\ x_{10} = 964847.48170274167713\\ x_{11} = 482935.55973452582660\\ x_{12} = 242490.33277426247529 \\ x_{13} = 123281.64823302696814 \\ x_{14} = 65646.506775513694016 \\ x_{15} = 40345.773393104621684 \\ x_{16} = 32412.760144718719221 \\ x_{17} = 31441.958847358050036 \\ x_{18} = 31426.971626562861740 \\ x_{19} = 31426.968052932067262 \\ x_{20} = 31426.968052931864079 $$
with small enough error.

J. W. Tanner
  • 60,406
  • I found my error in Newtons method now! I coded it wrong, thank you! +1

    Do you know of algorithms faster than this one?

    – Albert Renshaw Feb 06 '13 at 07:30
  • If the representation of your number x allows this, begin with a number with half of the number of digits of x (where the digits left of the decimal point are taken). Then in Hagen's list you step in immediately at $x_{17}$ ... (take care: you've given $97654321$ while Hagen uses $987654321$ ) – Gottfried Helms Feb 06 '13 at 08:00
6

Not a bona fide "alogrithm", but a cute hack nevertheless that I once used in code that required taking an inverse square root millions of times (back when I was doing computational astrophysics) is found here:

http://en.wikipedia.org/wiki/Fast_inverse_square_root

It does use a few iterations of Newton's method, but only after some very, very clever trickery.

I remember naively using trial-and-error optimization to find a "magic number" that would come closest to a direct square root, though of course it was much slower (still faster than just called "sqrt" from math.h) and had a higher error than the above hack.

gmoss
  • 922
4

In case you meant not the theoretical speed but the algorithm that runs the fastest on a computer, then it's the "quake 3" algorithm or one of its derivatives which, I believe, is implemented as the GCC's sqrt function at optimization levels 2 and 3. It's ironic that the determining factor here is a clever value and implementation of the initial condition rather than the actual iterative scheme. On my normal laptop it takes about 2.0e-09s to call sqrt with gcc -O2 or gcc -O3, which is about 2.8 times less than what the runner up, the standard implementation of sqrt gives.

rytis
  • 151
3

I just noticed nobody's pointed out the following trick: to compute $1/\sqrt{n}$, Newton's method used to find a root of $f(y) = 1/y^2 - n$ gives the iteration

$$ y \leftarrow \frac{3y - ny^3}{2}$$

I believe that in some ranges, it is faster to compute an estimate of $\sqrt{n}$ by using Newton's method to first compute $1/\sqrt{n}$ then invert the answer than it is to use Newton's method directly.

It is likely faster to compute this as

$$ \frac{3y - ny^3}{2} = y - \frac{n y^2 - 1}{2}y$$

The point being that if $y$ is a good approximation of $1/\sqrt{n}$, then $n y^2 - 1$ is a good approximation of $0$, which reduces the amount of precision you need to keep around, and you can play tricks to speed up the calculation of a product if you already know many of its digits.


But it's possible you might be able to do even better by computing an approximation of both $x \sim \sqrt{n}$ and $y \sim 1/\sqrt{n}$ simultaneously. I haven't worked through the details.

The reason to hope that this might work out is that a better update calculation for $x$ may be available because $ny \sim n/x$, and then a faster way to update $y$ based on $y \sim 1/x$ rather than $y \sim 1/\sqrt{n}$.

2

A related problem. You can use the Taylor series of $\sqrt{x}$ at a point $a$

$$ \sqrt{x} = \sum _{n=0}^{\infty }\frac{\sqrt {\pi }}{2}\,{\frac {{a}^{\frac{1}{2}-n} \left( x-a\right)^{n}}{\Gamma\left( \frac{3}{2}-n \right)n! }}. $$

If you pick $a$ to be close to $987654321$, say $ a=987654320 $, then you get fast convergence.

  • Hey, how do you calculate $\pi$ and $\Gamma(x)$ quickly in computer? – Yai0Phah Feb 06 '13 at 09:44
  • @FrankScience: well you'd hardcode Pi of course. The gamma function is the largest hurdle here, I think. – orlp Feb 06 '13 at 10:59
  • I think I remember reading something a while back about the natural log of gamma being easier to compute than gamma itself, then you just raise e to that power, but these are all iteratively long steps. – Albert Renshaw Jun 29 '16 at 19:28
1

You can get $20$ exact digits with only $3$ iterations, but it requires a better initial guess ( do not use $X_0=1$ ) This guess is easy to find with a simple table of square roots between $1$ and $100$. Then you start with $3.16 \times 10^4$ for $X_0...$
Apply Newton/Hero $3$ times, and you get:
$31426.968052931864079...$ that's $20$ correct digits!
Hello again,
With my " $101$ Dalmatians " algorithm, I can get $5$ exact digits with $1$ iteration. and $15$ exact digits with only $2$ Newton/Heron iterations. I use linear interpolation and second-order polynomial to find a very good initial guess.
I use also a "lookup" in a simple table from $1$ to $101$, with $4$ digits accuracy.
It is very fast...but an assembly language code will do better of course.

Kumar
  • 1,167
Serge
  • 11
0

Babylonian algorithm.

To determine square root of a.

$ x_{n+1} = \frac{1}{2}(x_n + \frac{a}{x_n}) $

Exponential convergence with $x_0 \approx \sqrt{a} $. Results independent of starting value $x_0$, shows fastest possible convergence. http://en.wikipedia.org/wiki/Methods_of_computing_square_roots#Babylonian_method