Most efficient way to calculate logarithm numerically?

Question

I was asked a question the other day about how to calculate logarithms $\log(t)$ without log tables.

Some approaches I thought of (own work) were

Different methods solving $\exp(x) = t$
Taylor expand $\log(1+t)$ around $t=0$, it will converge for $t\in [-1,1]$.
Combine any of the above with log laws $$\cases{\log(nx) = \log(n) + \log(x)\\\\\log(x^n) = n\log(x)}$$

However I don't know how this is typically done in practice, for example in software or hardware in electronics and computers. It would be interesting with introduction to some methods and/or some sources to read up on it.

What algorithm is used by computers to calculate logarithms? might be of interest — Sil, Apr 10 '20 at 17:13

emacs drives me nuts · Answer 1 · 2023-08-26T13:38:49.550

It depends... $\def\artanh{\operatorname{artanh} } \def\larr{\leftarrow}$

Prerequisites

Whether some approach is a good choice depends on many factors:

What arithmetic / operations are available / can be used? What are their costs w.r.t. execution time, memory usage (static and dynamic, volatile and non-volatile), silicon consumption, current consumption, etc.
Is it about absolute error or relative error?
Is it fixed-point arithmetic or floating-point?
Is the required precision known in advance and what is it? Or is it for some arbitrary precision library like MPFR?
Is the range known in advance? Or is is for some generic library that must handle any input range?
If fixed-point arithmetic is used, extra care must be taken that intermediate results do not overflow resp. approaches like Taylor series might be obsolete. In that case, consider Bernstein polynomials evaluated using De Casteljau's algorithm which avoids intermediate overflow by design provided the control points are representable.
What are the metrics for "optimal"? Precision? Speed of execution? Code consumption? Memory consumption? Ease of implementation? ...

In hardware, CORDIC might be the way to go. In software, it depends on which instructions are available and how they perform. CORDIC is usually advantageous if shifting is much faster than multiplication.

Computing $\ln$ using range reduction and $\artanh$

Some months ago I implemented an IEEE-754 double emulation for an 8-bit microcontroller, and even though the emulated multiplication is very expensive (takes some 1000 cycles), approximation via polynomials beat CORDIC which was implemented by a fellow. The algorithm goes basically like this:

Range reduction to $[1,2]$ and then to $[1/\sqrt2, \sqrt2]$. This is basically a no-op with binary floating-point because the mantissa stays the same.
Compute $$\ln x = 2\artanh\frac{x-1}{x+1}$$ That series is much better suited than the Mercator series, the Taylor expansion of $\ln$ around 1. Notice that the speed of convergence of the McLaurin series for $\artanh$ with the argument above is basically the same when $x$ is replaced by $1/x$. This is the reason for the choice $x \in [1/\sqrt2, \sqrt2]$.
The implementation uses a pre-computed polynomial$^{*)}$ of degree 7 which minimizes the maximal relative error against $$f(x) = 2\frac{\artanh(\sqrt x)}{\sqrt x}$$ so that we have $$\ln x = y\cdot f(y^2)\quad\text{ with }\quad y=\frac{x-1}{x+1}$$ For the MinMax polynomial approximation of $f$ see the end of this question. This function is better suited than using $\artanh$ directly because it kind of encodes the symmetry of $\artanh$ (which would be lost if I'd approxed $\artanh$ directly by a polynomial). And it "removes" the zero at $y=0$ which is much more convenient as the approximation is about relative error, which is not compromized by that approach. Degree 7 is enough to get double precision, i.e. 53 bits of precision or more. Due to the range reduction, the argument of $\artanh$ satisfies $$|y| < 0.172 \approx \frac{\sqrt2-1}{\sqrt2+1}$$ and thus $f(y)$ has to be approxed for $|y| < 0.1716^2 \approx 0.0295$.

In case you prefer the Maclaurin series of $f$ over a pre-computed, optimal polynomial: It is$$f(x)=2\sum_{k=0}^\infty\frac{x^k}{2k+1}=2\left(1+\frac{x}{3}+\frac{x^2}{5}+\frac{x^3}{7}+\cdots\right)$$
Undo the range reduction. As $\ln x = \ln (x\cdot 2^k) - k\ln2$ all we have to do is adding an integral multiple of $\ln2$. $\ln2$ is a known constant; and if it's not then it can be approximated using the same approach.

Apart from the range reduction, this consumes 9 additions, 9 multiplications and 1 division.

Bit-wise computation of $\log_2$

A much more elementary approach is the following spigot-like algorithm. Notice that

Squaring a number shifts it's base-2 exponent one to the left.
Dividing a number by 2 decrements it's base-2 exponent by 1.
$\log_2x$ of a number $x$ has the representation 0.* in base 2 iff $1\leqslant x<2$, and the representation 1.* iff $2\leqslant x<2^2$.

This leads to the following algorithm to compute the fractional bits $b_n$ of $\log_2 x=0.b_1b_2b_3\dots$:

Shift $x$ until $1\leqslant x < 2$. The number of right-shifts is the integral part of $\log_2x$. It's negative if you have to left-shift $x$ to normalize it.
$n \larr 1$
$x \larr x^2$
If $x \geqslant2$ then $b_n \larr 1$, else $b_n\larr0$
If $x \geqslant2$ then $x \larr x/2$
$n\larr n+1$
goto 3

Some final remarks

MiniMax polynomials perform better than Taylor series of same degree. Taylor has the advantage that it's easy to come by for most functions you might need. Disadvantage of MiniMax polynomials is that they have to be pre-computed, and that you must know the needed precision in advance. On the other hand, they do not suffer from convergence issues, whereas Taylor sucks or does not work at all if there are singularities around and it has only finite radius of convergence.
MiniMax rational functions perform better than Padé of same degree. Basically everything from the previous point carries over to this one.
As a rule of thumb, MiniMax rational functions of degree $[n/n]$ perform better than MiniMax polynomials of degree $2n$ (for the same target function and interval, of course), with a clear advantage if singularities hang around and as you move towards higher degrees. In the presence of singularities however, MiniMax rational approximations might be tricky to compute.
As a rule of thumb, if you fix $m+n$ then rational MiniMax of degree $[m/n]$ performs best if $n \approx m$.

Example 1: MiniMax polymonial of degree 7 for $f$ over $[0, 0.02944]$

$^{*)}$The MiniMax polyomial approximation of $f(x)$ determined using the Remez algorithm is: $$\begin{align} f(x) &\approx 2.00000000000000000287938058543222037939 \\&+ 0.666666666666667653654896829047862109605\cdot x \\&+ 0.399999999996639180070480580779767357622\cdot x^2 \\&+ 0.285714286985476106491309914245597720749\cdot x^3 \\&+ 0.222222024077476110197597399876978069272\cdot x^4 \\&+ 0.181833876328594532366358057253631240618\cdot x^5 \\&+ 0.153181571233880298729095145342556944283\cdot x^6 \\&+ 0.147580071554569676223389696418304199218\cdot x^7 \end{align}$$ This is good enough to get IEEE double precision, i.e. precision is 53 bits or better.

Example 2: MiniMax rational function of degree 1/1 for $f$ over $[0, 0.02944]$

A rational MiniMax approximation over $[0,(\sqrt2-1)^2/(\sqrt2+1)^2]$ of degree 1/1 is sufficient to get IEEE single precision, i.e. precision is 24 bits or better: $$ f(x) \approx \frac{0.8945832414588x-3.316364574224}{x-1.65818225556}$$ The maximal relative error over the target interval is less than $1.91\cdot10^{-8}$, which is 25.6 bits resp. 7.7 decimals.

Apart from the range reduction, it costs 2 divisions, 3 multiplications and 4 additions.

Here is a Desmos plot showing the approximation and the relative error in the target interval.