Error on Simpson Sum and Trapezoidal sum

Question

I was coding an algorithm to approximate the integral $\int_{0}^{1} f(x) dx\int_{0}^{1} cos(x)+1/10cos(10x)+1/50cos(50x) dx$ as i encountered some interessting graphs. When approximating for $N=2^k$ for $k\in \mathbb{N}$ intervals with an size of $h=\frac{b-a}{N}=\frac{1}{2^k}$ i realized that the error between the excact integration and the approximations (here: $|T(h)-I(f)| $ and $|S(h)-I(f)|$ with T being the Trapezodial sum and S being the Simpson sum) went down as assumed until a certain point number of intervals with the error in Simpson decreasing fast than the Trapezoid one (also excpected).

But at a certain interval size $h=\frac{1}{2^{15}}$ (therefor $2^{15}$ intervals) i realized that the error rate is certainly increasing. This goes so far, that even the boundary set for the maximum error $E_{Simpson}=-h^4(b-a)\frac{f^4(\xi)}{2880}$ and $E_{Trapezoid}=-h^2(b-a)\frac{f^2(\xi)}{12}$ for $a=0$ and $b=1$ got broken. As i am assuming my algorithm is completly correct i was interessted in how this boundary could be broken. An aspect could be computation errors due to very low numbers but i was thinking that there might be a mathematical reason behind it.

I am also sharing a plot with you (axis are written in german but Y-Axis meaning the absolute error and X-Axis meaning k with interval amount $2^k$, both axis scaled with Logarithm to base 2. Graph

I am interessted wether some of you have an explanation to this phenomen or if it is really just an data error due to the limits of computers.

Greetings, Florian

Python has inbuilt double precision, therefor all numbers should be double — Florian Bauer, Jul 05 '21 at 16:15
The error estimate is dominated by the derivative of $\cos(50x)/50$. For k = 15 and 30 the simpson errors are $s15,s30$ and trapezoidal are $t15,t30.$ $s15 = -(2^{-15})^4(50)^4/50/2880 = -3.7646e-17$, $s30 = -(2^{-30})^4(50)^4/50/2880 = -3.2653e-35$, $t15 = -(2^{-15})^250^2/50/12 = -3.88051e-09$, $t30 = -(2^{-30})^250^2/50/12 = -3.6140e-18$, The graph gives better results than the error estimates. How were the error estimates calculated? — , Jul 05 '21 at 19:29
The graph gives worse results due to the fact that is is scaled with the logarithm of base 2. For $k=15$ it is $|t15|\approx |\frac{1}{2^{50}}|\approx |8.88e-16|$ which appears to be worse than what you calculated. Therefor it breaks the error boundary. Error were calculated by summing the intervals for simpson / trapezoidal up and taking the absolute difference to the correct integral, then scaling this difference with $log_2$. — Florian Bauer, Jul 05 '21 at 21:15
This might be relevant https://math.stackexchange.com/questions/1786896/rounding-error-of-trapezoidal-method — , Jul 06 '21 at 06:39

score 1 · Accepted Answer · 2021-07-06T15:27:28.153

It looks like rounding error.

I ran trapz on octave and got:

exactI       =    8.359258237575212e-01
qtrapz15     =    8.359258237547984e-01
errortrapz15 = qtrapz15 - exactI
errortrapz15 =   -2.722821967893196e-12
qtrapz30     =    8.359258237608431e-01
errortrapz30 = qtrapz30 - exactI
errortrapz30 =    3.321898311980931e-12

The absolute error for $t30$ is greater than $t15$. Similar to your results.

I then ran a C++ program using $128$ binary bit floating point numbers and got:

exact    = 0.835925823757521237004089087553020177787458754748900059337972
trap15   = 0.835925823754799058014553389923348889088758837882225475274439
trap30   = 0.835925823757521237001553860011910824446002551425735037263132
errort15 = -2.72217898953569762967128869869991686667458406353306678167762e-12
errort30 = -2.53522754110935334145620332316502207483983770730852571768988e-21

The trapezoidal errors are now less than the estimated errors.

The C++ code uses gmp GNU Multiple Precision Arithmetic Library and mpreal a C++ interface to it.

traptest128bits.cpp:
#include <iostream>
#include <vector>
#include <stdlib.h>
#include "mpreal.h"
// ------------------------------------------------------------------
mpfr::mpreal f(mpfr::mpreal x)
{
    return cos(x) + cos(10.0x)/10.0 + cos(50.0x)/50.0;
}
// ------------------------------------------------------------------
// Integral
mpfr::mpreal If(mpfr::mpreal x)
{
    return sin(x) + sin(10.0x)/100.0 + sin(50.0x)/2500.0;
}
// ------------------------------------------------------------------
int main(int argc, char* argv[])
{
    using mpfr::mpreal;

    using std::cout;
    using std::endl;
    using std::cerr;
// Required precision of computations in bits
const int numbits = 128;

// Setup default precision for all subsequent computations
// MPFR accepts precision in bits 
mpreal::set_default_prec(numbits);

cout.precision(100);

mpreal exact = If(1.0) - If(0.0);
cout &lt;&lt; &quot;exact = &quot; &lt;&lt; exact &lt;&lt; endl;

mpreal dx15 = pow(2,-15);
cout &lt;&lt; &quot;dx15 = &quot; &lt;&lt; dx15 &lt;&lt; endl;

mpreal sum15(0.0);
unsigned long kmax15 = pow(2,15);
mpreal x(0.0);
for(unsigned long k = 0; k &lt;= kmax15; k++)
{
    if ((k % 1000) == 0) 
    {
        printf(&quot;\r%lu&quot;,k); 
        fflush(stdout);
    }

    sum15 += f(x);
    x += dx15;
}
cout &lt;&lt; endl;

sum15 = sum15 - (f(0.0) + f(1.0))/2.0;

mpreal trap15 = sum15*dx15;
mpreal errort15 = trap15 - exact;

cout &lt;&lt; &quot;trap15 = &quot; &lt;&lt; trap15 &lt;&lt; endl;
cout &lt;&lt; &quot;errort15 = &quot; &lt;&lt; errort15 &lt;&lt; endl;


// -------------------------------------------------------------------------

mpreal dx30 = pow(2,-30);
cout &lt;&lt; &quot;dx30 = &quot; &lt;&lt; dx30 &lt;&lt; endl;

mpreal sum30(0.0);
unsigned long kmax30 = pow(2,30);
x = 0.0;
for(unsigned long k = 0; k &lt;= kmax30; k++)
{
    if ((k % 1000) == 0) 
    {
        printf(&quot;\r%lu&quot;,k); 
        fflush(stdout);
    }

    sum30 += f(x);
    x += dx30;
}
cout &lt;&lt; endl;

sum30 = sum30 - (f(0.0) + f(1.0))/2.0;

mpreal trap30 = sum30*dx30;
mpreal errort30 = trap30 - exact;

cout &lt;&lt; &quot;trap30 = &quot; &lt;&lt; trap30 &lt;&lt; endl;
cout &lt;&lt; &quot;errort30 = &quot; &lt;&lt; errort30 &lt;&lt; endl;

// -------------------------------------------------------------------------

cout &lt;&lt; &quot;---------------------------------------------&quot; &lt;&lt; endl;
cout &lt;&lt; &quot;exact    = &quot; &lt;&lt; exact &lt;&lt; endl;
cout &lt;&lt; &quot;trap15   = &quot; &lt;&lt; trap15 &lt;&lt; endl;
cout &lt;&lt; &quot;trap30   = &quot; &lt;&lt; trap30 &lt;&lt; endl;
cout &lt;&lt; &quot;errort15 = &quot; &lt;&lt; errort15 &lt;&lt; endl;
cout &lt;&lt; &quot;errort30 = &quot; &lt;&lt; errort30 &lt;&lt; endl;


return 0;

}

The mpreal.h file should be in the same directory.

The build instruction on cygwin Linux is:

g++ -Wall -D_GNU_SOURCE -D__LINUX__ -std=c++11 -O3 traptest128bits.cpp -lmpfr -lgmp -o traptest128bits

Error on Simpson Sum and Trapezoidal sum

1 Answers1