Convergence Speed for Approximating a Piecewise Constant Function in $L^1$ with Mollifiers

Question

I am having trouble understanding the convergence speed of an approximation.

Suppose we are given a piecewise constant function $f:\mathbb{R}\to\mathbb{R}$, say e.g. $f:=\mathbb{1}_{[0,1]}$. Since this function belongs to $L^1$, defining a sequence of mollifiers, say $\phi_\epsilon(t):= \phi(t/\epsilon)/\epsilon$ (usual scaled mollifiers), one gets that $f_\epsilon:=\phi_\epsilon * f$ is smooth and \begin{equation} f_\epsilon\to f \quad\text{in } L^1\quad \text{as}\quad \epsilon\to 0. \end{equation} My question is now the following: From sources like Convolution Error Estimate Reference Request, we now that if $f$ belongs to some Sobolev space, we get a better speed of convergence with higher regularity of $f$. However the $f$ given here is not weakly differentiable and is not an element of any Sobolev space (see Why the Heaviside distribution $H$ doesn't belong to any Sobolev space $H^{s}(\mathbb{R})$). Yet these mollifications converge.

How does one understand the speed of convergence here? Is it essentially the case here that the convergence speed becomes slower with decreasing $\epsilon$? Can we give an estimate for this speed?

(So far I tried looking at the proofs for this convergence and a scaling argument.)

Lorenzo Pompili · Accepted Answer · 2022-11-11T11:20:17.877

This answer relies on the Fourier transform, which I denote with a hat. If you don't know the Fourier transform, you can simply jump to the second part of the answer.

Look at the problem using the Fourier transform. Using some basic properties of the Fourier transform, is easy to see that

$$ \widehat {f_\epsilon}(\xi) = \widehat f(\xi)\widehat \phi_\epsilon(\xi)= \widehat f(\xi) \widehat \phi (\epsilon \xi). $$

It follows that

$$ \widehat{(f-f_\epsilon)}(\xi)=(1-\widehat \phi(\epsilon\xi))\widehat f(\xi). \quad (1) $$

Recall that if $g\in L^1$, then $\widehat g$ is continuous and goes to zero at infinity. As you can see, the function $\widehat{(f-f_\epsilon)}$ goes to zero uniformly as $\epsilon\to 0$ (thanks to the fact that $\widehat f$ goes to zero at infinity); moreover, since $\widehat \phi$ is continuous and goes to zero at infinity, it is clear that the speed of uniform convergence of $\widehat{(f-f_\epsilon)}$ to zero is governed by the decay of the function $\widehat f(\xi)$ as $\xi\to\pm\infty$. This is important because it holds the inequality $$ ||\widehat{(f-f_\epsilon)}||_{L^\infty}\leq ||f-f_\epsilon||_{L^1}, $$ so the speed of convergence of $f_\epsilon$ to $f$ in $L^1$ is at least as bad as the rate of uniform convergence of $\widehat{(f-f_\epsilon)}$ to zero, which we have showed is linked to the decay at infinity of the Fourier transform of $f$ itself.

The problem is that we know that $\widehat f$ goes to zero at infinity, but we do not know much more in general. For instance, I think one can show that exist functions in $L^1$ such that the Fourier transform decays like $\xi^{-\delta}$ with $\delta>0$ arbitrarily small (though I don't know a reference for this, sorry). So in general the rate of convergence of $f_\epsilon$ to $f$ in $L^1$ can be very bad.

I think from the above computations it is clear how the convergence of the mollification to the original function is related to the Fourier transform of the function itself, even for functions that are not in $L^1$.

In your case, though, the Fourier transform of $\chi_{[0,1]}$ is a sinc function, which decays as $\xi^{-1}$, so in principle one could hope for a convergence rate of $f_\epsilon$ of order $\epsilon$ (see formula (1)).

This is in fact true, but to show this one has to use some other properties of the function itself (note for instance that it belongs to any $L^p$). Note that if you ask for the rate of convergence in different norms (e.g. in $L^p$), the rate of convergence could depend on the kind of convergence itself in principle.

To show the convergence rate, let’s assume for the sake of simplicity that we are working in the easier case of the Heaviside function, $f(x)=\chi_{[0,+\infty)}$ (then one can argue by linearity, as $\chi_{[0,1)}(x)=f(x)-f(x-1)$). Let’s compute the mollification $f_\epsilon$:

$$ f_\epsilon(x)=\int f(y) \epsilon^{-1}\phi(\epsilon^{-1}(x-y)) dy=\int_{0}^\infty \epsilon^{-1}\phi(\epsilon^{-1}(x-y))dy = $$ $$ =\int_{0}^\infty \phi(\epsilon^{-1}x-y)dy = f_1(\epsilon^{-1}x). $$ This means that (let me do the calculation for any $L^p$, $p<\infty$) (note that $f(x)=f(\lambda x)$ for any $\lambda>0$) $$ \int |f(x)-f_\epsilon(x)|^pdx= \int |f(\epsilon^{-1}x)-f_1(\epsilon^{-1}x)|^pdx= \int \epsilon|f(x)-f_1(x)|^pdx , $$ that is $$ || f-f_\epsilon ||_{L^p}\leq C\epsilon^{1/p}. $$

As you can see, for $p=1$ we obtain what we hoped for, while for different values of $p$ we have a different rate (note that for $p=\infty$ we do not expect to have convergence at all).

Upon double checking your proof, a question arose. How can you be sure, that the integral $\int \lvert f(x) - f_1(x)\rvert dx$ is finite? The function you chose, namely the indicator on $[0,\infty[$ is not in any $L^p$. I don't think that your linearity argument for $\chi_{[0,1]}$ can fix that, since only the indicator on $[0,\infty[$ will have this scaling invariance, that you use. — Joseph Expo, Nov 20 '22 at 10:20
Good question. In this case it is finite simply because $f-f_1$ is bounded and compactly supported. It is true that $f$ is not in any $L^p,;p<\infty$, but the linearity argument still works precisely because $f-f_1$ is in $L^1$. Try to think about it. Let’s consider $g(x)=f(x)-f(x-1)$. Then it holds $g_\epsilon(x)=f_\epsilon (x)-f_\epsilon(x-1)$ (I have used translation invariance of the convolution), so that $g(x)-g_\epsilon(x)=[f(x)-f_\epsilon(x)]-[f(x-1)-f_\epsilon(x-1)]$. This holds pointwise (or a.e.), but now the two functions in the square brackets are in $L^1$ and so on… — Lorenzo Pompili, Nov 20 '22 at 11:53

Convergence Speed for Approximating a Piecewise Constant Function in $L^1$ with Mollifiers

1 Answers1

Linked