Where is the wild use of the Dirac delta function in physics justfied?

Question

Wikipedia has a wild article about the Dirac delta function. Are the things listed correct? Or is there no proof that they are correct? For my master thesis I want to refer to rigorous proofs of these properties if they exist. The problem is that Wikipedia's list of references is meager and in almost every appropriate place, the references are missing. To give you a taste, some properties Wikipedia lists are:

Fourier transform of delta function,
delta function composition with another function
translations of delta function,
delta function is an even function
the property $\delta(ax) = \delta(x)/|a|$
algebraic properties
integration by parts of integrals containing delta function,
distributional derivatives

I looked at a few texts, but they were not relevant for two reasons, i.e. Griffel - modern functional analysis, because the space of functions were too small (test functions with compact support). In physics, the convolving function (not the generalized function) is usually any function on $\mathbb{R}^d$, and therefore I am interested in a large a space as possible. And second, they didn't refer to anywhere near all these properties.

Is there a math book written by a mathematician (not a physicist) which treats much of the above rigorously? Alternatively, if you can justify that the above properties are just physics (not math) sufficiently well, then I can let it go and get on with my life. Either is appreciated.

Reed & Simon, Functional Analysis contains pretty much everything you are asking for. I'm not familiar with many functional analysis books, but distributions should be covered almost every book. — s.harp, Nov 26 '16 at 20:18
For introductory; Any book that develops basic measure theory (and integration with respect to a measure) along with functional analysis will probabilty explain or give formal preperation to the concept of distributions. — Ranc, Nov 26 '16 at 20:23
yes, physicists like to abuse notation a lot. Partly to save time, but also because a mathematically rigorous theory does not actually exist (yet). In particular quantum field theory depends (sadly) on mathematically ill-defined "renormalization and regularization", which mostly boils down to making sense of distributional expressions which are not actually defined in math. — Simon, Nov 26 '16 at 20:28
Tempered distributions are important, where the test functions are the Schwartz functions (i.e. "rapidly decreasing" functions). — , Nov 26 '16 at 20:44
To do multivariable stuff (which includes composition), it might help to have some basic ideas about differential geometry and/or algebraic geometry. — , Nov 26 '16 at 20:49
A few references that might help: Distribution theory and differential equations. , Delta function that obeys inverse square law outside its (-1; 1) range and has no 1/0 infinity , Could this be called Renormalization? . — Han de Bruijn, Nov 29 '16 at 19:02
How much rigorous is "rigorously"? Standards of rigor are social standards, as is clear from the different views exposed by physicists/engineers and mathematicians on the subject. — Han de Bruijn, Nov 29 '16 at 19:11
I find Rudin Functional Analysis chs. 6-7 helpful for these issues (e.g., 6.26ff., 7.14). You mention "convolving function" and want the "convolving function" to be any function on $\Bbb{R}^d$. While convolution is not on the list of issues above, note Rudin 6.36 whereby convolutions are defined if at least one of the distributions convolved has compact support. — ForgotALot, Dec 01 '16 at 03:05
Perhaps interesting as well : WHO INVENTED DIRAC’S DELTA FUNCTION? . — Han de Bruijn, Dec 03 '16 at 13:47
As soon as I become enthusiastic about a treatise concerning the Dirac delta written by a mathematician, you will hear me applaud. So far, I have only seen the clothing of essentially simple ideas in ever more set theoretic underwear that does not reveal anything that we did not already know for a century or so. Mathematicians have only joined the BBQ after the beef has been roasted already. As a physics engineer by education, I tend to keep the path from A to B as short as possible. Rigorous, for me, is a straight line, not a Hilbert curve. — Han de Bruijn, Dec 03 '16 at 14:25
Oh yeah, before I forget: in engineering, Dirac deltas are functions and not "distributions", not "measures" and not "functional analysis" (mind the +50,+200,+400 bounties with the latter, why would that be?) — Han de Bruijn, Dec 03 '16 at 14:27
I suggest, as others, to look up at distribution theory. In any case: maybe I'm stating the obvious here, but don't worry about test functions being the restrictive class of smooth functions with compact support. They are just the functions on which delta (or another distribution) acts so to say, not the distribution itself! Again, maybe I didn't get the point — Del, Dec 03 '16 at 18:32

user45664 · Answer 1 · 2017-05-06T16:44:34.800

To answer Is there a math book written by a mathematician (not a physicist) which treats much of the above rigorously?

The following references are my favorites:

a) "Mathematics for the Physical Sciences", Laurent Schwartz;

b) "Generalized Functions vol 1", I.M. Gelfand, G. E. Shilov.

These are classics and primary sources in the areas of generalized functions, b), and distributions, a). Generalized functions and distributions are the same thing, see wiki 'generalized functions'. Both are rigorous math books, and are very readable. They both have much information on the Dirac Delta distribution (aka 'Delta Function').

Schwartz is credited with originating the 'theory of distributions' which is also the title of his original book (in French only). "Mathematics for the Physical Sciences" contains much of the material in that book.

Gelfand, a master mathematician, goes into even more detail. A substantial portion of vol 1 is devoted to the Dirac Distribution.

To answer Alternatively, if you can justify that the above properties are just physics (not math) ...

The properties are math not physics.

Daniel McLaury · Answer 2 · 2016-12-02T15:33:52.650

Associated to a function $f$ on a domain $D$ there is a linear operator given by $$g \mapsto \int_D f(x) \, g(x) \, dx$$ If we have a point $0 \in D$ then there is also a linear operator given by $$g \mapsto g(0)$$ and in many ways this behaves very much like a linear operator of the previous kind. For one thing, if you take a sequence of compact domains $C_i \to \{0\}$ and consider the "average value of $g$ on $C_i$" linear operator $$g \mapsto \int_D \frac{1_{C_i}(x)}{{\rm vol}(C_i)} g(x) \, dx$$ associated to the normalized indicator function $$f(x) = \frac{1_{C_i}(x)}{{\rm vol}(C_i)}$$ then this should obviously converge to the operator $g \mapsto g(0)$, at least assuming that things are set up so that convergence works properly. So we can imagine the linear operator $g \mapsto g(0)$ being associated to a "generalized function" $\delta(x)$, so that

$$``\int_D \delta(x) g(x) \, dx\text{''} := g(0)$$

You then just proceed to define "generalized functions" (or "distributions") to be objects having the desired properties, while in the background you're really just replacing the notion of a function $f$ with the associated linear operator [1] $$g \mapsto \int_d f(x) \, g(x) \, dx$$

That's really everything you need to know. Everything else just comes down to picking exactly what context you want to work in and choosing the things that make sense there -- if you want to use a larger space of test functions, you just have to restrict the class of functions $f$ you allow yourself to consider. But this just has to do with the functions (or "functions") that $\delta$ is going to sit alongside; $\delta$ itself works under pretty much any circumstances, since it doesn't require any notion of convergence to define.

UPDATE: Knowing the above, the proofs of most of the statements listed in the question are routine calculations. You can find the definitions of all these things in ay fuctional analysis text and simply plug in the dirac delta. For instance, by definition the Fourier transform of a function is

$$\hat{f}(s) = \int_{-\infty}^\infty f(x) e^{-2 \pi i x s} \, ds$$

If we regard a function $f$ as corresponding to linear operators $F$ where

$$F(g) := \int_{-\infty}^\infty f(x) g(x) \, dx$$

This leads us to define

$$\hat{f}(s) := F(e^{-2 \pi i x s})$$

where "f" can be anything we associate a linear operator $F$ to. Remembering that $\delta$ is just a formal symbol corresponding to the linear operator $L(g) := g(0)$, we have

$$\hat{\delta}(s) = L(e^{2 \pi i x s}) = 1$$

Similarly, if $f$ is a differentiable function then we can consider the linear operator associated to $f'$,

$$g \mapsto \int_{-\infty}^{\infty} f'(x) \, g(x) \, dx = - \int_{-\infty}^{\infty} f(x) g'(x) \, dx$$

where the equality follows from integrating by parts, using the fact that we're necessarily working in some context where $\lim_{x \to \pm \infty} f(x) g(x) = 0$. So the linear operator associated to $f'$ is

$$g \mapsto - \int_{-\infty}^{\infty} f(x) g'(x) = - F(g')$$

so we choose to take this as the definition of the derivative of something we can associate a linear operator to. In the case of the dirac delta function, $\delta'$ denotes the thing that associates to the linear operator $g \mapsto g'(0)$.

[1] If you prefer measure theory to functional analysis, you might instead think of replacing the function $f(x)$ with the measure $\mu(x) = f(x) \, dx$. Then the $\delta$ "function" is merely a formal notation such that $\delta(x) \, dx$ denotes a point mass measure centered at zero. It amounts to the same thing, since ultimately what you do with a measure is integrate something with respect to it.

"If you prefer measure theory to functional analysis" But what you explain is exactly the measure theory aspect. This post does not touch upon the functional analysis aspect, which includes notions such as the derivative of the Dirac distribution (meaning distributions à la Schwartz, not the distributions of probabilists). — Did, Dec 02 '16 at 09:38

score 1 · Answer 3 · answered Dec 06 '16 at 01:25

The term you're looking for is distribution theory. In the language of distributions, it is extremely simple to make the Dirac delta "function" rigorous, and to prove the aforementioned properties.

Here's the basic notion of a distribution:

A distribution is a continuous linear map from a set of nice functions (called "test functions") to $\mathbb{R}$.

Notice, by the way, that this means distributions are actually honest-to-satan functions. However, they're functions that eat other functions, which makes them somewhat different from, say, functions on the real line. For one thing, it's probably not immediately clear how to define a "derivative" or anything else. Once we look at the details, we'll find a way around this pretty quickly.

When we pick different sets of test functions, we get different notions of "distribution." To begin with, let's choose our space of test functions $D$ be the set of infinitely differentiable functions $\mathbb{R}^d \to \mathbb{R}$ that have compact support (that is, we require the functions to be zero except on some compact set). We need some topology on $D$ in order to make sense of the term "continuous." (If you're not familiar with topologies and convergence, skip the next line for now.) The topology on $D$ is usually given by specifying what convergence means on $D$: we will say that a sequence of elements $\varphi_k$ in $D$ converges to $\varphi$ as $k \to \infty$ if and only if every derivative of $\varphi_k$ converges uniformly to the corresponding derivative of $\varphi$ and all the $\varphi_k$ have supports contained in a common compact set.

An example of a distribution is the map $D \to \mathbb{R}$ given by $$\varphi \mapsto \varphi(0)$$ You can check that this is a continuous linear map. This map is the Dirac delta "function" $\delta$.

Another example: Say we have a locally integrable function $f: \mathbb{R}^d \to \mathbb{R}$. Then we can define another distribution $$\varphi \mapsto \int f(x) \varphi(x) dx$$ Now this is linear in $\phi$, and is continuous. I'll write $(f, \varphi)$ for this distribution.

Perhaps somewhat confusingly, when $F$ is a distribution (not a locally integrable function like $f$) we often use the conflicting notation $(F, \varphi)$ to indicate some distribution applied to $\varphi$.

Keep in mind that the first example above cannot be written in the form of the second example, i.e. as integration against a locally-integrable function, but is nonetheless the notation that is often used, particularly in physics: $\int \delta(x) f(x) dx = f(0)$. This is a pretty common sleight of hand: we pretend that distributions are given by integrating against a nice function even though not all distributions can be written this way.

Now we want to define a notion of "derivative" for distributions. Since a distribution is a function from a space of functions to the real numbers, it's not immediately clear how to do this. Let's try that aforementioned sleight-of-hand: consider the distributions of the form $\varphi \mapsto \int f(x) \phi(x) dx$ for some locally-integrable $f$.

From the usual integration by parts formula from ordinary calculus,

$$\int \partial_x^{\alpha} f(x) \varphi(x) dx = - \int f(x) \partial_x^{\alpha} \varphi(x) dx$$

(Note that the usual boundary terms in the integration-by-parts formula go away because $\varphi$ has compact support.)

To put this back into the notation from above: $(\partial_x^{\alpha} f(x), \varphi) = - (f(x), \partial_x^{\alpha} \varphi)$. So this suggests a way to define "differentiation:" let's use this notation as a definition.

That is, for any distribution $F\colon D \to \mathbb{R}$, we define the distributional derivative $\partial_x^{\alpha} F$ by $$(\partial_x^{\alpha} F, \varphi): = -(F, \partial_x^{\alpha} \varphi)$$

For example, let's consider the distribution given by (integrating against) the Heaviside function (we're taking $\mathbb{R}^d$ in the definition of $D$ to be $\mathbb{R}^1$):

$$ H(x) = \begin{cases} 1 &(x >0)\\ 0 &(x\leq 0) \end{cases} $$

Like in the second example, the distribution defined by $H$ is $(H, \varphi) = \int H(x) \varphi(x) dx$. As an exercise compute the derivative of this from the definition (the answer is at the bottom).

So to recap: A distribution is a continuous linear map from a set of nice functions (called "test functions") to $\mathbb{R}$. A dirty trick we will use again and again in distribution theory is to systematically confuse a function $f$ and the distribution given by integrating against it. Using this trick, we can use relatively basic mathematics to understand what certain notions like integration ought to mean for distributions, and then take this to be the definition. I've shown how to do this with (partial) derivatives; you can do the same with convolutions, adjoints, and more.

The above is just meant to give you a small flavor of the subject, so I won't go any further, and most good analysis texts should have more details if you seek them. A readable source (though not one I personally favor) is Stein and Shakarchi's Functional Analysis, Chapter 3.

Answer: The distributional derivative of this is:

$$(H', \varphi) = - (H, \varphi') = -\int H(x) \varphi'(x) dx = - \int_{0}^\infty \varphi'(x) dx = \varphi(0) - \lim_{\alpha \to \infty} \varphi(\alpha) = \varphi(0)$$

Notice that the Dirac delta "function" (distribution) applied to $\varphi$ gives precisely the same thing! (Hence the common confusing claim in intro physics classes: "the Dirac delta is just the derivative of the Heaviside function.")

In physics, the the test function assumption is violated. Physicist replace them by ordinary functions on $\mathbb{R}^d$. I specified this as a concern in my question. As far as I can tell, you are saying the same thing as everyone else has. In physics, the compact support is neglected, I need a theory that holds for a larger class of functions. What about the fourier transform of the delta function? And what about the composition with another function? Can you fix the test-function assumption, and elaborate on the other properties? Thanks — Mikkel Rev, Dec 06 '16 at 09:04
@MariusJonsson: Still apologies for the thoughtless answer that I have posted (and deleted) lately. You're quite right: the treatment of Dirac delta in Wikipedia is a mess. Which is not much of a surprise because Wikipedia is just reflecting the current "state of the art", i.e. an inconsistent hodgepodge of mathematical and physical viewpoints. This interesting question shall be ringing in my head for some time forthcoming, because I don't think it's impossible to formulate an answer that does meet some of your concerns. LATER though. — Han de Bruijn, Dec 06 '16 at 13:56
Hi @MariusJonsson. I see that perhaps I haven't quite answered your question. If you just take your space of test functions to be all functions $\mathbb{R}^d \to \mathbb{R}$, I think it would be very surprising if you could actually get a meaningful theory. — user3668561, Dec 06 '16 at 15:56
For example, if take the function space to be all functions $\mathbb{R}^d \to \mathbb{R}$, distributions defined by (integrating against) some locally integrable function may not be well-defined. Moreover, the integration by parts formula will have boundary terms (they died above since we assumed compact support), so it's not clear how to define differentiation.
One common alternative space of test functions is the schwartz space ([link]https://en.wikipedia.org/wiki/Schwartz_space [/link]). — user3668561, Dec 06 '16 at 15:56
So much is clear to me too. The Schwartz space is much bigger, and better for physics applications. Suppose we assume that the test functions are replaced by Schwartz functions (tempered distribution theory?), then could we get the properties which I list? What would the Fourier transform of a delta function be? Would it be $\frac{1}{2\pi} \int_{\mathbb{R}} e^{ikx} dk$? And for composition with a differential function $g$ would we get $\int_{\mathbb{R}} f(x)\delta(g(x)) dx = \sum_{\text{$a$ zero of g}}f(a)/|g'(a)|$? Thanks. — Mikkel Rev, Dec 06 '16 at 16:11
@MariusJonsson Provided we're working with distributions on the Schwartz space, for the composition of distributions just use the sleight-of-hand described above and the change of variables formula to find the right definition. Unfortunately, I don't have time to give a full answer atm, but I'll elaborate later. — user3668561, Dec 06 '16 at 16:36

score 0 · Answer 4 · answered Dec 02 '16 at 16:53

Distribution Theory and Transform Analysis by A. H. Zemanian is a good book, which should satisfy your needs. Its intended audience is graduate students in engineering and science, hence its more readable for non mathematicians than most math books.

Han de Bruijn · Answer 5 · 2016-12-12T09:27:13.317

Disclaimer. Not meant as a full answer that covers all of the concerns. Just a few aspects.

As the OP says, Wikipedia has a wild article about the Dirac delta "function". Interestingly enough, I think that there are a few good things in that wild article. The first good thing is the picture right on top:

It may be a problem analytically, but when seen from a purely geometrical viewpoint, there is no problem at all: the Dirac delta is the union of the $x$ - axis and the positive part of the $y$ - axis.
More precisely, it is the set $$ \{(x,y)\in\mathbb{R}^2|((y=0)\land(x\ne 0))\lor((x=0)\land(y>0))\} $$ Apart from the fact that (half)lines in Euclidean geometry cannot have an area, while the Dirac delta has one $=1$.

It's typical that the Dutch Diracdelta Wikipedia has an aditional section about Approximations with test functions. There are two nice GIF animations in the article showing how it works,

with a Gauss function : $\large \delta_\sigma(x) = \frac{1}{\sigma\sqrt{2\pi}}e^{-(x/\sigma)^2/2}$
with a Sinc function : $\large \delta_\sigma(x) = \frac{\sin(x/\sigma)}{\pi.x} = \frac{\sin(x/\sigma)}{\sigma.\pi.(x/\sigma)}$

I find that the absence of a section about Dirac delta test functions in the English version of the Wikipedia is an omission. Therefore I've collected nine of these in a separate web page:

Dirac Delta Test Functions

The secret is in scaling (with $\sigma$). Let $T(x)$ be one of the Test functions. Then quite in general we have, for any approximation of the Dirac delta with such a Test function ($\sigma > 0$) : $$ \delta_\sigma(x) = \frac{1}{\sigma}T\left(\frac{x}{\sigma}\right) \\ \int_{-\infty}^{+\infty}\delta_\sigma(x)\, dx = \int_{-\infty}^{+\infty} T\left(\frac{x}{\sigma}\right) d\left(\frac{x}{\sigma}\right) = \int_{-\infty}^{+\infty} T(x)\, dx = 1 $$ According to a sloppy definition, maybe used by some physicists, we have: $$ \delta(x) = \lim_{\sigma\to 0} \delta_\sigma(x) = \lim_{\sigma\to 0} \left[\frac{1}{\sigma}T\left(\frac{x}{\sigma}\right)\right] = 0 \quad \mbox{for} \quad x \ne 0 $$ Sloppy because, upon inspection, this limit covers only part of the geometrical representation, namely: $$ \{(x,y)|(y=0)\land(x\ne 0)\} \quad \mbox{but not} \quad \{(x,y)|(x=0)\land(y>0)\} $$ In order to cover the half $y$-axis case, we might need the inverse of the test function: $$ y = \frac{1}{\sigma}T\left(\frac{x}{\sigma}\right) \quad \Longrightarrow \quad x = \sigma.T^{-1}(y.\sigma) $$ And another limit, expressing that the upper $y$ - axis is approximated as closely as we want: $$ \lim_{\sigma\to 0} \left[\sigma.T^{-1}(y.\sigma)\right] = 0 $$ Example. Take test function number (5.), which is the Cauchy distribution: $$ T(x) = \frac{1/\pi}{1+x^2} \quad \Longrightarrow \quad \delta_\sigma(x) = \frac{1}{\sigma}T\left(\frac{x}{\sigma}\right) = \frac{1/(\pi\sigma)}{1+(x/\sigma)^2} $$ The inverse function (two branches) is found in a few steps: $$ y = \frac{1/(\pi\sigma)}{1+(x/\sigma)^2} \\ \frac{1}{\pi\sigma.y} = 1+(x/\sigma)^2 \\ x = \pm\sigma\sqrt{\frac{1}{\pi\sigma.y}-1} $$ For the sake of completeness: $$ \delta_\sigma^{-1}(y>0) = \begin{cases} \pm\sigma\sqrt{1/(\pi\sigma.y)-1} & \mbox{for} & y \le 1/(\pi\sigma) \\ 0 & \mbox{for} & y > 1/(\pi\sigma) \end{cases} $$ It is clear that for $x\ne 0$ : $$ \lim_{\sigma\to 0} \delta_\sigma(x) = \lim_{\sigma\to 0} \frac{1/(\pi\sigma)}{1+(x/\sigma)^2} = \frac{\sigma/\pi}{\sigma^2+x^2} = 0 $$ On the other hand, for $\,0 < y \le 1/(\pi\sigma)$ : $$ \lim_{\sigma\to 0} \delta^{-1}_\sigma(y) = \lim_{\sigma\to 0} \pm\sigma\sqrt{\frac{1}{\pi\sigma.y}-1} = \lim_{\sigma\to 0} \pm\sqrt{\frac{\sigma}{\pi.y}-\sigma^2} = 0 $$ Thus, in the limit, the Dirac delta is indeed equal to the geometry that is represented by the set $\{(x,y)\in\mathbb{R}^2|((y=0)\land(x\ne 0))\lor((x=0)\land(y>0))\}$ .

score 0 · Answer 6 · answered Feb 01 '18 at 05:33

I do not recall the reference, however Dirac doubted its validity. Review his videos on youtube. He encouraged his new students to not waist their time studying the Quantum Mechanics which he played the main role in developing. He suggested they spend their time looking for a simpler more elegant solution. I concur. He and others have raised the use of infinite as not conforming to scientific qualification.

Gauss and infinite letter to Schumacher (1831)

I protest against the use of infinite magnitude as something completed, which is never permissible in Mathematics, true meaning being a limit which certain ratios approach indefinitely close, while others are permitted to increase without restriction.

Where is the wild use of the Dirac delta function in physics justfied?

6 Answers6

Dirac Delta Test Functions

Linked

Related