The misuse of Dirac's $\delta$

Question

In physics and engineering it is common practice to use the Dirac delta distribution to represent "densities" of discrete random variables. It is a very useful construct and you can do many things with it easily.

$$f_{\pmb x}(x)=\sum_{n=1}^{\infty}P(\pmb x = x_n) \,\delta(x-x_n) \;=\; \sum_{n=1}^{\infty}p_n \,\delta(x-x_n)$$

$$E_{f_{\pmb x}}\{\delta(g(\pmb x))\} \;=\; \sum_i \frac{f_{\pmb x}(x_i)}{{|g'(x_i)|}}\enspace, \quad \text{ with } g(x_i)=0 \text{ and } g'(x_i)\neq 0$$

$$\pmb y = g(\pmb x) \quad\Rightarrow \quad f_{\pmb y}(y) \;=\; E_{f_{\pmb y}}\{\delta(\pmb y - y)\} \;=\; E_{f _{\pmb x}}\{\delta(g(\pmb x)-y)\}$$

But the mathematicians always say it would be mathematically objectionable or even incorrect, because a density function made up of delta distributions is not continuous and not integrable. However the definition of the delta distribution precisely defines its integral. So what is the problem here?

Is there an example where the use of the Dirac delta function can lead do wrong results?

A probability density function should be integrable in order to be defined, but it's unclear to me that it needs to be "continuous". For instance, the uniform distribution is discontinuous. Can you give an example of someone saying that there is a problem with delta pdfs? — John Barber, Oct 25 '17 at 21:51
The problem is that the Dirac delta is not a distribution, because it’s lebasgue integral does not exists.Think about it: it’s zero almost everywhere, yet its integral is one.This cannot happen with a normal function.Indeed, the Dirac delta is not a normal function, but it’s a generalized function, and the symbol of integral when used with the Dirac delta does not have the same meaning was when used with normal functions (ie it not the lebesgue integral) .Most of the time things “work out” anyway but you should justify those manipulations more rigorously, and sometimes it’s not done in physics — Ant, Oct 25 '17 at 21:51
@Ant I would say that it's a distribution but not a function; there's an important distinction between the two. — Steven Stadnicki, Oct 25 '17 at 22:23
@Ant: The contention is not about Lebesgue integrability but whether the Dirac delta function is an $\mathbf R\to\mathbf R$ function. There are perfect (smooth and everything) $\mathbf R\to\mathbf R$ functions which are non-integrable, Consider $1/x$ for $x\in(0,\infty)$. — Hans, Oct 25 '17 at 22:27
@StevenStadnicki Yes, that's why I said generalized function. They are also called distributions, but in this context the word could get confused with "probability distribution", which I wanted to avoid :) — Ant, Oct 25 '17 at 22:30
@Hans I somewhat disagree. He's not asking about the dirac function directly, but rather examples of the use of dirac functions. And in fact the dirac function generally comes up only in terms of its integral with something else, since if you're not working on generalized functions directly, most of the time you only work with its integral :) — Ant, Oct 25 '17 at 22:31
@ant Agreed, but in the first sentence of your comment you say 'The problem is that the Dirac delta is not a distribution'; that's the only bit that I was referencing. Agreed that this is an area where there's a lot of opportunity for confusion around specific terminology, for sure! — Steven Stadnicki, Oct 25 '17 at 22:32
@Ant: The OP IS asking about the Dirac delta function directly, in fact in his very title and his very first sentence and all his examples. His main point is enquiring about the mathematical legitimacy of its definition. You are saying "Dirac delta is not a (probabilistic) distribution, the reason being "its Lebesgue integral does not exist". This is NOT the reason. Dirac delta function is not a real to real function at all. Lebesgue integration of $f$ is premised on $f$ first being a real (complex too) to real function. It is therefore meaningless to talk about Lebesgue integration at all. — Hans, Oct 25 '17 at 23:55
@Ant: The integral involving the Dirac delta is NOT Lebesgue per se but an extension of it, defined rigorously in the theory of distribution or generalized function which you have also mentioned in your first comment. — Hans, Oct 26 '17 at 00:06

Hans · Answer 1 · 2017-10-26T00:07:36.283

What the mathematicians are saying is not that the Dirac delta function is not continuous or integrable which requires first the object under discussion be an $\mathbf R\to\mathbf R$ function, but that it is not even an $\mathbf R\to\mathbf R$ function. However, the Dirac delta function is rigorously defined, only not as an $\mathbf R\to\mathbf R$ function but a class of linear functionals, which is a linear function from a function space into the set of real (complex) numbers $\mathbf R$, called distribution or generalized function.

leftaroundabout · Accepted Answer · 2017-10-31T22:41:28.007

The Dirac distribution really is a function – specifically, a functional $$ \tilde\delta : (\mathbb{R} \to \mathbb{R}) \to \mathbb{R}, \qquad \tilde\delta(f) := f(0). $$ That definition is perfectly simple and uncontroversial.

The funny thing is, nobody's actually using it this way! For a reason I find strange, physicists and also many mathematicians actually seem more suspicious about such a simple, but “higher-order” function than about a function on the real axis itself, even if it requires “infinite function values” to work.

What's actually going on with the standard definition is this: the functions $\mathbb{R} \to \mathbb{R}$ form a vector space. If you narrow it down to only functions whose square is integrable over the entire domain, you get the $L^2(\mathbb{R})$ Hilbert space.

One of the nice things in Hilbert spaces is the Riesz representation theorem. It says roughly that a Hilbert space is isomorphic to its dual space; in this case meaning, the space of linear functionals^† $L^2(\mathbb{R}) \to \mathbb{R}$ is isomorphic to $L^2(\mathbb{R})$ itself. IOW, any square-integrable function has a canonical correspondent functional vice versa. These corresponding pairs are always basically given by imitating the integral over the product. For instance, $g(x) = e^{-x^2/2}$ has the corresponding functional $$ \tilde g(f) = \int_\mathbb{R}\!\!\mathrm{d}x \: g(x)\cdot f(x). $$ That choice is canonical because you can reconstruct $g$ from that functional, as the unique unit-norm function which maximises the $L^2$ scalar product. (That this is possible in a Hilbert space – thanks to the completeness property – is the interesting bit about the Riesz representation theorem.)

Naïvely, we could follow from this that $\tilde\delta$ has a corresponding function $\mathbb{R}\to\mathbb{R}$. It is after all a functional on functions, and then we can as well consider it only on square-integrable ones... what's the problem?

Well, the problem^† is that $L^2(\mathbb{R})$ is not really just an integrability-restriction of the space of functions. It's actually a space of equivalence classes of such functions: when two functions only differ on a Lebesgue null set, they're considered the same element of $L^2(\mathbb{R})$. And that means $\tilde\delta$ isn't actually defined on $L^2(\mathbb{R})$, because if you change the function only on the point 0 you'd get a different result, but from the “same” argument. And that would be your wrong results from naïve use of $\delta$ as a “real-valued function”: if you evaluate it with functions that are tweaked at a single spot, you can get wrong results.

The reason this isn't usually an issue in physics is the “all functions are continuous” paradigm. Because while every element of $L^2(\mathbb{R})$ contains many functions, each differing only in a null set (e.g. only in discrete points), there is always at most a single continuous such function. So, $\tilde\delta$ is actually well defined as a functional $L^2(\mathbb{R}) \cap \mathcal{C}^0(\mathbb{R}) \to \mathbb{R}$. Then again, that is not a Hilbert space, but it's certainly an actual subset of one, so the physicists are doing ok.

^†_{To be precise (as Hans reminds me to be), the dual space in question is only the space of bounded linear functionals (or equivalently, continuous linear functionals, though I'd remark that continuity on functionals should not be confused with continuity on corresponding functions). So even if $\tilde\delta$ was a well-defined functional – which in fact you can make it by restricting yourself further to the $H^1$ Sobolev space, in which each equivalence-class has exactly one continuous member – you wouldn't be able to apply Riesz, because the functional would not be bounded, i.e. you would be able to construct a sequence of $L^2$ functions that have all the same $L^2$ norm but give infinitely-growing results of $\tilde\delta$.}

I doubt "also many mathematicians actually seem more suspicious about such a simple, but 'higher-order' function". It is the first concept in algebra and particularly in functional analysis. There are also quite a few false remarks here. You say "the space of linear functionals $L^2(\mathbb{R}) \to \mathbb{R}$ is isomorphic to $L^2(\mathbb{R})$ itself." This is not true. It is only true for continuous functionals. "We could follow from that that $\tilde\delta$ has a corresponding function $\mathbb{R}\to\mathbb{R}$. It is after all a functional on functions," — Hans, Oct 26 '17 at 01:03
This is the consequence of the previous false claim. It is a functional but not a continuous one on $L^2$. This implies your claim that the culprit of $\tilde\delta$ not having a real function representation is the null set is irrelevant or false. I do not see the point of mentioning Riesz representation theorem here in discussing the Dirac delta. — Hans, Oct 26 '17 at 01:10
@Hans the point of mentioning the null set stuff is that it means $\tilde\delta$ is not a functional on $L^2$ at all, which is sufficient to invalidate applying Riesz. The point of nevertheless mentioning Riesz is that it motivates in general treating functionals as if they were $L^2$ functions, vice versa, and therefore suggests that it's not completely unreasonable to think of $\delta$ as a real-valued function. Just when making this rigorous, you run into the problems mentioned. (I chose not to mention continuity because of the potential for confusion with continuous ℝ-functions.) — leftaroundabout, Oct 26 '17 at 08:58
Wow thanks for your elaborate answer. I figure the delta distribution must only be used together with continuous functions, because it makes the value of a function at a "null set" matter. — Daniel Frisch, Oct 26 '17 at 15:23
Yes, it's a good rule of thumb: “$\int\limits_a^b!\mathrm{d}x: \delta(x)\cdot f(x)$ is defined for any continuous $f$”. — leftaroundabout, Oct 27 '17 at 08:58
You're claiming that two elements of the same equivalence class in $L^2(\mathbb{R})$ only differ in a discrete set of points. This is wrong, because discrete sets in $\mathbb{R}$ are at most countable and there are sets of measure 0 that are uncountable. You also seem to claim that every equivalence class in $L^2(\mathbb{R})$ has a continous element, which is false too. — user159517, Oct 31 '17 at 19:08
@user159517 yes, “discrete set” is only a special case, and I meant every equivalence class in $H^1(\mathbb{R})$. — leftaroundabout, Oct 31 '17 at 22:42

score 0 · Answer 3 · answered Oct 27 '17 at 11:40

I am very happy to consider $\delta$ as a function, without having to mess with functional analysis that drives intuition away. Until I stumble upon the square $(\delta(x))^2$, which is a suspicious object: indeed, a single Internet search reveals trouble. $^{[1]}$

There is also another issue: if $\phi\colon \mathbb R^n\to \mathbb R$, I would be very happy to consider $\delta(\phi(x_1\ldots x_n))$. That's another case in which the $\delta$ "function" does not behave like a function at all.

Note [1]. At a more down-to-earth level, Note 2 in this answer shows that $\delta$ cannot be considered as an element of $L^2(\mathbb R)$, not even in a "weak" sense, so $\delta(x)^2$ should have an infinite integral, to begin with.

The downvoter did not get the irony in the first sentence. – Giuseppe Negro May 08 '18 at 10:08 — Giuseppe Negro, May 08 '18 at 10:08

score 0 · Answer 4 · answered Oct 31 '17 at 18:39

It might help to analogize somewhat to CS (assuming you're familiar enough with CS to understand this, of course). The "normal" use for functions is to provide an evaluation value: f(x) = something. You call f, you pass the parameter x, and you get some value returned. However, more generally/abstractly, you can think of a function as an object with an evaluation method: f.evaluate(x) = something. Depending on the context in which functions are being used, there can then be further methods: f.taylor(n), for instance, might return the nth coefficient in the Taylor expansion. If you're doing combinatorics, this might be all you're interested in: if f is a generating function, then you might not care at all what it actually evaluates to. And once you stop caring about that, then there are going to be objects that do have what you care about, but don't have what a "true" function has. For instance, if f.taylor(n) = nⁿ, then its radius of convergence is zero, and it is therefore undefined everywhere but zero. More precisely, its evaluation is undefined everywhere but zero. Its Taylor coefficients are still perfectly well-defined.

So we can have a class that has been generalized from the traditional concept of a “function”, but now is no longer required to have something integral to that concept. Now, consider how functions are used in probability. Generally, a probability function isn’t used for its evaluation method; the PDF gives the probability “density” at a point, but the actual probability “mass” at any point is zero. It’s only over a non-zero-measure set that a PDF has non-zero probability mass. So what we really need is a CDF method: f.cdf(a,b) gives the CDF over the interval (a,b). So again we have a class that looks a lot like “functions”, and is often given in function form e.g. f(x) = e^-x. As long as we’re in this abstract function-like class, there’s no problem including the delta “function”. But when we treat it as actually being a function, that can cause problems.

So what are some cases where it can cause problems? Well, obviously trying to evaluate it at 0 causes a problem. Since it’s not continuous, interchanging limits that involve the delta function can cause problems. Applying the Fundamental Theorem of Calculus causes problems (the FTC basically says that the derivative of the integral is equal to the original function). Taking the Fourier Transform does not itself cause problems, but assuming that the result will be in L² does. More generally, one has to be more careful when doing normalization when dealing with delta functions.

I see where you're going with this, but IMO it doesn't make anything clearer to bring OO terminology in here. All those “methods” are also just functions, just with different domains from the “conceptual” $\mathbb{R}\to\mathbb{R}$ function. Rather than talking about different “evaluation methods”, you might better once properly specify what the domain and codomain are on which the thing is well-defined. — leftaroundabout, Oct 31 '17 at 22:50

The misuse of Dirac's $\delta$

4 Answers4