Rigorous proof of the change of coordinates formula for Dirac's delta.

Question

Studying some properties of Dirac's delta distribution $\delta$ in $n$ dimensions, I found that under a coordinate transformation $\mathbf{x}\mapsto \mathbf{y}(\mathbf{x})$, an integral involving $\delta(\mathbf{x})$ is equivalent to the integral in the transformed coordinates involving $|\det\partial_{\mathbf{x}}\mathbf{y}|\delta(\mathbf{y})$, where $\partial_{\mathbf{x}}\mathbf{y}$ is the Jabocian of $\mathbf{y}$ at $\mathbf{x}$. Specifically, the following formula is present in the book I'm reading.

$$\delta(\mathbf{x})=\frac{\delta(\mathbf{y})}{|\det\partial_{\mathbf{x}}\mathbf{y}|}$$

However this doens't make much sense to me, for I can't imagine how a change of coordinates is performed only by looking at the integrand. Formally, (tell me if I'm wrong) for a test function $f$ and a diffeomorphism $\mathbf{y}$ such that $\mathbf{y}(K) = \mathbb{R}^n$one would have that

$$\int_{\mathbb{R}^n} \delta(\mathbf{x})f(\mathbf{x}) d\mathbf{x} =\int_{\mathbb{y}(K)} \delta(\mathbf{x})f(\mathbf{x}) d\mathbf{x} = \int_{K} \delta(\mathbf{y}(\mathbf{x}))f(\mathbf{y}(\mathbf{x})) |\det\partial_{\mathbf{x}}\mathbf{y}|d\mathbf{y}$$

I don't know, however, how does this lead to the result, or if this is valid at all since the change of variables through the jacobian is perhaps not applicable to the given case.

My question is therefore how do we prove the initial statement without falling in the sloppy reasonings I've seen around, unfortunately usual among physicits, employing for example indefinite integrals---because, to start with, I don't understand how one can perform a change of coordinates without taking into account the domain of integration---.

All related answers I've seen in stackexchange didn't really helped me, either because they were written without really going into the mathematics behind it, or because they treated specifically a situation I could not generalise to my question.

See here on how to properly define the composition of a distribution with a smooth function — SolubleFish, Jun 07 '21 at 16:52

score 2 · Accepted Answer · answered Jun 07 '21 at 18:44

A rigorous proof first requires a rigorous definition. In your case, once the rigorous definition has been written down, it's a matter of abusing notation to getting the result you've written down.

Definition.

Let $U,V$ be open subsets of $\Bbb{R}^n$ and $\Phi:U\to V$ be a $C^{\infty}$ diffeomorphism. Given a distribution $F\in \mathcal{D}'(V)$, we define the distribution $\Phi^*F\in \mathcal{D}'(U)$ by setting for all $\phi\in \mathcal{D}(U) $\begin{align} \langle \Phi^*F, \phi\rangle&:= \bigg\langle F, (\phi\circ \Phi^{-1})\cdot \left|\det D(\Phi^{-1})\right| \bigg\rangle.\tag{$*$}\\ &=\langle |\det D(\Phi^{-1})|\cdot F ,\,\,\phi\circ \Phi^{-1}\rangle \end{align} We refer to $\Phi^*F$ as the pull-back distribution (the second equality is by definition of multiplication of a distribution by a smooth function).

Several remarks are in order.

By definitions, distributions are continuous linear functionals, where continuity is defined relative to the topology on the space of test functions. So, technically for the above definition to be meaningful, we should first prove that continuity of $F:\mathcal{D}(V)\to\Bbb{C}$ and smoothness of $\Phi:U\to V$ imply continuity of $\Phi^*F:\mathcal{D}(U)\to\Bbb{C}$. This can be done, and it's a standard exercise in unwinding the definition of continuity on the relevant spaces, but I'm not going to write it out here.
Sometimes, we may write $\Phi^*F$ as $F\circ \Phi$ and think of it as a composed distribution, but be warned that this symbol shouldn't be taken too literally, afterall, $\Phi:U\to V$ and $F:\mathcal{D}(V)\to\Bbb{C}$, so their composition in the usual sense isn't even defined.
What I wrote above is just a definition, and of course I can define anything I want as long as I don't introduce any contradictions. But of course, I didn't pull this definition out of thin air. This definition comes from the standard procedure of first looking at what happens to composition of functions, and then dualizing the result to obtain a definition for distributions. More explicitly, if $f\in \mathcal{D}(V)$ then $f\circ \Phi\in \mathcal{D}(U)$, and we can think of this as a distribution by integration: for any $\phi\in\mathcal{D}(U)$, \begin{align} \langle f\circ \Phi, \phi \rangle&:=\int_Uf(\Phi(x))\phi(x)\,dx\\ &=\int_Vf(y)\phi(\Phi^{-1}(y))\cdot \left|\det D(\Phi^{-1})_y\right|\,dy\\ &:= \left\langle f, (\phi\circ \Phi^{-1})\cdot |\det D(\Phi^{-1})| \right\rangle \end{align} In other words, we have figured out how composition by $\Phi$ affects functions, so by passing to the duals, we can define it for distributions. This is why the definition above is the way it is.

Finally, we can get to how $(*)$ is related to your formula. So, we think of $\Phi:U\to V$ as a diffeomorphism mapping $x$-coordinates in $U$ to $y$-coordinates in $V$. Originally, $F\in \mathcal{D}'(V)$, so we shall write this as $F(y)$, to indicate that $F$ is a distribution "defined on $y$-coordinates"; of course, writing $F(y)$ is meaningless since distributions are to be evaluated on test functions on $V$, not on individual points of $V$. Next, we shall by abuse of notation, write down $\Phi^*F$ as $F(x)$ because we think of this as "$F$ expressed in $x$-coordinates". Lastly, $D(\Phi^{-1})$ shall be written as $\frac{\partial x}{\partial y}(y)$, because it's the derivative of the $x$-coordinates with respect to the $y$-coordinates (the final $(y)$ is to indicate that this is a function of $y$, i.e a function on $V$ rather than $U$). So, $(*)$, with abuse of notation can be written as \begin{align} F(x)&=\left|\det \frac{\partial x}{\partial y}(y)\right|\cdot F(y) \end{align} This notation is also abusive because on the LHS the distribution acts on the test function $\phi$ (i.e a function of $x$) while on the RHS it acts on $\phi\circ \Phi^{-1}$ (i.e the same function "expressed in $y$-coordinates").

Applying this notation to $F=\delta$, we get exactly \begin{align} \delta(x)&=\left|\det \frac{\partial x}{\partial y}\right|\cdot \delta(y)= \frac{\delta(y)}{\left|\det \frac{\partial y}{\partial x}\right|}, \end{align} where in the second equality, we used the inverse function theorem, and more abuse of notation (because I suppress where exactly the derivatives are evaluated etc).

Rigorous proof of the change of coordinates formula for Dirac's delta.

1 Answers1

Linked