8

It is not possible to write a function representing the uniform distribution, say $\mathcal{U}$, on the real line, $\mathbb{R}$ (in a Bayesian context, this is an improper prior). My question is: is it possible to write a generalized function (aka, a distribution, but I avoid this term as it may cause confusion with the "distribution" used in probability terminology) corresponding to $\mathcal{U}(\mathbb{R}) $?


Additional text to give more context to the question. Just as Dirac's Delta can be represented as a limiting process of gaussians, i.e., $$ \delta(t) = \lim_{\sigma^2 \rightarrow 0} \frac{1}{\sqrt{2 \pi \sigma^2}} e^{-\frac{1}{2 \sigma^2} t^2 } $$ in principle, $\mathcal{U}(\mathbb{R})$ could be represented as $$ \mathcal{K}(t) = \lim_{ \sigma^2 \rightarrow +\infty} \frac{1}{\sqrt{2 \pi \sigma^2}} e^{-\frac{1}{2 \sigma^2} t^2 } $$ in some sense of the limit. Clearly, it must hold that $$ \int^{+\infty}_{-\infty} \mathcal{K}(t) dt = 1 $$ just as with Dirac's Delta. I do not know if there are technical difficulties that arise in measure theory for this case.

Related question: link. But I would like a bit more details as an answer.

PseudoRandom
  • 1,344
  • If you do a calculation with $\sigma$ and then let $\sigma$ tend to $\infty$ then you get a result that which you can interpret as if it belonged to the all-universal distribution. However, if you try with another distribution with a parameter playing a similar role then the result may be different... – zoli Jan 31 '18 at 14:07
  • The problem, in that case, may lie within the properties of the other distribution (e.g., it may not be everywhere smooth or absolutely integrable). Once you formally define $\delta(t)$, you use $\delta(t)$ right from the start of the calculation. Same thing for $\mathcal{K}(t)$. There is no limiting process involved anymore. After all, $\delta(t)$ is not defined (strictly speaking) that way, so we should expect the same for $\mathcal{K}(t)$. If you have a specific example in mind, you can post it as an answer. – PseudoRandom Jan 31 '18 at 14:43
  • Years ago, I did such a calculation. I will try to recall the details. – zoli Jan 31 '18 at 17:00
  • 1
    Would it make sense to use a Césaro summation type of definition $$\mathbf E[f(\mathcal U)]:=:(\mathcal U,f):=\lim_{N\to\infty}\frac{1}{2N}\int_{-N}^Nf(y)dy.$$ It looks like a positive preserving bounded linear functional on $C_b(\mathbb R)$ with norm $1$. This would give you existence of a finitely additive probability measure. Is it wrong or a totally trivial object? ($C_b(\mathbb R):=$ bounded real-valued continuous functions on $\mathbb R$). (would it be just the 0 measure $C_0(\mathbb R)$ (functions vanishing at infinity)?) – Rgkpdx Feb 16 '18 at 12:44
  • Let me try to better understand that definition with some questions. In the language of distributions, is $f$ a so-called "test function" ? That is, a function which is infinitely differentiable on the reference space (i.e., $\mathbb{R}$) and vanishes very fast at infinity? Which "norm" are you referring to? Is the measure (you speak of) also $\sigma$-additive when $f$ is a test function? – PseudoRandom Feb 16 '18 at 16:34
  • By the way, I noticed some similarities with another definition. The (average) power of a signal (read: function) $f$ is defined as $$ \mathcal{P}f = \lim{T \rightarrow +\infty} \frac{1}{2 T} \int_{-T}^{T} |f(t)|^2 dt $$ which is non-trivial (and actually a very interesting quantity). As your definition appears somewhat broader, I don't think triviality is a problem. – PseudoRandom Feb 16 '18 at 17:13
  • @PseudoRandom: yes $f$ would be a test function, would be any function in $C_b(\mathbb R)$, (nonseparable) Banach space of realvalued continuous bounded functions on $\mathbb R$ with supremum norm. I was trying to use some Riesz representation theorem. Ideally the theorem for $C_0$ in here https://en.wikipedia.org/wiki/Riesz–Markov–Kakutani_representation_theorem, as the dual would be a probability measure. But you need to show that the operator norm is 1. The dual of $C_b(\mathbb R)$ is only the finitely additive measures... – Rgkpdx Feb 16 '18 at 18:53
  • One issue with your proposal is that Dirac-type distributions typically have compact support (or more precisely, are linear functionals on the space of smooth compactly-supported test functions), so roughly speaking they're "localized". This is important, because it allows us to freely integrate by parts and neglect surface terms. What you want is spread out over the entire number line, so it's inherently nonlocal and more difficult to mathematically formalize. – tparker Feb 17 '18 at 04:07
  • "It is not possible to write a function representing the uniform distribution on the real line"- I disagree, take f(x)=1. If this isn't what you had in mind, then you need to be more clear about what you mean by a uniform distribution on the real line. – Simon Segert Feb 17 '18 at 14:44
  • 1
    @MikeHawk That isn't a probability density function because it doesn't integrate to 1 over the real line. Your proposed PDF does not correspond to any cumulative distribution function. – tparker Feb 18 '18 at 00:59
  • @tparker You are right in that Dirac's Delta $\delta(t)$ is "localized", which may as well be the defining property of $\delta(t)$, but I see "nonlocality" more as a difficulty in formalization of $\mathcal{K}(t)$, rather than a true conceptual difficulty, as it "trivially" behaves as the "dual" of a "localized" distribution - which, in some sense, should exist anyway. – PseudoRandom Feb 18 '18 at 11:45
  • @Ton Unfortunately, $\sigma$-additivity is fundamental and can't be given up, otherwise it's not a measure (strict definition) anymore. – PseudoRandom Feb 18 '18 at 11:46
  • @PseudoRandom: I did make the lack of sigma-additivity central in my answer. – Rgkpdx Feb 18 '18 at 11:56
  • @Ton: Yes, you are right. My point is that I don't think you need a measure in the first place to formalize the improper uniform, but if you actually go that route, then you should start from some object which is $\sigma$-additive.

    Look at this example. Call $\mathcal{K}(t)$ the improper uniform, with $t \in \mathbb{R}$ dummy variable (as in $\delta(t)$). Now, what is $$ \int_{\mathbb{R}} \mathcal{K}(t) f(t) dt = ? $$ where $f$ is your favorite well-behaved function, say $e^{- t^2}$. What does your intuition tell about that?

    – PseudoRandom Feb 18 '18 at 12:11
  • P.S. Note that, for $f(t)=1$, the integral is equal to 1, via the defining property of $\mathcal{K}(t)$. Which means that $\mathcal{K}(t)$, while purely nonlocal in itself, must have something which makes it "collapse" to a point. A bit like wave-functions. – PseudoRandom Feb 18 '18 at 12:24
  • @PseudoRandom: Can you clarify your definition $\mathcal K$ (as the pointwise limit of the Gaussian is $0$, and the way I would define it would lead to similar shortcomings as my proposal). An alternative (equivalent way?) could be defining $$\mathbf E[ f(\mathcal U)]:\text{ finite subsets of } \mathbb R \to \mathbb R$$ via $$ A\mapsto \mathbf E f(\mathcal U):=\frac{\sum_{i\in A} f(i)}{|A|},$$ and then take a good extension or keep it like this as a multivalued map $f\mapsto \mathbf E[ f(\mathcal U)]$. (trying to think about an algebraic way to treat infinite events equally). – Rgkpdx Feb 18 '18 at 13:19
  • @Ton: The definition of $\mathcal{K}(t)$ is anybody's guess.

    In an intuitive sense, you have that $\mathcal{K}(t) = dt$ for any $t$, i.e., a function of infinitesimal height whose integral is 1. It may be possible to give sense to $$ \int_{\mathbb{R}} \mathcal{K}(t)dt = \int_{\mathbb{R}} dt dt = \int_{\mathbb{R}} (dt)^2 = 1 $$ which, for now, is only empty formalism.

    – PseudoRandom Feb 18 '18 at 14:07
  • @tparker, of course it doesn't integrate to 1; that is the defining property of an "improper" distribution – Simon Segert Feb 19 '18 at 16:41
  • @MikeHawk I think you are generating confusion here. An unnormalized prior is a function which does not integrate to 1, but differs from a proper PDF merely by a constant, which it is provable to exist. E.g., $e^{-t^2/2}$ is an unnormalized PDF and I can prove to you that there exists a constant which normalizes it. An improper prior is a function which cannot be normalized, no matter how hard you try. That is the key difference. Aside from this, the question asks if there is, not a function, but a generalized function (e.g. Dirac's Delta) which can do the job of $\mathcal{K}(t)$. – PseudoRandom Feb 22 '18 at 15:46
  • Riesz-Markov-Kakutani theorem loosely tells that any non-negative distribution is a measure. So we need not leave the realm of the measure theory. Now by noting that $C_b(\mathbb{R})$ is identified with $C(\beta\mathbb{R})$ (here $\beta\mathbb{R}$ is the Stone–Čech compactification of $\mathbb{R}$), your $\mathcal{K}$ may be understood as a weak-* limit point of your choice of approximating prob. measures ${\mathcal{K}i}{i\in I}$. And any such $\mathcal{K}$ must be supported on the set of 'points at infinity' $\beta\mathbb{R}\setminus\mathbb{R}$. – Sangchul Lee Feb 22 '18 at 18:28
  • @PseudoRandom, Thank you for clarifying the distinction between an improper and unnormalized prior. The point I was attempting to make is that the phrase "The uniform distribution the real line" does not correspond to a well-defined mathematical object, so the question cannot be answered as written without a precise definition of the mysterious quantity U(R). Of course there are several ways in which the phrase may be figuratively interpreted, one given in your answer and another in my comment, but in the absence of a precise definition, the original question is just one of semantics. – Simon Segert Feb 22 '18 at 20:46

1 Answers1

2

Since no one answered, I would like to post here some useful insights on the matter.

The first, and most important, insight is that the improper uniform is inherently non-local. Distributions like Dirac's Delta, $\delta(t)$, have a local behavior, which is important. This issue makes the mathematical formalization of $\mathcal{K}(t)$ harder than it may appear.

The second comment is an attempted definition is the following. Consider the so-called integral mean (or average) over an interval $[a,b]$, given by $$ \frac{1}{|a-b|} \int^b_{a} f(t) dt $$ which is well-defined as long as the function $f$ is well-behaved. We can consider the limit of a sequence of increasing symmetric intervals as the mean of the function itself, given by $$ \mathbb{E}[f] \triangleq \lim_{N \rightarrow +\infty} \frac{1}{2N} \int^N_{-N} f(t)dt $$ and use this as our definition. In other words, the improper uniform can be thought of as an operator associating a real number to a function, i.e., $\mathbb{E} : \mathcal{F} \rightarrow \mathbb{R}$, where $\mathcal{F}$ is some function space with some regular properties (e.g. Schwartz space). This approach can likely be refined significantly, but I invite the author to post his answer since, as of now, no better definition has been given (unfortunately, the bounty expired).

There is, however, one significant aspect which I think can be improved upon. Intuition tells us that $\mathcal{K}(t)$ should be an infinitesimally thin "carpet", spread over $\mathbb{R}$. So, it makes sense that it should share some properties of the zero-function (the pointwise limit of gaussians for $\sigma^2 -> +\infty$). This can be accomplished via the following definition $$ \int_{\mathbb{R}} \mathcal{K}(t) f(t) dt = \lim_{\sigma^2 \rightarrow +\infty} \frac{1}{\sqrt{2 \pi \sigma^2}} \int_{\mathbb{R}} e^{-\frac{t^2}{2 \sigma^2}} f(t) dt $$ Let us have a deeper look at this. For $f(t)=1$, we have that the result is $1$, so that $$ \int_{\mathbb{R}} \mathcal{K}(t)dt = 1 $$ is satisfied. Consider now $f(t)=t$, for which the integral on the LHS is zero. This is, intuitively, correct, as $\mathcal{K}(t)$ should behave like the zero-function and simply make the integral vanish, but not always! We should expect some functions, like $f(t)=c$, for any $c \neq 0$, not to be annihilated. Functions with divergences also may be into this category. In other words, the integral is zero for some (but not all!) functions $f$, making $\mathcal{K}(t)$ "look like" the zero-function. To be more precise, there exists a function space $\mathcal{F}_0 \subsetneq \mathcal{F}$, where $\mathcal{F} = \mathcal{F}(\mathbb{R})$ is the space of all functions in $\mathbb{R}$ (regular enough to have an integral), such that $$ \forall f \in \mathcal{F}_0: \quad \int_{\mathbb{R}} \mathcal{K}(t) f(t)dt = 0 $$ which means that $\mathcal{K}(t)$ acts like the zero-measure in $\mathcal{F}_0$. But, and this is the key point, it acts nontrivially when other functions ($\not\in \mathcal{F}_0$) come into play.

PseudoRandom
  • 1,344
  • 1
    It is worth noting that $\mu(A)=\int K(t) 1_A(t)dt$ does not correspond to a measure (even assuming the existence of the "function" K); indeed for any bounded set $A$, we have $\mu(A)=0$, but since R can be expressed as a countable union of such sets, the countable additivity axiom does not hold. – Simon Segert Feb 23 '18 at 14:57
  • Can you find a function such that $\mathbb E[f]\neq \int_{\mathbb R}\mathcal K(t)f(t)dt$? – Rgkpdx Feb 23 '18 at 15:31