Successful approaches to the modelization of ''randomness''

Question

If you pick a number $x$ randomly from $[0,100]$, we would naturally say that the probability of $x>50$ is $1/2$, right?

This is because we assumed that randomly meant that the experiment was to pick a point from $[0,100]$ (with numbers equally distributed). But, since $f(r)=r^2$ is a bijection $[0,10] \rightarrow [0,100]$, we could also pick a number $r$ from $[0,10]$ and then do $x=r^2 \in [0,100]$ and let that be our random experiment. This time $x>5$ only for $r> \sqrt{50} \sim 7.07$.

In this case we would agree that the first way of choosing $x$ looks a lot more natural. So we would equally agree that is a successful way of modeling the experiment ''pick a random number from [0,100]''.

There are sometimes when we can't even agree on that! For example, on Bertrand's Paradox we are asked to pick a random chord from a circumference and calculate the probability that it is longer than the side of the inscribed equilateral triangle. The point is there are several (a priori) natural ways of choosing the chords (three of them are nicely described here) which, of course, produce different probabilities.

How and when can we consider something is truly random? Does it even make any sense saying something is truly random or is it more a matter of agreement?

Is there any convention in the mathematical community about this issues?

Could we say the common notion of randomness relates to the notion of uniform distribution?

Are there any successful approaches on models about randomness? (That let us decide if a certain distribution represents randomness in the sense of being an uniform distribution)

For example, on the comments it is said: "One can show [using Kolmogorov Complexity] that a number in [0,1] is random with probability 1 under the uniform distribution, so it coheres well with other notions.''

Wikipedia:''Kolmogorov randomness – also called algorithmic randomness – defines a string (usually of bits) as being random if and only if it is shorter than any computer program that can produce that string.'' That seems a nice definition, is it useful? — Z. L., Jan 13 '13 at 16:53
Your observation is in a way turned around to say a pseudo-random number generator is a procedure that is equally likely to produce a ``random'' number in equal length intervals. — Maesumi, Jan 13 '13 at 17:01
@ZangoLotino There are certainly lot of applications. One can also show that a number in $[0,1]$ is random with probability $1$ under the uniform distribution, so it coheres well with other notions. — Michael Greinecker, Jan 13 '13 at 17:01
Ok, thank you. Is kolmogorov randomness the best mathematical object we have for modeling the human notion of random string? — Z. L., Jan 13 '13 at 17:05
@ZangoLotino It is the only succesful one I'm aware of (I think probabilist Reichenbach tried to do something related without success). — Michael Greinecker, Jan 13 '13 at 17:09
I would look up Kolmogorov complexity and Shannon entropy and Shannon Entropy. You might also want to look into DIEHARD and TESTU01 for testing, especially as it relates to the great need for randomness in cryptographic applications. Regards — Amzoti, Jan 13 '13 at 17:32
If @MichaelGreinecker or Amzoti would write an answer with a little bit explanation of that... that was the kind of answers I was expecting instead of ''you need to give a distribution along with it'' which I thought I made clear I already knew (by my first example). — Z. L., Jan 13 '13 at 17:44
highly related: http://math.stackexchange.com/questions/15055/in-a-family-with-two-children-what-are-the-chances-if-one-of-the-children-is-a/15085#15085 — BlueRaja - Danny Pflughoeft, Jan 13 '13 at 19:25
I don't understand how Kolmogorov complexity is supposed to help you pick numbers in $[0,100]$ or chords in a circle in a uniform way. In other words, I don't understand the connection between your motivating examples and your comment that Kolmogorov complexity is the kind of answer you were looking for. — , Jan 13 '13 at 19:45
@RahulNarain Ok, maybe I have to re-write everything a bit. My examples are just for showing that random is a concept that, by itself, doesnt have a mathematical meaning. I am looking for approaches to the modelling of the notion of random (or some variants). As Michael Greinecker pointed out: ''There are certainly lot of applications [of Kolmogorov complexity]. One can also show that a number in [0,1] is random with probability 1 under the uniform distribution, so it coheres well with other notions'' That is a good reason why it is a nice approach/model. — Z. L., Jan 13 '13 at 19:56
"Have we achieved a mathematical model for randomness?"
Yes. It's called probability theory. — Christian Blatter, Jan 13 '13 at 20:10
I don't know if the edit is any better. Is the entire field of probability theory unsuccessful? Of course any nontrivial distribution represents randomness if it is the distribution of a random variable. I really think you mean "uniform distribution" when you say "random". The sum of two dice rolls has a higher probability of coming up 7 than of coming up 2. Is it not random? — , Jan 13 '13 at 20:39
@ChristianBlatter, I think Bayes, Laplace, Cox, Polya and many others would take exception to your statement. A recent issue of the journal Stat. Sci. for example analyzed various ways in which frequentist and Bayesian statistical methdos (which deal with uncertainty if not "randomness) diverge in results, despite given the same data. — alancalvitti, Jan 13 '13 at 21:38
@alancalvitti I doubt the Great Ones you cite would have taken exception to that. Would you be confusing the true fact that some given real-life situations are liable to several modelizations with the equally true fact that all of these modelizations pertain to probability theory as delineated by Uncle K in 1933, for example? — Did, Jan 14 '13 at 17:48
@ChristianBlatter, no I'm not confusing it: although Bayes theorem itself is a direct consequence of the product rule, the Bayesian logical framework is not based on measure theory in contrast to K's development. Rather it's an extension of laws of logic such as weak syllogism in place of classical Modus Ponens, see eg Jaynes' "Probability Theory" 2003. Jaynes was a physicist and in his book says the bulk of measure theory (eg, absolute continuity) is basically irrelevant to statistics. — alancalvitti, Jan 14 '13 at 18:23
@alancalvitti (You misattributed my last comment.) Bayesians never use continuous parameters? — Did, Jan 14 '13 at 19:17
@did, my bad: (can a moderator please change ChristianBlatter --> did in my previous comment?)... Bayesians use continuous and discrete parameters: the distribution of the fairness of a coin would be a continuum, eg [0,1]. They also use discrete distributions. Jaynes in his book writes that there's no difference between model selection and parameter estimation, since in any indexable set of models the index can be considered a parameter — alancalvitti, Jan 14 '13 at 19:21
@ChristianBlatter, pace - I accidentally misattributed you in a previous comment. — alancalvitti, Jan 14 '13 at 19:22
@alancalvitti How to use continuous parameters without measure theory? — Did, Jan 14 '13 at 19:24
@did, I'm not sure what you mean by continuous: topological spaces deal with continuity, measure spaces deal with measures. There are measure-theoretical concepts like absolute continuity that are central in functional analysis - but as I was saying, Jaynes says in applied statistics they make no difference (I would agree as all data is finite). — alancalvitti, Jan 14 '13 at 19:27
@alancalvitti Well, continuous like this, of course. Ergo, measure theory. For some perspective on Jaynes' always interesting and sometimes extreme views, you might try this. — Did, Jan 14 '13 at 19:57
@did, great quote by Diaconis in that review: "I had to fight to keep the measure theory requirement in Stanford's statistics graduate program. The fight was lost at Berkeley." and I agree that the book is not "modern" - statistical methods are accelerating. However re your Q: Baysesian use continuous distributions, but stay away from measure spaces (K's). See, eg Sivia's Data Analysis: a Bayesian Tutorial 2006 — alancalvitti, Jan 14 '13 at 20:05
@alancalvitti From the Preface by Sivia: What we were not told in our undergraduate lectures is that there is an alternative approach to the whole subject of data analysis which uses only probability theory (my emphasis). In one sense, it makes the topic of statistics entirely superfluous. In another, it provides the logical justification for many of the prevalent statistical tests and procedures, making explicit the conditions and approximations implicitly assumed in their use. This book is intended to be a tutorial guide to this alternative Bayesian approach. — Did, Jan 14 '13 at 20:22
@did, I have that book. When you look through it, you will find distributions of course, including continuous ones. But the methods diverge from K's methods. The central concepts are priors, conditional and posterior distributions, not algebra of events and measure spaces. — alancalvitti, Jan 14 '13 at 20:38
@alancalvitti And continuous conditional distributions use very much measure theory. Yawn... Well, since each had a chance to express their view, I think I will take my leave. — Did, Jan 14 '13 at 20:43
@did, ok I give up too. In any case, why did you mention continuous distributions? Measure theory applies as well to discrete distributions. — alancalvitti, Jan 14 '13 at 21:19

score 8 · Accepted Answer · 2013-01-13T20:27:31.367

One way to interpret your motivating examples is not that the word random is ill-defined (all of probability theory would disagree with that), but that you want a mathematically natural characterization and generalization of the notion of a uniform distribution. In that case, the answer could be the Haar measure on Lie groups (among other things). This is a measure that is invariant under the action of the group, and if you restrict it to a compact set you can normalize it to form a probability distribution.

For example, the real numbers form a Lie group under addition, and the corresponding Haar measure is nothing but the usual uniform measure on $\mathbb R$, which restricted to $[0,100]$ leads to the uniform distribution on the same. We can tell that the distribution produced by uniformly picking a number in $[0,10]$ and squaring it is not uniform, because it is not invariant under addition (the probability of $[20,30]$ is not equal to the probability of $[20,30]+40 = [60,70]$).

Similarly, when dealing with lines in the plane, the relevant Lie group is the Euclidean group of rigid motions of the plane, which comes equipped with a Haar measure. This induces a measure on the space of lines which is invariant to translation and rotation. When restricted to the lines that intersect a given circle, it gives you something you could objectively call "the" uniform distribution over chords of the circle. This corresponds to picking the angle and the distance from the center uniformly, and matches Jaynes' solution using the principle of maximum ignorance.

The field of integral geometry deals with exactly this sort of thing: the properties of geometrical objects under measures that are invariant to the symmetry group of the geometrical space. It has many interesting results such as the Crofton formula, stating that the length of any curve is proportional to the expected number of times a "random" line intersects it. Of course, this could not be a theorem without precisely formalizing what it means for a line to be random.

+1 for a nice summary answer. Be aware however that in practice, the symmetry group (or more generally, inverse semigroup or partial symmetries) of the system under study is not at all easy to characterize. By the way, even the notion of averaging has been studied from an invariance point of view - this is the Chisini mean approach. — alancalvitti, Jan 13 '13 at 21:42
@ZangoLotino, related to Rahul's answer, check: http://en.wikipedia.org/wiki/Jeffreys_prior — alancalvitti, Jan 13 '13 at 21:43
Thank you all, I wasnt expecting such a feedback. It this goes on, I will do some kind of summary in a few days ;) — Z. L., Jan 13 '13 at 21:49

score 7 · Answer 2 · answered Jan 13 '13 at 17:08

7

When someone says 'random' there should be a distribution that goes along with it. In your example, to pick a random $x$ from $[0,100]$, it is implied that you pick $x$ over a uniform distribution. Of course, like you pointed out, using a different distribution will give you a different result.

The point is, 'random' needs a distribution to define it.

answered Jan 13 '13 at 17:08

timidpueo

2,049

So the answer is that there is no such thing as randomness?. Since we need a distribution for every time we use that word, we wouldn't need to use it. So we have failed in modeling the notion of random, or we are confused about it. – Z. L. Jan 13 '13 at 17:19
2

Rather, we often forget to mention the words uniformly distributed when introducing a random variable. Similarly, the word independant is often forgotten when talking about two random variables. But that is a matter of imprecision in linguitic expression. The concept of random variable has a very precise mathematical definition, but connot easily be transferred fully to everyday experiments (coin flips, dice rolls, weather forecasts, ...). – Hagen von Eitzen Jan 13 '13 at 17:28
3

No the point is that when someone says something is random, from context it is usually implicitly understood what the distribution is. In your random chord example, the paradox isn't really a paradox, it's just that the question doesn't properly define a distribution of chords, so depending on how you choose it, you will get different answers. Using the word 'random' is fine, as long as everybody knows its underlying distribution. – timidpueo Jan 13 '13 at 17:29
Maybe the title of my question is not well posed. Is random a mathematical term? In the sense that, have we achieved to give a mathematical object which models it (the human feeling about it)? – Z. L. Jan 13 '13 at 17:37
3

Would you say “number” is a mathematical term? It’s part of several specifically defined mathematical terms — real numbers, natural numbers, complex, p-adic, etc. — but on its own, it doesn’t have a uniquely defined meaning. “Random” is similar: it’s part of several formally defined notions (e.g. random variables, Kolmogorov randomness, …), and randomness is certainly something that’s studied mathematically, but the informal idea of randomness corresponds to several different mathematical concepts, and so there’s no single formal definition of “random”. – Peter LeFanu Lumsdaine Jan 13 '13 at 17:43
I agree with @PeterLeFanuLumsdaine. In mathematics, randomness is well-defined. In everyday language, randomness may not have such a formal definition. – timidpueo Jan 13 '13 at 17:46
I would say ''number'' is a mathematical term and a good one since any person who understands the underlying definitions would agree that the detailed mathematical definition I'm giving them coincides with their own feelingg about (real, natural, complex) number. But the thing is that doesn't seem to happen with random. Of course there are adjectives to random which are studied and are mathematical objects but if I model a potato with a sphere it becomes an unsuccessful model, even though the sphere is still a mathematical object. – Z. L. Jan 13 '13 at 17:49
@ZangoLotino, just to augment timidpueo's answer, there is no uniform distribution on infinite sets obviously, since probability measure is finite and additive. – alancalvitti Jan 13 '13 at 21:44

score 5 · Answer 3 · answered Jan 13 '13 at 17:07

5

A common abuse of language is to say "let $x$ be a random foo" when one really means "let $x$ be a random variable uniformly distributed over all foo".

It is also common to abuse language to use "random" to mean "something that appears too hard to predict".

The fundamental rationale behind applying probability distributions to real world observations is really a matter of metaphysics, not mathematics.

answered Jan 13 '13 at 17:07

Ok, completely agree on the first 2 paragraphs. My concern is that the last one is also true... so the fact that we trust in such applications is a matter of faith in some way? – Z. L. Jan 13 '13 at 17:14
Anyway, if I said the chords are equally distributed, this time it is not even defined equally distributed, otherwise there would be agreement on the solution. – Z. L. Jan 13 '13 at 17:15

score 3 · Answer 4 · edited Jan 13 '13 at 20:47

If we look at dice or a pseudo-random number generating algorithm, we need to know laws (physics or algorithm) and initial conditions to predict the result.

Based on that, here goes my attempt to define randomness:

If function's $f\left( x_1, x_2, \ldots \right)$ value can't be predicted while knowing the values of all variables $x_1, x_2, \ldots$, then the value of the function is a random number.

Any comments on this are welcome.

Truly random numbers in real life

I disagree that a truly random number is impossible.

It would be hard to believe that a deterministic algorithm could produce random results. But an algorithm is not the only way to produce a number. You just need to assign numbers to possible outcomes of some random process to get random numbers. In quantum physics the result of an experiment is random.

Example 1

State of a particle is described by wave function $\psi \left( q \right)$. Probability to find particle somewhere in $\delta$ is $\int\limits_\delta \left| \psi \left( q \right) \right| ^2 dq$. If you perform many experiments, you'll get results distributed according to $\left| \psi \left( q \right) \right| ^2$. But outcome of a single experiment is a random number from that distribution.

Example 2

Let's look at another experimental situation: light travels along $z$ axis and it is polarized along $x$ axis. It falls on a polarizer whose axis of polarization is not the same $x$ axis. For simplicity let's put our polarizer so that the angle between it's axis and $x$ axis is $\pi/4$. In that case half of the light goes through and half is absorbed. for individual photons it means that a photon will randomly either be absorbed or let through with equal probability.

The second experiment is almost the same coin toss, but, when throwing coin, one could think

if i could very precisely know initial speed and position and everything else, I could predict it's final position without any randomness

but in case of photons the are no underlying variables, the result of a single experiment is fundamentally unpredictable (random), however the result of many (infinitely many) experiments approach the 50/50 distribution.

Furthermore, the processes in atmosphere are very unstable so some tiny quantum randomness might actually be enough to make a macroscopic result random.

This answer does not actually reflect the status of quantum mechanics. There are entirely deterministic models of quantum theories (by Bohm, t'Hooft, and many others) that have isomorphic predictions to the probabilistic models (and thus are models of the same theory). So you can't accurately say "the result of an experiment is random" according to your definition. Instead, you need to actually point out that in deterministic models, measurement of variables is contextual and modifying, so measurement is the problem, not the lack of variables (among other things). — ex0du5, Jan 14 '13 at 21:49

score 2 · Answer 5 · edited Jun 12 '20 at 10:38

I would recommend looking into:

Kolmogorov complexity $K(x)$ measures the amount of information contained in an individual object $x$, by the size of the smallest program that generates it.

Shannon entropy $H(X)$ of a random variable $X$ is a measure of its average uncertainty. It is the smallest number of bits required, on average, to describe $x$, the output of the random variable $X$.

Here are two references for your review Kolmogorov complexity and Shannon entropy and Shannon Entropy.

Today, most people marry this into Information Theory.

Random numbers are very important in many fields and particularly for cryptographic applications (since getting this wrong could make a secure system insecure). I would recommend looking into the papers and code for DIEHARDER and TESTU01 and there are interesting papers and results for psuedo-RNGs and crypto-strength RNGs.

Random numbers, as you are finding, are a very complex area and it is a great idea to question them.

Here is a List of random number generators for your perusal. You might also have a look at the Handbook of Applied Cryptography - HAC for some crypto related ones.

Regards

Interesting question, and great references! – amWhy May 09 '13 at 00:18 — amWhy, May 09 '13 at 00:18

Successful approaches to the modelization of ''randomness''

5 Answers5

Truly random numbers in real life

Example 1

Example 2