Meaning of epsilon in proving a PRG is predictable/ un predictable

Question

I am quite confused, I have been following a course on crypto provided by Dan Boneh, this is the lecture I am confused with - https://www.youtube.com/watch?v=EjWap3szsPk time stamp 8:58 - 11:00

I don't understand the meaning of epsilon in this situation, I know epsilon is a symbol in math that is used for a positive number approaching 0.

What does he mean by the PRG is predictable if the probability is greater than 1/2 + epsilon for some non negligible epsilon?

if someone could explain this in a little more dept and in more laymans terms if possible would be greatly appreciated.

transcript - we say that G:K -> {0,1}^n is predictable if

∃ "eff" algorithm A and ∃ 1 < i < n-1 s.t

k <- K pr[ A(G(K) |1,...., i = G(K) | i+1] > 1/2 + ε for some non negligible ε

we say that G is predictable if there exists a efficient algorithm A and there is some position 1 between 1 and n-1 such that we look at the probability that we generate a random key, if I give this algorithm the prefix of the output (the first i bits of the output) the probability that it's able to predict the next bit of the output is greater than half + epsilon for some non negligible epsilon.

Can you please transcribe whatever text or formulas are in the video? — Squeamish Ossifrage, Apr 24 '19 at 14:34
we say that G:K -> {0,1}^n is predictable if
∃ "eff" algorithm A and ∃ 1 < i < n-1 s.t

k <- K pr[ A(G(K) |1,...., i = G(K) | i+1] > 1/2 + ε for some non negligible ε — Mark, Apr 24 '19 at 15:03

Squeamish Ossifrage · Accepted Answer · 2019-04-24T21:26:10.907

Loosely, ‘negligible’ means small enough to ignore. A difference in probability of $2^{-128}$ is likely small enough to ignore; a difference in probability $2^{-10}$ may not be.

Formally, in the language of complexity theory, a function $f\colon \mathbb N \to \mathbb R$ which assigns to each natural number $n$ a real number $f(n)$ is negligible if it goes to zero faster than the inverse of any polynomial—that is, if for any polynomial $p$, there exists a number $n_0$ such that for any $n > n_0$, $|f(n)| < |1/p(n)|$.

For example, $n \mapsto e^{-n}$ is a negligible function.

Sketch of proof: Let $p(x) = x^d$; at what point $n_0$ do we have $e^{-n} < 1/n^d$, or $e^d < n^n = e^{n \log n}$, for all $n > n_0$? We don't need to solve this exactly—it suffices to set $n_0 = d$; as long as $n > n_0 = d$, $n \log n > n > d$, so $n^n = e^{n \log n} > e^d$ and thus $n^{-d} > e^{-n}$, or $e^{-n} < 1/n^d$ as desired. Finding the corresponding bound $n_0$ for other polynomials left as an exercise for the reader.

In the language of complexity theory, we usually consider ensembles of cryptosystems that scale with a security parameter, and we consider the asymptotics of the growth curves of the user's cost and the adversary's cost as the security parameter scales.

The same is true for algorithms more generally, which is why software engineers will casually talk of quicksort's $O(n^2)$ worst-case time but $O(n \log n)$ average time, because it's a useful proxy for estimating concrete costs—and, unlike a measurement in seconds or joules, it doesn't change if you move from one machine to another, which is helpful for understanding how it will scale if you feed it more and more data even if you use a slightly faster machine.

Of course, we must be careful not to lose sight of the concrete numbers without asymptotics: while it is proven[1] that multiplication can be computed in $O(n \log n)$ bit operations, you'll still get an answer faster in wall clock time using a naive $O(n^2)$ algorithm for any input that is possible to work with in practice.

Similarly, while the Blum–Blum–Shub pseudorandom generator has a polynomial-time reduction theorem relating the difficulty of distinguishing BBS output to the difficulty of factoring the BBS modulus, the concrete security of BBS[2] (paywall-free) is such that it costs $2^{43}$ bit operations to generate a mere 128 KB of data at a ~100-bit security level, using somewhat old estimates for factoring costs. At 2.5 GHz, that takes a minute of computation. In contrast, e.g., AES-256 can generate the same amount of data in microseconds at a much higher security level.

score -1 · Answer 2 · answered Apr 24 '19 at 15:34

-1

$\epsilon$ is just the bias of any PRNG such as $G:K \rightarrow \{0,1\}^n$ away from uniformly pseudo random. Ideally the long run next bit predictability of $G$ should be exactly $1 \over 2$. In reality, it's $\frac{1}{2} + \epsilon$ as no one's perfect and we have to balance bias with efficiency.

Negligible = "too slight or small in amount to be of importance"

The concept of importance here simply means that if it's sufficiently large we can stochastically measure $\epsilon$ by sampling enough output from $G$. If we can measure it, we can use it to create an advantage for ourselves in guessing the $i_{n+1}$ th bit of an output sequence. That then undermines the security of $G$.

If $\epsilon$ is negligible, we can't measure it using current resources and we can treat $G$ as secure distribution wise. Dan suggests the threshold values of $b$ in $\epsilon = \frac{1}{2^b}$ as 30 and 80 for non-negligible and negligible biases respectively. NIST use $b = 64$.

There's a bit more on negligible crypto concepts in What exactly is a negligible (and non-negligible) function?.

answered Apr 24 '19 at 15:34

Paul Uszak

15,390
2
28
77

1

This may not address the formal definition in complexity theory, but it's not wrong. Would the downvoter care to explain what you disagreed with? – Squeamish Ossifrage Apr 24 '19 at 17:51
thanks Paul, what is meant by bias in this sense(uniformly pseudo random )? and how could one even predict the next nth+1 bit? – Mark Apr 24 '19 at 18:02
Also If I'm thinking correctly why would epsilon > 1/2^30 be non negligible? since Epsilon is so small I don't understand why epsilon at even that size would be a problem? maybe he could have used a variable like x to explain it better than epsilon? – Mark Apr 24 '19 at 18:14
Well, $\frac{1}{2^{30}}$ isn't all that small for contemporary computers . $2^{30}$ bits is only ~134MB of data. If we used such a biased key stream over 10GB of data, we might get ~75 bits that shouldn't be there. A specialist distinguisher might be able to pick this up. If this happens and the stream is identified as non random, cryptographers deem the function insecure. The function becomes tainted at that point irrespective of whether a real world exploit currently exists. – Paul Uszak Apr 24 '19 at 21:51
$\epsilon$ is just convention. – Paul Uszak Apr 24 '19 at 22:03
Bias is as I wrote. It's simply the probability difference between even 1's and 0's as expected from a uniform distribution, and a dodgy (biased) one. An exact 0.5 chance of getting either, or say $ 0.5 + \frac{1}{2^{30}} $ for one of the choices. Like a bad penny. – Paul Uszak Apr 24 '19 at 22:07
@PaulUszak That's the wrong notion for security. The alternating sequence $G(k)=010101\dots$ has zero bias in that definition, but it's trivially distinguishable from uniform random with advantage near the maximum possible, 1—hence hopelessly insecure. The right notion is the advantage $|\Pr[A(G(k))]-\Pr[A(U)]|$ of any cost-limited decision algorithm $A$. For the alternating sequence 0101…, a distinguisher that draws $q$ bits of output can attain advantage $1-1/2^q$ as follows: simply check whether the output is 0101…; under uniform random bits, this check returns 1 with probability $1/2^q$. – Squeamish Ossifrage Apr 25 '19 at 00:03
Just a quick follow up, is the reason why it should be 1/2 or close to 1/2 as possible because we have two bit 0 and 1? so if we have a bit stream of 01110 the next bit should almost have a 1/2 chance probability being either a 1 or 0? :) – Mark Apr 28 '19 at 20:46
@Mark Yes, absolutely. And for any specific next number from a fair die should simply be $1 \over 6$. In theory. In fact it will be $\frac{1}{6} + \epsilon$ and the casino just hopes that $\epsilon$ is not too large. – Paul Uszak Apr 28 '19 at 21:07
@Mark Paul has the wrong notion for security. If you have a $t$-bit key, and an $n$-bit PRG ($n > t$), then after the $t^{\mathit{th}}$ bit (more or less), for any $i > t$, either $\Pr[G(k)[i] = 1] = 1$ or $\Pr[G(k)[i] = 1] = 0$ because there are only at most $2^t$ distinct outputs determined by the $2^t$ keys, but that's not important if there is no algorithm that can determine with nonnegligible probability which of those two options it is. Imagine if you had a prediction algorithm which made the wrong next bit prediction 100% of the time. Could you convert that to one that works? – Squeamish Ossifrage Apr 28 '19 at 21:58

Meaning of epsilon in proving a PRG is predictable/ un predictable

2 Answers2