(OP hasn't responded to my question about floating point, but I'm posting this mostly as an example for user21820.)
We may choose to think of the supply of random bytes as providing a (virtually) infinitely long mixed radix integer (that is conveniently left-aligned, but inconveniently, we only find out the radix of the next digit as we come to it). At each call to the function, we draw more random bytes until we have acquired enough bits to determine what the next output digit is. This is equivalent to arithmetic decoding.
Throughout, we follow the interval $[p,p+w)$, a range of real number guaranteed to contain the rest of the infinitely long random binary number. As bits are drawn, the width, w
, of this interval decreases by a factor of $2$. Depending on the bit, we either keep the low half ($0$) or the high half ($1$) of the interval. When the entire interval fits in a single bin (definition coming soon) we output that bin and rescale p
and w
as if that bin were the interval $[0,1)$, preparing for the next call.
When we call our random number function to generate an integer in $[0,c)$, we divide the interval $[0,1)$ into $c$ bins (by multiplying by $c$ and using the integer parts of the unit intervals in $[0,c)$ as the labels for the bins. The interval $[p,p+w)$ is likewise scaled to $[cp, cp+cw)$. If a prior call to the function required reading ahead several bits to resolve in which bin the interval fell, the interval $[cp,cw)$ may be so small that it already fits in a single bin. If not, we draw bits, halving the width of the interval until it does fit.
A Mathematica implementation, with a bit of monitoring code, demonstrating usage.
foo = Module[{
p, w, k, kmant, mant, rnd,
rndCount, resetCount, vec
},
p = 0.;
w = 1.;
k = 256;
kmant = 8; (* = log_2(k) *)
mant = 16; (* bits of mantissa in p (and w) *)
rndCount = 0;
resetCount = 0;
rnd[c_] := Module[{retVal, r},
(* Return random integer in [0,c). *)
If[c < 2, (* then *)
retVal = 0
, (* else *)
p = p*c;
w = w*c;
(* There are much sneakier ways to write the next two conditions. *)
While[Floor[p] != Ceiling[p + w] - 1,
If[p > 2^(mant - kmant) w,
(* If width is so small that p + w/k loses precision,
restart p and w. Only happens if random bits conspire
to make [p,p+w) persistently straddle an integer. *)
p = 0.;
w = c + 0.;
resetCount++;
];
r = RandomInteger[k - 1]; (* random integer in [0,k-1] *)
rndCount++;
p = p + r w/k;
w = w/k;
];
retVal = IntegerPart[p]; (* For this and next, in C, see modf(). *)
p = FractionalPart[p];
];
retVal
];
(* generate one million random integers in the range [0,3) = [0,2]. *)
vec = Table[rnd[3], {10^6}];
(* report the stats for this run *)
Print[{rndCount, N[100 rndCount/(10^6 Log[k, 3])], resetCount}];
(* Log[k,3] = log(3)/log(k)
(* return list of integers to foo *)
vec
];
(* Output example:
{198998,100.443,681}
* Drew 198998 random bytes, 100.443% of entropy required to uniquely
select one outcome from 3^(10^6) possible outcomes (disregarding
conspiracies in the random number generator for resolving which
bin the last member of the sequence lies in).
* Had to reset 681 times due to the risk of precision loss. This
resulted in 198998-198121=877 drawn random bytes being discarded.
*)
(*
One can also
Histogram[foo]
or
Length/@Split[Sort[foo]]
to decide whether uniformity was attained. The run with the above
stats had output counts of
{332942, 333926, 333132}
We could even run a chi-squared test to see if the counts above are
sufficiently extreme to reject that the data is drawn from the
uniform distribution.
PearsonChiSquareTest[foo, DiscreteUniformDistribution[{0, 2}],
"TestDataTable"]
We find that our test statistic is 1.63479... compared to a chi-
squared distribution with two degrees of freedom. The resulting p-
value is 0.44158... (That is, only 44.158...% of data drawn from
the uniform distribution would have output counts closer to the
expected value than these.) This data is not sufficiently
extreme to reject that it is drawn from the uniform distribution.
*)
It is possible to implement this entirely in (arbitrary precision) integers. But this cannot be done (exactly) in finite precision since the random bit source may conspire to require arbitrarily large read-ahead to resolve to which side of a bin boundary the interval eventually falls. (Although, such long read ahead is exponentially unlikely -- at each new bit, either the interval falls entirely to one side of the boundary or it does not, with equal probability.)
user21820's answer is an approximate implementation of this idea, representing the interval as $\left[ \frac{p}{q}, \frac{p+1}{q} \right)$ in integers $p$ and $q$, "barber-poling" the integers in $[0,q-1)$ so that incrementing $p$ increments the represented output value ($p=0$ represents the left $c/q$-wide subinterval corresponding to $v=0$, $p=1$ represents the left $c/q$-wide subinterval corresponding to $v=1$, ..., $p=c-1$ represents the left $c/q$-wide subinterval corresponding to $v = c-1$, $p=c$ represents the second $c/q$-wide subinterval corresponding to $v=0$, and so on ...). Note that this can't represent an interval inside a single bin until $q \geq c$. Also, at the end, when ready to select an output bin, the interval represented by $p$ and $q$ is (very, very likely) less than a unit wide, but $p$ and $q$ are altered to exactly match the bin with integer divisions. (This side-steps the need for arbitrary precision integers by (very, very likely) discarding a fractional bit on each output.)