You can do it in one call to the rand() function, with a perfect uniform distribution, if you're willing to use large integers.
Here's the basic idea:
- List out all of the possible ways to choose K distinct integers from the range [0, N-1].
- Throw a dart at the list, choosing one of those ways
Once you've done this, you'll realize that you don't actually need that list of possibilities: You can use a mathematical formula to compute the selection choice directly from the id, in O(1) time, without the list.
For example, if you're trying to select k = 2 distinct items from n = 3 options, then you could use the list:
L(3, 2) = ["011", "101", "110"]
...where "110" means that you would select the first two items, and ignore the third. So, if rand() generates "id = 2", then you'd look up the binary sequence at position 2 ("110"), and select those first two items.
In general, there are M = (N K) = N!/(K!(N-K)!) possible ways to select K distinct integers from the set [0, N-1]. So a single call to the random number generator, rand(M), will randomly choose one of those possibilities.
All you need to get started is a canonical list of the possibilities--that is, a canonical ordering of the list.
Really, any ordering will work. But here's one that has a nice recursive formula:
L(n, 0) = ["0...0"] (where "0...0" has length n), and
L(n, n) = ["1...1"] (where "1...1" has length n), and
L(n, k) = [], for n < k, and
L(n, k) = ["0" + L(n-1, k), "1" + L(n-1, k-1)]
...where "1" + "001" = "1001" is string concatenation.
This lets you build up a canonical list of any size.
For example, if you'd like to generate the list from earlier, L(n, k) where n = 3 and k = 2, then you could do it like this:
L(1, 1) = ["1"]
L(2, 2) = ["11"]
L(1, 0) = ["0"]
L(2, 0) = ["00"]
L(3, 0) = ["000"]
L(2, 1)
= ["0" + L(1, 1), "1" + L(1, 0)]
= ["0" + ["1"], "1" + ["0"]]
= ["01", "10"]
L(3, 1)
= ["0" + L(2, 1), "1" + L(2, 0)]
= ["0" + ["01", "10"], "1" + ["00"]]
= ["001", "010", "100"]
L(3, 2)
= ["0" + L(2,2), "1" + L(2,1)]
= ["0" + ["11"], "1" + ["01", "10"]]
= ["011", "101", "110"]
This allows you to compute the canonical list for any problem size. This already enough to get a solution using dynamic programming.
But you can do better. The recursive formula also yields a closed-form solution, and its inverse, g(id), directly computes the binary sequence.
This is a high-effort approach, but it does yield a fantastic O(1) solution that involves only one call to the random number generator.