2

I need an algorithm that gives me integer numbers with the following features:

  • Numbers must be in a given range $[n..m]$;

  • Numbers must be returned in pseudo-random order (random at visual inspection is enough; it is more important that numbers are well distributed over the given range);

  • Numbers may not repeat before each number in the range $[n..m]$ has been returned once;

  • Range may be huge (up to $2^{64}$; this excludes all list/shuffle based algorithms);

  • It should be possible to seed the function so it returns numbers in different order on repetition;

  • Algorithm should be as fast as possible and should return in constant time.

I wrote code that uses table based bit swapping and optionally XOR and/or addition. It works very well for bit-aligned $n$ and $m$. However it is a bit slow and if the range $n..m$ is not aligned to bits (i.e., $n$ other than $2^x$ and $m$ other than $2^y$), I get either gaps within the returned numbers or extremely non-constant runtime behavior.

How can this be solved?

Silicomancer
  • 202
  • 1
  • 9
  • 1
    If the range is huge, numbers won't repeat. If generating, say, $2^{20}$ numbers out of a random of size $2^{64}$, a repeat is rather unlikely (happens with probability less than $2^{-25}$). – Yuval Filmus Sep 26 '17 at 22:03
  • "seed the function so it returns numbers in different order": do you mean the same numbers ? –  Sep 27 '17 at 14:06
  • @Yuval: This does not work since I need up to $2^{64}$; numbers. Also the numbers in the range may not have gaps. I.e. I need to get all the numbers within $[n..m]$. – Silicomancer Sep 27 '17 at 20:14
  • @Yves: Yes, the same number in the range $[n..m]$ (but in a different order and beginning with a differnt number out of $[n..m]$) – Silicomancer Sep 27 '17 at 20:15
  • @Silicomancer: sorry, this answer is still ambiguous. –  Sep 28 '17 at 07:41

2 Answers2

4

Without loss of generality, we can assume you need numbers in the range $\{0,..,n-1\}$. (Just add the appropriate offset.)

This can be solved by constructing a random permutation $f:\{0,\dots,n-1\} \to \{0,\dots,n-1\}$, then outputting the values $f(0),f(1),f(2),f(3),\dots$ as your sequence of pseudorandom numbers. Those numbers won't repeat until each number in the range has been defined. If $f$ can be computed efficiently, this will be fast.

So how do we construct such a random function $f$? One approach is to use format-preserving encryption, a technique from cryptography that allows you to construct a function $f$ that is a bijection on the set $\{0,\dots,n-1\}$ (for any $n$ of your choice), and that appears pseudorandom. There are many FPE algorithms.

See https://blog.cryptographyengineering.com/2011/11/10/format-preserving-encryption-or-how-to/ and https://crypto.stackexchange.com/questions/tagged/format-preserving for more (e.g., https://crypto.stackexchange.com/q/41450/351, https://crypto.stackexchange.com/q/504/351, https://crypto.stackexchange.com/q/29073/351, https://crypto.stackexchange.com/q/20035/351, https://crypto.stackexchange.com/q/16561/351, https://crypto.stackexchange.com/q/18988/351).

D.W.
  • 159,275
  • 20
  • 227
  • 470
  • I will need some time. I agree your way of solving the problem is the right one. But currently I don't understand how to implement such a function without restricting it to bit aligned ranges. All of those algorithms seem to operate on bits and bytes. – Silicomancer Sep 27 '17 at 20:34
  • @Silicomancer, you can build format-preserving encryption for any range (the size of the range doesn't need to be a power of two, i.e., it doesn't need to be bit-aligned). Study the techniques and if you still don't see how to do it, ask on Crypto.SE or here. It may help to give a specific value of $n$, as I think the best technique varies depending on whether $n$ is small or large. – D.W. Sep 27 '17 at 20:55
1

You can use a Linear Feedback Shift Register to generate a maximum length sequence that cycles through all numbers that fit into the size of the LFSR without repetition. This is used in computer games for the so-called Fizzlefade effect. LFSRs are really efficient, so this might fit your performance requirements.

(This is of course a special case of the general technique mentioned by D.W.)

adrianN
  • 5,951
  • 18
  • 27