2

I ran into this problem while trying to create a procedural texture algorithm. I ended up using a greedy approximation and shuffling it to hide the bias, but I was wondering if there was a way to find a uniform solution.

The Problem

Divide a wall of a height $H$ into a layers whose individual sizes range from $L_{min}$ to $L_{max}$. The algorithm should choose a random solution from the set of all possible layer counts and sizes that satisfy the constraints.

To put it another way, I'm looking for a way to randomly select a set $S$ from the sets that satisfy the constraints for the given $H$, $L_{min}$, and $L_{max}$: $$\sum_iS_i = H$$ $$\forall i | L_{min} \le S_i \le L_{max}$$

It seems similar to a combinatorial optimization problem, but over an infinite set rather than a finite one. Looking at the Wikipedia article, it could be expressed as a knapsack problem whose items are all real numbers in the size interval and whose values are random with the additional constraint that the weight must exactly equal the limit.

Examples

Setting the input parameters $H=10, L_{min}=2, L_{max}=3$ yields an infinite number of possible solutions:

  • $\{2, 2, 2, 2, 2\}$
  • $\{2, 2, 3, 3\}$
  • $\{2.1, 2.5, 2.6, 2.8\}$
  • ... and many more

The goal is to choose one of these solutions at random. In this case, the chosen solution would almost certainly have four elements (layers), as there is only one five element option and an uncountable number of four element options.

If we increase $H$ to $10.1$, we suddenly have an uncountable number of five-element solutions:

  • $\{2, 2, 2, 2, 2.1\}$
  • $\{2.01, 2.03, 2.05, 2.07, 2.09\}$
  • etc… In this case, four-element solutions would still be more common, but the random algorithm should occasionally produce a five-element solution.

My Analysis

I wrote a quick program to count up solutions using fixed steps in layer sizes in the hopes that there were exploitable patterns in the distribution. When grouped by layer count, the first layer's size followed a combination counting function with $n$ equal to the number of layers and $k$ linearly related to layer size, but I couldn't find a consistent pattern to how the scale or offset were determined.

Edit: Added examples section. Minor changes to wording.

erefewinter
  • 123
  • 4
  • Where do you choose $S$ from. What are "all possible sets"? – nir shahar Aug 11 '21 at 06:09
  • @nirshahar I updated the question in an effort to make the wording less ambiguous. I intended for it to be read as "all possible sets that satisfy the constraints" and indicate that $S$ is chosen from those at random. Hopefully the new wording is clearer. – erefewinter Aug 11 '21 at 07:20
  • How would you define a "uniform" distribution here? – nir shahar Aug 11 '21 at 07:43
  • @nirshahar I meant that any distinct combination of layer heights is equally likely to occur. – erefewinter Aug 11 '21 at 08:06
  • I don't understand what is meant by "randomly select a set S from the sets" - what do you mean by the sets? What distribution do you want? What is meant by "all possible layer counts and sizes that satisfy the constraints"? I don't know what is meant by "layer counts" or "layer sizes" or what the constraints are. – D.W. Aug 11 '21 at 08:12
  • What is meant by $\sum S_i$? What is $S_i$? If $S_i$ is a set, what is meant by the sum of sets? Do you really mean a set of values $S_1,\dots$, or a sequence? Are repeated values allowed? – D.W. Aug 11 '21 at 08:14
  • If you want "uniformly at random over all solutions", then that contradicts "In this case, four-element solutions would still be more common, but the random algorithm should occasionally produce a five-element solution" - there are uncountably more five-element solutions than four-element solutions, so the probability of choosing a four-element solution would be zero. – D.W. Aug 11 '21 at 08:16
  • @D.W. $S$ is a set, and $S_i$ are its elements – nir shahar Aug 11 '21 at 09:03
  • 2
    Your probability space is not well-defined. If the size of each set is fixed, this problem can be modeled by the geometric probability. Be careful for probability space with infinite elements. Simply defining each element to have the same probability is not sufficient. – xskxzr Aug 11 '21 at 09:06

1 Answers1

3

The number of solution sets of size $k$ is either uncountably infinite (and in particular has the same cardinality as $\mathbb{R}^{k-1}$), 1, or 0.

So, the following algorithm works:

  1. If there is any $k$ so that there are uncountably infinitely many solution sets of size $k$:

    1. Find the largest $k$ such that there are uncountably many solution sets of size $k$.
    2. Randomly pick a solution set of size $k$.
  2. Otherwise, find any solution and output it. (Unless there is no solution; in that case, output that no solution exists.)

Steps 1.1 and 2 can be implemented efficiently. If $k L_\min = H$ or $k L_\max = H$, there is only one solution of size $k$. If $H < k L_\max$ or $H > k L_\min$, there are 0 solutions of size $k$. Otherwise, there are uncountably many solutions of size $k$, and in that case, you can reduce the sampling problem in Step 1.2 to the following (by rescaling appropriately):

Given $u$ and $k$, sample $r_1,\dots,r_k$ uniformly at random such that $\sum r_i = 1$ and $0 \le r_i \le u$.

(Proof: Set $u = (L_\max - L_\min) / (H - k L_\min)$ and $x_i = L_\min + r_i (L_\max - L_\min)/u$.)

This problem is a generalization of sampling from the unit simplex. Unfortunately, I don't know an efficient solution to it.

D.W.
  • 159,275
  • 20
  • 227
  • 470
  • 1
    One quick note: you can solve the last problem you mention "efficiently" (i.e., in polynomial time) because it's an instance of the problem of uniformly sampling a point from a convex polytope. But it is true that while technically poly-time, these algorithms are quite involved and slow in practice. I wonder if there's a nice way to solve it for this specific polytope... – jschnei Aug 11 '21 at 18:46
  • Answers the problem as stated, but I'll probably use a different approach for selecting layer count. While there are technically more solutions at higher layer counts, the solutions all look very similar in practice. – erefewinter Aug 11 '21 at 21:06