2

A composition of a positive integer $n$ is a way of writing $n$ as the sum of an ordered sequence of strictly positive integers. Any positive integer $n$ has $2^{n - 1}$ distinct compositions.

Compositions of $n$ range in length from one ($n$ on its own) to $n$ ($n$ instances of the number $1$ added together). The number of compositions of $n$ into exactly $k$ parts is the binomial coefficient $\binom{n - 1}{k - 1}$, or $\frac{(n - 1)!}{(k - 1)! × (n - k)!}$.

What I want to find is a formula that gives the number of compositions of $n$ of length $k$ containing only numbers between $1$ and $p$, inclusive, but no higher.

For example, the number $6$ has $2^{6 - 1} = 32$ compositions: one of length $1$, five of length $2$, ten of length $3$, ten of length $4$, five of length $5$, and one of length $6$. If the cutoff value is $p = 3$, that is reduced to $24$ compositions: zero of length $1$, one of length $2$, seven of length $3$, ten of length $4$, five of length $5$, and one of length $6$.

When $\frac{n}{2} ≤ p ≤ n$, I found that the formula $\binom{n - 1}{k - 1} - k × \binom{n - p - 1}{k - 1} = \frac{(n - 1)!}{(k - 1)! × (n - k)!} - k × \frac{(n - p - 1)!}{(k - 1)! × (n - p - k)!}$ works. However, when $1 ≤ p < \frac{n}{2}$, this formula over-counts the compositions to be excluded. This seems to be because the second term counts, for example, how many numbers greater than $p$ are in compositions of length $k$ and not what I want, which is how many compositions of length $k$ contain at least one number greater than $p$. This means that it double-counts compositions that contain two instances of an invalid number, triple-counts compositions that contain three instances, and so on.

Is it possible to find a closed-form (non-recursive, non-infinite) formula that gives the results I want?

For reference, here are the compositions of integers $1$ through $6$, color-coded by length (background color) and highest number (text color):

Compositions of integers 1 through 6, color-coded

Lawton
  • 1,759
  • 7
  • 18
  • I would recommend taking the examples you've already calculated and plugging them into https://oeis.org/ – Jenny Kenkel Oct 09 '23 at 15:31
  • @JennyKenkel This isn't a one-dimensional sequence, though; it's two-dimensional, in $n$ and $p$. Can the OEIS help with that? – Lawton Oct 09 '23 at 15:37
  • Well, you could try it for one sequence at a time, and I would guess you will see patterns – Jenny Kenkel Oct 09 '23 at 15:38
  • See this answer for a blueprint of how to combine Inclusion-Exclusion with Stars-And-Bars to attack this generic type of problem. – user2661923 Oct 09 '23 at 15:42
  • @user2661923 I looked at the answer you linked, but I'm having trouble figuring out how to apply it to my question. Could you elaborate? – Lawton Oct 09 '23 at 18:04
  • $~x_1 + x_2 + \cdots + x_k = n.~$ How many ($\color{red}{\text{either non-negative or positive)}}$ integer solutions are there? Addendum 2 in the linked answer allows you to convert from positive integer solutions to non-negative solutions. Addendum 1 discusses shortcuts available if (for example) all variables have the same upper limit. Beyond that, you will need to leave me a comment asking a direct question. Then, I can reply further. – user2661923 Oct 09 '23 at 21:20

1 Answers1

3

The best tool for this sort of thing is generating functions. The answer is the coefficient of $x^n$ in the product

$$(x + x^2 + \dots + x^p)^k.$$

(You can see this very directly by writing down what a term in this product is when fully expanded, it is exactly a composition of the desired sort.) You can interpret this as telling you how many ways there are to get a certain sum of dice rolls when rolling $k$ different $p$-sided die, each of which is labeled with the numbers $\{ 1, \dots p \}$, so when $p = 6$ this tells you about rolling ordinary die. That makes this question a duplicate of other questions - this has been asked many times before - although not obviously. You can simplify the generating function by rewriting it as

$$x^k \left( \frac{1 - x^p}{1 - x} \right)^k.$$

The numerator can be expanded using the binomial theorem while the denominator can be expanded using the binomial theorem with negative exponent; this gives

$$(1 - x^p)^k = \sum_{i=0}^k (-1)^i {k \choose i} x^{pi}$$ $$\frac{1}{(1 - x)^k} = \sum_{j \ge 0} {-k \choose j} (-1)^j x^j = \sum_{j \ge 0} {k+j-1 \choose j} x^j$$

and multiplying these gives

$$x^k \left( \sum_{i, j} (-1)^i {k \choose i} {k + j - 1 \choose k - 1} x^{pi+j} \right)$$

so that the final answer is

$$\sum_{pi+j=n-k} (-1)^i {k \choose i} {k+j-1 \choose k-1} .$$

Whether this counts as a "closed form" is up to you but I don't think it gets any better than this. This may be a little easier to understand as a sum over $i$, namely

$$\boxed{ \sum_{i=0}^{\lfloor \frac{n-k}{p} \rfloor} (-1)^i {k \choose i} {n-pi-1 \choose k-1} }.$$

The $i = 0$ term is ${n-1 \choose k-1}$ which is the answer with no restrictions on $p$. The $i = 1$ term is the first correction term $- k {n-p-1 \choose k-1}$ coming from inclusion-exclusion which you've found already. The subsequent terms are further corrections needed to complete the inclusion-exclusion argument (but personally I prefer to let generating functions do these arguments for me).

Qiaochu Yuan
  • 419,620
  • "The best tool for this sort of thing is generating functions" : I question this. Admittedly, my knowledge of the use of generating functions here is weak, as opposed to using [Stars and Bars + Inclusion Exclusion]. Ignoring the not insignificant learning curve for generating functions, which is moderately greater than the corresponding learning curve for [SB+IE], I have two areas of disagreement. [1] Assuming that each of the upper bounds is a different positive integer, then I surmise that the best choice will depend on the value of $~k.~$ ...see next comment – user2661923 Oct 09 '23 at 21:29
  • That is, if $~k \leq 3,~$ then in my opinion [SB + IE] is the best choice, while if $~k \geq 6,~$ and perhaps also $~k = 5,~$ then generating functions is the best choice. For $~k = 4,~$ I would lean towards [SB + IE]. Further, consider the shortcut available in Addendum 1 of the article that I linked to in a comment following the posting. If (for example) all upper bounds are identical, then I would opt for [SB + IE] for $~k \leq 8.~$ ...see next comment – user2661923 Oct 09 '23 at 21:34
  • [2] My second reason for disagreeing is more controversial. Typically, with such a problem, the problem composer is expecting either a direct numerical answer, or some expression like $~\displaystyle \binom{20}{6} - \binom{14}{6}.~$ The difficult with generating functions is that the problem solver has to do significant follow-up to actually compute the numerical value of the corresponding coefficient. Note that many Math-SE responses skirt this issue by referring the original poster to Wolfram Alpha to compute the coefficient. ...see next comment – user2661923 Oct 09 '23 at 21:38
  • I consider the appropriateness of such a strategy somewhat iffy. After all, if computer assistance is available, then (assuming a reasonable size for $~n~$ and $~k),~$ any/all algorithms can be avoided by simply writing a computer program to manually count the satisfying ordered $~k$-tuples. I feel that it is reasonable to require no computer assistance in attacking such a problem. This issue calls into question the entire approach of using generating functions for this type of problem. – user2661923 Oct 09 '23 at 21:41
  • In my general ignorance of using generating functions, I may be overestimating the work involved in converting the coefficient into an expression that resembles $~\displaystyle \binom{20}{6} - \binom{14}{6}.~$ However, when I broached this issue in the past on MathSE, none of the opposition voices indicated that I was overestimating the potential difficulties. – user2661923 Oct 09 '23 at 21:52
  • I don't understand the generating functions part, but the "sum over $i$" at the end of your answer does exactly what I want! Thank you. – Lawton Oct 10 '23 at 13:59
  • 1
    @Lawton: you're welcome to work through how to prove this directly using inclusion-exclusion, it's probably a good exercise. Personally I find it annoying to get the details of inclusion-exclusion arguments right whereas the generating function approach, once you become familiar with it, handles all such things automatically. You can learn more e.g. here: https://www2.math.upenn.edu/~wilf/DownldGF.html – Qiaochu Yuan Oct 10 '23 at 19:38