1

I am trying to calculate the probability that I'll have a sequence of length $L$ in a random subset of $[n]$ when the subset size is $k$.

For example, if $n=5$, $k=4$ and $L=2$ I'll have the following subsets:

$$\{2,3,4,5\}, \{1,3,4,5\}, \{1,2,4,5\}, \{1,2,3,5\}, \{1,2,3,4\}$$

Thus the answer will be $1/5$ because there is only one subset that have $L=2$ sequence or for $L=3$ the answer will be $2/5$ etc.

Thomas Russell
  • 10,425
  • 5
  • 38
  • 66
  • So, when you write, "have a sequence of length $L$," you mean "the longest consecutive sequence should be of length $L$," is that right? – Gerry Myerson Feb 16 '13 at 11:09
  • @Gerry Myerson No, that's mean that i'm defining the L, it shouldn't be the longest one. – user61807 Feb 16 '13 at 11:20
  • FWIW, I understood the definition of L exactly as @GerryMyerson describes it in his comment and I fail to understand the answer in your comment. – Did Feb 16 '13 at 11:33
  • Given $L$, what is the probability that a random $k$-subset of ${1,\dots,n}$ contains at least $L$ successive integers? (somewhere, maybe in several places). In Gerry's terms, the longest consecutive sequence should be of length $L$ or more. Of course, if you know one probability, you know the other, which makes me suspect that the exact answer is fairly ugly. – fedja Feb 16 '13 at 11:37
  • @Did i'm looking for the Pr(L=2|k=4) or Pr(L=3|k=4) or whatever..., i'm not looking for the longest L – user61807 Feb 16 '13 at 11:38
  • Actually, your definition and your example contradict each other, hence the confusion. Nevertheless, as I said, it doesn't matter which way we understand the question :). – fedja Feb 16 '13 at 11:40
  • I never said you did, I said that at present nobody knows what it is you call L (that is, if L is not the quantity @GerryMyerson suggested). – Did Feb 16 '13 at 11:41
  • @user61807 Maybe you can tell which $L$ value each of subsets you mentioned above corresponds to and why. It will help clarify the definition. – polkjh Feb 16 '13 at 11:45
  • @polkjh The question is from all k-size subsets of [n], how many will have exactly L lenght sequence (if it has longest that L - it shouldn't be counted), or what is the probability that a choosen k-size subset have exactly L-size sequence – user61807 Feb 16 '13 at 11:51
  • Ah, this is a third possible interpretation I overlooked. Now it is clear what is meant :). – fedja Feb 16 '13 at 12:10

1 Answers1

0

There is a bijection between binary numbers of length $n$ and compositions of number $n+1$. And subsets of $[n]$ can be easily represented as binary numbers of length $n$. So we can represent subsets of $[n]$ by compositions of $n+1$.

Let $C(n,k,L)$ denote the number of subsets of $[n]$ in which the longest sequence of consecutive numbers is of length at most $L$. What we need is that sequence to have length exactly $L$, which is given by $C(n,k,L)-C(n,k,L-1)$.

Representing subsets as compositions, $C(n,k,L)$ corresponds to the number of compositions of $n+1$ into $n-k+1$ parts with each part at most $L+1$. This was answered in Restricted Compositions.

polkjh
  • 1,432
  • I don't see where the L-size sequence is considered. – user61807 Feb 16 '13 at 12:52
  • That is what $C(n,k,L)$ gives. It is the number of subsets with sequence size $L$ or less. So $C(n.k,L)-C(n,k,L-1)$ gives the number of subsets with exactly $L$-size sequences. And in compositions, this is forced by saying that each part is at most $L+1$. Is the mapping between compositions and subsets clear? – polkjh Feb 16 '13 at 13:04
  • So I don't understand what does this C(n,k,l) means? c(n,k,l)=? (is there any equation?) – user61807 Feb 16 '13 at 13:13
  • That is given in the link I gave above, to another question. (http://math.stackexchange.com/questions/21645/restricted-compositions) – polkjh Feb 16 '13 at 13:15