5

Let $Z\left(n,m\right)$ be the number of unique binary strings of length $m$ containing at least one instance of $n$ consecutive 1's. I am trying to come up with an expression for $Z$, preferably directly calculable though I will accept a recursive solution as well. I have attempted a formulation based on [1], $$ \hat{Z}\left(n,m\right) = \sum_{q=m}^{n}\sum_{i=1}^{\lfloor \frac{q}{m}\rfloor}(-1)^{i+1}\binom{n-q+1}{i}\binom{n-mi}{n-q}\text{,}$$ however I am getting some discrepencies against test cases I worked out by hand. For example, it works for $Z\left(7,6\right)=3$ and $Z\left(7,5\right)=8$, but it does not work for $\left(7,4\right)=16$ (the formulation above gives $20$). N.B.: my definitions of $n$ and $m$ are opposite those of [1]; $q$ is the same.

I believe it has something to do with double-counting some string permutations, but I haven't been able to work out what else I have to take out.

Update: I found a recursive formulation [2] that gives me the same result as my $\hat{Z}$ above: $$ \tilde{Z}\left(n,m\right) = 2\tilde{Z}\left(n-1,m\right) + 2^{n-m-1}-\tilde{Z}\left(n-m-1,m\right) $$ Having found this independent formulation, I will have to revisit my counting and see if I've made a mistake somewhere.

Bonus points for an answer that works for arbitrary dictionaries, i.e. $W\left(a,n,m\right)$ where $a$ is the number of possible symbols in each position of the string. The original question would be equivalent to $Z\left(n,m\right) = W\left(2,n,m\right)$.

  • 1
    If a string of length $m$ contains a run of length $n$ then $n<m$. Could you give the 7 strings for $Z(5,7)$? I get eight: 0011111, 0111110, 0111111, 1011111, 1111100, 1111101, 1111110, 1111111. – Andrew Woods Mar 17 '15 at 09:50
  • 1
    If that's what you had in mind, then $Z(n,m) = 2^m - f_n(m)$ where $f_n(m)=\sum_{j=1}^nf_n(m-j)$ for appropriate starting values. – Andrew Woods Mar 17 '15 at 10:01
  • Sorry, I was omitting the all-1s cases when counting manually (and in my software implementation), since that's what I will ultimately care about. I have updated the question to reflect the proper values according to the stated definition of $Z$. – Aesthatron Mar 17 '15 at 11:26
  • I fail to see where $f_n$ actually takes on a value; as written it appears to be infinitely recursing, while if you flip the $m$ and $n$ around you eventually end up with an empty sum returning 0, which propagates all the way back up the chain. – Aesthatron Mar 17 '15 at 11:34
  • Since I found a completely independent formulation for what should be the same problem ([2]) that gave the same results as my method from ([1]), I went back and re-visited my counting. I found the "missing" strings for Z(7,4), so 20 is in fact the correct value. I have to go to work now but I will play with it tonight and see if it gives me the expected result for other values of $n$ and $m$. – Aesthatron Mar 17 '15 at 12:08
  • The starting values are $f_n(m)=2^m$ for $0\le m<n$. You can test any of these results in Python with an expression like len([i for i in range(2**8) if '1111' in bin(i)]). I'll take the time to write a fuller answer with an explanation. – Andrew Woods Mar 17 '15 at 12:32
  • This can be writen in terms of n-Step Fibonacci numbers, see eg http://mathworld.wolfram.com/Fibonaccin-StepNumber.html – leonbloy Mar 17 '15 at 12:38
  • refer to the answer to this other post – G Cab Oct 25 '17 at 21:34

2 Answers2

1

You can use symbolic combinatorics to get a generating function. It is easier to get the number of strings that don't have $m$ consecutive ones, and subtract from the total.

Call $\mathcal{B}_{1^m}$ the set you are after, and $\mathcal{P}_{< m}$ the set of strings of less than $m$ ones, i.e., $\{ \epsilon, 1, 11, \dotsc, 1^{m - 1} \}$. We have the following symbolic equations:

$\begin{align} \mathcal{P}_{< m} &= \mathcal{E} + \{ 1 \} + \dotsb + \{ 1 \}^{m - 1} \\ \mathcal{B}_{1^m} &= \mathcal{P}_{< m} + \mathcal{P}_{< m} \times \{ 0 \} \times \mathcal{B}_{1^m} \end{align}$

Essentially, one of your strings is either a string of less than $m$ ones, or a string of less than $m$ ones, a zero, and a string of the form we are looking for.

Use $z$ to mark length, so that the symbol itself is irrelevant, and write the equations for the generating functions:

$\begin{align} P_{< m}(z) &= 1 + z + \dotsb + z^{m - 1} \\ &= \frac{1 - z^m}{1 - z} \\ B_{1^m}(z) &= P_{< m}(z) + P_{< m} \cdot z \cdot B_{1^m}(z) \\ &= \frac{1 - z^m}{1 - z} (1 + z B_{1^m}(z)) \end{align}$

Solving:

$$ B_{1^m}(z) = \frac{1 - z^m}{1 - 2 z + z^{m + 1}} $$

What you are looking for is the coefficient of $z^n$ in this. There is no simple expression for that in the general case. For $m = 1$ it is:

$$ B_1(z) = \frac{1 - z}{1 - 2 z + z^2} = \frac{1}{1 - z} $$

so that $Z(n, 1) = 2^1 - 1 = 1$. No surprise there.

If $m = 2$ you have:

$$ B_{11}(z) = \frac{1 - z^2}{1 - 2 z + z^3} = \frac{1 + z}{1 - z - z^2} $$

so $Z(n, 2) = 2^n - F_{n + 2}$, where $F_n$ is a Fibonacci number, defined by:

$$ F_0 = 0, F_1 = 1, F_{n + 2} = F_{n + 1} + F_n $$

The cases $m = 3$ and $4$ are still doable, but are a horrible mess of roots when expanding the generating function in partial fractions.

vonbrand
  • 27,812
0

It is easiest to count strings which don't contain the runs of ones. Suppose you start with a string of $m$ ones: $$1111111111\ldots1111111111\ \_$$ To ensure that there are no four consecutive ones, we will replace ones with zeros, hopping from left to right, and landing on the underscore.

Counting binary sequences without runs of a given length is equivalent to the stairs problem, with each footfall marking the zero which prevents a run.

Start by considering sequences without 1111. This is like climbing a staircase of height $m$, with steps of length 1, 2, 3, or 4. Let $f(n)$ be the number of ways of reaching step n. If you are at step n, you could have got there from steps $n-1, n-2, n-3,$ or $n-4$. Thus $$f(n) = f(n-1)+f(n-2)+f(n-3)+f(n-4)$$ and you start with $1, 2, 4, 8$: $$1, 2, 4, 8, 15, 29, 56, 108$$

You need to get to step 5 at least, so add $15+29+56+108=208$. Therefore there are $256-208=48$ sequences which contain 1111.

As for the situation with $a>2$, the only modification is that each footfall can represent any of the $a-1$ digits which aren't $1$. Thus for base-$a$ strings, we'd use $$f(n)=(a-1)(f(n-1)+f(n-2)+f(n-3)+f(n-4))$$ instead.

Andrew Woods
  • 3,672