1

What is the probability that a substring of length N will appear in a longer uniformly distributed string from an alphabet of length K if the substring is allowed to repeat, (e.g. AAAA)?

I've seen some approximations to this that assume non-repeating substrings. I'd like to get a better grasp for this problem, but I can't seem to find the search terms. I expected to find something from Knuth.

I've found a SO post about the Goulden-Jackson method, but it's difficult to understand. Any help on what this problem is called and where I can learn about it?

RobPratt
  • 45,619
noel
  • 145

1 Answers1

0

Hint: A nice presentation of the Goulden-Jackson Cluster Method is given in this paper. You might find the following posts helpful:

Markus Scheuer
  • 108,315