Probability of substring in string with overlaps

Question

What is the probability that a substring of length N will appear in a longer uniformly distributed string from an alphabet of length K if the substring is allowed to repeat, (e.g. AAAA)?

I've seen some approximations to this that assume non-repeating substrings. I'd like to get a better grasp for this problem, but I can't seem to find the search terms. I expected to find something from Knuth.

I've found a SO post about the Goulden-Jackson method, but it's difficult to understand. Any help on what this problem is called and where I can learn about it?

score 0 · Answer 1 · answered Feb 29 '20 at 20:13

Hint: A nice presentation of the Goulden-Jackson Cluster Method is given in this paper. You might find the following posts helpful:

Probability of substring in string with overlaps

1 Answers1