15

I want to know if the following problem is decidable:

Instance: An NFA A with n states

Question: Does there exist some prime number p such that A accepts some string of length p.

My belief is that this problem is undecidable, but I can't prove it. The decider can easily have an algorithm to figure out if a particular number is prime, but I don't see how it would be able to analyze the NFA in enough detail to know exactly what lengths it can produce. It could start testing strings with the NFA, but for an infinite language, it may never halt (and thus not be a decider).

The NFA can easily be changed to a DFA or regular expression if the solution needs it, of course.

This question is something I've been pondering as a self-made prep question for a final I have coming up in 2 weeks.

Raphael
  • 72,336
  • 29
  • 179
  • 389
Chill
  • 305
  • 3
  • 6
  • I am not sure if this is undergrad-level, so don't worry about deleting it. It might turn out to be a hard problem, see e.g. http://terrytao.wordpress.com/2007/05/25/open-question-effective-skolem-mahler-lech-theorem/ –  Mar 27 '13 at 17:44
  • Well, I made it up, so it may well be difficult. I haven't found any proofs of undecidable problems involving NFAs/DFAs, which is why I thought it might be interesting to try one. –  Mar 27 '13 at 17:53
  • I believe what you linked to is a different (easier) problem. It can answer "how many strings of length x does an NFA accept?". Using the formula provided, we would have to check infinitely many instances of $s_L(n)$ to see if there exists a string the NFA accepts that is prime in length. I'm not asking about a particular prime, I'm asking about all of them. –  Mar 27 '13 at 19:01

1 Answers1

17

The lengths of the strings accepted by a DFA form a semilinear set (like in Parikh's theorem for context free languages), the description of those isn't too hard to come by (essentially splice up all possible cycles of the automaton), and by Dirichlet's theorem any arithmetic progression of the form $a + b k$ with $\gcd(a, b) = 1$ contains an infinitude of primes.

Pulling the above together gives an algorithm to check if your regular (or even context free language) contains strings of prime length. Definitely not a simple question, IMVHO...

vonbrand
  • 14,004
  • 3
  • 40
  • 50
  • I'd appreciate some help understanding Parikh's theorem in this instance.

    We can obviously turn an NFA into a PDA by just not using the stack in the PDA. Do the linear subsets specify the cycles? If so, how does that work?

    – Chill Mar 27 '13 at 20:28
  • 1
    @Chill, Consider any path through the DFA. It might go straight from starting state to final state, or it might loop. The possible lengths of strings are determined by the "straight portion" + a sum of $k$ times "length of a possible loop" for arbitrary $k$s. Just draw some tangle of a DFA, and trace the paths through it. You will see the possible lengths fall into families of arithmetic sequences defined by the cycles, i.e., they form a semilinear set. No need to go context free (just a nice free bonus). – vonbrand Mar 27 '13 at 20:39
  • 1
    I think that answers my question. I'm going to try to read up more on Parikh's theorem. I understand the idea of it and how it can specify cycles in this case. What I want to figure out is a more "hands on" solution where I make an actual algorithm to solve this problem. – Chill Mar 27 '13 at 20:55
  • @Chill, just look at my previous comment. It isn't so hard to come up with a description of the possible lengths by just erasing the symbols on the DFA as a graph and checking for walks between the start start state and final states. Hard to formalize, easy to figure out by hand for any given example. – vonbrand Mar 27 '13 at 21:00
  • @Chill This also relates to the proof of the Pumping lemma: if some (long) word of length $n$ is accepted by the automaton, so are inifinitely many more whose lengths have the form $ki + (n-k)$, among others. – Raphael Mar 28 '13 at 10:14
  • Indeed now the algorithm that answers the question is very simple: if the language is infinite (i.e. if there exists a cycle which is accessible and co-accessible), then answer YES. Otherwise, just check the length of all the possible paths in the automaton. Parikh's theorem is not even needed, because we just need ONE cycle to obtain the existence of a prime numbe in the language, no need to consider the interactions between cycles. – Denis Mar 28 '13 at 12:11
  • 3
    @dkuper, it is not that simple. The regular language $a a a a (a a)^*$ is infinite, but contains no string of prime length. – vonbrand Mar 28 '13 at 12:17
  • @Chill For an elementary proof that the lengths of strings accepted by a DFA form a semilinear set, see What are the possible sets of word lengths in a regular language? – Gilles 'SO- stop being evil' Mar 28 '13 at 22:40
  • ah you're right my bad – Denis Mar 29 '13 at 16:17