5

Given an NFA with alphabet $\Sigma = \{a, b, c\}$ defined in the diagram, is there a way to efficiently convert it into a regular expression?

NFA

The way I solved this problem is to first convert the NFA into a DFA using equivalent classes, and then proceed with the method described here. I feel in this case, that method is very inconvenient, as there are many loops and multiple accepting states. I wrote down all the partial regexps regarding to each accepting state alone, then unioned them together, then eliminated the redundant parts. My answer is $a(a^+ \cup b^*a \cup c^*)^+$.

I also tried to eliminate the $\varepsilon$ edges first then start from the "sanitized" NFA, but there were way too many edges in this case, and was very confusing during the process.

Edit: as @DavidRicherby has commented below, converting the NFA to DFA first is not necessary and makes the problem more complex.

John
  • 219
  • 1
  • 8

1 Answers1

4

State elimination method is a good technique, if used properly.

Solving $f_2$ can be done directly from the given automaton.
$$ f_2 = ( s \rightarrow q_1 \rightarrow f_2 + s \rightarrow q_2 \rightarrow f_2 + s \rightarrow f_1 \rightarrow f_2)^+ $$ ( Overall Kleene + due to $(f_2,\epsilon) \rightarrow s$) $$ f_2 = (a\cdot a^*\cdot a + a\cdot b^*\cdot a + (a\cdot c^*)^+\cdot c)^+ $$ This can be rearranged as $$ f_2 = (a\cdot a^+ + a\cdot b^*\cdot a + (a\cdot c^*)\cdot (a\cdot c^*)^*\cdot c )^+$$ $$ = ( a \cdot ( a^+ + b^*\cdot a + c^*\cdot (a\cdot c^*)^*\cdot c) )^+ $$
Similarly, $f_1$ can be solved directly from the given automaton as follows $$ f_1 = ( s \rightarrow q_1 \rightarrow f_2 \rightarrow s + s \rightarrow q_2 \rightarrow f_2 \rightarrow s)^* \cdot s \rightarrow f_1 \cdot (\epsilon + c \cdot \epsilon \cdot f_1) $$ $$ = ( a \cdot a^+ + a\cdot b^* \cdot a )^* \cdot (a\cdot c^*)^+ \cdot ( \epsilon + c \cdot \epsilon \cdot f_1) $$

$( \epsilon + c \cdot \epsilon \cdot f_1)$ represent the two possible transitions from $f_1$. If the input has been completed scanned, stop at $f_1$. If not, move to $f_2$ using $c$, then move to $s$ using $\epsilon$. Then start scanning for $f_1$ again.

This recursive relationship can be solved by taking $$ r = ( a \cdot a^+ + a\cdot b^* \cdot a )^* \cdot (a\cdot c^*)^+$$ Then $f_1$ becomes,

$$ f_1 = r \cdot ( \epsilon + c \cdot f_1)$$ $$ = r \cdot ( \epsilon + c \cdot ( r \cdot ( \epsilon + c \cdot f_1) ) )$$ Unravelling this recursion gives us, $$ f_1 = r + r \cdot c \cdot r + r \cdot c \cdot r \cdot c \cdot r + \space... $$ $$ = r \cdot ( c \cdot r )^*$$

  • 1
    Thank you for the answer! I am still a bit confused with the $f_1$ part, what does $\epsilon + c \cdot f_2$ represent at the end of that equation? Especially, what does $c \cdot f_2$ mean? (I understand the notation of eg. $s \rightarrow q_1$, but here $c$ is an edge/input/short hand of a transition function whereas $f_2$ is a state) – John Sep 17 '18 at 13:31
  • 1
    @John have made a mistake with $\epsilon + c \cdot f_2$. Let me correct it. – RandomPerfectHashFunction Sep 17 '18 at 13:40
  • 1
    @John I edited it a bit. I hope it seems clearer now. – RandomPerfectHashFunction Sep 17 '18 at 17:18
  • Ah I get it now! I didn't realize that this is meat to represent recursions. Thank you! – John Sep 17 '18 at 18:59