7

Let $L_1$ and $L_2$ be two regular languages given as regular expressions (in this type of tasks it often happens that $L_1 \subseteq L_2$, but vice versa it is false).

Is there a nice way to prove that $L_1 \subseteq L_2$ ? If yes, than do you think you could explain that algorithm?

I had an idea to construct those languages from regexes using Kleene theorem and than prove that every word from $L_1$ can be a prefix or a suffix of some word w $\in$ $L_2$, where the rest of w can be omitted (i.e. in regex representation it is under $^*$ sign).

OK, another idea is to use brute force - just show step by step that any word from $L_1$ is accepted by the DFA corresponding to $r_2$. However, what if there are too many words in $L(r_1)$?

So I don't think these are good ideas.

Example: $$r_1 = (a+ab+bb)(a+b)^* \\ r_2 = aab^*$$ Obviously $L(r_1) \text { is not a subset of } L(r_2)$, but $L(r_2) \subseteq L(r_1)$.

  • You could convert both to FSMs and run them in parallel. – John Dvorak Jan 21 '13 at 22:48
  • @Jan Dvorak, what do you mean by run in parallel? Draw on a piece of paper next to each other? – petajamaja Jan 21 '13 at 22:49
  • FSM = Finite state machines? – petajamaja Jan 21 '13 at 22:51
  • 1
    Basically, I want you to make a cartesian product of two deterministic finite state machines and see if any of the [accepts, rejects] or [rejects, accepts] states are accessible from the [start, start] state. – John Dvorak Jan 21 '13 at 22:53
  • 1
    Umari, those stars are supposed to be superscripts; writing them on the main line is simply wrong. – Brian M. Scott Jan 21 '13 at 22:54
  • Are you looking for a manual proof or an automated algorithm? – John Dvorak Jan 21 '13 at 22:58
  • OK, @JanDvorak, thanks, I almost understand the method you suggested. However, could you please provide a small example to show why we have to look for [accept,reject] or [reject, accept], but never [accept,accept] or [reject,reject]? – petajamaja Jan 21 '13 at 23:00
  • Actually, I am looking for any of those. – petajamaja Jan 21 '13 at 23:01
  • @Umari [accept, accept] and [reject, reject] are not as interesting if you want to prove subset relationship. If you want to test if two languages are disjoint, look for [accept, accept] states. If you want to prove Left is a subset of Right, see if there's any reachable [accept, reject] state. – John Dvorak Jan 21 '13 at 23:01
  • @JanDvorak, is the choice between looking for [accept,reject] or for [reject, accept] determined by the order of languages (which is subset of which)? I mean, for example, if $L_1 \subseteq L_2$, then we look for [accept,reject], and if $L_2 \subseteq L_1$, we look for [reject, accept]? If not, then how is the order determined in your algorithm? – petajamaja Jan 21 '13 at 23:11
  • I have corrected the asterisk. =) – petajamaja Jan 21 '13 at 23:14

1 Answers1

6

The basic idea for an automated proof is:

  • convert both languages to their deterministic finite state machines.
  • Calculate the cross product of the finite state machines with the transitions $F: ([x,y],\sigma) => [F_x(x,\sigma), F_y(y,\sigma)]$
  • Collect the list of reachable states, and their acceptivity status from both languages. If there are any
    • [reject, reject] states, then the union of both languages is not the universal language (there is a word that they both reject).
    • [reject, accept] states, then the latter is not a subset of the former.
    • [accept, reject] states, then the former is not a subset of the latter.
    • [accept, accept] states, then the languages are not disjoint.

Other classes: two languages are equivalent if they are both subsets of each other (however, there are other tests for that); One language is a proper subset of another language if it is its subset but not vice versa.