Is one regular language subset of another?

Question

Let $L_1$ and $L_2$ be two regular languages given as regular expressions (in this type of tasks it often happens that $L_1 \subseteq L_2$, but vice versa it is false).

Is there a nice way to prove that $L_1 \subseteq L_2$ ? If yes, than do you think you could explain that algorithm?

I had an idea to construct those languages from regexes using Kleene theorem and than prove that every word from $L_1$ can be a prefix or a suffix of some word w $\in$ $L_2$, where the rest of w can be omitted (i.e. in regex representation it is under $^*$ sign).

OK, another idea is to use brute force - just show step by step that any word from $L_1$ is accepted by the DFA corresponding to $r_2$. However, what if there are too many words in $L(r_1)$?

So I don't think these are good ideas.

Example: $$r_1 = (a+ab+bb)(a+b)^* \\ r_2 = aab^*$$ Obviously $L(r_1) \text { is not a subset of } L(r_2)$, but $L(r_2) \subseteq L(r_1)$.

@Jan Dvorak, what do you mean by run in parallel? Draw on a piece of paper next to each other? — petajamaja, Jan 21 '13 at 22:49
Basically, I want you to make a cartesian product of two deterministic finite state machines and see if any of the [accepts, rejects] or [rejects, accepts] states are accessible from the [start, start] state. — John Dvorak, Jan 21 '13 at 22:53
Umari, those stars are supposed to be superscripts; writing them on the main line is simply wrong. — Brian M. Scott, Jan 21 '13 at 22:54
Are you looking for a manual proof or an automated algorithm? — John Dvorak, Jan 21 '13 at 22:58
OK, @JanDvorak, thanks, I almost understand the method you suggested. However, could you please provide a small example to show why we have to look for [accept,reject] or [reject, accept], but never [accept,accept] or [reject,reject]? — petajamaja, Jan 21 '13 at 23:00
@Umari [accept, accept] and [reject, reject] are not as interesting if you want to prove subset relationship. If you want to test if two languages are disjoint, look for [accept, accept] states. If you want to prove Left is a subset of Right, see if there's any reachable [accept, reject] state. — John Dvorak, Jan 21 '13 at 23:01
@JanDvorak, is the choice between looking for [accept,reject] or for [reject, accept] determined by the order of languages (which is subset of which)? I mean, for example, if $L_1 \subseteq L_2$, then we look for [accept,reject], and if $L_2 \subseteq L_1$, we look for [reject, accept]? If not, then how is the order determined in your algorithm? — petajamaja, Jan 21 '13 at 23:11

score 6 · Accepted Answer · answered Jan 21 '13 at 23:16

The basic idea for an automated proof is:

convert both languages to their deterministic finite state machines.
Calculate the cross product of the finite state machines with the transitions $F: ([x,y],\sigma) => [F_x(x,\sigma), F_y(y,\sigma)]$
Collect the list of reachable states, and their acceptivity status from both languages. If there are any
- [reject, reject] states, then the union of both languages is not the universal language (there is a word that they both reject).
- [reject, accept] states, then the latter is not a subset of the former.
- [accept, reject] states, then the former is not a subset of the latter.
- [accept, accept] states, then the languages are not disjoint.

Other classes: two languages are equivalent if they are both subsets of each other (however, there are other tests for that); One language is a proper subset of another language if it is its subset but not vice versa.

@Umari Note that it might be beneficial performance-wise to combine the latter two steps. — John Dvorak, Jan 21 '13 at 23:18

Is one regular language subset of another?

1 Answers1