4

For $x,y \in \{0,1\}^*$ a language $L ⊆ \{0, 1\}^*$ is called substring closed, if $y \in L$ and $x \preceq y$ ($x$ substring of $y$) implies $x \in L$.

I want to prove that every substring closed language $L ⊆ \{0, 1\}^*$ is regular.

Is it enough to have a $L$ such that just $x$ in it, then prove that $L$ is regular?

MR.-c
  • 197
  • 9
  • 1
    What do you mean by "just $x$ in it"? Can you clarify? – John L. Feb 20 '22 at 15:30
  • I mean all strings that don't have substring or the strings which are substring of other string. If I put them in a langauge and prove somehow that langauge is regular then I can prove also that substring closed language is also regular – MR.-c Feb 20 '22 at 15:54

2 Answers2

16

There are some substring-closed languages that are not regular.

Here is an example. Let $C=\{01^n0^n1\mid n\ge1\}$ and $F=\{f\mid \exists c\in C, f\preceq c\}$, i.e., $F$ is the language of all substrings of strings in $C$.

  • $F$ is substring-closed, since a substring of a substring of string $c$ is also a substring of $c$.
  • The intersection of $F$ and the regular language $\{01w01\mid w\in \{0,1\}^*\}$ is $C$, a non-regular language. So $F$ is not regular.
John L.
  • 38,985
  • 4
  • 33
  • 90
9

The result I know holds for SUBSEQUENCE closed languages, and not SUBSTRING closed. The confusion is frequent between english and french, since "subsequence" is translated into "sous-mot" which means litteraly "subword" or "substring"…

To be clear, $u=u_1u_2…u_n$ is a subsequence of $v=v_1v_2…v_m$ if and only if there exists $1\leqslant i_1 < i_2 < … < i_n \leqslant m$ such that $u = v_{i_1}…v_{i_n}$.

The proof I know is a bit long. I will add some details if necessary.

For $L$ any language, denote $\widehat{L}$ the set of words that have a subsequence in $L$: $$\widehat{L} = \{v\in\Sigma^*\mid \exists u\in L, u\preccurlyeq v\}$$

The proof goes as such:

  • prove that for any sequence $(u_n)_{n\in\mathbb{N}} \in \left(\Sigma^*\right)^{\mathbb{N}}$, there exists two indexes $i<j$ such that $u_i\preccurlyeq u_j$;
  • show that for any language $L$, there exists a finite language $F$ such that $\widehat{F} = \widehat{L}$;
  • prove that if $L = \widehat{L}$, then $L$ is regular;
  • conclude that if $L$ is a subsequence closed language, then it is regular.

Each step uses the result of the previous one.

Nathaniel
  • 15,071
  • 2
  • 27
  • 52
  • The word "subsequence" is usually translated into "sous-séquence" as shown by Google Translate. – John L. Feb 21 '22 at 21:29
  • 1
    @JohnL.: I don't know one way or the other, but I would not trust Google Translate on this. (If I go to the English Wikipedia article for "subsequence" and hit "Français", I get taken to a page titled "Sous-suite", which isn't either of the possibilities brought up so far.) – user2357112 Feb 21 '22 at 22:00
  • 1
    @JohnL. "Subsequence" is translated into « sous-suite » (more often than « sous-séquence ») when talking about the general sequence theory (not particularly for words). It is translated into « sous-mot » when talking about a subsequence of a word (see the fourth bullet point. – Nathaniel Feb 21 '22 at 22:38
  • @JohnL. My answer was not a typo: the proof I know show first that a supersequence-closed language is regular, then uses this result to prove that a subsequence-closed language is regular. The second point of the proof was not so obvious for me using your definition. – Nathaniel Feb 26 '22 at 22:03
  • 1
    @JohnL. If $L$ is subsequence-closed, then the complement $\overline{L}$ is supersequence-closed, hence regular, and so is $L$. – Nathaniel Feb 27 '22 at 22:31