12

Say, $L \subseteq \{0\}^*$. Then how can we prove that $L^*$ is regular?

If $L$ is regular, then of course $L^*$ is also regular. If $L$ is finite, then it is regular and again $L^*$ is regular. Also I have noticed that, for $L = \{0^p \mid p \text{ is a prime}\}$, $L$ is not regular, $L \subseteq \{0\}^*$ and $L^*$ is regular.

But how to show this for any subset $L$ of $\{0\}^*$ ?

Paresh
  • 3,338
  • 1
  • 20
  • 32
ChesterX
  • 239
  • 3
  • 7
  • https://cs.stackexchange.com/q/21765/755, https://cs.stackexchange.com/q/145952/755 – D.W. Nov 21 '21 at 03:00

2 Answers2

9

Assume that $L$ contains two words $w_1$ and $w_2$ such that the lengths of these words, $|w_1|$ and $|w_2|$, have no factors in common. Then, we have that the longest word that cannot be formed by concatenating these words has length $(|w_1|-1)(|w_2|-1) - 1$ (Frobenius number). That is to say, if there are words in the language whose lengths don't have a common factor, then all words of a certain minimal length are in the language $L^*$. It's easy to see this is regular since, of necessity, there are a finite number of equivalence classes under the Myhill-Nerode indistinguishability relation.

What if the lengths of all words in $L$ share a common factor? Well, It's not hard to see that in such cases, $L^*$ is also regular. Simply note that instead of all words whose lengths are greater than some minimal length being in $L^*$, it will instead be true that all words whose lengths are a multiple of the GCD of word lengths will be in $L^*$, and no words whose lengths aren't multiples of this GCD will be, and since $(L^k)^*$ is regular for any integer $k$, $L^*$ is also regular.

This is pretty informal, but everything you need to formalize this should be here.

Patrick87
  • 12,824
  • 1
  • 44
  • 76
4

The basic idea is that in a language built on a one-letter alphabet, every sufficiently long word is a concatenation of shorter words. So when you take a word $w$ in $L^*$, i.e. a concatenation of words in $L$, there is a core $\mathring{L}$ such that $w$ is a concatenation of words in $\mathring{L}$. Thus $L^* = \mathring{L}^*$. It turns out that $\mathring{L}$ is finite, hence it and $L^*$ are regular.

Let $M$ be a subset of $L$ and $w$ a word in $L$. $w$ can be expressed as a concatenation of words in $L$ iff $|w|$ can be expressed as a sum of elements of $S \subset \mathbb{N}$ where $S$ is the set of lengths of words in $M$. Thus the problem reduces to expressing an integer as a sum of integers in a particular set (with repetitions allowed): can $|w|$ be expressed as $k_1 s_1 + \ldots + k_m s_m$ with $\forall i, s_i \in S$ and $k_1 \in \mathbb{N}$?

This is a well-known problem in arithmetic, and the answer is that if the coefficients $(k_i)$ can be negative ($k_i \in \mathbb{Z}$), $|w|$ is expressible iff it is a multiple of the greatest common divisor of the elements of $S$: $\gcd S$. With a requirement for non-negative coefficients, this still holds for sufficiently large $|w|$.

Consider the infinite sequence $(g_i)_{i\ge\min S}$ defined by $g_i = \gcd (S \cap [0,i])$. This is a decreasing sequence of integers (starting with $g_{\min S} = \min S$, so it is constant after a certain index $j$; and $g_j = \gcd S$. By the Chinese remainder theorem, every element of $S$ can be expressed as $k_1 s_1 + \ldots + k_m s_m$ with $\forall i, k_i \in \mathbb{Z}$ and $\{s_1, \ldots, s_m\} = S \cup [0,j]$. If $x \in S$ and $x \ge s_1 \cdot \ldots \cdot s_m$ then you can pick all non-negative coefficients.

Enough arithmetic. Let $\mathring{L} = \{w \in L \mid |w| \le g_j\}$. Every word in $L$ can be expressed as a concatenation of words in $L$ whose length is at most $g_j$, i.e. $L \subseteq \mathring{L}^*$. Since we also have $\mathring{L} \subseteq L$, we have $L^* = \mathring{L}^*$, which is regular since $\mathring{L}$ is finite hence regular.


Alternatively, use the characterization of regular languages in single-letter alphabets.

Gilles 'SO- stop being evil'
  • 43,613
  • 8
  • 118
  • 182