Proving Regularity of Languages that are 1/k of an already known regular language

Question

There is this question in Kozen, that states if a language is regular then the first half would also be regular. Also I found a material on the internet that extends the thinking saying a language that is two-thirds of already known regular language is regular. I'm tempted to think that it should also hold true for any general $k>0$ that $m/k$th ($1 \leqslant m \leqslant k-1$) portion of a regular language would also be regular. I need a mathematical proof ( or a constructive proof having mathematical verifications done on it ) for the above statement.
EDIT: For the "first half Language" which is much better formalised by David in the comments, I tried a similar argument as the answer in the link given by Hendrik. The product automaton notion was intuitively clear. But I was stuck with the transition listings for the 2nd state in the pair of states so formed by constructing the product automaton. I was flummoxed as to how I could be able to get the exact state for the corresponding word 'w' which would be the "first half" of a word accepted by the original regular language.

What do you mean by "the first half" of a language $L$? The set of strings ${w\mid ww'\in L \text{ for some }w'{ with }|w'|=|w|}$? Also, what is a proof that is not mathematical? — David Richerby, Sep 08 '14 at 23:12
The "first half" automaton-construction for regular languages was given in a question at this site. The method used there is easily expandable to 1/k. Let us know when you have specific questions. — Hendrik Jan, Sep 09 '14 at 01:08
yes, sir I just looked through the site and found those questions... I'll come back to this question in case I need more mathematical rigor to answer that question... @David.. Yes, sir I meant that, I'm sorry for not explaining things clearly. Please pardon my naiveté. — Ramit, Sep 09 '14 at 03:57
@Hendrik ...I read up the answer provided in the link given. I am a bit confused with the last sentence where they do away with the assumption. Is it saying that say if Q={q_1,q_2..q_n} then my new automaton would have states like (q_1,q_2) n times? Each time q_2 transiting to one of q_i (1<=i<=n) — Ramit, Sep 09 '14 at 04:31
What have you tried and where did you get stuck? Also, please edit the question to include a mathematical definition of your operator -- how can you expect a proof without a definition? — Raphael, Sep 09 '14 at 10:16
@Raphael... For the "first half Language" which is much better formalised by David in the above comments, I tried a similar argument as the answer in the link given by Hendrik. The product automaton notion was intuitively clear. But I was stuck with the transition listings for the 2nd state in the pair of states so formed by constructing the product automaton. I was flummoxed as to how I could be able to get the exact state for the corresponding word 'w' which would be the "first half" of a word accepted by the original regular language. — Ramit, Sep 09 '14 at 13:19
Please incorporate these things into the question (note the link "edit" beneath it). — Raphael, Sep 09 '14 at 13:21

score 4 · Accepted Answer · answered Sep 09 '14 at 09:35

Let $L$ be a regular language and let $(p, q) \in \mathbb{N}^2$. Then the following language is regular: $$ L_{p,q} = \{ u \in A^* \mid \text{there exist $x$ and $y$ in $A^*$ such that $|x| = p|u|$, $|y| = q|u|$ and $xuy \in L$} \} $$ Furthermore, for any subset $S$ of $\mathbb{N}^2$, the language $$ L_S = \bigcup_{(p,q,r) \in S} L_{p,q} $$ is also regular. I would like to insist that it works for any any subset $S$, including non recursively enumerable subsets of $\mathbb{N}^2$, which might look a little bit suspicious at first glance...

You can try to prove these results by using automata, but it is much easier to use the fact that a language is regular iff it is recognized by a finite monoid.

Let $L$ be a regular language of $A^*$. It is recognized by a finite monoid $M$, that is, there is a surjective monoid morphism $f:A^* \to M$ and a subset $P$ of $M$ such that $f^{-1}(P) = L$. Now $\mathcal{P}(M)$, the powerset of $M$, is also a finite monoid under the multiplication defined, for $X, Y \in \mathcal{P}(M)$, by $XY = \{ xy \mid x \in X, y \in Y\}$.

Let now $h: A^* \to \mathcal{P}(M) \times M$ be the monoid morphism defined, for each letter $a \in A$, by $h(a) = (f(A), f(a))$. Then for each word $u$, $h(u) = (f(A^{|u|}), f(u))$. I claim that $L_{p,q} = h^{-1}(Q)$ where $$ Q = \bigl\{(R,m) \in \mathcal{P}(M) \times M \mid R^pmR^q \cap P \not= \emptyset \bigr\}. $$ and similarly $L_S = h^{-1}(T)$ where $$ T = \Bigl\{(R,m) \in \mathcal{P}(M) \times M \mid \Bigl(\bigcup_{(p,q) \in S}R^pmR^q\Bigr) \cap P \not= \emptyset \Bigr\}. $$ Thus $L_{p,q}$ and $L_S$ are recognized by the finite monoid $\mathcal{P}(M) \times M$ and hence are regular.

Proof of the claim. \begin{align*} h^{-1}(Q) &= \{u \in A^* \mid (f(A^{|u|}), f(u)) \in Q \} \\ &= \{u \in A^* \mid f(A^{p|u|}f(u)f(A^{q|u|}) \cap P \not= \emptyset \} \\ &= \{u \in A^* \mid f(A^{p|u|}uA^{q|u|}) \cap P \not= \emptyset \} \\ &= \{u \in A^* \mid A^{p|u|}uA^{q|u|} \cap f^{-1}(P) \not= \emptyset \} \\ &= \{u \in A^* \mid A^{p|u|}uA^{q|u|} \cap L \not= \emptyset \} \\ &= L_{p,q} \end{align*}

Could you suggest to me a book which I can go through to develop the Mathematical structures for proving such things related to Automata. I do not have a background on Abstract Algebra. I wish to be able to give proofs like the one you have given here. Thank you, for answering my question. Though truly speaking I didn't get much of what you wrote here :P I wish to work on this and get better :) — Ramit, Sep 09 '14 at 16:42

Yuval Filmus · Answer 2 · 2014-09-10T19:10:40.683

Suppose we are given a regular language $L$ and integers $p, q$, and are interested in the language $M$ of all $x$ such that $axb\in L$ for some $|a|=p|x|,|b|=q|x|$.

Our starting point is an automaton for $L$. The automaton for $M$ starts by guessing two states $s, t$ of the $L$ automaton. These are supposed to be the states just before and just after reading $x$. The new automaton keeps track of the states, corresponding to the $a, x, b$ part of the purported word. They start at $s_0,s, t$, where $s_0$ is the initial state of the $L$ automaton. When reading a character, we update the $x$ state in the expected way. We also update the other states by guessing $p, q$ characters, accordingly. We are in an accepting state of the tracked states are $s, t, f$ for some accepting state $f$ of the $L$ automaton.

Each choice of $p, q$ controls in which possible ways we can advance the $a, b$ states. There are only a finite number of possibilities here (basically since the $L$ automaton is finite), and so we can handle arbitrary many allowed choices for the pair $p,q$, like shown in the other answer.

Thanks a lot Sir, for this answer. Perhaps my biggest problem here is the inability to completely grasp the notion of non-determinism used to guess states s,t of L. Understanding the part after the choices are made is complete. I perhaps need to get more acquainted to the utilities of guessing. — Ramit, Sep 09 '14 at 15:45

Proving Regularity of Languages that are 1/k of an already known regular language

2 Answers2

Linked

Related