6

Motivation

Consider the following languages, are they context-free?

  • $\{x \# y: x \neq y\}$
  • $\{x y: |x|=|y|, x \neq y\}$
  • $\{x \# y: |x|=|y|, x \neq y\}$
  • $\{x y: |x|=|y|,d(x,y)>1\}$
  • $\{x x\}$

The first three are explained here, the fourth one is this question, the last one is well-known.

I'm wondering whether there is an algorithm to solve this kind of problem in general.

Question

Given two strings $x,y$, let $\operatorname{zip}(x,y)$ denote the string $(x_1,y_1)(x_2,y_2)\dots(x_n,y_n)$. Note that letters in $\operatorname{zip}(x,y)$ are pairs. If one string is shorter, we pad it with an extra blank symbol. For example, $\operatorname{zip}(aab,cd)=(a,c)(a,d)(b,blank)$.

Is the following problem decidable?

Given a regular language $L$, is $ \{x y: \operatorname{zip}(x,y) \in L\}$ context-free?

If the answer is positive, we can solve the five problems from the motivation section by picking a suitable $L$:

  • $\{x x\}$ can be written using $L=((a_1,a_1)+\dots+(a_n,a_n))^{\ast}$ where $a_i$ are all letters of the alphabet except blanks.
  • $\{x y: |x|=|y|, x \neq y\}$ can be written using $L$ which checks that in the pairs $(a,b)$ there is at least one mismatch $a \neq b$ and there are no blanks.
  • $\{x y: |x|=|y|,d(x,y)>1\}$ is similar, the language $L$ checks if there are at least two mismatches and no blanks.
  • $\{x \# y: |x|=|y|, x \neq y\}$ is checking for at least one mismatch, and then expects a single symbol $(\#, blank)$.
  • $\{x \# y: x \neq y\}$ is checking that the left component of the last character is $\#$, that there is at least one mismatch, and unlike the previous examples blanks are allowed when checking for a mismatch.
sdcvvc
  • 3,491
  • 18
  • 28
  • 1
    "$\operatorname{zip}(aab,cd)=(a,c)(a,d)(b,blank)$". The parenthesis "(" and ")" are symbols of some languages such as the language of well-balanced parentheses or some languages of arithmetic expressions. Would it be better to define "$\operatorname{zip}(aab,cd)=acadb\sqcup$" simply where $\sqcup$ is the blank symbol? That seems more consistent with the examples as well. – John L. Aug 01 '19 at 12:01
  • It is notable that whether case 4, ${x y: |x|=|y|,d(x,y)>1}$ is context-free or not has not been determined yet for the past 6 years, although some people have betted it is not context-free. – John L. Aug 01 '19 at 17:11
  • No, as I wrote in the question, the letters in strings belonging to $zip(x,y)$ are pairs. The parentheses are only used to denote those pairs, they are not symbols. The alternative definition which interleaves the letters can be used, I preferred the version with the pairs since the automaton will pair them anyway, and it feels nicer when $x$ and $y$ are strings which do not come from the same alphabet. – sdcvvc Aug 01 '19 at 21:36
  • 1
    The definition $\text{zip}(x,y)=(x_1,y_1)(x_2,y_2)\cdots(x_n,y_n)$ does look nicer, as it is understood that $x=x_1x_2\cdots x_n$, $y=y_1y_2\cdots y_n$ and all "(", "," ")" are helping delimiters that are not part of the string. – John L. Aug 01 '19 at 23:01

0 Answers0