53

I remember coming across the following question about a language that supposedly is context-free, but I was unable to find a proof of the fact. Have I perhaps misremembered the question?

Anyway, here's the question:

Show that the language $L = \{xy \mid |x| = |y|, x\neq y\}$ is context free.

Raphael
  • 72,336
  • 29
  • 179
  • 389
Dave Clarke
  • 20,205
  • 4
  • 68
  • 113

1 Answers1

46

Claim: $L$ is context-free.

Proof Idea: There has to be at least one difference between the first and second half; we give a grammar that makes sure to generate one and leaves the rest arbitrary.

Proof: For sake of simplicity, assume a binary alphabet $\Sigma = \{a,b\}$. The proof readily extends to other sizes. Consider the grammar $G$:

$\qquad\begin{align} S &\to AB \mid BA \\ A &\to a \mid aAa \mid aAb \mid bAa \mid bAb \\ B &\to b \mid aBa \mid aBb \mid bBa \mid bBb \end{align}$

It is quite clear that it generates

$\qquad \mathcal{L}(G) = \{ \underbrace{w_1}_k x \underbrace{w_2v_1}_{k+l}y\underbrace{v_2}_l \mid |w_1|=|w_2|=k, |v_1|=|v_2|=l, x\neq y \} \subseteq \Sigma^*;$

the suspicious may perform a nested induction over $k$ and $l$ with case distinction over pairs $(x,y)$.

The length of a word in $\mathcal{L}(G)$ is $2(k+l+1)$. The letters $x$ and $y$ occur on positions $k+1$ and $2k+l+2$, respectively. When we split the word in half, i.e. after $(k+l+1)$ letters, then the first half contains the letter $x$ on position $k+1$ and the second half has the letter $y$ on position $k+1$.

Therefore, $x$ and $y$ have the same position (in their respective half), which implies $\mathcal{L}(G) = L$ because $G$ imposes no other restrictions on its language.


The interested reader may enjoy two follow-up problems:

Exercise 1: Come up with a PDA for $L$!

Exercise 2: What about $\{xyz \mid |x|=|y|=|z|, x\neq y \lor y \neq z \lor x \neq z\}$?

NerdOnTour
  • 125
  • 5
Raphael
  • 72,336
  • 29
  • 179
  • 389
  • If we use this grammar, we can generate a string like: $ S \rightarrow AB $ $ A \rightarrow a $ $ B \rightarrow bBa, then B \rightarrow b $ After that, we got a S as abba! This is not equal to the raw language L, is there a mistake here? – George.Zhao Mar 15 '19 at 07:28
  • 1
    @George.Zhao I don't follow. Cleary, $abba \in L$ with $x = ab$ and $y=ba$? – Raphael Mar 15 '19 at 09:14
  • S->BA, B->aSb->abb, A->a, S->abba. – gnasher729 Jun 09 '22 at 18:41