Grammar for $0^a1^b2^c3^d$ with $a+b = c+d$

Question

I'm currently attempting to construct a grammar for the language $L = \{0^a1^b2^c3^d | a,b,c,d \in \mathbb{N} \land a+b = c+d\}$

However I'm getting stuck on constructing the rules in a way that the $2$s and $3$s always appear in the correct order.

My current approach is the following ruleset with $E$ being the initial rule. \begin{align*} E &\rightarrow \epsilon &E &\rightarrow AC\\ F &\rightarrow BC &A &\rightarrow 0\\ A &\rightarrow 0E &A &\rightarrow B\\ B &\rightarrow 1 &B &\rightarrow 1F\\ C &\rightarrow 2 &C &\rightarrow 3 \end{align*} However this also incorrectly accepts the word $w = 0032$. How can I make sure no $2$ ever follows a $3$?

try to resolve ambiguity between the two options for C by introducting D — Pieter21, Nov 17 '16 at 16:07
Yeah introducing D is obviously the way to go here, however I don't see a way to ensure the correct order between C and D right now. — ntldr, Nov 17 '16 at 16:09
and can you create a Left rule of $0^a1^b$ and a Right rule? — Pieter21, Nov 17 '16 at 16:18
Ah I see. Instead of using many symbols to insert between I just have to find rules for a more broad case. Thanks to you aqd Brian! — ntldr, Nov 17 '16 at 16:24

score 1 · Accepted Answer · answered Nov 17 '16 at 16:14

I’d take a different approach altogether:

$$\begin{align*} E&\to X_{03}\mid X_{02}\mid X_{13}\mid X_{12}\mid\epsilon\\ X_{03}&\to 0X_{03}3\mid 0X_{02}3\mid 0X_{13}3\mid 0X_{12}3\mid\epsilon\\ X_{02}&\to 0X_{02}2\mid 0X_{12}2\mid\epsilon\\ X_{13}&\to 1X_{13}3\mid 1X_{12}3\mid\epsilon\\ X_{12}&\to 1X_{12}2\mid\epsilon \end{align*}$$

dtldarek · Answer 2 · 2016-11-17T17:15:43.287

Context-free languages are equivalent to push-down automata, and in this particular case constructing an automaton is easier:

$$ \begin{array}{c} \mathtt{0}:\mathrm{push}&&\mathtt{1}:\mathrm{push}&&\mathtt{2}:\mathrm{pop}&&\mathtt{3}:\mathrm{pop} \\ \curvearrowleft && \curvearrowleft && \curvearrowleft && \curvearrowleft \\ s_0& \xrightarrow{\epsilon} &s_1& \xrightarrow{\epsilon} &s_2& \xrightarrow{\epsilon} &s_3 \end{array} $$

where $s_0$ is the initial state and $s_3$ is the accepting state (with empty stack). We only use $\mathrm{push}$ and $\mathrm{pop}$ because we don't need any additional information.

To simulate this automaton with a context-free grammar we have to preserve the symmetry between pushes and pops. In other words, each time we produce one of $0$ or $1$ we will need to produce also $2$ or $3$. To keep track which one we can produce, we need enough states to represent all the combinations:

production $A$ will represent pair of states $(s_0,s_3)$,
production $B$ will represent pair of states $(s_1,s_3)$,
production $C$ will represent pair of states $(s_0,s_2)$,
production $D$ will represent pair of states $(s_1,s_2)$.

Observe that the path of the automaton determines possible dependencies in the grammar (with a slight oversimplification we could say that in "push" states we go forward and in "pop" states we go backward):

from $A$ we could go to $B$, because we can go forward from $s_0$ to $s_1$, but not back;
from $A$ we could go to $C$, because we can backward from $s_3$ to $s_2$, but not the other way around;
from $B$ we cannot go to $C$, because we cannot go from $s_1$ to $s_0$ (but we can go back from $s_3$ to $s_2$, in particular $B \to D$ is ok);
etc.

The completed grammar looks as follows:

\begin{align} S &\to A \\ A &\to \mathtt{0}\ A\ \mathtt{3} \mid B \mid C \mid D \\ B &\to \mathtt{1}\ B\ \mathtt{3} \mid D \\ C &\to \mathtt{0}\ C\ \mathtt{2} \mid D \\ D &\to \mathtt{1}\ D\ \mathtt{2} \mid \epsilon \end{align}

I hope this helps $\ddot\smile$

Grammar for $0^a1^b2^c3^d$ with $a+b = c+d$

2 Answers2