Does Reverse Polish Notation have an LL grammar?

Question

Let L be the language of all arithmetic expressions written in Reverse Polish Notation, containing only binary operators. $\Sigma(L) = \{n, o\}$, n := number, o := operator.

Is there an LL grammar G so that L(G) = L?

Reverse Polish Notation is a notation, e.g. infix, prefix or postfix. A Grammar on the other hand is a set of rules of composition. You can create a grammar that makes use of postfix notation, but you are mixing apples and oranges with this question. Maybe you should ask "How is RPN related to grammars? and you should get a more detailed answer with examples. — Guy Coder, Oct 30 '13 at 12:43
@guy-coder I agree with you that the question should be more precise, but its meaning is pretty clear: Is there an LL grammar which generates the arithmetic expressions in reverse Polish notation. See this related question. — J.-E. Pin, Oct 30 '13 at 13:09

score 1 · Accepted Answer · answered Oct 30 '13 at 18:06

Yes,

$ G = (\{E,S,R\}, \{n,o\}, P, E)\text{, with productions}\\ E \rightarrow n\ |\ SR \\ S \rightarrow nEo \\ R \rightarrow EoR\ |\ \epsilon \\ $

Proof:

Observation $(*)$: Each word of L represents an arithmetic expression, the result of calculating such an expression is essentially again a number. So whenever we have $w \in L$, we can treat $w$ as a number.
Theorems:
1. $L(G) \subset L$
2. $L(S) \subset L$
3. $w \in L(R) \implies w=\epsilon \lor (w=w_1 o, w_1 \in L)$
Inductive proof of above theorems on derivation step count $n$:
1. Basis $n=1$
  1. $E \Rightarrow n\\ \land \\ n \in L $
  2. $S\text{ can't generate any words in one step. }\\ \land \\ \emptyset \subset L $
  3. $R \Rightarrow \epsilon, w=\epsilon$
2. Basis $n=2$
  
  In all 3 cases nothing can be generated in 2 steps, and the theorems hold trivially.
3. Inductive step: $n=k>1,$ induction hypothesis (IH): assume theorems hold for $n<k$
  1. $E \Rightarrow SR \Rightarrow^{k-1} w_1 w_2\, \\ \text{where }w_1 \in L\text{ because of IH of theorem 2,} \\ \text{and } w_2 = \epsilon\text{ or }w_2 = w_3 o, w_3 \in L\text{ because of IH of theorem 3.}$
    1. First case: $w_2 = \epsilon$, then $w_1 w_2 = w_1 \in L$.
    2. Second case: $w_1 w_2 = w_1 w_3 o =^{(*)} nno$, which is a valid RPN expression, thus $w_1 w_3 o \in L$
  2. $S \Rightarrow nEo \Rightarrow^{k-1} nwo$, where $w \in L$ because of IH of theorem 1. $nwo =^{(*)} nno$, which is a valid RPN expression, and thus $n w o \in L$
  3. $R \Rightarrow EoR \Rightarrow^{k-2} w_1 o R$, where $w_1 \in L$ because of IH of theorem 1. The last step can only be $w_1 o R \Rightarrow w_1 o \epsilon$, which satisfies theorem 3.
$\square$
Theorem: $L \subset L(G)$

Proof: induction on length of $w \in L$, show that $E \Rightarrow^* w$

(I got lazy from this point on, I might fill this in later)

$\square$
$(1) \land (2) \implies L(G) = L$
Theorem: G is LL(k) grammar, $k \in \mathbb{N}$

Proof: Pick a k large enough, then show that for each non-terminal you can decide which production rule to use by looking at the k next terminals.

$\square$

$\square$

Does Reverse Polish Notation have an LL grammar?

1 Answers1