Is "duplicate" in RPN enough for replacing variable binding in term expressions?

Question

I try to work out some consequences of storing (or "communicating"/"transmitting") a rational number by a term expression using the following operators: $0$, $\mathsf{inc}$, $\mathsf{add}$, $\mathsf{mul}$, $\mathsf{neg}$, and $\mathsf{inv}$. Here $\mathsf{add}$ and $\mathsf{mul}$ are binary operators, $\mathsf{inc}$, $\mathsf{neg}$, and $\mathsf{inv}$ are unary operators, and $0$ is a $0$-ary operator (i.e. a constant). Because I want to be able to also store numbers like $(3^3+3)^3$ efficiently, I need some form of variable binding. I will use the notation $(y:=t(x).f(y))$ to be interpreted as $f(t(x))$ in this question. Now I can store $(3^3+3)^3$ as $$(c3:=(1+1+1).(x:=((c3*c3*c3)+c3).(x*x*x))).$$ If I stick to the operators $0$, $\mathsf{inc}$, $\mathsf{add}$, and $\mathsf{mul}$, this becomes $$(c3:=\mathsf{inc}(\mathsf{inc}(\mathsf{inc}(0))).(x:=\mathsf{add}(\mathsf{mul}(\mathsf{mul}(c3,c3),c3),c3).\mathsf{mul}(\mathsf{mul}(x,x),x))).$$ Using RPN with a "duplicate" operation written $\mathsf{dup}$ instead of variable binding, this becomes $$0\ \mathsf{inc}\ \mathsf{inc}\ \mathsf{inc}\ \mathsf{dup}\ \mathsf{dup}\ \mathsf{dup}\ \mathsf{mul}\ \mathsf{mul}\ \mathsf{add}\ \mathsf{dup}\ \mathsf{dup}\ \mathsf{mul}\ \mathsf{mul}.$$

My question is whether it is always possible to replace variable binding by the "duplicate" operation. The binary operations ($\mathsf{add}$ and $\mathsf{mul}$) are associative and commutative, but it seems to me that even this is not enough for ensuring that variable binding can be completely eliminated. Take for example $$(c2:=(1+1).(x:=(((c2+1)*c2)+1).(y:=(x*x).((y+c2)*y)))).$$ If I stick to the operators $0$, $\mathsf{inc}$, $\mathsf{add}$, and $\mathsf{mul}$, this becomes $$(c2:=\mathsf{inc}(\mathsf{inc}(0)).(x:=\mathsf{inc}(\mathsf{mul}(\mathsf{inc}(c2),c2)).(y:=\mathsf{mul}(x,x).\mathsf{mul}(\mathsf{add}(y,c2),y)))).$$ Using RPN with a "store" operation written $\mathsf{sto}(x)$ instead of variable binding, this becomes $$0\ \mathsf{inc}\ \mathsf{inc}\ \mathsf{sto}(c2)\ c2\ \mathsf{inc}\ c2\ \mathsf{mul}\ \mathsf{inc}\ \mathsf{sto}(x)\ x\ x\ \mathsf{mul}\ \mathsf{sto}(y)\ y\ c2 \ \mathsf{add}\ y\ \mathsf{mul}.$$ After eliminating $\mathsf{sto}(x)$ and $\mathsf{sto}(y)$ by $\mathsf{dup}$, this becomes $$0\ \mathsf{inc}\ \mathsf{inc}\ \mathsf{sto}(c2)\ c2\ \mathsf{inc}\ c2\ \mathsf{mul}\ \mathsf{inc}\ \mathsf{dup}\ \mathsf{mul}\ \mathsf{dup}\ c2 \ \mathsf{add}\ \mathsf{mul}.$$ Using explicit substitution to eliminate $\mathsf{sto}(c2)$, this becomes $$0\ \mathsf{inc}\ \mathsf{inc}\ \mathsf{dup}\ \mathsf{inc}\ \mathsf{mul}\ \mathsf{inc}\ \mathsf{dup}\ \mathsf{mul}\ \mathsf{dup}\ 0\ \mathsf{inc}\ \mathsf{inc}\ \mathsf{add}\ \mathsf{mul}.$$ My issue with explicit substitution is that it might lead to an exponential increase in the size of the expression. It's easy to see that expressions like $(3^3+3)^3$ or $((3^3+3)^3+3)^3$ can't be stored efficiently without something like $\mathsf{sto}(x)$ or $\mathsf{dup}$. Is there another way to eliminate $\mathsf{sto}(x)$, like an additional first-in, first-out queue? Or can one prove that an exponential blowup of the expression won't happen, if only explicit substitution and $\mathsf{dup}$ are "suitably" used together?

Is your question essentially: can I compile a register-based instruction set down to a stack-based instruction set? — D.W., Oct 21 '14 at 23:33
@D.W. The question is more: I guess there are some known results which show why "duplicate" can't be enough for replacing variable binding in my case. If there is a fact that you can't compile a register-based instruction set (with an unbounded number of labeled registers) down to a stack-based instruction set, then a reference this fact could be an answer to this question. — Thomas Klimpel, Oct 22 '14 at 07:06

score 2 · Accepted Answer · answered Nov 02 '14 at 23:13

Is “duplicate” in RPN enough for replacing variable binding in term expressions?

No, it is not strong enough. One can eliminate $\mathsf{dup}$ by replacing each sequence of $\mathsf{dup}$ by a sequence of $x$ preceded by one $\mathsf{sto}(x)$ operation. Since the single variable "$x$" is enough to eliminate all $\mathsf{dup}$ operations, any term expression which requires more than one variable to be stored efficiently can't be expressed (efficiently) using only $\mathsf{dup}$.

If the $\mathsf{inv}$ operation is omitted, then this whole intended representation by a term expression can be seen as a special case of the directed acyclic graph representation of a polynomial used in arithmetic circuit complexity. So the RPN stack from the question turns out to be just a representation of a tree. Now a directed acyclic graph can't be "emulated easily" by a tree, and especially a non-planar DAG will be "too difficult" for the $\mathsf{dup}$ operation.

In order to add the $\mathsf{inv}$ operation again, note that the ring of polynomials in $x_1,\dots x_n$ over $\mathbb Z$ is just the free commutative ring with generators $x_1,\dots x_n$. The corresponding free algebra with an $\mathsf{inv}$ operation would be the free commutative regular ring with generators $x_1,\dots x_n$. To be faithful to the canonical representation of the free algebra of a variety (equationally defined collection of algebras), the $\mathsf{inc}$ operation should be replaced by $1$.

Is "duplicate" in RPN enough for replacing variable binding in term expressions?

1 Answers1

Linked