Essential undecidability of binary string arithmetic

Question

The weak theory of concatenation of binary strings is essentially undecidable.¹ Is Presburger arithmetic with two successors (one for each letter) essentially undecidable? Formally, consider the axioms

$\mathsf{S_\sigma x = S_\sigma y \to x = y}$
$\mathsf{S_\sigma x \neq 0}$
$\mathsf{x + 0 = x}$
$\mathsf{x + S_\sigma y = S_\sigma (x + y)}$
$\mathsf{P 0 \land \forall x (P x \to P S_a x \land P S_b x) \to \forall x P x}$ for every formula $\mathsf{P}$
$\mathsf{S_a x \neq S_b y}$

where $\mathsf{\sigma \in \{a,b\}}$. Is the resulting theory essentially undecidable? What if we replace 5 by

$\mathsf{x = 0 \lor \exists y (x = S_a y \lor x = S_b y)}$

as done in Robinson arithmetic? If neither is true, is there some minor modification of these axioms that yields an essentially undecidable theory?

Andrzej Grzegorczyk, Konrad Zdanowski. Undecidability and Concatenation. April 2007.

user21820 · Accepted Answer · 2021-08-25T15:57:27.670

Let $a = S_a(0)$ and $b = S_b(0)$. We have $S_a(x) = x+a$ and $S_b(x) = x+b$ from (3) and (4), so $+$ is really concatenation and $a,b$ are the distinct symbols by (6).

Let (5') be the weaker version of (5). As you probably know, (5') is a trivial consequence of (5).

−−−−−−−

With (5), your theory $T$ directly interprets TC (using the equivalent axiomatization here) because we can easily prove:

$\color{blue}{ ∀x\ ( \ 0+x = x \ ) }$.
Proof: Clearly we have $0+0 = 0$. Now take any $x$ such that $0+x = x$. Then we have $0+(x+a) = (0+x)+a = x+a$ and similarly $0+(x+b) = x+b$. Therefore by (5) we are done. $\small_\strut$
$\color{blue}{ ∀x,y,z\ ( \ x+(y+z) = (x+y)+z \ ) }$.
Proof: Clearly we have $∀x,y\ ( \ x+(y+0) = (x+y)+0 \ )$. Now take any $x,y,z$ such that $x+(y+z) = (x+y)+z$. Then $x+(y+(z+a)) = x+((y+z)+a)$ $= (x+(y+z))+a = ((x+y)+z)+a$ $= (x+y)+(z+a)$ and similarly $x+(y+(z+b)) = (x+y)+(z+b)$. Therefore by (5) we are done. $\small_\strut$
$\color{blue}{ ∀x,y\ ( \ a = x+y ∨ b = x+y ⇒ x = 0 ∨ y = 0 \ ) }$.
Proof: Take any $x,y$ such that $a = x+y$. If $y ≠ 0$ then $y = t+a$ or $y = t+b$ for some $t$ by (5'), and so $a = (x+t)+a$ or $a = (x+t)+b$, but the latter is impossible by (6), so $0 = x+t$ by (1). Thus $t = 0$ otherwise by (5') and (1) we get a contradiction. Thus $x = 0$. Similarly for any $x,y$ such that $b = x+y$. $\small_\strut$
$\color{blue}{ ∀x,y,z,w\ ( \ x+y = z+w ⇒ ∃c\ ( \ Q(x,y,z,w,c) \ ) \ ) }$ where $Q(x,y,z,w,c)$ $≡ x+c = z ∧ y = c+w ∨ x = z+c ∧ c+y = w$.
Proof: Clearly $∀x,y,z\ ( \ x+y = z+0 ⇒ Q(x,y,z,0,y) \ )$. Now take any $w$ such that $∀x,y,z\ ( \ x+y = z+w ⇒ ∃c\ ( \ Q(x,y,z,w,c) \ ) \ )$. And take any $x,y,z$ such that $x+y = z+(w+r)$ where $r = a$ or $r = b$. If $y = 0$, then $Q(x,y,z,w+r,w+r)$. If $y ≠ 0$, then $y = t+s$ for some $t,s$ such that $s = a$ or $s = b$ by (5'), and so we have $(x+t)+s = (z+w)+r$ and hence $s = r$ by (6) and $x+t = z+w$ by (1). Let $d$ be such that $Q(x,t,z,w,d)$. Then $Q(x,y,z,w+r,d)$. In either case, we have shown that $∃c\ ( \ Q(x,y,z,w+r,c) \ )$. Therefore by (5) we are done. $\small_\strut$

Since TC can reason about programs (as defined and sketched here), so can $T$, and hence $T$ is essentially undecidable (if consistent).

−−−−−−−

If (5) is weakened to (5'), it is no longer clear that the resulting theory $T'$ interprets TC. But $T'$ can still reason about programs, for the same essential reason as TC. The substring relation $⊆$ is definable via $x ⊆ y ≡ ∃t,u\ ( \ (t+x)+u = y \ )$, and we shall write "$∀x{⊆}y\ ( \ Q(x) \ )$" to mean "$∀x\ ( \ x⊆y ⇒ Q(x) \ )$" and "$∃x{⊆}y\ ( \ Q(x) \ )$" to mean "$∃x\ ( \ x⊆y ∧ Q(x) \ )$", and we shall call "$∀x{⊆}y$" and "$∃x{⊆}y$" bounded quantifiers. We can use the program execution encoding sketched in the linked thread, by which we can for any given strings $p,x,y$ translate "The program $p$ halts on input $x$ and outputs $y$." to a $Σ_1$-sentence over $T'$, namely of the form "$∃h\ ( \ P(h) \ )$" where $P$ has only bounded quantifiers. (The unbounded quantifier "$∃h$" here is for the finite sequence of program states witnessing the entire history of the execution of $p$ on $x$.) Similarly we can translate "The program $p$ halts on input $x$." and "It is not true that the program $p$ halts on input $x$ and outputs $y$." to $Σ_1$-sentences over $T'$.

So all we need to show is that $T'$ can prove every true $Σ_1$-sentence over $T'$, where truth is with respect to the standard model (i.e. the structure of finite strings from the alphabet $\{a,b\}$). Since such a sentence has a constant term witness, it clearly suffices to show that $T'$ can prove every true sentence over $T'$ with only bounded quantifiers. This can be done by converting it to prenex normal form $E$ (while keeping the bounded quantifiers) with matrix in disjunctive normal form, and then inducting on the number of quantifiers.

If $E$ has no quantifiers, all the terms are constants and it is easy to check that $T'$ proves $E$ using (1),(2),(3),(4),(6) for inequalities.

Note that if $E$ has quantifiers, the bound for the first quantifier must be a constant term.

If $E$ is $∃x{⊆}m\ ( \ P(x) \ )$ for some constant term $m$ and predicate $P$ over $T'$, then there is a term witness $k$, so $T'$ proves $k ⊆ m$ (as it is witnessed by just an equality) and also proves $P(k)$ (since it is true and has fewer quantifiers than $E$), and hence proves $E$.

If $E$ is $∀x{⊆}m\ ( \ P(x) \ )$ for some constant term $m$ and predicate $P$ over $T'$, then we need the key fact that $T'$ proves $∀x{⊆}m\ ( \ \bigvee_{k{⊆}ι(m)} x = τ(k) \ )$, where $ι(m)$ is the interpretation of $m$ and $τ(k)$ is the standard term representing string $k$ (i.e. built from $0,S_a,S_b$). Given that fact, $T'$ proves $E$ because $T'$ proves $P(τ(k))$ for every $k⊆ι(m)$ (since $P(τ(k))$ is true and has fewer quantifiers than $E$).

The key fact can be proven via induction over the length of $m$, and this is where we finally need (5'). Note that we can use (3),(4) to prove equality between $m$ and its standard form, so we can assume that $m$ is a standard term. Take any $x ⊆ m$. Let $t,u$ be such that $(t+x)+u = m$. If $m$ is $0$, then we are done since $(t+x)+u = 0$ implies $u = 0$ by (5'),(4),(2), which implies $t+x = 0$ by (3) and hence similarly $x = 0$. So we can assume $m$ is not $0$. Then $m$ must be $S_s(n)$ for some term $n$ and $s∈\{a,b\}$. If $u = 0$, then $t+x = n+s$ by (3), so $x = v+s$ for some $v$ by (5'),(4),(6), and hence $t+v = n$ by (4),(1), yielding $v ⊆ n$ by (3). If $u ≠ 0$, then $u = w+s$ for some $w$ by (5'),(4),(6), and hence $(t+x)+w = n$ by (4),(1), yielding $x ⊆ n$. In either case, we can invoke the induction since $n$ is shorter than $m$.

For reference, any formal system $S$ that can reason about programs and is consistent (as per the definitions in linked post) cannot be decidable (i.e. there is no program that given any input $Q$ will output the truth-value of $(S⊢Q)$). Otherwise we can solve the zero-guessing problem as follows: Given program $P$ and input $X$ by using that program to determine whether $S$ proves $Q := {}$ "The program $P$ halts on input $X$ and outputs $0$." and if so output $0$ otherwise output $1$. This works because if $P$ halts on $X$ but does not output $0$ then $S ⊢ ¬Q$ and hence $S ⊬ Q$. — user21820, Aug 28 '21 at 05:00
And any formal system $S'$ that interprets such an $S$ can reason about programs, so if $S'$ is consistent (about programs) via that interpretation then it too cannot be decidable. — user21820, Aug 28 '21 at 05:03

Essential undecidability of binary string arithmetic

1 Answers1