Here's a direct computability-theoretic argument. Suppose $T$ is an "appropriate" theory. Let $A,B$ be two disjoint c.e. sets, and let $A_T=\{x: T\vdash x\in A\}$ and $B_T=\{x:T\vdash x\not\in A\}$. Each $A_T$ and $B_T$ is c.e. by definition; since $T$ is complete we have $A_T\sqcup B_T=\mathbb{N}$, and hence they're each computable; and since $T$ is "appropriate" we have $A\subseteq A_T$ and $B\subseteq B_T$. But this gives us a contradiction: just take $A,B$ to be a pair of computably inseparable c.e. sets.
Of course, how do we know that computably inseperable c.e. sets exist? Well, this in turn is a straightforward trick with a universal Turing machine: we set $$A=\{e:\varphi_e(e)\downarrow =0\}\quad\mbox{and}\quad B=\{e:\varphi_e(e)\downarrow=1\}.$$ By definition, $A$ and $B$ are disjoint c.e. sets (and note that if $\varphi_e(e)\uparrow$ then $e$ is not in $A$ or $B$).
Now suppose $C$ were a computable separator for $A$ and $B$; that is, $C$ is computable, $A\subseteq C$ and $B\cap C=\emptyset$. Let $\varphi_c$ be the total computable characteristic function of $C$ - that is, $$C=\{x:\varphi_c(x)\downarrow=1\}.$$ We now have two cases:
If $c\in C$, then $\varphi_c(c)=1$. But then we have $c\in B$, contradicting the assumption that $C\cap B=\emptyset$.
If $c\not\in C$, then $\varphi_c(c)=0$. But then we have $c\in A$, contradicting the assumption that $A\subseteq C$.
Note that crucially "$\varphi_c(c)\uparrow$" is not an option - since $\varphi_c$ is total.
There's no self-reference at all there: the closest we come is in the verification step of our construction of computably inseparable c.e. sets, but that's not actually self-reference (feeding an existing object its own "name" is different from building an object which somehow "knows its own name ahead of time"). This is exactly the diagonalization versus self-reference point.