In first-order arithmetic, you can't quantify over sets of numbers. However, you can include sets as free variables. I don't think this is just a meta-linguistic thing, as I've read papers about Peano Arithmetic with first-order sentences containing predicate variables which definitely seem to be used for more than just fill-in-the-blank-with-a-formula. For example, you can prove within Peano Arithmetic that strong induction works, and this paper uses free predicate variables to create a first-order formula stating that a relation is well-ordered.
However, if you really start using free set variables in interesting ways, it seems like you can contradict well known theorems. For example, I could write the formula $(\forall X, n,i)(\Phi(X,n,i)\Leftrightarrow [i=0\land\phi(X,n)\lor i\ge0\land(\exists x)(\forall y)\Phi(\pi(X,x,y),n,i-1)])\Rightarrow\Phi(0,n_0,i_0)$
where
- $\pi$ is an injection from triples to numbers
- $\phi$ is a binary predicate like "if the $n^\text{th}$ quantifier-free formula is true given variables $X$"
- $\Phi$ is a free ternary predicate variable
- $n_0$ is a free number variable indicating the index of the formula to use
- $i_0$ is a free number variable indicating the number of existential/universal quantifiers
Assuming I set that up right and assuming you can use free predicate variables like that, this is a formula for the numbers of true arithmetic sentences - in first-order arithmetic. Sure, there's an implicit $(\forall\Phi\in\mathcal{P}(\mathbb{N}))$ in the formula, but according to questions like this, as long as that quantification isn't explicit, it still counts as a first-order formula. But that all contradicts Tarski's Undefinability Theorem, so it can't be right.
It also relates back to questions like this, about why the definitions of addition and multiplication need to be included in the signature and axioms of Peano arithmetic when you could just start any sentence like $(\forall x,z)([\phi_+(x,0,z)\Leftrightarrow x=z]\land(\forall y,z)[\phi_+(x,Sy,Sz)\Leftrightarrow\phi_+(x,y,z)])\Rightarrow...$
And if you can do recursion like that, it kind of makes the whole setup of Godel's $\beta$ function for encoding sequences unnecessary, which I assume it isn't. I've been learning about proof theory for a while and this one basic concept for some reason still confuses me. I keep going back and forth between thinking free set variables add immense power and are the key to everything and that they're just a fancy notation. Is there something I'm missing here? Is the first paper I referenced using "fresh" predicates incorrectly? Does it have to do with the difference between a formula and a sentence? Any explanation would be great. Thanks.