I'm having some trouble understanding the constraints on the rule of inference Universal Introduction.
From Wikipedia:
The full generalization rule allows for hypotheses to the left of the turnstile, but with restrictions. Assume $\Gamma$ is a set of formulas, $\varphi$ a formula, and $\Gamma \vdash \varphi (y)$ has been derived. The generalization rule states that $\Gamma \vdash \forall x\,\varphi (x)$ can be derived if $y$ is not mentioned in $\Gamma$ and $x$ does not occur in $\varphi$.
(Emphasis mine)
I dont understand why these constraints are correct. I have seen other constraints elsewhere, and those I understand (I think). For example, the universal introduction in Dirk van Dalen's Logic and Structure (4th ed) is:
$${\forall I}\, \frac{\varphi}{\forall x\, \varphi} $$ where the intended restriction is: the variable $x$ may not occur free in any hypothesis on which $\varphi$ depends, i.e. an uncancelled hypothesis in the derivation of $\varphi$.
I understand why this is correct (we learned a similar pair of constraints in class), but according to the constraints described on the Wikipedia article, I dont see why I would not be able to infer the following (obviously this is incorrect) from the set of premises $\Gamma = \{\exists x \varphi(x)\}$
$$ 1.\ \exists x \varphi(x) \quad \quad \quad \quad \quad \quad \quad \quad \text{premise} $$ $$ 2.\ \varphi(y) \quad \quad \text{1, existential elimination} $$ $$ 3.\ \forall x \varphi(x) \quad \text{2, universal introduction} $$
This seems to imply that $\exists x \varphi(x) \vdash \forall x \varphi(x)$, and I dont see how this would violate the conditions on Wikipedia. $y$ has not been mentioned in $\Gamma$, and $x$ does not occur in $\varphi$.
Am I misunderstanding something? Or is the Wikipedia article wrong?