So there are a few posts about this already, but they skip over the problem I have.
The proof of Cantor's theorem elegantly shows that if we consider $f:A\rightarrow \mathcal{P}(A)$, the set $B=\{x\in A : x\not\in f(x) \}$ by its definition cannot lie in the image of $f$, precluding it from being a bijection - hence power sets have strictly greater cardinality.
But now, I thought one (obvious) way to characterise the "set of all sets" would be $S=\mathcal{P}(S)$, as indeed, it contains all possible sets (perhaps here is the problem, but I don't immediately see how using this as a definition for the set of all sets is bad). Now consider simply taking the identity function $i:S\rightarrow S$. Then $B=\{x\in S : x\not\in x \}$ - the set of all sets which do not contain themselves. But this doesn't exist. It's exactly the set of Russell's paradox. If this isn't a set, and so is not in $S$, then the proof no longer offers an obstruction to $i$ being surjective.
Each other post I've seen on the matter uses Cantor's theorem to show such a set doesn't exist. My query is to find that something else that must be happening, since I think the proof has issues for this particular set. I'm expecting other issues to arise with the definition $S=\mathcal{P}(S)$...