Since the other users have answered the question in the context of well-founded set theory, let me say a few words about other set theories.
Before we can really answer this question, we must first think about what a ‘set’ is in the first place. Intuitively, a set is something that has members and which is wholly determined by what its members are. This is codified in the axiom of extensionality:
Extensionality. If $X$ and $Y$ are sets, and for all $z$, $z \in X \iff z \in Y$, then $X = Y$.
Notice, however, that the quantifier “for all $z$” is unbounded – that is, there is no restriction on the type of $z$. Let us non-commitally fix a universe of discourse $\mathbf{U}$ and say that $z$ is required to be in $\mathbf{U}$. So, can a set be a member of another set? Well, that depends – are there any sets in $\mathbf{U}$? If not, then obviously a set cannot be a member of any set. This is rather unacceptable for doing modern mathematics, so we must rectify this somehow.
Russell's own solution to his paradox was to introduce the notion of a type. (What I describe here is the unramified type theory TST, not the type theory of Principia Mathematica.) We start with some basic type $\mathbf{U}_0$ – say, the natural numbers. We define sets whose members are of type $\mathbf{U}_0$ – and this is a new type $\mathbf{U}_1$. We repeat this procedure infinitely, forming at each stage the type $\mathbf{U}_{n+1}$ corresponding to sets of things of type $\mathbf{U}_n$. Thus, we get sets whose members are other sets; on the other hand, it is clear that the universal ‘set’ does not exist in this ontology: if $X$ is a set, then it is of type $\mathbf{U}_{n+1}$ for some natural number $n$, and its members must be of type $\mathbf{U}_n$ – so in particular, $X \notin X$. We could even entirely banish the formula “$X \in X$” because there is no possible assignment of types that makes it a well-formed formula!
Unfortunately, we have had to introduce infinitely many types of sets, and it seems rather complicated to keep track of all these types in practice. Modern set theory resolves this by taking $\mathbf{U}_0$ to be the empty type and collapsing all the higher types into a single type $\mathbf{U}$. Thus, everything in the universe of discourse is a set (but that does not mean all sets are in $\mathbf{U}$!), and it makes sense to ask whether $x \in y$ for any $x$ and $y$ in $\mathbf{U}$. In particular, the once-banished formula $x \in x$ is well-formed again – so again we have to find some other solution to Russell's paradox.
Digging a little bit deeper, we discover that one of the assumptions of the paradox is the naïve axiom of comprehnsion: that is, whenever $\varphi (x)$ is a well-formed formula, then there exists a set $\{ x : \varphi (x) \}$ in $\mathbf{U}$ whose members are precisely those $x$ in $\mathbf{U}$ for which $\varphi (x)$ is satisfied. As such, we must be more careful about the sets we assume are in $\mathbf{U}$. This is where the set–class distinction comes from: in the usual parlance, ‘set’ refers to sets that are in $\mathbf{U}$, and ‘class’ refers to sets whose members are in $\mathbf{U}$ but are not necessarily in $\mathbf{U}$ themselves. To avoid confusion, I will say $\mathbf{U}$-set for the former.
So what should we assume instead of the naïve axiom of comprehension? Quine's New Foundations (NF) offers one option:
Stratified comprehension. Let us say a well-formed formula $\varphi (x)$ is stratified if there is a way to assign types to all the variables appearing in $\varphi (x)$ so that whenever $y \in z$ appears in $\varphi (x)$, $y$ is of type $\mathbf{U}_n$ and $z$ is of type $\mathbf{U}_{n+1}$, and whenever $y = z$ appears in $\varphi (x)$, both $y$ and $z$ are of type $\mathbf{U}_n$. Then, whenever $\varphi (x)$ is a stratified formula, the class $\{ x : \varphi(x) \}$ is a $\mathbf{U}$-set.
Roughly speaking, any set that exists under TST also exists under NF. In particular, the class $\{ x : x = x \}$ is a $\mathbf{U}$-set under NF – so NF admits a universal set. On the other hand, the paradoxical class $\{ x : x \notin x \}$ does not exist in NF, because $x \notin x$ is not a stratified formula. Now, the relative consistency of NF is not well-understood, but the related theory NFU (obtained by allowing $\mathbf{U}_0$ to be non-empty) is known to be consistent relative to ZF set theory. Thus, if we believe ZF is consistent, then we should also believe that there is a consistent set theory in which the universal set exists – in particular, the universal set does not produce a contradiction on its own.
Having mentioned it, I suppose I should also say how comprehension is handled in ZF. We have the following axiom:
Separation. For any $\mathbf{U}$-set $X$, if $\varphi (x)$ is any well-formed formula, the class $\{ x \in X : \varphi (x) \}$ is a $\mathbf{U}$-set.
Obviously, in the presence of a universal set, the axiom of separation is equivalent to the naïve axiom of comprehension, so we had better do something about that.
Regularity. Any $\mathbf{U}$-set $X$ has a member $Y$ such that any member of $Y$ is not a member of $X$. (Equivalently, $X \cap Y = \emptyset$.)
In particular, there is no universal set. It is tempting to call say that the membership relation $\in$ is well-founded on $\mathbf{U}$, but there is a subtlety here: only $\mathbf{U}$-sets are guaranteed to have a $\in$-minimal member. There are still other problems to fix, however – so far, there are no axioms that guarantee our universe $\mathbf{U}$ is non-empty! But that is a story for another day.
Finally, we should discuss formal class–set theories such as von Neumann–Bernays–Gödel (NBG) or Morse–Kelley (MK). In these theories, the universe of discourse $\mathbf{U}$ consists of ‘classes’, and a ‘set’ is defined to be a class that is a member of some class. To avoid confusion, let us say $\mathbf{V}$-class for the former and $\mathbf{V}$-set for the latter. A proper $\mathbf{V}$-class is a $\mathbf{V}$-class that is not also a $\mathbf{V}$-set.
We have a class comprehension axiom governing the formation of $\mathbf{V}$-classes:
Bounded class comprehension. If $\varphi (x)$ is a well-formed formula that does not have any bound variables ranging over $\mathbf{V}$-classes, then the class $\{ x : x \text{ is a } \mathbf{V} \text{-set and } \varphi (x) \}$ is a $\mathbf{V}$-class.
Full class comprehension. If $\varphi (x)$ is any well-formed formula, then the class $\{ x : x \text{ is a } \mathbf{V} \text{-set and } \varphi (x) \}$ is a $\mathbf{V}$-class.
NBG uses the bounded class comprehension axiom, while MK uses the full class comprehension axiom. Either way, we are guaranteed the existence of the $\mathbf{V}$-class
$$\mathbf{V} = \{ x : x \text{ is a } \mathbf{V} \text{-set and } x = x \}$$
which contains all $\mathbf{V}$-sets. But is $\mathbf{V}$ itself a $\mathbf{V}$-set? To answer that we need an axiom telling us which $\mathbf{V}$-classes are $\mathbf{V}$-sets.
Limitation of size. Let us say that a bijection is a $\mathbf{U}$-bijection if its graph exists in $\mathbf{U}$, i.e. if it can be defined by a $\mathbf{V}$-class function. A $\mathbf{V}$-class $X$ is a $\mathbf{V}$-set if and only if there does not exist a $\mathbf{U}$-bijection between $X$ and $\mathbf{V}$.
In particular, $\mathbf{V}$ must be a proper $\mathbf{V}$-class. Note that by definition a proper $\mathbf{V}$-class cannot contain itself. Unfortunately, this doesn't answer the question of whether a $\mathbf{V}$-set can be contained in itself. In NBG and MK, this question is settled by the regularity axiom applied to classes:
Class regularity. Any $\mathbf{V}$-class $X$ has a member $Y$ such that any member of $Y$ is not a member of $X$.
Thus, no $\mathbf{V}$-set can contain itself – at least in NBG or MK.