Well be prepared to have your mind blown. Not all "collections" can be called "sets" in a rigorous form of set theory, or else we run into paradoxes such as Russel's Paradox, where a set $A$ has itself as an element ($A \in A$), which unfortunately leads to so-called "naive set theory" (where every "collection" is a set) being inconsistent theory.
In set theory, this is typically resolved using the Zermelo-Frankel-Choice (ZFC) or Neumann-Berney-Godel (often called NBG) axioms. Basically, NBG is just a "conservative extension" of ZFC, and both treatments are equivalent. You can start reading about the ZFC axioms here and NBG axioms here (and the other answers provide even better references), but truly understanding such treatment involves having a pretty good handle on mathematical logic, which isn't necessarily required to understand small, large, and locally small categories.
Saunders Mac Lane (seen as generally the "standard" text of category theory) resolves the issue in a pretty informally simple way in Ch. I Sec. 6 (pg. 21-24) that is ultimately equivalent to ZFC, which I reference in my definition below. Essentially what is going on is we define the so-called "set of all sets" $U$ to be a collection with closure properties in a way where neither $U$ nor any other potentially problematic "collections" are included.
Definition. We define the collection $U$ as follows:
(i.) $x \in u \in U$ implies $x \in U$;
(ii.) $u, v \in U$ implies $\{u, v\}, \langle u, v \rangle, u \times v, u-v \in U$;
(iii.) $x \in U$ implies $\mathcal{P}(x), \bigcup x \in U$;
(iv.) $\mathbb{N} \in U$;
(v) if $f \colon a \to b$ is a surjective function with $a \in U$ and $b \subset U$, then $b \in U$;
A collection $a$ is a set if and only if $a \in U$ (note that these closure properties of $U$ are redundant, but sufficient as a criteria for determining which collections are sets).
A category $\mathcal{C}$ is small if $Obj(\mathcal{C}) \in U$, and large otherwise. Morever, $\mathcal{C}$ is locally small if for each $A, B \in Obj(\mathcal{C})$, we have $Hom(A, B) \in U$.
With the way this is set up, note that we can just talk about "small" and "large" categories in terms of naive set theory and get on with our lives. But I understand that this state of affairs leads to a bit of a predicament.
In particular, where you might be unsatisfied is the fact that I used "collections" and haven't exactly defined what those are, and that leads to answering second question, since a "class" is really what I've been calling a "collection" (and what category theory texts call "small sets") this whole time. I don't think you're ever going to be 100% happy (so much as 60% happy) with any definition you read of "classes" until you brush up on enough formal logic where you're comfortable looking at logic symbols. Because what separates "axiomatic set theory" from "naive set theory" is precisely the careful use of symbolic logic vs. treating any old collection with criteria $\{ x \colon \phi(x) \}$ as a set. There just isn't really a sufficient shortcut. But I'll do my best in terms of "informal logic" in the definition below.
Definition. A "class" is a logical statement $\Phi(x)$ with a variable $x$ (treated as any "set") in the "language of set theory". A given set $z$ is contained in a class if and only if the statement $\Phi(z)$ is true.
Such classes $\Phi$ define a set if there is some $z \in U$ such that $\forall x(x \in z \leftrightarrow \Phi(x)).$
Personally, I believe thinking of classes as simply logical statements that are separate and isolated from "sets" (as you'll find standard set theory texts often do more rigorously than I)--except when classes serve as legitimate "definitions" for particular sets--resolves a lot of annoying tension in your brain that distracts you from settling "more important" matters in mathematics.
The big thing to get is that it doesn't matter if the mathematical statements are regarded (in the "language" you're using) as sets or not. These "proper classes" can still be talked about as simply statements about sets. For example, the universal set $U$ can be denoted as the statement "$x=x$", which is always true for any set $z$. Another important example is the class $ON$ of ordinal numbers, where a set $x$ is an ordinal if and only if $\forall y (y \in x \rightarrow y \subset x) \land \exists z(z \in x \land \forall y(y \in x \rightarrow y \notin z))$. The class $Card$ of all cardinal numbers can be established similarly but defining the statement for it takes a quite a bit more work.
You can of course extend your "language of sets" to a "language of classes" (where the logical statements refer to classes instead of sets) and that way talking about classes is formalized "less awkwardly". This is the language that the NBG axioms operate on that conservatively extend ZFC in the language of sets. But you run into the same problems where you have statements about classes (such as again $x=x$ which is true for all classes) that can't be considered to define a "class", and we have the same issue all over again. That's why it's better to just appreciate the fact that everything you talk about in math can be written (in some weird way) in the language of sets and just move on with life.
And to answer your final (and arguably most important question in terms of category theory) we don't lose anything "important" to assume all categories are "locally small", since the goal of category theory is to generalize mappings. To do this, we need to be responsible in our distinction between "small" vs. "large" categories in category theory terminology (analogous to "sets" vs. proper "classes" in set theory terminology), so that we have $\mathbf{Set}$, $\mathbf{Grp}$, $\mathbf{Top}$, etc., with all the nice pretty functors between them, but without the paradox-related problems. However, you'll find not only $\mathbf{Set}$, $\mathbf{Grp}$, and $\mathbf{Top}$, but other large categories (such as $\mathbf{Rng}$, $\mathbf{K}$-$\mathbf{Vec}$, $\mathbf{Grph}$, $\mathbf{pTop}$, and even $\mathbf{Cat}$!) we care about are all "locally small". So we lose nothing in practical mathematics to conveniently assume that all categories are "locally small".
Sorry that it's a bit of a long post, and I hope this clears at least some of the confusion.