9

I apologize for the long post, but I'm currently a student finishing up his first semester in group theory. My introduction was pretty definition-heavy so I've found I can internalize concepts (such as quotient groups, normal subgroups, etc.) myself by forming my own way of motivating and teaching them intuitively. I'd like to know if my current presentation/understanding is correct.

I think after learning about subgroups and Lagrange's theorem, a natural question is then if we can break down a group $G$ to better understand its parts and hopefully the whole $G$ (as is tradition in any analytical endeavor). But if we want to pull back any useful understanding of $G$ from this smaller group, it ought to preserve some structure of $G$. That structure is exactly how the operation acts on elements of $G$ (since groups are just elements with an operation relating them).

So for the sake of exploration, we pretend to have the magic function $\phi : G \rightarrow H$ that does exactly this for us–maps $(G, *_G)$ to some smaller part $(H, *_H)$, then ask what we can say about $\phi$. Our original goal was for $\phi$ to preserve the operation, i.e. for all $a, b \in G$ that $\phi(a *_G b) = \phi(a) *_H \phi(b)$.

The next thing I would observe is that since $H$ has smaller order, $\phi$ necessarily maps a multiple elements, say $a, b$, in $G$ to the same element in $H$. In this sense, $a$ and $b$ are "equivalent" under $\phi$. Given that we have "willed" $\phi$ to operation-preserving, we can see that a natural way this arises by letting $ak = b$ for some $k \in G$:

$$ak = b \implies \phi(b) = \phi(a *_G k) = \phi(a) *_H \phi(k).$$

If we want $\phi(b) = \phi(a) *_H \phi(k) = \phi(a)$ then $\phi(k) = e_H$. I think leads naturally to the definition of the kernel: it's a set of elements that maps to the identity, and makes $a$ equivalent to $b$ mod $\ker \phi$. And in fact, as I've learned from this answer, we naturally get equivalence classes of elements that partition the group into cosets analogous to modular arithmetic. So, $\phi$ takes elements and puts them neatly into these equivalence classes (abstracting away some of the details in $G$ that look "the same" in $H$, leading–at least for me–directly to the First Isomorphism Theorem). Then it makes sense to propose the map $\phi : g \mapsto g \ker \phi$.

The next question becomes what the operation of this $H$ looks like. We've established that elements of $H$ are cosets (and equivalence classes), so for two elements in cosets $g_1 \ker\phi$ and $g_2 \ker\phi$, once combined by $*_H$ we'd want for the result to be in $g_1g_2 \ker\phi$ (modular arithmetic analogy works here as well). Set-wise, we might write

$$g_1\ker\phi \cdot g_2\ker\phi = g_1g_2\ker\phi.$$

But does this come for free? For an element $g_1k_1g_2k_2 \in g_1\ker\phi \cdot g_2\ker\phi$ to look like $g_1g_2k$ for some $k$, it must be that $k_1g_2 = g_2k_3$ for some $k_3$. Set-wise this can be written as $g\ker\phi = \ker\phi g$, i.e. left-costs = right-cosets and it turns out it indeed satisfies this condition and we are safe to proceed.

So, in the end, we have designed $H$, a broken-down version of $G$. And how did we do it? By "dividing out" or "quotienting out" the information that looks the same under $\phi$ in $H$$\ker \phi$. Thus we write $H$ as $G/\ker\phi$, aptly called a quotient group.

Although, you could flip this presentation, and instead of viewing from the kernel perspective, suppose $K$ is some arbitrary group. Then it must satisfy the condition of left-cosets = right-cosets (which we name normality because it is a nontrivial property that gives us a usable quotient) for $G/K$ to be a group, as $\ker\phi$ already does, and through satisfying normality automatically becomes the kernel of some homomorphism (namely the natural, which I've presented).

My questions are:

  • Is this presentation correct (on an intuitive level, I know there are lots of places for concrete proofs)? It feels right to me, but I also feel like I may have gotten definitions vs. implications mixed up.
  • If so, does any textbook follows this approach that I can dig into?
  • I think homomorphisms also can fit into this framework, given I suggest $\phi$ pretty early on, but how would non-surjective homomorphisms be explained?
Andrew Li
  • 4,554

2 Answers2

3

It seems like you have understood the story pretty well.

To address your specific questions:

  1. Yes, seems good. I agree that you could do some work polishing this to make everything fully rigorous, including making some decisions about what you want to be a definition and what you want to be a theorem.
  2. I think this is pretty close to how most textbooks present the quotient groups and the first isomorphism theorem.
  3. Non-surjective homomorphisms are surjective onto their image. The first isomorphism theorem says that if $\phi: A \to B$ is a group homomorphism, then $\textrm{Im}(\phi) \cong A/\ker(\phi)$. So $\phi$ is picking out a subgroup of $B$ which is isomorphic to the quotient group $A/\ker(\phi)$.
  • I guess I was unlucky. The textbook I had spent quite a bit on cosets and normal subgroups before introducing homomorphisms and quotient groups. I do have one more question on #3-is there any case where nonsurjective homomorphisms are interesting if we can always restrict our attention to its image? For example, do we ever consider homomorphisms from S3 to Z over S3 to Z2? – Andrew Li Dec 03 '20 at 01:21
  • @AndrewLi: The only homomorphism from $S_3$ to $\mathbb{Z}$ is the trivial one. On the other hand, we do consider lots of nonsurjective homomorphisms. For instance, we often consider the homomorphism from a group to its automorphism group given by conjugation, and this is noninjective if the group has nontrivial center, and nonsurjective if the group has outer automorphisms, which is often the case. – Arturo Magidin Dec 03 '20 at 01:24
  • @AndrewLi A textbook often must proceed at a fairly slow pace to make sure all the definitions are in the correct order, make the connections to earlier definitions and theorems, provide lots of examples of each concept, etc. I think what you are finding here is that, once you have done all of this hard work, it is really useful to give a high level summary of what you have learned. This high level summary may not have been very useful to you before you did all of that work. – Steven Gubkin Dec 03 '20 at 01:24
  • @ArturoMagidin Whoops I meant S4, but I'll definitely look into those examples. Thanks! – Andrew Li Dec 03 '20 at 01:27
  • 1
    @AndrewLi Many natural group homomorphisms are not surjective. For instance, we want to think about the inclusion of a subgroup as an injective group homomorphism. Also, have you had a linear algebra course? Every linear map is also a homomorphism of the additive group of vectors. So if you care about nonsurjective linear maps (most of them!), then you should also care about nonsurjective group homomorphisms. – Steven Gubkin Dec 03 '20 at 01:28
  • 1
    @AndrewLi A finite group cannot map to $\mathbb{Z}$ nontrivially. The image of a group homomorphism is a subgroup, and the subgroups of $\mathbb{Z}$ are all infinite. – Steven Gubkin Dec 03 '20 at 01:30
  • @StevenGubkin I guess so. Though, at least for me, the presentation of the First Isomorphism Theorem was out of left field without much motivation-it may have also been a case of being bogged down in notation. It was only till I though the stuff in this post out until I felt like I really got it and many other concepts. Thanks for your answer and guidance (on MESE as well)! – Andrew Li Dec 03 '20 at 01:31
3

I know you've gotten one satisfactory answer, but let me weigh in here a bit.

I'll first mention that what you present is generally correct, and a valid way to approach this. The idea of "breaking down a (finite) group into smaller pieces" is in fact behind the idea of classifying finite simple groups (groups that cannot be broken down), together with the theory of group extensions (trying to understand what a group $G$ "is" if you have a normal subgroup $N\triangleleft G$, and you understand both $N$ and $G/N$).

But let me offer you a different perspective and a different way into the isomorphism theorems...

After you learn about groups, and subgroups and Lagrange's Theorem, maybe Cauchy's Theorem, we come to a crossroads in how to try to better understand a given group.

One way to try to learn things about a given group $G$ is to just stare at it until you notice some interesting things about $G$. However, generally speaking, a much more fruitful approach in algebra is to take a less static approach and to consider two things: what the group $G$ "can do", and how it interacts with other groups.

What a group "can do" is in fact historically how groups were originally understood. The original notion of a group was a "group of permutations": a collection of operations acting on a set in specific ways. Even as late as the turn of the 20th Century, Burnside's book on groups still defines a group as a collection of "operators" acting on "some objects". It was Cayley who introduced the abstract definition of a group as a "set with a binary associative operations satisfying certain conditions", and then immediately went on to prove that this did not change the objects of study, as any "group of permutations" was a group under his new proposed definition, and any object that satisfied this new proposed definition could be understood as a "group of permutations". This is the notion behind Cayley's Theorem, and why it is, in my opinion, more important historically than practically today. But this already introduces the notion of functions: what does "can be understood as a group of permutations" mean? It means you can biject it with such a group in a way that respects the operation.

This also leads us to functions. To justify why we want to think about functions, let's consider two areas where functions play a major role: the real numbers/calculus, and linear algebra.

The key property of the real numbers was that they were "continuous": they have no 'holes'. Rather than just stare at real numbers and see if we can say interesting things about them, it turns out to be much more fruitful and interesting to consider functions from $\mathbb{R}$ to itself that respect this "continuity". And so we get the notion of continuous functions, and the study of continuous functions, as a way to shed light on the nature of the real numbers themselves.

Similarly, with Linear Algebra, staring at vector spaces only takes you so far; the real power of vector spaces only emerges when you start considering linear transformations.

In both cases, you don't just want any old function; you want functions that "preserve" whatever it is that makes your objects interesting. For real numbers, continuity; for vector spaces, the addition and scalar product.

So with groups. A group is characterized by three things (bear with me): a binary operation $G\times G\to G$, that assigns to any pair of elements $g_1,g_2\in G$ their "product" $g_1g_2$. A distinguished element $e_G\in G$ with the property that $ge_G=e_Gg=g$ for all $g\in G$. And a function $G\to G$ that assigns to every element $g\in G$ its "inverse", $g^{-1}$, which has the property that $gg^{-1}=g^{-1}g=e_G$.

So if we have two groups $G$ and $H$, then a "function that preserves this structure" would be a function $f\colon G\to H$, such that

  1. Respects products: if $g_1,g_2\in G$, then $f(g_1g_2) = f(g_1)f(g_2)$.
  2. Respects the identity: $f(e_G) = e_H$.
  3. Respects inverses: if $g\in G$, then $f(g^{-1}) = (f(g))^{-1}$.

It turns out that two of these conditions are superfluous, but that is how we want to start. As you know, if 1 holds for a function between groups, then 2 and 3 will automatically hold as well. One can then define a group homomorphism as simply a function that satisfies 1 and prove 2 and 3; I prefer to define it as a function that satisfies 1, 2, and 3, and then prove that if it satisfies 1, then it must satisfy 2 and 3. The reason I prefer is that I think it makes the definition more natural.

Okay, so these are the functions that will play the role of "linear transformations" and "continuous functions". We call them, as I mentioned above, "group homomorphisms." They also are the type of functions needed in Cayley's argument that any group "can be seen" as a group of permutations, because that corresponds to a one-to-one function $f\colon G\to S_X$ (for some set $X$), that satisfies 1, 2, and 3. So that $f(G)$ is "essentially the same" (as far as the group structure is concerned) as $G$, but now it consists of permutations on a set $X$.

Now, given a function $f\colon G\to H$ (in fact, any function between two sets $X$ and $Y$), there is a natural equivalence relation that we can define on $G$. Let us say that two elements $x,y\in G$ are "$f$-equivalent", $x\sim_f y$, if and only if $f(x)=f(y)$. This is easily verified to be an equivalence relation, and so it partitions $G$ into equivalence classes.

But because $f$ is a group homomorphism, we have the following consequences: if $x\sim_f y$ and $z\sim_f w$, then $xz\sim_f yw$, and $x^{-1}\sim_f y^{-1}$. So we can make the set of equivalence classes, $G/\sim_f$ into a group! Let $[x]_f$ be the equivalence class of $x$. Then we can define $[x]_f*[y]_f = [xy]_f$, $e_{G/\sim_f} = [e_G]_f$, and $([x]_f)^{-1} = [x^{-1}]_f$. It is then an easy exercise to show that this is indeed a group.

What relation does this group have with $G$ and with $f$? Well, that's the first isomorphism theorem: there is a bijective group homomorphism between the group $G/\sim_f$ and the image group $f(G)$, given by sending $[x]_f$ to $f(x)$.

How is this related to normal subgroups? Ah, well, these equivalence classes have an interesting property: because of the property that $x\sim_f y$ implies $x^{-1}\sim_f y^{-1}$, and if $x\sim_f y$ and $z\sim_f w$ then $xz\sim_f yw$, we have $$x\sim_f y \iff xy^{-1}\sim_f e_G.$$ That is: we can completely determine the equivalence relation by just knowing $[e_G]_f$. Moreover, this collection is a subgroup of $G$!

Will any subgroup work? No, it turns out it doesn't. If $N$ is a subgroup and we try to define an equivalence relation $x\sim y$ if and only if $xy^{-1}\in N$, we do get an equivalence relation, but we do not get an equivalence relation that lets you define a group structure on $G/\sim$. The condition that lets you do that is precisely that $N$ must be a normal subgroup. I go into much more detail about this in this answer.

So, "good" equivalence relations, those coming from functions (they are called "congruences"), correspond to normal subgroups. In fact, as with Cayley's Theorem before (which gave a separate definition of "group" and then showed it was really the same as the old one), so it is with "good" equivalence relations:

Theorem. A subgroup $N$ of a group $G$ is normal in $G$ if and only if there exists a group $H$ and a homomorphism $f\colon G\to H$ such that $N=[e_G]_f$.

This then leads to the usual First Isomorphism Theorem, which says: this construction of taking quotients of a group is "essentially the same" as looking at the image of $G$ under a group homomorphism, in that given any homomorphism $f\colon G\to H$, if $N=[e_G]_f$, then $G/N = G/\sim_f$ is "essentially the same" as $f(G)$: there is a bijective group homomorphism between them.

The Third Isomorphism Theorem corresponds to compositions of morphisms: if $f\colon G\to H$ and $g\colon H\to K$, then $f(G)/\sim_g$ is essentially the same as $g\circ f(G)/\sim_{g\circ f}$. That is, if $N\triangleleft G$, $K\triangleleft G$, $N\subseteq G$, then $K/N \triangleleft G/N$ and $(G/N)/(K/N)\cong G/K$.

The Fourth, or lattice, Isomorphism Theorem establishes a correspondence between the subgroups of $f(G)$ and the subgroups of $G$ that contain $[e]_f$. One then asks... okay, and what about other subgroups of $G$? That's what the Second Isomorphism Theorem gives you: if $f\colon G\to H$ is a homomorphism, and $K$ is an arbitrary subgroup of $G$, then $f(K)$ corresponds to $K/(K\cap N)$ (where $N=[e]_f$). And this image is "the same" as the image of $KN$. That is, $$\frac{K}{K\cap N} \cong \frac{KN}{N}.$$

So in summary:

  1. First Isomorphism Theorem tells you that images of groups correspond to quotients and vice-versa.

  2. Third Isomorphism Theorem tells you that this correspondence plays well with composition.

  3. Fourth Isomorphism Theorem tells you that there is a very nice correspondence between the subgroups of $f(G)$ and the subgroup of $G$ that contain $[e]_f$.

  4. And the Second Isomorphism Theorem tells you how the rest of the subgroups of $G$ behave under the homomorphism $f$.

Thus, the importance of normal subgroups corresponds simply to the importance of homomorphisms. Images of a group are like "shadows" of the group, and so will hopefully sometimes be easier to understand. Simple groups are the ones that we cannot simplify this way: we'll just have to stare at them intently until we understand them. And if we can understand simple groups, and we can understand how to put groups together (group extensions) from $N$ and $G/N$, then perhaps we can leverage our (hypothetical) understanding of simple groups into a (even more hypothetical) understanding of all groups. Turns out this is too naïve a hope, unfortunately, but perhaps it can help justify why we care about morphisms, normal subgroups, quotients, etc.

Arturo Magidin
  • 398,050
  • 1
    This is a splendid answer! I'll have to come back when I have to time to really internalize the insights on the 2nd-4th isomorphism theorems (I learned 2 and 3, but in a "these theorems exist" way). And this connects well with the simple groups I'm studying now, thanks again for your comments and answer! – Andrew Li Dec 03 '20 at 22:17