21

This could be too broad if we're not careful. I'm sorry if it ends up that way.

Let's put together a list of different constructions of the free group $F_X$ over a given set $X$.

It seems to be one of those things a lot of people know about (and use implicitly) but whose constructions can be so tedious, it's hard to get hold of at first. The main problem, as Lee Mosher reminds us, seems to be associativity.

The Wikipedia page (linked to above) goes some way into listing some useful perspectives. I'm impressed by this intuitive summary from Wolfram:

A group is called a free group if no relation exists between its group generators other than the relationship between an element and its inverse required as one of the defining properties of a group.

But this is not exactly a construction nor is it strictly the free group over a set.

Here's a brief list of what I have so far:

This starts with the "standard" construction using finite strings over $\mathcal{X}=X\cup X'$ then asks for an opinion on quotienting by some equivalence relation.

The free group is constructed as the left adjoint of the composition of certain forgetful functors. This one is interesting in that it goes via the category InvMon of monoids with involutions as objects and involution-preserving homomorphims as morphisms.

Martin Brandenburg also gives quite a concise one in the comments there, so I invite him to elaborate on that here :)


The above list is not at all exhaustive (of what I know) and there's bound to be some overlap. Personally I would be interested to see explicit use of Universal Algebra, Semigroup Theory, and Category Theory. The reason for the latter should be clear from the above; as for the first two, see


Feel free to give more details on those already listed here.

Shaun
  • 44,997
  • 4
    I'm a big fan of Massey's presentation in "Algebraic Topology: An Introduction" (or something like that); he starts from the "words with cancelling rules" approach, points out that you end up saying a lot of not-quite-sensible things, and then shows how to construct things based on universal properties, with the "words" model as motivation throughout. – John Hughes Jul 03 '14 at 12:53
  • 7
    One should never define the free group as a set of reduced words. This is conceptually wrong and requires tedious calculations to verify the group structure ... the free group (or free object of any type) is defined via its universal property. Existence is a simple application of Freyd's Adjoint Functor Theorem. Explicitly, we have $F(S) = \langle \mathrm{im}(\phi) \rangle$ with $\phi : S \to \prod_{i : S \to U(G) \text{ generates} G} G$. The structure of words can be derived from the universal property. See also http://math.stackexchange.com/questions/487628 – Martin Brandenburg Jul 03 '14 at 20:42
  • 6
    @MartinBrandenburg: "should never" might be a bit strong. What I like about this question is that different definitions are useful for different purposes. The modern geometric theory of free groups and their automorphism and outer automorphism groups is founded on a good intuitive understanding of reduced words, starting with the bounded cancellation lemma and Cooper's paper "Automorphisms of free groups have finitely generated fixed point sets". Part of that is what goes into my answer below, in which there are no tedious calculations, just geometry. – Lee Mosher Jul 03 '14 at 23:32
  • 1
    I might put together a detailed answer from the perspective of Universal Algebra, time permitting; I need to brush-up on it. – Shaun Jul 06 '14 at 19:11
  • 1
    Using Lawvere theories might be illustrative. Have a look at the "Free $T$-algebras and underlying sets" section of the link provided. But would that give a construction, @MartinBrandenburg? :) – Shaun Jul 07 '14 at 09:52
  • This is pretty cool. I just thought I'd mention it here. – Shaun Jan 29 '15 at 22:54
  • 1
    Old question, I know… but it does not seem to have the construction of “free group as a subobject of a large product”. Should I add it? – Arturo Magidin May 07 '21 at 20:36
  • @ArturoMagidin: Yes, please do! – Shaun May 07 '21 at 20:42

6 Answers6

9

It's just not that hard to construct the free group via reduced words. The trick is to prove the universal property before proving associativity. The argument is standard (e.g., it is in Dummit & Foote, Ch.6). Who came up with it first?

Start with a set of symbols $S$. Extend to a set $S\amalg S^*$ of letters, equipped with an involution $a\mapsto a^*$ which switches the summands. A reduced word is a finite sequence $x=(x_1,\dots,x_n)$ of letters such that $x_j\neq x_{j+1}^*$ for all $j$. Define a binary operation $x\cdot y$ on the set $F$ of reduced words: $$ (x_1,\dots,x_m)\cdot (y_1,\dots,y_n) := (x_1,\dots, x_{m-k},y_{k+1},\dots,y_n), $$ where $k$ is the unique integer such that $x_{m-k}\neq y_{k+1}^*$ but $x_{m-j}=y_{j+1}^*$ for $j<k$ (with the obvious modifications when $k=\min(m,n)$, giving $(y_{m+1},\dots,y_n)$ or $(x_1,\dots,x_{m-n})$ or $()$ as the case may be).

The following is entirely straightforward to prove:

  • Every function $\phi\colon S\to G$ to a group $G$ extends uniquely to a function $\Phi\colon F\to G$ such that:

    1. $\Phi((s))=\phi(s)$ for each symbol $s\in S$, and

    2. $\Phi(x\cdot y)=\Phi(x)\Phi(y)$ for any reduced words $x,y\in F$.

Note: 2. implies that $\Phi$ must send the empty word to the identity element and that $\Phi((s^*))=\phi(s)^{-1}$, since $()=()\cdot ()=(s)\cdot (s^*)=(s^*)\cdot (s)$ in $F$.

The construction of $\Phi$ is simply: extend $\phi$ to $S\amalg S^*\to G$ by $\phi(s^*):=\phi(s)^{-1}$, and define $\Phi(x):=\phi(x_1)\cdots \phi(x_n)$.

Now let $G=$ the permutation group of $F$. For each letter $a\in S\amalg S^*$, let $\lambda_a\colon F\to F$ be the left-multiplication function $$ \lambda_a(x) := (a)\cdot x. $$ A direct calculation using the definition of the binary operation shows that the functions $\lambda_a$ and $\lambda_{a^*}$ are inverse to each other, so $\lambda_a\in G$. Let $\phi\colon S\to G$ be $\phi(s):=\lambda_s$, and let $\Phi\colon F\to G$ be the unique multiplicative extension as above. From the construction of $\Phi$ calculate that the evaluation of the permutation $\Phi(x)\in G$ at the empty word is exactly $x$, so $\Phi$ is injective. From this we can read off that the operation on $F$ is associative, since multiplication in $G$ is associative; the other group axioms for $F$ are obvious.

6

Although this is an old question, this construction recently came up by reference in another question by Shaun, and it does not appear to be included above, so I figure I can add it (having checked with OP first).

Here's a categorical construction using products; it is actually related to the adjoint functor construction, in that it follows the ideas of Freyd's General Adjoint Functor Theorem. It can be carried out with any class of algebras that is closed under products and has a notion of "subobject generated". So it works in most "nice" categories of algebras (in the sense of Universal Algebra): abelian groups, $R$-modules, commutative rings, associative rings, etc.

Let $X$ be a set. A free group on $X$ is a pair $(F,u)$, where $F$ is a group, $u\colon X\to F$ is a set map, and the pair has a universal property: for every group $G$ and every set map $v\colon X\to G$, there exists a unique group homomorphism $\phi\colon F\to G$ such that $v = \phi\circ u$.

The idea is this: consider all pairs $(G_v,v)$ where $G_v$ is a group, and $v\colon X\to G_v$ is a set map. Take the product $$P=\prod_{(G_v,v)} G_v,$$ of all such $G$, indexed by the pairs. The maps $v_G\colon X\to G$ induce a set map $u\colon X\to P$. Let $F$ be the subgroup of $P$ generated by $u(X)$. Then $F$ and $u$ will have the desired property, because given any group $G$ and set map $v\colon X\to G$, we obtain the relevant morphism $F\to G$ by simply taking the restriction of the projection $\pi_{(G_v,v)}\colon P\to G$ to the subgroup $F$. And because $F$ is generated by $u(X)$, this will provide the uniqueness. Voila! As if by magic, we obtain a group with the correct universal property.

However, there is a technical issue with the idea above: we can't construct that product, because the collection of all pairs $(G_v,v)$ is a proper class, not a set, and we cannot take a cartesian product indexed over a proper class. So we need to pare down this collection to something manageable (i.e., to a set).

The first observation we can make is that if $f\colon G\to H$ is an isomorphism, and $v\colon X\to G$ is a set map, then we do not need both $(G,v)$ and $(H,f\circ v)$ in our collection: we can just take $(G,v)$, and if we are presented with $H$ and $f\circ v$, we project to $G$ and then use $f$ to get all the way to $H$. So we really only need one group from each isomorphism class, plus all set maps from $X$ to that group (and that collection, for a fixed group $G$, is a set).

However, this is still not small enough, since there are groups of arbitrarily large cardinality, and thus even taking just one group from each isomorphism class we would still have a proper class. So we will not go down this road in the end (turns out to not matter).

The second observation to make is that in fact, given $(G,v)$, we don't really need all of $G$: we just need $\langle v(X)\rangle$, the subgroup generated by $v(X)$. Because the image of $F$ in $G$ will be generated by $v(X)$, so the image will lie inside that subgroup.

This is a good observation, because that bounds the size of generating sets we need to consider. And now we have an easy upper bound for how big the groups we need to consider are:

Lemma. Let $G$ be a group, and $S\subseteq G$ a subset of $G$. If $G=\langle S\rangle$, then $|G|\leq |S|\aleph_0$.

Proof. We know that $\langle S\rangle$ consists of finite products of elements of $S$ and their inverses. The set of such products has at most $|S|\aleph_0$ elements, so $G$ has at most that many elements. $\Box$

So, here's how we get around the technical issue: fix a set $M$ of cardinality $|X|\aleph_0$. We will consider all pairs $(G_v,v)$ such that

  1. $G_v$ is a group;
  2. $v\colon X\to G_v$ is a set map;
  3. $G_v$ is generated by $v(X)$;
  4. The underlying set of $G_v$ is contained in $M$;

Call the collection of all such pairs $\mathcal{S}$. It is now easy to verify that this is a set.

This is what is sometimes called a "solution set". It has the following property:

Given any group $K$ and set map $w\colon X\to K$ such that $K=\langle w(X)\rangle$, there exists a pair $(G_v,v)\in\mathcal{S}$ and an isomorphism $\phi\colon G_v\to K$ such that $w = \phi\circ v$.

So now we proceed as outlined above: Let $P$ be the product over all elements of $\mathcal{S}$, $$P = \prod_{(G_v,v)\in \mathcal{S}} G_v.$$ Let $\pi_v\colon P\to G_v$ be the projection onto the $(G_v,v)$th coordinate. Because $\mathcal{S}$ is a set, this product exists and makes sense.

The underlying set of $P$ is the cartesian product of the $G_v$. Thus, the functions $v\colon X\to G_v$ induce a unique set map $u\colon X\to P$ such that $\pi_v\circ u = v$ for all $(G_v,v)\in\mathcal{S}$.

Now, we "remember" that $P$ is in fact a group, so let $F=\langle u(X)\rangle$ be the subgroup of $P$ generated by image of $u$. I claim that $(F,u)$ is the free group on $X$.

Indeed, let $H$ be any group, and let $w\colon X\to H$ be a set map. Then $K=\langle w(X)\rangle$ is a group generated by $w(X)$, and hence by the solution set property of $\mathscr{S}$, there exists a $(G_v,v)\in\mathcal{S}$ and an isomorphism $\phi\colon G_v\to K$ such that $w=\phi\circ v$. Let $\iota\colon K\to H$ be the inclusion map.

Then $\iota\circ\phi\circ\pi_v\colon F\to G_v\to K\hookrightarrow H$ is a morphism from $F$ to $H$, and $$\begin{align*} (\iota\circ\phi\circ\pi_v)\circ u &= \iota\circ\phi\circ (\pi_v\circ u)\\ &= \iota\circ\phi\circ v\\ &= w \end{align*}$$ (because $\phi\circ v$ is equal to the co-restriction of $w\colon X\to H$ to $K$).

Thus, there exists a morphism $f\colon F\to H$ such that $f\circ u = w$. To prove uniqueness, we note that if $g\colon F\to H$ also satisfies $g\circ u=w$, then for every $x\in X$ we have $g(u(x)) = w(x) = f(u(x))$, so $g$ and $f$ agree on $u(X)$. Therefore, they agree on $\langle u(X)\rangle$; by construction, this equals $F$, so $f=g$.

Hence $(F,u)$ has the universal property of the free group on $X$, and hence is the free group on $X$ (up to unique isomorphism). $\Box$

Arturo Magidin
  • 398,050
  • Excellent answer! Thank you! Does "let $\mathscr S$ be a set consisting of exactly one representative from each equivalence class in $\mathcal{S}/\sim$" require the axiom of choice? – Shaun May 07 '21 at 21:53
  • 1
    @Shaun: Yes, because there's definitely infinitely many such classes; you have at least all cyclic groups lurking in there. – Arturo Magidin May 07 '21 at 21:54
  • 1
    @Shaun: Actually, you may not need to go to $\mathscr{S}$; all you need is some group that is suitably isomorphic to $K$, it doesn’t matter if there are more than one, since the uniqueness of the final map only depends on the fact that it is completely determined by values at $u(X)$. – Arturo Magidin May 07 '21 at 23:05
  • 1
    What a nice answer! – Bryan Castro Feb 18 '22 at 16:45
5

Recently I learned a new (to me) way of proving the associativity of multiplication of reduced words from this AMS Monthly article

An Elementary Treatment of the Construction of the Free Product of Groups http://www.jstor.org/stable/10.4169/amer.math.monthly.122.7.690

James E. McClure and Alec McGail

The American Mathematical Monthly, Vol. 122, No. 7 (August–September 2015), pp. 690-692

The article addresses free products of groups, but of course the technique applies to free groups as well.

They show that every word has a unique reduced form in a direct and elementary way. Then the associativity follows immediately from the (easy) associativity in the free monoid.

Here is the proof. I will use Charles Rezk's notation, so the words are finite sequences of letters, which are elements of $S \cup S^*$. Suppose a word $w$ has two sequences of reductions $$w \to w_1 \to \cdots \to w_m,$$ $$w \to w'_1 \to \cdots \to w'_n,$$ where each step $w_i \to w_{i+1}$ is ''cancellation of a pair of some adjacent letters $a$, $a^*$'' and the final word $w_m$ is reduced (and similarly for the $w'_j$). We call $w_{i+1}$ an immediate descendent of $w_i$, and $w_j$ a descendent of $w_i$ if $i<j$. If $w_i = w'_j$ for any $i$ and $j$, then we are done by induction on the word length. So assume that $w_1 \neq w'_1$. There are two cases.

  1. We have $$w = (u_1, a, a^*, u_2, b, b^*, u_3),$$ where the $u_i$ are subwords and $a, b$ are letters, and $$w_1 = (u_1, u_2, b, b^*, u_3),$$ $$w'_1 = (u_1, a, a^*, u_2, u_3).$$ Then $$\tilde{w} = (u_1, u_2, u_3)$$ is a common (immediate) descendent of $w_1$ and $w'_1$.
  2. We have $$w = (u_1, a, a^*, a, u_2),$$ where the $u_i$ are subwords and $a$ is a letter, and $$w_1 = (u_1, a, u_2) = w'_1.$$

In both cases, we are done by induction on the word length.

5

The simplest definition of the elements of a free group is the one using reduced words; you found it in Ledermann. This also leads to a reasonably simple definition of the multiplication. But this just pushes the problem somewhere else, namely in verification of the associative law (once the associative law is proved, it then follows that each word is equivalent to a unique reduced word).

Personally I like a topological proof of the associative law; this will be in my book on $Out(F_n)$. One first constructs the tree $T$ whose edges are oriented and labelled by the elements of $X$, such that for each vertex $v$ and each $x \in X$ there is a unique incoming and a unique outgoing edge at $v$ labelled with $x$. After the fact one notices that this tree is the universal covering space of the wedge of circles with one circle for each generator; but you don't need the theory of universal covering spaces to construct this tree, you just construct it inductively by constructing the radius $n$ neighborhood of a base vertex, verifying as you go along that the construction satisfies the tree axiom, namely that it is connected and contains no circles. The associative law in the free group then comes down to the fact that the operation of concatenating paths and straightening the result to eliminate backtracking is an associative operation, which follows from the simple observation that two points in a tree are connected by a unique path without backtracking.

Lee Mosher
  • 120,280
4

The following is what I have after running through "A Course in Universal Algebra" by Burris et al. using the description of a group given here. It's just a sketch. Consult the text wherever necessary.

We work in the type $\mathcal{G}=\{\cdot, \sim, e\}$, where $\cdot\in\mathcal{G}_2$ is a binary operation, $\sim\in\mathcal{G}_1$ is a unary operation, and $e\in\mathcal{G}_0$ is a nullary operation, all subject to $$\begin{align} x\cdot (y\cdot z)&=(x\cdot y)\cdot z,\tag{associativity} \\ e\cdot x&=x=x\cdot e,\text{ and}\tag{identity}\\ x\cdot(\sim x)&=e=(\sim x)\cdot x.\tag{inverses}\end{align}$$

Definition 1: Let $X$ be a set of (distinct) variables. The set $T(X)$ of terms of type $\mathcal{G}$ over $X$ is the smallest set s.t.

  • $X\cup\{e\}\subseteq T(X)$ and
  • If $p_1, \dots , p_n\in T(X)$ and $f\in\mathcal{G}_n$, then the string $f(p_1, \dots , p_n)\in T(X)$

Definition 2: Given a group $\mathbb{G}=\langle G, \{\cdot^{\mathbb{G}}, \sim^{\mathbb{G}}, e^{\mathbb{G}}\}\rangle$ (i.e., an algebra of type $\mathcal{G}$) and an $n$-ary term $p(x_1, \dots , x_n)$ of type $\mathcal{G}$ over $X=\{x_i\mid i\in I\}$, some $I$, define the term function of $\mathbb{G}$ corresponding to $p$, denoted $p^{\mathbb{G}}: G^n\to G$, like so.

  • If $p$ is a variable $x_i$, then $p^{\mathbb{G}}(a_1, \dots , a_n)=a_i.$
  • If $p$ is of the form $p_1(a_1, \dots , a_n)\cdot p_2(a_1, \dots , a_n)$, $\sim p_1(a_1, \dots , a_n)$, or $e$, then $p^{\mathbb{G}}(a_1, \dots , a_n)$ is $p_1^{\mathbb{G}}(a_1, \dots , a_n)\cdot p_2^{\mathbb{G}}(a_1, \dots , a_n)$, $\sim p_1^{\mathbb{G}}(a_1, \dots , a_n)$, or $e^{\mathbb{G}}$.
  • If $p=f\in\mathcal{G}$, then $p^{\mathbb{G}}=f^{\mathbb{G}}$.

Definition 3: The term algebra of type $\mathcal{G}$ over $X$, denoted $\mathbb{T}(X)$, has as its underlying set $T(X)$ and has the fundamental operations $$f^{\mathbb{T}(X)}:(p_1, \dots , p_n)\mapsto f(p_1, \dots , p_n)$$ for $f\in\mathcal{G}_n$ and $p_i\in T(X)$ for all $i\in \overline{1, n}$. (NB: Since $\mathcal{G}_0\neq\emptyset$, $\mathbb{T}(\emptyset)$ exists.)

The free group $F_X$ over $X$ is exactly $\mathbb{T}(X)$ (up to isomorphism).

See Example 1, p. 74, Ibid.


Please do correct me if I'm wrong.

Shaun
  • 44,997
1

While reading about category theory in the book "Category Theory for Scientists," I started wondering about the free group construction after the part about adjunctions, in particular showing that the free monoid construction is left-adjoint to the forgetful functor. I'm pretty sure that the free monoid construction can be pushed through to a free group constructor, here is what I came up with.

One imagination of the free monoid over a set $A$ the the List functor. This maps $A$ to the monoid $([], ++, A+)$, where $[]$ is the empty list and identity, $++$ is concatenation, given on atoms by $[a] {++} [b] = [a,b]$, and $A+$ is the set of all lists of elements of $A$. This is a standard construction, and will be denoted $L$.

To lift this to a group, the first hint for me is that, in a group, $(a*b)^{-1} = b^{-1} * a^{-1}$. So in some sense our inverse elements for lists should be the reversed list. But I'm hesitant to say that (for instance) $[a,b,b,c] = [a,b] {++} [b,a] = []$, this seems like an extreme condition for a free construction. Let's first consider $L(A) \sqcup L(A)$, the disjoint union of $L(A)$ with itself.

For ease and abuse of notation we'll let $a_1$ represent an element of the left copy of $L(A)$ in $L(A) \sqcup L(A)$, and it will also represent $a_1$ as the injected value (in $A$). If your mental "type-checker" is sufficiently tuned, this won't be ambiguous or confusing. We'll assume similarly for $a_2, l_1, l_2$, favoring $a_i$ to represent atoms and $l$ to represent lists. $\mathcal{G} = (G,*,\mathrm{id})$ will denote a group, and we'll probably not make a fuss about whether an element $e \in G$ or $e \in \mathcal{G}$. We'll let $i$ be the injection $L(A) \rightarrow L(A) \sqcup L(A)$ when we need it. Since we're lifting our monoid to a group, we will call the group operation $\overline{++}$. Now it would be nice if

$i(l_i \overline{++} l_i') = l_i {++} l_i'$

(Remember, the elements in the LHS are members of $L(A) \sqcup L(A)$, while the ones in the RHS are elements of $L(A)$). This would immediately imply that

$i([]_1 \overline{++} []_1) = []_1 {++} []_1 = [] = []_2 {++} []_2 = i([]_2 \overline{++} []_2)$

We don't yet know that $i([]_i \overline{++} []_j) = []$ when $i \neq j$. But, it seems reasonable that our desired $\overline{++}$ won't (shouldn't?) be able to tell apart an empty list from the left or right copy of $L(A)$, which is good because we only want one identity element anyways. So we have the relation:

$[]_1 = []_2 = \mathrm{id}$

Here comes the "good idea." Denote the "reverse" of a list $l$ as $l^r$ We now choose our inverse elements to finish our construction. We'll pick:

$l_1^{-1} = l_2^r$

That is, the inverse of our lists on the left are the reversed lists on the right. This, along with our desire for the injection to behave naturally on the group operation, subsumes our relation for the identity. It also makes our "inverse formula" hold:

$([a]_1 \overline{++} [b]_1)^{-1} = [a,b]_1^{-1} = [b,a]_2 = [b]_2 \overline{++} [a]_2 = [b]_1^{-1} \overline{++} [a]_1^{-1}$

Now the part of showing that this $\bar{L}$ construction from $\mathbf{Set} \rightarrow \mathbf{Grp}$ Is left-adjoint to the forgetful functor. This boils down to finding a unique homomorphism $\psi(f): \bar{L}(A) \rightarrow \mathcal{G}$ for every function $f: A \rightarrow G$. The apparent way is:

$\psi(f)([]) = \mathrm{id}$

$\psi(f)([a_1]) = f(a_1)$

$\psi(f)([a_2]) = f(a_1)^{-1}$

$\psi(f)([a_i,...,z_i]) = f(a_i) * ... * f(z_i)$

We can end the discussion immediately by demonstrating a total $\phi: \mathrm{Homo}(\bar{L}(A), \mathcal{G}) \rightarrow \mathrm{Func}(A,G)$ such that

$\phi(\psi(f)) = f$

$\psi(\phi(h)) = h$

So say we have a homomorphism $h: \bar{L}(A) \rightarrow \mathcal{G}$, and we want a function $\phi(h): A \rightarrow G$. One obvious choice is:

$\phi(h)(a) = h([a_1])$

(Remembering that we're not really distinguishing elements of $\mathcal{G}$ and $G$). Now lets see what happens when we compose them...

$\phi(\psi(f))(a) = \psi(f)([a_1]) = f(a_1)$.

And the other way around:

$\psi(\phi(h))([a_1]) = \phi(h)(a_1) = h([a_1])$

$\psi(\phi(h))([a_2]) = \phi(h)(a_1)^{-1} = h([a_1])^{-1} = h([a_1]^{-1}) = h([a_2]) \quad \blacksquare$

I'm not sure if this gives a free-group as required by other definitions, but it seems like a straightforward construction using adjunctions.

Comments

  1. This is my first time using category theory to try and do something interesting or useful, so if I f***ed something up please DO NOT be gentle.
  2. If someone already did it better, I'd appreciate a reference.
  3. There seems to be an ironic asymmetry to the argument (ironic because we're discussing groups). I guess that the conclusion is that while an isomorphism demonstrating $\mathrm{Homo}(\bar{L}(A), \mathcal{G}) \cong \mathrm{Func}(A,G)$ is required, that isomorphism itself does not need to be unique...
  4. A cute "intuitive" interpretation of this construction is to first generate the free monoid, then put it up to a mirror, squashing the identity element with the mirror.