Download ABSTRACT ALGEBRA I NOTES 1. Peano Postulates of the Natural

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Polynomial greatest common divisor wikipedia , lookup

Cayley–Hamilton theorem wikipedia , lookup

Homological algebra wikipedia , lookup

Birkhoff's representation theorem wikipedia , lookup

Group action wikipedia , lookup

Laws of Form wikipedia , lookup

Congruence lattice problem wikipedia , lookup

Complexification (Lie group) wikipedia , lookup

Fundamental theorem of algebra wikipedia , lookup

Transcript
ABSTRACT ALGEBRA I NOTES
MICHAEL PENKAVA
1. Peano Postulates of the Natural Numbers
1.1. The Principle of Mathematical Induction. The principle of mathematical induction is usually stated as follows:
Theorem 1.1. Let Pn be a sequence of statements indexed by the positive integers
n ∈ P. Suppose that
• P1 is true.
• If Pn is true, then Pn+1 is true.
Then Pn is true for all n ∈ P.
This formulation makes the idea of mathematical induction into a property of
statements. However, in reality, there is a deeper level to this principle, as a
property of the positive integers themselves. Let us state this as a property of the
set of positive integers.
Theorem 1.2. Let S be a subset of P satisfying the following:
• 1 ∈ S.
• If n ∈ S then n + 1 ∈ S.
Then S = P.
We don’t give a proof of either of these versions of the principle of mathematical
induction. However, it is not difficult to show that both of these versions are
equivalent. That is to say, if the version described in terms of statements is true,
then the version given in terms of subsets of the positive integers is true and viceversa. Instead, we will give an axiomatic construction of the positive integers,
including the notions of addition and multiplication of such numbers, in terms of
what are called the Peano Postulates.
Before giving this axiomatic construction, we will give some simple examples of
how to use the principle of mathematical induction to prove some explicit formulae. We begin with an apocryphal story about the mathematician Carl Friedrich
Gauss, 1777–1855, who was one of the most significant contributors to modern
mathematics.
According to the story, Gauss was an elementary school student who was constantly disrupting his class, and his teacher decided to give him a task to occupy his
time, adding up the numbers from 1 to 100. Unfortunately for his teacher, Gauss
was able to give the answer immediately, ”They sum to 5050”. In various versions
of the story, his teacher doubted the answer, but Gauss was able to give a simple
explanation for his result. There are 100 numbers from 1 to 100, and they can be
divided into 50 pairs of numbers each of which sum to 101, 1 and 100, 2 and 99,
etc. Thus, the sum is 50*101 =5050.
1
2
MICHAEL PENKAVA
The above reasoning is certainly clever, but we can give a very general answer to
the question of how to sum the numbers from 1 to n using mathematical induction.
In general, mathematical induction can be used to prove a conjecture, but usually
the conjecture cannot be seen by using inductive methods. This may seem strange,
that in order to determine a general result you have to first know the answer, but
this is a deep mystery of mathematics, that seeing what is true and being able to
show it are very different activities. The statement of the sum formula is as follows.
Theorem 1.3.
n
∑
k=
k=1
n(n + 1)
.
2
Proof. We use the principle of mathematical induction. Let Pn be the statement
∑n
n(n+1)
.. We first show that P1 is true. To see this, note that if n = 1,
k=1 k =
2
the left hand side of P1 is simply the sum from k = 1 to 1 of k, which is just 1. On
the other hand, the right hand side of P1 is 1(1+1)
, which is also equal to 1. Thus
2
we have shown that P1 is true.
∑n+1
Next, assume that Pn is true. Now Pn+1 is the statement k=1 k = (n+1)(n+2)
.
2
Let us compute
n+1
∑
k=1
k =n+1+
n
∑
k=1
k =n+1+
n(n + 1)
2(n + 1) n(n + 1)
=
+
2
2
2
(n + 2)(n + 1)
(n + 1)(n + 2)
=
=
.
2
2
Notice that in the third equality above, we used the statement Pn . By the principle
of mathematical induction, the statement Pn is true for all n.
Exercise 1.4. Prove the following statements using mathematical induction.
∑n
n(n+1)(2 n+1)
2
(1)
.
k=1 k =
6
2
2
∑n
n (n+1)
3
(2)
.
k=1 k =
4
(3) 3n > 2n for all positive integers n.
ABSTRACT ALGEBRA I NOTES
3
1.2. The Peano Postulates. The Peano postulates for the natural numbers were
first given by the mathematician Giuseppe Peano 1858–1932, in the year 1889.
These axioms were the culmination of about a century of work in developing the
notions of arithmetic as a system of formal reasoning. Here, we will give the axioms
and constructions for the set P of positive integers. Peano’s axioms were originally
stated for the natural numbers N = 0, 1, . . . .
Definition 1.5. The positive integers P is a set with a map s : P → P, called the
successor map, satisfying:
(1) There is a natural number 1 such that 1 ̸= s(n) for any n ∈ P.
(2) If s(m) = s(n) then m = n.
(3) If S is a subset of P such that
• 1 ∈ S.
• If n ∈ S, then s(n) ∈ S.
Then S = P.
It is not hard to show that if P and P′ are two such sets, then there is a unique
bijection φ : P → P′ such that φ(1) = 1 and φ(s(n)) = s(φ(n)) for all n ∈ P.
This means that in some sense, the Peano postulates uniquely determines the set
of positive integers.
The construction of all of the elementary arithmetic operations from the Peano
postulates was given in the Principia Mathematica, a three volume tome written
by the mathematicians Alfred North Whitehead, 1861–1947, and Bertrand Russell
1872–1970, consisting of thousands of pages. Clearly, in a course on Abstract Algebra, there is not enough time to give this kind of an in depth treatment of elementary
arithmetic, so we will only establish a few of the highlights of the material.
Theorem 1.6. If n ∈ P, then n ̸= s(n).
Proof. Let S be the set of all elements of P which are not equal to their successors,
that is all n ∈ P such that n ̸= s(n). If we can show that S = P, then the
theorem is true. First we show that 1 ∈ S. This is true because by hypothesis, 1
is not a successor of any element. Now suppose that n ∈ S. Then n ̸= s(n). If
s(n) = s(s(n)), then it would follow that n = s(n), since both n and s(n) have the
same successor. However, by assumption n ̸= s(n). Thus s(n) ̸= s(s(n)). It follows
that s(n) ∈ S. From this we conclude that S = P.
The proof above illustrates a common technique in the theory of arithmetic on
P. We use the inductive property of the natural numbers to show the property we
wish to establish. We give one more example of such a proof.
Theorem 1.7. Let n ∈ P and suppose that n ̸= s(m) for any m ∈ P. Then n = 1.
In other words, the only element of P which is not a successor is 1.
Proof. Let S be the set of all elements n ∈ P such that either n = 1 or n = s(m)
for some m ∈ P. Notice that 1 ∈ S by assumption. Let us suppose that n ∈ S.
Then s(n) ∈ S, since we have s(n) = s(m) where m = n. It follows that S = P.
It follows that if n ∈ P , then n ∈ S, so that if n ̸= 1, n = s(m) for some m ∈ P .
Therefore, if n ̸= s(m) for all m ∈ P, we must have n = 1.
Definition 1.8 (Recursion). A function f defined on P is said to be defined recursively if f is defined as follows. First, f (1) is explicitly given. Secondly, the value
f (s(n)) is given by some rule that depends only on the value of f (n).
4
MICHAEL PENKAVA
The principle of mathematical induction shows that a recursive definition gives
a well-defined function, provided that the rule for f (s(n)) can always be evaluated.
The rules for addition and multiplication of positive numbers are given by recursive
definitions.
Definition 1.9 (Definition of addition). Addition is a binary operation on S given
by the following rules:
• m + 1 = s(m).
• m + s(n) = s(m + n).
Notice that there can be no conflict between the two rules because 1 is not a
successor. The fact that addition is well-defined is an elementary exercise. One
shows that the set S of all n such that m + n is defined satisfies the induction
hypotheses, so is all of P. From the definition of addition, we are able to show
the properties of associativity and commutativity of addition. The order in which
these two properties are established is quite important. One of the difficulties that
Whitehead and Russell encountered in writing the Principia Mathematica was that
there is a certain natural order in which the properties need to be established, and
the difficulty is determining that natural order.
Theorem 1.10 (Associativity of addition).
(a + b) + c = a + (b + c),
for all positive integers a, b and c.
Proof. The first difficulty one has to overcome in this proof is that there are three
variables, but mathematical induction gives conditions for a subset S of P to be
all of P. This means we should somehow reduce our proof to a one variable proof.
One way to do this is to imagine that a and b are fixed numbers, and to show that
the set S consisting of all c ∈ P such that (a + b) + c = a + (b + c) is all of P.
First we show that 1 ∈ S. To see this, note that (a + b) + 1 = s(a + b) by the
first rule of addition. Secondly, a + (b + 1) = a + s(b) = s(a + b), by the second rule
of addition. It follows that (a + b) + 1 = s(a + b) = a + (b + 1). This shows that
1 ∈ S.
Next, suppose that c ∈ S, so that (a + b) + c = a + (b + c). Then
(a + b) + s(c) = s((a + b) + c) = s(a + (b + c)) = a + s(b + c) = a + (b + s(c)).
But this means that s(c) ∈ S. By the inductive principle of natural numbers, we
have S = P .
Finally, we note that although we fixed a and b to give this property for c, we did
not use any properties of a and b, so we finally see that the formula for associativity
holds for all positive integers a, b and c.
Theorem 1.11 (Commutativity of addition). For all positive integers m, n ∈ P,
m + n = n + m.
Sometimes, it helps to prove a technical or special case of a theorem, which will
help in the general proof, as a separate result. Such a result is usually called a
lemma. Of course, a lemma is a theorem, but we usually reserve that word for
results which are primarily useful in proving a more important result. However,
there are cases where an important result is also called a lemma, so one has to be
careful.
ABSTRACT ALGEBRA I NOTES
5
Lemma 1.12. For all positive integers m, m + 1 = 1 + m.
Proof of the lemma. Let S be the subset of all m ∈ P such that m + 1 = 1 + m.
Evidently 1 ∈ S, since 1 + 1 = 1 + 1. Now suppose that m ∈ S. Then
1 + s(m) = s(1 + m) = s(m + 1) = m + s(1) = m + (1 + 1) = (m + 1) + 1 = s(m) + 1.
Thus, by induction S = P, and the lemma holds.
Notice that we used associativity in the proof of this lemma, so it was important
that the associative law of addition was established first.
Proof of the theorem. Fix m ∈ P. Let S be the subset of all n ∈ P such that
m + n = n + m. Then by the lemma, 1 ∈ S. Suppose that n ∈ S. Then
m + n = n + m. As a consequence,
m + s(n) = s(m + n) = s(n + m) = n + s(m) = n + (m + 1)
= n + (1 + m) = (n + 1) + m = s(n) + m.
Thus S = P and the commutative law of addition holds.
Definition 1.13 (Definition of multiplication). Multiplication is a binary operation
on S given by the following rules:
• m · 1 = m.
• m · s(n) = m · n + m.
There are two properties of multiplication, associativity and commutativity, and
a property involving addition and multiplication called the distributive law.
Theorem 1.14 (The distributive law). For all a, b and c in P we have
a · (b + c) = a · b + a · c
Proof. Once again, we prove this result by fixing a and b and showing that the set
of all c ∈ P for which the equation above holds satisfies the induction hypotheses.
First, if c = 1, we note that
a · (b + 1) = a · s(b) = a · b + a = a · b + a · 1,
so 1 ∈ S. Suppose now that c ∈ S. Then
a · (b · s(c)) = a · (b + (c + 1)) = a · ((b + c) + 1)
= a · (b + c) + a · 1 = (a · b + a · c) + a · 1
= a · b + (a · c + a · 1) = a · b + a · s(c)
Actually, there are two distributive laws. The one stated above is often called
the left distributive law. The right distributive law is stated as follows:
(a + b) · c = a · c + b · c
When the commutative law of multiplication holds, the right distributive law follows
directly from the left distributive law. However, many structures with addition and
multiplication do not have a commutative multiplication, so in those cases, the left
and right distributive laws do not follow directly from each other, and other methods
of proof are necessary. It also may seem strange that we first proved the distributive
law, instead of the laws involving multiplication alone, but it will emerge that we
use the distributive law in proving the other properties. An interesting feature
6
MICHAEL PENKAVA
of the proof of the distributive law is that all the work seems to be in moving
parentheses around. This is a key feature in proofs in algebra.
Exercise 1.15. Prove the right distributive law: For all positive integers a, b and
c, we have
(a + b) · c = a · c + b · c.
Theorem 1.16 (Associative law of multiplication). For all a, b and c in P, we
have
a · (b · c) = (a · b) · c.
Proof. As usual, we fix a and b and show that the set S of all c ∈ P such that
a · (b · c) = (a · b) · c is all of P. Now
a · (b · 1) = a · b = (a · b) · 1,
so 1 ∈ S. Now suppose c ∈ S. Then
(a · b) · s(c) = (a · b) · c + (a · b) = a · (b · c) + a · b = a · (b · c + b) = a · (b · s(c))
Notice that in the proof of the associative law of multiplication we used the left
distributive law. Finally, we are ready to prove the commutative law of multiplication.
Theorem 1.17. For all positive integers a and b we have
a · b = b · a.
To simplify the proof, we first state and prove the following lemma.
Lemma 1.18. For all positive integers a, we have a · 1 = 1 · a.
Proof of the lemma. Let S be the set of all positive integers a such that a · 1 = 1 · a.
Then 1 ∈ S because 1 · 1 = 1 · 1. Now suppose that a ∈ S. Then
1 · s(a) = 1 · a + 1 = a · 1 + 1 · 1 = (a + 1) · 1 = s(a) · 1.
Proof of the theorem. Fix a and let S be the set of all b ∈ P such that a · b = b · a.
Then by the lemma, 1 ∈ S. Suppose now that b ∈ S. Then
a · s(b) = a · b + a · 1 = b · a + 1 · a = (b + 1) · a = s(b) · a.
This shows that S = P so the commutative law of multiplication holds for the
positive integers.
Next, we introduce the notion of inequality for the positive integers.
Definition 1.19 (Definition of inequality). We say that a < b, a is less than b,
precisely when there is some c such that b = a + c.
Although we won’t develop the properties of inequalities, we point out that the
usual properties of inequalities involving positive integers can all be established
using the properties of addition and multiplication which we have developed thus
far. To illustrate this principle, we state and prove the following theorem.
Theorem 1.20. If a < b then a + c < b + c for any c ∈ P.
ABSTRACT ALGEBRA I NOTES
7
Proof. Suppose that a < b. Then there is some x ∈ P such that b = a + x. Thus
b + c = (a + x) + c = (a + (x + c) = a + (c + x) = (a + c) + x.
It follows that a + c < b + c.
1.3. Well Ordering and Strong Induction.
Definition 1.21. A set X is ordered provided it is equipped with a binary relation
< satisfying:
(1) If a, b ∈ X, then exactly one of the following hold:
• a < b.
• a = b.
• b < a.
(2) If a < b and b < c then a < c.
For an ordered set X, we write a ≤ b if a < b or a = b.
Definition 1.22. An ordered set X satisfies the Principle of Strong Induction if
given any subset S which satisfies:
• If x ∈ S for all x < n then n ∈ S
Then S = X. A subset of an ordered set X is said to be strongly inductive if it
satisfies the condition above.
One can restate the principle of strong induction in the form: X satisfies the
principle of strong induction if every strongly inductive subset is all of X.
Theorem 1.23 (Strong Induction). P satisfies the principle of strong induction.
Proof. Let S ⊆ P be a strongly inductive subset of X. We need to show that S = P.
To see this, we will show that a certain subset of S is already all of P. Let Y be
the subset of S consisting of all elements n ∈ S such that x ∈ S for all x < n. We
show that Y satisfies the inductive hypotheses.
First, note that x ∈ S for all x < 1, since there are no such values of x. Therefore
1 ∈ S. Furthermore, it is clear that 1 ∈ Y as well. Next, suppose that n ∈ Y . Then
for all x < n, x ∈ S, and since n ∈ S, it follows that for all x < s(n), x ∈ S. Thus
s(n) ∈ S. It follows that s(n) ∈ Y . Since Y satisfies the hypotheses of induction,
Y = P. It follows that S = P as well.
Definition 1.24. If X is an ordered set, and Q is a subset of X, then c is called a
least element of Q if c ≤ x for all x ∈ Q.
An ordered set X is well ordered or satisfies the least element property provided
that any nonempty subset Q of X has a least element.
Theorem 1.25. The set P satisfies the least element property.
Proof. Let Q be a subset of P which does not have a least element, and let S be
the subset of P consisting of all x ∈ P such that y ̸∈ Q for all y ≤ x. We show that
S satisfies the hypothesis of strong induction, which implies it is all of P. Suppose
that x ∈ S for all x < n. Then x ̸∈ Q for all x < n. If n ∈ Q, it would be the least
element of Q. Thus n ̸∈ Q, so n ∈ S. Thus S must be all of P.
8
MICHAEL PENKAVA
2. Equivalence of forms of induction and well ordering
Both the Principle of Strong Induction and the Well Ordering Principle refer
only to an ordering on a set X. The Principle of Mathematical Induction which we
gave as part of the Peano Postulates, which is also known as weak induction requires
a successor operation, and there must be a connection between the ordering and
the successor operation. We have already shown that the set of positive integers,
with the ordering given by the construction from the Peano postulates satisfies the
Well Ordering Principle and the Principle of Strong Induction.
Theorem 2.1. Let X be an ordered set. Then X is well ordered if and only if it
satisfies the principle of strong induction.
Proof. We show that well ordering implies the principle of strong induction. We
leave the reverse direction as an exercise. Suppose that X is well ordered and S
is a strongly inductive subset of X. We must show that S = X. Let Q be the
complement of S. It is enough to show that Q must be the empty set. Suppose
that it is not empty. Then Q has a least element c. It follows that for all x < c,
x is not an element of Q, which means that x is in S. Thus for all x < c, x ∈ S.
Since S is strongly inductive, it follows that c ∈ S. But this contradicts the fact
that c ∈ Q. This shows that Q is empty.
Exercise 2.2. Show that an ordered set satisfying the principle of strong induction
is well ordered.
It can be shown that every set X can be well-ordered, using the axiom of choice,
which is an axiom of a certain set theory, called Zermelo-Frenkel Choice, often
denoted as ZFC. To understand this construction would take us too far into the
realm of set theory for this course. However, we note that if X is well ordered, then
the principle of strong induction holds, by the theorem above.
Transfinite Induction refers to proofs using the principle of strong induction on
a well ordered set. Since every set can be well ordered, transfinite induction can
be used to prove many interesting results in set theory, in particular, it is used to
study ordinal numbers.
3. The Division Algorithm
From the positive integers, the integers are constructed in a straightforward
manner, and all of the usual properties of addition, multiplication and inequalities
can be established in a routine manner. Nevertheless, the construction takes a lot
of detail and would take too long to carry out in this course. We will assume that
all of these basic properties have been shown, and will begin our analysis of the
integers with the division algorithm.
Theorem 3.1. Suppose that m, n ∈ Z and m ̸= 0. Then there are unique q, r ∈ Z
such that 0 ≤ r < |m| and
n = qm + r.
Proof. We first show uniqueness of q and r. Suppose that n = mq + r and n =
mq ′ + r′ , where 0 ≤ r ≤ |m| and 0 ≤ r ≤ |m|. If r = r′ , it follows that mq = mq ′ ,
so m(q − q ′ ) = 0. By the zero product property of the integers, either q − q ′ = 0
or m = 0. Since we have explicitly assumed that m ̸= 0, it follows that q − q ′ = 0,
so q = q ′ . Now, let us assume that r ̸= r′ . Then we can assume without loss of
ABSTRACT ALGEBRA I NOTES
9
generality that r′ > r, so that r′ −r > 0. But m(q−q ′ ) = r′ −r, so |m||q−q ′ | = r′ −r.
However r′ − r < |m| − r < |m|, but |m||q − q ′ | > |m| unless q = q ′ . It follows that
q = q ′ so r = r′ . This proves uniqueness.
We will use the least element property of P to prove the existence of a q and r
satisfying the properties. Let X = {n − mq|q ∈ Z} ∩ P. Because m ̸= 0, X ̸= ∅.
Therefore X has a least element r. We have n = mq + r for some q. Suppose that
r ≥ |m|. Then r′ = r − |m| ≥ 0 If m > 0, then n = mq + r = m(q + 1) + r′ , so
r′ ∈ X. If m < 0 then n = m(q − 1) + r′ , so again r′ ∈ X. But this contradicts the
fact that r is the least element of X, since r′ < r.
Definition 3.2. Let a, b ∈ Z. We say that a divides b, and denote this by a|b,
provided that there is some integer x such that ax = b.
Note that a|b is a statement, not a number.
Definition 3.3. Let m, n ∈ Z. Then c is called a greatest common divisor of m
and n provided that
(1) c|m and c|n.
(2) If d|m and d|n then d|c.
Notice that we did not define the greatest common divisor. In fact, in general, the
greatest common divisor is only determined up to multiplication by ±1, as we shall
show. However, this fact does allow us to define the greatest common divisor as the
unique greatest common divisor which is nonnegative, which is exactly what most
textbooks do. Note also that the definition of a greatest common divisor does not
imply that such a thing exists. It simply gives a criterion for determining whether
a number c is a greatest common division. It is common to write c = gcd(m, n) to
express that c is a greatest common divisor of m and n, even though there is some
ambiguity about c.
Proposition 3.4. Suppose that a|b and b|a. Then b = ±a.
Proof. Let a|b and b|a. Then there are x, y ∈ Z such that b = ax and a = by. It
follows that b = byx, so b(1 − yx) = 0. If b = 0, then a = 0, so b = a. Otherwise
we must have 1 − yx = 0, so xy = 1. In particular, x has a multiplicative inverse.
But the only integers which have a multiplicative inverse are ±1, so x = ±1, and
b = ±a.
Theorem 3.5. Let c and d be two greatest common divisors of m and n. Then
d = ±c.
Proof. Since c is a gcd of m and n, we have c|m and c|n. Since d is a gcd of m and n,
it follows that c|d. Similarly, d|c. Thus, according to Proposition 3.4, d = ±c. Proposition 3.6. Let m ∈ Z. Then
(1) gcd(m, 0) = m.
(2) gcd(m, 1) = 1.
Exercise 3.7. Prove Proposition 3.6
Definition 3.8. Let m, n ∈ Z. Then m and n are said to be relatively prime if
gcd(m, n) = 1. In other words, 1 is a greatest common divisor of m and n.
10
MICHAEL PENKAVA
Notice that m and 1 are relatively prime for any m ∈ Z, by Proposition 3.6.
Now we will show that given m, n ∈ Z, there is always a greatest common divisor
of m and n. In other words, greatest common divisors exist!
Theorem 3.9. Let m, n ∈ Z, and suppose that n ̸= 0. Let
X = {rm + sn|r, s ∈ Z} ∩ P.
Then X has a least element c, and this least element is a greatest common divisor
of m and n.
Moreover, for any m, n ∈ Z, if c is a gcd of m and n, then c = rm + sn for some
r, s ∈ Z.
Proof. Since n ̸= 0, |n| ∈ P. Moreover, |x| = sx where s = 1 or s = −1. Thus
|x| = 0 · m + sn ∈ X, so X is nonempty. As a consequence, it has a least element c,
and since c ∈ X, c = rm + sn for some r, s ∈ Z. Since c ̸= 0, there are unique q, d
such that 0 ≤ d < c and m = cq + d. But then d = m − cq = m − (rm + sn)q =
(1 − rq)m + (−sq)n. If d > 0, it follows that d ∈ X and d < c, which contradicts
our assumption that c is the least element of X. Thus d = 0, so m = cq. Thus c|m.
Similarly, c|n.
Now suppose that d ∈ Z satisfies d|m and d|n. Then m = xd and n = yd for
some x, y ∈ Z. Thus c = rxd + syd = (rx + sy)d. It follows that d|c. Thus c is a
gcd of m and n.
Finally, from what we have shown, when n ̸= 0, we have constructed a gcd c
of m and n which satisfies c = rm + sn for some r, s ∈ Z. If d is another gcd of
m and n, then either d = c or d = −c. But −c = (−r)m + (−s)n, so d can be
expressed in the required form. We still have to address the case when n = 0, but
then gcd(m, n) = m, so any gcd of m and n is of the form rm + sn where r = ±1
and s = 0.
Corollary 3.10. Let m, n ∈ Z. Then m and n are relatively prime if and only if
there are r, s ∈ Z such that 1 = rm + sn. In other words, we can express 1 as a
linear combination of m and n.
Proof. If m and n are relatively prime, then 1 is a gcd of m and n. Thus, by the
theorem, 1 = rm+sn for some r, s ∈ Z. On the other hand, suppose 1 = rm+sn for
some r, s ∈ Z. Now, by the theorem, the least element in X = {rm+sn|r, s ∈ Z}∩P
is a gcd of m and n, and by assumption, 1 ∈ X. It follows that 1 must be the least
element in X, so 1 is a gcd of m and n.
Theorem 3.11 (Euclidean Algorithm). Suppose that n, m ∈ Z, and n = mq + r.
Then gcd(m, n) = gcd(m, r).
Proof. Let c = gcd(m, n) and d = gcd(m, r). Then m = xd and r = yd for some
x, y ∈ Z. Thus n = mq + r = (qx + y)d, so d|n. Since d|m and c is a gcd of m
and n, it follows that d|c. Next, note that m = rc and n = sc for some r, s ∈ Z.
so r = n − mq = (s − rq)c. It follows that c|r and c|m, so c|d. Therefore d = ±c.
Thus gcd(m, n) = gcd(m, r).
It may seem that the Euclidean algorithm is not an algorithm at all, since it
does not tell one how to compute the gcd of m and n. The trick is to notice that
if we first express n = mq + r with 0 ≤ r ≤ |m|. and then we express m = q1 r + r1
with 0 ≤ r1 < r, and continue this process, we obtain a sequence of elements
ABSTRACT ALGEBRA I NOTES
11
r > r1 > · · · rn . Eventually, this process must terminate with some rn+1 = 0.
But we have gcd(m, n) = gcd(m, r) = gcd(r, r1 ) = · · · = gcd(rn , rn+1 ) = rn , since
rn+1 = 0. Thus, the Euclidean algorithm computes the gcd of m and n. In fact,
the Euclidean algorithm is efficient in this computation. Moreover, we can adapt
the Euclidean algorithm to find numbers x and y so that the gcd c of m and n
satisfies c = xm + yn. Dr. Nick Passell, a professor emeritus of the department
of mathematics at the University of Wisconsin-Eau Claire, developed an efficient
algorithm, which we illustrate below.
Let us find the gcd c of 78 and 30, as well as x and y such that c = 30x + 78y.
First make a table with 4 columns, with headings r, −q, m and n. We will use it
to keep track in each row how the element r can be expressed as a linear column
of m and n. For simplicity, we start with the largest element n = 78. and the first
row expresses that it is zero times m = 30 plus 1 times n. In the next row, before
filling in the q column, first note that m = 1 · m + 0 · n, so put a 1 in the m column
and a 0 in the n column. Now, note that when we use the division algorithm to
express n = mq + r, with 0 ≤ r < |m|, we have q = 2, so write −2 in the q column,
and put the r = 18 in the r column in the next row.
To figure out the m column in the current row, add the m column from 2 rows
above, and −q times the m column in the row above, and do similarly for the
n column. Then we begin again by figuring out how to express 30 in the form
30 = 18 ∗ q + r. We write the −q, which in this case is −1 in the q column, and
proceed as before. In each case, we determine the m column by adding the value
in the m column two rows above plus the −q times the value in the m column in
the row above, and similarly for the n column.
Finally, when the number c in the r column divides the number in the r column
in the row above, that r is the gcd, and the numbers we calculate in the m and
n columns become the x and y so that c = xm + yn. The complete calculation is
given in the table below.
r -q m n
78
0 1
30 -2 1 0
18 -1 -2 1
12 -1 3 -1
6
-5 2
From this calculation we see that 6 is the gcd of 78 and 30, and that 6 =
−5 · 30 + 2 · 78.
Definition 3.12. An element a ∈ Z is called a unit if a has a multiplicative inverse.
Of course we already know that the units in Z are precisely the numbers ±1.
Definition 3.13. Let p ∈ Z and suppose that p is not zero and not a unit. Then
• p is said to be irreducible if whenever p = ab then either a or b is a unit.
• p is said to be prime if whenever p|ab then p|a or p|b.
We will show that the notions of primeness and irreducibility coincide for the integers.
Theorem 3.14. Let p ∈ Z be prime. Then p is irreducible.
12
MICHAEL PENKAVA
Proof. Suppose that p is prime and p = ab. Then p|ab so either p|a or p|b. Suppose
that p|a. Then a = px for some x and thus p = pxb. It follows that p(1 − xb) = 0.
Since p ̸= 0, we must have xb = 1, so b is a unit. Similarly, if p|b, then we can show
that a is a unit. It follows that p is irreducible.
Theorem 3.15. Let p be irreducible and a ∈ Z. Then either gcd(p, a) = 1 or p|a.
Proof. Let c = gcd(p, a). Then c|p so p = cx and a = cy for some x, y ∈ Z. If c is a
unit, then gcd(p, a) = 1. Otherwise, x is a unit, so a = cy = px−1 y. Thus p|a. Theorem 3.16. Suppose that p is irreducible. Then p is prime. As a consequence,
we have p is prime if and only if p is irreducible.
Proof. Suppose p is irreducible and p|ab. Then ab = xp for some x ∈ Z. If p ̸ |a,
then gcd(p, a) = 1, so 1 = rp + sa for some r, s ∈ Z. Thus
b = brp + sab = brp + sxp = (br + sx)p.
It follows that p|b. Thus p is prime.
Proposition 3.17. Suppose that a and b are relatively prime and that a|bx. Then
a|x.
Exercise 3.18. Prove the above proposition.
4. Modular Arithmetic
Modular arithmetic is also called clock arithmetic, because the rules of addition
resemble the rules for addition on a clock. In order to give a rigorous definition, we
will first introduce the notion of an equivalence relation. A relation on a set X is
a subset of elements (a, b) ∈ X × X. If we have a relation, we often denote it by
introducing some symbol R, and write xRy to mean that (x, y) lies in the relation.
For example, the relation equality is given by the symbol “=” and we write a = b
to mean that (a, b) lies in the relation equality. Other examples of relations given
by symbols are “¡”, ≤, ⊆. If ∼ is the symbol of a relation, we will usually just call
the relation ∼, rather than say that it is the symbol of the relation.
Definition 4.1. Suppose ∼ is a relation on a set X. Then ∼ is called an equivalence
relation provided that
(1) a ∼ a for all a ∈ X. (Reflexivity)
(2) If a ∼ b then b ∼ a. (Symmetry)
(3) If a ∼ b and b ∼ c then a ∼ c. (Transitivity)
Definition 4.2. If ∼ is an equivalence relation on X and b ∈ X, then the equivalence class of b, denoted by b, is
b = {a ∈ X|a ∼ b}.
The set of all equivalence classes of elements in X is denoted by X/ ∼ or sometimes
X.
Theorem 4.3. Let ∼ be an equivalence relation on X. Then the following properties hold:
(1) If a ∈ X, then a ∈ a. Thus a ̸= ∅.
(2) If a ∩ b ̸= ∅, then a = b.
(3) ∪{a|a ∈ X} = X.
ABSTRACT ALGEBRA I NOTES
13
Proof. Since a ∼ a by reflexivity, it follows that a ∈ a. Thus a ̸= ∅. Suppose that
x ∈ a ∩ b. Then x ∼ b and x ∼ a. Then by symmetry, b ∼ x. Let y ∈ b̄. Then
y ∼ b, and by transitivity y ∼ x, and applying the transitive rule a second time, we
have y ∼ a. It follows that y ∈ a. This shows b ⊆ a. By a similar argument a ⊆ b.
Thus we must have a = b. Finally, let x ∈ X. Then x ∈ x, so x ∈ ∪{a|a ∈ X}. It
follows that {a|a ∈ X} = X.
Definition 4.4. Let X be a set and C be a collection of subsets of X. Then C is
said to be a partition of X provided that
(1) If A ∈ C, then A ̸= ∅.
(2) If A and B are in C, and A ∩ B ̸= ∅, then A = B.
(3) If x ∈ X then x ∈ A for some A ∈ C.
Theorem 4.5. If ∼ is an equivalence relation on a nonempty set X, then the
collection X is a partition of X.
Exercise 4.6. Prove the above theorem.
Definition 4.7. Let n ∈ Z and define a relation on Z by
x=y
(mod n) if y − x = kn for some k ∈ Z.
Theorem 4.8. The relation = (mod n) is an equivalence relation.
Proof. First, note that a = a (mod n), because a − a = 0 = 0 · n. Suppose that
a = b (mod n), so b − a = kn for some k ∈ Z. But then a − b = (−k)n, which shows
that b = a (mod n). Finally, suppose that a = b (mod n) and b = c (mod n).
Then b − a = kn and c − b = ln for some k, l ∈ Z. Thus
c − a = c − b + b − a = ln − kn = (l − k)n.
It follows that a = c (mod n).
Definition 4.9. For the equivalence relation = (mod n), the set of equivalence
classes is denoted by Zn . (Some authors denote it by Z/nZ.)
Theorem 4.10. There is a well-defined binary operation + on Zn given by
a + b = a + b.
Moreover, this operation satisfies the following properties.
(1) a + (b + c) = (a + b) + c. (Associativity)
(2) a + b = b + a. (Commutativity)
(3) a + 0 = a. (Existence of additive identity)
(4) a + −a = 0. (Existence of additive inverse)
Proof. It turns out that the hard part is showing that the addition is well defined.
What causes the problem is that the sets a do not determine the element a. So
what the operation actually says is that to add the two sets, take arbitrary elements
a and b out of the sets and form the set a + b. The problem is that we need to show
that the set a + b does not depend on the choice of a and b.
To do this, let a1 ∈ a and b1 ∈ b. We need to show that a1 + b1 = a + b. Now
a1 = a (mod n), and b1 = b (mod n), so a − a1 = kn and b − b1 = ln for some
k, l ∈ Z. It follows that
(a + b) − (a1 + b1 ) = a − a1 + b − b1 = kn + ln = (k + l)n.
14
MICHAEL PENKAVA
Thus a+b = a1 +b1 (mod n). It follows that a+b ∈ a1 + b1 , and since a+b ∈ a + b,
we see that a1 + b1 ∩a + b ̸= ∅. Therefore, a1 + b1 = a + b. This shows that addition
is well defined.
Now, to show the associative law, we proceed as follows:
a + (b + c) = a + b + c = a + b + c = a + b + c = (a + b) + c.
To show commutativity:
a + b = a + b = b + a = b + a.
Next, we compute
a + 0 = a + 0 = a.
Finally,
a + −a = a + −a = 0.
Theorem 4.11. There is a well defined binary operation · on Zn , called multiplication, given by
a · b = ab.
This operation satisfies the following properties:
(1) a · (b · c) = (a · b) · c. (Associativity)
(2) a · b = b · a. (Commutativity)
(3) a · (b + c) = a · b + a · c. (Distributive Law)
(4) a · 1 = a. (Existence of a multiplicative identity)
Proof. As usual, well definedness is the hard part. Suppose that a1 = a (mod n)
and b1 = b (mod n). We need to show that a1 b1 = ab (mod n). Now a1 = a + kn
and b1 = b + ln for some k, l ∈ Z. Thus
a1 b1 − ab = a1 b1 − a1 b + a1 b − ab = a1 (b1 − b) + (a1 − a)b = a1 ln + knb = (a1 l + kb)n.
Thus a1 b1 = ab (mod n) and it follows that multiplication is well defined.
The properties are straightforward to show and are left as an exercise.
Theorem 4.12. Let a ∈ Z. Then a is a unit in Zn precisely when gcd(a, n) = 1.
In that case, if we express 1 = xa + yn, then (a)−1 = x. In particular, Zp is a field
if and only if p is prime.
Exercise 4.13. Prove the theorem above.
Theorem 4.14 (Freshmen Exponentiation). Let p ∈ P be prime and a, b ∈ Z.
Then (a + b)p = ap + bp (mod p).
∑n ( )
Proof. Recall the binomial theorem for n ∈ P: (a + b)n = k=0 nk an−k bk . Note
(
)
(n)
n!
, and that nk ∈ P. As a consequence, when n = p is prime,
that k = k!(n−k)!
we note that for any 1 ≤ x < p, we have gcd(x, p) = 1. But this means that
gcd(k!, p) = 1 and gcd((p − k)!, p) = 1 if 1 ≤ k < p. Therefore gcd(k!(p − k)!, p) = 1,
if 1 ≤ k < p, and
( ) since (k!(p(−)k)!)|p!, it follows that (k!(n − k)!)|(p − 1)!. But this
means that p| kp and thus kp = 0 (mod p) for 1 ≤ k < p. It follows that every
term in the binomial formula is equal to zero mod p except for the terms with k = 0
and k = p. But the term corresponding to k = 0 is ap and the term corresponding
to k = p is bp . This gives the exponentiation formula in the theorem.
ABSTRACT ALGEBRA I NOTES
15
Theorem 4.15 (Fermat’s Little Theorem). Suppose that p ∈ P is prime. Then if
a ∈ Z, ap = a (mod p). In particular, if gcd(a, p) = 1, then ap−1 = 1 (mod p).
Proof. We first show the statement is true whenever a ∈ P. For a = 1, the statement
is trivial. Suppose that ap = a (mod p). Then
(a + 1)p = ap + 1p = a + 1
(mod p).
Thus by induction, we see that the statement is true for all a ∈ P.
Next, note that 0p = 0, so the statement holds for a = 0. If p is odd, then
if a ∈ P, we have (−a)p = (−1)p ap = −a (mod p), so the statement holds when
a < 0. Thus we only have to show the case when a < 0 and p = 2. But −a = a
mod 2, since −a − a = 2a is divisible by 2. Thus (−a)2 = a2 = a = −a mod 2.
Thus the statement holds when p = 2 and a < 0.
Finally, suppose that gcd(a, p) = 1. Now ap = a (mod p) so a(ap−1 − 1) = 0
(mod p). Since Zp is a field and a ̸= 0 (mod p), it follows that ap−1 = 1 (mod p).
Theorem 4.16 (Chinese Remainder Theorem). Suppose that m and n are relatively
prime and a, b ∈ Z. Then there is an x ∈ Z such that
x=a
mod m
x=b
mod n.
Proof. If there is an x satisfying the statement above then x = a+km and x = b+ln
for some k, l ∈ Z. As a consequence a+km = b+ln. This means that b−a = km−ln.
On the other hand, since gcd(m, n) = 1, we know that 1 = rm+sn for some r, s ∈ Z.
It follows that b−a−(b−a)rm−(a−b)sn. Thus if we set k = (b−a)r and l = (a−b)r
we have expressed b − a in the required format.
Theorem 4.17 (General Chinese Remainder Theorem). Let m1 , . . . , mn ∈ Z be
pairwise coprime; that is gcd(mi , mj ) = 1 if i ̸= j. Let a1 , . . . , am ∈ Z. Then there
is an integer x such that x = ai (mod mi ) for i = 1, . . . , n.
∏n
∏
Proof. Let M = i=1 mi , and Mi = j̸=i mj . Then mi Mi = M . Moreover mi
and Mi are relatively prime, so there are integers ri , si such that ri mi + si Mi = 1.
Let ei = si Mi . Then ri mi + ei = 1. so ei = 1 (mod
∑ mi ). Moreover, if j ̸= i,
then mj |Mi , so mj |ei , and ei = 0 (mod mi ). Let x = i = 1n ai ei . It follows that
x = ai (mod mi ) for all i.
Exercise 4.18. Suppose that gcd(a, n) = 1. Show that the equation ax = b
(mod n) has a solution for any b ∈ Z. Moreover, if x is the equivalence class
(mod n) of a particular solution x to the equation, then the solutions to the equation are precisely the elements of the equivalence class of x mod n.
Exercise 4.19. Let gcd(a, n) = c, and express a = ca′ and n = cn′ . Show that the
equation ax = b (mod n) has a solution if and only if c|b. In that case, if b = cb′ ,
and x is a solution to a′ x = b′ (mod n′ ), then x is a solution to ax = b (mod n).
5. Permutations
Definition 5.1. If f : X → Y is a map, then
• f is injective if f (x) = f (x′ ) implies that x = x′ .
• f is surjectiveif given any y ∈ Y , there is some x ∈ X such that f (x) = y.
16
MICHAEL PENKAVA
f is said to be a bijection if f is both injective and surjective.
Theorem 5.2. Suppose that f : Y → Z and g : X → Y are maps. Then
• If f and g are both injective then f ◦ g is injective.
• If f and g are both surjective then f ◦ g is surjective.
• If f and g are both bijective then f ◦ g is bijective.
If h : W → X is another map, then
(f ◦ g) ◦ h = f ◦ (g ◦ h).
Proof. Suppose that both f and g are injective and (f ◦ g)(x) = (f ◦ g)(x′ ). Then
f (g(x)) = f (g(x′ )), and since f is injective it follows that g(x) = g(x′ ). But then,
since g is injective, we see that x = x′ . Thus f ◦ g is injective.
Next, suppose that f and g are surjective, and y ∈ Y . Then since f is surjective,there is some x ∈ X such that f (x) = y. Since g is surjective, there is some
w ∈ W such that g(w) = x. Then (f ◦ g)(w) = f (g(w)) = f (x) = y. Thus f ◦ g is
surjective.
Putting the two results together, we see that if f and g are bijective, then f ◦ g
is bijective.
Finally, the associativity of function composition is easy to see and is left as an
exercise to the reader. g(w) = x.
Definition 5.3. Let X be a set Then the set SX = {f : X → X|f is bijective}
is called the permutation group of X. The permutation group of n = {1, . . . , n} is
denoted simply as Sn .
Often, the permutation group of n is denoted by Σn instead of Sn .
Theorem 5.4. Function composition is a well defined binary operation SX ×SX →
SX . This operation, called the product of permutations, is usually denoted by juxtaposition instead of the composition symbol ◦. It satisfies the following properties.
(1) (στ )ϕ = σ(τ ϕ). (associativity)
(2) The identity map 1X , defined by 1X (x) = x, is a permutation and
σ · 1X = 1x · σ = σ for all σ ∈ SX . (Existence of identity)
(3) The inverse map σ −1 to σ, defined by σ −1 (y) = x if and only if σ(x) = y
is a permutation of X and
σ · σ −1 = σ −1 · σ = 1X . (Existence of inverse)
Proof. Since the composition of bijections is a bijection, we see that the product of
permutations is well defined. Since function composition is associative, the product
is associative. Clearly, 1X is a bijection. We have (σ · 1X )(x) = σ(1X (x)) = σ(x),
for any x ∈ X. Thus σ · 1X . Similarly, 1X · σ = σ.
The identity element in Sn is often denoted as e, since the notation 1SX is
cumbersome. Note that with this notation, there is some ambiguity about which
Sn the element e belongs to, which needs to be determined by context.
Definition 5.5 (Matrix
Notation for)Permutations). If σ ∈ Sn , then the matrix
(
1
···
n
notation for σ is
σ(1) · · · σ(n)
ABSTRACT ALGEBRA I NOTES
17
Definition 5.6. Let a1 , . . . , ak be a sequence of distinct elements of X. Then the
cycle σ associated to the sequence is the map σ : X → X given by


ai+1 if x = ai and 1 ≤ i < k
σ(x) = a1
if x = ak


x
if x ̸∈ {a1 , . . . , ak }
We say that the cycle σ has length k, and we denote it by σ = (a1 , . . . , ak ). If
τ = (b1 , . . . , bℓ ) is another cycle, then the cycles σ and τ are said to be disjoint if
the sets {a1 , . . . , ak } and {b1 , . . . , bℓ } are disjoint.
Exercise 5.7. Show that a cycle σ : X → X is actually a permutation of X.
Theorem 5.8. The product of disjoint cycles commutes.
Proof. Let σ = (a1 , . . . , ak ) and τ = (b1 , . . . , bℓ ) be two disjoint cycles. Let ϕ = στ ,
and ψ = τ σ. Suppose that x ∈ X. Let x ∈ X. Then exactly one of three
possibilities hold: x ∈ {a1 , . . . , ak }, x ∈ {b1 , . . . , bℓ }, or x ̸∈ {a1 , . . . , ak , b1 , . . . , bℓ }.
Let us examine what happens in each case.
Case 1 : x ∈ {a1 , . . . , ak } In this case σ(x) ̸∈ {b1 , . . . , bℓ }, so ϕ(x) = τ (σ(x)) =
σ(x). Moreover, τ (x) = x so ψ(x) = σ(τ (x)) = σ(x).. Thus ϕ(x) = ψ(x).
Case 2 : x ∈ {b1 , . . . , bℓ } In this case τ (x) ̸∈ {a1 , . . . , ak }, so ψ(x) = σ(τ (x)) =
τ (x). Moreover, σ(x) = x so ϕ(x) = τ (σ(x)) = τ (x).. Thus ϕ(x) = ψ(x).
Case 3 :x ̸∈ {a1 , . . . , ak , b1 , . . . , bℓ }. In this case, both σ(x) = x and τ (x) = x, so
ψ(x) = x = ϕ(x).
Since ϕ(x) = ψ(x) for all x, we that σ and τ commute.
We can generalize the result above and combine with the associative law to see
that if σ1 , . . . , σm is a sequence of disjoint cycles, then the order of multiplication
does not determine their product.
Theorem 5.9. If X is a nonempty set, then every permutation can be written as
a product of disjoint cycles so that every element of X appears in one of the cycles.
Moreover, this product is unique up to order.
Exercise 5.10. Prove the above theorem.
Note that there is some ambiguity about which Sn a permutation written in
disjoint notation belongs to. For example sigma = (1, 3, 2) might belong to Sn
for any n ≥ 3. Sometimes this ambiguity is advantageous. Note that there is no
ambiguity about the n when a permutation is expressed in matrix notation.
Theorem 5.11. Let σ = (a1 , . . . , ak ) be a cycle. Then σ −1 = (ak , . . . , a1 ). In
other words, to compute the inverse of a cycle, you just reverse the order of the
elements in the cycle.
Exercise 5.12. Prove the theorem above.
Theorem 5.13. If σ, τ ∈ SX , then (στ )−1 = τ −1 σ −1 .
Proof. If f : X → Y and g : Y → X, then we know that g = f −1 precisely when
g ◦ f = 1X and f ◦ g = 1Y . Thus we compute
(στ )(τ −1 σ −1 ) = σ(τ τ −1 )σ −1 = σ · 1X σ −1 = σσ −1 = 1X .
Similarly, (τ −1 σ −1 )(στ ) = 1X . Thus (στ )−1 = τ −1 σ −1 .
18
MICHAEL PENKAVA
In the study of linear algebra, you learned that if A, B are n × n matrices,
then (AB)−1 = B −1 A−1 . The rule for computing the inverse of a product of
permutations is analogous to the rule for matrix inverse computation.
If you combine the rule for computing the inverse of a cycle, and the rule for
computing the inverse of a product of permutations, one obtains a simple method
for computing the inverse of a product of any number of cycles, whether they are
disjoint or not. In the case when one has a product of disjoint cycles, this gives a
very simple method of computing the inverse.
Example 5.14. Let σ = (1, 3, 5, 6)(2, 4, 8). Then σ −1 = (6, 5, 3, 1)(8, 4, 2). Notice
that we don’t have to reverse the order because the two cycles are disjoint, so their
inverses are also disjoint, and thus can be multiplied in any order.
It is also easy to multiply permutations which are expressed in cycle notation.
In fact, one can compute the product of a number of permutations in a very quick
fashion. It is also easy to convert the matrix notation for a permutation into disjoint
cycle notation.
(
)
1 2 3 4 5 6 7 8
. Then σ = (1, 3, 4, 6, 8, 2, 5).
Exercise 5.15. Let σ =
3 5 4 6 1 8 7 2
Notice that σ = (1, 3, 4, 6, 8, 2, 5)(7) as well, but it is customary to drop the singleton cycles from the expression for σ, as they are not necessary. Now let τ =
(1, 4, 5)(2, 3)(7, 8) be a permutation
in S8 expressed in cycle
(
) notation. Then the
1 2 3 4 5 6 7 8
matrix notation for τ is τ =
. To find the product
4 3 2 5 1 6 8 7
of σ and τ write
στ = (1, 3, 4, 6, 8, 2, 5)(1, 4, 5)(2, 3)(7, 8) = (1, 6, 8, 7, 2, 4)(3, 5).
To calculate this, we first note that when computing the product of permutations,
you must remember that the permutation on the right acts first. To get the right
hand side of the equation, you first start a cycle with any number. We started with
1, so we first wrote (1. Now reading from right to left, we track down where 1 goes
to. First, the cycle (1, 4, 5) acts on 1 taking it to 4. Then the cycle (1, 3, 4, 6, 8, 2, 5)
takes 4 to 6. Thus we put a comma, followed by a 6, so we have (1, 6 so far. Next
we do the same thing as we did with 1, but starting with 6, and find that 6 goes to
8. We continue in this manner until we have (1, 6, 8, 7, 2, 4. When we repeat the
process with 4, we find 4 goes to 5 which then goes to 1. Since 4 goes to 1, which
is the first element in the cycle, we have computed a cycle in the product. Next, we
look for a number which is not in the first cycle. 3 is such a number, so we can
start a new cycle with (3. In this manner, we compute the product.
Note that the method above can be applied when multiplying more than two
permutations together. Thus it is a very efficient method of computing the product
of two permutations. One might ask, if the disjoint cycle notation is so advantageous
for computing inverses and products of cycles, what is the value of the matrix
notation. It turns out that the matrix notation has some applications, which do
not arise in a course in abstract algebra, and the notation is a common notation as
well, so it is valuable to learn.
Definition 5.16. If σ ∈ SX , then the order of σ, denoted o(σ) is the least positive
integer m such that σ m = 1X . If there is no such integer, then we say that the
order of σ is ∞ and write o(σ) = ∞.
ABSTRACT ALGEBRA I NOTES
19
Theorem 5.17. Let σ = (a1 , . . . , ak ) be a cycle. Then o(σ) = k.
Proof. Suppose that 1 ≤ i < k. Then it is a straightforward induction to see that
σ i (a1 ) = ai+1 . Since ai+1 ̸= a1 , it follows that σ i ̸= e. In particular σ k (a1 ) =
σ(ak ) = a1 . Since σ = (aj , . . . , ak , a1 , . . . , aj−1 ) for any 1 ≤ j ≤ k, it follows that
σ k (aj ) = aj for any 1 ≤ j ≤ k. Moreover if x ̸∈ {a1 , . . . , ak }, then σ(x) = x, so
σ k (x) = x for all k. It follows that σ k = e.
Recall that if n1 , . . . , nℓ ∈ P, then lcm(n1 , . . . , nℓ ) is the least common multiple
of n1 , . . . , nℓ . It is the smallest positive integer x such that ni |x for all i = 1, . . . , ℓ.
Corollary 5.18. Let σ = σ1 · · · σm be a product of cycles σ1 , . . . , σm . Then
o(σ) = lcm(o(σ1 ), . . . , o(σm )).
Definition 5.19. A cycle of the form (a1 , a2 ) is called a transposition.
Theorem 5.20. Let k > 1 and σ = (a1 , . . . , ak ) be a cycle. Then
σ = (a1 , a2 )(a2 , a3 ) . . . (an−1 , an ).
As a consequence, any element of Sn can be written as a product of transpositions
when n > 1.
Proof. That σ = (a1 , a2 )(a2 , a3 ) . . . (an−1 , an ) is a matter of calculation. If σ is not
the identity, then it can be written as a product of disjoint cycles, each of which has
length at least 2. Thus, after factoring each of them as a product of transpositions,
we have found a factorization of σ in the desired form. It remains to consider the
case when σ = e. But e = (1, 2)(1, 2), so it is a product of transpositions.
Definition 5.21. Let n > 1. Then a permutation σ ∈ Sn is said to be even if it
can be expressed as a product of an even number of transpositions. A permutation
which is not even is said to be odd.
Note that if n > 1, then a permutation which is odd can be expressed as a
product of an odd number of transpositions, since every permutation is a product of
transpositions. What is not so obvious is that a permutation can not be expressed
both as a product of an even number of transpositions and an odd number of
transpositions. In order to prove this fact, we need to develop some properties of
permutations.
Definition 5.22. Suppose that σ ∈ Sn can be expressed as a product of k disjoint
cycles so that every number 1 ≤ i ≤ n appears in one of the cycles. Then the orbit
number of σ is n − k.
Notice that since the decomposition of σ into such a product is unique up to the
order of the cycles, the orbit number is well defined. Also, the orbit number of the
identity element e is zero, since it is a product of n disjoint cycles.
Theorem 5.23. Suppose that σ ∈ Sn and τ = (a, b) be a transposition. Then the
orbit number of στ is either 1 larger or 1 less than the orbit number of σ. More
precisely, if a and b lie in the same cycle of σ, then στ has 1 more orbit than σ,
and if a and b lie in different cycles of σ, then στ has one less orbit.
20
MICHAEL PENKAVA
Proof. Suppose that a and b belong to the same orbit of σ. We can suppose
that σ = (a, a1 , . . . , ak , b, b1 , . . . , bℓ ), as the other cycles in σ will not influence the
outcome of the product. Then
στ = (a, a1 , . . . , ak , b, b1 , . . . , bℓ )(a, b) = (b, a1 , . . . , ak )(a, b1 , . . . , bℓ ),
so στ has one more orbit.
Next, suppose that a and b belong to different orbits. Then we can suppose that
σ = (a, a1 , . . . , ak )(b, b1 , . . . , bℓ ). In this case, we have
στ = (a, a1 , . . . , ak )(b, b1 , . . . , bℓ )(a, b) = (a, b1 , . . . , bℓ , b, a1 , . . . , ak ),
, so that στ has one less orbit.
Corollary 5.24. If n > 1, then an element σ ∈ Sn has a factorization as a product
of an even number of transpositions or an odd number of transpositions, but not
both. In fact, σ is even precisely when its orbit number is even. Moreover, we have
the following:
• The product of two even elements is even.
• The product of an even element and an odd element in either order is an
odd element.
• The product of two odd elements is even.
• The inverse of an even element is even.
• The inverse of an odd element is odd.
6. Groups
Definition 6.1. A set G, equipped with a binary operation ⋆, called the product
or group operation is called a group provided that
(1) a ∗ (b ∗ c) = (a ∗ b) ∗ c, for all a, b, c ∈ G. (associativity)
(2) There is an element e ∈ G such that a ⋆ e = e ⋆ a = a for all a ∈ G.
(Existence of identity)
(3) Given a ∈ G there is some b ∈ G such that a ⋆ b = b ⋆ a = e. (Existence of
inverse)
Frequently, the group operation is indicated by juxtaposition;i.e.we write gh
instead of g ⋆ h. If we wish to emphasize the group operation, we sometimes say
(G, ⋆) is a group. This may be important when the set G is equipped with more
than one operation. It is also common for the operation to be written as +, but
in that case, almost always we require the operation to be commutative, which we
define below.
Definition 6.2. A group G with product ⋆ is said to be commutative provided
that a ⋆ b = b ⋆ a for all a, b ∈ G.
Examples of commutative groups are (Z, +), (Zn , +), (Q, +), (R, +), and any
vector space over any field k with the operation of addition. In all of these cases
the identity element is called 0. Commutative groups whose group operation is not
written as + are (Z∗ , ·), (Z∗n , ·), (R∗ , ·), where the ∗ means the subset of elements
invertible under the group operation.
The set GL(n, k) of invertible n × n matrices with coefficients in a field k is
a group under matrix multiplication, which is not commutative if n > 1. The
permutation group SX is a group under composition of maps, which is also not
commutative when X has more than two elements.
ABSTRACT ALGEBRA I NOTES
21
A careful reading of the definition of a group reveals that it does not state that
there is only one identity element or one element satisfying the inverse property.
Luckily, we can prove this uniqueness of identity and inverse.
Theorem 6.3 (Uniqueness of Identity). Suppose that G is a group and e, e′ both
satisfy the condition of identity in the second axiom of a group. Then e = e′ . In
fact, if e is the identity and e′ ⋆ a = a or a ⋆ e′ = a for some a ∈ G, then e′ = e.
Proof. Suppose that e ⋆ a = a ⋆ e = a for all a ∈ G, and that e′ ⋆ a = a, for some
a ∈ G. By the third axiom of groups, there is some b ∈ G such that a ⋆ b = e. Then
e′ = e′ ⋆ e = e′ ⋆ (a ⋆ b) = (e′ ⋆ a) ⋆ b = a ⋆ b = e.
The proof is similar if we assume a ⋆ e′ = a for some a ∈ G.
Theorem 6.4 (Uniqueness of Inverse). Suppose that G is a group, a, b ∈ G and
a ⋆ b = b ⋆ a = e. Let b′ ∈ G satisfy b′ ⋆ a = e or a ⋆ b′ = e. Then b′ = b.
Proof. Let a, b be as in the statement of the theorem, and suppose that b′ ⋆ a = e.
Then
b′ = b′ ⋆ e = b′ ⋆ (a ⋆ b) = (b′ ⋆ a) ⋆ b = e ⋆ b = b.
A similar argument holds when a ⋆ b′ = e.
As a consequence of the above theorem, we can give the definition below.
Definition 6.5. If G is a group with identity e and g ∈ G, then the inverse of g is
the unique element h such that g ⋆ h = h ⋆ g = e. When the group operation of G
is written in some multiplicative form (either by juxtaposition or ⋆), we denote the
inverse of g by g −1 . When the group operation of G is a commutative operation
written as +, we write the inverse of g as −g. Most of the time, we will assume
that the group in question is written multiplicatively, so will state our results in
that form. Later, we will give a table which compares the multiplicative forms of
our results to their additively written counterparts.
Theorem 6.6. Let G be a group (written multiplicatively). Then
• If g ∈ G, then (g −1 )−1 = g.
• If g, h ∈ G,then (gh)−1 = h−1 g −1 .
Exercise 6.7. Prove the above theorem.
Definition 6.8 (Exponentiation). Let G be a group. For n ∈ P we define the
power g n for g ∈ G recursively as follows:
• g 1 = g.
• g s(n) = g n g.
This definition is extended to all n ∈ Z as follows
• g 0 = e.
• g −n = (g n )−1 if n ∈ P.
Lemma 6.9. Let G be a group, g ∈ G, and m, n ∈ P. Then
(1) g m g n = g m+n .
(2) (g m )n = g mn .
22
MICHAEL PENKAVA
Proof. To establish the first equation, we show that the set S = {n ∈ P|g m g n =
g m+n for all m ∈ P} is an inductive subset of P. Note g m+1 = g m g = g m g 1 , so
1 ∈ S. Suppose that n ∈ S. Then
g m+s(n) = g (m+n)+1 = g m+n g = g m g n g = g m g n+1 = g m g s(n) .
Thus S is inductive so it follows that S = P.
Next, we show that the set S = {n ∈ P|(g m )n = g mn for all m ∈ P} is an
inductive subset of P. Note (g m )1 = g m = g m·1 , so 1 ∈ S. Suppose that n ∈ S.
Then
(g m )s(n) = (g m )n+1 = (g m )n (g m )1 = g mn g m = g mn+m = g m(n+1) = g m·s(n) .
Thus S is inductive so that S = P.
Theorem 6.10. Let G be a group, g ∈ G and n ∈ P. g −n = (g −1 )n .
Proof. We proceed by induction. Let S be the subset of P such that g −n = (g −1 )n
for all n ∈ S. Since g −1 = (g −1 )1 by the definition of exponentiation, it follows
that 1 ∈ S. Suppose that n ∈ S. then
g −s(n) = (g s(n )−1 = (g n g)−1 = g −1 (g n )−1 = g −1 (g −1 )n = (g −1 )n+1 = (g −1 )s(n) .
.
Now we are ready to show that the statements of Lemma 6.9 holds for all integers.
Theorem 6.11. Let G be a group, g ∈ G, and m, n ∈ Z. Then
(1) g m g n = g m+n .
(2) (g m )n = g mn .
Proof. Let us first note that both formulas are immediate whenever m or n is equal
to zero. Thus, we can restrict to the case when either both m and n are negative,
or when one is positive and the other is negative.
Let us examine the case when both coefficients are negative, so m = −k and
n = −ℓ for some k, ℓ ∈ P. Then
g m+n = g n+m = g −(ℓ+k) = (g ℓ+k )−1 = (g ℓ g k )−1
= (g k )−1 (g ℓ )−1 = g −k g −ℓ = g m g n .
Next
(g m )n ) = (g −k )−ℓ = ((g −k )ℓ )−1 = (((g k )−1 )ℓ )−1
= (((g k )ℓ )−1 )−1 ) = (g k )ℓ = g kℓ = g mn .
Thus, both formulae hold when m and n are negative.
Next, Suppose that m ≥ n ∈ P. Then
g m g −n = g m−n g n (g n )−1 = g m−n .
A similar formula holds when 1 ≤ m < n, but this time, we factor g −n = g −m−n−m .
It is an exercise for the reader to extend this to the case g −m g n when m, n ∈ P.
Finally, let us consider the multiplicative formula. Let m, n ∈ P. Then
(g −m )n = ((g m )−1 )n = (g m )−n = ((g m )n )−1 = (g mn )−1 = g −mn .
Finally, the case (g m )−n is handled similarly.
ABSTRACT ALGEBRA I NOTES
23
Theorem 6.12. Suppose that G is a group, g, h ∈ G. Then if gh = hg, we
have (gh)m = g m hm for all m ∈ Z. Moreover, g and h commute precisely when
(gh)2 ̸= g 2 h2 , so if g and h fail to commute, the formula does not hold for all
m ∈ Z.
Exercise 6.13. Prove the above theorem.
Let us consider a group G with a commutative operation +. In this case, it
is natural to write a + a as 2a rather than a2 . We could define na for a ∈ Z
using a recursive definition for n ∈ P, 0a = 0, and −na = −(na). Then we could
prove properties corresponding to the exponentiation rules we have shown above.
However, it is not really necessary to do this, as these rules represent a translation
of the power rules into additive notation. In the table below, we give a comparison
between the exponential properties of a group, written multiplicatively, and the
properties of multiplication by integers in a group with the group operation given
as +, which we assume is commutative.
Property
Power
Inverse
Sum Rule
Multiplication Rule
Power of products
Exponential
Notation
gn
g −1
g m+n = g m g n
(g m )n = g mn
If gh = hg then
(gh)m = g m hm
Additive
Notation
ng
−g
(m + n)g = mg + ng
(mn)g = m(ng)
m(g + h) = mg + mh
Proposition 6.14 (Cancelation Laws). Let G be a group and a, b, c ∈ G.
(1) If ab = ac then b = c. (left cancelation)
(2) If ba = ca then b = c. (right cancelation)
Exercise 6.15. Prove the above proposition.
Proposition 6.16. If a, b ∈ G, then
• The equation ax = b has the unique solution x = a−1 b.
• The equation xa = b has the unique solution x = ba−1 .
Exercise 6.17. Prove the above proposition.
Definition 6.18. If G is a finite group, then a Cayley Table of the group is an
n × n matrix whose columns and rows are headed by elements of the group ordered
as g1 , . . . , gn , and whose entry in the ith row and jth column is gi gj .
It is conventional to list the elements in the same order in the rows as in the columns
and to list the identity element e first. Cayley tables first appeared in an 1854 paper
by Arthur Cayley 1821–1895.
Example 6.19. Recall that the symmetric group S3 is given by S3 = {e, ρ, ρ2 , σ, ρσ, ρ2 σ},
where ρ = (1, 2, 3) and σ = (1, 2) are given in cyclic notation. Then a Cayley table
for S3 is
24
MICHAEL PENKAVA
e
ρ
ρ2
σ
ρσ
ρ2 σ
e
e
ρ
ρ2
σ
ρσ
ρ2 σ
ρ
ρ
ρ2
e
ρ2 σ
σ
ρσ
ρ2
ρ2
e
ρ
ρσ
ρ2 σ
σ
σ
σ
ρσ
ρ2 σ
e
ρ
ρ2
ρσ
ρσ
ρ2 σ
σ
ρ2
e
ρ
ρ2 σ
ρ2 σ
σ
ρσ
ρ
ρ2
e
In the example above, note that each element appears exactly once in each
column and row of the table. This is a basic property of Cayley tables, which
follows from Proposition solveeq. We can exploit this to find Cayley tables for
groups of small size.
Example 6.20. Let G be a group of order 2, e be its identity element, and g be
the nonidentity element. Then its Cayley Table is
e g
e e g
g g e
Example 6.21. Let G be a group of order 3, e be its identity element and g be a
nonidentity element. Let h be the third element. If g 2 were equal to e, then consider
what the Cayley Table would look like
e g h
e e g h
g g e x
h h y z
There is no way to assign a value to either x or y which is consistent with the
observation that the rows and columns must have each element appearing only once.
Thus we must have g 2 = h and we can write the Cayley Table as follows.
e g g2
e e g g2
g g g2 e
g2 g2 e g2
We have presented Cayley tables for groups of order 2 and 3, but there is a
problem with the exposition. The table alone does not mean that there is a group
with the structure given in the table. For example, if we have a group of order 3,
then its Cayley table must look like the one given in the example above, but that
is not enough to show that there is such a group. The problem is that we have not
verified the axioms.
In particular, the axiom of associativity for a binary operation on a set G is time
consuming to verify. For a group of order n, there are n3 expressions of the form
(a ⋆ b) ⋆ c, and another n3 expressions of the form a ⋆ (b ⋆ c). To check associativity,
we have to compare the two sequences of products, so one needs to make 2n3
calculations to verify the associativity axiom directly.
ABSTRACT ALGEBRA I NOTES
25
On the other hand, it is possible to determine if the other two axioms are satisfied
by examining the Cayley table, so checking associativity is the main problem. As
a consequence, we will have to find other methods to check if a binary operation
is associative. We already encountered this problem for the groups Z and Zn .
We proved associativity of addition on P by verifying that it is a consequence of
the Peano postulates, and we reduced the problem of verifying the associativity of
addition on Zn to the associativity in Z. We did not give a proof of associativity
in Z, because we did not give the construction of Z from P, but this can be done.
These remarks still are not enough to verify that there is a group whose Cayley
table corresponds to the one which we gave for a group of order 3. However, it is
enough to show there is some group of order 3, because our argument shows that
the table we gave applies to any group of order 3. However, we can exhibit such a
group easily, as we show in the theorem below.
Theorem 6.22. The group given by addition on Zn for n ∈ P has order n.
Exercise 6.23. Prove the above theorem.
There is a second problem with the idea that there is only one group of order 3,
and it is more substantial. In fact, there are many groups of order 3. If we take set
G = {x, y, z} of three elements, then we can give them the structure of a group of
order 3, by identifying e = x, g = y and g 2 = z. Since there is clearly more than
one set of order 3, there are more than one group of order 3. Yet, in some sense,
we would like to say that all groups of order 3 are essentially the same group. To
make this idea precise, we introduce the notion of isomorphism.
Definition 6.24. Let G and G′ be two sets equipped with binary operations (which
we will denote by juxtaposition). Then an isomorphism ϕ from G to G′ is a bijection
ϕ : G → G′ which satisfies
ϕ(gh) = ϕ(g)ϕ(h), for all g, h ∈ G.
We can interpret ϕ as a relabeling function. The key idea is that it doesn’t
matter whether you multiply the elements in G and then consider the image of
their product or their images in G′ , because you obtain the same result. We can
also express the property of isomorphism in the form
gh = ϕ−1 (ϕ(g)ϕ(h).
From this point of view, to compute the product in G, first map the two elements
to G′ , compute their product and then map back. We can use this formula to define
a product on G given a product on G′ and a bijection between G and G′ .
Notice that we did not require G or G′ to be a group to define an isomorphism.
However, there is an important fact we can establish relating group structures and
bijections.
Theorem 6.25. Let ϕ : G → G′ be an isomorphism between two sets which are
equipped with binary operations. Then
• The product on G is associative if and only if the product on G′ is associative.
• There is an identity element in G if and only if there is one in G′ .
• G is a group if and only if G′ is a group.
26
MICHAEL PENKAVA
Proof. Suppose that x, y, z ∈ G′ . Let g, h, k ∈ G be such that ϕ(g) = x, ϕ(h) = y
and ϕ(k) = z. If the product on G is associative, then
(xy)z = (ϕ(g)ϕ(h))ϕ(k) = ϕ(gh)ϕ(k) = ϕ((gh)k) = ϕ(g(hk)) = ϕ(g)ϕ(hk)
= x(ϕ(h)ϕ(k)) = x(yx).
Suppose that g, h, k ∈ G and let x, y, z ∈ G′ be such that ϕ(g) = x, ϕ(h) = y and
ϕ(k) = z. If the product on G′ is associative, then
(gh)k = ϕ−1 (ϕ((gh)k)) = ϕ−1 (ϕ(gh)ϕ(k)) = ϕ−1 ((ϕ(g)ϕ(h))z)
= ϕ−1 ((xy)z) = ϕ−1 (x(yz)) = ϕ−1 (ϕ(g)(ϕ(h)ϕ(k)))
= ϕ−1 ((ϕ(g)ϕ(hk)) = ϕ−1 (ϕ(g(hk))) = g(hk).
This shows that the product on G is associative precisely when the product on G′
is associative.
Next, suppose that eg = eg = g for all g ∈ G. Let e′ = ϕ(e). Suppose that
x ∈ G′ . Then x = ϕ(g) for some g ∈ G, so we have
e′ x = ϕ(e)ϕ(g) = ϕ(eg) = ϕ(g) = x,
and similarly xe′ = x. Thus e′ is an identity element in G′ . On the other hand, if
e′ ∈ G′ satisfies e′ x = xe′ = x for all x ∈ G′ , then let e ∈ G be such that ϕ(e) = e′ .
Let g ∈ G. Then
eg = ϕ−1 (ϕ(eg)) = ϕ−1 (ϕ(e)ϕ(g)) = ϕ−1 (e′ ϕ(g)) = ϕ−1 (ϕ(g)) = g,
and similarly ge = g.
Now finally, note that if either G or G′ is a group, then the product on both
of them is associative and there are identity elements e ∈ G and ϕ(e) = e′ ∈ G′ .
Assume G is a group, and let x ∈ G′ . Then there is some g ∈ G such that ϕ(g) = x
and some h ∈ G such that gh = hg = e. Let y = ϕ(h). Then xy = ϕ(g)ϕ(h) =
ϕ(gh) = ϕ(e) = e′ and similarly yx = e′ . Thus G′ is a group. We leave for the
reader the case of showing that if G′ is a group then G is a group.
7. Subgroups
Definition 7.1. If H ⊆ G is a subset of a group G, then H is said to be a subgroup
of G provided that H is a group under the same binary operation as in G. If H is
a subgroup of G, we denote this fact by H ≤ G.
It is very important that we use the same binary operation. For example Q is a
group under addition, and Q∗ , the set of nonzero elements of Q is a group under
multiplication. But it is not a subgroup of Q, because the binary operation is not
the same.
Since a subgroup H of a group G must be a group, it cannot be empty, since
it must have an identity element. In fact, if e is the identity in G and e′ is the
identity in H, then if h ∈ H, we have e′ h = h. But by Uniqueness of Identity in
G, it follows that e′ = e. Thus the identity in H must be the same identity as in
G. Similarly, the inverse of an element in H must coincide with the inverse of that
element in G, by the Uniqueness of Inverse.
The theorem below gives a powerful criterion for determining whether a subset
of a group is a subgroup.
ABSTRACT ALGEBRA I NOTES
27
Theorem 7.2. Let H be a subset of a group G. Then H is a subgroup of G
if and only if it satisfies the three properties below.
(1) H is not empty.
(2) If a, b ∈ H, then ab ∈ H. (H is closed under the group operation)
(3) If a ∈ H, then a−1 ∈ H. (H is closed under inverse).
Proof. By the second property, the binary operation is well-defined on H. Although
we don’t state the condition of well-definedness of the binary operation as one of
the three axioms of a group, it is implicitly required because the definition of a
group is based on the existence of a binary operation satisfying three requirements,
so at least we must verify that the binary operation is well-defined.
The axiom of associativity is automatic, because associativity holds in G, so the
product of elements in H must also satisfy associativity.
The existence of an identity in H is verified as follows. By the first property,
there is some h ∈ H. By the third property, h−1 ∈ H. Thus by the second property
e = hh−1 lies in H. Since e is the identity in G, it also is the identity in H.
Finally, the third property guarantees that h−1 ∈ H if h ∈ H. Moreover, h−1 is
the inverse of h in H, since it is the inverse of h in G.
Students often make the mistake of trying to show that H is a subgroup of G
by proving that an identity exists in H or an inverse exists in H. This is not the
idea. Existence of the identity and existence of the inverse of an element in H are
already guaranteed. It is location of the identity and inverse we are concerned
with. You want to show the identity lies in H and the inverse of an element in H
lies in H, not that these elements exist.
Another common mistake is students proving that the product in H is associative. We already know that so such a “proof” is irrelevant to showing that H is a
subgroup of G. The key is to prove the three properties in the theorem hold!
Corollary 7.3. Suppose that H is a finite subset of a group G and it satisfies the
first two properties in the theorem. Then H is a subgroup of G.
Proof. Since the first two properties hold, we need only show that the third holds.
Let h ∈ H. Now the set P is infinite, and hk ∈ H for any k ∈ P by a straightforward
induction argument, so there must be some m < n ∈ P such that hm = hn . Then
e = hm h−m = hn h−m = hn−m . Let k = n−m. If k = 1, then h = e so h = h−1 and
h−1 ∈ H. Otherwise k > 1, so k − 1 ∈ P. Now e = hhk−1 so h−1 = hk−1 ∈ H. Corollary 7.4. Let G be a finite group. Then if g ∈ G, g −1 = g k for some k ∈ P.
Proof. Let H = {g k |k ∈ P}. Then H is not empty, since g ∈ H. Suppose that
x, y ∈ H. Then x = g k and y = g ℓ for some h, ℓ ∈ P. Then xy = g k+ℓ ∈ H. Thus
H satisfies the first two properties in the theorem, and since G is finite, so is H.
Thus by the corollary above H is a subgroup of G. But this implies g −1 = g k for
some k ∈ P.
Theorem 7.5. Let H, K ≤ G. Then H ∩ K ≤ G. More generally, let Λ be a
set
Hλ of G for λ ∈ Λ. Define
∩
∩ and suppose that we have a collection of subgroups
Hλ = {g ∈ G|g ∈ Hλ for all λ ∈ Λ}. Then
Hλ ≤ G.
λ∈Λ
λ∈Λ
28
MICHAEL PENKAVA
since it includes the case of the inProof. We prove the more general statement,
∩
tersection of two subgroups. Let H =
Hλ . First, since Hλ is a subgroup for all
λ∈Λ
λ ∈ Λ, we must have e ∈ Hλ for all λ ∈ Λ. Thus e ∈ H and H is not empty. Now
suppose that a, b ∈ H. Then a, b ∈ Hλ for all λ, so ab ∈ Hλ for all λ. Thus ab ∈ H.
Finally, a−1 ∈ Hλ for all λ, so it also follows that a−1 ∈ H. Thus H ≤ G.
Theorem 7.6. Let H ≤ G and suppose that g ∈ G. Then the set gHg −1 =
{ghg −1 |h ∈ H} is a subgroup of G. Such a subgroup is called a conjugate of H (by
g).
Proof. e ∈ H, so e = geg −1 ∈ gHg −1 . Thus gHg −1 ̸= ∅. Suppose that x, y ∈
gHg −1 . Then x = gag −1 and y = gbg −1 for some a, b ∈ H. Thus xy =
gag −1 gbg −1 = gabg −1 ∈ gHg −1 , since ab ∈ H. Finally
x−1 = (gag −1 )−1 = (g −1 )−1 a−1 g −1 = ga−1 g −1 ∈ gHg −1 ,
since a−1 ∈ H.
Definition 7.7. Let G be a group and gλ ∈ G for λ ∈ Λ. Then ⟨aλ , λ ∈ Λ⟩ is
the intersection of all subgroups H such that aλ ∈ H for all λ ∈ Λ, is called the
subgroup generated by the aλ .
Exercise 7.8. Show that ⟨aλ , λ ∈ Λ⟩ is actually a subgroup of G.
Exercise 7.9. Let G = S3 . Show that the only proper subgroups of G are ⟨(1, 2)⟩,
⟨(1, 3)⟩, ⟨(2, 3)⟩ and ⟨(1, 2, 3)⟩. Thus there are precisely 6 subgroups of S3 .
Definition 7.10. Let G be a group. Then the center of G, denoted Z(G) is the
subset of all g ∈ G such that gx = xg for all x ∈ G.
Exercise 7.11. Prove that the center of G is a subgroup of G.
Exercise 7.12. Find Z(S3 ).
Definition 7.13. Let G be a group and S ⊆ G. Then the centralizer of S in G is
the set CG (S) = {g ∈ G|gs = sg for all s ∈ S}. When the context is clear, we also
denote the centralizer of S by C(S), and when S = {a} is a singleton, we usually
denote C({a}) more compactly as C(a).
Note that C(G) = Z(G) is just the center of G. Also, note that Z(e) = G, since
every element of G commutes with e.
Proposition 7.14. Let S ⊆ G. Then C(S) is a subgroup of G; i.e., C(S) ≤ G.
Proof. First note that e ∈ C(S), since e commutes with any element of G, so
it certainly commutes with every element in S. Thus C(S) ̸= ∅. Suppose that
g, h ∈ C(G). If s ∈ S, then gs = sg and hs = sh, so
(gh)s = g(hs) = g(sh) = (gs)h = (sg)h = s(gh),
so gh ∈ C(G). Finally, if g ∈ C(G) and s ∈ S, then
g −1 s = g −1 se = g −1 sgg −1 = g −1 gsg −1 = esg −1 = sg −1 .
Thus g −1 commutes with s, so g −1 ∈ C(G). This shows that C(S) ≤ G.
ABSTRACT ALGEBRA I NOTES
29
8. Cyclic Groups
Definition 8.1. Let G be a group and g ∈ G. The cyclic subgroup generated by g,
denoted by ⟨g⟩, is the set {g k |k ∈ Z}.
For a group G with commutative operation +, the cyclic subgroup ⟨g⟩ is given
by ⟨g⟩ = {kg|k ∈ Z}. This is because kg is the analogue of g k for groups with
operation +.
Example 8.2. We show that ⟨g⟩ is actually a subgroup of G. First, it is not empty,
since g = g 1 ∈ ⟨g⟩. Next, suppose that x, y, ∈ ⟨g⟩. Then x = g k and y = g ℓ for
some k, ℓ ∈ Z, so xy = g k g ℓ = g k+ℓ ∈ ⟨g⟩. Finally x−1 = g −k ∈ ⟨g⟩.
Theorem 8.3. Let G be a group and g ∈ G. Then the cyclic subgroup ⟨g⟩ generated
by g is the intersection of all subgroups of G containing g. As a consequence, if
H ≤ G and g ∈ H, then ⟨g⟩ ≤ H.
Let H be the collection of all subgroups containing g. Then ⟨g⟩ ∈ H, so
Proof.
∪
{H|H ∈ H} ⊆ ⟨g⟩. On the other hand, suppose that H ∈ H. We show by
induction that g m ∈ H for all m ∈ P. Clearly, g 1 = g ∈ H, since H ∈ H. Suppose
that g m ∈ H. Then g m+1 = g m g ∈ H, since H is closed under products. Thus, by
the principle of mathematical induction, g m ∈ H for all m ∈ P. Moreover, if m ∈ P,
then g −m = (g m )−1 ∈ H, since H is closed under inverses. Finally, g 0 = e ∈ H,
since every
∪ subgroup contains the identity. Thus ⟨g⟩ ⊆ H. It follows from this that
⟨g⟩ ⊆ {H|H ∈ H}. Thus equality holds. The second statement of the theorem
follows immediately from the first.
Example 8.4. Let G be a group. Then ⟨e⟩ = {e} is the cyclic subgroup generated
by the identity element. It is called the trivial subgroup of G. The group G is also
a subgroup of G, called the improper subgroup of G. Note that every group has
these two subgroups, and they are distinct unless G is a one element group. All
other subgroups of G are called proper, nontrivial subgroups.
Definition 8.5. A group G is called a cyclic group if there is an element g ∈ G
such that G = ⟨g⟩. An element g such that G = ⟨g⟩ is called a generator of the
group G.
Example 8.6. The group Z is cyclic, because Z = ⟨1⟩. Moreover, the group Zn is
also cyclic, since Zn = ⟨1⟩.
Definition 8.7. Suppose that G and G are groups. If there is a an isomorphism
φ : G → G′ we say that G and G′ are isomorphic groups (or simply that G and G′
are isomorphic). We denote this by G ∼
= G′ .
Exercise 8.8. Show that ∼
= is an equivalence relation.
Actually, the collection of all groups is not a set, but is something called a class
in set theory. Since we require an equivalence relation to be a relation on a set, it
is not technically correct that ∼
= is an equivalence relation, since there is no set of
all groups. But we can extend the notion of an equivalence relation to classes, and
if we do so, then it is true that ∼
= is an equivalence relation.
Definition 8.9. If X is a finite set, the order of X, denoted as o(X) or |X|, is the
number of elements in X. Otherwise, we say that X has infinite order and denote
o(X) = ∞.
30
MICHAEL PENKAVA
Theorem 8.10. Let G be a cyclic group. If G is infinite, then G ∼
= Z. If G is not
infinite, then G ∼
= Zn where o(G) = n.
Proof. Let G = ⟨g⟩. Suppose that o(G) = n < ∞. The sequence g = g 1 , . . . , g n+1
has n + 1 elements, so must contain some duplicates, so there is some 1 ≤ k <
ℓ ≤ n + 1 such that g k = g ℓ . Then g ℓ−k = e. Moreover ℓ − k ≤ n + 1 − 1 = n.
Since the set X = {x ∈ P|g k = e} includes ℓ − k, X is not empty and so has a
least element m. Moreover, m ≤ n. We claim that G = {g 1 , . . . , g m }, from which
it follows that m = n. It cannot happen that there are any duplicates in the set
T = {g 1 , . . . , g m }, because if some g k = g ℓ for some 1 ≤ k < ℓ < m, then g ℓ−k = e,
and 1 ≤ ℓ − k < m, which would contradict the minimality of m. On the other
hand, we claim that g k ∈ T for all k ∈ Z. To see this, we first show this fact for
k ∈ N. Let S = {k ∈ P|g k ∈ T }. Suppose that x ∈ S for all x < k. If 1 ≤ k ≤ m,
then clearly k ∈ T , so assume k > m. Then g k = g k−m g m = g k−m e = g k−m . Now
x = k − m < k, so by assumption x ∈ S, so g k = g x ∈ T . Thus k ∈ S. By the
principle of strong induction, S = P.
Next, note that g 0 = g m ∈ T . Finally, note that if k ∈ P, then mk − k =
(m − 1)k >= 0. Moreover
g mk−k = (g m )k g −k = ek g −k = eg −k = g −k ,
so g −k = g mk−k ∈ T . This shows that G = T , and thus m = n, since T has m
elements and G has n elements.
Define a map ϕ : Zn → G by ϕ(k) = g k . We have to show that this map is well
defined. Suppose that k = ℓ. Then ℓ = k + nr for some r ∈ Z, so g ℓ = g k+nr =
g k (g n )r = g k . Thus the right hand side in the definition of ϕ is independent of the
choice of representative for k. Clearly, ϕ is a bijection, since it is surjective and
o(Zn ) = o(G).
We show that ϕ is an isomorphism. Now
ϕ(k + ℓ) = ϕ(k + ℓ) = g k+ℓ = g k g ℓ = ϕ(k)ϕ(ℓ).
Thus ϕ satisfies the required condition of an isomorphism.
Next, we consider the case when G is infinite. Define ϕ : Z → G by ϕ(k) = g k .
By the definition of a cyclic group, ϕ is surjective. If ϕ were not injective, then
there would have to be integers k and ℓ such that k < ℓ and g k = g ℓ . But then
g l−k = e, and it would follow that G is finite, by the kind of argument we gave
above. Therefore ϕ is injective, so it is a bijection. To see ϕ is an isomorphism,
note that
ϕ(k + ℓ) = g k+ℓ = g k g ℓ = ϕ(k)ϕ(ℓ).
Thus ϕ meets all of the conditions to be an isomorphism.
The theorem above shows that we do not encounter any new kinds of groups
when considering cyclic groups. Note that if G is any group and g ∈ G, then the
cyclic subgroup ⟨g⟩ is cyclic, so it must be either isomorphic to Z or to Zn for some
n ∈ P. This means there is a short menu for what the structure of ⟨g⟩ could be,
and this is a very useful fact.
Theorem 8.11. Let m ∈ Z and d = gcd(m, n). Then ⟨m⟩ = ⟨d⟩ in Zn . Moreover,
if n ∈ P, this subgroup has order n/d. As a consequence, the number of subgroups
of Zn is equal to the number of distinct nonnegative divisors of n.
ABSTRACT ALGEBRA I NOTES
31
Proof. Let H = ⟨m⟩ and K = ⟨d⟩. We show that H ⊆ K and K ⊆ H, from which
it follows that H = K. Express d = rm + sn. Thus d = rm, so that d ∈ ⟨m⟩. It
follows that K ⊆ H, since any multiple of d is a multiple of m. On the other hand,
since m is a multiple of d, m is a multiple of d. Thus m ∈ K, and it follows that
H ⊆ K.
To see that o(⟨d⟩) = n/d, note that if 1 ≤ k < n/d, then kd < n, so kd ̸= 0.
Moreover, if 1 ≤ k < ℓ ≤ n/d, we could not have kd = ℓd, because in that case
we would have (ℓ − k)d = 0, and since 1 ≤ ℓ − k < n/d, this is impossible by the
previous remark. It follows that o(⟨d⟩) = n/d.
Corollary 8.12. Every subgroup of a cyclic group is cyclic.
Proof. First, note that the map ϕ : Z → Z0 , given by ϕ(a) = a is surjective, since
every element in Zn for any n is of the form a for some a ∈ Z. Next note that
a = b (mod 0) if and only if b−a = k0 = 0, that is, if and only if a = b. Thus ϕ is
injective, since if ϕ(a) = ϕ(b) then a = b, which only happens if a = b. Thus Z ∼
= Z0 .
Now let H ≤ Zn . If H = ⟨o⟩, it is cyclic. Otherwise, let X = {x ∈ P|x̄ ∈ H}. Since
H ̸= ⟨o⟩, there must be some nonzero x ∈ Z such that x ∈ H. If x ∈ P then x ∈ X.
If x < 0, then −x = −x ∈ H, so −x ∈ X. Thus X ̸= ∅, so it must have a least
element d. Suppose that m ∈ H. Express m = qd + r where 0 ≤ r < |d|. Since
r = m − qd, r ∈ H. Now if r > 0, it follows that r ∈ H, which would contradict
the minimality of d. Thus m ∈ ⟨d⟩, and H ⊆ ⟨d⟩. Since ⟨d⟩ ⊆ H, it follows that
H = ⟨d⟩. Thus H is cyclic.
Because of our remarks above that Z ∼
= Z0 , we see that the theorem holds both
for Z and Zn for all n. Of course, since Z is not finite, the counting part of the
theorem does not apply. However, we can easily see that ⟨k⟩ is an infinite cyclic
group, so is isomorphic to Z, except when k = 0, in which case ⟨0⟩ = {0} ∼
= Z1 , the
cyclic group with exactly one element.
The moral of all this discussion of cyclic groups is that they are not very interesting, since they are all isomorphic to groups of the form Zn , which we already
have studied. Moreover, they are all abelian, and all subgroups of cyclic groups are
cyclic.
Definition 8.13. Given a group G, its subgroup lattice is the collection H of all
subgroups of G, partially ordered by inclusion. The lattice diagram corresponding
to the subgroup lattice is the graph with one vertex for each subgroup, with edges
corresponding to subgroup inclusion. These edges occur between subgroups K and
H provided that K ≤ H and there is no subgroup X such that K ≤ X ≤ H except
for the trivial cases when X = H or X = K. We also draw the graph so that if
K ≤ H, and K ̸= H, then the vertex corresponding to K lies lower in the graph
than the vertex corresponding to H.
A little thought reveals that there is one top vertex, corresponding to the improper subgroup G, and one bottom vertex, corresponding to the trivial subgroup
⟨e⟩. We give two examples below of lattice diagrams, for the groups S3 and Z45 .
9. Morphisms of Groups
Definition 9.1. Let G, G′ be groups. Then a morphism of groups or homomorphism is a map ϕ : G → G′ which satisfies
ϕ(gh) = ϕ(g)ϕ(h).
32
MICHAEL PENKAVA
j ⟨σ, τ ⟩ TT
jjjj  ??? TTTTTTT
j
j
j
T
jjj
2
⟨τ ⟩
⟨στ ⟩
⟨σ⟩ TTT
j ⟨σ τ ⟩
TTTT
??
j
j
j

j
TTTT ?  jjjj
T
jj
⟨e⟩
Figure 1. Subgroup Lattice for S3 , where σ = (1, 2, 3) and τ = (1, 2)
⟨9⟩


⟨3⟩
??
?
⟨0⟩


⟨1⟩
??
?
??
? 
⟨15⟩


⟨5⟩
Figure 2. Subgroup Lattice for Z45
Example 9.2. If ϕ : G → G′ is an isomorphism, then it is a homomorphism, since
the morphism condition above is one of the two conditions of an isomorphism.
Example 9.3. If G, G′ are groups, then the trivial morphism from G to G′ is the
map ϕ(g) = e for all g ∈ G. It is clearly a homomorphism, since ϕ(gh) = e = e·e =
ϕ(g)ϕ(h), for any g, h ∈ G. Thus there is always at least one morphism between
any two groups G and G′ .
Example 9.4. The identity morphism 1G : G → G is the map 1G (g) = g for all
g ∈ G. It is easy to show it is a morphism of groups.
Proposition 9.5. Let ϕ : G → G′ be a homomorphism. Then
(1) ϕ(e) = e.
(2) ϕ(g −1 ) = ϕ(g)−1 .
(3) ϕ(g m ) = ϕ(g)m for any m ∈ Z.
Proof. First, Note that ϕ(e) = ϕ(e · e) = ϕ(e)ϕ(e). By uniqueness of identity in G′ ,
this forces ϕ(e) = e. Next, note that e = ϕ(e) = ϕ(gg −1 ) = ϕ(g)ϕ(g −1 ), for any g ∈
G. By uniqueness of inverse, it follows that ϕ(g −1 ) = ϕ(g)−1 . Finally, we prove the
last result by first showing it for m ∈ P by induction. Since ϕ(g 1 ) = ϕ(g) = ϕ(g)1 ,
it holds for m = 1. Suppose that ϕ(g m ) = ϕ(g)m . Then
ϕ(g m+1 ) = ϕ(g m g) = ϕ(g m )ϕ(g) = ϕ(g)m ϕ(g) = ϕ(g)m+1 .
Thus the formula holds for all m ∈ P. Next ϕ(g 0 ) = ϕ(e) = e = ϕ(g)0 , so it holds
for m = 0. Finally, if m ∈ P, then
ϕ(g −m ) = ϕ((g −1 )m ) = ϕ(g −1 )m = (ϕ(g)−1 )m = ϕ(g)−m .
Thus the formula holds for all m ∈ Z.
Theorem 9.6. Suppose that ϕ : G → G′ and ψ : G′ → G′′ are morphisms of
groups. Then ψ ◦ ϕ : G → G′′ is also a morphism of groups.
ABSTRACT ALGEBRA I NOTES
33
Proof. Let g, h ∈ G. Then
(ψ ◦ ϕ)(gh) = ψ(ϕ(gh)) = ψ(ϕ(g)ϕ(h)) = ψ(ϕ(g))ψ(ϕ(h)) = (ψ ◦ ϕ)(g)(ψ ◦ ϕ)(h).
Corollary 9.7. Let ϕ : G → G′ and ψ : G′ → G′′ be isomorphisms. Then ψ ◦ ϕ is
an isomorphism.
Proof. We already showed that the composition of two bijections is a bijection.
By the theorem, the composition of two isomorphisms is a morphism. Since an
isomorphism is just a bijective morphism, this shows that ψ ◦ ϕ is an isomorphism.
Definition 9.8. Let ϕ : G → G′ be a homomorphism of groups. Then the kernel
of ϕ, denoted ker(ϕ) is the subset of G given by
ker(ϕ) = {g ∈ G|ϕ(g) = e}.
The notion of a kernel of a morphism of groups is parallel to that of the kernel
of a linear transformation. In fact, the kernel of a linear transformation is a special
case of the kernel of a morphism, since a linear map λ between two vector spaces
is a morphism of their underlying group operation of addition, and the kernel of
λ, considered as a linear transformation coincides with the kernel of the map λ,
considered as a morphism of groups!
Proposition 9.9. The kernel of a morphism ϕ : G → G′ is a subgroup of G.
Proof. Let ϕ : G → G′ be a homomorphism, and let H = ker(ϕ). Now e ∈ H since
ϕ(e) = e, so H ̸= ∅. Suppose that g, h ∈ H. Then ϕ(gh) = ϕ(g)ϕ(h) = e · e = e, so
gh ∈ H. Also ϕ(g −1 ) = ϕ(g)−1 = e−1 = e, so g −1 ∈ H. Thus H ≤ G.
An interesting question is what kinds of subgroups of G turn up as kernels of
morphisms. Since the kernel of the identity map 1G is just {e}, we see that the
trivial subgroup is the kernel of a morphism. Moreover, if we consider the trivial
morphism from G to G, it is easy to see that the improper subgroup of G is the
kernel of a morphism. We will see later, in the section on normal subgroups, that
the kinds of subgroups which are kernels of morphisms are quite special!
One of the most powerful results about kernels of morphisms is the following!
Theorem 9.10. Let ϕ : G → G′ be a group homomorphism. Then ϕ is injective
if and only if ker(ϕ) = {e}.
Proof. Suppose ϕ is injective and g ∈ ker(ϕ). Then ϕ(g) = e = ϕ(e), so g = e.
Thus ker(ϕ) = {e}. On the other hand, suppose that ker(ϕ) = {e} and ϕ(g) = ϕ(h).
Then ϕ(gh−1 ) = ϕ(g)ϕ(h)−1 = e, so gh−1 = e and g = h. Thus ϕ is injective. What makes the theorem above so powerful is that it reduces the proof of injectivity to the consideration of when ϕ(g) = e, instead of having to study when
ϕ(g) = ϕ(h) in general. In many cases, computation of the kernel of a morphism is
easy, so determination of whether the morphism is injective is equally easy.
Example 9.11. Let ϕ : Z → Zn be defined by ϕ(a) = a. Since ϕ(a + b) = a + b =
a + b = ϕ(a) + ϕ(b), it follows that ϕ is a homomorphism. Moreover a ∈ ker(ϕ)
if and only if ϕ(a) = 0 if and only if a = 0 if and only if a ∈ ⟨n⟩. Thus we have
identified ker(ϕ) = ⟨n⟩.
34
MICHAEL PENKAVA
Theorem 9.12. Suppose that ϕ : G → G′ is a morphism of groups and H ≤ G.
Then ϕ(H) ≤ G′ .
Proof. Since e = ϕ(e) ∈ ϕ(H), ϕ(H) ̸= ∅. Let x, y ∈ ϕ(H). Then x = ϕ(g) and
y = ϕ(h) for some g, h ∈ H. But then xy = ϕ(g)ϕ(h) = ϕ(gh) ∈ ϕ(H), since
gh ∈ H. Finally x−1 = ϕ(g)−1 = ϕ(g −1 ) ∈ ϕ(H), since g −1 ∈ H.
Definition 9.13. If ϕ : X → Y is a map and S ⊆ Y , then ϕ−1 (S) = {x ∈ X|ϕ(x) ∈
S} is called the inverse image of S under ϕ.
Theorem 9.14. Let ϕ : G → G′ be a group homomorphism and H ′ ≤ G′ . Then
ϕ−1 (H ′ ) ≤ G.
Proof. Let H = ϕ−1 (H ′ ). Since ϕ(e) = e ∈ H ′ , e ∈ H. This show that H ̸= ∅.
Suppose that g, h ∈ H. Then ϕ(gh) = ϕ(g)ϕ(h) ∈ H ′ , since H ′ is a subgroup and
ϕ(g), ϕ(h) ∈ H ′ . Thus gh ∈ H. Moreover ϕ(g −1 ) = ϕ(g)−1 ∈ H ′ , so g −1 ∈ H.
Thus H ≤ G.
Definition 9.15. Let G be a group. Then Aut(G) is the set of all isomorphisms
from G onto G is called the group of automorphisms of G, or automorphism group
of G.
Theorem 9.16. Aut(G) is a group under composition. In fact, Aut(G) ≤ SG ;
i.e., the automorphism group of G is a subgroup of the permutation group of G.
Proof. Since the second statement implies the first, we prove that Aut(G) ≤ SG .
Since an automorphism is a bijection by definition, Aut(G) ⊆ SG . Moreover,
composition is the group operation in SG . We already showed that the identity 1G
is a morphism, so it is in Aut(G), which implies that Aut(G) ̸= ∅. Suppose that
ϕ, ψ ∈ Aut(G). Then we already showed that ϕ ◦ ψ is a morphism, so it lies in
Aut(G). Finally, we need to show that ϕ−1 ∈ Aut(G), which follows if we can show
it is a morphism. But
ϕ−1 (gh) = ϕ−1 (ϕ(ϕ−1 (g))ϕ(ϕ−1 (h)) = ϕ−1 (ϕ(ϕ−1 (g)ϕ−1 (h)) = ϕ−1 (g)ϕ−1 (h).
Thus ϕ−1 is a morphism, which shows it is in Aut(G).
Theorem 9.17 (Cayley’s Theorem). The map ϕ : G → SG given by ϕ(g)(h) = gh
is an injective morphism of groups. Thus G is isomorphic to a subgroup of SG . In
particular, every group is isomorphic to a subgroup of a permutation group.
Proof. First, we need to show that if g ∈ G, then ϕ(g) ∈ SG . To see that ϕ(g)
is injective, suppose ϕ(g)(h) = ϕ(g)(h′ ). Then gh = gh′ so by left cancelation
h = h′ . Thus ϕ(g) is injective. To see that ϕ(g) is surjective, suppose y ∈ G. Then
ϕ(g)(g −1 y) = gg −1 y = y. Thus ϕ(g) is surjective, so it is bijective. This shows that
ϕ(g) ∈ SG .
Next, suppose that g, g ′ ∈ G. Then
ϕ(gg ′ )(x) = gg ′ x = ϕ(g)(g ′ x) = ϕ(g)(ϕ(g ′ )(x) = (ϕ(g) ◦ ϕ(g ′ ))(x).
Thus ϕ(gg ′ ) = ϕ(g) ◦ ϕ(g ′ ), which shows that ϕ is a morphism of groups. Suppose
that g ∈ ker(ϕ). Then ϕ(g) = 1G , so
g = ge = ϕ(g)(e) = 1G (e) = e.
. Thus ker(ϕ) = {e}, which shows that ϕ is injective. Finally, consider the restriction of the image of ϕ to ϕ(G). Then the induced map G → φ(G) is still injective,
and is clearly surjective. This means that ϕ : G → ϕ(G) is an isomorphism.
ABSTRACT ALGEBRA I NOTES
35
In older textbooks, it was pointed out that Cayley’s Theorem is interesting
from a theoretical viewpoint, but is utterly useless as a calculation tool. However,
times have changed. The computational algebra software Maple has a group theory
package that can help analyze properties of groups, but only if they are given as
groups of permutations, that is, as subgroups of some permutation group. Cayley’s
Theorem can be used to express a finite group as a permutation group in a very
concrete manner.
Example 9.18. Let G be a group and g ∈ G. Then the map cg : G → G given by
cg (x) = gxg −1 is called conjugation by g. We claim that cg is an automorphism
of G. To see this, first note that cg (xy) = gxyg −1 = gxg −1 gyg −1 = cg (x)cg (y),
so cg is a morphism. Thus, to show that cg is injective, we need only show that
ker(cg ) = {e}. Suppose that x ∈ ker(cg ). Then e = cg (x), so gxg −1 = e, and
multiplying both sides on the left by g −1 and both side on the right yields x = e. To
see that cg is surjective, let y ∈ G. We compute cg (g −1 yg) = gg −1 ygg −1 = y. Thus
cg is an isomorphism from G onto G, which is precisely what an automorphism of
G is.
Theorem 9.19. Let ϕ : G → G′ be a surjective morphism of groups and K =
ker(ϕ). Then the map H ′ 7→ ϕ−1 (H ′ ) is an order preserving bijection between the
set of subgroups of G′ and the set of subgroups of G which contain K.
Proof. First, note that if H = ϕ−1 (H ′ ) for some H ′ ≤ G′ , then if g ∈ ker(ϕ),
we have ϕ(g) = e ∈ H ′ , which implies that g ∈ ϕ−1 (H ′ ). This shows that
ker(ϕ) ⊆ (ϕ−1 (H ′ )), which means that H ′ 7→ ϕ−1 (H ′ ) is a well defined map
between subgroups of G′ and subgroups of G containing ker(ϕ). Now suppose
that H is a subgroup of G containing ker(ϕ). Then H ′ = ϕ(H) is a subgroup
of G′ . We claim that ϕ−1 (H ′ ) = H. To see this, suppose that a ∈ ϕ−1 (H ′ ).
Then ϕ(a) ∈ H ′ = ϕ(H), so there is some h ∈ H such that ϕ(a) = ϕ(h). But
then ϕ(ah−1 ) = e, so ah−1 ∈ H, since H contains the kernel of ϕ. From this
we conclude that a = ah−1 h ∈ H. Thus ϕ−1 (H ′ ) ⊆ H. However, if h ∈ H
then ϕ(h) ∈ ϕ(H) = H ′ , so h ∈ ϕ−1 (H ′ ). Thus H ⊆ (ϕ−1 (H ′ ), and we see that
H = ϕ−1 (H ′ ). This shows that the map H ′ 7→ ϕ−1 (H ′ ) is surjective to the set of
all subgroups of G containing ker(ϕ). Next, we show it is injective. Let H ′ , K ′ be
subgroups of G′ and suppose that H ′ ̸= K ′ . Suppose that x ∈ H ′ but x ̸∈ K ′ . Then
there is some g ∈ G such that ϕ(g) = x, since ϕ is surjective. We have g ∈ ϕ−1 (H ′ )
but g ̸∈ ϕ−1 (K ′ ). Thus ϕ−1 (H ′ ) ̸= ϕ−1 (K ′ ). If we have some x ∈ K ′ which does
not lie in H ′ we deduce by a similar argument that the two inverse images are not
equal. This shows that the map is injective.
Notice that in the proof above, we only needed that ϕ was surjective to show
that H ′ 7→ ϕ−1 (H ′ ) is an injective map to the set of subgroups of G containing
ker(ϕ). When ϕ is not surjective, the proof above shows that the map is still well
defined and surjective.
10. Cosets
In this section we consider a fixed subgroup H of a group G.
Definition 10.1. Let H ≤ G. We define a relation ∼, called left equivalence by
a ∼L b if and only if b−1 a ∈ H. Similarly, we define right equivalence by a ∼R b
if and only if ab−1 ∈ H.
36
MICHAEL PENKAVA
Theorem 10.2. The relations left equivalence and right equivalence are equivalence relations.
Proof. We prove the result for left equivalence, which we will denote by ∼ instead
of ∼L , and we leave the proof for right equivalence, which is similar, to the reader.
First, note that a ∼ a because a−1 a = e ∈ H. Next, suppose that a ∼ b. Then
b−1 a = h ∈ H. Thus a−1 b = (b−1 a)−1 = h−1 ∈ H. This shows that b ∼ a. Finally,
suppose that a ∼ b and b ∼ c. Then b−1 a = h ∈ H and c−1 b = h′ ∈ H. It follows
that c−1 a = c−1 bb−1 a = hh′ ∈ H. Thus a ∼ c.
Definition 10.3. For the equivalence relation ∼L , we call the equivalence class of
an element a a left coset of a. The left coset of a is often denoted as aH, although,
for simplicity, we will usually just denote it as a. Similarly, the equivalence class of
a under ∼R is called the right coset of a, and is often denoted as Ha. The set of
left cosets of G is denoted by G/H. The number of elements of G/H is called the
index of H in G, and is denoted by [G : H]; in other words, [G : H] = o(G/H).
In fact, there is some ambiguity in the notation, as a = {b ∈ G|a ∼L b} by
definition, but aH = {ah|h ∈ H}. However, it is easy to see that these two sets are
the same, so the ambiguity is only apparent, not real.
Lemma 10.4. The map ϕ : a → b, given by ϕ(x) = ba−1 x, is a well defined
bijection of aH onto bH.
Proof. Let x ∈ a. Then x = ah for some h ∈ H, so ϕ(x) = ba−1 aah = bh ∈ b.
This shows that ϕ is well defined. Suppose that ϕ(x) = ϕ(y) for some x, y ∈ a.
Then x = ah and y = ah′ for some h, h′ ∈ H. We compute that ϕ(x) = bh and
ϕ(y) = bh′ , so bh = bh′ and by left cancelation, we conclude that h = h′ . This
forces x = y. Thus ϕ is injective. Now let y ∈ b. Then y = bh for some h ∈ H. Let
x − ah. Then x ∈ a and clearly ϕ(x) = y. Thus ϕ is surjective. This shows that ϕ
is a bijection of a onto b.
Theorem 10.5 (Lagrange’s Theorem). Suppose that o(G) < ∞, and H ≤ G. Then
o(H)|o(G).
Proof. By the lemma, it follows that the number of elements in a is independent
of a. Note that e = H, so we have o(H) = o(a) for all a ∈ G. Recall that the sets
a for a ∈ G are either disjoint or coincide, and that G is the union of all such sets.
Since G is finite, it follows that G is a finite union of disjoint sets, each of which has
o(H) elements. If there are m distinct such sets, we obtain that o(G) = mo(H). It
follows that o(H)|o(G).
Corollary 10.6. If G is a finite group and g ∈ G, then o(g)|o(G).
Proof. Recall that the order of an element g is the least positive integer n such
that g n = e. But by the classification of cyclic groups, we also know that ⟨g⟩ =
{g 1 , . . . , g n }, so o(g) = o(⟨g⟩). Thus, by the theorem, o(g)|o(G).
Example 10.7. We show that there are exactly two groups of order 4, up to isomorphism. Let G be a group of order 4. If there is an element g of order 4 in G,
then G is cyclic, so G ∼
= Z4 . Otherwise, every non identity element in G has order
2. Let e be the identity element in G. Then there are three non identity elements
in G. Denote two such distinct elements by a and b. Then ab cannot equal a, e or
b, so the fourth element is ab. Thus, we have the following Cayley Table for G.
ABSTRACT ALGEBRA I NOTES
e a
e e a
a a e
b b ab
ab ab b
b
b
ab
e
a
37
ab
ab
b
a
e
Notice that the Cayley Table is symmetric, so the group corresponding to the Cayley
Table, if there is such a group, is abelian. However, to check that the Cayley Table
above gives a group would require 128 calculations to check associativity. Thus, it is
more convenient if we can find a group with this Cayley Table. The group described
by this table is not cyclic, nor is it isomorphic to Sn , for any n. Nevertheless,
we can find a subgroup of S4 which has this Cayley Table. Let a = (1, 2) and
b = (3, 4). Then a2 = b2 = e. Moreover ab = ba. It follows from this fact that the
set H = {a, b, ab, e} is a subgroup of S4 whose Cayley Table is given above.
The group with Cayley Table above has several names in the literature. It is
called the Klein 4-group, after Felix Klein, 1849-1925. It is also isomorphic to the
group Z2 × Z2 , a group which we will understand when we discuss product groups.
It is also isomorphic to the Dihedral Group D2 , which is also denoted by some as
D4 . We will discuss Dihedral groups later.
11. Normal Subgroups and Quotient Groups
Proposition 11.1. Let K = ker(ϕ) be the kernel of a morphism ϕ : G → G′ . Then
if g ∈ G and x ∈ K, gxg −1 ∈ K. In other words, cg (K) = K for every g ∈ G,
where cg is the conjugation by g morphism introduced in Example 9.18.
Proof. Let g ∈ G and x ∈ K. Then ϕ(x) = e. We compute ϕ(gxg −1 ) =
ϕ(g)ϕ(x)ϕ(g)−1 = ϕ(g)eϕ(g)−1 = e. Thus gxg −1 ∈ K. To see the second statement, we note that cg (K) ⊆ K by the first statement. To see equality holds,
suppose y ∈ K and g ∈ G, Then x = g −1 yg = cg−1 (y) ∈ K. Moreover, cg (x) = y.
Thus K ⊆ gg (K) and equality holds.
From the theorem above, we see that kernels of morphisms have a special property among subgroups. This property turns out to be very important, leading to
the definition below.
Definition 11.2. A subgroup H ≤ G is said to be normal in G if gxg −1 ∈ H
whenever x ∈ H and g ∈ G. When H is normal in G, we denote this by H ▹ G.
Since we defined a subgroup to be normal by requiring to have a property which
we already showed that kernels of morphisms satisfy, it follows that every kernel of
a morphism is automatically a normal subgroup.
Theorem 11.3. Let H ≤ G. Then the following are equivalent.
(1) H ▹ G.
(2) gxg −1 ∈ H for all x ∈ H and g ∈ G.
(3) cg (H) ⊆ H for all g ∈ G.
(4) cg (H) = H for all g ∈ G.
Proof. The equivalence of conditions (1) and (2) is a matter of definition. That (2)
is equivalent to (3) is straightforward. That (4) implies (3) is clear. To see that (3)
implies (4), the same argument as in proposition 11.1 works.
38
MICHAEL PENKAVA
Example 11.4. We give an example of a subgroup which is not normal. Let
H = ⟨(1, 2)⟩ = {(1, 2), e} be the subgroup of S3 generated by the transposition
(1, 2). Then c(1,3) ((1, 2)) = (1, 3)(1, 2)(1, 3) = (2, 3) ̸∈ H, so H is not normal in
S3 .
The fact that we chose a nonabelian group in order to find a subgroup which
was not normal in the group is no accident.
Theorem 11.5. Every subgroup of an abelian group is normal in the group.
Proof. Let H ≤ G and suppose G is abelian. If x ∈ H and g ∈ G, then cg (x) =
gxg −1 = xgg −1 = x ∈ H, so H ▹ G.
It is very important to notice that the theorem does not imply that an abelian
subgroup of a group is normal in the group. In fact, in Example 11.4, the subgroup
H of S3 is abelian, because it is a cyclic subgroup, but it is not normal in S3 .
We would like to explore the possibility of giving a group structure to the set of
left cosets G/H of a subgroup H of G. Remember that we denote the left coset of
g by g. How would we define a · b? There seems to be only one natural definition:
a · b = ab.
The problem with this “definition” is that it is not obvious that it is well-defined,
since this rule can be stated as “Take some element out of the first set, some
element of the second set and then multiply them and take the resulting coset”.
Of course, the answer may depend on which elements we take from the two sets.
We encountered this problem when defining addition and multiplication on Zn and
were able to resolve them. It turns out that in this situation, the problem may not
be insurmountable. We will break the problem down into some steps and see that
some condition on H is necessary in order for this definition to work.
Theorem 11.6. Suppose H ≤ G and that the multiplication of cosets on G/H
given by a · b = ab is well-defined. Then
• G/H is a group under this multiplication with identity e and inverse given
by (a)−1 = a−1 .
• The map π : G → G/H given by π(a) = a is a surjective morphism with
kernel H.
• H is normal in G.
• If G is finite, then o(G/H) = o(G)/o(H).
Thus, in order for the product above to be well defined, a necessary condition is
that H ▹ G. Conversely, when H ▹ G, the product above is well defined, so this
condition is sufficient as well.
Proof. Assume a · b = ab is well defined. Then
(a · b) · c = ab · c = abc = a · bc = a · (b · c).
a · e = ae = a = ea = e · a.
a · a−1 = aa−1 = e = a−1 a = a−1 · a.
Thus G/H is a group. Now define π as above, and we compute
π(ab) = ab = a · b = π(a)π(b).
ABSTRACT ALGEBRA I NOTES
39
Thus π is a morphism of groups. We have π(a) = e precisely when a = e which
happens if and only if a ∈ e = H. Thus ker(π) = H. Since kernels of morphisms
are normal in their group, H ▹ G. The fact that o(G/H) = o(G)/o(H) doesn’t
depend on the group structure of G/H, but is simply a coset counting formula,
which we already used in establishing Lagrange’s Theorem.
Now, let us show that when H ▹ G, the product above is well defined. Suppose
that a1 ∈ a and b1 ∈ b. Then a1 = ah and b1 = bh′ for some h, h′ ∈ H. We then
compute
a1 b1 = ahbh′ = abb−1 hbh′ ∈ ab,
since b−1 hb ∈ H, so b−1 hbh′ ∈ H. This shows that a1 b1 = ab, and the product is
well defined.
Definition 11.7. If H ▹G, then the group G/H, equipped with the induced product
above, is called a quotient group or factor group of G.
Example 11.8. In any group G, the improper subgroup G and the trivial subgroup
{e} are both normal in G. The group G/G has one element while the group G/{e}
is isomorphic to G. Thus, we have identified the isomorphism classes of these two
quotient groups.
Example 11.9. Let G = S3 . Then the complete list of subgroups of G is G, ⟨(1, 2)⟩,
⟨(1, 3)⟩, ⟨(2, 3)⟩, ⟨(1, 2, 3)⟩, and {e}. Of these subgroups,
Example 11.10. Let n ∈ Z, and H = ⟨n⟩ = nZ be the subgroup of Z generated by
n. Then we claim that Z/nZ = Zn . This shows that Zn is a quotient group of Z.
To see this claim, note that a = b (mod n) precisely when b − a ∈ ⟨n⟩. Thus the
equivalence classes mod n are precisely the left cosets of H. Also, the formula for
addition of cosets is the same in both cases. This shows that every quotient group
of Z is cyclic.
Theorem 11.11. Let n ∈ P and H = ⟨d⟩ be the subgroup of Zn generated by a
positive divisor d of n. Then Zn /H ∼
= Zd . In particular, every quotient group of
Zn is cyclic.
Proof. We already showed that o(H) = n/d. Moreover, it is easy to see that Zn /H
is generated by the image of 1, so Zn /H is cyclic. To determine what cyclic group
it is isomorphic to, we need only find o(Zn )/H). But we have
n
o(Zn /H) = o(Zn )/o(H) =
= d.
n/d
This shows that Zn /H ∼
= Zd . Since every subgroup of Zn is of the form H = ⟨d⟩,
where d is a positive divisor of n, it follows that every quotient group of Zn is
cyclic.
Theorem 11.12. Let ϕ : G → G′ be a morphism of groups and H ≤ ker(ϕ) be
a normal subgroup of G. Then there is an induced map ϕ : G/K → G′ , given by
ϕ(a) = ϕ(a).
Proof. Let a1 ∈ a, so that a1 = ah. Then
ϕ(a1 ) = ϕ(ah) = ϕ(a)ϕ(h) = ϕ(a)e = ϕ(a).
This shows that ϕ is well defined. Next, we see that
ϕ(a · b) = ϕ(ab) = ϕ(ab) = ϕ(a)ϕ(b) = ϕ(a) · ϕ(b),
40
so ϕ is a morphism.
MICHAEL PENKAVA
Theorem 11.13 (First Isomorphism Theorem for Groups). Let ϕ : G → G′ be a
morphism of groups and H = ker(ϕ). Then ϕ(G) ∼
= G/H.
Proof. The map ϕ : G/H → G′ is well defined, and its image is ϕ(G). Thus we
obtain a morphism G/H → ϕ(G), which is surjective, since if y ∈ ϕ(G), then
y = ϕ(g) for some g ∈ G, so that y = ϕ(g). Moreover, this map is injective, since
if ϕ(a) = e, then ϕ(a) = e, so that a ∈ H and thus a = e. Thus ker(ϕ) = {e}. Using the first isomorphism theorem, we see that every morphism ϕ : G → G′
factors as the composition of an injective map, an isomorphism, and a surjective
map. The injective map is the inclusion ϕ(G) ,→ G′ . (We use the symbol ,→ to
indicate an injective morphism.) The surjective map is the map G → G/ ker(ϕ).
The isomorphism is the map G/H ∼
= ϕ(G) given in the First Isomorphism Theorem
for groups. We have ϕ is given by the composition
G → G/ ker(ϕ) ∼
= ϕ(G) ,→ G′ .
One consequence of all this gyration is that we can understand all morphisms
if we understand surjective and injective morphisms. Moreover, to find the set
of morphisms from G to G′ , we should study the quotient groups of G and the
subgroups of G′ . Every morphism is given by an isomorphism from one of the
quotient groups of G to a subgroup of G′ .
Example 11.14. We classify all morphisms from Z4 to S3 and all morphisms from
S3 to Z4 . We know that Z4 is abelian, so every subgroup is normal in Z4 . There
are precisely 3 subgroups of Z4 , H0 = ⟨0⟩, H1 = ⟨1⟩, and H2 = ⟨2⟩, since there are
only 3 positive divisors of 4. There are 6 subgroups of S3 , K0 = ⟨e⟩, K1 = ⟨(1, 2)⟩,
K2 = ⟨(1, 3)⟩, K3 = ⟨(2, 3)⟩, K4 = ⟨(1, 2, 3)⟩, and S3 itself. Only the subgroups K1 ,
K4 and S3 are normal in S3 .
To find all the morphisms from Z4 to S3 , note that the quotient groups of Z4
are isomorphic to Z1 , Z2 and Z4 . The group Z1 is isomorphic to the subgroup K0
of S3 , and there is exactly one isomorphism between these groups. Thus we obtain
the map ϕ0 : Z4 → S3 , given by ϕ0 (a) = e for all a ∈ Z4 , which is, of course the
trivial morphism. Next, the group Z2 is isomorphic to each of the subgroups K1 , K2
and K3 , and there is precisely one isomorphism between these groups. This gives
three more morphisms ϕ1 , ϕ2 and ϕ3 . The first one is given by ϕ1 (a) = (1, 2)a , the
second by ϕ2 (a) = (1, 3)a , and the third by ϕ3 (a) = (2, 3)a . Finally, we note that
none of the subgroups of S3 are isomorphic to Z4 . Thus we have found exactly 4
morphisms from Z4 to S3 .
To find all the morphisms from S3 to Z4 , note that the quotients of S3 are
S3 /S3 ∼
= Z1 , S3 /K3 ∼
= Z2 , and S3 /K0 ∼
= S3 . The first quotient is isomorphic to
the subgroup H0 , and gives rise to the trivial morphism from S3 → Z4 . The second
quotient is isomorphic to H2 , and gives rise to a morphism ϕ : S3 → Z4 given by
{
2 if σ ∈ {(1, 2), (1, 3), (2, 3)}
ϕ(σ) =
.
0 otherwise
Finally, the third quotient group is not isomorphic to any subgroup of Z4 . Thus
there are exactly 2 morphisms S3 → Z4 .
This example shows that even when the groups are small, the description of the
morphisms between them is quite involved.
ABSTRACT ALGEBRA I NOTES
41
Theorem 11.15. Let H and K be two normal subgroups of G. Then H ∩ K ▹ G.
Exercise 11.16. Prove the above theorem.
Definition 11.17. Let H, K ≤ G. Then define the product of the two subgroups
by HK = {hk|h ∈ H, k ∈ K}.
The product of two subgroups need not be a subgroup of G. To see this, Consider
H = ⟨(1, 2)⟩, K = ⟨(1, 3)⟩ in S3 . It is an easy exercise to show that HK is
not a subgroup of S3 . For example, by direct computation, we see that HK has
4 elements, while S3 has 6 elements, so by Lagrange’s Theorem, it cannot be a
subgroup. However, we do have the following important result.
Theorem 11.18. Let H, K ≤ G and suppose that either H ▹ G or K ▹ G. Then
HK is a subgroup of G.
Exercise 11.19. Prove the above theorem.
Theorem 11.20 (Second Isomorphism Theorem for groups). Let H, K ≤ G and
suppose that K ▹ G. Then H ∩ K ▹ H, K ▹ HK, and
∼ HK/K.
H/H ∩ K =
Proof. By Theorem 11.18, we know that HK ≤ G. Since cg (K) = K for all g ∈ G,
this also holds for all g ∈ HK. Thus K ▹ HK. If h ∈ H and x ∈ H ∩ K, then
ch (x) = hgh−1 ∈ H and ch (x) ∈ K. Thus ch (x) ∈ H ∩ K. This shows that
H ∩ K ▹ H. Define ϕ : H → HK/K by ϕ(h) = h. This definition makes sense
since H ⊆ HK because e ∈ K, so h = he ∈ HK for all h ∈ H. Suppose that
y ∈ HK/K. Then y = hk for some h ∈ H and k ∈ K. Thus
ϕ(h) = h = he = hk = hk = y.
This shows that ϕ is surjective. Let h ∈ ker(ϕ). Then h = e, so h ∈ K. Thus
h ∈ H ∩ K. Since any element in H ∩ K is in the kernel of ϕ, it follows that
ker(ϕ) = H ∩ K. Thus by the first isomorphism theorem,
H/H ∩ K ∼
= ϕ(H) = HK/K.
To prepare for the next isomorphism theorem, we need a definition.
Definition 11.21. Let X ⊆ G and H ▹ G. Then by X/H we mean the subset
{x|x ∈ X} of G/H.
Proposition 11.22. If H ▹ G and H ≤ K, then K/H ≤ G/H. Moreover, if
K ▹ G, then K/H ▹ G/H.
Proof. Since e ∈ K/H, K/H is nonempty. Let a, b ∈ K/H. Then a = k1 h1
and b = k2h2 for some k1 , k2 ∈ K and h1 , h2 ∈ H. Then ab = k1 h1 k2 h2 =
k1 k2 k2−1 h1 k2 h2 , so ab ∈ K/H. But a·b = ab, so this shows that K/H is closed under
−1
multiplication. Moreover a−1 = h−1
= k1−1 k1 h1 k1−1 , so (a)−1 = a−1 ∈ K/H.
1 k1
Thus K/H ≤ G/H.
Theorem 11.23 (Third Isomorphism Theorem for groups). Let H ≤ K ≤ G and
suppose that both H and K are normal subgroups of G, so that in particular, H ▹K.
Then
(G/H)/(K/H) ∼
= G/K.
42
MICHAEL PENKAVA
Proof. The tricky part of the proof is to give a good notation for the equivalence
classes of elements in G in the two different quotients G/K and G/H. Let us denote
the image of a ∈ G in G/H by a, and in G/K by a. Define a map ϕ : G/H → G/K
by ϕ(a) = a. To see that this is well defined, note that if π : G → G/K is the
projection π(a) = a, then H ⊆ ker(π), since H ⊆ K. Let ϕ = π be the induced
map ϕ(a) = π(a) = a. Clearly ϕ is surjective. Now suppose that a ∈ ker(ϕ). Then
a ∈ K, so a ∈ K/H. Moreover, if a ∈ K/H, then a = kh for some k ∈ K and
h ∈ H. Since H ⊆ K, it follows that a ∈ K, so that ϕ(a) = e. Thus ker(ϕ) = K/H,
and the induced map ϕ : (G/H)/(K/H) → G/K is surjective and injective, so is
an isomorphism.
11.1. The Commutator Subgroup.
Definition 11.24. If G is a group and g, h ∈ G, then the commutator of g and h,
denoted [g, h], is the element
[g, h] = ghg −1 h−1 .
The commutator subgroup of G, denoted as G′ or [G, G], is the smallest subgroup
of G containing all commutators.
It is not true in general that the set of commutators of a group G is a subgroup.
Instead, we have the following characterization of [G, G].
Theorem
11.25. If a, b ∈ G, then [a, b]−1 = [b, a]. As a consequence [G, G] =
∏n
{ i=1 [ai , bi ]|ai , bi ∈ G, n ∈ P}.
Proof. To see the first statement, note that
[a, b][b, a] = aba−1 b−1 bab−1 a−1 = e.
∏n
In general, if S ̸= ∅ ⊆ G, we have ⟨S⟩ = { i=1 xi |x±1
∈ S, n ∈ P}. But in this
i
case, since the set of commutators is closed under inverses, we have the simpler
description above.
Theorem 11.26. If a, b ∈ G, then cg ([a, b]) = [cg (a), cg (b)] for all g ∈ G. As a
consequence, the commutator subgroup is a normal subgroup of G.
Theorem 11.27. Let ϕ : G → G′ be a morphism, and suppose that G′ is abelian.
Then [G, G] ≤ ker(ϕ). Moreover G/[G, G] is abelian.
Proof. First, note that
φ([a, b]) = φ(aba−1 b−1 ) = ϕ(a)ϕ(b)ϕ(a)−1 ϕ(b)−1 = ϕ(a)ϕ(a)−1 ϕ(b)ϕ(b)−1 = e.
Thus [a, b] ∈ ker(ϕ) for every commutator. It follows that any product of commutators is in ker(ϕ), and since every element in the commutator subgroup is a product
of commutators, we have [G, G] ≤ ker(ϕ).
Let π : G → G/[G, G] be the natural projection π(a) = a. Then if a, b ∈
G/[G, G], we have
ab(a)−1 (b)−1 = aba−1 b−1 = [a, b] = e.
But this means that ab = ba, so G/[G, G] is abelian.
ABSTRACT ALGEBRA I NOTES
43
12. Dihedral Groups
The dihedral group Dn is often defined as the group of symmetries of the regular
n-gon. By symmetries, we mean rotations and reflections which preserve the set
of vertices and edges of the n-gon. There are several different ways of representing
the dihedral group, and in this section, we will discuss several of them.
0.8
0.6
0.4
0.2
–1
–0.5
0
0.5
1
–0.2
–0.4
–0.6
–0.8
Figure 3. Hexagon centered at the origin with one vertex at (1, 0)
The picture above illustrates a regular hexagon, centered at the origin, with one
vertex at the point (1, 0). The vertices are at the points (cos(kπ/3), sin(kπ/3), for
k = 0, . . . , 5. More generally, for a regular n-gon, we would have n-vertices at the
points (cos(2kπ/n), sin(2kπ/n), for k = 0, . . . , n − 1. There are n rotations which
preserve the n-gon, generated by the rotation ρ through the angle of 2π/n, which
is conventionally chosen to be a counterclockwise rotation. The n rotations are
ρ, ρ2 , . . . , ρn , where ρn is the identity element, since a rotation by the angle 2π is
considered as the identity.
For a hexagon, there are 6 reflections, three of them across the lines determined
by pairs of midpoints of opposite edges, and three of them through the lines given
by lines given by pairs of opposite vertices. The same pattern holds for any ngon, when n is even, but for n odd, there is a different pattern. There are still n
reflections, but each is given by a reflection through a line through a vertex and
the midpoint of the opposite edge.
If we denote the reflection of the plane across the x-axis by σ, we note that for
any n-gon, this reflection is one of the symmetries. Moreover, every reflection is of
the form ρk σ, for k = 0, . . . , n. Thus the complete set of symmetries of the n-gon is
44
MICHAEL PENKAVA
{e, ρ, . . . , ρn−1 , σ, ρσ, . . . , ρn−1 σ}. This means that there are 2n symmetries of the
n-gon.
We would like to show that the set Dn of symmetries of the n-gon form a group
under composition. We have already used composition to aid in the description
of the symmetries, but we still need to show that any composition of symmetries
is another symmetry. This is clear from the geometric point of view, since the
composition of maps which preserve the edges and vertices should also preserve
the vertices and edges. However, we would like to understand this idea from an
algebraic point of view as well.
Let see what we can work out directly from the definitions of ρ and σ. Clearly,
the elements σk = ρk σ are not rotations, because we have flipped the plane over
with σ, and the rotation ρk does not flip it over again. In fact, it is not hard to see
that the point whose coordinates are ±(cos(kπ/n), sin(kπ/n)) is preserved under
σk , so that σk preserves the line through these two points, and therefore must be a
reflection across that line.
Suppose that ϕ is any element of the dihedral group. It is enough to know
whether it is a rotation or reflection and what it does to the vertex (1, 0). We use
this fact to compute σρ. Since ρ takes (1, 0) to (cos(2π/n), sin(2π/n)), and σ has
the effect of negating the y-coordinate,
(ρσ)(1, 0) = (cos(2π/n), − sin(2π/n)) = (cos(2π/n), sin(−2π/n)).
However, we can directly compute that
(ρn−1 σ)(1, 0) = ρn−1 (1, 0) = (cos(2(n − 1)π/n), sin(2(n − 1)π/n))
= (cos(2π/n), sin(−2π/n)),
It follows that σρ = ρn−1 σ, since both of these maps are reflections and they take
(1, 0) to the same vertex.
All of the 2n symmetries of the n-gon are given by orthogonal transformations of
the plane. The set of all orthogonal transformations of the plane is a group, denoted
by O(2, R),[or just O(2).]It consists of rotations, which are given by matrices of the
− sin(θ)
form ρθ = cos(θ)
, and reflections, which are given by matrices of the form
sin(θ) cos(θ)
]
[
[ 0 ]
cos(θ) sin(θ)
σθ = sin(θ) − cos(θ) . Let σ = 10 −1
. Then σθ = ρθ σ, by direct computation.
Moreover, by direct computation, we obtain that σρθ = ρ−θ σ.
Now, let ρ = ρθ for θ = 2π/n. It is easily computed that n is the least positive
integer such that ρn = I, the identity matrix, and that the set of 2n elements
Dn = {I, ρ, . . . , ρn−1 , σ, ρσ, . . . , ρn−1 σ} is closed under multiplication and inverses
and is nonempty. Thus Dn is a subgroup of O(2).
The last incarnation of the dihedral group that we will study is as operators on
the complex plane. By an operator on the complex plane we mean a map C → C.
Let ρ be the operator defined by ρ(z) = e2πi/n z, in other words, multiplication by
the complex number e2πi/n , where eiθ = cos(θ) + i sin(θ), a property of complex
numbers that you can verify by expanding the power series for eiθ , and using the
fact that i2 = −1.
Let σ be the conjugation operator σ(z) = z. Then the dihedral group Dn is the
subgroup of operators on the complex plane generated by ρ and σ. Since ρ and σ
are invertible maps, with the inverse to ρ given by multiplication by e−2πi/n , and
σ being its own inverse, Dn is a subgroup of the permutations of C. Moreover ρk
ABSTRACT ALGEBRA I NOTES
45
is multiplication by e2kπi/n , so ρn = 1, the operator of multiplication by 1, which
is the identity operator on C.
Next, we compute
(σ ◦ ρ)(z) = σ(ρ(z)) = σ(e2πi/n z) = e2πi/n z = e2πi/n z
= e−2πi/n z = ρ−1 (σ(z)) = (ρ−1 ◦ σ)(z).
From this we conclude that σρ = ρ−1 σ. It follows from this relation that Dn =
{e, ρ, . . . , ρn−1 , σ, ρσ, . . . , ρn−1 σ}, and that these 2n elements are distinct.
No matter which of the three models one makes of Dn , we obtain that it has 2n
elements, that ρ has order n, σ has order 2, and σρ = ρ−1 σ. In fact, these three
properties are sufficient to compute all products in Dn , and therefore to construct
the Cayley Table of Dn .
For some low values of n, we obtain some isomorphisms between Dn and some
other groups. D1 has exactly 2 elements, e and σ, so D1 ≃ Z2 . D2 is isomorphic
to the Klein 4-group. D3 ≃ S3 , which can be verified by comparing Cayley tables.
There is a morphism Dn → Sn , since every symmetry of the n-gon induces a
permutation of the n vertices of the n-gon. When n ≥ 3, this map is injective.
The isomorphism between D3 and S3 can be seen in this way. When n > 3, the
morphism Dn → Sn is not surjective, as o(Dn ) = 2n < n! = o(Sn ) when n > 3.
The model for Dn as the symmetries of the n-gon has some difficulty when n
is 1 or 2. To rescue the model, let us consider the n-gon to be defined as follows.
Take the unit circle and mark n equally spaced points on it, in other words, the
points (cos(2kπ, n), sin(2kπ/n)) for k = 0, . . . , n − 1. We will call these n-points
the vertices of the n-gon. Define the group Dn as the symmetries of the circle,
which preserve this set of n vertices. In this way, the 1-gon is a circle with one
point marked on it, and the 2-gon is a circle with two diametrically opposite points
marked on it.
Note that some mathematicians denote the group Dn as D2n . This kind of
conflicting terminology is not unusual in mathematics and reflects traditions coming
from different branches of mathematics.
13. Direct Products and Semidirect Products
Definition 13.1. Let G and H be groups. Then the set G × H, equipped with
binary operator (g, h)(g ′ , h′ ) = (gg ′ , hh′ ) is called the direct product
∏ of G and H.
More generally, if Gλ is a collection of groups for λ ∈ Λ, then
Gλ is equipped
λ∈Λ
with the binary operator (gλ )(gλ′ ) = (gλ gλ′ ) is called the direct product of the groups
Gλ .
Theorem 13.2. The direct product of groups is a group.
Proof. We give the proof for two groups G and G′ . The proof for a collection of
groups is similar. First, we check that the binary operation is associative.
(g, h)((g ′ , h′ )(g ′′ , h′′ )) = (g, h)(g ′ g ′′ , h′ h′′ ) = (gg ′ g ′′ , hh′ h′′ ) = (gg ′ , hh′ )(g ′′ , h′′ )
= ((g, h)(g ′ , h′ ))(g ′′ , h′′ ).
Next we check that (e, e) is the identity in G × H.
(e, e)(g, h) = (eg, eh) = (g, h) = (ge, he) = (g, h)(e, e).
46
MICHAEL PENKAVA
Finally we show that (g, h)−1 = (g −1 , h−1 ).
(g, h)(g −1 , h−1 ) = (gg −1 , hh−1 ) = (e, e) = (g −1 g, h−1 h) = (g −1 , h−1 )(g, h).
Theorem 13.3. G × H is abelian if and only if both G and H are abelian groups.
Exercise 13.4. Prove the above theorem.
Theorem 13.5. If (g, h) ∈ G × H, then (g, h)k = (g k , hk ) for all k ∈ Z.
Exercise 13.6. Prove the theorem above.
Example 13.7. Let G = Z2 and H = Z3 . Then o(G × H) = o(G)o(H) = 6. We
have found two groups of order 6, up to isomorphism, Z6 and S3 . It turns out that
these are the only two groups of order 6 up to isomorphism. However we have not
shown that, so let us see what we can determine. First, since both Z2 and Z3 are
abelian, we know that G is abelian. Thus we can rule out the possibility that G is
isomorphic to S3 . On the other hand, if G ∼
= Z6 , it would have to have an element
of order 6. This motivates us to study the order of the elements in G. It is easy
to calculate that k(1, 1) = (k, k), so this is (0, 0) precisely when k = 0 (mod 2) and
k = 0 (mod 3). But this means that both 2 and 3 must divide k. Since 2 and 3 are
relatively prime, this forces their product to divide k, so k is a multiple of 6. The
least nonzero positive multiple of 6 is 6, so we see that (1, 1) has order 6. But that
means that G is cyclic and so it is isomorphic to Z6 .
Theorem 13.8. Let g ∈ G and h ∈ H be elements of finite order. Then the order
of (g, h) in G × H is the least common multiple of o(g) and o(h). On the other
hand, if either g or h have infinite order, then so does (g, h).
Proof. Let m = o(g) and n = o(h), and c = lcm(m, n). Now c = mx and c = ny
for some x, y ∈ Z, so
(g, h)c = (g c , hc ) = (g mx , hny ) = ((g m )x , (hn )y ) = (ex , ey ) = (e, e).
This shows that o(g, h)|c. On the other hand, if (g, h)k = (e, e), then g k = e and
hk = e, so m|k and n|k, which means that c|k, since c is the least common multiple
of m and n. Since the order of the element is the least positive power k such that
(g, h)k = e, it follows that c|o(g, h). Thus the two are equal. When g or h has
infinite order, it is impossible that (g, h)k = e for any positive integer k, because in
that case we would have g k = e and hk = e, which would imply that both of them
have finite order.
In the example above, we saw that Z2 × Z3 ∼
= Z6 . In fact, this is a consequence
of the more general result below.
Theorem 13.9. Zm × Zn ∼
= Zmn precisely when gcd(m, n) = 1.
∼ Zmn . Then there is an element (a, b) of order mn.
Proof. Suppose that Zm × Zn =
But the maximum possible order of any element in Zm × Zn is lcm(m, n) ≤ mn. It
mn
, so this forces gcd(m, n) =
follows that lcm(m, n) = mn. Now lcm(m, n) = gcd(m,n)
1. On the other hand, when gcd(m, n) = 1, then o(1, 1) = lcm(m, n) = mn, so
Zm × Zn is a cyclic group of order mn, and thus it is isomorphic to Zmn .
ABSTRACT ALGEBRA I NOTES
47
One problem with the notion of direct product is that in order for a group to be
such a direct product, it needs to consist of orders pairs. This is extremely unlikely,
so at first hand, the notion of direct product does not seem to be so powerful. But
the theorem above shows that the key idea is not being a direct product, but being
isomorphic to one. Because we know that Zmn is isomorphic to Zm × Zn when m
and n are relatively prime, it tells us a lot about the structure of that group. The
following theorem gives a method of characterizing when a group is isomorphic to
a direct product.
Theorem 13.10 (Fundamental Theorem on Direct Products). Let H and K be
subgroups of a group G. Assume the following three conditions hold.
(1) H ∩ K = {e}.
(2) H ▹ G and K ▹ G.
(3) HK = G.
Then the map ϕ : H × K → G, given by ϕ(h, k) = hk is an isomorphism between
H × K and G. We can replace (2) by
(2′ ) hk = kh for all h ∈ H, k ∈ K.
Moreover, when o(G) < ∞, we can replace (3) by
(3′ ) o(G) = o(H)o(K).
Proof. We first show that if (1) and (2) hold, then (2′ ) holds. Let h ∈ H and
k ∈ K. Note that hk = kh if and only if hkh−1 k −1 = e. Now hkh−1 k −1 lies in
K because hkh−1 is in K. Similarly, it lies in H because kh−1 k −1 is in H. Thus
hkh−1 k −1 lies in H ∩ K, so we must have hkh−1 k −1 = e.
Next, we show that if (2′ ) and (3) hold, then (2) holds. Let g ∈ G. By (3), we
must have g = hk for some h ∈ H and k ∈ K. Let x ∈ H. Then
gxg −1 = hkxk −1 h−1 = hkk −1 xh−1 = hxh−1 ∈ H.
Thus H ▹ G. By a similar argument K ▹ G.
Suppose that (1) holds. Then we claim that ϕ is injective. Suppose that hk =
h′ k ′ for some h′ ∈ H and k ′ ∈ K. Then (h′ )−1 h = k ′ k −1 ∈ H ∩ K, which implies
that (h′ )−1 h = k ′ k −1 = e, from which we deduce that h = h′ and k = k ′ . Thus ϕ
is injective.
Suppose (1) and (3) hold, and that o(G) < ∞. Since (1) implies that ϕ is
injective, and (3) implies that ϕ is surjective, we have that ϕ is a bijection, so
o(G) = o(H) × o(K). Thus (3′ ) holds.
On the other hand, suppose (1) and (3′ ) hold and that o(G) < ∞. Since ϕ is
injective, and the sets G and H × K have the same number of elements, ϕ must be
surjective. Thus G = HK. Thus (3) holds.
Now, we have already seen that if (1) holds, ϕ is injective, and if (3) holds, ϕ
is surjective. Thus it remains to show that ϕ is a morphism. If either (2) holds
as well, then we know that (2′ ) holds. Thus we will use (2′ ) to show that ϕ is a
morphism. We have
ϕ((h, k)(h′ , k ′ )) = ϕ((hh′ , kk ′ )) = hh′ kk ′ = hkh′ k ′ = ϕ((h, k))ϕ((h′ , k ′ )).
Thus ϕ is a morphism.
Notice that in the above proof, if conditions (1) and (2′ ) hold, then ϕ is still an
injective morphism. Thus, whether (3) holds or not, ϕ is an isomorphism of H × K
onto its image, which is a subgroup of G. However, we need to be a bit careful here,
48
MICHAEL PENKAVA
because it is not enough that (1) and (2) hold, since it was condition (2′ ) that we
needed to show that ϕ was a morphism, and condition (3) was needed to replace
(2) by (2′ ).
Since we are mainly concerned with finite groups, it is usually much easier to
show that condition (3′ ) holds than condition (3). In fact, if we don’t know what
o(H) and o(K) are, we probably don’t have enough information about the two
subgroups to conclude anything.
Definition 13.11. If H and K are subgroups of G and the map ϕ : H × K → G
given by ϕ(h, k) is an isomorphism of groups, then we say that G is the internal
direct product of H and K, and we also write G = H × K.
Notice that when we express an internal direct product in the form G = H × K,
we don’t mean that G consists of ordered pairs of elements of H and K. Thus there
is some ambiguity about whether H × K means the internal direct product or the
external direct product, which is given by these ordered pairs of elements. However,
this ambiguity is not so much of a problem as the two groups are isomorphic in a
natural manner.
Example 13.12. Consider the group R∗ of nonzero real numbers (under multiplication). Let H = {±1} and K = R+ , the subgroup of positive real numbers. Clearly
H ∩K = {1}. Moreover, every subgroup of an abelian group is normal, so condition
(2) is satisfied. (Of course, condition (2′ ) is even more obvious.) Finally, every
element of R∗ is in HK. Thus R∗ = H ×K. We also express this as R∗ = Z2 ×R+ ,
since H ∼
= Z2 .
∼ Zm × Zn when gcd(m, n) = 1.
Example 13.13. We give another proof that Zmn =
We note that m has order n and n has order m in Zmn . Let H = ⟨m⟩ and K = ⟨n⟩.
Suppose x ∈ H ∩K. Then o(x) must divide both m and n, since it divides the orders
of each of those groups, since it is a member of both of them. Thus o(x) = 1, so
x = 0. It follows that H ∩K = {0}, so condition (1) holds. Condition (2) holds since
Zmn is abelian. Condition (3′ ) holds since o(Zmn ) = mn = o(K)o(H) = o(H)o(K).
Thus Zmn = H × K ∼
= Zm × Zn .
Exercise 13.14. Show that D2n ∼
= Dn × Z2 when n is odd. To do this, show that
if we let r = ρ2 , then the subgroup H = ⟨r, σ⟩ ∼
= Dn . Let K = ⟨ρn ⟩. Show that
∼
K = Z2 . Then show that the hypotheses of the Fundamental Theorem on Direct
Products are satisfied.
Exercise 13.15. Recall that GL(n, R) is the group of n × n invertible matrices
with real coefficients, and SL(n, R) is the subgroup of matrices with determinant 1.
Suppose that n is odd, so that det(−I) = −1. Let H = {±I}, and K = SL(n, R).
Show that GL(n, R) = H × K.
Before the introduction of the notion of direct product, we did not know many
groups. We have studied the groups Zn , Sn , Dn and some matrix groups. With
the introduction of the direct product, we obtain a lot more groups. For example,
every finite abelian group (actually every finitely generated abelian group) can
be expressed as a direct product of cyclic groups. This statement is called the
Fundamental Theorem of Finitely Generated Abelian Groups.
The proof of the Fundamental Theorem of Finite Abelian Groups is long, but
the applications of the theorem are fairly straightforward.
ABSTRACT ALGEBRA I NOTES
49
The construction of groups by direct products is still not enough to classify
all finite groups. We need a more powerful tool, called the semidirect product.
Even this tool, which we introduce next, is not enough to classify finite groups,
but it is a much more powerful tool than the direct product. The basic strategy
in constructing finite groups is as follows. Suppose that you want to construct
all groups of order n, and you do know all groups of order less than n, up to
isomorphism. Then we would like to be able to construct groups of order n from
groups of smaller order. The first case we need to consider is the case when our
group has no proper nontrivial normal subgroups.
Definition 13.16. Let G be a finite group. Then G is said to be a simple group if
there are no proper nontrivial subgroups of G.
The first step in classifying groups is to determine all of the simple groups. The
classification of simple groups turned out to be a very hard problem. Its solution
was originally announced in the 1980s, but there were some problems with this
solution. Then, in the 1990s, it was thought to have been completely solved, but
again there were some problems with the proofs. According to Wikipedia, the
complete classification of finite simple groups was completed in 2008. Nevertheless,
the classification remains a difficult problem, and a complete proof has not yet been
published.
The complete list of simple groups contains several families of simple groups,
and 26 special cases, called sporadic groups. We already know one of the families,
Zp for p prime, because the only subgroups of Zp are the trivial subgroup and the
improper subgroup. This is because if 1 ≤ m < p, then m is a generator of Zp . The
family of groups An , the subgroup of even permutations in Sn , is another family
of simple groups for n ≥ 5. It was a very important discovery of Évariste Galois
(1811-1832) that A5 was a simple group. This was the key fact in his proof that
there can be no formula for obtaining the solution to a general quintic polynomial
in terms of roots.
The largest sporadic group, called the Monster Group, was only discovered in
1982 (although it was shown to exist in the 1970s). It has order
246 · 320 · 59 · 76 · 112 · 133 · 17 · 19 · 23 · 29 · 31 · 41 · 47 · 59 · 71 ≈ 8 × 1053 .
This group is so large that even calculating things like the product of two elements
in the group is extremely complicated. A great deal of study of this monster group
is still ongoing.
If α : K → Aut(H) ia a morphism of K to the automorphism group of a group
H, then it is typical to denote the automorphism α(k) by αk .
Definition 13.17. Suppose that α : K → Aut(K) is a morphism between the
group K and the automorphism group of the group H. Then the semidirect product
of H and K determined by α, denoted by H oα K, is the set H × K equipped with
the binary operation
(h, k)(h′ , k ′ ) = (hαk (h′ ), kk ′ ).
When the map α is implicit, we usually write H o K instead of H oα K.
Theorem 13.18. The semidirect product H oα K is a group under the binary
operation introduced above.
50
MICHAEL PENKAVA
Proof. To see associativity holds, we compute
((h, k)(h′ , k ′ ))(h′′ , k ′′ ) = (hαk (h′ ), kk ′ )(h′′ , k ′′ ) = (hαk (h′ )αkk′ (h′′ ), kk ′ k ′′ )
= (hαk (h)αk (αk′ (h′′ ))kk ′ k ′′ ) = (hαk (h′ αk′ (h′′ ), kk ′ k ′′
= (h, k)(h′ αk′ (h′′ ), k ′ k ′′ ) = (h, k)((h′ , k ′ )(h′′ , k ′′ )).
It is natural to guess that the identity is (e, e), and we verify this by
(e, e)(h, k) = (eαe (h), ek) = (αe (h), k) = (1H (h), k) = (h, k)
(h, k)(e, e) = (hαk (e), ke) = (he, k) = (h, k).
It is not so obvious what (h, k)−1 should be, so let us solve for it. Suppose that
(h, k)(x, y) = (e, e). Then
(e, e) = (h, k)(x, y) = (hαk (x), ky).
Thus y = k −1 and αk (x) = h−1 , so applying αk−1 to both sides, we obtain that x =
αk−1 (h−1 ). Thus (x, y) = (αk−1 (h−1 ), k −1 ). We need to verify that (x, y)(h, k) =
(e, e). But
(x, y)(h, k) = (αk−1 (h−1 ), k −1 )(h, k) = (αk−1 (h−1 )αk−1 (h), k −1 k)
= (αk−1 (h−1 h), e) = (αk−1 (e), e) = (e, e).
Note that a direct product is a special case of a semidirect product, where the
map α is the trivial morphism between K and Aut(K), because in that case we
have
(h, k)(h′ , k ′ ) = (hαk (h′ ), kk ′ ) = (h1H (h′ ), kk ′ ) = (hh′ , kk ′ ).
As is the case for direct products, it is uncommon for a group to actually consist
of ordered pairs, so there is little chance that a group fits the description of a
semidirect product. However, what is more important is when a group is isomorphic
to a semidirect product. The following theorem characterizes when G is isomorphic
to a semidirect product.
Theorem 13.19. Suppose that H and K are subgroups of G satisfying
(1) H ∩ K = {e}.
(2) H ▹ G.
(3) HK = G.
Let α : K → Aut(H) be given by αk (h) = khk −1 be the automorphism of H given
by the restriction of the conjugation operator to K, acting on H. Then the map
H oα K → G given by (h, k) 7→ hk is an isomorphism. If o(G) < ∞, then we may
replace condition (3) by the condition
(3)′
o(G) = o(H)o(K).
Proof. The fact that (3) is equivalent to (3)′ if (1) holds is proved in the same way
it was for direct products. Let ϕ : H oα K → G be given by ϕ(h, k) = hk. Then
ϕ((h, k)(h′ , k ′ )) = ϕ(hkh′ k −1 , kk ′ ) = hkh′ k −1 kk ′ = hkh′ k ′ = ϕ(h, k)ϕ(h′ , k ′ ).
Injectivity of ϕ follows from (1) and surjectivity from (3).
ABSTRACT ALGEBRA I NOTES
51
Example 13.20. Suppose that n ≥ 2. Let H = An be the alternating subgroup of
the permutation group Sn . We know that An ▹ Sn and that o(An ) = n!/2. Let
K = ⟨(12)⟩ = {(12), e} ∼
= Z2 . Since o(H)o(K) = p! = o(Sn ) and H ∩ K = {e},
we see that Sn = An o Z2 . Thus Sn is a semidirect product. Since An is simple
for n ≥ 5, this gives a decomposition of Sn as a semidirect product of two simple
groups.
Exercise 13.21. Show that Dn ≃ Zn o Z2 .
Example 13.22. Let V be a vector space over a field k. Then GL(V ) is a the group
of invertible linear transformations from V to V . Let v ∈ V and A ∈ GL(V ). Then
the map TA,v : V ] → V given by TA,v (x) = Ax+v is called an affine transformation
of V . We show that the set Aff(V ) of affine transformations of V is a group under
composition. Since
(TA,v ◦ TB,w )(x) = TA,v (TB,w (x)) = TA,v (Bx + v)
= A(Bx + v) + w = (AB)x + Av + w = TAB,Av+w (x),
we see that the composition of two affine transformations is another affine transformation.
Clearly, the identity map TI,0 is an affine transformation, and
TA,v ◦ TI,0 = TA,v = TI,0 ◦ TA,v ,
so e = TI,0 is the identity in Aff(V ). Finally,
TA,v ◦ TA−1 ,−A−1 v = TI,0 = TA−1 ,−A−1 v ◦ TA,v ,
−1
so TA−1 ,−A−1 v = TA,v
. Thus Aff(V ) is a group.
Next, we let H = {TI,v |v ∈ V }. Denote an element TI,v by Tv and call it the
translation by v. It is easy to see that Tv + Tw = Tv+w , Tv−1 = T−v , and that
e ∈ H, so H is a subgroup of Aff(V ). It is also straightforward to see that the map
T : V → H given by T (v) = Tv is an isomorphism, so H ≃ V . Next, we compute
TA,v ◦ Tw ◦ TA−1 ,−A−1 v = TA,v ◦ TA−1 ,w−A−1 v = TI,Aw ,
which shows that H ▹ Aff(V ). Let K = GL(V ) = {TA,0 |A ∈ GL(V )}. We have
−1
TA,0 ◦ TB,0 = TAB,0 , TA,0
= TA−1 ,0 and e ∈ K, which shows that K ≤ Aff(V ).
Next, note that if TA,0 = Tv , then A = I and v = 0, so H ∩K = {e}. Finally, note
that TA,v = Tv ◦ TA,0 , which shows that HK = Aff(V ). Thus all three conditions
of the theorem on semidirect products are satisfied and Aff(V ) = H o K. Since
H ≃ V and K ≃ GL(V ), we express this fact in the form
Aff(V ) = V o GL(V ).
Department of Mathematics, University of Wisconsin-Eau Claire, Eau Claire, WI
54729 USA
E-mail address: [email protected]