Download lecture notes

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Gröbner basis wikipedia , lookup

Polynomial wikipedia , lookup

Basis (linear algebra) wikipedia , lookup

Complexification (Lie group) wikipedia , lookup

Congruence lattice problem wikipedia , lookup

Modular representation theory wikipedia , lookup

Tensor product of modules wikipedia , lookup

Algebraic K-theory wikipedia , lookup

Algebraic variety wikipedia , lookup

Deligne–Lusztig theory wikipedia , lookup

Polynomial greatest common divisor wikipedia , lookup

Birkhoff's representation theorem wikipedia , lookup

System of polynomial equations wikipedia , lookup

Homological algebra wikipedia , lookup

Field (mathematics) wikipedia , lookup

Cayley–Hamilton theorem wikipedia , lookup

Factorization wikipedia , lookup

Eisenstein's criterion wikipedia , lookup

Commutative ring wikipedia , lookup

Factorization of polynomials over finite fields wikipedia , lookup

Polynomial ring wikipedia , lookup

Fundamental theorem of algebra wikipedia , lookup

Algebraic number field wikipedia , lookup

Transcript
Introduction to Algebra
Rui L. Fernandes
University of Illinois at Urbana-Champaign
Manuel Ricou
Instituto Superior Técnico
April 14, 2017
2
Contents
1 Basic Algebraic Structures
1.1 Introduction . . . . . . . . . . . . . . . . . . .
1.2 Groups . . . . . . . . . . . . . . . . . . . . . .
1.3 Permutations . . . . . . . . . . . . . . . . . .
1.4 Homomorphisms and Isomorphisms . . . . . .
1.5 Rings, Integral Domains and Fields . . . . . .
1.6 Homomorphisms and Isomorphisms of Rings .
1.7 The Quaternions . . . . . . . . . . . . . . . .
1.8 Symmetries . . . . . . . . . . . . . . . . . . .
2 The
2.1
2.2
2.3
2.4
2.5
2.6
2.7
2.8
2.9
Integers
The Axioms of the Integers . . . . . . . .
Inequalities and Ordered Rings . . . . . .
Principle of Induction . . . . . . . . . . .
Summations and Products . . . . . . . . .
Factors, Multiples and Division . . . . . .
Ideals and Euclides Algorithm . . . . . . .
The Fundamental Theorem of Arithmetic
Congruences . . . . . . . . . . . . . . . . .
Prime Factorizations and Cryptography .
3 Other Examples of Rings
3.1 The Rings Zm . . . . . . . . . . . . .
3.2 Fractions and Rational Numbers . .
3.3 Polynomials and Power Series . . . .
3.4 Polynomial Functions . . . . . . . .
3.5 Division of Polynomials . . . . . . .
3.6 The Ideals of K[x] . . . . . . . . . .
3.7 Divisibility and Prime Factorization
3
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
7
7
13
19
24
33
42
50
54
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
65
65
70
77
83
88
93
103
109
119
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
125
125
137
142
150
156
164
168
4
CONTENTS
3.8
Factorization in D[x] . . . . . . . . . . . . . . . . . . . . . . . 182
4 Quotients and Isomorphisms
4.1 Groups and Equivalence Relations . .
4.2 Quotient Group and Quotient Ring . .
4.3 Real and Complex Numbers . . . . . .
4.4 Isomorphism Theorems for Groups . .
4.5 Isomorphism Theorems for Rings . . .
4.6 Free Groups, Generators and Relations
5 Finite Groups
5.1 Groups of Transformations . .
5.2 The Sylow Theorems . . . . . .
5.3 Nilpotent and Solvable Groups
5.4 Simple Groups . . . . . . . . .
5.5 Symmetry Groups . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
6 Modules
6.1 Modules over a ring . . . . . . . . . . .
6.2 Linear Independence . . . . . . . . . . .
6.3 Tensor Products . . . . . . . . . . . . .
6.4 Modules over Integral Domains . . . . .
6.5 Finitely Generated Modules over a PID
6.6 Classifications . . . . . . . . . . . . . . .
6.7 Categories and Functors . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
7 Galois Theory
7.1 Field Extensions . . . . . . . . . . . . . .
7.2 Compass-and-Straightedge Constructions
7.3 Splitting Fields . . . . . . . . . . . . . . .
7.4 Homomorphisms of Extensions . . . . . .
7.5 Separability . . . . . . . . . . . . . . . . .
7.6 The Galois Group . . . . . . . . . . . . .
7.7 The Galois Correspondence . . . . . . . .
7.8 Applications . . . . . . . . . . . . . . . . .
8 Commutative Algebra (not yet available)
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
189
. 189
. 197
. 205
. 213
. 221
. 228
.
.
.
.
.
245
. 245
. 251
. 259
. 265
. 270
.
.
.
.
.
.
.
281
. 281
. 290
. 295
. 304
. 310
. 320
. 326
.
.
.
.
.
.
.
.
333
. 335
. 338
. 343
. 350
. 354
. 358
. 364
. 370
381
CONTENTS
A Set
A.1
A.2
A.3
A.4
Theory
Relations and Functions . . . . . . .
Axiom of Choice, Zorn’s Lemma and
Finite Sets . . . . . . . . . . . . . . .
Infinite Sets . . . . . . . . . . . . . .
5
. . . . . .
Induction
. . . . . .
. . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
383
. 383
. 390
. 396
. 401
6
CONTENTS
Chapter 1
Basic Algebraic Structures
1.1
Introduction
Algebra is still today, as it always has been, the study of operations, rules
and methods to solve equations. The origin of the term “Algebra” itself
is specially enlightening. According to B. L. van der Waerden, one of the
outstanding algebraists of the XX Century, the term was used for the first
time by in the IX Century the Arab mathematician al-Khowarizmi, in the
title of a monograph 1 which aimed at presenting mathematical knowledge
of practical application. The Arab word al-jabr is used in this monograph
to denote two basic procedures in solving equations, namely:
1. adding the same positive quantity to both sides of an equation, in
order to cancel negative quantities, and
2. multiplying both sides of the equation by the same positive quantity
in order to cancel fractions.
The aforementioned monograph discusses problems of a very different nature, ranging from geometric problems to the solution of linear and quadratic
equations, and includes also a variety of applications to astronomy, to commerce, to calendars, etc. With time, the term al-jabr, or Agebra, became
associated with the general area of knowledge that deals with operations
and numerical equations.
1
The title of the book was “Al-jabr wa’l muqābala”. His author, Mohammed alKhowarizmi was an astronomer and mathematician, member of the Bait al-hikma (“House
of Wisdom”) established in Baghdad, the center of scientific knowledge in those times. The
name al-Khowarizmi is also the origin of another common term: algorism!
7
8
CHAPTER 1. BASIC ALGEBRAIC STRUCTURES
Nowadays, one sometimes distinguishes Classic and Modern Algebra.
Such expressions are not very fortunate, and according the the Italian mathematician F. Severi today’s Modern Algebra will soon become the Classic
Algebra of tomorrow. Actually, when one compares, e.g, the XXI Century
Algebra with the XVI Century Algebra, and one eliminates differences common to all branches of scientific knowledge (rigor, formalism, notations, the
amount of knowledge), the only remaining difference is the generality in
which problems are stated and treated.
Actually, a tendency to generalization has always been present in Algebra. Initially, this was the case with the successive generalizations of the
concept of number: first from natural to positive rational, then to negative
numbers, irrational numbers and complex numbers. In the XIX Century,
mathematicians realized that the so-called “algebraic ideas” could be applied
to objects which are not numbers, such as vectors, matrices and transformations.
The initial slow expansion of the scope of Algebra turned into a rapid
explosion when it became clear that one could study properties of any algebraic operation without specifying the nature of the objets on which the
operation applies or how to calculate the result of the operation. In fact, this
study can be performed simply by postulating (i.e., assuming as hypothesis) a certain numbers of basic algebraic properties of the operation, such
as commutativity or associativity. Algebra then finally became axiomatic
(although with a lag of more than 2000 years relative to Geometry). This
was perhaps the most significative achievement of recent times, and justifies
the use of the name “Abstract Algebra”, to which nowadays we often refer
to Algebra.
The axiomatization of Algebra required, first of all, precise definitions
of abstract algebraic structures. In its simplest form, an abstract algebraic
structure is formed by a non-empty set X, called the support of the structure,
and a binary operation on X, which is just a map µ : X × X → X. Distinct
sets of hypothesis, or axioms, imposed on the operation µ lead to distinct
abstract algebraic structures. Note, however, that the definitions do not
include any assumption on the nature of elements of X, nor on the procedure
to evaluate the operation µ.
There a few conventions that everyone follows. If µ : X × X → X
is a binary operation on X, one chooses a symbol such as “+” or “∗” to
represent it, and writes “x + y” or “x ∗ y” instead of “µ(x, y)”. Actually,
one can even omit the symbol and represent the binary operation by simply
juxtaposition, i.e., “xy” instead of “µ(x, y)”. The use of such notations as
“x + y” and “xy” does not mean that these symbols represent or bear any
1.1. INTRODUCTION
9
relation with the usual addition and multiplication operations on numbers.
In this respect, the only generally accepted convention is that one reserves
the symbol “+” to denote commutative operations, i.e., operations such that
µ(x, y) = µ(y, x). Whenever we represent a commutative operation by the
symbol “+” we will say that we use the additive notation, while in all other
cases we say that we use the multiplicative notation. Also, we will make
systematic use of the usual conventions with parenthesis, so that we write
x ∗ (y ∗ z) = µ(x, µ(y, z)),
which, in general, will be distinct from
(x ∗ y) ∗ z = µ(µ(x, y), z).
As a general rule, imposing few axioms on an algebraic structure lead
to results which apply in great generality, for they will be valid for many
concrete algebraic structure. However, very general results tend to be not
so interesting, precisely because they rely on a small number of hypothesis.
On the other hand, if one starts with a richer set of axioms, one obtains
in principle more interesting results, but less general, since fewer concrete
algebraic structures will satisfy all the axioms. One of the main issues in
Abstract Algebra is precisely to determine a set of axioms ( (i.e., definitions
of abstract algebraic structures) which are sufficiently general to include
many concrete examples and, at the same time, sufficiently rich to lead to
interesting results.
Let us consider a few important examples illustrating these remarks.
Without any additional assumption on a binary operation ∗, we can introduce the notion of identity element, inspired on the properties of the integers
0 and 1 relative to the usual addition and multiplication, respectively.
Definition 1.1.1. Let ∗ be a binary operation on a set X. An element
e ∈ X is called an identity element for ∗ if x ∗ e = e ∗ x = x for all x ∈ X.
It is possible to prove a general result about identity elements which is
valid for any binary operation.
Proposition 1.1.2. Every binary operation has at most one identity element.
Proof. Assume that e and ẽ are both identity elements for the operation ∗.
We then have:
e ∗ ẽ = ẽ
e ∗ ẽ = e
We conclude that e = ẽ.
(since e is an identity element),
(since ẽ is an identity element).
10
CHAPTER 1. BASIC ALGEBRAIC STRUCTURES
Hence, we can talk without ambiguity about the identity element, whenever this element exists. Also, it is common to use the terms “zero” and
“one” (the last one usually called “identity”) to denote the identity element
of ∗. Obviously, we should be consistent: we should reserve the use of the
term “zero” (and also the symbol “0”) for the additive notation, and use
the term “identity” (and also the symbols “1” or “I”) for the multiplicative
notation.
Whenever a binary operation ∗ has an identity e, we can introduce the
notion of inverse element. The definition is as follows:
Definition 1.1.3. Given a binary operation ∗ with identity e, an element
x ∈ X is called invertible if there exists y ∈ X such that
x ∗ y = y ∗ x = e.
In this case, we call y an inverse of x.
Again, to be consistent with common usage, if we use the additive notation we call an inverse element of x a symmetric of x.
Note that y is an inverse of x if and only if x is an inverse of y, i.e.,
in other words “being inverse of” is a symmetric relation. If one only has
x∗y = e, we say that y is a right inverse of x and that x is a left inverse of y.
Obviously, y is an inverse of x if and only if y is both a right and left inverse
of x. However, a right inverse may fail to be a left inverse, and vice-versa.
Still, when ∗ is an associative operation we can show the following.
Proposition 1.1.4. Let ∗ be an associative binary operation on X with
identity e. If x ∈ X has a right inverse y and a left inverse z, then y = z
and x is invertible.
Proof. Assume that y, z ∈ A satisfy x ∗ y = z ∗ x = e. Then:
x ∗ y = e ⇒ z ∗ (x ∗ y) = z
⇒ (z ∗ x) ∗ y = z
⇒e∗y =z
⇒y=z
(because z ∗ e = z),
(because ∗ is associative),
(because z ∗ x = e),
(because e ∗ y = y).
The previous results apply to any binary operation that satisfies the two
properties used in the proofs (existence of identity and associativity). These
are exactly the assumptions used as axioms in the the following definition:
1.1. INTRODUCTION
11
Definition 1.1.5. An algebraic structure (X, ∗) is called a monoid if it
satisfies the following properties:
(i) The operation ∗ has identity e on X.
(ii) The operation ∗ is associative, i.e., (x ∗ y) ∗ z = x ∗ (y ∗ z), for all
x, y, z ∈ X.
If ∗ is a commutative operation, i.e., if x ∗ y = y ∗ x for all x, y ∈ X, we
say that the monoid is abelian 2 . If the operation is commutative and we use
the additive notation, we say that the monoid is additive.
From Proposition 1.1.4 we conclude:
Proposition 1.1.6. If (X, ∗) is a monoid (with identity e), and x ∈ X is
invertible, there exists a unique element y ∈ X such that x ∗ y = y ∗ x = e.
Some well-known operations furnish examples of monoids.
Examples 1.1.7.
1. The set of all n × n matrices with real entries with the usual product of
matrices is a monoid. Hence, if A, B, C are matrices n × n, I is the identity
matrix and AB = CA = I, then B = C and the matrix A is invertible.
2. The set RR of all functions 3 f : R → R with “composition product”, defined
by (f ◦ g)(x) = f (g(x)) is a monoid. The identity is the function I : R → R
given by I(x) = x. Hence, if there exist functions g, h : R → R such that
f ◦ g = h ◦ f = I, then g = h and f is invertible ( i.e., is a bijection).
3. The set R of all real numbers with the usual addition is a (additive) monoid.
In this case any element is invertible.
4. The set R+ of all positive real numbers with the usual product is a monoid.
Again the operation is commutative and every such element is invertible.
If an element x in a monoid (X, ∗) is invertible, then the inverse of x is
unique. If we use the multiplicative notation, then we denote by “x−1 ” the
inverse of x, while in the additive notation we denote by “−x” the symmetric
of x. With these conventions, some of the well-known results for symmetrics
and inverses of real numbers, actually hold for any monoid. We leave the
proofs of the following results as exercises:
2
In honour of Niels Henrik Abel (1802–1829), a Norwegian mathematician who is consider to be one of the founders of Modern Algebra
3
If X and Y are sets, Y X denotes the set of all functions f : X → Y (See Definition
A.2.4 in the Appendix).
12
CHAPTER 1. BASIC ALGEBRAIC STRUCTURES
Proposition 1.1.8. If (X, ∗) is a monoid, and x, y ∈ X are invertible, then
x−1 and xy are invertible, and we have:
(x−1 )−1 = x, and (xy)−1 = y −1 x−1 .
For an additive monoid, we have:
−(−x) = x, and − (x + y) = (−x) + (−y).
Exercises.
1. Let X = {x, y} be a set with two elements. How many binary operations
are there on X? How many of these operations are (i) commutative, (ii)
associative, (iii) have an identity?
2. How many binary operations are there in a set with 10 elements?
3. In (Z, −) is there an identity? Inverses? Is this operation associative?
4. Let RR be the set of all functions mentioned in Example 1.1.7.2, and assume
that f ∈ RR .
(a) Show that there exists g ∈ RR such that f ◦ g = I if and only if f is
surjective.
(b) Show that there exists g ∈ RR such that g ◦ f = I if and only if f is
injective.
(c) If f ◦ g = f ◦ h = I, is true that g = h?
5. Prove Proposition 1.1.8.
6. Let ∗ be binary operation on X, and x, y ∈ X. If n ∈ N is a natural
number, define the power xn by induction as follows: x1 = x and, for n ≥ 1,
xn+1 = xn ∗ x. Assuming that ∗ is associative, shows that:
(a) xn ∗ xm = xn+m , and (xn )m = xnm , for all n, m ∈ N.
(b) xn ∗ y n = (x ∗ y)n , for all n ∈ N, if x ∗ y = y ∗ x.
How would you express these results in the additive notation?
7. Assume that (X, ∗) is a monoid with identity e, and x ∈ X is invertible. In
n
this case, we define for n ∈ N x−n = (x−1 ) , x0 = e. Show that the identities
(a) and (b) in the previous problem are valid for all n, m ∈ Z.
1.2. GROUPS
1.2
13
Groups
The examples we saw in the previous section show that in an arbitrary
monoid not every element is invertible. The monoids where every element
has an inverse correspond to the one of the most important abstract algebraic
structures.
Definition 1.2.1. A monoid (G, ∗) is called a group if all the elements in G
are invertible. A group is called abelian if the operation ∗ is commutative.
The following simple examples already show how general the concept of
group is.
Examples 1.2.2.
1. (R, +) is an abelian group.
2. (R+ , ·) is also an abelian group.
3. (Rn , +), where + denotes sum of vectors, is an abelian group.
4. The set of all n × n invertible matrices with real entries is a non-abelian
group with the usual product of matrices. This group is called the General
Linear Group and denoted by GL(n, R)).
5. The complex numbers z ∈ C with |z| = 1 (the unit circle) form an abelian
group for complex multiplication, usually denoted by S1 .
6. The complex numbers {1, −1, i, −i}, with the complex multiplication form, a
finite abelian group.
7. The set of all functions f : R → R form an abelian group for the usual
addition of of functions. We can also consider special classes of functions,
such as continuous functions, differentiable functions or integrable functions.
All such classes give examples of abelian groups.
8. If X is any set and (G, +) is an abelian group, then the set of all functions
f : X → G form an abelian group with operation “+” defined by
(f + g)(x) = f (x) + g(x),
∀x ∈ X.
(Note that in this expression the symbol “+” is used with two different meanings!)
9. More generally, if X is a set and (G, ∗) is a group, then the functions f :
X → G form a group with operation “∗” defined by
(f ∗ g)(x) = f (x) ∗ g(x),
∀x ∈ X.
(Again, in this expression the symbol “∗” is used with two different meanings.)
14
CHAPTER 1. BASIC ALGEBRAIC STRUCTURES
The fourth example above illustrates a general fact: the invertible elements in an arbitrary monoid always form a group.
Proposition 1.2.3. Let (X, ∗) be a monoid and G the set of invertible
elements in X. Then G is closed for the operation ∗, and (G, ∗) is a group.
Proof. The identity e of X is invertible with e−1 = e, since e ∗ e = e. Hence,
G is non-empty and contain the identity of X.
Proposition 1.1.8 shows that
x ∈ G =⇒ x−1 ∈ G
(because (x−1 )−1 = x),
and also that
x, y ∈ G =⇒ x ∗ y ∈ G
(because (x ∗ y)−1 = y −1 ∗ x−1 ).
Since the operation ∗ is associative (in the original monoid), (G, ∗) is a
group.
It is not our aim at this point to discuss in depth the theory of groups.
For now, we limit ourselves to some elementary results which will be useful
later in the study of many other algebraic structures.
Proposition 1.2.4. If (G, ∗) is group with identity element e, then4 :
(i) Cancellation law: if g1 , g2 , h ∈ G and g1 ∗h = g2 ∗h or h∗g1 = h∗g2 ,
then g1 = g2 ;
(ii) In particular, if g ∗ g = g then g = e;
(iii) The equation g ∗ x = h (respectively, x ∗ g = h) has the unique solution
x = g−1 ∗ h (respectively, x = h ∗ g −1 ).
Proof. We have:
g1 ∗ h = g2 ∗ h =⇒ (g1 ∗ h) ∗ h−1 = (g2 ∗ h) ∗ h−1 (because h is invertible),
=⇒ g1 ∗ (h ∗ h−1 ) = g2 ∗ (h ∗ h−1 ) (by associativity),
=⇒ g1 ∗ e = g2 ∗ e
=⇒ g1 = g2
(because h ∗ h−1 = e),
(because e is the identity).
4
Note that the results stated in this theorem are just “sophisticated” versions of the
al-jabr operations mentioned in the introductory section.
1.2. GROUPS
15
The proof for h ∗ g1 = h ∗ g2 is similar, hence (i) holds. On the other
hand,
g ∗ g = g =⇒ g ∗ g = g ∗ e
=⇒ g = e
(because g ∗ e = g),
(by the previous result).
so (ii) holds. The proof of (iii) is left as an exercise.
If (G, ∗) is a group and H ⊂ G is a non-empty set, it is possible that
H is closed relative to the operation ∗, i.e., that h1 ∗ h2 ∈ H, whenever
h1 , h2 ∈ H. In this case, the operation ∗ is a binary operation on H, and
we can ask under what conditions is (H, ∗) a group. When this is the case,
we say that (H, ∗) is a subgroup of (G, ∗).
The next result gives simple criteria to decide if a given subset H of a
group G is a subgroup.
Proposition 1.2.5. If (G, ∗) is a group with identity element e, and H ⊂ G
is non-empty, then the following are equivalent:
(i) (H, ∗) is a subgroup of (G, ∗);
(ii) for all h1 , h2 ∈ H, h1 ∗ h2 ∈ H and h−1
1 ∈ H;
(iii) for all h1 , h2 ∈ H, h1 ∗ h−1
2 ∈ H.
Proof. We will show the implications (i) ⇒ (ii) ⇒ (iii) ⇒ (i).
Assume that (i) holds. Then, by definition, H is closed for multiplication:
if h1 , h2 ∈ H then h1 ∗ h2 ∈ H. By assumption, H has an identity element
ẽ, so that ẽ ∗ ẽ = ẽ. By Proposition 1.2.4 (ii) we conclude that ẽ = e, so
H contains the identity of G. If h ∈ H, consider the equation h ∗ x = e.
According to Proposition 1.2.4 (iii), this equation has a unique solution
x ∈ H, which is also a solution of the same equation in G. Hence, we must
have x = h−1 (the inverse of h in the original group G). This shows that if
h ∈ H, then h−1 ∈ H.
If (ii) holds, then clearly (iii) holds.
Finally, assume that (iii) holds. We must show that (H, ∗) is a group.
Since H is non-empty, we can choose h ∈ H, and observe that h ∗ h−1 =
e ∈ H. Hence, H contains the identity of G. Similarly, if h ∈ H, then
e ∗ h−1 = h−1 ∈ H, so H contains the inverses (in G) of all its elements. It
follows also that H is closed relative to ∗, for if h1 , h2 ∈ H then we already
−1 = h ∗ h ∈ H. Finally, ∗ is
know that h−1
∈ H, so that h1 ∗ (h−1
1
2
2
2 )
associative in H because it was already associative in G.
16
CHAPTER 1. BASIC ALGEBRAIC STRUCTURES
In any abelian group, using the additive notation, the difference h1 − h2
is defined by h1 − h2 = h1 + (−h2 ) Hence, in additive notation, the condition
“h1 ∗ h−1
2 ∈ H” is written as “h1 − h2 ∈ H.
Examples 1.2.6.
1. Consider the group (R, +) and the subset of all integers Z ⊂ R. This subset
is non-empty and the difference of two integers is still an integer, so (Z, +) is
a subgroup of (R, +).
2. Still in the same group (R, +), consider the subset of natural numbers N ⊂ R.
The difference of two natural numbers may fail to be a natural number, so
(N, +) is not a subgroup of (R, +). Note that the sum of two natural numbers
is still a natural number, so the sum is a binary operation on the set of natural
numbers.
Let (G, ∗) and (H, ·) be two groups, and consider the cartesian product
G × H = {(g, h) : h ∈ G, h ∈ H}.
In G × H we define the binary operation
(g1 , h1 ) ◦ (g2 , h2 ) = (g1 ∗ g2 , h1 · h2 ).
(1.2.1)
We leave as an exercise to check that this algebraic structure is a group. It
is called the direct product of the groups G and H. Note that G and H
can be seen as subgroups of G × H if we identify them with
G × {e} = {(g, e) : g ∈ G}
and
{e} × H = {(e, h) : h ∈ H}.
If G and H are abelian groups and the additive notation is used, then we
will write G ⊕ H instead of G × H, and we call this group the direct sum
of G and H.
The notion of direct product or direct sum of groups extends in a more
or less straightforward way to any finite number of groups5 . For example, if
G, H, and K are groups, the direct product G × H × K can be defined as
G × H × K = (G × H) × K. More generally, given groups G1 , G2 , · · · , Gn ,
we have:
!
1
n
n−1
Y
Y
Y
Gk = G1 , and
Gk =
Gk × Gn .
k=1
5
k=1
k=1
We will see later how to treat an infinite number of groups where another distinction
between the direct sum and the the direct product of groups will become clear.
1.2. GROUPS
17
Example 1.2.7.
We leave it as an exercise to check that the direct sum of (R, +) with itself is
(R2 , +). More generally, we have:
n
R =
n
M
k=1
R = R ⊕ · · · ⊕ R.
We can also consider the group (Z, +) and take the direct sum of this group
with itself any finite number of times. The resulting group is denoted by
Zn =
n
M
k=1
Z = Z ⊕ · · · ⊕ Z.
This group, as we shall see in Chapter 4, is the so-called free abelian group in
n symbols. It is a subgroup of the group (Rn , +).
Exercises.
1. Show that the sets G = {0, 1} and H = {1, −1} with the operations given
by the following tables are groups.6
+
0
1
0
0
1
1
1
0
×
1
-1
1
1
-1
-1
-1
1
2. Same as the previous exercise, but now with the sets G = {0, 1, 2} and
H = {1, x, x2 }, with operations given by the following tables. 7
+
0
1
2
0
0
1
2
1
1
2
0
2
2
0
1
×
1
x
x2
1
1
x
x2
x
x
x2
1
x2
x2
1
x
3. Complete the proof of Proposition 1.2.4 (iii).
4. Give an example showing that the cancelation law, in general, does not hold
for monoids.
5. Let (G, ∗) be a group. Show that the map f : G → G, given by f (x) = x−1 ,
is a bijection of G with G.
6
The group G is usually denoted by (Z2 , +), for reasons to be explained later. The
group H is formed by the square roots of the unit.
7
The group G is usually denoted by (Z3 , +). The group H is formed by the cubic roots
2π
of the unit, since we can assume, e.g., that x is the complex number e 3 i .
18
CHAPTER 1. BASIC ALGEBRAIC STRUCTURES
6. How can Propositions 1.2.4 and 1.2.5 be stated in additive notation?
7. If (G, ∗) is a group, its center is defined by C(G) = {x ∈ G : g ∗ x =
x ∗ g, ∀g ∈ G}. Show that (C(G), ∗) is an abelian subgroup of G. Determine
the center of G = GL(n, R).
8. Let (G, ∗) be a group, H1 , H2 subgroups of G. Show that H1 ∩ H2 is a
subgroup of G.
9. Show that a group (G, ∗) is abelian if and only if (g ∗ h)2 = g 2 ∗ h2 , for all
g, h ∈ G.
10. Let ∗ be an associative binary operation in a set G, which satisfies:
(i) there exists e ∈ G such that, for all g ∈ G, g ∗ e = g (right identity).
(ii) For all g ∈ G there exists g ′ ∈ G such that g ∗ g ′ = e (right inverses).
Show that:
(a) (G, ∗) is a group.
(Hint: show first that g ∗ g = g ⇒ g = e).
(b) Fix some set X and let G be the class of all surjective maps f : X → X,
with ∗ the operation of composition of maps. Is this a group? Why
doesn’t this example contradicts (a)?
11. Let ∗ be an associative binary operation on a non-empty set G, which
satisfies:
(i) The equation g ∗ x = h has a solution in G for all g, h ∈ G.
(ii) The equation x ∗ g = h has a solution in G for all g, h ∈ G.
Show that (G, ∗) is a group.
(Hint: show first that if g ∗ e1 = g and e2 ∗ h = h, for some g and h, then one
must have e1 = e2 = e and e is the identity element for ∗).
12. Let (G, ∗) and (H, ·) be two groups. Show that the binary operation on
G × H defined by (1.2.1) gives this set a group structure. Verify that the direct
product (R, +) × (R, +) coincides precisely with (R2 , +).
13. Determine the direct product of the groups described in Exercises 1 and 2.
1.3. PERMUTATIONS
19
14. Consider the group (Z4 , +) given by the following table:
+
0
1
2
3
0
0
1
2
3
1
1
2
3
0
2
2
3
0
1
3
3
0
1
2
(a) Determine all its subgroups.
(b) Consider the group with support H = {1, −1, i, −i} ⊂ C furnished with
the product of complex numbers. Is there a bijection f : Z4 → H such
that f (x + y) = f (x)f (y), for all x, y ∈ Z4 ? If yes, how many are there?
1.3
Permutations
The bijective maps f : X → X under composition of maps form a group SX
called the symmetric group on X. A bijection from X onto X is called a
permutation of X, specially when X is a finite set. It should be clear that
in this case it does not matter what X is, only its number of elements. We
will now study the group of permutations of the set {1, 2, 3, . . . , n}, which is
usually denoted by Sn . An elementary counting argument shows that Sn is
a finite group with n! elements (here n! denotes the factorial of n, i.e., the
product of the first n natural numbers).
Examples 1.3.1.
1. The group S2 has only two elements, I and φ, where I is the identity map
on the set {1, 2}, and φ “exchanges” 1 with 2, ( i.e., φ(1) = 2 and φ(2) = 1).
2. The map δ : {1, 2, 3} → {1, 2, 3} defined by δ(1) = 2, δ(2) = 3, and δ(3) = 1
is one of the six permutations in S3 .
3. More generally, in Sn we have the permutation π : Sn → Sn which cyclic
permutes all elements: π(i) = i + 1 (i = 1, . . . , n − 1) and π(n) = 1.
One can represent a permutation π in Sn by a matrix with two rows,
where on the first row one writes the variable x and in the second row the
value π(x). For example, in the case of S3 its elements can be represented
20
CHAPTER 1. BASIC ALGEBRAIC STRUCTURES
by the matrices:
1 2
I=
1 2
1 2
γ=
2 1
3
3
3
3
,
α=
,
δ=
1 2 3
1 3 2
1 2 3
2 3 1
,
β=
,
ε=
1 2 3
3 2 1
1 2 3
3 1 2
,
.
It is a routine task to find all the products (i.e., composition) of these permutations. The result is given in the following table:
I
α
β
γ
δ
ε
I
I
α
β
γ
δ
ε
α
α
I
ε
δ
γ
β
β
β
δ
I
ε
α
γ
γ
γ
ε
δ
I
β
α
δ
δ
β
γ
α
ε
I
ε
ε
γ
α
β
I
δ
Given any permutation π of X and element x ∈ X, the subset of X consisting of all elements obtained from x by applying π repeatedly is denoted
by Ox and is called the orbit of the permutation π. Hence,
Ox = {x, π(x), π(π(x)), . . . }.
Examples 1.3.2.
1. In the case of S3 , we have
• orbits of I: O1 = {1}, O2 = {2} e O3 = {3};
• orbits of α: O1 = {1}, O2 = O3 = {2, 3};
• orbits of ε: there is a unique orbit O1 = O2 = O3 = {1, 2, 3}.
The structure of the orbits of β and γ is similar to the orbits of α (can you
say precisely what they are?), while ε, just like δ, has a unique orbit (which?).
2. The permutation in S4 given by
1 2
π=
2 1
3 4
4 3
has orbits O1 = O2 = {1, 2} and O3 = O4 = {3, 4}.
The length of an orbit is simply the number of elements in the orbit.
Note that the orbits of a given permutation π of X are disjoint subsets of
1.3. PERMUTATIONS
21
X, whose union is X, so we say that the orbits of π form a partition of
the set X. Note also that the identity is the unique permutation with all
orbits having length 1. The permutations with at most one orbit of length
greater than 1 are called cycles. For example, all permutations in S3 are
cycles, but the permutation π in S4 mentioned above is not a cycle: it has
two orbits of length 2. Note also that π(x) 6= x, precisely when x belongs
to an orbit of π of length greater than 1. A cycle with an orbit of length 2
is called a transposition.
If a permutation π is a cycle, it is more neat to represent it by indicating
which elements belong to its unique orbit Ox of length greater than 1: we
write (x, π(x), π 2 (x), . . . , π k−1 (x)), where k is the length of Ox , i.e., the least
natural number such that π k (x) = x.
Example 1.3.3.
For S3 we have:
α = (2, 3) = (3, 2),
β = (1, 3) = (3, 1),
δ = (1, 2, 3) = (2, 3, 1) = (3, 1, 2),
γ = (1, 2) = (2, 1),
ε = (1, 3, 2) = (3, 2, 1) = (2, 1, 3).
We can represent the identity I, e.g., by I = (1). Note that the inverse
permutation of a cycle is the cycle that is obtained by inverting the order
in which the elements apear in the corresponding orbit:
(i1 , i2 , . . . , ik )−1 = (ik , . . . , i2 , i1 ).
In particular, the inverse permutation of a transposition is the same transposition. In the example of S3 , the permutations α, β, and γ coincide with
there inverses, and ε and δ are inverses of each other.
Two cycles are said to be disjoint if the orbits of length greater than 1
are disjoint. Two disjoint cycles π and ρ commute, i.e., πρ = ρπ, and any
permutation is a product of disjoint cycles (one cycle for each of its orbits of
length greater than 1). More precisely, in Sn we have the following factorization result, which can be thought of as an analogue of the Fundamental
Theorem of Arithmetic 8 :
Proposition 1.3.4. Every permutation π in Sn is a product of disjoint
cycles. This factorization is unique up to the order of the factors.
8
“Every natural number n ≥ 2 is a product of prime numbers, which are unique up to
the order of the factors” (see Chapter 2).
22
CHAPTER 1. BASIC ALGEBRAIC STRUCTURES
Notice that, in general, we have
(x1 , x2 , . . . , xm ) = (x1 , xm ) . . . (x1 , x3 )(x1 , x2 ).
Hence, it is possible to factor permutations in Sn into a product of transpositions. In this case, however, the factors are not unique and the order
matters, since its unavoidable to use non-disjoint transpositions.
Examples 1.3.5.
1. For the permutation π in S4 above, we can factor it as:
1 2 3 4
1 2 3 4
1 2 3
=
2 1 4 3
2 1 3 4
1 2 4
4
3
,
i.e., we can write this permutation in the form π = (1, 2)(3, 4) = (3, 4)(1, 2).
2. Similarly, the cycle (1, 2, 4, 3) can be factor as a product of transpositions:
(1, 2, 4, 3) = (1, 3)(1, 4)(1, 2).
But this cycle also admits, e.g., the factorizations:
(1, 2, 4, 3) = (2, 1)(2, 3)(2, 4) = (1, 3)(1, 4)(1, 2)(2, 4)(1, 3)(2, 4)(1, 3).
The previous example shows that in a factorization of a permutation
as a product of transpositions, these are not uniquely determined. Note
also that the number of transpositions in two distinct factorizations is not
unique. In spite of this lack of uniqueness, it is possible to show that the
number of factors has a fixed parity, i.e., is either always even or always
odd. To see this, if π is a permutation
P with orbits O1 , O2 , . . . , OL having
lengths n1 , n2 , . . . , nl , we set P (π) = L
i=1 (ni − 1), and we show that:
Proposition 1.3.6. If π is a permutation and τ is a transposition, then
P (πτ ) = P (π) ± 1.
Proof. Let τ = (a, b), We must consider two cases:
(a) The elements a and b belong to distinct orbits Oi and Oj of π, and
(b) The elements a and b belong to the same orbit Oi .
For each of these cases it is to check that the following statements hold:
1.3. PERMUTATIONS
23
(a) πτ has the same orbits as π, with the exception of Oi and Oj , which
now form a unique orbit of length ni + nj . Hence, P (πτ ) = P (π) + 1;
(b) πτ has the same orbits as π, with the exception of Oi , which is split
into two orbits. In this case, P (πτ ) = P (π) − 1.
We can now show that:
Theorem 1.3.7. If τ1 , τ2 , . . . , τm are transpositions such that π = τ1 τ2 · · · τm ,
then P (π) − m is even, and hence P (π) and m have the same parity (are
both even or both odd).
Proof. We use induction in m.
If m = 1, then π is a transposition and P (π) = 1, so P (π) − m = 0 is
even.
If m > 1, we let α = τ1 τ2 · · · τm−1 . By the induction hypothesis, we have
that P (α)−(m−1) is even, and by the previous proposiition P (π) = P (α)±1.
We conclude that
P (π) − m = (P (α) ± 1) − m − 1 + 1 = P (α) − (m − 1) − (1 ± 1)
is even.
The parity of a permutation π is the parity of the number of transpositions in any factorization of π into a product of transpositions or, as we
have just seen, the parity of P (π). If P (π) is even (respectively, odd), we
say that π is an even (respectively, odd) permutation. The sign of π is +1
(respectively, −1), if π is even (respectively, odd), and is denoted by sgn(π).
In particular, any transposition is odd, and also the cycle (1, 2, 4, 3). The
identity is an even permutation since, e.g., I = (1, 2)(1, 2).
The even permutations in Sn form a subgroup, denoted by An , called the
alternating group (in n symbols). We leave for the exercises to check
that An contains n!
2 elements.
Exercises.
1. Find the factorization of the permutation
1 2 3 4 5
π=
2 3 4 1 6
into a product of disjoint cycles.
6
7
7
5
24
CHAPTER 1. BASIC ALGEBRAIC STRUCTURES
2. What is the parity of the permutation π in the previous exercise?
3. How many transpositions are there in Sn ?
4. How many distinct cycles of length k (1 ≤ k ≤ n) are there in Sn ?
5. Show that, if π, ρ ∈ Sn , then sgn(πρ) = sgn(π) sgn(ρ).
6. Prove that An is a subgroup of Sn .
7. Determine all elements of the group A3 .
8. Determine all the subgroups of the group S3 .
9. Show that in Sn (n > 1) the number of even and odd permutations coincide
and conclude that An contains n!
2 elements.
1.4
Homomorphisms and Isomorphisms
In order to compare algebraic structures which satisfy the same abstract
definition, one uses a fundamental notion of Algebra, namely the notion
of an isomorphism. This is a special case of the more general notion of a
homomorphism, whose formal definition for monoids is as follows:
Definition 1.4.1. If (X, ∗) and (Y, ·) are monoids, a map φ : X → Y is
called a homomorphism if
φ(x1 ∗ x2 ) = φ(x1 ) · φ(x2 ),
∀x1 , x2 ∈ X.
A homomorphism φ which is a bijection is called an isomorphism, in which
case the monoids are said to be isomorphic9
One suggestive way to describe the notion of an homomorphism is through
the following diagram:
X ×X
“∗”
/X
φ×φ
φ
Y ×Y
9
“·”
/Y
The use of the following terms is also common: a monomorphism is an injective homomorphism and a epimorphism is a surjective homomorphism. Also, an endomorphism
is a homomorphism from an algebraic structure into itself, while an automorphism is an
isomorphism of an algebraic structure with itself.
1.4. HOMOMORPHISMS AND ISOMORPHISMS
25
Such a diagram is said to be commutative, because one can follow two distinct paths on it without changing the final result.
Examples 1.4.2.
1. The logarithm function φ : R+ → R given by φ(x) = log(x), is a bijection.
Since log(xy) = log(x) + log(y), the groups (R+ , ·) and (R, +) are isomorphic.
Note that the inverse function, namely the ( exponential) ψ(x) = exp(x) is also
an isomorphism.
2. The set of all linear transformations T : Rn → Rn under composition is a
monoid. Once a basis of Rn has been fixed, it is possible to find for each linear
transformation T its matrix representation M(T ), a certain n× n matrix. It is
easy to see that the the map T 7→ M(T ) is an isomorphism of monoids (composition of linear transformations corresponds to the product of their respective
matrix representations).
If (G, ∗) and (H, ·) are isomorphic groups we will write “(G, ∗) ≃ (H, ·)”,
or even just “G ≃ H”, whenever the binary operations are obvious from the
discussion. For example, we have (R+ , ·) ≃ (R, +), or R+ ≃ R. 10
Proposition 1.4.3. Let (G, ∗) and (H, ·) be groups, with identities e and ẽ
respectively , and let φ : G → H be a homomorphism. Then
(i) Invariance of the identity: φ(e) = ẽ.
(ii) Invariance of inverses: φ(g−1 ) = (φ(g))−1 , ∀g ∈ G.
Proof. (i) Since
φ(e) · φ(e) = φ(e ∗ e) = φ(e),
the cancellation law shows that φ(e) = ẽ.
(ii) Since
φ(g) · φ(g−1 ) = φ(g ∗ g−1 ) = φ(e) = ẽ,
we conclude that φ(g−1 ) = (φ(g))−1 .
Examples 1.4.4.
1. Consider the groups (R, +) and (C∗ , ·), where C∗ denotes the set of all nonzero complex numbers. Define
φ : R → C∗ ,
φ(x) = e2πxi = cos(2πx) + i sin(2πx).
10
We will also use the same symbol ≃ to indicate that two objects in other contexts
(e.g., monoids, rings, fields) are isomorphic.
26
CHAPTER 1. BASIC ALGEBRAIC STRUCTURES
Then φ is a homomorphism (ez · ew = ez+w , even when z and w are complex
numbers11 ). Obviously, φ is not surjective (because φ(x) is a complex number
of absolute value 1), and is not injective (because, if x = n is an integer, then
φ(n) = 1. Note that φ “wraps” the real line over the unit circle. According
to the proposition above, the identity in the domain (the real number 0), is
transformed into the identity in the target (the complex number 1), and the
image of the symmetric of the real number x is the inverse of the complex
number φ(x).
2. Consider the groups (Z, +) and (C∗ , ·) and define
φ : Z → C∗ ,
φ(n) = in .
The map φ is a homomorphism which is neither surjective nor injective. The
identity in the domain, the integer 0, is transformed into the identity in the
target, the complex number 1, and the image of the symmetric of the integer n
is the inverse of the complex number φ(n).
3. The previous example can be generalized as follows: if (G, ∗) is any group
and g ∈ G is a fixed element, we can define (multiplicative notation):
φ : Z → G,
φ(n) = g n .
The map φ is a group homomorphism from (Z, +) into (G, ∗).
Given a group homomorphism φ : G → H, consider now the equation
φ(x) = y,
where we assume y ∈ H fixed, and x is the unkown to be determined. By
analogy with Linear Algebra, if y = ẽ is the identity in the target group
we say that we have an homogenous equation, and if y 6= ẽ we say that it
is inhomogenous. The set of solutions of the homogenous equation is called
the kernel of the homomorphism and will be denoted by N (φ):
N (φ) = {g ∈ G : φ(g) = ẽ}.
On the other hand, the set of all y ∈ H for which the equation φ(x) = y has
a solution x ∈ G, is called the image of the homomorphism and will be
denoted by φ(G) (or Im(φ)):
φ(G) = {φ(g) ∈ H : g ∈ G}.
11
Recall that if z = x + iy is a complex number, where x, y ∈ R, one defines ez =
ex (cos(y) + i sin(y)).
1.4. HOMOMORPHISMS AND ISOMORPHISMS
27
Examples 1.4.5.
1. Continuing the discussion in Examples 1.4.4, the kernel of φ : R → C∗ is
precisely the set N (φ) = Z of all integers and the image is the set φ(R) = S1
of all complex numbers of absolute value 1 (the unit circle).
2. Similarly, the kernel of φ : Z → C∗ is precisely the set N (φ) = 4Z of all
integers which are a multiple of 4 and the image is the set φ(Z) = {1, −1, i, −i}.
The next figure illustrates the concepts of kernel and image.
G
N ()
H
(G)
PSfrag replaements
e~
e
Figure 1.4.1: Kernel and image of a homomorphism.
In the examples above, both the kernel and the image are subgroups of
the domain and target groups, respectively. This is a general fact:
Proposition 1.4.6. If (G, ∗) and (H, ·) are groups and φ : G → H is a
homomorphism, then:
(i) The kernel N (φ) is a subgroup of G;
(ii) The image φ(G) is a subgroup of H.
Proof. (i) The kernel N (φ) is non-empty, since it contains at least the identity of G, by Proposition 1.4.3 (i). Moreover, if g1 , g2 ∈ N (φ) we have
φ(g1 ∗ g2−1 ) = φ(g1 ) · φ(g2−1 ) (because φ is a homomorphism),
= ẽ · (φ(g2 ))−1
−1
= ẽ · ẽ
= ẽ.
(by Proposition 1.4.3 and because g1 ∈ N (φ)),
(because g2 ∈ N (φ)),
28
CHAPTER 1. BASIC ALGEBRAIC STRUCTURES
This shows that for any g1 , g2 ∈ N (φ) we have g1 ∗ g2−1 ∈ N (φ), so N (φ) is
a subgroup of G.
(ii) φ(G) is non-empty, because G is non-empty. If h1 , h2 ∈ φ(G), there
exist g1 , g2 ∈ G such that h1 = φ(g1 ) and h2 = φ(g2 ). Hence:
−1
h1 · h−1
= φ(g1 ∗ g2−1 ) ∈ φ(G),
2 = φ(g1 ) · (φ(g2 ))
so φ(G) is a subgroup of H.
From now on we will stop being explicit about the group operations,
except if needed for clarity. Hence, if G and H are groups, g1 , g2 ∈ G and
h1 , h2 ∈ H, we will write both products as g1 g2 and h1 h2 . From the context,
it will be clear to which operation we are referring too. Similarly, we will
denote by e both the identities in G and in H.
A very important remark is that the kernel of a group homomorphism
is not an arbitrary subgroup. It satisfies the following property:
Definition 1.4.7. If H ⊂ G is a subgroup, one says that H is a normal
subgroup of G if for all h ∈ H and g ∈ G, one has ghg −1 ∈ H.
Examples 1.4.8.
1. When G is an abelian group, it is clear that ghg −1 = hgg −1 = h, so all
subgroups of an abelian group are normal subgroups.
2. If G = S3 and H={I, α}, then H is a subgroup of G which is not normal:
εαε−1 = γ 6∈ H.
3. If G = S3 and H = A3 = {I, δ, ε}, then H is a normal subgroup. In fact,
recall that A3 is formed by the even permutations of S3 , so if π ∈ A3 and
σ ∈ S3 , then σπσ −1 is an even permutation (why?), and hence σπσ −1 ∈ A3 .
4. If G × H is the direct product of the groups G and H, then G and H (identified, respectively, with G × {e} and {e} × H) are normal subgroups of G × H.
Theorem 1.4.9. If φ : G → H is a group homomorphism, then its kernel
N (φ) is a normal subgroup of G.
Proof. Let n ∈ N (φ) and g ∈ G. We must show that gng−1 ∈ N (φ), or that
φ(gng−1 ) = e, where e is the identity of H. For this, we observe that:
φ(gng−1 ) = φ(g)φ(n)φ(g−1 )
= φ(g)φ(g
=e
−1
)
(definition of homomorphism),
(because φ(n) = e, since n ∈ N (φ)),
(invariance of inverses).
1.4. HOMOMORPHISMS AND ISOMORPHISMS
29
In Linear Algebra, the number of solutions of the non-homegenous equation φ(x) = y, i.e., the lack of injectivity of φ, depends only on the kernel
N (φ). Similarly, we have:
Theorem 1.4.10. Let φ : G → H be a homomorphism. Then:
(i) φ(g1 ) = φ(g2 ) if and only if g1 g2−1 ∈ N (φ);
(ii) φ is injective if and only if N (φ) = {e};
(iii) if x0 is a particular solution of φ(x) = y0 , the general solution is
x = x0 n, where n ∈ N (φ).
Proof. (i) Notice that:
φ(g1 ) = φ(g2 ) ⇔ φ(g1 )(φ(g2 ))−1 = e (multiplication in H by (φ(g2 ))−1 ),
⇔ φ(g1 g2−1 ) = e
⇔
g1 g2−1
(because φ is a homomorphism),
∈ N (φ)
(by definition of φ).
(ii) φ is injective if and only if φ(g1 ) = φ(g2 ) ⇔ g1 = g2 ⇔ g1 g2−1 = e.
From (i), we conclude that this holds if and only if N (φ) = {e}.
(iii) If φ(x0 ) = y0 , n ∈ N (φ), and x = x0 n, it is clear that
φ(x) = φ(x0 n) = φ(x0 )φ(n) = φ(x0 )e = φ(x0 ) = y0 ,
so x is also a solution of the inhomogenous equation. On the other hand,
if x is a solution of the inhomogenous equation, then φ(x) = φ(x0 ), so
−1
xx−1
0 ∈ N (φ). But xx0 = n ⇔ x = x0 n.
Examples 1.4.11.
1. Continuing with Examples 1.4.4, we saw that the kernel of φ : R+ → C∗ is
the set of all integers, i.e.,
φ(x) = 1
⇔
cos(2πx) = 1, sin(2πx) = 0
⇔
x ∈ Z.
If we consider the equation:
φ(x) = i
an obvious solution is x0 =
x = 14 + n, with n ∈ Z.
1
4.
⇔

 cos(2πx) = 0,

sin(2πx) = 1,
The general solution of this equation is then
30
CHAPTER 1. BASIC ALGEBRAIC STRUCTURES
2. The kernel of φ : Z → C∗ is the set of multiples of 4, i.e.,
φ(n) = 1
⇔ in = 1
⇔
n = 4k, with k ∈ Z.
φ(n) = i
⇔
in = i,
The equation
has the obvious solution n0 = 1. The general solution of this equation is then
n = 1 + 4k, with k ∈ Z.
3. The algebraic structures (Rn , +) and (Rm , +), where addition is the usual
vector sum, are abelian groups. If T : Rn → Rm is a linear transformation,
one checks easily that T is also a group homomorphism, and the usual Linear
Algebra result concerning solutions of the equation T (x) = y is just a special
case of the previous theorem.
The notion of isomorphism between two algebraic structures satisfying
the same set of axioms is at the basis of another fundamental problem of
Algebra, namely the classification problem for algebraic structures. Loosely
speaking, this problem is the following:
Given some definition of an (abstract, axiomatic) algebraic structure
determine a class C of concrete algebraic structures satisfying the definition, and such that any algebraic structure satisfying the definition
is isomorphic to exactly one algebraic structure in the class C.
For example, the problem of classification of finite simple groups (a very
important class of groups which we will study later in Chapter 5) was solved
in the last 20 years, and is consider one of the major accomplishments of
modern mathematics. We will see later some less complex classification
problems. For now, in order to illustrate these ideas, we discuss a very
elementary example: the classification of monoids with exactly two elements.
If (X, ∗) is a monoid with two elements, we have X = {I, a}, where I
denotes the identity, and I 6= a. Note that the products I ∗ I, I ∗ a and
a ∗ I are completely determined by the fact that I is the identity (I ∗ I = I,
I ∗ a = a ∗ I = a). It remains to determine the product a ∗ a, and the only
possibilities are a ∗ a = a ou a ∗ a = I (in the later case, a is invertible,
and hence the monoid is a group). The multiplication tables which describe
these two possibilities are:
I
a
I
I
a
a
a
I
I
a
I
I
a
a
a
a
1.4. HOMOMORPHISMS AND ISOMORPHISMS
31
In order to check that both cases are possible, consider the sets M =
{0, 1}, and G = {1, −1}, with the usual product of integers. It is obvious
that in both cases this product is a binary operation which is associative,
with identity. Hence, each of these sets with the usual product is a monoid
with two elements. Note also, by inspection of the main diagonals of the
tables, that these monoids are not isomorphic.
1
-1
1
1
-1
-1
-1
1
1
0
1
1
0
0
0
0
Obviously, any of the first two multplication tables represent a monoid
isomorphic to one of these monoids. Indeed, in the case a ∗ a = I, the
isomorphism is the map φ : X → G given by φ(I) = 1 and φ(a) = −1. In
the case a∗a = a, the isomorphism is the map φ : X → M given by φ(I) = 1
and φ(a) = 0. To summarize this discussion:
• G and M are monoids with two elements,
• G and M are not isomorphic, and
• If X is any monoid with two elements, then either X ≃ G or X ≃ M .
Therefore we can say that, “up to isomorphism”, there exist exactly two
monoids with two elements, and the classification of monoids with two elements is the family {G, M }. Note that since any group is a monoid, and
since only G is a group, we can conclude also that “up to isomorphism”,
there exists a unique group with two elements, which is G.12
Exercises.
1. Show that the identity ez+w = ez ew , with z and w complex numbers, is a
consequence of the usual identities for ex+y , cos(x + y) and sin(x + y), with x
and y real numbers.
2. The complex solutions of the equation xn = 1 are the complex numbers
e2πki/n ,where 1 ≤ k ≤ n. These are called the n-th roots of the unit, and they
form the vertices of a regular polygon with n sides, inscribed in the unit circle,
and with one vertex on the number 1.
(a) Show that φ : Z → C∗ given by φ(k) = e2πki/n is a homomorphism.
(b) Conclude that the n-th roots of unit form a subgroup of C∗ .
12
This group is obviously isomorphic to the group (Z2 , +) mentioned before.
32
CHAPTER 1. BASIC ALGEBRAIC STRUCTURES
(c) Determine the kernel of the homomorphism φ.
3. Assume that G, H and K are groups. Show that:
(a) G × H ≃ H × G, and G × (H × K) ≃ (G × H) × K.
(b) φ : G → H × K is a group homomorphism if and only if φ(x) =
(φ1 (x), φ2 (x)), where φ1 : G → H and φ2 : G → K are homomorphisms.
4. Consider the algebraic structure (R2 , +), where + is the usual sum of vectors.
(a) Show that (R2 , +) is an abelian group.
(b) Show that any linear transformation T : R2 → R is a homomorphism
from group (R2 , +) into the group (R, +).
(c) Show that any homomorphism from the group (R2 , +) into the group
(R, +) which is continuous is a linear transformation13 .
(d) For a fixed a ∈ R2 , define a linear transformation T : R2 → R by T (x) =
a · x, where “·” denotes the usual inner product. Find the kernel of T ,
and check that it is a subgroup and a linear subspace.
(e) Given an example of a subgroup of (R2 , +) which is not a linear subspace.
5. Assume that (A, ∗) and (B, ·) are monoids with identities e and ẽ, respectively. If φ : A → B is an isomorphism, show that:
(a) φ(e) = ẽ.
(b) φ(a−1 ) = (φ(a))
−1
(c) φ
−1
if a ∈ A is invertible.
: B → A is also an isomorphism.
(d) (A, ∗) is a group if and only if (B, ·) is a group.
6. If in the previous exercise one only assumes that φ is an injective (respectively,
surjective) homomorphism, which of the statements remain valid?
7. Let G be a group and denote by Aut(G) the set all automorphisms φ : G → G.
(a) Show that Aut(G) furnished with composition of maps is a group.
(b) If G is the group formed by the complex solutions of x4 = 1, determine
Aut(G).
8. Let G be a group.
(a) If g ∈ G is fixed, show that φg : G → G given by φg (x) = gxg −1 is an
automorphism.
13
There are group homomorphisms which are not continuous and which are not linear
transformations. Its existence can only be shown using the Axiom of Choice, which is
discussed in the Appendix.
1.5. RINGS, INTEGRAL DOMAINS AND FIELDS
33
(b) Show that the map T : G → Aut(G) defined by T (g) = φg is a group
homomorphism.
(c) Show that the kernel of T is the center of the group G (see Exercise 7, in
the previous section).
9. Classify all the groups with three and with four elements.
10. Show that the map φ : Sn → Z2 , which to a permutation π associates
sgn(π), is a group homomorphism with kernel An .
11. Determine all the normal subgroups of S3 .
12. Let G be a group and φ : S3 → G a group homomorphism. Classify the
group φ(S3 ) (i.e., list all the possibilities for φ(S3 ) up to isomorphism).
13. Let G be a group and fix g ∈ G. Show that
(a) the map Tg : G → G given by Tg (x) = gx, is a permutation on the set G.
(b) the map φ(g) : G → SG given by φ(g) = Tg is an injective homomorphism. In particular, G is isomorphic to a subgroup of the group of
permutations of the set G.
(c) if G is a finite group wth n elements, then there exists a subgroup H ⊆ Sn
such that G ≃ H.
1.5
Rings, Integral Domains and Fields
The integers, rationals, reals and complex numbers can be added and multiplied and the result is still a number of the same type. Similarly, we can
add and multiply square matrices of the same size, linear transformations
of a fixed vector space, and many other objects, which are useful in Mathematics and its applications. All these are examples of algebraic structures
more complex than groups or monoids, precisely because they involve simultaneously two operations. However, they share a common set of basic
properties, which lies at the basis of the definition of an algebraic structure
called a ring, that we will now study. We will distinguish some important
classes of rings, namely the fields (rings where the product is commutative
and a division by non-zero elements is always possible, such as Q, R and C),
and integral domains (rings with properties similar to Z).
Let A be an non-empty set, and σ, π : A × A → A two binary operations
in A. In order to simplify the notation, we will write“a + b” instead of
34
CHAPTER 1. BASIC ALGEBRAIC STRUCTURES
“σ(a, b)”, and “a · b” (or still ab) instead of “π(a, b)”. We say that a + b and
a · b are respectively the addition and the product of the elements a, b ∈ A.
Definition 1.5.1. The triple (A, +, ·) is called a ring if:
(i) Addition properties: (A, +) is an abelian group.
(ii) Product properties: The product is associative, i.e.,
∀a, b, c ∈ A, (a · b) · c = a · (b · c).
(iii) Mixed properties: The addition and the product are distributive, i.e.,
∀a, b, c ∈ A, a · (b + c) = a · b + a · c, and (b + c) · a = b · a + c · a.
Let us list more explicitly all the axioms included in this definition:
• Associativity of addition: ∀a, b, c ∈ A, (a + b) + c = a + (b + c).
• Commutativity of addition: ∀a, b ∈ A, a + b = b + a.
• Identity for addition: ∃0 ∈ A ∀a ∈ A, a + 0 = a.
• Symmetrics for addition: ∀a ∈ A ∃b ∈ A : a + b = 0.
• Associativity of product: ∀a, b, c ∈ A, (a · b) · c = a · (b · c).
• Distributivity: ∀a, b, c ∈ A, a·(b+c) = a·b+a·c, and (b+c)·a = b·a+c·a.
As we have already mentioned, Definition 1.5.1 is at least partially inspired by the properties of the set of integers A = Z, where “addition” and
“product” are the usual operations on integers, and 0 is the integer zero. In
particular, all the axioms in the definition correspond to well-known properties of the integers. However, this does not mean that an arbitrary ring
(A, +, ·) behaves just like (Z, +, ·). For example, in general the “product” is
not commutative, nor is there any reference in Definition 1.5.1 to an identity
for the product (similar to the integer 1). The examples that we will study
later show that a ring may have properties dramatically distinct from the
ring of integers.
If a ring A has an identity (for the product) then (A, ·) is a monoid. In
this case we say that (A, +, ·) is a unitary ring. We shall always refer to
the (unique) identity for addition has the zero of the ring, reserving the
term identity without further qualifications to the (unique) identity for
the product, whenever the ring has one (i.e., whenever it is a unitary ring).
Also, a commutative or abelian ring is a ring where a · b = b · a, for all
a, b ∈ A.
1.5. RINGS, INTEGRAL DOMAINS AND FIELDS
35
Examples 1.5.2.
1. The set Z of all integers with the usual operations of addition and product
is an unitary abelian ring. On the other hand, the set of even integers 2Z =
{0, ±2, ±4, . . . }, still with the usual operations of addition and product, is an
abelian ring without identity.
2. The sets of rational, real and complex numbers, denoted respectively by Q,
R and C, also with the usual addition and product are unitary abelian rings.
3. The set of n×n square matrices with entries in either Z, Q, R or C, which we
will denote by Mn (A), where A = Z, Q, R or C, still with the usual operations
of addition and product of matrices, are rings (non-abelian if n > 1) unitary
(the identity is the identity matrix I). More generally, we can consider the
ring of n × n matrices Mn (A) with entries in an arbitrary ring A.
4. The set of all functions f : R → R, with pointwise addition and product:
(f + g)(x) = f (x) + g(x),
(f g)(x) = f (x)g(x),
is a unitary abelian ring (the identity is the constant function equal to 1). Similarly, we can consider the ring of continuous functions, differentiable function,
etc.
5. The set Z2 = {0, 1}, with addition and product defined by
0 + 0 = 1 + 1 = 0,
0 · 0 = 0 · 1 = 1 · 0 = 0,
0 + 1 = 1 + 0 = 1,
1 · 1 = 1,
is a unitary abelian ring. Note that the operations in this ring correspond to
the boolean operations of “logical disjunction (OR)” and “logical conjunction
(AND)” if we associate
0 → False,
1 → True.
The operations in this ring correspond also to the usual “binary” arithmetic
( base 2 arihtmetic), with no transport addition.
6. The set of complex numbers of the form z = n + mi, with n, m ∈ Z, is a unitary abelian ring for the usual complex addition and multiplication. This ring
is usually denoted by the symbol Z[i] and its elements are called the Gaussian
integers 14 .
14
Carl Friedrich Gauss (1777-1855) was one of the great mathematicians from Göttingen.
He was a prodigious child, and when he was only 19 years old he found out a method to
construct a regular polygon with 17 sides using only a ruler and a compass (see Chapter 7).
For more that 2000 years, since the greek geometers first attempts of such constructions,
36
CHAPTER 1. BASIC ALGEBRAIC STRUCTURES
Let us now turn to some basic properties which are shared by any ring.
We start with properties which are consequences of results we have seen
before in a more general context:
Proposition 1.5.3. Let A be a ring.
(i) Cancellation law for addition: a + c = b + c ⇒ a = b, and in particular
d + d = d ⇒ d = 0.
(ii) Uniqueness of symmetrics: The equation a + x = 0 has a unique solution in A, given by x = −a.
(iii) Sign rules: −(−a) = a, −(a+b) = (−a)+(−b), and −(a−b) = (−a)+b.
We have mentioned above that A is a unitary ring if and only if (A, ·)
is a monoid. In this case, we denote by A∗ the set of invertible elements of
the monoid (A, ·), also-called the invertible elements of the ring A. Again,
recalling a result we proved in a more general context, we have:
Proposition 1.5.4. Let A be a unitary ring. Then (A∗ , ·) is a group, and
hence:
(i) A∗ is closed relative to the product.
(ii) If a ∈ A∗ , ax = 1 has the unique solution x = a−1 , where a−1 ∈ A∗ .
(iii) If a, b ∈ A∗ , (ab)−1 = b−1 a−1 , and (a−1 )−1 = a.
Examples 1.5.5.
1. The unique invertible integers are 1 and −1, i.e., Z∗ = {−1, 1}.
2. All the non-zero rationals, reals and complex numbers are invertible. Hence,
we have, e.g., Q∗ = Q − {0} 15 .
3. In the ring Mn (R), the invertible elements are the non-singular matrices, in
other words the matrices with non-zero determinat.
In general, in a arbitrary ring A with identity 1 6= 0, we can only say that
1 and −1 are invertible, because 1 · 1 = (−1) · (−1) = 1. For the ring Z these
the only regular polygons with a prime number of sides that were known to be constructible
were the equilateral triangle and the regular pentagon. Some of the most relevant results
to be discussed in this book were actually discovered by Gauss, such as the theory of
congruences.
15
If A and B are sets, then A − B = {x ∈ A : x 6∈ B}.
1.5. RINGS, INTEGRAL DOMAINS AND FIELDS
37
are the only invertible elements, but at the other extreme, we have rings
such as Q, R and C, where all the non-zero elements are invertible. There
are intermediate case, such as the ring Mn (A) (A a ring), where determining
the invertible elements is more complicated.
Let us look next at properties which involve both operations in a ring,
which therefore are not consequence of results we have seen so far. The
first property, e.g, shows that the zero in a unitary ring is not invertible (so
division by zero is not possible), except for the trivial ring A = {0} (where
0 = 1).
Proposition 1.5.6. If A is a ring, for any a, b ∈ A, we have:
(i) Product by zero: a0 = 0a = 0.
(ii) Sign rules: −(ab) = (−a)b = a(−b), and (−a)(−b) = ab.
Proof. To show that a0 = 0 we observe that
a0 + a0 = a(0 + 0)
(distributive property),
= a0
(because 0 is the identity for +),
⇒ a0 = 0
(by the cancellation law).
The proof of (ii) is left as an exercise.
Some of the main differences between examples of rings that we have
refer so far, are related with the behavior of the product. The only issue is
not just the existence of inverses, but also wether the product satisfies the
”cancellation law”, which can formally be defined as follows:
Definition 1.5.7. A ring A satisfies the cancellation law for the
product if
∀a, b, c ∈ A, [c 6= 0 and (ac = bc or ca = cb)] ⇒ a = b.
The restriction to elements c 6= 0 (which does not appear in the cancellation law for addition) is a consequence of the “product by zero” property.
The fact that the cancellation law for the product does not hold in every
ring, and hence is not a logical consequence of Definition 1.5.1, can already
be seen in the example of M2 (R) where we have the product
0 0
0 0
1 0
0 0
1 0
,
=
=
0 0
0 0
0 0
0 1
0 0
38
CHAPTER 1. BASIC ALGEBRAIC STRUCTURES
so that ca = cb, with c 6= 0 and a 6= b.
This example also shows that there are rings were cd = 0 with c 6= 0 and
d 6= 0. In this case we call c and d zero divisors. This notion should not
be confused with the notion of non-invertible element. However, if ca = cb
(or ac = bc) and c is invertible, it is obvious that we can multiply both sides
of the equalities by the inverse of c, and conclude a = b. In other words,
a zero divisor is always non-invertible, but a ring may have non-invertible
elements which are not zero divisors: for example, in Z there are no zero
divisors.
On the other hand, we leave as an exercise to check that:
Proposition 1.5.8. The cancellation law for the product holds in a ring A
if and only if A has no zero divisors.
In particular, in a ring A where all non-zero elements are invertible (i.e.,
A∗ = A − {0}), the cancellation law holds. Such a ring is entitled a special
name:
Definition 1.5.9. A division ring is a unitary ring A where A∗ = A−{0}.
An abelian division ring is called a field.
The rings Q, R and C are obviously fields. Note that Z2 is a field with
two elements, but 2Z or Mn (R) are neither fields, nor division rings. We
will describe later a division ring which is not a field.
There are also rings which are not division rings, since some non-zero
elements are not invertible, but where the cancellation law for the product
still holds. Examples of these are the ring of integers Z, the polynomial
rings with coefficients in Z, Q, R ou C, and the ring of Gaussian integers.
This class of rings is also entitled a special name:
Definition 1.5.10. An integral domain is a unitary abelian ring where
the cancellation law for the product holds.
By Proposition 1.5.8, a unitary abelian ring is an integral domain if and
only if it has no zero divisors.
The following diagram shows the relationship between the various kinds
of rings that we have seen so far (see also the exercises at the end of this
section).
If (A, +, ·) is a ring and B ⊂ A, it is possible that B is closed for both
the addition and the product of A, i.e., it is possible that
a, b ∈ B ⇒ a + b ∈ B and a · b ∈ B.
If this is the case, then it is possible that (B, +, ·) is a ring.
1.5. RINGS, INTEGRAL DOMAINS AND FIELDS
39
❘✐ ❣s
■ t❡❣r❛❧
❉♦♠❛✐ s
❋✐❡❧✂s
❉✐✁✐s✐♦
❘✐ ❣s
Figure 1.5.1: Integral domains, division rings and fields.
Definition 1.5.11. Let (A, +, ·) be a ring and let B ⊂ A be a subset closed
for the addition and the product of the ring A. We say that B is a subring
of A if (B, +, ·) is a ring. In this case, we say also that the ring A is an
extension of the ring B.
Examples 1.5.12.
1. Z is a subring of Q, and the even integers 2Z is a subring of Z.
2. The set N of all natural numbers (positive integers) is closed for both the
addition and product of Z, but it is not a subring of Z.
3. C is an extension of R.
4. The ring Mn (C) is an extension of Mn (Z).
According to a result for groups proved in the previous section, if B ⊂ A
and B is non-empty, then (B, +) is a subgroup of (A, +) (i.e., it verifies (i)
in Definition 1.5.1) if and only if it is closed for the difference. If B is closed
for the addition and product of A, then it is obvious that is satisfies (ii) and
(iii) in Definition 1.5.1, simply because its operations are the same as the
ones of the ring A. We conclude that
Proposition 1.5.13. Let A be a ring. A subset B ⊂ A is a subring of A if
and only if it is non-empty and it is closed for the difference and the product.
If A and B are rings we can form the direct product A × B of the underlying additive groups. Actually, it should be obvious that one can also
introduce a product in A × B in a similar fashion, so that:
(1.5.1)
(a, b) + (a′ , b′ ) = (a + a′ , b + b′ ),
(a, b)(a′ , b′ ) = (aa′ , bb′ ).
40
CHAPTER 1. BASIC ALGEBRAIC STRUCTURES
We leave it as an exercise to check that A × B with these operations is a
ring, which we call the direct sum of the rings A and B, and which we
denote by A ⊕ B. It should be clear that we can form the direct sum of any
finite number rings (later we will consider the case of an infinite number of
rings.
Exercises.
1. Check wether the following algebraic structures are rings. If the answer is
yes, indicate wether the ring is commutative, has an identity, and/or verifies
the cancellation law for the product. If the answer is no, indicate which axioms
in Definition 1.5.1 fail:
(a) The set of all integers multiple of a fixed integer m, with the usual addition and product of integers;
(b) The set of all linear transformations T : Rn → Rn , with the usual addition
and the product being composition;
(c) The set of all functions f : R → R, with the usual addition and the
product being composition;
(d) The set of all non-negative integers, with the usual addition and product
of integers;
(e) The set of all irrational numbers, with the usual addition and product of
real numbers;
(f) The set Z[i] of Gaussian integers, with the usual addition and product of
complex numbers;
(g) The set R[x] of all polynomials in the variable x with real coefficients
with the usual addition and the product of polynomials.
16
,
2. Show that for any ring 0 = −0.
3. Complete the proof of Theorem 1.5.6.
4. Let A be a set with three distinct elements, which we denote by 0, 1, and 2.
How many distinct operations of “addition” and “product” are there, turning
A into a unitary ring, with zero the element 0 and identity the element 1?
5. Show that, in a general ring, the difference operation is neither commutative,
nor associative, Give examples of rings where this operation satisfies both
properties.
16
We will see later that if A is a ring, it is possible to define the ring of polynomials “in
the variable x” with coefficients in A, which is usually denoted by A[x]. Note that the
use of the symbol Z[i] to denote the ring of Gaussian integers uses the same idea, since a
polynomial in the imaginary unit i can always be reduced to a first degree polynomial.
1.5. RINGS, INTEGRAL DOMAINS AND FIELDS
41
6. The equation a + x = b has exactly one solution in a ring A. What can you
say about the solutions of the equation ax = b? What about the equation
x = −x?
7. Assume that A, B and C are rings. Prove that the following statements hold:
(a) The set A × B with the operations defined by (1.5.1) is a ring.
(b) If A and B both have more than one element, then A ⊕ B has zero
divisors.
(c) If A and B are unitary rings then A ⊕ B is also a unitary ring and
(A ⊕ B)∗ = A∗ × B ∗ .
(d) A⊕B is isomorphic to B⊕A, and A⊕(B⊕C) is isomorphic to (A⊕B)⊕C.
(e) φ : A → B ⊕ C is a homomorphism of rings if and only if φ(x) =
(φ1 (x), φ2 (x)), with φ1 : A → B and φ2 : A → C homomorphisms of
rings.
8. If X is any set and A is a ring, show that the set of all maps f : X → A is a
ring with the “usual” operations of addition and product.
9. Use the result of the previous exercise with X = {0, 1} and A = Z2 to obtain
an example of a ring with 4 elements. Show that this ring is isomorphic to
Z2 ⊕ Z2 .
10. Let A = {0, 1, 2, 3} be a set with four elements. Show that there exists a
field structure on A, with zero 0 and identity 1.
11. Let A be a ring. Show that the set of n × n matrices with entries in A,
denoted Mn (A), is a ring. Show that if A has an identity, then Mn (A) also
has an identity.
12. Let B be a subring of A. Give examples of rings that show that all the
following cases are possible:
(a) A has identity and B does not have identity.
(b) A does not have identity and B has identity.
(c) A and B have distinct identities.
(Hint: consider subrings of rings of 2 × 2 matrices).
13. Give an example of non-abelian, finite, ring.
14. Show that any subring of a division ring satisfies the cancellation law for
the product. Conclude that a subring of a field which contains the identity of
the field is an integral domain (but not necessarily a field).
42
CHAPTER 1. BASIC ALGEBRAIC STRUCTURES
15. Show that the cancellation law for the product holds in a ring A if and only
if A has no zero divisors.
16. Let A be an integral domain. Do the following statements hold in A?
(a) x2 = 1 implies that x = 1 or x = −1;
(b) −1 6= 1.
17. Assume that ring A is an extension of the field K, and that K contains the
identity of A. Show that A is vector space over K.
18. Determine the invertible elements in the ring Z[i] of Gaussian integers.
19. Show that a finite integral domain A 6= {0} must be a field.
20. Show that M2 (Z)∗ is infinite.
(Hint: Show that M2 (Z)∗ = {A ∈ M2 (Z) : det(A) = ±1}.)
21. Let B be a subring of A. Show that:
(a) the zero of B is the zero of A;
(b) the symmetric of an element of B is the same in B and in A.
Assume now that both A and B have an identity.
(c) Is it true that B ∗ ⊂ A∗ ?
(d) Is it true that the inverse of an element in B ∗ is necessarily the same as
the inverse in A∗ ? What if the identities of A and B coincide?
1.6
Homomorphisms and Isomorphisms of Rings
We consider now the notions of homomorphisms and isomorphisms associated with the concept of ring. As one could expect, these are maps whose
domain and target are rings, and which moreover preserve the algebraic operations of these rings. Observe that in the following definition we use the
same symbols to represent the algebraic operations of distinct rings, in spite
that these can be very different operations. Note also that a homomorphism
of rings is a special case of a group homomorphism.
Definition 1.6.1. Let A and B be rings and φ : A → B a map. We say
that φ is a homomorphism of rings if:
(i) φ(a1 + a2 ) = φ(a1 ) + φ(a2 ), ∀a1 , a2 ∈ A;
1.6. HOMOMORPHISMS AND ISOMORPHISMS OF RINGS
43
(ii) φ(a1 a2 ) = φ(a1 )φ(a2 ), ∀a1 , a2 ∈ A.
An isomorphism of rings is a homomorphism which is a bijection17 . We
say that the rings A and B are isomorphic if there exists some isomorphism
φ : A → B.
Examples 1.6.2.
1. Let us denote the complex conjugate of z = x + iy by z = x − iy. We have
z + w = z + w,
zw = zw.
According to the previous definition, the map φ : C → C given by φ(z) = z is
an automorphism of the ring C.
2. Consider the map φ : C → M2 (R) defined by
x −y
φ(x + iy) =
.
y x
Notice that
so that,
Similarly,
so that,
x −y
y x
+
x′
y′
−y ′
x′
=
x + x′
y + y′
−y − y ′
x + x′
,
φ(x + iy) + φ(x′ + iy ′ ) = φ((x + iy) + (x′ + iy ′ )).
x −y
y x
x′
y′
−y ′
x′
=
xx′ − yy ′
xy ′ + x′ y
−(xy ′ + x′ y)
xx′ − yy ′
,
φ(x + iy)φ(x′ + iy ′ ) = φ((x + iy)(x′ + iy ′ )).
We conclude that φ is a homomorphism of rings. It is easy to check that φ is an
injective homomorphism ( i.e., it is a monomorphism) which is not surjective.
3. Let φ : Z → Z2 be given by φ(n) = 0, if n is even, and φ(n) = 1, if n is odd.
It is easy to check that φ is a surjective homomorphism of rings ( i.e., it is a
epimorphism), which is not injective.
4. Let φ : R → M2 (R) be given by
φ(x) =
17
x 0
0 0
.
Just like the cases of monoids and groups, we will also use the terms epimorphism
and monomorphism of rings, as well as endomorphism and automorphism of rings, which
are defined in the obvious way.
44
CHAPTER 1. BASIC ALGEBRAIC STRUCTURES
It should be obvious that φ is a monomorphism (which is not surjective). Notice
that the matrices in the image of this monomorphism form a subring of M2 (R)
with identity distinct from the identity of ring M2 (R).
5. If S, T : Rn → Rn are linear transformations we define the addition S + T
and composition ST by
(S + T )(x) = S(x) + T (x),
(ST )(x) = S(T (x)).
We have already observed that these operations make the set of linear transformations of Rn into itself a ring, which we will denote by L(Rn , Rn ). Now fix
a basis for Rn and let M(S) denote the matrix of the linear transformation S
relative to this basis. It is clear that M(S) is a n × n matrix with real entries,
and we know from Linear Algebra that the map M : L(Rn , Rn ) → Mn (R)
satisfies the identities
M(S + T ) = M(S) + M(T ),
M(ST ) = M(S)M(T ).
Hence M is an isomorphism of rings.
In certain cases, when there exists an “obvious” injective homomorphism
φ : A → B, we use the same symbol to denote a and φ(a). Although this is
not recommended practice from a logical point of view, it is unavoidable in
order not to have a too heavy notation. Examples of this practice are the
representation of the real number a and the complex number (a, 0) or the
constant polynomial p(x) = a, by the same symbol.
Since every homomorphism of rings φ : A → B is also a homomorphism
of the additive groups (A, +) and (B, +), we can restate Proposition 1.4.3
as follows:
Proposition 1.6.3. Let φ : A → B be a homomorphism of rings. Then:
(i) φ(0) = 0;
(ii) φ(−a) = −φ(a).
According to the definition of ring homomorphism and the previous
proposition, a homomorphism φ : A → B “preserves” additions, products,
the zero, and symmetrics. However, certain notions related to the product
are nor preserved, at least in their most obvious form. One of the examples
above shows that if A has identity 1 for the product, then φ(1) may fail to
be the identity of the product in B. We will see in the exercises at the end of
the section how to formulate a correct statement concerning “preservation”
of identities.
1.6. HOMOMORPHISMS AND ISOMORPHISMS OF RINGS
45
The facts concerning the equation φ(x) = y for a ring homomorphism is
very similar to the case of group homomorphisms. The kernel of φ is the
set of solutions of the homogenous equation φ(x) = 0, i.e.,
N (φ) = {x ∈ A : φ(x) = 0},
while the image φ(A) is the set of y ∈ B for which the equation φ(x) = y
has solutions x ∈ A. Obviously, φ is surjective if and only if φ(A) = B, and
φ is injective if and only if N (φ) = {0}.
If we restate Theorem 1.4.10 in additive notation, we obtain:
Theorem 1.6.4. Let φ : A → B be a homomorphism of rings. Then:
(i) φ(x) = φ(x′ ) if and only if x − x′ ∈ N (φ);
(ii) φ is injective if and only if N (φ) = {0};
(iii) if x0 is a particular solution of φ(x) = y0 , then the general solution is
x = x0 + n, with n ∈ N (φ).
Here is a simple example illustrating this result.
Example 1.6.5.
Consider the homomorphism φ : Z → Z2 given by φ(n) = 0, if n is even,
and φ(n) = 1, if n is odd. The kernel of φ is the set of even integers. Since
φ(0) = 0 and φ(1) = 1, we conclude that
φ(x) = 0
φ(x) = 1
⇔ x = 2n, with n ∈ Z,
⇔ x = 1 + 2n, with n ∈ Z.
Given a homomorphism φ : A → B, we know that N (φ) and φ(A) are
subgroups of the additive groups (A, +) and (B, +). It is easy to check that
these subgroups are actually subrings:
Proposition 1.6.6. If φ : A → B is a ring homomorphism, then N (φ) is
a subring of A, and φ(A) is a subring of B. In particular, if φ is injective,
then A is isomorphic to φ(A).
Proof. In order to show that N (φ) and φ(A) are subrings, we need only to
show that they are closed for the respective products.
If b1 , b2 ∈ φ(A), there exist a1 , a2 ∈ A such that b1 = φ(a1 ) and b2 =
φ(a2 ). Hence, since φ is a homomorphism, we have
b1 b2 = φ(a1 )φ(a2 ) = φ(a1 a2 ) ∈ φ(A),
46
CHAPTER 1. BASIC ALGEBRAIC STRUCTURES
which shows that φ(A) is closed for the product of B.
If a1 , a2 ∈ N (φ), then φ(a1 ) = φ(a2 ) = 0. Again using that φ is a
homomorphism, we have
φ(a1 a2 ) = φ(a1 )φ(a2 ) = 0 · 0 = 0,
hence a1 a2 ∈ N (φ), and N (φ) is closed for the product of A.
Finally, it is obvious that if φ is injective, then φ is gives an isomorphism
between A and φ(A).
Examples 1.6.7.
1. Considere again the homomorphism φ : C → M2 defined by
x −y
φ(x + iy) =
.
y x
Form the theorem above, we conclude that the set of matrices of the form
x −y
y x
is a subring of M2 (R), isomorphic to the field of complex numbers.
2. If instead we consider the homomorphism φ : R → M2 given by
x 0
φ(x) =
,
0 0
we conclude that M2 contains also subring isomorphic to the field of real numbers. Note, however, that M2 contains various distinct subrings isomorphic to
the field of real numbers: in the previous example we can restrict to the real
numbers and obtain the injective homomorphism φ : R → M2 , x 7→ xI, where
I is the identity matrix.
We have seen that the kernel of a group homomorphism is a special kind
of subgroup, namely a normal subgroup. Similarly, the kernel N (φ) of a
ring homomorphism φ : A → B is a subring of A of a special kind: not only
N (φ) is closed for the product, just like any subring, but also in order for
the product ab to belong to N (φ) it is enough that just one of the factors a
or b belongs to N (φ):
Proposition 1.6.8. Let φ : A → B be a ring homomorphism If a ∈ N (φ)
and a′ is any element of A, then both aa′ and a′ a belong to N (φ).
1.6. HOMOMORPHISMS AND ISOMORPHISMS OF RINGS
47
Proof. Note that
a1 a2 ∈ N (φ) ⇔ φ(a1 a2 ) = 0 ⇔ φ(a1 )φ(a2 ) = 0.
For the product φ(a1 )φ(a2 ) to be zero it is enough for one of the factors
φ(a1 ) or φ(a2 ) to be zero, i.e., for one of the elements a1 or a2 to belong to
the kernel N (φ).
The subrings with this property deserve a special name.
Definition 1.6.9. Let A be ring and I ⊂ A a subring. We say that I is an
ideal of A if for all a ∈ A and b ∈ I one has ab, ba ∈ I.
Not all subrings of a ring are ideals. The following examples illustrate
both cases.
Examples 1.6.10.
1. It is clear that Z is a subring of R (the difference and the product of integers
is always an integer). However, Z is not an ideal of R (the product of an
integer by real number may fail to be an integer).
2. Let A = Z and let I be the set of even integers. Obvioulsy, I is a subring of
Z (the difference and the product of even integers is an even integer) but I is
also an ideal of Z (the product of any integer by an even integer is always an
even integer). We will see later that all the subrings of Z are automatically
ideals. Note also that I is a subring of R, but obviously not an ideal of R.
3. Every ring A always has the “trivial” ideals {0} and A.
4. In some cases, a ring only has the trivial ideals. For example, this always
happens with a field. In fact, if K is a field, and I ⊂ K is an ideal which
contains an element x 6= 0 ( i.e., if I 6= {0}), then xx−1 = 1 ∈ I (because x ∈ I
and x−1 ∈ K). But then every element y ∈ K belongs to I, because y = 1y,
where 1 ∈ I and y ∈ K.
In a non-abelian ring A we can consider subrings for which the condition
of ideal is only verified for one side: a left ideal of A is a subring B ⊂ A
such that if a ∈ A and b ∈ B then ab ∈ B. Similarly, a right ideal of A
is a subring B ⊂ A such that if a ∈ A and b ∈ B then ba ∈ B. Obviously
I ⊂ A is an ideal in the sense of Definition 1.6.9 if and only if it is both a
left and a right ideal. For an abelian ring, all these notions coincide. Due
to Proposition 1.6.8, the lateral ideals play a much less important role that
the bilateral ideals.
48
CHAPTER 1. BASIC ALGEBRAIC STRUCTURES
Exercises.
1. Let A be a ring and φ, ψ : A → A endomorphisms. Show that the composition φ ◦ ψ is an endomorphism, but the sum φ + ψ may fail to be one. In
particular, show that the set of endomorphisms of A, denoted by End(A), with
the composition operation, form a monoid.
2. Let A be a ring and φ, ψ : A → A automorphisms. Show that the composition
φ◦ψ and the inverse φ−1 are automorphisms. In particular, show that the set of
all automorphisms of A, denoted by Aut(A), with the composition operation,
form a group.
3. Every integer m is of the form m = 3n + r, where n is the result of the
division of m by 3 and r is the remainder. Note that n and r are unique, as
long as 0 ≤ r < 3. Show that the map φ : Z → Z3 given by φ(m) = r is a
homomorphism of rings. What is the kernel of this homomorphism?
4. Show that if A and B are rings, A has identity 1, and φ : A → B is a
homomorphism then φ(1) is the identity of φ(A), which may differ from the
identity of B. Show also that if a ∈ A∗ , then φ(a−1 ) is the inverse of φ(a) in
φ(A), which may not be the inverse of φ(a) in B. In particular, φ(a) may fail
to be invertible in B.
5. Show that, if A and B are rings, A has identity 1, and φ : A → B is an
isomorphism then φ(1) is the identity of B. Show also that if a ∈ A∗ , then
φ(a−1 ) is the inverse of φ(a) in B and φ(A∗ ) = B ∗ .
6. Show that, if A and B are rings, A has identity 1, and φ : A → B is an
isomorphism, then B is a field (respectively, division ring, integral domain) if
and only if A is a field (respectively, division ring, integral domain).
7. Show that, if K is a field, A is a ring, and φ : K → A is a homomorphism,
then either A contains a subring isomorphic to K, or φ is identically 0.
8. Are there any subrings of Z ⊕ Z which are not ideals of Z ⊕ Z?
9. Determine End(A) when A = Z and A = Q.
(Hint: Find φ(1) and proceed by induction.)
10. Determine End(R).
(Hint: Show that φ(x) ≥ 0 whenever x ≥ 0, so φ is non-decreasing.)
11. Determine all the homomorphisms φ : C → C which satisfy φ(x) ∈ R
whenever x ∈ R.
(Hint: Show that if φ(1) 6= 0 and x = φ(i) then x2 = −1.)
1.6. HOMOMORPHISMS AND ISOMORPHISMS OF RINGS
12. Assume that I is an ideal of M2 (R) and
M2 (R).
1
0
0
0
49
∈ I. Show that I =
13. Determine all the ideals of M2 (R).
14. Let A be a commutative ring with identity and consider the ring Mn (A).
Show that the map det : Mn (A) → A defined by
X
det(B) =
sgn(π)a1π(1) a2π(2) · · · anπ(n) ,
π∈Sn
where B = (aij ), preserve products (but is not a homomorphism of rings).
Conclude that a matrix B ∈ Mn (A)∗ if and only if det(B) ∈ A∗ .
15. Assume that C ⊂ B ⊂ A where B is a subring of A.
(a) If C is a subring of B, is it true that C is a subring of A?
(b) If C is an ideal of B, is it true that C is an ideal of A?
(c) If C is an ideal of A, is it true that C is an ideal of B?
16. Show that, if A is a unitary abelian ring and its only ideals are {0} and A,
then A is a field. If A is non-abelian, is it true that A is necessarily a division
ring?
17. Let A and B be unitary rings.
(a) Assume that J is an ideal of A ⊕ B, and show that (a, b′ ), (a′ , b) ∈ J ⇒
(a, b) ∈ J.
(b) Show that K ⊂ A × B is an ideal of A ⊕ B if and only if K = K1 × K2 ,
where K1 is an ideal of A, and K2 is an ideal of B.
18. This exercise concerns the decomposition of rings into direct sums.
(a) Assume first, that A is isomorphic to B ⊕ C, and show that there exist
ideals I and J of A such that I ∩ J = {0}, and I + J = {i + j : i ∈ I, j ∈
J} = A.
(b) Assume now that there exist ideals I and J of A such that I ∩ J = {0},
and I + J = A. Show that A is isomorphic to I ⊕ J as follows:
(i) Show that, if i ∈ I, j ∈ J and i + j = 0, then i = j = 0.
(ii) Show that, if i ∈ I and j ∈ J, then ij = 0.
(iii) Show that the map φ : I ⊕ J → A, given by φ(i, j) = i + j, is an
isomorphism of rings.
50
1.7
CHAPTER 1. BASIC ALGEBRAIC STRUCTURES
The Quaternions
The field of complex numbers is an extension of the field of real numbers.
This last field is an extension of the field of rationals, which in turn is an
extension of the ring of integers. It is natural to ask if one can continue this
sequence, if there is an extension of the field of complex numbers, or to what
extend this process of determining successive extensions has a “natural”
end. In the XIX Century, W. R. Hamilton 18 asked and tried to solve this
question.
Z
Q
R
C
??
PSfrag replaements
Figure 1.7.1: Hamilton’s Problem.
In his first attempt (which lasted 20 years!), Hamilton looked for a field
consisting of “numbers” of the form a + ib + cj, where a, b, c ∈ R and i is
the imaginary unit. In other words, in modern language, he tried to look
for a field structure in the set R3 containing a subfield isomorphic to the
field of complex numbers. After many attempts to find a “reasonable” value
for the product ij (in the form ij = a + bi + cj), he realized he needed to
introduce an additional “number” k, so that ij = k. Hamilton eventually
discovered the existence not of a field, but of a division ring structure on
the set R4 . The elements of this division ring are now called quaternions,
or Hamilton’s numbers.
We denote the elements of the canonical basis of the vector space R4 by
1 = (1, 0, 0, 0),
18
i = (0, 1, 0, 0),
j = (0, 0, 1, 0),
k = (0, 0, 0, 1).
William Rowan Hamilton (1805-1865), was a great Irish astronomer and mathematician. Hamilton like Gauss was very precocious: at the age of 5 he could read Greek Hebrew
and Latin, and at the age of 10 he was familiar with half a dozen oriental languages!
1.7. THE QUATERNIONS
51
Hence, we can write a quaternion q = (a, b, c, d) as
q = a1 + bi + cj + dk,
where a, b, c, d are real numbers. We look for a ring structure on R4 for
which the injective maps φ : R → R4 and ψ : C → R4 defined by φ(x) = x1
and ψ(x + iy) = x1 + yi, are ring homomorphisms, so that we can identify
the real numbers x ∈ R with the quaternions {(x, 0, 0, 0) : x ∈ R}, and the
complex numbers x + iy ∈ C with the quaternions {(x, y, 0, 0) : x, y ∈ R}.
Given a quaternion q = a1 + bi + cj + dk, we call a1 the real part of
q, and bi + cj + dk vectorial part of q. As in the case of complex numbers,
we will usually write q = a + bi + cj + dk, so that the quaternion 1 is
not included in the notation. The addition of quaternions is the usual
vector addition in R4 . Hence, it is obvious that, if x and y are reals and z
and w are complex numbers, then
φ(x + y) = φ(x) + φ(y),
ψ(z + w) = ψ(z) + ψ(w).
The product of quaternions is harder to discover. Let us observe
first that, if we think of the product of a real number a by the quaternion q
as the product (a1)q, then this product should amount to the usual product
of a scalar by a vector, and in particular the quaternions form a vector space
of dimension 4 over the reals. On the other hand, we should also have
i2 = −1,
(1.7.1)
since
i2 = ψ(i)ψ(i) = ψ(i2 ) = ψ(−1) = φ(−1) = −φ(1) = −1.
Hamilton found the fundamental identities:
(1.7.2)
ij = k,
jk = i,
ki = j.
Using these identities, the associativity of the product, and the relation
i2 = −1, it is possible to determine the products ji, kj, ik, j 2 and k2 . For
example,
ij = k ⇒ i(ij) = ik ⇒ (ii)j = ik ⇒ (−1)j = ik ⇒ −j = ik.
We leave the details of the other products as an exercise. The final result is:
(1.7.3)
j 2 = k2 = −1,
ji = −k,
kj = −i,
ik = −j.
52
CHAPTER 1. BASIC ALGEBRAIC STRUCTURES
Observe that the product of quaternions is not commutative, so the quaternions do not form a field. From the identities (1.7.1), (1.7.2), and (1.7.3),
and using only the associative and distributive properties, it is possible to
find the product of two arbitrary quaternions. Instead, we choose to invert
the whole process and to define formally the ring of quaternions as follows.
Theorem 1.7.1. The set R4 with the usual vector addition and the product
of q = a + v and r = b + w defined by19
(1.7.4)
(a + v)(b + w) = ab − (v · w) + aw + bv + v × w,
is a division ring which will be denoted by H.
Proof. We start by observing that the product (q, r) → qr defined by (1.7.4)
coincides with the product by scalars if q ∈ R, and also that it is R-bilinear:
given a1 , a2 ∈ R, q, q 1 , q 2 ∈ R4 and r 1 , r 2 , r ∈ R4 we have
(a1 q 1 + a2 q2 )r = a1 (q 1 r) + a2 (q 2 r),
q(a1 r 1 + a2 r 2 ) = a1 (qr1 ) + a2 (qr 2 ).
We check also that, with the notation i, j and k for the canonical basis of
R3 , the identities (1.7.1), (1.7.2), and (1.7.3) hold.
In order to check that associativity holds, we now use the bilinearity and
the identities (1.7.1), (1.7.2), and (1.7.3), to find the products
(a0 + a1 i + a2 j + a3 k) ((b0 + b1 i + b2 j + b3 k)(c0 + c1 i + c2 j + c3 k)) ,
and
((a0 + a1 i + a2 j + a3 k)(b0 + b1 i + b2 j + b3 k)) (c0 + c1 i + c2 j + c3 k),
and verify that they coincide.
Finally, it is clear that
q1 = 1q = q,
and we leave as an exercise to check that for any non-zero quaternion q =
(a, b, c, d) the following identities hold:
(1.7.5)
19
qq ′ = q ′ q = 1, where q′ =
a − bi − cj − dk
.
a2 + b2 + c2 + d2
In this expression, (v · w) and v × w denote the usual inner and cross products.
Actually, these operations and the notation i, j and k for the canonical basis of R3 are
traces that remain from Hamilton’s original work.
1.7. THE QUATERNIONS
53
An interesting fact is that the ring of quaternions H is isomorphic to a
subring of the ring M4 (R) of 4 × 4 matrices with real entries. This gives
concrete realization of this division ring and a different proof of Theorem
1.7.1. For that, consider the 2 × 2 matrices:
1 0
0 −1
0 0
M=
,
N=
,
0=
.
0 1
1 0
0 0
These allow to define a linear transformation ρ : R4 → M4 (R) by setting on
the canonical basis:
N
0
0
M
0 N
ρ(i) =
,
ρ(j) =
,
ρ(k) =
,
0 −N
−M 0
N 0
and
ρ(1) =
M
0
0
M
.
Then, we have:
Proposition 1.7.2. The map ρ : H → M4 (R) is an injective homomorphism, so the ring M4 (R) has a subring isomorphic to the division ring H.
The proof of this proposition is a simple computation. We will see later
that as a consequence of the Fundamental Theorem of Algebra (“Every polynomial with complex coefficients of positive degree has one complex root”)
there is no field which is an extension of C and at the same time a vector
space of finite dimension over R or C. In other words, we now know that
the original problem of Hamilton does not have a solution.
Exercises.
1. Using formula (1.7.4) for the product of quaternions show that relations
(1.7.1), (1.7.2) and (1.7.3) hold.
2. Using only (1.7.4), bilinearity and the identities (1.7.1), (1.7.2), and (1.7.3),
prove formula (1.7.5) for the inverse of a quaternion q 6= 0.
3. For any quaternion q = a1 + bi + cj + dk one defines its conjugate to be the
quaternion q = a1 − bi − cj − dk. Show that:
(a) The map φ : H → H defined by φ(q) = q is an automorphism of (H, +).
What can you say about φ(q 1 q 2 )?
√
(b) qq = ||q||2 where ||q|| = a2 + b2 + c2 + d2 denotes the norm of the
quaternion q = a1 + bi + cj + dk;
54
CHAPTER 1. BASIC ALGEBRAIC STRUCTURES
(c) The inverse of the quaternion q is the quaternion q −1 =
q
||q||2 .
4. Check that the set formed by all linear combinations of the 4 × 4 matrices:
M 0
N
0
0
M
0 N
,
,
,
,
0 M
0 −N
−M 0
N 0
is a subring of M4 (R).
5. Prove Proposition 1.7.2, i.e., show that the map ρ : H → M4 (R) is is an
injective homomorphism.
6. Find all solutions of the equation x2 = −1 in the ring H of quaternions.
7. Assume that {1} ⊂ K ⊂ L ⊂ M are fields, with M an extension of L, and
L an extension of K. Recall that M is a vector space over K and over L, and
that, in turn, L is a vector space over K. Assume that the dimension of M
over L is m, and that the dimension of L over K is n. Show that the dimension
of M over K is mn. Conclude that a non-trivial extension of C has at least
dimension 4 over R.
8. Consider the quaternions of the form a + bi + cj + dk, where a, b, c, d ∈ Z.
Verify that these quaternions form a non-abelian ring, which is not a division
ring, but where the cancellation law holds.
9. Verify that the set formed by all the invertible elements of the ring in the
previous exercise is a non-abelian group with 8 elements, which will be denoted
by H8 . Determine all the subgroups of H8 , and identify the normal subgroups.
1.8
Symmetries
In this section we will see how group theory can be used to formalize the
notion of symmetry. We will focus mainly on symmetries of planar figures.
For that, we start by observing that a “plane figure” is formally a set Ω ⊂ R2 ,
and we call a symmetry of Ω any map f : R2 → R2 which preserves
distances between points of R2 , i.e., such that:
||f (x) − f (y)|| = ||x − y||,
∀x, y ∈ R2 ,
and which transforms the set Ω in itself, i.e., such that:
f (Ω) = Ω.
1.8. SYMMETRIES
55
Example 1.8.1.
If Ω is the unit circle centered at the origin, any planar rotation around the
origin is a symmetry of Ω. Similarly, any reflection of the plane relative to a
line through the origin is also a symmetry of Ω.
The symmetries of the plane or, more generally, the symmetries of Rn ,
are the maps f : Rn → Rn which preserve distances, and which for that
reason are called isometries.
Examples 1.8.2.
1. Any translation is an isometry of the plane.
2. Any rotation is an isometry of the plane.
3. Any reflection (relative to a line or a point) is an isometry of the plane.
Our next aim is to classify all the isometries of Rn . For that, we look
first at the isometries f which fix the origin, i.e., such that f (0) = 0.
Proposition 1.8.3. For a map f : Rn → Rn such that f (0) = 0, the
following statements are equivalent:
(i) f is an isometry, i.e.,
||f (x) − f (y)|| = ||x − y||, ∀x, y ∈ Rn ;
(ii) f preserves inner products, i.e.,20
f (x) · f (y) = x · y, ∀x, y ∈ Rn .
Proof. We assume first that f is an isometry, and we observe that
||x|| = ||x − 0|| = ||f (x) − f (0)|| = ||f (x)||.
Moreover, we note that
||f (x) − f (y)||2 = ||f (x)||2 + ||f (y)||2 − 2f (x) · f (y),
||x − y||2 = ||x||2 + ||y||2 − 2x · y.
Since by assumption ||f (x) − f (y)|| = ||x − y||, and we have already shown
that ||x|| = ||f (x)||, it follows immediately that f (x) · f (y) = x · y, for all
x, y ∈ Rn . We conclude that (i) implies (ii).
We leave as an exercise the proof that (ii) implies (i).
20
In this section we denote by ‘·” the usual inner product of Rn .
56
CHAPTER 1. BASIC ALGEBRAIC STRUCTURES
Still considering only isometries fixing the origin, we show next that such
an isometry is necessarily a linear transformation.
Proposition 1.8.4. If f : Rn → Rn is an isometry and f (0) = 0, then f
is a linear transformation.
Proof. Denote by {e1 , . . . , en } the canonical basis of Rn and let vk = f (ek ).
The vectors v k are unitary (because ||v k || = ||f (ek )|| = ||ek || = 1) and
orthogonal (because v i · v j = f (ei ) · f (ej ) = ei · ej ). Hence, the vectors
{v 1 , . . . , v n } also form a basis of Rn (why?).
Let now x, y ∈ Rn , where y = f (x). Since {e1 , . . . , en } and {v 1 , . . . , v n }
are both bases of Rn , there exist scalars x1 , . . . , xn and y1 , . . . , yn such that
x=
n
X
xk ek ,
y=
k=1
n
X
yk v k .
k=1
Notice that xk = x · ek and yk = y · v k , and since
y · v k = f (x) · f (ek ) = x · ek ,
we must have xk = yk . Therefore:
n
n
n
X
X
X
f (x) = f (
xk ek ) =
yk v k =
xk f (ek ),
k=1
k=1
k=1
so f is a linear transformation.
Since an isometry satisfying f (0) = 0 is a linear transformation, it is
natural to characterize an isometry in terms of its matrix representation.
For that, we recall that a n × n matrix is called orthogonal if AT A = I, or
equivalently A−1 = AT . Since det(AT ) = det(A), for any orthogonal matrix
A we have
[det A]2 = det AT det A = det(AT A) = det I = 1,
so that det A = ±1.
Proposition 1.8.5. If f : Rn → Rn is any map, the following statements
are equivalent:
(i) f is an isometry and f (0) = 0;
(ii) f is a linear transformation and the matrix of f relative to the canonical basis is orthogonal: f (x) = Ax, with AT A = I.
1.8. SYMMETRIES
57
Proof. (i) ⇒ (ii). If A isP
the matrix of f relative to the canonical basis, then
A = (aij ), where v j = ni=1 aij ei . Therefore the column j of A is formed
by the components of the vector v j in the canonical basis. Since the vectors
v j are unitary and orthogonal, we conclude that
vj · vk =
n
X
i=1
aij aik =

 1

0
if
j = k,
if
j 6= k,
which means that AT A = I, This shows that the matrix A is orthogonal.
(ii) ⇒ (i). Exercise.
The linear transformations which are isometries are usually called orthogonal transformations. We can now characterize an arbitrary isometry
of Rn as a orthogonal transformation followed by a translation, as follows:
Theorem 1.8.6. If f : Rn → Rn , the following statements are equivalent:
(i) f is an isometry,
(ii) there exists an orthogonal matrix A and a vector a ∈ Rn such that
f (x) = Ax + a.
Proof. (i) ⇒ (ii). Let f be an isometry, and a = f (0). The map g(x) =
f (x) − a satisfies g(0) = 0 and is an isometry:
||g(x) − g(y)|| = ||f (x) − f (y)|| = ||x − y||.
According to Proposition 1.8.5, g is an orthogonal transformation.
(ii) ⇒ (i). An orthogonal matrix A defines an orthogonal transformation
g(x) = Ax, which we know is an isometry. It is immediate to check that if
a ∈ Rn , then f (x) = Ax + a is an isometry.
The set of all isometries f : Rn → Rn , with binary operation composition
of isometries, form a group E(n, R), called the symmetry group of Rn
or euclidean group (why is this a group?). Theorem 1.8.6 shows that an
isometry f can be identified with a pair (A, a), formed by an orthogonal
matrix A and a vector a ∈ Rn , so one often identifies the euclidean group
with the set of all such pairs:
E(n, R) = {(A, a) : A ∈ Mn (R), AT A = I, a ∈ Rn }.
58
CHAPTER 1. BASIC ALGEBRAIC STRUCTURES
We leave it as an exercise to check that the composition of isometries corresponds to the following binary operations on pairs:
(A1 , a1 ) · (A2 , a2 ) = (A1 A2 , A1 a2 + a1 ).
The euclidean group has various interesting subgroups. For example,
the translations are the isometries of the form f (x) = x + a and they
form a subgroup of E(n, R) isomorphic to (Rn , +). Under the identification
of isometries with pairs, we have that Rn ⊂ E(n, R) corresponds to the
injective homomorphism: a 7→ (I, a).
The orthogonal transformations f : Rn → Rn also form a subgroup
O(n, R) ⊂ E(n, R), called the orthogonal group. We can view this
group as the group of orthogonal matrices:
O(n, R) = {A ∈ Mn (R) : AT A = I}.
and the inclusion O(n, R) ⊂ E(n, R) corresponds to the injective homomorphism: A 7→ (A, 0).
As we have observed above, the determinant of an orthogonal transformation f must be 1 or −1, and the orthogonal transformations f of
determinant 1 form a subgroup SO(n, R) ⊂ O(n, R) called the special
orthogonal group. The elements of SO(n, R) are called proper rotations ou simply rotations. In terms of matrices:
SO(n, R) = {A ∈ Mn (R) : AT A = I, det A = 1}.
An isometry f : Rn → Rn which coincides with its own inverse, i.e., such
that f ◦ f = I is called a reflection. An isometry f (x) = Ax + a is a
reflection if and only if A2 = I and Aa = −a. Composition of reflections
may fail to be a reflection, so reflections do not form a group.
More generally, if Ω ⊂ Rn , then the isometries of Rn which preserve Ω
form a subgroup GΩ ⊂ E(n, R), called the symmetry group of Ω:
GΩ = {f : Rn → Rn : f is an isometry and f (Ω) = Ω}.
We can talk of the symmetries of Ω which are translations, orthogonal transformations, rotations, reflections, etc.
Examples 1.8.7.
1. Let Ω ⊂ R2 be a rectangle centered at origin, with sides (of different lengths)
parallel to the coordinate axes. The corresponding symmetry group GΩ has 4
elements: the identity, the reflections in the axes Ox and Oy, and the rotation
of 180o around the origin (which is also a reflection relative to the origin). The
symmetry group of the rectangle, often called the Klein group, is isomorphic
to the direct product Z2 × Z2 .
1.8. SYMMETRIES
59
PSfrag replaements
a
b
Figure 1.8.1: Symmetries of a rectangle.
2. If Ω ⊂ R2 is a regular polygon with n sides centered at origin, the corresponding symmetry group GΩ is called the dihedral group, and has 2n
elements: the rotations of 2kπ/n around the origin, the reflections relative to
lines through the origin and the vertices, and the reflections relative to lines
through the origin and which bisect the sides of the polygon. It is common to
denote this group by the symbol Dn .
PSfrag replaements
a
b
Figure 1.8.2: Symmetries of an equilateral triangle.
For example, for an equilateral triangle (n = 3), there are three rotational
symmetries {R, R2 , R3 = I} generated by a rotation R of 2π/3 around the
origin. Their matrix representations relative to the canonical basis of R2 are
!
√
√ !
1
3
− 23
−
−√12
2
2
2
√
R=
, R =
.
3
− 23 − 21
− 12
2
We have also three axes of symmetry giving rise to 3 reflections {σ1 , σ2 , σ3 }. If
4πi
2πi
we choose the triangle so that its vertices are 1, e 3 , and e 3 , then the matrix
representations of the reflections relative to the canonical basis of R2 are:
√ !
√ !
1
1
3
3
−
−
1 0
−
2
2
√2
√2
σ1 =
,
σ
=
.
, σ2 =
3
3
1
1
0 −1
− 3
2
2
2
2
60
CHAPTER 1. BASIC ALGEBRAIC STRUCTURES
We leave as an exercise to check that the resulting symmetry group D3 is
isomorphic to the symmetric group S3 .
In the previous examples, all the figures are bounded. It is also interesting to study groups of symmetry of unbounded figures. Consider, for
example, the following subset of the plane:
Ω = {na + mb : n, m ∈ Z},
where a, b ∈ R2 are fixed, linearly independent vectors of the plane. Then
Ω is a discrete set of points, which can be thought of as a model for a planar
lattice of atoms that extend indefinite in the plane (e.g., a crystal).
b
a
PSfrag replaements
Figure 1.8.3: A planar lattice Ω.
The unbounded planar sets which exhibit symmetry are used extensively
in the decoration of surfaces (tilings). Looking at various examples of decorations, one is lead to conjecture that such symmetric tilings are obtained by
repetition of one the following figures: equilateral triangle, square, rectangle
or hexagon21 .
We will not determine the full symmetry group GΩ , but will focus instead
on the simpler problem of determining which rotations can be symmetries of
Ω. The facts mentioned above suggest that if some rotation is a symmetry
of Ω, then it must be a rotation by 60o , 90o , 120o or 180o (besides, of course,
the identity, which is also a rotation). To see that this is indeed the case, let
f be some rotation which is a symmetry of Ω, and denote by A its matrix
21
For a detail discussion of the notion of symmetry and its relationship with art, see
the classical monograph of H. Weyl, Symmetry, Princeton University Press, Princeton
N. J. (1952).
1.8. SYMMETRIES
61
PSfrag replaements
a
b
Figure 1.8.4: Symmetries of unbounded planar figures.
representation relative to the canonical basis. We can always write A in the
form:
cos θ − sin θ
A=
.
sin θ cos θ
Also, let B be the matrix representation of f relative to the basis {a, b}. In
this basis all the elements of Ω have integer coordinates (indeed, the points
of Ω are precisely the vectors of R2 whose components in the basis {a, b}
are integers). Hence, the matrix B must have integer entries, since these
entries represent the vectors f (a) and f (b), which are necessarily points of
Ω. We can then write:
n11 n12
,
B=
n21 n22
where the nij are integers.
Since A and B represent the same linear transformation relative to distinct basis, we see that A and B are similar matrices i.e., there exists a
non-singular matrix S such that S −1 AS = B. Now observe that A and B
have the same trace (the sum of the diagonal elements):
2 cos θ = tr(A) = tr(ASS −1 ) = tr(S −1 AS) = tr(B) = n11 + n22 ,
where we used the fact that tr(CD) = tr(DC), for any square matrices C
and D. In particular, 2 cos θ must be an integer. Since −1 ≤ cos θ ≤ 1, we
62
CHAPTER 1. BASIC ALGEBRAIC STRUCTURES
conclude that the possible values of cos θ are −1, −1/2, 0, 1/2 or 1, i.e.,
that θ = 180o , 120o , 90o , 60o or 0o , as claimed.
There are many other examples, with concrete important applications,
where group theory plays fundamental role in understanding issues related
with the idea of symmetry.
Examples 1.8.8.
1. By exploring the symmetries of certain unbounded planar regions one can
determine the so-called crystallographic groups, which are used in the classification of the crystals that occur in Nature.
2. We will see in Chapter 7 that one can associate to each polynomial a group
of symmetries, consisting of certain permutations of its roots, the so-called
Galois group of the polynomial. The structure of the Galois group of the polynomial determines whether the roots of the polynomials can be obtained from
its coefficients by expressions which only involve fractions and radicals. The
application of group theory to Galois theory allows one to explain why there are
no formulas to express the solutions of polynomial equations for degree greater
or equal to 5.
3. One of the most basic and fruitful ideas in Physics in the principle of “ material
objectivity” or “ material frame-indifference”. This principle expresses the idea
that different observers, using different frames of reference, should describe the
same physical phenomenon using the same laws of Physics. From a mathematical point of view, this forces the laws of Nature to have as symmetry groups the
groups of transformations that relate observers in different frames of reference.
For example, according to this principle, the laws of Classical Mechanics and
the laws of Electromagnetism must have the same symmetry group. It was the
careful and systematic use of this idea that led Albert Einstein to the discover
the Theory of Relativity, for sure one of the greatest achievements of Science.
Exercises.
1. Assume that f : R2 → R2 is a non-trivial isometry of the plane. Show that:
(a) if f fixes two points a and b of the plane. then f is a reflection relative
to the line determined by a and b;
(b) if f fixes a unique point a, then f is a rotation around a;
(c) if f does not fix any point, then f is a translation, followed possibly by
a rotation and/or a reflection.
2. Conclude the proof of Proposition 1.8.3.
3. Conclude the proof of Proposition 1.8.5.
1.8. SYMMETRIES
63
4. Show that the Klein group is isomorphic to Z2 ⊕ Z2 .
5. Show that the following sets of transformations are groups:
(a) The orthogonal transformations and proper rotations f : Rn → Rn .
(b) The isometries f : Rn → Rn .
(c) The symmetries of a figure Ω ⊂ Rn .
6. Show that the special orthogonal group SO(n, R) is a normal subgroup of
O(n, R), but it is not a normal subgroupof E(n, R).
7. Show that D3 (the symmetry group of the equilateral triangle) is isomorphic
to the symmetric group S3 .
8. Describe the the group D4 (the symmetry group of a square).
9. The honeycomb in Figure 1.8.4 has rotational symmetries consisting of rotations of 60o around the centers of the faces, as well as rotations of 120o around
the vertices. How can one modify this figure so that the only rotational symmetries are rotations of 120o ?
10. Let f : R3 → R3 be a rotation. Show that there exists a basis {v1 , v 2 , v 3 }
of R3 relative to which the matrix representation of f is


1
0
0
 0 cos θ − sin θ  .
0 sin θ cos θ
One calls v 1 the axis of rotation and θ the angle of rotation of f . Can you
generalize this result to dimensions n > 3?
11. Let f : R3 → R3 be a reflection that fixes the origin. Show that there exists
a basis {v 1 , v 2 , v 3 } of R3 relative to which the matrix representation of f is
one of the following:






1 0 0
1 0
0
−1 0
0
 0 1 0  ,  0 −1 0  ,  0 −1 0 
0 0 −1
0 0 −1
0
0 −1
(reflection through a plane, through a line, through the origin, respectively).
What if f (0) 6= 0? Can you generalize this result to dimensions n > 3?
12. Let f : R3 → R3 be an isometry. Show that f can be factored as the
composition of a rotation, followed by a reflection, followed by a translation.
Is this factorization unique?
64
CHAPTER 1. BASIC ALGEBRAIC STRUCTURES
Chapter 2
The Integers
2.1
The Axioms of the Integers
We have mentioned (always informally) the ring of integers and some of the
properties that these numbers satisfy. So far we have assumed that these
properties are well-known and “obvious”. However, it is not possible to
develop any mathematical theory in a precise way without a careful examination of its foundations. In particular, one must make a careful distinction
between what are the axioms of the theory and what are the consequences
of those axioms, i.e., the propositions and theorems.
We wish now to discuss which properties of the integers should be consider axioms and which are simply consequences of those axioms. Note,
however, that our discussion will not be completely formal: we will continue
to use notions and results from set theory in an informal way, without a
rigorous formulation. The reason is that they fall outside the domain of
Algebra. However, the reader may wish to look at the Appendix in this
respect.
The choice of axioms that form the base of some theory is, up to some
point, arbitrary. It is always possible to chose distinct, but logically equivalent axioms. Hence, the final choice is necessarily a consequence of subjective
criteria related to such aesthetical aspects as elegance, brevity, or efficiency
of presentation.
We choose to start with an axiom which fits easily in the previous discussion:
Axiom I. There exists an integral domain Z, whose elements are called
integers.
65
66
CHAPTER 2. THE INTEGERS
The zero and the identity of Z will be denoted by 0 and 1, respectively.
It follows from the results in the previous chapter that a large number of
elementary properties of the integers are a direct consequence of Axiom I. In
particular, the cancellation laws for addition and product and the sign rules
are valid in Z, as well as the property 0 = −0 (proved in the exercises). On
the other hand, it should be clear that the this axiom does not characterize
completely the integers; for example, it is impossible to decide based solely
on this axiom if the property 1 6= −1 holds or not, or to decide if the integers
form a finite or a infinite set (why?). In order to complete this with other
axioms, we will now look carefully at the properties of the natural numbers.
In a informal way, we may think of the natural numbers as integers
which are obtained from 1 by “successive addition” of 1, or in other words,
the numbers of the form:
1, 2 = 1 + 1, 3 = 2 + 1, 4 = 3 + 1, . . .
In particular, denoting by N the set of natural numbers, we should have:
1 ∈ N, and n ∈ N ⇒ n + 1 ∈ N.
Taking a more general perspective, which will be useful later, we introduce
the following definition.
Definition 2.1.1. If A is a unitary ring with identity 1, a subset B ⊂ A is
called inductive if:
(i) 1 ∈ B;
(ii) n ∈ B ⇒ n + 1 ∈ B.
The ring A itself is obviously an inductive subset of A. Hence, Z is an
inductive subset of itself, so N is not the only inductive subset of Z. However,
the heuristic description of N above suggests another property of this set,
namely that any inductive subset of Z necessarily contains all the natural
numbers. In other words, N is the smallest inductive subset of Z.
Coming back to the general case of an arbitrary ring A with identity
1, we formalize these ideas as follows. Since the ring A itself is always an
inductive set, the family of all inductive subsets de A is necessarily nonempty. Denoting by N (A) the intersection of all inductive subsets of A, it
is clear that N (A) is contained in any inductive subset of A.
This remark deserves a name:
2.1. THE AXIOMS OF THE INTEGERS
67
Theorem 2.1.2 (Principle of Finite Induction). Let A be a unitary ring.
Then:
(i) if B ⊂ A is inductive, then N (A) ⊂ B, and
(ii) if B ⊂ N (A) is inductive, then B = N (A).
The previous result becomes even more interesting when one observes
that:
Proposition 2.1.3. If A is a unitary ring, then N (A) is an inductive subset
of A.
Proof. Since 1 belongs to all inductive subsets of A, it is obvious that 1 ∈
N (A).
Assume now that a ∈ N (A) and B is any inductive subset of A. Then
a ∈ B (because N (A) ⊂ B) and a + 1 ∈ B (because B is inductive). Since
B is arbitrary, it follows that a + 1 belongs to all inductive subsets of A,
i.e., a + 1 ∈ N (A). Hence, we conclude that N (A) is inductive.
The previous results show that N (A) is an inductive subset which is
contained in any inductive subset of A. For this reason we introduce:
Definition 2.1.4. For a unitary ring A, the set N (A) is called the smallest
inductive subset of A. If A = Z we denote this subset by N instead of
N (Z), and call N the set of natural numbers.
We will see later that the usual principle of mathematical induction is
precisely Theorem 2.1.2 applied to the ring of integers, and we will identify
all the possible sets N (A) (up to isomorphism). Note alse that our previous
(heuristic) description of N also applies to the set N (A), i.e., this set consists
of the elements of A obtained from the identity 1 by “successive addition”
of 1 (this will be made more precise later).
Examples 2.1.5.
1. If A = C, then N (A) = N.
2. If A = Mn (C), then N (A) = {mI : m ∈ N }.
3. If A = Z2 , then N (A) = Z2 .
We know that the addition and the product of natural numbers are still
natural numbers. We can now prove this statement as a special instance of
a general statement valid for any ring:
68
CHAPTER 2. THE INTEGERS
Proposition 2.1.6. The subset N (A) ⊂ A is closed relative to the addition
and the product, i.e.,
∀a, b ∈ N (A), a + b ∈ N (A) and ab ∈ N (A).
Proof. We prove the statement for addition, leaving the case of the product
as an exercise. For that, fix a ∈ N (A) and define Ba ⊂ N (A) as the set of all
elements b ∈ N (A) such that a+ b ∈ N (A). We must show that Ba = N (A),
which will follow from Theorem 2.1.2 (ii) by showing that Ba is inductive:
1. Since N (A) is inductive, it is clear that a + 1 ∈ N (A), and hence
1 ∈ Ba .
2. If b ∈ Ba , then a + b ∈ N (A) and so (a + b) + 1 ∈ N (A), because N (A)
is inductive. Since (a + b) + 1 = a + (b + 1), it follows that b + 1 ∈ Ba .
We conclude that Ba is inductive.
After this long digression, we return to the question of how to characterize axiomatically the ring of integers. Recall that the set of integers is usual
describe, informally, as consisting of the natural numbers, the symmetrics
of the natural numbers, and zero:
Z = {0, 1, −1, 2, −2, 3, −3, . . . },
being understood that in this list there are no repetitions. Our next axiom
makes this precise.
Axiom II. Every m ∈ Z satisfies exactly one of the following:
m = 0 or m ∈ N or − m ∈ N.
Note that for all the other examples of rings that we have seen before if
we replace in the previous statement Z by A and N by N (A), we obtained
a false statement. It is convenient (and standard) to denote the set N ∪ {0}
by N0 .
Axioms I and II will be the basis of our study of the integers. Eventually, we will show that the usual axioms for the rational and real numbers
are logical consequences of these axioms for the integers. The question of
knowing if this set of axioms is complete, i.e., if they allow to decide if any
“reasonable” statement about the integers is either true or false, and consistent, in the sense that some statement cannot be proved to be both true and
false, is a deep and delicate problem studied in Mathematical Logic, so it
2.1. THE AXIOMS OF THE INTEGERS
69
will not be discussed here. We mentioned only that an equivalent question
was the subject of Hilbert’s second problem 1 . The solution given by Kurt
Gödel 2 in 1930 was one of the most surprising and amazing achievements
of 20th Century Mathematics. Gödel showed that these two attributes of
the axiomatization of the integers (complete and consistent) are themselves
inconsistent: any system of axioms for Z which is consistent must contain
statements whose logical value (true or false) cannot be decided based solely
on those axioms.
It may sound strange at first sight that someone would even ask such
questions. Actually, the amazing fact is that their study had profound consequences in our lives. Gödel’s works were followed up by the English mathematician Alan Turing3 , who in 1936 come up with its own theory of a
Universal Computing Machine, now known as a Turing Machine. His theoretical findings inspired the Hungarian mathematician John von Neumann 4 ,
after he moved to the USA, in his efforts to improve the eniac (Electronic
Numerical Integrator and Computer ), one of the first digital computers.
This machine and other similar machines, which were constructed in the
years after the publication of Turing’s work, were the precursors of modern
day computers and digital machines.
The efforts of these two mathematicians also had profound consequences
in other ways. The Second World War started in 1939 and Turing played a
fundamental role in the English war efforts, helping to decipher the German
secret transmission codes with the help of another digital machine, called
enigma. Von Neumann on the other hand, with the help of the eniac,
1
David Hilbert (1862-1943), German mathematician, who was a professor in Göttingen.
Hilbert’s communication to the International Congress of Mathematicians held in Paris
(1900) included a list of 23 problems which he thought should be consider by mathematicians throughout the XX Century (see Bull. Am. Math. Soc., 2nd ser., vol. 8 (1901-02),
pp. 437-79).
2
Kurt Gödel (1906-1978) was born in Austria and emigrated at a young age to the USA,
where he became a member of the Institute for Advanced Study, in Princeton. When he
solved Hilbert’s second problem he was only 24 years old.
3
Alan Turing (1912-1954) was an English mathematician who towards the end of his
life also worked in theoretical biology. He was interested in the development of patterns
and shapes in biological organisms (morphogenesis) and the role of the Fibonacci numbers
in plant structures. His life is mixed with tragedy, since he was prosecuted and convicted
for his sexual orientation, resulting in chemical castration and the removal of his security
clearance. These events eventually led him to commit suicide.
4
John von Neumann (1903-1957), was born in Hungary, and lectured in Berlin and
Hamburg, before emigrating to the USA. Together with Einstein, he was one of the first
members of the Institute for Advanced Study, in Princeton. Von Neumann gave the first
axiomatization of the notion of a Hilbert space, a basic notion of Functional Analysis
which plays a fundamental role in Quantum Mechanics.
70
CHAPTER 2. THE INTEGERS
collaborated on the Manhattan project, which resulted in the construction
of the first atomic bomb, and the destruction of the Japanese cities of Hiroshima and Nagasaki, leading to the end of the war.
Exercises.
1. Complete the proof of Theorem 2.1.2 and of Proposition 2.1.6.
2. Determine the smallest inductive subset of each of the rings with identity
mentioned in Exercise 1.5.1.
3. For which of the rings in the previous exercise does an axiom similar to Axiom
II hold if the expression “exactly one” is replaced by “one”?
4. What are the consequences of replacing in Axiom II the expression “exactly
one” by “one”, and adding one of the following condition:
(a) 0 6∈ N;
(b) n ∈ N ⇒ −n 6∈ N;
(c) n 6∈ N ⇒ −n ∈ N.
5. A set X is said to be infinite 5 if there exists an injective map Ψ : X → X
which is not surjective. Show that N is infinite.
6. Show that, if there exists a bijection Ψ : X → Y , then X is infinite if and
only if Y is infinite.
7. Show that, if Y ⊂ X and Y is infinite, then X is also infinite. In particular,
show that Z is infinite.
2.2
Inequalities and Ordered Rings
Some of the elementary properties of the integer, rational and real numbers,
concern the manipulation of inequalities, i.e., the order relation which exists
in these rings. In these section we will study order relations in a ring, in
an abstract manner, looking for properties which are common to all ordered
rings. We will understand why some rings can be ordered, while other
cannot, and whether orders are unique or not. In the case of the integers, we
will see that there are properties of the order which are specific to this ring,
5
The notion of number of elements (or cardinal) of a set is discussed in the Appendix.
2.2. INEQUALITIES AND ORDERED RINGS
71
and moreover, that they are all consequence of the axioms in the previous
section.
The abstract notion of an order relation can be defined on any set without
any reference whatsoever to some algebraic structure (see the Appendix).
Here we are interested in a special kind of an order relation:
Definition 2.2.1. A binary relation “>” on a set X is called a strict
total order relation if:
(i) Transitivity: ∀x, y, z ∈ X, x > y and y > z ⇒ x > z.
(ii) Trichotomy: ∀x, y ∈ X, exactly one of the following three cases hold:
x > y or y > x or x = y.
We call the pair (X, >) an ordered set.
Note that condition (ii) states that any two elements of an ordered set
(X, >) can be compared, which is not the case for a general partial order
relation.
Given an ordered set X with an order relation “>”, which one reads
“greater than”, one can define, just like for the usual numbers, the relations
“<”, “≥” and “≤” (read as usual):
• a < b if and only if b > a;
• a ≥ b if and only if a > b or a = b;
• a ≤ b if and only if a < b or a = b.
We recall next some elementary concepts valid for any ordered set (X, >).
Definition 2.2.2. Let (X, >) be an ordered set. If Y ⊂ X and x ∈ X,
then:
(i) x is called an upper bound (respectively lower bound) of Y if
x ≥ y (respectively x ≤ y), for all y ∈ Y ;
(ii) Y is called bounded above (respectively bounded below) in X if
Y has at least one upper bound (respectively lower bound) in X;
(iii) Y is called bounded in X if it is bounded above and below in X.
Using these notions, we also have the usual notions of maximum, minimum, supremum and infimum, for any ordered set (X, >), which we also
state for future reference.
72
CHAPTER 2. THE INTEGERS
Definition 2.2.3. Let (X, >) be an ordered set. If Y ⊂ X and x ∈ X,
then:
(i) if x is an upper bound (respectively lower bound) of Y and x belongs
to Y , x is called the maximum (respectively minimum) of Y , and we
denote x by max(Y ) (respectively min(Y ));
(ii) if the minimum (respectively maximum) of the set of upper bounds of
Y exists, it is called the supremum or least upper bound (respectively infimum or largest lower bound) of Y , and it is denoted
by sup(Y ) (respectively inf(Y )).
It should be clear from the definition of an ordered set that the maximum/minimum of a subset (and hence also the supremum/infimum) if they
exist they are unique. Hence, these definitions and notations make sense.
The usual notions of intervals, familiar for real numbers, can be defined
for any ordered set (X, >). For example, if a, b ∈ X, then
]a, +∞[= {y ∈ X : y > a},
]a, b] = {y ∈ X : a < y ≤ b},
...
These are also commonly denoted by (a, +∞) and (a, b).
Examples 2.2.4.
1. Let X = R with the usual order relation “>” and Y =] − ∞, 0[. In this
case, Y has no lower bounds in R, and hence cannot have an infimum or
a minimum. The set of upper bounds is the interval [0, +∞[, which has the
minimum 0 which does not belong to Y . Obviously, sup(Y ) = 0 and Y has no
maximum.
2. Let X = R2 and define a strict, total order “>” as follows:

 y1 > y2 , or
(x1 , y1 ) > (x2 , y2 ) if and only if

y1 = y2 and x1 > x2 ,
where > on the right side denotes the usual order in R. One sometimes call
this ordered set (X, >) the long line. Now, the interval Y =](1, 0), (0, 2)] (draw
a picture of this interval!) has max(Y ) = (0, 2) = sup(Y ) and inf(Y ) = (1, 0),
while Y has no minimum.
When the set X has some algebraic structure it is natural to consider
order relations on X which, in some sense, respect the algebraic structure.
In the case of a ring A we will demand that
a > b ⇐⇒ a − b > 0,
2.2. INEQUALITIES AND ORDERED RINGS
73
i.e., that a > b if and only if a − b is positive, and that addition and product
of positive elements be positive elements. Hence, for a ring it is somewhat
more convenient to describe an order relation in terms of the set of positive
elements.
Definition 2.2.5. A ring A is called an ordered ring if there exists a
subset A+ ⊂ A sastifying:
(i) Closeness relative to addition and product: a + b, ab ∈ A+ , ∀a, b ∈ A+ .
(ii) Trichotomy: For all a ∈ A exactly one of the following three cases
hold: a ∈ A+ or a = 0 or −a ∈ A+ .
If A is an ordered ring, we define in A an order relation “>” by
a > b ⇐⇒ a − b ∈ A+
Notice that this is indeed an order relation in A: transitivity of “>” follows
from the fact that A+ is closed for addition, and the trichotomy of “>” is
an immediate consequence of (ii).
Notice also that one has a > 0 if and only if a ∈ A+ , and therefore we
say that A+ is the set positive elements of the ordered ring A. It also clear
that a < 0 if and only if −a ∈ A+ , and hence we call the elements of the set
A− = {a ∈ A : −a ∈ A+ } the negative elements.
The following proposition lists the basic rules to handle inequalities and
are valid in any ordered ring. It shows that a large number of properties of
the order relation of the integers, rationals and reals coincide, because they
are shared by all ordered rings.
Proposition 2.2.6. Let A be an ordered ring. For all a, b, c ∈ A, the
following properties hold:
(i) a > b ⇐⇒ a + c > b + c;
(ii) a > b ⇐⇒ −a < −b;
(iii) ab > 0 ⇐⇒ (a > 0 and b > 0) or (a < 0 and b < 0);
(iv) ab < 0 ⇐⇒ (a > 0 and b < 0) or (a < 0 and b > 0);
(v) ac > bc ⇐⇒ (a > b and c > 0) or (a < b and c < 0);
(vi) a 6= 0 ⇐⇒ a2 = aa > 0.
74
CHAPTER 2. THE INTEGERS
Proof. The proof of these properties is almost immediate from the definition.
We illustrate them with the proof of property (i), leaving the rest for the
exercises. By Definition 2.2.5,
a + c > b + c ⇔ (a + c) − (b + c) ∈ A+
⇔ a − b ∈ A+ ⇔ a > b.
It should now be clear that the ring of integers can be ordered by setting
= N, and this corresponds to the usual order of the integers. The
trichotomy property of N is exactly Axiom II for the integers, and we have
already checked that in any ring with identity the set N (A) is closed relative
to addition and the product. Perhaps, more interesting is the fact that the
usual order of the integers is the unique order in this ring. In order to show
it, we will need one more result valid for any ordered ring A where 1 6= 0.
Z+
Theorem 2.2.7. If A is an ordered ring with identity 1 6= 0, then one must
have N (A) ⊂ A+ .
Proof. We only need to check that A+ is an inductive subset and apply the
definition of N (A):
(a) Since 1 6= 0 and 1 = (1)(1) = 12 , it follows from property (vi) in
Proposition 2.2.6 that 1 > 0, i.e., that 1 ∈ A+ ;
(b) Since A+ is closed relative to the addition,
a ∈ A+ ⇒ a + 1 ∈ A+ .
Hence, A+ is inductive and we conclude that N (A) ⊂ A+ .
Obviously there exist rings A such that A+ 6= N (A), i.e. such that
N (A) ( A+ . This is the case, e.g., with the rings Q and R. However, as we
mentioned above, if A is the ring of integers, then we must have N (A) = A+ .
Theorem 2.2.8. The ring of integers Z has only one order, namely the one
obtained by setting Z+ = N.
Proof. Assume that Z is ordered, possibly differently from the usual order,
and let Z+ be the corresponding set of “positive” integers. By the previous
theorem, we have that N ⊂ Z+ . We claim that Z+ ⊂ N, from which the
result follows.
To prove the claim, if m ∈ Z+ , then m 6= 0 and −m 6∈ Z+ , according
to trichotomy property in Definition 2.2.5. Since N ⊂ Z+ , it follows that
−m 6∈ N. Finally, by Axiom II, we must have m ∈ N.
2.2. INEQUALITIES AND ORDERED RINGS
75
One should not conclude from the previous results that the order of
an arbitrary ring is always unique! For example, we shall indicate in the
exercises of these section a subring of R that can be ordered in several
distinct ways. On the other hand, there are many rings which cannot be
ordered.
Example 2.2.9.
If A = Z2 is the field with two elements, then −1 = 1, so we see that trichotomy cannot be satisfied and Z2 cannot be ordered.
As we have already pointed out, the results at the beginning of this
section show that a large number of properties of the order of the integers
is common to all other ordered rings. The previous result suggests that the
properties specific to the order of Z result from the fact that A+ = N (A)
for this ring. We illustrate this fact by showing that the positive integers
are the integers which are larger than or equal to 1, a statement that is
obviously false if we replace the word “integers” by “rationals” or “reals”.
Proposition 2.2.10. Z+ = {m ∈ Z : m ≥ 1}.
Proof. Let S = {m ∈ Z : m ≥ 1}. If m ∈ S, then m ≥ 1, and since 1 > 0,
we have that m > 0. Hence, S ⊂ N.
It is obvious that 1 ∈ S and that
1 > 0 ⇒ 1 + 1 > 1 + 0 = 1.
Hence, if m ∈ S, then m + 1 ≥ 1 + 1 > 1, and so m + 1 ∈ S, showing
that S is inductive. By the Principle of Induction and by Theorem 2.2.8,
we conclude that S = N = Z+ .
The statement in the previous proposition is equivalent to the fact that
in the ring of integers one has ]0, +∞[= [1, +∞[, or also ]0, 1[= ∅. Note that
in Q and in R the interval ]0, 1[ is an infinite set. Other properties specific
to the order of the integers are discussed in the exercises at the end of this
section and the next one.
The notion of absolute value or modulus can be introduced without much
difficulty in any ordered ring.
Definition 2.2.11. Let A be an ordered ring and a ∈ A. The absolute
value or modulus of a is the element |a| ∈ A defined by
|a| = max{−a, a}.
76
CHAPTER 2. THE INTEGERS
It should be clear from this definition that the absolute value of a nonzero number a ∈ A is a positive number |a| ∈ A+ . We list next some
properties of the absolute value, which may be shown directly from this
definition.
Proposition 2.2.12. For any a, b ∈ A, we have:
(i) −|a| ≤ a ≤ |a|;
(ii) |a + b| ≤ |a| + |b|;
(iii) |ab| = |a||b|.
The proofs are left to the exercises.
Exercises.
1. Complete the proof of Proposition 2.2.6.
2. Show that, if A is a ring with identity 1 6= 0 and there exists in A an element
i such that i2 = −1, then A cannot be ordered.
3. Show that, if m ∈ Z, then ]m, m + 1[= ∅ in Z.
4. Show that in Z:
(a) m > n ⇔ m ≥ n + 1;
(b) mn = 1 ⇔ (m = n = 1) or (m = n = −1);
(c) m = sup S if and only if m = max S;
(d) m = inf S if and only if m = min S.
5. Prove Proposition 2.2.12.
6. Show that in any ordered ring:
(a) |a| ≤ |b| ⇔ −|b| ≤ a ≤ |b| ⇔ a2 ≤ b2 ;
(b) ||a| − |b|| ≤ |a − b|;
7. Let B be an ordered ring and φ : A → B an isomorphism of rings. Show
that A can be ordered with A+ = {a ∈ A : φ(a) ∈ B + }.
√
8. Let A = {m + n 2 : m, n ∈ Z}. Show that A can be ordered with an order
distinct from the one induced
√ from R. √
(Hint: Note that φ(m + n 2) = m − n 2 is an automorphism of A and use
the previous exercise.)
2.3. PRINCIPLE OF INDUCTION
77
9. Show that, if an ordered ring A is bounded above or bounded below, then
A = {0}.
10. Show that any ordered ring A 6= {0} is infinite.
11. Show that the cancellation law for the product holds in any ordered ring.
12. An ordered ring A is called archimedean if for any a, b ∈ A with a =
6
0 there exists x ∈ A such that ax > b. Show that the ring of integers is
archimedean.
2.3
Principle of Induction
According to the definition of N, discussed in the previous section, the set
of natural numbers satisfies the Principle of (finite) Induction:
Theorem 2.3.1 (Principle of Induction). If S ⊂ Z is an inductive set, then
N ⊂ S. In particular, if S ⊂ N, then S = N.
Traditionally, a proof “by induction” obeys the following scheme: given
a proposition P(n), depending on some natural n ∈ N, we must (i) show that
P(1) is true, and (ii) show that if P(n) is true, then P(n + 1) is also true.
The relationship between this procedure and the Principle of Induction can
be easily explained by introducing the set
S = {n ∈ N : P(n) is true}.
Then we have that:
• (P(1) is true) ⇐⇒ (1 ∈ S);
• (P(n) ⇒ P(n + 1)) ⇐⇒ (n ∈ S ⇒ n + 1 ∈ S).
Hence, we see that the usual induction argument amounts to show that the
set S of natural numbers for which the proposition is true is an inductive
subset de N. Often, one does not mention explicitly the set S, but that
should not be a source of any misunderstanding.
Example 2.3.2.
We say that n ∈ N is even (respectively, odd) if there exists k ∈ Z such that
n = 2k (respectively, n = 2k + 1). In order to prove the statement “every
natural number is even or odd”, we consider P(n) =“n even or odd”, and
observe:
78
CHAPTER 2. THE INTEGERS
(i) If n = 1, then 1 = 2 · 0 + 1, hence 1 is odd, so that P(1) is true.
(ii) If n is a natural number such that P(n) is true, we have n = 2k or n =
2k+1. It follows that that n+1 = 2k+1 or n+1 = (2k+1)+1 = 2(k+1),
and hence P(n + 1) is true.
We conclude that P(n) is true for any natural number n.
We will often use induction in our proofs. We start illustrating its use
by proving a few more properties of the order of the natural numbers and of
the integers. Our first application is to prove the Well-ordering Principle.
Theorem 2.3.3 (Well-ordering Principle). Any non-empty set of natural
numbers has a minimum.
Proof. Let S be any non-empty set of natural numbers and consider the
statement
P(n) =“If S contains a natural k ≤ n, then S has minimum”.
Our theorem will follow (why?) if we can show that:
“P(n) is true for every n ∈ N”.
Therefore, we can try to apply the Principle of Induction:
(i) P(1) is true, because, if S contains a natural k ≤ 1, then according to
Proposition (2.2.10) we have k = 1, and 1 is obviously the minimum
of S, because it is the minimum of N.
(ii) Assume now that P(n) is true for some natural n, and assume that S
contains a natural k ≤ n + 1. We must show that S has a minimum.
If S contains a natural k ≤ n, then it follows from P(n) that S has a
minimum. Otherwise, S contains a natural in the interval [1, n + 1],
but none in the interval [1, n]. Since we know that the interval ]n, n+1[
is empty, we conclude that n + 1 is the minimum of S.
On the other hand, when S is a set of integers, we have
Theorem 2.3.4. Any non-empty set of integers which is bounded below
(respectively, bounded above) has a minimum (respectively, a maximum).
2.3. PRINCIPLE OF INDUCTION
79
We leave the proof of this theorem has an exercise, and instead we prove
the following theorem, which can be used in proofs by induction that “start”
at any integer k 6= 1. One should notice that the proof uses Theorems 2.3.3
and 2.3.4, which occasionally are more practical to apply than the Principle
of Induction.
Proposition 2.3.5. If a statement P(n) is true for some integer n = k and
if P(n) ⇒ P(n + 1) for every integer n ≥ k, then P(n) is true for every
integer n ≥ k.
Proof. Let
S = {n ∈ Z : n ≥ k and P(n) is false}.
We claim that S is empty, i.e., that P(n) is true for every n ≥ k.
Note that S is, by definition, bounded below by k. Hence, according to
Theorem 2.3.4, if S is non-empty then it must have a minimum m. Moreover,
since m ∈ S, it follows that m ≥ k.
By assumption P(k) is true, and hence k 6∈ S, so that m > k. Consider
now the integer m − 1. Since m > k, we have m − 1 ≥ k. Since m − 1 is
less than the minimum of S, we have m − 1 6∈ S, i.e., P(m − 1) is true. But
by assumption P(n) ⇒ P(n + 1) for all n ≥ k so that P(m) is true. This
contradicts the fact that m belongs to S. We conclude that S cannot have
minimum, and so it must be empty, proving our claim.
In certain circunstancies it is more convenient to use yet another form of
the Principle of Induction. For example, consider the sequence an consisting
of the Fibonacci numbers:
1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, . . .
This sequence is usually defined as the sequence an satisfying6 :
(2.3.1)
a0 = a1 = 1, and an+1 = an + an−1 for n ≥ 1.
It is possible to determine an explicit expression for this sequence. We will
not discuss now how to deduce it, but the expression is the following:
√ !n
√ !n
√
√
5− 5 1− 5
5+ 5 1+ 5
+
.
(2.3.2)
an =
10
2
10
2
6
Fibonacci, also known as Leonardo of Pisa (1180-1250), found this sequence by considering the following problem about rabbit breeding: Beginning with a single pair of
rabbits, how many pairs of rabbits are there at the end of one year if each pair of rabbits
gives rise to a new pair of rabbits every month, and each new pair starts giving birth at
the end of 2 months?
80
CHAPTER 2. THE INTEGERS
It looks obvious that the sequence an defined by this last formula is the
only sequence that satisfies (2.3.1). However, the proof of this statement is
not as easy as it looks at first sight. First of all it is convenient to formalize
the notion of sequence in any set X.
Definition 2.3.6. A sequence of elements in a set X is a map f : N → X.
When dealing with sequences it is common to write fn instead of f (n),
and to speak of the “sequence {f1 , f2 , . . . }”, “{fn }”, or even “fn ”, instead of
saying “the sequence f ”. We will also commit this abuse of language (why
is this an abuse of language?).
Note that, if k is an integer, then to a map g : [k, ∞[∩Z → X corresponds
the sequence f : N → X given by f (n) = g(n + k − 1). For this reason, we
say that g is a sequence defined in [k, ∞[. Hence, the “Fibonacci sequence”
above is a sequence defined in [0, ∞[.
The result that we wish to prove is the following:
Proposition 2.3.7. If f is a sequence of natural numbers defined in N0 ,
such that f0 = f1 = 1 and fn+1 = fn + fn−1 for n ≥ 1, then f (n) = a(n),
where a(n) = an is the sequence defined by (2.3.2).
If we let P(n) be the statement “f (n) = a(n)”, we must show P(n) for
all integers n ≥ 0. The difficult in applying the usual induction method is
that, for obvious reasons, we cannot show that that P(n) ⇒ P(n + 1), but
only that (P(n) and P(n−1)) ⇒ P(n+1). In order to facilitate the solution
of the to this and similar problems, we introduce the following form of the
Principle of Induction:
Theorem 2.3.8. Let S be a set of integers such that
(i) k ∈ S, and
(ii) for all n ≥ k, [k, n] ⊂ S ⇒ [k, n + 1] ⊂ S.
Then [k, +∞[⊂ S.
In terms of the statement P(n) =“n ∈ S”, the previous result says that,
if P(k) is true and if P(n) is true whenever P(m) is true for all k ≤ m < n,
(and not just for m = n − 1), then P(n) is true for all n ≥ k. The proof of
this statement is not hard if we use the Well-ordering Principle and proceed
as in the proof of Proposition 2.3.5.
2.3. PRINCIPLE OF INDUCTION
81
Proof of Theorem 2.3.8. Let
F = {n ≥ k : n 6∈ S}.
We claim that F is empty, so that [k, +∞[⊂ S.
If F 6= ∅, since F is bounded below by definition, we conclude that
F has a minimum m ≥ k. Since k ∈ S, it is clear that m > k. Hence,
m − 1 ≥ k, and the interval [k, m − 1] does not contain any element of F ,
because all its elements are less than the minimum of F . In other words, we
have [k, m − 1] ⊂ S. It follows now from (ii) that we must have [k, m] ⊂ S,
and hence m ∈ S, i.e., m 6∈ F , a contradiction.
We conclude that F cannot have minimum, hence F is empty, as claimed.
The previous theorem allows for an immediate proof of Proposition 2.3.7,
which we leave for the exercises.
Note, however, that this result does not eliminate all the logical difficulties with definitions like (2.3.1). This definition is an example of a recursive
definition, where the object to be defined appears in the description of the
object. As we know since at least back to Epimenides of Crete (circa 600
BC) who become famous with the phrase “All Cretans are liars”, it is possible to create logical paradoxes, or propositions whose logical value cannot be
decided, my making statements which in some way refer to themselves. A
classical example is Bertrand Russell’s paradox 7 , suggested by the attempts
to define the “set of all sets”.
Observe that a definition like “U is the set of all sets” is recursive, because
U , being a set, is one of the elements which enters its description. In other
words, U has the strange property of being an element of itself, which is
not at all usual in the sets we are familiar with! If this property of U is
unpleasant, we can consider instead the set N of “normal’ sets, i.e., the sets
which are not elements of themselves. In symbols,
N = {C ∈ U : C 6∈ C}.
7
Bertrand Russell (1872-1970) co-authored together with Alfred N. Whitehead (18611947) the famous treaty Principia Mathematica (3 vols., 1910-13), where they attempted
to formalize in an axiomatic manner all the fundamental notions of Arithmetics. This
monumental work was the ultimate result of the “logistic” program to formalize Mathematics, which consisted in constructing all Matematical knowledge through logic deduction
from a small number of concepts and principles. Although this program was doomed to
fail, due to the later work of Gödel, it was fundamental for the advance of Mathematical
Logic.
82
CHAPTER 2. THE INTEGERS
The question is now: is N a “normal“ set? Unfortunately, if we assume that
N is “normal” (i.e., N 6∈ N ) then N belongs to the set of normal sets (i.e.,
N ∈ N !). If we assume that N is not “normal”, we have that N ∈ N . But
then N is an element of the set of normal sets, and hence N is itself normal
(N 6∈ N !). In other words, we cannot give a logical value to the statement
“N ∈ N ”.
At a superficial level, the lesson to learn from this example is simply
that one must be careful with recursive definitions, something that the Greek
philosophers were already aware of. A similar difficulty occurs when one uses
a spreadsheet and creates a closed loop between cells in the spreadsheet, or
when one states a “theorem” such as:
Theorem 2.3.9. This statement is false. . .
It is more or less clear that such difficulties do not occur with definitions
like the one for the Fibonacci sequence, and we will often use such recursive
definitions (see the examples in the next section and the Appendix for a
formalization).
However, at a deeper level, the logical difficulties with recursive definitions, or more generally with statements that refer to themselves, are
unavoidable and are related with some of the deepest questions tackled by
mathematicians and philosophers nowadays: Is it possible to give a (rigorous) definition of “rigorous definition”? Can we understand how our own
intelligence works? How can we reconcile the mechanical aspect of logical
deductions, exhibit by a computer program, with the apparent infinite adaptive nature of what we call “intelligent behavior”? At the end of the day,
this is the central problem of Artificial Intelligence.
Exercises.
1. Determine all the inductive subsets of Z.
2. Prove Theorem 2.3.4.
3. Prove Proposition 2.3.7.
4. Consider the statement (obviously false!) “All blonde women have blue eyes”.
What is the mistake in the following “proof” by induction? Denote by P(n)
the statement “If in a set of n blonde women there exists one with blue eyes,
then all have blue eyes”. Then:
(a) P(1) is obviously true.
2.4. SUMMATIONS AND PRODUCTS
83
(b) Assume that P(n) is true, and consider a set with n + 1 blonde women
L = {M1 , . . . , Mn+1 }. Assume that M1 has blue eyes. Define Ln+1 =
{Mk : k 6= n + 1} and Ln = {Mk : k 6= n}. Since P(n) is true, all women
in Ln and Ln+1 have blue eyes. Since L = Ln ∪ Ln+1 , all women in L
have blue eyes.
(c) Since there exists at least one blonde women with blue eyes, we conclude
that all blonde women must have blue eyes.
5. Here is another variant of the “blonde women with blue eyes” problem. Consider the statement (obviously false!) “All finite groups are abelian”. We have
the following “proof” by induction: Let G be a group and denote by P(n) the
statement “In a subset of G with n elements, all elements commute”.
(a) P(1) is obviously true.
(b) Assume that P(n) is true, and consider a subset of G with n + 1 elements
L = {g1 , . . . , gn+1 }. Denote by Li = {gk : k 6= i} the subset formed by
all elements of L, with the exception of the element i. Since P(n) is true,
each Li is commutative. Since the Li exhaust all the elements of L, we
conclude that L is commutative.
(c) Since G is finite, we conclude that G is commutative.
6. Formula (2.3.2) for the Fibonacci sequence can be obtained by determining
the sequences of the form β n which satisfy equation (2.3.1). What are the
sequences of integers which satisfy bn+1 = bn + 6bn−1 and b0 = b1 = 1?
2.4
Summations and Products
Often one is interested in adding and multiplying more than two elements.
For this reason, it is convenient to generalize these algebraic operations to
an arbitrary (finite) number of factors. We start by looking at the definition
of products of more than two factors, since the results for summations are
obtained by a simple change of notation.
In this section, S will denote a set with an associative binary operation.
Given a sequence a1 , a2 , . . . of elements of S, the corresponding sequence of
partial products is π1 = a1 , π2 = a1 a2 , π3 = (a1 a2 )a3 , . . . . Formally:
Definition 2.4.1. Given a sequence a : N → S its sequence of partial
products is the sequence π : N → S defined inductively by:
π1 = a1 and πn = πn−1 an if n > 1.
We write πn =
1 to n.
Qn
k=1 ak
and called it the product of the ak ’s, with k from
84
CHAPTER 2. THE INTEGERS
In term of these sequences the associativity property reads:
Proposition 2.4.2. If a, b : N → S are sequences in S, we have:
Qn
Q
Qn
(i) for any n > m: ( m
k=1 ak ;
k=1 ak )
k=m+1 ak =
Q
Q
Q
(ii) if the operation is commutative: ( nk=1 ak ) ( nk=1 bk ) = nk=1 (ak bk ).
The proof (by induction) is left for the exercises. Also, note that the
definition above can be easily adapted to products that start at some k > 1.
A particular interesting case of Definition 2.4.1 happens when the sequence a : N → S is constant (i.e., when an = c, for all n ∈ N). The
product of the first n terms of the sequence a corresponds then to the notion of power with base c and exponent n.
Definition
2.4.3. The power of base c and exponent n ∈ N is defined by
Q
cn = nk=1 c.
The power can be thought of as map Φ : N × S → S, (n, c) 7→ cn . If we
fix c, we have the exponential map φ : N → S, while if we fix n, we obtain
a map ψ : S → S. According to Definitions 2.4.1 and 2.4.3, we have:
(2.4.1)
c1 = c and cn = (cn−1 )c for n > 1.
When S is a monoid with identity I and c is invertible, we define:
(2.4.2)
c0 = I and c−n = (c−1 )n for n > 1.
Hence, for c is invertible, the power cm is well-defined for any integer m. In
particular, when S is a group, the map Φ : N × S → S can be extended to
a map Φ̃ : Z × S → S.
The usual rules for powers are valid in this more general context:
Proposition 2.4.4. If S has an associative operation, then for any a, b ∈ S,
one has:
(i) an am = an+m and (an )m = anm , for n, m ∈ N;
(ii) If ab = ba, then (ab)n = an bn , for n ∈ N;
If S is a monoid, then for any a, b ∈ S invertible, one has:
(iii) an am = an+m and (an )m = anm , for n, m ∈ Z;
(iv) If ab = ba, then (ab)n = an bn , for n ∈ Z;
2.4. SUMMATIONS AND PRODUCTS
85
Proof. As one could guess the proof of these results uses the Principle of
Induction. We illustrate it in the case of (i), as an example where the
argument involves two natural numbers. The proofs of the other cases are
left for the exercises.
We prove that an am = an+m by induction in the natural number m. Let
P(m) be the statement:
P(m) = “an am = an+m for any a ∈ S and natural number n”.
P(1) follows from Definition 2.4.1, and to prove P(m) ⇒ P(m + 1), we
observe that
an am+1 = (an )(am a)
n
m
= ((a )(a ))a
= (a
n+m
)a
= an+m+1
(by (2.4.1)),
(by associativity),
(by the induction hypothesis),
(by (2.4.1)).
Notice that according to (iii) in the previous proposition, and assuming
that S is a group, the map f : Z → S, n 7→ an , for some fixed a ∈ S, is
a homomorphism of groups. If S is just a monoid, the restriction of this
function to N0 is still a homomorphism of monoids.
Let us now formulate the results above in the additive notation. If “+”
denotes a commutative binary operation on the set S, one should restate
Definition 2.4.1 as follows:
Definition 2.4.5. Let + be a commutative, associative, binary operation
on the set S. Given a sequence a : N → S its sequence of partial sums
is the sequence σ : N → S defined inductively by:
σ1 = a1 and σn = σn−1 + an if n > 1.
OnePcalls σn the summation of the ak ’s, with k from 1 to n, and denotes it
by nk=1 ak .
PnWhen a : N → S is a constant sequence with an = c, then we write nc =
k=1 c. Moreover, if S has identity 0 and the element c has a symmetric
−c, we define
0c = 0,
(−n)c = n(−c).
The operation Φ : N × S → S, (n, c) 7→ nc is called the “product of a
natural n by an element c of S”. Also when c has a symmetric, we call
86
CHAPTER 2. THE INTEGERS
the operation Ψ : Z × S → S, (n, c) 7→ nc, the “product of an integer n by
an element c of S”. This terminology is slightly ambiguous when S = Z:
apparently, we obtain in Z two distinct product operations, namely the operation mentioned in Axiom I and the operation we have just introduced
in Definition 2.4.5. We leave as simple exercise to check that these two
operations actually coincide. In fact, this duplication suggests that all the
references to the product in the axiomatization of the integers are superfluous and redundant. Though we have not chosen this path, this is indeed
the case: it is possible to present a set of axioms for the integers without
mentioning the product operation, and to deduce all the properties of the
product as theorems.
If (G, +) is an abelian group, the basic algebraic properties of the product
of elements of G by integers can be summarized as follows:
Proposition 2.4.6. Given an abelian group (G, +), if g, h ∈ G and m, n ∈
Z, then we have:
(i) Identity: 1g = g.
(ii) Distributivity: (n + m)g = ng + mg and n(g + h) = ng + nh.
(iii) Associativity: n(mg) = (nm)g.
One may notice that that these properties are formally similar to the
properties in the definition of a linear vector space. More precisely, if we
replace the elements of the group G by vectors of a linear vector space V
over some field and the integers by scalars of the field, then the properties
in Proposition 2.4.6 are exactly the conditions on the operation “product of
scalar by a vector ” in the definition of a vector space. In fact, we will see in
Chapter 6 the more general concept of a module over a ring, which allows to
treat at the same level both the notion of an abelian group and the notion
of a vector space over a field.
If (A, +, ·) is a ring, we can also look at some “mixed” properties combining the addition and the product:
Proposition 2.4.7. Let (A, +, ·) be a ring. If a, a1 , . . . , an , b, c ∈ A and
n ∈ N, then we have:
P
P
(i) c ( nk=1 ak ) = nk=1 (cak );
P
P
(ii) ( nk=1 ak ) c = nk=1 (ak c);
(iii) n(ab) = (na)b = a(nb).
2.4. SUMMATIONS AND PRODUCTS
87
We mentioned before that, when G is a group and g ∈ G, then the map
ψ : Z → G, n 7→ g n is a homomorphism of groups. Obviously, if G is an
abelian group, the map ψ : Z → G, n 7→ ng is also a homomorphism of
groups. Finally, if (A, +, ·) is a ring and a ∈ A, then ψ : Z → A, n 7→ na
is always a homomorphism of groups between (Z, +) and (A, +), which in
general fails to be a ring homomorphism, except when a2 = a (why?).
Exercises.
1. What is the sequence defined in Z by a1 = 1, an+1 =
Pn
k=1
ak ?
2. Complete the proofs of the results stated in this section.
3. Show that, if S is a monoid where the cancellation law holds, then the identity
! n
!
n
n
Y
Y
Y
(ak bk )
ak
bk =
k=1
k=1
k=1
holds if and only if S is abelian.
4. Let G be an abelian group, and assume that n ∈ Z, and g1 , g2 ∈ G. Is it true
that one always has
n 6= 0 and ng1 = ng2 ⇒ g1 = g2 ?
(Hint: Consider the group Z2 .)
5. Show that, if B is a subset of the ring A which is closed relative to the
difference in A, then B is also closed relative to the product by integers, i.e.,
that:
If [a, b ∈ B ⇒ a − b ∈ B] then [(n ∈ Z and b ∈ B)⇒ nb ∈ B].
6. Use the previous result to prove that in the ring of integers, if B ⊂ Z is
non-empty, the following statements are equivalent:
(a) B is closed relative to the difference;
(b) B is a subring of Z;
(c) B is an ideal of Z.
7. Show that:
(a) if φ : G → H is a homomorphism of additive groups, then φ(ng) = nφ(g),
for all n ∈ Z, g ∈ G;
88
CHAPTER 2. THE INTEGERS
(b) if φ : Z → G is a homomorphism of additive groups, then φ(n) = ng, for
some g ∈ G.
How can one generalize these results for non-additive groups?
8. Show that, if G is a group and g ∈ G, then H = {g n : n ∈ Z} is the smallest
subgroup of G that contains g.
9. Let A be a ring with identity I, and let φ : Z → A be given by φ(n) = nI.
Show that:
(a) φ is a ring homomorphism, and φ(Z) = {nI : n ∈ Z} is the smallest
subring of A that contains I;
(b) For each a ∈ A, {n ∈ Z : na = 0} is an ideal of Z that contains the kernel
N (φ);
P
(c) φ(N) = {nI : n ∈ N} = { nk=1 I : n ∈ N} coincides with the set N (A)8 .
10. Let A 6= {0} be a ring with identity I, and let φ : Z → A be given by
φ(n) = nI. Show that, if A is well-ordered (i.e., if A is ordered and every
non-empty subset of A+ has a minimum), then A is isomorphic to Z.
(Hint: Show the following:
(a) The set {a ∈ A : 0 < a < I} is empty;
(b) A+ = φ(N):
(c) A = φ(Z);
(d) φ is injective.)
2.5
Factors, Multiples and Division
In an arbitrary ring A the equation ax = b may fail to have a solution, even
if a 6= 0 (if a = 0 the equation can only have solutions if b = 0). In order to
avoid having to distinguish the equation ax = b from the equation xa = b,
we will assume in this section that A is a commutative ring.
Definition 2.5.1. If a, b ∈ A, we say that a is factor (or divisor9 ) of b,
or that b is multiple of a, and we write “a|b”, if the equation ax = b has
some solution x ∈ A.
8
This is the most rigorous way to formulate the idea that the elements of N (A) are
obtained by adding the identity I to itself an arbitrary (but finite) number of times.
9
Note that the term divisor is used here in with a slightly different meaning from the
one in the term zero divisor. Recall that a 6= 0 is called a zero divisor if the equation
ax = 0 has a solution x 6= 0.
2.5. FACTORS, MULTIPLES AND DIVISION
89
Examples 2.5.2.
1. In a ring with identity 1, any element b has at least the factors 1, −1, b, −b,
since b = 1b = (−1)(−b) (but note that is possible that 1 = −1 = b = −b!).
2. If K is a field and 0 6= k ∈ K, then every r ∈ K is a multiple of k.
3. If a ∈ Z and a2 6= 1, the set of multiples of a is smaller than Z.
4. The multiples of (x − 1) in the ring R[x] of polynomials with real coefficients
in the variable x are precisely {p(x) ∈ R[x] : p(1) = 0}.
It is obvious that that the relation “ is a factor of ” is transitive (if a|b
and b|c, then a|c), and if c 6= 0 is not a zero divisor, we have that ac|bc if
and only if a|b. On the other hand, if A is an ordered ring, it is also clear
that a|b if and only if |a| | |b|.
In this chapter we are mainly interested in the case A = Z. The study
of factorization and divisibility in more general rings will be the done in the
next chapter. In the case of the integers, the implication n > 0 ⇒ n ≥ 1
gives us the following:
Lemma 2.5.3. If m, n ∈ Z, then:
(i) m|n =⇒ (|m| ≤ |n| or n = 0);
(ii) (m|n and n|m) ⇐⇒ |m| = |n|.
According to the previous lemma, if n and k are natural numbers and
k|n, then k < n. As we have already observed, 1 is a factor of any natural
number n. Hence, if n and m are natural numbers, the set of common factors
(or divisors) of n and m is non-empty and bounded above, so it must have
a maximum.
Similarly, the set of natural numbers common multiples of n and m, i.e.,
the set {k ∈ N : n|k and m|k}, is non-empty since nm > 0 is a common
multiple of n and m. Hence, according to the the Well-ordering Principle,
this set must have a minimum element.
Definition 2.5.4. If n, m ∈ N, then:
(i) gcd(n, m) = max{k ∈ N : k|n and k|m} is called the greatest common divisor of n and m;
(ii) lcm(n, m) = min{k ∈ N : n|k and m|k} is called the least common
multiple of n and m.
90
CHAPTER 2. THE INTEGERS
Example 2.5.5.
If n = 12 and m = 16, the natural numbers which are divisors of n and m
form the sets
{k ∈ N : k|12} = {1, 2, 3, 4, 6, 12},
{k ∈ N : k|16} = {1, 2, 4, 8, 16}.
Therefore, the natural numbers which are common divisors of 12 and 16 form
the set
{k ∈ N : k|12 and k|16} = {1, 2, 4}.
We conclude that the corresponding greatest common divisor is gcd(12, 16) = 4.
The natural numbers which are multiples of 12 and 16 are
{k ∈ N : 12|k} = {12, 24, 36, 48, . . . },
{k ∈ N : 16|k} = {16, 32, 48, . . . },
so the natural numbers which are common multiples of 12 and 16 form the set
{k ∈ N : 12|k and 16|k} = {48, 96, . . . }
We conclude that the corresponding least common multiple is lcm(12, 16) = 48.
Notice that in this example gcd(m, n) is a multiple of all the common
divisors of n and m, while lcm(n, m) is a factor of all the common multiples
of n and m, In fact, we have:
Proposition 2.5.6. Let n, m, d, l ∈ N. Then:
(i) d = gcd(n, m) if and only if :
(a) d|n and d|m;
(b) for any k ∈ N, (k|n and k|m) ⇒ k|d.
(ii) d = lcm(n, m) if and only if :
(a) n|d and m|d;
(b) for any k ∈ N, (n|k and m|k) ⇒ d|k.
We will prove these statements in the next section. For now, we introduce
two more definitions for future reference. Note that in our conventions the
natural number 1 is not prime.
2.5. FACTORS, MULTIPLES AND DIVISION
91
Definition 2.5.7. If p, m, n ∈ N.
(i) We say that p is prime if p > 1 and if, for every k ∈ N such that k|p,
we have that k = 1 or k = p.
(ii) We say that n and m are relatively prime if gcd(n, m) = 1.
Example 2.5.8.
It is easy to check by listing all the possibilidades that 4 and 9 are relatively
prime, i.e., gcd(4, 9) = 1, and that 13 is a prime number. In order to show that
p is prime it is not necessary to test all the numbers k with 1 < k < p: it is
enough to stop testing at the greatest integer k such that k 2 < p. For example,
in the case p = 13, it is enough to check that 13 is not a multiple of 2 or 3.
We saw in the previous section (by using induction!) that any natural
number n is either even or odd, i.e.., that given n, there exist integers q
and r such that n = 2q + r, with 0 ≤ r < 2. This result is not specific to
the natural number 2: for example, we can always write n = 3q ′ + r ′ , with
0 ≤ r ′ < 3, or n = 4q ′′ + r ′′ , with 0 ≤ r ′′ < 4, etc. A moment of thought
shows that these statements are a consequence of the Division Algorithm
learned in Elementary School!
Theorem 2.5.9 (Division Algorithm). If n, m ∈ Z and n 6= 0, there exist
unique integers q, r, such that m = nq + r, and 0 ≤ r < |n|.
Proof. We will prove only the case n, m ∈ N, leaving as an exercise the
generalization to any integers. Note that the argument to prove existence
amounts to the usual method to perform the division of two numbers.
(i) Existence: Consider the set Q = {x ∈ N0 : nx ≤ m}. Note that Q
is non-empty (because 0 ∈ Q) and bounded above (because x ≤ nx ≤ m).
Therefore Q has a maximum x = q. It is clear that nq ≤ m < n(q + 1),
because q ∈ Q and (q+1) 6∈ Q. Subtracting nq to both sides of the inequality,
we obtain that 0 ≤ r ≤ n, since r = m − nq.
(ii) Uniqueness: Assume that m = nq + r = nq ′ + r ′ , with 0 ≤ r, r ′ < n.
We have −n < r − r ′ < n, or equivalently |r − r ′ | < |n|, and also that
n(|q − q ′ |) = |r − r ′ |. If q 6= q ′ , then |q − q ′ | ≥ 1 and |r − r ′ | > n, so we must
have q = q ′ . But from n(|q − q ′ |) = |r − r ′ |, we conclude also that r = r ′ .
Definition 2.5.10. If n, m ∈ Z, n 6= 0, and m = nq + r with 0 ≤ r < |n|,
we say that q and r are, respectively, the quotient and the remainder of
the division of m by n.
92
CHAPTER 2. THE INTEGERS
The way the quotient and the remainder depend on the algebraic signs
of m and n is not entirely obvious. This is illustrated fin the following table
by various examples:
m
n
q r
5
3
1 2
5 −3 −1 2
−5
3 −2 1
−5 −3
2 1
Assuming that n 6= 0 is fixed, consider the map ρ : Z → {0, 1, . . . , n − 1},
where ρ(m) is the remainder of the division of m by n. We leave as an
exercise to show that the following result holds true:
Proposition 2.5.11. If x, y are arbitrary integers, we have
(i) ρ(x) = ρ(y) if and only if n|(x − y);
(ii) ρ(x ± y) = ρ(ρ(x) ± ρ(y));
(iii) ρ(xy) = ρ(ρ(x)ρ(y)).
The Division Algorithm will be used many times in the sequel. For
example, we will use in the next section to describe all the ideals of Z.
Moreover, in the next Chapter we will show how it can be generalized to
other rings, including some polynomial rings. Actually, all the notions that
we have introduced in this section, such as prime number, gcd, lcm, etc.,
will eventually be generalized to a large class of rings.
Exercises.
1. Let A be a commutative ring, and a, b, c ∈ A. Show that, if a|b and b|c, then
a|c, and that, if c 6= 0 is not a zero divisor, we have ac|bc if and only if a|b.
2. Prove Lemma 2.5.3.
3. Conclude the proof of Theorem 2.5.9.
4. Show that, if m, n, k ∈ N, mn = k and m2 > k, then n2 < k.
5. List all the natural numbers between 100 and 200. Observe that 172 = 289,
and remove from the list all the multiples of 2, 3, 5, 7, 11 and 13. Which
numbers are left in the list? 10
10
This procedure is called Eratosthenes filter. Eratosthenes (276 a.C.-194 a.C.) was
born where is now Libya, and was the third librarian of the famous Library of Alexandria. Among other achievements he established the sphericity of the Earth and found its
diameter with great accuracy.
2.6. IDEALS AND EUCLIDES ALGORITHM
93
6. Determine the prime numbers between
√ 1950 and 2050.
(Hint: Determine first the primes p ≤ 2050).
7. If m, n ∈ Z, n 6= 0 and ρ : Z → {0, 1, . . . , n − 1} is the remainder of the
division by n, when is it true that ρ(m) = ρ(−m)?
8. Prove Proposition 2.5.11. What is the relationship between this theorem and
the “casting out nines” test of elementary arithmetics?
9. Is Theorem 2.5.9 still valid if one replaces in its statement the ring of integers
by the ring formed by the multiples of 2. What if one replaces the ring of
integers by the ring real numbers?
10. State and prove the analogue of Theorem 2.5.9 for the ring of Gaussian
integers.
2.6
Ideals and Euclides Algorithm
Assuming that one has two sand watches (hourglasses), one measuring intervals of 21 minutes and the other measuring intervals of 30 minutes,
what intervals of time can one measure using the two watches? Certain
intervals are obviously possible, by using successively each watch, such as
30, 60, 90, . . . , 21, 42, 63, . . . , or additions of these numbers, such as 51, 81, 111,
. . . , 102, 123, . . . , 132, 153, . . . , etc. By using simultaneously both watches
we can also measure the differences of these numbers, such as 9 = 30 − 21,
3 = 63 − 60, etc.
Looking carefully at the resulting numbers, one reaches the following
conclusions:
• It is possible to obtain any natural number of the form x21 + y30 with
x, y ∈ Z.
• All the numbers of the form x21 + y30 are multiples of 3 (since 3 is
the greatest common divisor of 21 and 30).
On the other hand, there exist integers x′ and y ′ (e.g., x′ = 3, y ′ = −2) such
that 3 = x′ 21 + y ′ 30. In particular, if m = k3 is any multiple of 3, then
m = k(x′ 21 + y ′ 30) = k(x′ 21 + y ′ 30) = x′′ 21 + y ′′ 30. In other words:
• the numbers of the form x21 + y30 are precisely the multiples of 3;
• 3 = gcd(21, 30) is the smallest natural number of the form x21 + y30.
94
CHAPTER 2. THE INTEGERS
Our aim now is to explore this remarks, generalizing them to an arbitrary
ring. As a byproduct of this work we will be able to determine all the ideals
of the ring Z and we will find a method to compute gcd(n, m), the so-called
Euclides Algorithm. This algorithm does not require the knowledge of prime
factors of n and m, so it is specially useful for computational purposes, and
it is also easily generalized to polynomials,
If n and m are arbitrary fixed integers, we will consider the set
I = {xn + ym : x, y ∈ Z}.
It is clear that I is non-empty and closed relative to the difference and the
product by arbitrary integers, i.e., if x, y, x′ , y ′ , z ∈ Z, then
(xn + ym) ± (x′ n + y ′ m) = (x ± x′ )n + (y ± y ′ )m ∈ I,
z(xn + ym) = (xn + ym)z = (zx)n + (zy)m ∈ I.
In other words, I is an ideal of Z. On the other hand, if J is an ideal such
that n, m ∈ J, it is obvious that xn, ym ∈ J for all x, y ∈ Z, hence also
xn + ym ∈ J, so that I ⊂ J. For this reason we say that I is the smallest
ideal of Z that contains n and m, or still that I is the ideal generated by n
and m.
More generally, consider any ring A instead of the ring of integers, and
replace the set {n, m} by an arbitrary (non-empty) subset S ⊂ A. Note that
the ring A itself is an ideal of A that contains S. Hence, the family of ideals
of A which contains S is non-empty. We will denote by hSi the intersection
of all the ideals in this family. It is clear that S ⊂ hSi. Also, one verifies
directly from the definition of ideal that the intersection of a family of ideals
of A is still an ideal of A. Hence, hSi is an ideal of A that contains S.
Finally, observe that if S ⊂ I ⊂ A and I is an ideal, then hSi ⊂ I. We leave
as an exercise to verify that all these statements hold true.
Definition 2.6.1. If A is a ring and S ⊂ A, one calls hSi the ideal generated by S, or the smallest ideal of A that contains S. The elements of S
are said to be generators of the ideal hSi, and S is called a generating
set of hSi.
When S = {a1 , a2 , . . . , an } is a finite subset of a ring A, we will often
write ha1 , a2 , . . . , an i instead of hSi. In the case of the integers, we saw
above that hn, mi = {xn + ym : x, y ∈ Z}, and it is immediate to show that
hni = {xn : x ∈ Z}. However, there are rings where determining the ideal
generated by a given set of elements is not so easy.
2.6. IDEALS AND EUCLIDES ALGORITHM
95
Examples 2.6.2.
1. Let A be an abelian ring and a ∈ A. Then {xa : x ∈ A} is a subring of A,
for if x, y ∈ A, we have
xa − ya = (x − y)a, (xa)(ya) = (xay)a,
showing that {xa : x ∈ A} is closed for the difference and the product. Also, if
x, b ∈ A, since A is abelian, we have
b(xa) = (xa)b = (bx)a,
so {xa : x ∈ A} is an ideal. Finally hai contains necessarily the “multiples” of
a, and we conclude that hai = {xa : x ∈ A}.
2. Let A = M2 (Z) be the ring of 2 × 2 matrices with entries in Z, and
1 0
a=
.
0 0
The set of all “multiples” of a is
{ax : x ∈ M2 (Z)} =
n
0
m
0
: n, m ∈ Z ,
but hai = M2 (Z) as it will be clear in the exercises.
The ideal of Z generated by 21 and 30 is also generated by 3. This is
true for all ideals in the integers, which are always generated by only one of
its elements (but it is by no means a property valid for arbitrary rings):
Theorem 2.6.3. I is an ideal of Z if and only if there exists d ∈ Z such
that I = hdi.
Proof. We verify both implications:
(a) If I = hdi, then I is obviously an ideal of Z.
(b) Let I be an ideal of Z. If I = {0} it obvious that I = h0i, so we
can assume that I 6= h0i. In this case, we observe that the ideal I contains
positive integers, since if n ∈ I then −n ∈ I. Set
I + = {n ∈ I : n > 0},
and let d be the minimum of I + , which exists by the Well-ordering Principle.
Since d ∈ I it follows that hdi ⊆ I (why?). Hence, it remains to show
that I ⊆ hdi, i.e., that if m ∈ I, then m is a multiple of d. Let m ∈ I,
96
CHAPTER 2. THE INTEGERS
and let q and r be the quotient and the remainder of the division of m by d
(recall that d > 0, so d 6= 0). We have then that m = qd + r, or r = m − qd.
Since qd ∈ hdi ⊂ I, this means that qd ∈ I. But then r = m − qd is the
difference of two elements in the ideal I, so we must have r ∈ I. Finally,
since 0 ≤ r < d and d is by definition the smallest positive element of I, we
must have r = 0, so m is a multiple of d, i.e., m ∈ hdi. Hence, I ⊆ hdi.
In an arbitrary ring we have a special designation for those ideals which
are generated by one of its elements:
Definition 2.6.4. An ideal I in a ring A is called a principal ideal if
there exists a ∈ A such that I = hai.
According to Theorem 2.6.3, all the ideals in the ring of integers are
principal, but we will see later many examples of rings which have ideals
that are not principal. However, the rings where all ideals are principal
form an important class of rings.
The example of the ideal h21, 30i suggests that, if I = hdi = hn, mi, then
|d| is the greatest common divisor of n and m. The previous result allows
us to prove this result and also part (i) of Proposition 2.5.6.
Corollary 2.6.5. If n, m ∈ N, then hn, mi = hdi where d = gcd(n, m). In
particular, we have that:
(i) the equation xn + ym = d has solutions x, y ∈ Z;
(ii) if k is a common divisor of n and m, then k is also divisor of d.
Proof. (i) The theorem shows that there exists a natural number d such that
hn, mi = hdi. Obviously, d ∈ hdi = hn, mi. Since hn, mi is the set of integers
of the form xn + ym, there are integers x′ , y ′ such that d = x′ n + y ′ m.
(ii) It is also obvious that n, m ∈ hn, mi = hdi. Hence, n and m are
multiples of d, which is a common divisor of n and m. On the other hand,
if k ∈ N is any common divisor of n and m, then we have n = kn′ and
m = km′ , so that d = x′ n + y ′ m = k(x′ n′ + y ′ m′ ), or k|d. In particular,
k ≤ d and d is the greatest common divisor of n and m.
This corollary suggest that one can compute gcd(n, m) by searching for
the smallest natural number in the ideal hn, mi. The Division Algorithm
makes this search possible by using the following lemma.
Lemma 2.6.6. If n, m ∈ N and m = qn + r, then hn, mi = hn, ri.
2.6. IDEALS AND EUCLIDES ALGORITHM
97
Proof. On the one hand,
k ∈ hn, mi =⇒ k = xm + yn
=⇒ k = x(qn + r) + yn
=⇒ k = (xq + y)n + xr
=⇒ k ∈ hn, ri,
so hn, mi ⊂ hn, ri. On the other hand,
k ∈ hn, ri =⇒ k = xn + yr
=⇒ k = xn + y(m − qn)
=⇒ k = (x − yq)n + ym
=⇒ k ∈ hn, mi,
so hn, ri ⊂ hn, mi.
The Euclides Algorithm consists in applying repeatedly the previous
lemma until one obtains an exact division (i.e., r = 0). This is a very simple
procedure, easy to program in a computer, which we illustrate now for the
example that we started this section.
Example 2.6.7.
If n = 21 and m = 30, then
30 = 1 · 21 + 9 =⇒ h30, 21i = h21, 9i,
21 = 2 · 9 + 3 =⇒ h21, 9i = h9, 3i,
9 = 3 · 3 + 0 =⇒ h9, 3i = h3i.
Hence
h30, 21i = h3i,
and by the previous corollary we have that 3 = gcd(21, 30).
In general, if we start with two natural numbers n and m, and assuming that
n < m, the general procedure should be clear, and it corresponds to an iterative
process which is very easy to program. Observe that it is also possible to determine integers x and y such that gcd(n, m) = xn + ym: from the equations
above it follows immediately that
3 = 21 + (−2)9 and 9 = 30 + (−1)21,
so:
3 = 21 + (−2)[30 + (−1)21] = (3)21 + (−2)30.
98
CHAPTER 2. THE INTEGERS
The next result takes advantage of the fact that gcd(n, m) is a linear
combination of n and m.
Proposition 2.6.8. Let m, n, p, k ∈ N and assume that mn|kp. If m and p
are relatively prime, then m is factor of k.
Proof. By assumption, gcd(m, p) = 1, so there exist integers x′ , y ′ such that
1 = x′ m + y ′ p. Hence, k = k(x′ m + y ′ p). Moreover, since mn|kp, there exists
an integer z ′ such that kp = z ′ mn. Hence,
k = k(x′ m + y ′ p)
= kx′ m + y ′ kp
= kx′ m + y ′ z ′ mn = (kx′ + y ′ z ′ n)m,
which shows that m|k.
The least common multiple of two natural numbers maybe found also by
applying Theorem 2.6.3. Given natural numbers n and m, we observe that
hni∩hmi is the set of common multiples of n and m. Since the intersection of
two ideals is an ideal, we conclude from Theorem 2.6.3, that hni ∩ hmi = hli,
where l is a natural number. Of course, l is a common multiple of n and m,
and any common multiple k of n and m is a multiple of l, so that l ≤ |k|.
Using this, and similarly to the case of the greatest common divisor, one
can prove part (ii) of Proposition 2.5.6.
These remarks also suggest a way to define the greatest common divisor
and the least common multiple of two integers:
Definition 2.6.9. If m, n, d, l are integers, and d, l ≥ 0, then:
(i) d = gcd(n, m) if hdi = hn, mi;
(ii) l = lcm(n, m) if hli = hni ∩ hmi.
This definition is consistent with Theorem 2.6.3 and lead to the “natural”
result that gcd(n, m) = gcd(|n|, |m|) and lcm(n, m) = lcm(|n|, |m|). Notice,
by the way, that gcd(n, 0) = |n| and lcm(n, 0) = 0. Moreover, we leave to
the exercises to show that for any integers m, n ∈ Z one has:
lcm(m, n) gcd(m, n) = |m||n|.
Hence, the Euclides Algorithm also leads to the computation of lcm(m, n).
We will further explore these results later to extend the notions of greatest
common divisor and least common multiple to certain classes of rings.
2.6. IDEALS AND EUCLIDES ALGORITHM
99
If hmi and hni are ideals of Z, it is clear that hni ⊂ hmi if and only if m|n.
In other words, to determine all the ideals that contain hni is equivalent to
determine all the divisors of n.
This is illustrated in the next figure, when n = 12, where each rectangle
represents an ideal of Z that contains h12i. Note that if a rectangle is
contained in another, then the corresponding ideals also are, and that the
ideal generated by 1 is obviously the whole ring Z.
h1i
PSfrag replaements
Z
=Z
J
Zm
h2i
I
h4i
hmi
h12i
e
e~
h6i
h3i
Figure 2.6.1: The ideals of Z that contain h12i.
Alternatively, we can represent the ideals that contain h12i as in the
following diagram (notice the subdiagram in the right,
PSfragexpressing
replaements the general
Z
property used here).
h1i =
h2i
h
Z
hmd(m; n)i
e
e~
h3i
hm i
h4i
mi
hni
h6i
h12i
hmm(m; n)i
Figure 2.6.2: Os divisors of 12.
One should note that a diagram of this sort can be extended indefinitely
below, but not above. In particular, given an ideal hmi ⊂ Z it is possible that
the only ideal strictly containing it is Z itself, and that happens precisely
when n is a prime number. The same may happen with an ideal in an
arbitrary ring, so we introduce:
100
CHAPTER 2. THE INTEGERS
Definition 2.6.10. An ideal I ⊂ A is maximal if for every ideal J ⊂ A:
I ⊂ J =⇒ J = I or J = A.
Maximal ideals play an important role in many rings, not just in the
ring of integers. For example, we will see later than the maximal ideals of
an integral domain D can be used to associate to D certain fields. As we
mentioned above, it is possible to identify the maximal ideals of Z:
Theorem 2.6.11. An ideal hpi in Z is maximal if and only if p = 1 or |p|
is a prime number.
Proof. If p, q ∈ N and hpi ⊂ hqi, then q|p. If p = 1 or p is prime, we have
that q = 1 or q = p, so that hqi = Z or hqi = hpi, and hence hpi is maximal.
On the other hand, if p > 1 is not prime, then there exists q ∈ N such
that 1 < q < p and q|p. It follows that hpi ( hqi ( Z, so hpi is not a
maximal ideal.
In the next section we will look more in detail into the properties of
prime numbers.
Examples 2.6.12.
1. The ideal h0i is maximal in the ring R, but not in the ring Z.
√
2. The√ideal hx2 − 2i is not maximal in R[x], because hx2 − 2i ( hx + 2i, and
hx + 2i 6= R[x]. On the other hand, hx2 − 2i is maximal in Z[x] (why?).
3. In the Appendix, using Zorn’s Lemma, we show that in an arbitrary ring A
any proper ideal I ( A is contained in a maximal ideal.
If A is a ring with identity I, then the map φ : Z → A, n 7→ nI, is a ring
homomorphism. Hence, the set of solutions of the homogenous equation
nI = 0 (n ∈ Z), i.e., the kernel of φ, is an ideal of Z. Since every ideal of Z
is principal, we see that there exists an unique integer m ≥ 0 such that
{n ∈ Z : nI = 0} = hmi.
Actually, according to the proof of Theorem 2.6.3, if m > 0, then m is
simply the smallest positive solution of the equation nI = 0. In any case,
the integer m deserves a special name.
Definition 2.6.13. If A is a ring with identity I, its characteristic is
the unique number m ∈ N0 such that
{n ∈ Z : nI = 0} = hmi.
2.6. IDEALS AND EUCLIDES ALGORITHM
101
Examples 2.6.14.
1. The rings Z, Q, R, C and H all have characteristic 0.
2. The ring Z2 has characteristic 2.
Exercises.
1. Give examples of natural numbers a, b, n such that ab is not a factor of n,
but a|n and b|n.
2. Determine integers x and y such that gcd(135, 1987) = x135 + y1987.
3. How can one measure 1 pint of water, if one only has two jars of sizes,
respectively, 15 and 23 pints?
4. Let a, b, d, m, x, y, s, t ∈ Z. Show that:
(a) d = gcd(a, b), a = dx, b = dy ⇒ gcd(x, y) = 1;
(b) as + bt = 1 ⇒ gcd(a, b) = gcd(a, t) = gcd(s, b) = gcd(s, t) = 1;
(c) gcd(ma, mb) = |m| gcd(a, b);
(d) gcd(a, m) = gcd(b, m) = 1 ⇔ gcd(ab, m) = 1;
(e) a|m and b|m ⇒ ab|m gcd(a, b);
(f) |ab| = gcd(a, b) lcm(a, b).
5. Let I = {Iβ }β∈B be a family of ideals of a ring A indexed by some set B.
Consider the intersection of all sets in this family:
\
I=
Iβ = {a ∈ A : a ∈ Iβ , for all β ∈ B}.
β∈B
(a) Show that I is an ideal of A.
(b) Let S ⊂ A and let I be the family of all ideals of A that contains S.
Show that I is the smallest ideal of A that contains S.
6. Show that, if S1 ⊂ S2 ⊂ A, then hS1 i ⊂ hS2 i.
7. Assume that m, n ∈ Z, and show that:
(a) hni ⊂ hmi ⇐⇒ m|n;
(b) hni = hmi ⇐⇒ m = ±n.
102
CHAPTER 2. THE INTEGERS
8. Show that, if A is an abelian ring with identity and a1 , a2 , . . . , an ∈ A, then
( n
)
X
ha1 , a2 , . . . , an i =
xk ak : xk ∈ A, 1 ≤ k ≤ n .
k=1
How would you describe ha1 , a2 , . . . , an i if A was abelian without identity?
9. Let {an }n∈N be a sequence of integers.
(a) Define dn = gcd(a1 , a2 ,P
. . . , an ) and ln = lcm(a1 , a2 , . . . , an ), and show
n
that the equation dn = k=1 xk ak has solutions xk ∈ Z.
(b) Show that dn+1 = gcd(dn , an+1 ) and ln+1 = lcm(ln , an+1 ).
(c) For which values of n does the equation 30x+105y +42z = n have integer
solutions?
10. Construct a diagram similar to the one for the divisors of 12 for n = 18.
11. Assume that A and B are unitary rings of characteristic, respectively, n and
m. Show that the characteristic of A ⊕ B is lcm(n, m).
12. Let A be an abelian ring with identity 1, and let J be an ideal of A. Show
that J is maximal if and only if ,for al a 6∈ J, the equation xa + y = 1 has
solutions x ∈ A and y ∈ J.
(Hint: Check that {xa + y : x ∈ A, y ∈ J} is the ideal generated by J ∪ {a}.)
13. Let A be a ring with identity I and characteristic m > 0. Show that:
{nI : n ∈ Z} = {I, 2I, . . . , mI}
is a ring with m elements. Show also that, if a ∈ A, then the smallest positive
solution of na = 0 is a factor of m.
(Hint: If φ : Z → A, n → nI, and n = mq + r, then φ(n) = φ(r).)
14. Let A be a ring of characteristic 4. Determine the addition and multiplication tables of the subring {nI : n ∈ Z} ⊂ A. Decide whether or not this ring
is isomorphic to the field with 4 elements mentioned in Chapter 1.
15. Determine the ideals of each of the following rings:
(a)
(b)
(c)
(d)
The
The
The
The
ring
ring
ring
ring
2Z of even integers.
Z[i] of Gaussian integers.
Z ⊕ Z.
Mn (Z).
16. For each of the rings in the previous exercise, determine whether or not its
ideals are all principal.
2.7. THE FUNDAMENTAL THEOREM OF ARITHMETIC
2.7
103
The Fundamental Theorem of Arithmetic
One of the most important properties of the prime numbers is the fact that
they generate, via multiplication, all the natural numbers n ≥ 2. Our aim
now is to make this fact precise and to prove it. Its correct formulation is
known as the Fundamental Theorem of Arithmetic.
We start by showing that:
Proposition 2.7.1. Any natural n ≥ 2 has at least one prime divisor p.
Proof. The set
D = {m ∈ N : m > 1 and m|n}
is non-empty, since it contains n. Let p be a minimum of D. If p is not a
prime, then p = mk, where 1 < m < p. Since m is obviously a factor of n,
p cannot be a minimum of D. We conclude that p must be a prime.
A consequence of this proposition is the existence of factorizations into
prime numbers for any natural number n ≥ 2.
Corollary 2.7.2. If n ≥ 2 is a natural number there exist prime numbers
p1 ≤ p2 ≤ · · · ≤ pk such that
n=
k
Y
pi .
i=1
Proof. We give a proof by induction.
If n = 2, it is evident that n has a prime factorization with k = 1 and
p1 = 2. Assume now that any natural number m with 2 ≤ m < n has a
prime factorization. We wish to show that n also has a prime factorization.
Let P = {p ∈ N : p|n, p prime} be the set of prime factors of n. We now
that P is bounded above (n is an upper bound) and non-empty (according
to Proposition 2.7.1). Hence, P has a maximum q. If q = n, then n is a
prime, so we set k = 1 and p1 = q = n. Otherwise, q < n so n = mq,
where 2 ≤ m < n. By the induction Q
assumption, there exist prime numbers
′
p1 ≤ p2 ≤ · · · ≤ pk′ such that m = ki=1 pi , all of them bounded above by
q. In this case, k = k′ + 1, and pk = q.
Now that we know the existence of prime factorizations for all natural
numbers n > 2, we look into the question of uniqueness of these factorizations (up to the order of the factors). This is a problem of a rather different
nature, as the following example illustrates.
104
CHAPTER 2. THE INTEGERS
Example 2.7.3.
Let us call an element in the ring 2Z of even integers a “prime” if it cannot
be expressed as a product of other even integers. For example, according to this
definition the elements 2, 6 and 18 are all “primes”. It is left to the exercises
to prove the analogues of the previous results for this ring which, just like the
integers, is well-ordered. On the other hand, since 36 = 2 · 18 = 6 · 6, it is clear
that the “prime” factorizations in this ring are not unique.
The fundamental result about the integers that yields the uniqueness of
prime factorizations is the following:.
Lemma 2.7.4 (Euclides). Let m, n, p ∈ Z and assume that p is a prime
number. If p is a factor of the product mn, then p is a factor of m or a
factor of n.
Proof. Let d = gcd(m, p). Since d is a factor of p and p is prime, we must
have d = 1 or d = p.
Obviously, if d = p, then p is factor of a m. On the other hand, If d = 1,
there exist integers x and y such that 1 = xm + yp. Hence, n = nxm + nyp
and since p is a factor of mn, there exists also an integer z such that mn = zp.
We conclude that n = xzp+nyp = (zx+ny)p, which shows that p is a factor
of n.
The example above shows that Euclides Lemma is not valid in the ring
of even integers, for the definition of “prime” that we indicated.
Example 2.7.5.
When the Greek mathematicians found out about the existence of what we
now call irrational numbers they were very intrigued and surprised, and even
√
tried to hide its existence! We can now show without much difficulty that 2
n 2
is irrational, i.e., that there are no integers n and m such that m
= 2, or
equivalently n2 = 2m2 . For this we will argue by contradiction.
Assuming that n and m exist, we can choose them to be relatively prime (why?).
Now observe that by Euclides Lemma we must have:
n2 = 2m2 ⇒ 2|n2 ⇒ 2|n,
We conclude that n = 2k, for some integer k. Hence, n2 = 4k 2 , so 4k 2 = 2m2 ,
or still 2k 2 = m2 . Since 2|m2 , it follows again from Euclides Lemma that 2|m.
But 2 being a divisor of both m and n contradicts our assumption that they
are relatively prime. We conclude
that the equation n2 = 2m2 has no integer
√
solutions, and hence that 2 is not rational.
2.7. THE FUNDAMENTAL THEOREM OF ARITHMETIC
105
√
We leave it for the exercises to generalize this example to the case n,
where n ∈ N is not a perfect square. In other words, if n is a natural number,
its square root is either another natural number (case in which n is a perfect
square) or an irrational number.
It is not hard to generalize Euclides Lemma to any finite product of
integers. The proof is by induction in the number of factors and is left to
the exercises.
Q
Corollary 2.7.6. If p ∈ N is prime and p | ki=1 mi , then:
(i) p|mj for some j, with 1 ≤ j ≤ k.
(ii) If the integers mi are primes, then p = mj , for some j, with 1 ≤ j ≤ k.
Finally, we can state and prove the
Theorem 2.7.7 (Fundamental Theorem of Arithmetic). Any natural number n > 2 has a prime factorization which is unique up to the order of the
factors.
Proof. The existence of prime factorizations follows from Corollary 2.7.2. It
remains to prove the uniqueness.
Assume that k and m are natural numbers, p1 ≤ p2 ≤ · · · ≤ pk and
q1 ≤ q2 · · · ≤ qm are primes, and
k
Y
pi =
i=1
m
Y
qj .
j=1
We claim that k = m and pi = qi . For that, we argue by induction in k:
• If k =1, the result is obvious from the definition of a prime number.
• Assume that the result holds for the natural k − 1, and that
k
Y
pi =
m
Y
qj .
j=1
i=1
Let P = {pi : 1 ≤ i ≤ k} and Q = {qj : 1 ≤ j ≤ m}. Again by
the definition of prime number we must have m > 1, since k > 1.
By Corollary 2.7.6 it is clear that pk ∈ Q, so that pk ≤ qm , Similarly
qm ∈ P , so that qm ≤ pk . We conclude that pk = qm , and it follows
from the cancellation law that
k−1
Y
i=1
pi =
m−1
Y
j=1
qj .
106
CHAPTER 2. THE INTEGERS
By the induction hypothesis, k − 1 = m − 1 and pi = qi , for i < k, so
we conclude that k = m and pi = qi , for i ≤ k.
A prime factorization of n can, of course, contain repeated factors. For
that reason, one often writes it in the form:
m
Y
pi ei
(ei ≥ 1).
n=
i=1
This is called the prime powers factorization of n. Again, this expression is
unique, up to the order of the factors.
The Fundamental Theorem of Arithmetic does not imply directly the
existence of an infinite number of primes. This last fact was also discovered
by Euclides.
Theorem 2.7.8 (Euclides). The set of prime numbers is unbounded.
Proof. We show that for any natural number n there exists a prime number
p > n. Given a natural number n, consider the natural number m = n! + 1,
where n! is the factorial of n. It is obvious from the Division Algorithm that
the remainder of the division of m by any natural number between 2 and n
is 1. In particular, all the factors of m, including its prime factors (which
exist according to Proposition 2.7.1), are larger than n. We conclude that
there exist prime numbers greater than n.
Hence, the prime numbers form an infinite sequence
2, 3, 5, 7, 11, 13, . . . , pn , . . .
and this raises immediately some obvious questions. For example, is it
possible to determine an explicit formula involving the natural number n,
which gives the prime pn ? How many prime numbers are there in the interval
of 1 to n? How hard is it to find the prime factors of a given natural n?
How large can it be the gap between two successive primes?
The answer to the first question above seems to be negative. In particular, no explicit formula is known that produces only prime numbers. One
famous attempt is due to Fermat11 and involved the numbers of the form
n
Fn = 22 + 1,
11
Pierre of Fermat (1601-1665), French mathematician. Fermat, was a lawyer and one
of the most fascinating characters of the History of Mathematics. He was one of the
founders of Calculus and he discovered independently from Descartes (of whom he was a
friend) the principles of Analytic Geometry. However, his most important contributions
were, without a doubt, to Number Theory.
2.7. THE FUNDAMENTAL THEOREM OF ARITHMETIC
107
now known as Fermat numbers. It is easy to compute the Fermat numbers
corresponding to n = 0, 1, 2 and 3: namely, 3, 5, 17, and 257, which are
all primes. If n = 4 one finds the Fermat number 65537, which is still a
prime number. There are however Fermat numbers that are not primes,
has Euler12 find out in 1732 by taking n = 5. If this looks like a simple
observation, note that n = 5 corresponds to the number:
5
22 + 1 = 4 294 967 297,
and Euler found that its prime factorization is:
5
22 + 1 = 641 · (6 700 417).
The choice of the exponent e = 2n is easy to explain. Assume that
F = 2e + 1, and e = ks, where k, s > 1, with s odd. The polynomial
p(x) = xs + 1 has the root x = −1, hence can be factor as
p(x) = (x + 1)q(x),
where q(x) is a polynomial with integer coefficients. Substituting x by 2k ,
we conclude that
s
F = 2e + 1 = (2k ) + 1 = (2k + 1)q(2k ),
so the number F is not prime. In other words, if F = 2e + 1 is prime then e
has no odd factors greater than 1, so its only prime factor is 2, and we have
e = 2n , hence F is the Fermat number Fn .
In spite of the promising start of the sequence of Fermat numbers, we
don’t know any Fermat number with n > 4 that is a prime, and it is known
that some of these Fermat numbers are composed. For example, it is known
that the smallest prime factor of the Fermat number corresponding to n =
1945 (a number with more than 10582 digits in its decimal expansion!) is a
prime number p of 587 digits: p = 5 · 21947 + 1, and it is conjectured that no
Fermat number with n > 4 is a prime. In spite of that, we will see in the
exercises that these numbers can also be used to prove the existence of an
infinite number of primes.
Questions concerning the number of primes in the interval [1, n], or concerning the distribution of primes, are related to the probability of a randomly chosen natural number in the interval [1, n] be a prime. It is also
12
Leonhard Euler (1707-1783), Swiss mathematician. Euler was one of the most prodigious mathematicians of all times, having worked in various branches of Pure and Applied
Mathematics, such as Analysis, Geometry, Number Theory, Fluid Mechanics, etc.
108
CHAPTER 2. THE INTEGERS
related to the problem of finding an explicit formula for the n-esimal prime,
as mentioned above. Legendre13 and Gauss were the first mathematicians
to suggest an approximate expression π(x) for the number of primes < x.
At the end of the XIX Century, Hadamard 14 showed that
π(x)
x
log x
→ 1 when x → ∞.
We will not discuss here any results of this nature that typically require
analytic techniques.
Some of the problems are among the hardest problems in Mathematics,
and illustrate the capabilities and limitations of the human mind. In spite
of its abstract origin, these problems have interesting consequences in our
day-to-day life. We will mention later some techniques from Cryptography
that take advantage of the relative ease in manipulating “large” prime numbers when compared with the difficulties in finding prime factorizations. In
practical applications, these “large” numbers can have more that 100 digits,
while the determination of their prime factorizations by sequential verification of its possible prime factors may require a number of divisions of the
order of 1050 ! The security in the electronic transmission of the most secret
communications or of simple financial transactions depends on our ignorance
of a practical algorithm for prime factorization.
At this point it is hard to mention other “practical” applications where
the properties of prime numbers are important. We mention only that questions concerning speech or image recognition require algorithms for the decomposition of signals into its fundamental frequencies, a technique known
as Fourier Analysis. The theoretical maximum speed for these algorithms is
directly related to the function π(x).
Exercises.
Qk
Qk
1. If n = i=1 pi ei and m = i=1 pi fi with ei , fi ≥ 0 integers, find expressions
for the gcd(n, m) and the lcm(n, m).
2. Let p and q be distinct primes, and let n = p2 q 3 . Count all the natural
numbers that are factors of n, and show that their sum is (1 + p + p2 )(1 + q +
q 2 + q 3 ).
13
Adrien Marie Legendre (1752-1833), French mathematician. Legendre become known
for his fundamental contributions to Number Theory and to the study of elliptic function.
14
Jacques Hadamard (1865-1963), was one of the most influential French mathematicians of the second half of the XIX Century and the first half of the XX Century. His work
spanned such different domains of Mathematics as, e.g., Number Theory or the Calculus
of Variations.
2.8. CONGRUENCES
109
3. Generalize the previous result to the case where n =
Qk
i=1
pi ei .
4. Prove a version of Corollary 2.7.2 for the ring 2Z of all even integers.
5. Prove Corollary 2.7.6.
6. Show that, for any natural number n, the interval [n + 1, n! + 1] contains at
least one prime.
7. The primes of the form 2n − 1 are called Mersenne primes. Show that, if
an − 1 is prime and n > 1, then a = 2 and n is prime.
8. Show that the sequence an = n2 − n + 41 is not formed only by primes.
9. Show that, if p(x) is a non-constant polynomial with integer coefficients, then
the set of integers n for which an = p(n) is not prime must be infinite.
n
10. Let Fn = 22 + 1 be the n-esimal Fermat number (n ≥ 0).
Q
(a) Show that Fn+1 = 2 + ni=0 Fi ;
(b) Show that if n 6= m then gcd(Fn , Fm ) = 1;
(c) Explain why the previous result implies the existence of an infinite number of primes.
11. Show that, if n is not a perfect square, then
equation x2 = ny 2 has no solutions x, y ∈ Z).
2.8
√
n is irrational (i.e., the
Congruences
We now turn to the study of other binary relations in Z, the so-called “congruence modulo m”, which are associated to the divisibility relation We will
use them to solve equations of the form ax + by = n, where all the variables are integers. In the next chapter, we will also use them to exhibit an
important class of finite rings.
Definition 2.8.1. Let m ∈ N0 . If x, y ∈ Z, we say that x is congruent
modulo m with y if and only if x − y is a multiple of m. The integer m is
called the modulus of congruence.
If x is congruent with y modulo m, we will write x ≡ y (mod m). Therefore, we have:
x≡y
(mod m)
⇐⇒
m|(x − y)
⇐⇒
(x − y) ∈ hmi.
110
CHAPTER 2. THE INTEGERS
Recall that a binary relation is called an equivalence relation if it
is reflexive, symmetric and transitive (see the Appendix).
Proposition 2.8.2. Congruence modulo m is an equivalence relation.
Proof. We check that congruence modulo m satisfies all three properties:
(i) ≡ is reflexive: since 0 is a multiple of m, we have
x ≡ x (mod m).
(ii) ≡ is symmetric: obviously, x − y = km if and only if y − x = (−k)m,
hence
x ≡ y (mod m) ⇐⇒ y ≡ x (mod m).
(iii) ≡ is transitive: if x ≡ y (mod m) and y ≡ z (mod m), then there
exist integers k and n such that x − y = km and y − z = nm. But then,
x − z = (x − y) + (y − z) = km + nm = (k + n)m,
hence x ≡ z (mod m).
Examples 2.8.3.
1. Ignoring the variable y, the equation 3 = 21x + 30y (x, y integers) is written
3 ≡ 21x (mod 30), or
21x ≡ 3 (mod 30).
2. If m = 0, then since 0 is the only multiple of 0,
x≡y
(mod 0) ⇐⇒ x = y.
Hence, the congruence relation modulo 0 is the usual equality relation. At the
other extreme, if m = 1, then x ≡ y (mod 1), for all x, y ∈ Z (the integer x − y
is always a multiple of 1).
3. Any integer n is even or odd, i.e., n = 0 + 2k or n = 1 + 2k. Hence
n ≡ 0 (mod 2), or n ≡ 1
(mod 2).
According to the Division Algorithm, if m > 0 and x is any integer, then
x = mq + r, where q, r ∈ Z, and these integers are unique if 0 ≤ r < m.
In other words, x ≡ r (mod m), with 0 ≤ r < m, if and only if r is the
remainder of the division of x by m. Therefore, we have the following
generalization to any m > 0 of the example above for m = 2:
2.8. CONGRUENCES
111
Proposition 2.8.4. If m > 0, any x ∈ Z is congruent with exactly one
integer r in the set {0, 1, 2, . . . , m − 1}, where r is the remainder of the
division of x by m.
Example 2.8.5.
Any integer x satisfies exactly one of the following:
x ≡ 0 (mod 4), x ≡ 1 (mod 4), x ≡ 2 (mod 4) or x ≡ 3
(mod 4).
For a fixed m 6= 0, the set
r = {x ∈ Z : x ≡ r
(mod m)}
is called the congruence class or the residue class or simply the
residue of the integer r modulo m. We will see in the next chapter that
the set of congruence classes (mod m) is the support of the ring Zm , which
generalizes to any m the examples of the rings Z2 and Z3 which we have
already mentioned.
Equations that involve congruence relations can be dealt with just like
the ordinary algebraic equations for numbers, with the exception of the
“cancellation law” for the product. This is the content of the following
proposition:
Proposition 2.8.6. If x ≡ x′ (mod m) and y ≡ y ′ (mod m), then the
following properties hold:
(i) x ± y ≡ x′ ± y ′ (mod m);
(ii) xy ≡ x′ y ′ (mod m).
In particular, we have that:
(iii) x ≡ x′ (mod m) ⇔ x + a ≡ x′ + a (mod m);
(iv) x ≡ x′ (mod m) ⇒ ax ≡ ax′ (mod m).
Proof. By assumption, both x − x′ and y − y ′ are multiples of m. Therefore,
it is obvious that (x − x′ ) ± (y − y ′ ) = (x ± y) − (x′ ± y ′ ) are multiples of m,
i.e., x ± y ≡ x′ ± y ′ (mod m).
On the other hand, (x − x′ )y and x′ (y ′ − y) are still multiples of m.
Hence, (x − x′ )y − x′ (y ′ − y) = xy − x′ y ′ is a multiple of m, or equivalently,
xy ≡ x′ y ′ (mod m).
The proofs of the remaining statements are left for the exercises.
112
CHAPTER 2. THE INTEGERS
Example 2.8.7.
Since 10 ≡ 3 (mod 7) and 11 ≡ −3 (mod 7), according to Proposition 2.8.6:
10 + 11 ≡ 3 + (−3) (mod 7)
10 − 11 ≡ 3 − (−3) (mod 7)
10 · 11 ≡ 3 · (−3) (mod 7)
⇐⇒
⇐⇒
⇐⇒
21 ≡ 0
(mod 7),
−1 ≡ 6 (mod 7),
110 ≡ −9 (mod 7).
On the other hand, note that
4 · 5 ≡ 4 · 8 (mod 6),
but 5 6≡ 8 (mod 6). Hence, in general, the cancellation law for the product
fails: ax ≡ ax′ (mod m) does not imply that x ≡ x′ (mod m).
An important observation is that Proposition 2.8.6 uses precisely the
properties of hmi that make this set into an ideal. A corollary of this Proposition is the following, which is left as an exercise.
Corollary 2.8.8. If x ≡ y (mod m), then xn ≡ y n (mod m) for all natural
numbers n.
One of the most simple applications of Proposition 2.8.6 and of its Corollary is to obtain divisibility criteria in terms of the decimal expansion of x.
Example 2.8.9.
Letting m = 3, we observe that for any natural number k:
10 ≡ 1
(mod 3)
=⇒
10k ≡ 1
(mod 3).
For example, if x = 1998, we have that
1998 = 1 · 1000 + 9 · 100 + 9 · 10 + 8,
≡ 1 + 9 + 9 + 8 ≡ 27 ≡ 0 (mod 3).
We conclude that 1998 is divisible by 3, without using the Division Algorithm!
This example illustrates the following general divisibility criterium:
Proposition 2.8.10. A natural number n is divisible by 3 if and only if the
addition of all the digits of its decimal representation is divisible by 3.
We discuss in the exercises similar divisibility criteria for 2, 5, 9 and 11.
2.8. CONGRUENCES
113
Let us now turn to the study of linear equations of the form
ax ≡ b
(mod m).
Recall that the equation ax = b has integer solutions if and only if b is
multiple of a, in which case the solution if unique (except if a = 0). We
are interested in knowing for which values of b (in terms of a and m) does
the equation ax ≡ b (mod m) have solutions, and how can we find all its
solutions.
The problem of existence of solutions of ax ≡ b (mod m) is easily solved
using the notion of greatest common divisor:
Theorem 2.8.11. The equation ax ≡ b (mod m) has solutions if and only
if b is a multiple d = gcd(a, m).
Proof. Indeed, the equation ax ≡ b (mod m) has solutions if and only if
there exist integers x and y such that b − ax = my, i.e., b = ax + my.
In other words, ax ≡ b (mod m) has solutions precisely when b is a linear
combination of a and m with integers coefficients. As we saw in Section
2.6 (see Definition 2.6.9), the linear combinations of a and m with integers
coefficients are exactly the multiples of gcd(a, m).
The previous result shows that Euclides Algorithm can be used to decide
about the existence of solutions of the equation ax ≡ b (mod m). Actually,
since this algorithm allows to obtain d = gcd(a, m) as a linear combination
of a and m, it gives also one solution of ax ≡ d (mod m). This leads easily
to the solutions of the original equation, as we illustrate in the next example.
Example 2.8.12.
Consider the equation
15x ≡ b (mod 40).
Using Euclides Algorithm, we have
40 = 15 · 2 + 10, or 10 = 40 + 15 · (−2),
15 = 10 · 1 + 5, or 5 = 15 + 10 · (−1) = 15 · 3 + 40 · (−1),
10 = 5 · 2 + 0, hence 5 = gcd(15, 40) = 15 · 3 + 40 · (−1).
We conclude from Theorem 2.8.11 that the equation has a solution if and only
if b is a multiple of 5. These very same computations show that we have
5 = 15 · 3 + 40 · (−1).
Hence, x=3 is a solution of 15x ≡ 5 (mod 40). More generally, by Proposition
2.8.6, x = 3k is solution of 15x ≡ 5k (mod 40).
114
CHAPTER 2. THE INTEGERS
For example, if k = 2 we see that x = 6 is a solution of the equation 15x ≡ 10
(mod 40), because:
15 · 3 ≡ 5 (mod 40) =⇒ 15 · 3 · 2 ≡ 5 · 2 (mod 40),
Obviously, any integer x that verifies x ≡ 6 (mod 40) is also a solution of
15x ≡ 10 (mod 40), so this last equation has an infinite number of solutions.
However, the integers x ≡ 6 (mod 40) do not include all the solutions of 15x ≡
10 (mod 40). For example, x = −2 is also solution of 15x ≡ 10 (mod 40), but
−2 6≡ 6 (mod 40). This is again a consequence of the lack of a “cancellation
law” for the product, because
3 · 5 · (−2) ≡ 5 · 2
(mod 40) 6⇒ 6 ≡ −2 (mod 40).
We can look under what circunstancies this multiplicity of solutions is not
possible. Notice that Theorem 2.8.11 also shows that the equation ax ≡ b
(mod m) has solutions for all b precisely when gcd(a, m) = 1, i.e.., when
a and m are relatively prime. Obviously, this happens if and only if the
equation ax ≡ 1 (mod m) has solutions.
Definition 2.8.13. One says that a ∈ Z is invertible (mod m) if ax ≡ 1
(mod m) has a solution, i.e., if a and m are relatively prime. If c is a solution
of the equation ax ≡ 1 (mod m), we call c is an inverse (mod m) of a.
Examples 2.8.14.
1. The equation 4x ≡ b (mod 9) has solutions for avery b, since gcd(4, 9) = 1.
In particular, 4x ≡ 1 (mod 9) has the solution x = −2 because 4(−2) + 9 = 1.
Hence,
(a) −2 is an inverse of 4 (mod 9), and
(b) x = −2b is a solution of 4x ≡ b (mod 9), for every integer b.
2. Since gcd(21, 30) = 3, 21 is not invertible (mod 30).
Assume that x = c is a particular solution of ax ≡ b (mod m). As we
have observed, any integer x′ congruent with c (mod m) is also solution of
the equation:
x′ ≡ c (mod m) =⇒ ax′ ≡ ac ≡ b
(mod m).
It is possible that there are other solutions x′′ , that are not congruent with
c, i.e., we may have
ax′′ ≡ b
(mod m) with x′′ 6≡ c (mod m).
2.8. CONGRUENCES
115
However, when a and m are relatively prime this is impossible, due to the
following “cancellation law”:
Theorem 2.8.15. If a and m are relatively prime,
ax ≡ ay
(mod m)
⇐⇒
x≡y
(mod m).
Proof. We know already that
x≡y
(mod m) =⇒ ax ≡ ay.
On the other hand, let c be an inverse (mod m) of a, so that ac ≡ ca ≡ 1
(mod m). Then
ax ≡ ay
(mod m) =⇒ c(ax) ≡ c(ay)
=⇒ (ca)x ≡ (ca)y
=⇒ x ≡ y
(mod m) (Proposition 2.8.6),
(mod m) (associativity),
(mod m) (ca ≡ 1
(mod m).
We conclude from Theorems 2.8.11 e 2.8.15 that
Theorem 2.8.16. If a and m are relatively prime and b ∈ Z, then:
(i) The equation ax ≡ b (mod m) has at least one solution c ∈ Z;
(ii) ax ≡ b (mod m) ⇒ x ≡ c (mod m).
Therefore, when a and m are relatively prime the solution of the linear
equation ax ≡ b (mod m) is very simple. We illustrate it in the following
example.
Example 2.8.17.
Since gcd(4, 7) = 1 and 4 · 2 ≡ 1 (mod 7), we conclude that the solutions of
4x ≡ 1 (mod 7) (the inverses of 4 (mod 7)) are precisely the integers which
satisfy x ≡ 2 (mod 7), i.e., the integers of the form x = 2 + 7k. For example,
if b = 3, we have
4x ≡ 3 (mod 7)
⇐⇒
x ≡ 6 (mod 7).
When a and m are not relatively prime, it is still easy to determine all
the solutions of the equation
ax ≡ b
(mod m).
For that, recall that if a = a′ d and m = m′ d with d = gcd(a, m) 6= 0, then
a′ and m′ are relatively prime.
116
CHAPTER 2. THE INTEGERS
Example 2.8.18.
Let us find all solutions of the equation
15x ≡ 10 (mod 40).
Obviously, this equation is equivalent to 15x = 10 + 40y and this last equation
can be divided by gcd(15, 40) = 5, giving 3x = 2 + 8y, or 3x ≡ 2 (mod 8).
Hence, we have:
15x ≡ 10 (mod 40)
⇐⇒
3x ≡ 2
(mod 8),
where in the last equation obviously gcd(3, 8) = 1. We have already seen that
x = 6 is a solution of the original equation, and therefore also of 3x ≡ 2
(mod 8). It follows from Theorem 2.8.16, that the solutions of the last equation
are the integers which satisfy x ≡ 6 (mod 8).
We conclude that the solutions of 15x ≡ 10 (mod 40) are the integers of the
form x = 6 + 8k. In particular, this equation has 5 solutions that are not
congruents (mod 40), namely, x = 6, 14, 22, 30, 38 (the solution that we found
before, x = −2, is congruent to x = 38 (mod 40)).
We discuss in the exercises how to determine the number of solutions of
ax ≡ b (mod m) that are not congruent (mod m).
One can also solve systems of the form
x≡a
(mod m),
(2.8.1)
x≡b
(mod n).
The following result was found by the Chinese mathematician Sun-Tsu in
the 1st Century AD! For that reason, it is known as the Chinese Remainder
Theorem.
Theorem 2.8.19 (Chinese Remainder Theorem). The system (2.8.1) has
solutions for all a and b if and only if m and n are relatively prime. In this
case, if c is a solution, then (2.8.1) is equivalent to x ≡ c (mod mn).
Proof. It is obvious that the integers of the form x = a+ym are the solutions
of the equation x ≡ a (mod m). Hence x = a + ym is a solution of (2.8.1)
if and only if
x≡b
(mod n)
⇒
a+ym ≡ b
(mod n)
⇒
my ≡ b−a
(mod n).
We saw above that this last equation has a solution for all a and b if and
only if m and n are relatively prime. In this case, Theorem 2.8.16 shows that
2.8. CONGRUENCES
117
the solutions are the integers of the form y = y ′ + zn, where y ′ is any fixed
solution, and z is arbitrary. We conclude that the solutions of the system
(2.8.1) are the integers of the form
x = a + (y ′ + zn)m = c + z(nm),
where c = a + y ′ m, i.e., are the solutions of x ≡ c (mod mn).
The case where m and n are not relatively prime is dealt with in an
exercise at the end of the section.
Example 2.8.20.
Consider the system
x≡1
x≡2
(mod 4),
(mod 9).
The solutions of the first equation are the integers of the form x = 1 + 4y.
They should also satisfy the second equation:
1 + 4y ≡ 2 (mod 9) ⇐⇒ 4y ≡ 1
(mod 9) with solution y = −2
(we saw that −2 is inverse of 4 (mod 9)). We conclude that c = 1+4(−2) = −7
is a particular solution of the system. By the Chinese Remainder Theorem, the
system is equivalent to the equation x ≡ −7 (mod 36), so the solutions are the
integers of the form x = −7 + 36z, with z ∈ Z.
The equations studied in this section have as unknowns integers. Such
equations are called Diophantine 15 . Although, as we saw above, solving linear Diophantine equations is relatively simple, some of the hardest (and more
famous) problems in Mathematics concern non-linear Diophantine equations. Perhaps the most famous one is “Fermat’s Last Theorem” concerning
the integers solutions x, y and z of the equation
xn + y n = z n .
Although this equation has an infinite number of solutions when n=2 (e.g,
x = 3, y = 4 and z = 5), solutions for n > 2 were never found. Fermat
wrote in the margins of a copy of the Arithmetica of Diophantus, next to
15
After Diophantus of Alexandria, a Greek mathematician of the III Century, author
of a famous series of books called Arithmetica, most of which are now lost. This treaty
included, among many things, solutions (some of there extremely ingenious!) of algebraic
equations. Diophantus was only interested in rational solutions, and called the irrational
ones “impossible”.
118
CHAPTER 2. THE INTEGERS
the discussion of the Pythagoras’s theorem, that he knew how to prove the
non-existence of solutions for n > 2, but that the margin was too small to
describe the argument. Fermat’s proof remains unknown to these days and,
in fact, it took more that 300 years, and the efforts of many generations
of mathematicians, to reach a complete solution of Fermat’s Last Theorem.
The proof, due to the american mathematician Andrew Wiles16 , is no doubt
one of the greatest discovers of Mathematics. The degree of sophistication
of the proof, which culminates the works of many great mathematicians
for more than 200 years, makes it one of the most elaborate intellectual
constructions of all the humanity.
Exercises.
1. Find all integers solutions of 21x + 30y = 9.
2. For which integers b does the equation 533x ≡ b (mod 4141) have solutions?
3. State and prove criteria of divisibility by 2, 5, 9 and 11, in terms of the decimal
representation of a natural number n.
4. Given some natural number k, find the remainder of the division of 3k be 7.
5. Determine all solutions of xy ≡ 0 (mod 12).
6. For a, m ∈ Z, with m 6= 0, show that exactly one of following holds true:
(a) ax ≡ 1 (mod m) has solutions (so gcd(a, m) = 1), or
(b) ax ≡ 0 (mod m) has solutions x 6≡ 0 (mod m) (so gcd(a, m) 6= 1).
7. Assume that gcd(a, m) = d and m = dn. Show that:
(a) ax ≡ 0 (mod m) has d solutions x, with 0 < x ≤ m, namely x =
n, 2n, . . . , dn;
(b) ax ≡ b (mod m) has either 0 or d solutions x, with 0 < x ≤ m.
8. Show that the equation x2 + 1 ≡ 0 (mod 11) has no solutions.
16
Andrew Wiles announced in the summer of 1993 a proof of Fermat’s Last Theorem,
but it was soon recognized that a crucial step was missing. Finally, in September of 1994,
Wiles together with Richard Taylor found an argument which allowed to circumvent the
missing step. The correct proof was published in the paper “Modular elliptic curves
and Fermat’s last theorem”, Ann. of Math. 141 (1995), no. 3, 443–551, and refers to
the following paper in the Annals, which is precisely a joint paper of Wiles and Taylor,
“Ring-theoretic properties of certain Hecke algebras”, Ann. of Math. 141 (1995), no. 3,
553–572.
2.9. PRIME FACTORIZATIONS AND CRYPTOGRAPHY
119
9. Determine which numbers between 1 and 8 have an inverse (mod 9).
10. Find all solutions of the equation x2 + 1 ≡ 0 (mod 13).
11. Show that x5 − x ≡ 0 (mod 30) for for every integer x.
12. Show that, if a ≡ a′ (mod m), then gcd(a, m) = gcd(a′ , m).
13. Find the values of c ∈ Z for which the following system has solutions:
x≡c
(mod 14)
2x ≡ 10
(mod 42)
14. Solve the system of equations
2x + 3y ≡ 3
3x + y ≡ 4
(mod 5)
(mod 5).
15. Which day of the week was March 15, 1800?
(Hint: In the current Western calendar, called the Gregorian calendar, a leap
year is one which is divisible by 4, but not 100, or divisible by 400. For example,
the years 1700, 1800, and 1900 are not leap years, but 1600 and 2000 are).
16. Five shipwreck survivors arrive at an island where they find a chimpanzee.
After they spend one day picking up coconuts, they decide to leave the divison
of the coconuts for the next day. During the night each survivor wakes up and
goes pickup what he thought was his share of the coconuts. Each of them finds
that he cannot divide the coconuts by 5, since there is always one leftover,
which they leave for the chimpanzee. The next day, they decide to split the
coconuts that were left from their night incursions and now they find that they
can split them exactly among them. Knowing that that there were at most 10
000 coconuts in the island, how many coconuts did they pickup?
2.9
Prime Factorizations and Cryptography
We will now describe, as an example of the theory we have developed so far,
an application to Cryptography. We start with some preliminary results.
We will denote by Ckn the “number of arrangements of n elements in
groups of k”, so that:
k!(n − k)!Ckn = n!.
Since n|n!, this equation and Euclides Lemma, imply that whenever n = p
is a prime and 0 < k < p, we have p|Ckp . We state this as a lemma for future
reference:
120
CHAPTER 2. THE INTEGERS
Lemma 2.9.1. If p is a prime and 0 < k < p, then Ckp ≡ 0 (mod p).
Recall that the well-known Newton’s binomial formula states that:
(2.9.1)
(a + b)n =
n
X
Ckn ak bn−k .
k=0
(this formula is actually valid in any commutative ring, as can easily be see
by applying induction). When n = p is prime, it leads to:
Proposition 2.9.2 (“The Freshman’s Formula”). If p is prime, (a + b)p ≡
ap + bp (mod p).
Proof. By Lemma 2.9.1, the only terms in the binomial expansion that may
fail to be divisible by p correspond to k = 0 and k = n.
Theorem 2.9.3 (Fermat). If p is prime, ap ≡ a (mod p).
Proof. Let us prove this result by induction in a. The result is obvious if
a = 0. Assuming it is true for some integer a ≥ 0, we have that (a + 1)p ≡
ap + 1 (mod p), by the freshman’s formula, and that ap ≡ a (mod p), by
the induction hypoteshis. We conclude that (a + 1)p ≡ a + 1 (mod p), so
the result is true for any integer a ≥ 0. Since (−1)p ≡ (−1) (mod p) for any
prime p (why?), the result holds for all integers.
An interesting corollary of this result is a practical method to find
inverses (mod p).
Corollary 2.9.4. If p is prime and a 6≡ 0 (mod p), then ap−1 ≡ 1 (mod p).
In particular, ap−2 is the inverse (mod p) of a.
Proof. Obviously, since p is a prime, gcd(a, p) must be 1 or p, so that either
a ≡ 0 (mod p) or gcd(a, p) = 1. Since the first case is excluded, we must
have gcd(a, p) = 1 and a is invertible (mod p). Using the Fermat’s theorem,
we conclude that
ap = ap−1 a ≡ a
(mod p)
⇒
ap−1 ≡ 1
(mod p).
We can now describe a public key encryption mechanism, which is particularly ingenious. The reason for this name is that the encryption method
2.9. PRIME FACTORIZATIONS AND CRYPTOGRAPHY
121
can be known to everyone, but the decryption method is kept secret. Variations of this method of encryption are used, e.g., to make secure financial
transactions in the Internet. 17 .
The main ingredients are a natural number N = pq, which is the product
of two distinct primes p and q, and another natural number r, relatively
prime to both p − 1 and q − 1. The numbers N and r are public, but
the prime factors of N are kept secret. Hence, this system explores the
possibility of generating very large prime numbers, together with our lack of
knowledge of a practical, efficient, algorithm to determine the prime factors
of large natural numbers.
Now the encryption proceeds as follows: assume that we would like to
transmit numbers a such that 0 ≤ a < N . Instead of transmitting a,
one sends the remainder of the division of ar by N , which we denote by
b = ρ(ar , N ). The decryption amounts to finding a given b = ρ(ar , N ). This
computation can only be done efficiently if one knows the prime factors of
N , in which case it is a simple application of our previous results, as we now
explain.
On the one hand, setting c = ρ(b, p) and d = ρ(b, q), it is clear that
c ≡ ar
(mod p)
d ≡ ar
and
(mod q).
On the other hand, since r is relatively prime to both p − 1 and q − 1, there
exist integers x, y, x′ , y ′ such that
1 = rx + (p − 1)y = rx′ + (q − 1)y ′ .
It follows from the corollary above that
a = arx+(p−1)y ≡ (ar )x ≡ cx
rx′ +(q−1)y ′
a=a
x′
≡ (ar )
(mod p),
x′
≡d
(mod q).
Note that we can find x and x′ directly by applying Euclides Algorithm.
Also, interpreting negative powers of a as positive powers of an inverse of a,
it is clear that one can always choose x and x′ to be both positive.
We conclude that the original message a satisfies the system
a ≡ (ρ(b, p))x (mod p)
.
′
a ≡ (ρ(b, q))x (mod q)
17
One of the most commonly used methods is the so-called RSA encryption method
(discovered by Rivest, Shamir and Adlemman). Such a public system, besides the encryption mechanism that we describe, includes also a simple signature verification scheme,
which is crucial for private communications via public channels.
122
CHAPTER 2. THE INTEGERS
According to the Chinese Remainder Theorem, this system determines uniquely the number a (mod pq), i.e., (mod N ). Since we assumed 1 ≤ a < N ,
we have found a. Solving this system requires only the knowledge of an
inverse of p (mod q), and this can again be achieved by applying Euclides
Algorithm. We illustrate the method in the following example.
Example 2.9.5.
Let N = 21 = 3·7 = p·q. The number r = 5 is relatively prime to p−1 = 2 and
q−1 = 6. Let us assume that we wish to transmit the message a = 4. According
to the method just described, instead of transmitting a, one send instead the
remainder of the division of ar = 1024 by N = 21, that is b = ρ(1024, 21) = 16
(since 1024 = 48 · 21 + 16).
Now assume that the message b = 16 reaches the receiver, who is wishes to
decrypt it and find a. Since
1 = 5 · (−1) + 2 · 3 = 5 · (−1) + 6′ · 1,
we conclude that x = −1 and x′ = −1. On the other hand, the numbers c and
d are given by
c = ρ(b, p) = 1, (since 16 = 3 · 5 + 1)
d = ρ(b, q) = 2, (since 16 = 7 · 2 + 2).
Hence, the number a satisfies the system
a ≡ (1)−1 (mod 3)
.
a ≡ (2)−1 (mod 7)
Noting that 2 · 4 = 8 ≡ 1 (mod 7), the inverse of 2 (mod 7) is 4, and we
conclude that a satisfies the system
a ≡ 1 (mod 3)
.
a ≡ 4 (mod 7)
The solutions of the first equation are a = 1 + 3n, with n ∈ Z. Replacing in
the second equation, one finds
1 + 3n ≡ 4 (mod 7)
⇒
3n ≡ 3
(mod 7).
In order to solve this last equation we note that the inverse of 3 (mod 7) is 5
(since 3 · 5 = 15 ≡ 1 (mod 7)), so that:
n ≡ 3 · 5 = 15 (mod 7).
The Chinese Remainder Theorem now gives that a ≡ 1 + 3 · 15 = 46 (mod 21).
Since 0 ≤ a < 21, the receiver concludes that the message transmitted was
a = 4.
2.9. PRIME FACTORIZATIONS AND CRYPTOGRAPHY
123
Exercises.
1. A word is codified so that each letter of the the (english) alphabet corresponds to a natural number: a 7→ 1, b 7→ 2, c 7→ 3, . . . . Then the message
“10,21,23,10,16,1” is transmitted in a system of public key with N = 35 and
r = 5. What was the original word?
2. Prove the following generalization of the Chinese Remainder Theorem: If
d = gcd(m, n) the system
x≡a
(mod m)
x≡b
(mod n)
has solutions if and only if d|(a − b). In this case, if c is a solution, then the
system is equivalent to x ≡ c (mod lcm(n, m)).
124
CHAPTER 2. THE INTEGERS
Chapter 3
Other Examples of Rings
3.1
The Rings Zm
We have studied the ring Z in detail in Chapter 2. You should also be
familiar with the fields Q, R or C, and their propertied. In this chapter we
will study other important examples of rings.
We start by studying the rings associated with the congruence (mod m),
the rings Zm . In Chapter 1 we have seen briefly the cases Z2 and Z3 ,
without making any reference to the congruence (mod m). However, a more
systematic study of these rings requires using these congruence relations. Let
us start by observing that the congruence (mod m) can be replaced by an
equality if we “identify” (i.e., treat as unique element) all the integers which
are congruent between themselves. The method we use is actually valid for
any equivalence relation and consists in replacing one element by the class
made of all the elements which are equivalent to it.
Let us assume then that we have fixed a modulus of congruence m ∈ Z.
For any x ∈ Z we introduce:
Definition 3.1.1. The equivalence class (mod m) of x is the set x
formed by all integers congruent with x (mod m). In other words:
x = {y ∈ Z : x ≡ y
(mod m)}.
Example 3.1.2.
For the congruence with modulus m = 3, we have:
0 = {0, ±3, ±6, . . . },
1 = {1, 1 ± 3, 1 ± 6, . . . },
2 = {2, 2 ± 3, 2 ± 6, . . . }.
125
126
CHAPTER 3. OTHER EXAMPLES OF RINGS
Obviously the notation x is ambiguous since it does not contain any
information concerning the modulus of congruence m it refers to. However,
it is clear that
x = {x + ym : y ∈ Z},
or that:
x = {x + z : z ∈ hmi}.
For this reason, whenever necessary to clarify the modulus of congruence,
we will write x + hmi instead of x.
Example 3.1.3.
For m = 4 and m = 5 we have, respectively,
m=4:
m=5:
3 = 3 + h4i = {. . . , −5, −1, 3, 7, 11, . . . },
3 = 3 + h5i = {. . . , −7, −2, 3, 8, 13, . . . }.
In order to replace the congruence x ≡ y (mod m) by an equality we
will use the following lemma. Since it is a direct consequence of Proposition
2.8.2, the lemma is actually valid for any equivalence relation.
Lemma 3.1.4. For any integers x, y, the following statements are equivalent:
(i) x ≡ y (mod m);
(ii) x = y
(iii) x ∩ y 6= ∅.
Proof. We show that (i) ⇒ (ii) ⇒ (iii) ⇒ (i).
To prove that (i) ⇒ (ii), observe that if x ≡ y (mod m), the transitivity
property of the congruence relation, gives that x ⊇ y. But, by the symmetry
property, x ≡ y (mod m) if and only if y ≡ x (mod m), hence also x ⊆ y.
We conclude that x = y.
To prove that (ii) ⇒ (iii), we observe that the reflexivity property gives
that y ∈ y. Hence, if x = y, then x ∩ y 6= ∅.
Finally, to prove that (ii) ⇒ (iii), assume that z ∈ x ∩ y 6= ∅. From
the definition of equivalence class, we have that x ≡ z (mod m) and y ≡ z
(mod m). By the symmetry and transitivity properties, it follows that x ≡ y
(mod m).
3.1. THE RINGS ZM
127
According to the reflexivity property, an integer x belongs to the class
x, and hence the union of all the equivalence classes is the set Z. Moreover,
according to the previous lemma, distinct equivalence classes are necessarily
disjoint. For this reason we say the set of all equivalence classes for a given
modulus m, i.e., the set {x : x ∈ Z}, is a partition of Z. We will use
the notation introduced in the previous chapter and call x a congruence
class.
Examples 3.1.5.
1. When m = 2, the resulting partition of Z if the usual classification of the
integers into even and odd numbers.
2. When m = 3, the resulting partition amounts to the classification of integers
in terms of the remainder of the division by 3:
Z = {0, ±3, ±6, . . . } ∪ {1, 1 ± 3, 1 ± 6, . . . } ∪ {2 ± 3, 2 ± 6, . . . }.
Obviously, an equivalence class x is uniquely determined once we know
one of its members. For this reason, any integer y in x is called a representative of the class x, since we have y = x.
Example 3.1.6.
If m = 3, the integers 1, 4, 7, −2 and −5 are all representatives of 1, and we
have
1 = 4 = 7 = −2 = −5.
Given an equivalence relation “∼” in a set X, the corresponding set of
equivalence classes is called the quotient of X by ∼ and denoted by X/ ∼.
The map π : X → X/ ∼ given by π(x) = x, which associates to each element
of X its equivalence class, is called the quotient map. In the case where
X = Z and ∼ is the equivalence relation modulo m, we denote the quotient
set Z/ ∼ by Zm , and the quotient map by πm , or simply by π. Therefore,
we have πm (x) = x + hmi = x. More formally:
Definition 3.1.7. For a fixed modulus m > 0, we set:
Zm = {x : x ∈ Z},
πm : Z → Z m
x 7→ x.
In this notation, Proposition 2.8.4 amounts simply to count the number
of elements of Zm :
128
CHAPTER 3. OTHER EXAMPLES OF RINGS
Proposition 3.1.8. If m > 0, Zm = {0, 1, . . . , m − 1} has m elements.
Notice that if m = 0, then x = {x}, and hence Z0 is an infinite set.
Actually, for the algebraic operations that we will define shortly, Z0 and Z
are isomorphic rings.
Example 3.1.9.
The set Z6 has 6 elements, and we can write
Z6 = {0, 1, 2, 3, 4, 5} = {6, 7, 8, 9, 10, 11} = {36, −5, 2, 63, 610, −19}, etc.
Notice the ambiguity in notation: when we write Z4 = {0, 1, 2, 3}, the symbols
in this list denote objects that are not elements of Z6 . For example,
2 + h4i 6= 2 + h6i i.e., π4 (2) 6= π6 (2).
According to Proposition 2.8.6, if x ≡ x′ (mod m) and y ≡ y ′ (mod m),
then x + y ≡ x′ + y ′ (mod m) and xy ≡ x′ y ′ (mod m). In terms of equivalence classes, we have
x = x′ and y = y ′
=⇒
x + y = x′ + y ′ and xy = x′ y ′ .
In other words, the classes x + y and xy do not depend on the choice of
representatives x and y, but only on the congruence classes x and y. We
can use this fact to define operations of addition and product in Zm .
Definition 3.1.10. The addition and product in Zm are defined by
x + y = x + y,
and
x · y = xy.
As one could expect, some of the properties of the algebraic operations in
Z yield similar properties for the algebraic operations in Zm . For example:
x + (y + z) = x + y + z,
= x + (y + z),
= (x + y) + z,
= x + y + z,
= (x + y) + z,
so we conclude that addition in Zm is associative. We leave as an exercise
to check that:
3.1. THE RINGS ZM
129
Theorem 3.1.11. (Zm , +, ·) is an abelian ring with identity.
Note, however, that some properties of Z do not transfer to the ring Zm .
Example 3.1.12.
The addition and multiplication tables in Z4 are:
+
0
1
2
3
0
0
1
2
3
1
1
2
3
0
2
2
3
0
1
·
0
1
2
3
3
3
0
1
2
0
0
0
0
0
1
0
1
2
3
2
0
2
0
2
3
0
3
2
1
Now observe the following differences between Z4 and Z:
• The equation x = −x has two solutions in Z4 but only one in Z;
• Since 2 · 2 = 0 it follows that 2 is a zero divisor, so Z4 is not an integral
domain;
• The natural multiples of the identity 1 are
1 · 1 = 1, 2 · 1 = 2, 3 · 1 = 3, 4 · 1 = 4 = 0, etc.
and so this ring has characteristic 4;
• There are 3 subrings of Z4 , which are all principal ideals (as in the ring
of integers):
h1i = h3i = Z4 ,
h2i = {0, 2},
and
h0i = h4i = {0}.
Note that these ideals correspond precisely to the divisors of 4.
Returning to the general case of the ring Zm , with m > 0, let us identify
the invertible elements of Zm , which is denoted by Z∗m (according to the
notation introduced in Chapter 1). Given a ∈ Z, a is invertible in Zm if
and only if the equation a · x = 1 has a solution in Zm . From the results in
Section 2.8, we have that:
a ∈ Z∗m ⇐⇒ a · x = 1 has a solution in Zm ,
⇐⇒ ax ≡ 1 (mod m) has solutions in Z,
⇐⇒ gcd(a, m) = 1.
Hence, the invertible elements of Zm correspond to the natural numbers k,
1 ≤ k ≤ m, that are relatively prime to m. The number of elements of
Z∗m , is denoted by ϕ(m), and the resulting function ϕ : N → N is called the
Euler function.
130
CHAPTER 3. OTHER EXAMPLES OF RINGS
Example 3.1.13.
The invertible elements in the ring Z9 form the set
Z∗9 = {1, 2, 4, 5, 7, 8},
hence, ϕ(9) = 6.
We will see later how to compute the value of the Euler function ϕ(m)
from the prime factorization of m. For now we observe that for any prime p
we have ϕ(p) = p − 1, and that all the non-zero elements of Zp are invertible:
Theorem 3.1.14. If p is a prime, then Zp is a finite field with p elements.
The characteristic of the rings Zm is also very easy to find. We already
observed that Z4 has characteristic 4. In fact:
Theorem 3.1.15. The ring Zm has characteristic m.
Proof. Using induction, we have that
∀n ∈ N, a ∈ Z : na = n · a = na.
(3.1.1)
In particular for a = 1, we have
n1 = 0
⇐⇒
n=0
⇐⇒
n ∈ hmi,
so the result follows.
We have indicated in the example above all the subrings and ideals of Z4 .
In order to extend this to any value of m, let us start by looking at the ideals
generated by each of the elements of Zm . The following proposition follows
immeadately from the commutativity of the product in Zm and (3.1.1).
Proposition 3.1.16. If a ∈ Z, then hai = {a · n : n ∈ Zm } = {na : n ∈ Z}.
Hence, given a generator, it is very easy to list the elements of an ideal
of Zm .
Examples 3.1.17.
1. In Z40 we have:
h15i = {15, 30, 45 = 5, 20, 35, 50 = 10, 25, 40 = 0}.
3.1. THE RINGS ZM
131
2. In Z21 we have:
h15i = {15, 30 = 9, 24 = 3, 18, 33 = 12, 27 = 6, 21 = 0}.
Looking more carefully, one finds that the elements of the ideal hai correspond to the integers b for for which the equation ax ≡ b (mod m) has
solutions. This integers are, as we know, the multiples of d = gcd(a, m):
Proposition 3.1.18. If d = gcd(a, m), then we have que hai = hdi in Zm .
Proof. On the one hand, since d = ax + my, we have d = ax, so that d ∈ hai
and:
hai ⊃ hdi.
On the other hand, a = dz and we have a = dz, so that a ∈ hdi, and:
hdi ⊃ hai.
This result gives immediately the number of elements of hai: if d =
gcd(a, m), then m = dk, and hai = hdi has k elements1 .
Examples 3.1.19.
1. In Z40 we have:
h15i = h5i = {5, 10, 15, 20, 25, 30, 35, 0},
with
40
5
= 8 elements.
2. In Z21 we have
with
21
3
h15i = h3i = {3, 6, 9, 12, 15, 18, 0},
= 7 elements.
In Z4 all subrings are principal ideals, just like for the ring of integers.
In order to see that this is a general property of the rings Zm . we start
by relating the subrings of Zm and the subrings of Z, via the quotient map
π : Z → Zm .
1
This shows that the number of elements of the ideal hai is a factor of the number
of elements the ring Zm . We will see in the next chapter that this is a special case of
Lagrange’s Thoerem.
132
CHAPTER 3. OTHER EXAMPLES OF RINGS
Z
Zm
J
hmi
PSfrag Ireplaements
e~
e
Figure 3.1.1: Subrings of Z and of Zm .
Proposition 3.1.20. If I is um subring of Zm and J = π −1 (I), then I is
a subring of Zm if and only if J is a subring of Z that contains hmi.
Proof. Note that:
J = π −1 (I) = {a ∈ Z : a ∈ I}.
We will show that if I is a subring then J is also a subring, leaving the rest
as an exercise.
Obviously, if a, b ∈ J, then a, b ∈ I. Hence:
a − b = a − b ∈ I =⇒ a − b ∈ J,
a · b = ab ∈ I =⇒ ab ∈ J.
Corollary 3.1.21. If I is a subset of Zm , the following statements are
equivalent:
(i) I is a subring of Zm ;
(ii) I is an ideal of Zm ;
(iii) there exists d ∈ Z such that d|m and I = hdi.
Proof. The implications (iii)⇒(ii)⇒(i) are obvious. Therefore, it remains to
prove that (i)⇒(iii), which we leave for the exercises.
It follows from this corollary that Zm has exactly one subring (which is
necessarily a principal ideal) for each divisor of m. One can use Proposition
3.1.18 to find the generators of each of these ideals.
3.1. THE RINGS ZM
133
h1i
h1i
h5i
h2i
h5i
h2i
h10i
h4i
h10i
h4i
h8i
h8i
h20i replaements
PSfrag
h0i
h20i
h0i
Figure 3.1.2: The ideals of Z40 .
Example 3.1.22.
The generators of h1i = Z40 correspond to the solutions of gcd(x, 40) = 1:
h1i = h3i = h7i = h9i = h11i = h13i = h17i = h19i =
h21i = h23i = h27i = h29i = h31i = h33i = h37i = h39i.
The generators of h2i correspond to the solutions of gcd(x, 40) = 2:
h2i = h6i = h14i = h18i = h22i = h26i = h34i = h38i.
The generators of h4i correspond to the solutions of gcd(x, 40) = 4:
h4i = h12i = h28i = h36i.
The generators of h8i correspond to the solutions of gcd(x, 40) = 8:
h8i = h16i = h24i = h32i.
The generators of h5i correspond to the solutions of gcd(x, 40) = 5:
h5i = h15i = h25i = h35i.
The generators of h10i correspond to the solutions of gcd(x, 40) = 10:
h10i = h30i.
The ideals h20i and h0i obviously have a unique generator.
Assume now that n|m, and that B is a subring of Zm with n elements.
We have the following natural questions:
• Is the additive group B isomorphic to the group Zn ?
• Is the ring B always isomorphic to the ring Zn ?
134
CHAPTER 3. OTHER EXAMPLES OF RINGS
The first of these questions is very easy to answer:
Proposition 3.1.23. If n|m and B is a subring of Zm with n elements,
then the additive groups B and Zn are isomorphic.
Proof. Let m = dn, so that B = hdi. We define φ : Zn → Zm by setting
φ(x) = dx, where here x ∈ Zn , and dx ∈ Zm .2
Notice that one must show that φ is a well defined function: one must
show that the right side of φ(x) = dx does not depend on the choice of the
integer x representing the class x on the left side. For that it is enough to
observe that
bx = x′ in Zn
⇐⇒
n|(x − x′ )
′
⇐=
=⇒ m = dn|(dx − dx ) =⇒
dx = dx′ in Zm .
Now it is immediate to check that φ is a group homomorphism and also
that
φ(Zn ) = {dx : x ∈ Z} = hdi = B.
It remains to show that φ is an isomorphism, i.e.that φ is injective. For that
we need only to compute its kernel:
φ(x) = 0
⇐⇒
⇐⇒
dx = 0 (in Zm )
n|x
⇐⇒
⇐⇒
dn = m|dx
x = 0 (in Zn ).
⇐⇒
Since the kernel of φ is trivial, φ is an isomorphism between B and Zn .
We will see later that this result is a general property of the so called
cyclic groups. The second question above, concerning ring isomorphisms is
not so simple. We illustrate its complexity with a few more examples.
Examples 3.1.24.
1. Consider in Z4 the subring B = h2i = {2, 0}. We have just seen that
(B, +) ≃ (Z2 , +). However, the rings B and Z2 are not isomorphic, since
the product in B is always zero: x, y ∈ B ⇒ xy = 0.
2. Consider in Z6 the subrings B = h2i = {2, 4, 0}, and C = h3i = {3, 0}.
Again, we have (B, +) ≃ (Z3 , +) and (C, +) ≃ (Z2 , +), in this case, these are
also ring isomorphisms, although this is not obvious.
The following proposition gives a complete answer to this question:
2
We could also wrote, maybe more precisely, that φ(πn (x)) = πm (dx).
3.1. THE RINGS ZM
135
Proposition 3.1.25. If m = nd and B = hdi is the subring of Zm with n
elements, then the following statements are equivalent:
(i) The rings B and Zn are isomorphic,
(ii) The ring B is unitary,
(iii) gcd(n, d) = 1.
In this case, the identity of B is the unique x ∈ Zm such that
x ≡ 0 (mod d)
, and
x≡1
(mod n).
Example 3.1.26.
The ring Z36 has 9 subrings, because 36 has 9 divisors. With the exception of
the trivial subrings h0i = {0} and h1i = Z36 , only the subrings B = h4i, with 9
elements, and C = h9i, with 4 elements, have identity.
The solution of the system x ≡ 0 (mod 4) and x ≡ 1 (mod 9) is x ≡ 28
(mod 36), and hence the identity of B is x = 28. Similarly, the solution of
x ≡ 0 (mod 9) and x ≡ 1 (mod 4) is x ≡ 9 (mod 36), and hence the identity
of C is x = 9.
Exercises.
1. Prove Theorem 3.1.11.
2. Show that the only subrings of Z4 are h1i, h2i e h0i.
3. Show that, if m > 1, then Zm is either a field or has zero divisors.
4. Show that na = na (in particular, n = n1).
5. Show that, if n > 1, then Mn (Zm ) is a non-abelian ring, with characteristic
2
m, and mn elements.
6. Consider the ring consisting of all maps f : Z → Zm and show that for any
m > 1 there exist infinite rings with characteristic m.
7. Show that A ∈ Mn (Zm ) is invertible if and only if det(A) ∈ Z∗m 3 .
3
The determinant of a matrix A = (aij ) of size n × n with entries in a commutative
ring is defined as usual by:
X
det A =
sgn(π)a1π(1) · · · anπ(n) .
π∈Sn
136
CHAPTER 3. OTHER EXAMPLES OF RINGS
8. Give an example of a finite vector space over a finite field (compare Rn with
Znp ).
9. Determine all matrices in GL(2, Z2 ) (2 × 2 invertible matrices with entries in
Z2 ).
10. What is the cardinal of GL(n, Zp ), if p is prime? (Hint: Note that the
rows of the matrix M ∈ GL(n, Zp ), that are vectors of Znp , must be linearly
independent.)
11. Find the inverse of the matrix


1 0 0
 2 3 4  ∈ GL(3, Z5 ).
3 2 4
12. Solve the system
in Z5 .
x + 2y
−3x + 3y
= a
= b
13. Show that hai = Zm if and only if a ∈ Z∗m .
14. Complete the proofs of Proposition 3.1.20 and of Corollary 3.1.21.
15. Find all the elements of the ideal h85i ⊂ Z204 . Which ones are generators
of this ideal?
16. How many elements does the ideal h28, 52i ⊂ Z204 have?
17. Determine all the ideals of Z30 . Which of these ideals are unitary rings, and
what are the respective identities?
18. If p is a prime number, and n ∈ N, show that ϕ(pn ) = pn − pn−1 . Hint:
Show that x ∈ Z∗pn ⇐⇒ x 6∈ hpi.
19. Find ϕ(3000). Hint: Show that
x ∈ Z∗3000 ⇐⇒ x 6∈ (h2i ∪ h3i ∪ h5i).
20. Assume that d = gcd(a, m), m = dn, and φ : Zm → Zm is given by
φ(x) = ax.
(a) Show that φ is a homomorphism of groups.
(b) Show that the kernel of φ is hni, and φ(Zm ) has n elements.
3.2. FRACTIONS AND RATIONAL NUMBERS
137
(c) Assuming that m = 12, for which a is φ a group automorphism?
(d) Assuming that m = 12, for which a is φ a ring homomorphism?
21. Assuming that n|m, show that the function φ : Zm → Zn given by φ(x) = x,
i.e., φ(πm (x)) = πn (x), with x ∈ Z,is well defined, and that it is a homomorphism of rings. What is its kernel?
22. Prove Proposition 3.1.25 proceeding as follows:
(a) Show that “(i) =⇒ (ii)”.
(b) Show that “(ii) =⇒ (iii)”, by showing first that if a is the identity of B
then hai = hdi, and a2 = a. Conclude that a ≡ 0 (mod d), and a ≡ 1
(mod n).
(c) Solve the system a ≡ 0 (mod d), and a ≡ 1 (mod n), and consider the
function φ : Zn → Zm given by φ(x) = ax. Show that φ is well defined, is
an injective ring homomorphism, and φ(Zn ) = B, therefore completing
the proof.
23. Consider maps φ : Z4 → Z36 .
(a) Find the group homomorphisms φ. Which ones are injective?
(b) List the ring homomorphisms φ. Which ones are injective?
24. Assume that a ∈ Z∗m , and consider Ψ : Z∗m → Zm given by Ψ(x) = a · x.
(a) Show that Ψ is injective, and that in fact Ψ(x) ∈ Z∗m , for all x ∈ Z∗m .
(b) Writing Z∗m = {x1 , x2 , x3 , . . . , xk }, where k = φ(m) and φ is the Euler
Qk
Qk
function, show that i=1 Ψ(xi ) = i=1 xi , and use this fact to prove the
Theorem of Euler : ak = 1.
(c) Prove also the following Theorem of Fermat : If m = p is prime, then
ap = a.
3.2
Fractions and Rational Numbers
Our main aim in this section is to define the field of rational numbers and to
show that its properties (often introduced in an axiomatic form) are logical
consequences of the axioms of the integers studied in Chapter 2. At the same
time, we will see that the method to define the rational numbers from the
integer numbers can be applied to any abelian ring where the cancellation
law holds. This will allow us to define later other important fields.
138
CHAPTER 3. OTHER EXAMPLES OF RINGS
The rational numbers (fractions, roots, etc.) often are informally defined
as “expressions of the type m
n ”, where m and n are integers, and n 6= 0.
From a more formal point of view, we can observe that an ordered pair of
integers (m, n) determines a rational number, provided that n 6= 0. On the
other hand, we know that distinct ordered pairs can correspond to the same
m′
rational number: we can have (m, n) 6= (m′ , n′ ) and m
n = n′ , This happens
precisely when mn′ = m′ n.
These remarks suggest that instead of defining the rational numbers as
ordered pairs of integers we can define them as equivalence classes of ordered
pairs of integers. As we will see, the fact that this works does not depend
on specific properties of the integers, but only on the fact that Z is an
abelian ring with more that one element, where the cancellation law holds.
Therefore, we will formalize this construction in this more general abstract
setting.
In what follows, A 6= {0} denotes an abelian ring where the cancellation
law for the product is valid (i.e., without zero divisors).
Definition 3.2.1. Let B = {(a, b) : a, b ∈ A, and b 6= 0}. We denote by
“≃” the binary relation in B defined by
(a, b) ≃ (a′ , b′ )
⇐⇒
ab′ = a′ b.
The relation just defined is, of course, suggested by the equality of fractions that we mentioned before. Its usefulness depends on being an equivalence relation, which we now check.
Lemma 3.2.2. The relation ≃ is an equivalence relation.
Proof. Obviously, the relation “≃” is reflexive and symmetric. To check that
is also transitivity, assume that (a, b) ≃ (a′ , b′ ), and that (a′ , b′ ) ≃ (a′′ , b′′ ).
Using the commutativity and associativity of the product, we find:
(a′ , b′ ) ≃ (a′′ , b′′ )
(a, b) ≃ (a′ , b′ )
⇐⇒
⇐⇒
a′ b′′ = a′′ b′
ab′ = a′ b
=⇒
=⇒
a′ bb′′ = a′′ bb′ ,
a′ bb′′ = ab′′ b′ .
We conclude that a′′ bb′ = ab′′ b′ and, hence, that a′′ b = ab′′ , e (a, b) ≃ (a′′ , b′′ ),
since b′ 6= 0 and A has no zero divisors.
An equivalence relation “≃” in B determines a partition of B into equivalence classes. If (a, b) ∈ B, we call the corresponding equivalence class (a, b)
a fraction of A and denoted it by “a/b”, or by “ ab ”. More formally:
3.2. FRACTIONS AND RATIONAL NUMBERS
139
Definition 3.2.3. Let A 6= {0} be an abelian ring where teh cancellation
law holds.
(i) If a, b ∈ A and b 6= 0, the fraction
a
b
is defined by
a
= {(a′ , b′ ) ∈ A × A : b′ 6= 0 and ab′ = a′ b};
b
(ii) The set of all fractions
a
b
is denoted by Frac(A);
(iii) When A = Z, a fraction ab is called a rational number, and the set
Frac(Z) is denoted by Q.
When A = Z, the set of rational numbers Frac(Z) = Q is a ring. For the
general case, we define on the set of fractions of A algebraic operations of
addition and product by copying simply the usual operations with rational
numbers. These are given by:
(3.2.1)
(3.2.2)
ad + bc
a c
+ =
,
b d
bd
ac
ac
= .
bd
bd
In order to formalize these definitions, which represent binary operations
on equivalence classes, we must check that each of this operations is independent of the choice of representative for each class. We state this in the
following lemma, whose proof is left as an exercise:
Lemma 3.2.4. If (a, b) ≃ (a′ , b′ ), and (c, d) ≃ (c′ , d′ ), then
(ad + bc, bd) ≃ (a′ d′ + b′ c′ , b′ d′ ),
(ac, bd) ≃ (a′ c′ , b′ d′ ).
According to this lemma, (3.2.1) and (3.2.2) do indeed define addition
and product operations on the set of fractions. It is obvious that both
operations are commutative, and that the product is associative. With
a bit more work, one can verify that addition is associative and that the
product is distributive relative to addition. The existence of identities for
both operations also does not offer any difficulties. Indeed, just like for the
rationals, we have:
Theorem 3.2.5. The set Frac(A) with the algebraic operations (3.2.1) and
(3.2.2) is a field, called the field of fractions of A.
140
CHAPTER 3. OTHER EXAMPLES OF RINGS
Proof. Let b 6= 0 be any non-zero element in A. We set 0′ =
We leave as an exercise to check :
0′ = {(0, c) : c 6= 0} and
0
b
and 1′ = bb .
1′ = {(c, c) : c 6= 0}
are, respectively, identities for the addition and the product in Frac(A).
Note-se that the elements 0′ and 1′ are independent of the choice of b 6= 0
in A. In particular, and assuming that y 6= 0, we have:
x
= 0′ ⇐⇒ x = 0, and
y
x
= 1′ ⇐⇒ x = y.
y
Note also that the existence of an identity for the product of fractions does
not depend of the existence of an identity for the product in the original
ring A, and observe finally that
a (−a)
+
= 0′ ,
b
b
and that if
a
b
6= 0′ , then a 6= 0, so
b
a
is a fraction, and
ab
= 1′ .
ba
When A = Z is the ring of integers and Frac(A) = Q, we are used to
consider Z as subring of Q. For an arbitrary ring this “identification” is still
possible: the field of fractions of a ring A contains a subring isomorphic to
the ring A.
Proposition 3.2.6. Let a 6= 0 be a fixed element in the ring A. Then:
(i) ι : A → Frac(A), x 7→
ax
a
is a ring isomorphism between A and ι(A);
(ii) the isomorphism ι is independent of the choice of element a ∈ A − {0}.
Proof. The identities ι(x + y) = ι(x) + ι(y) and ι(xy) = ι(x)ι(y) are easily
verified, so ι is a homomorphism of rings and ι(A) is a subring of Frac(A).
Moreover, ι is an isomorphism between A and ι(A) since:
ι(x) = 0′
⇐⇒
⇐⇒
⇐⇒
ax
= 0′
a
ax = 0
x = 0.
3.2. FRACTIONS AND RATIONAL NUMBERS
141
′
ax
On the other hand, if a, a′ 6= 0 and x ∈ A, then ax
a = a′ , because
(ax)a′ = (a′ x)a. Hence, ι is independent of the choice of a ∈ A − {0}.
According to this result, the elements of Frac(A) of the form ax
a , with
a, x ∈ A and a 6= 0 fixed, are “copies” of the elements x ∈ A. For this
reason, whenever one works with elements of the field Frac(A), the fraction
ax
a is denoted by “x”, and one says that x is an element of the ring A. We
also write Frac(A) ⊃ A, so we consider Frac(A) as an extension of the ring
A. This abuse of language is fully justified by the proposition above, and
allows for a much lighter notation. For the same reason, we will use the
same symbol 0 to denote the zeros of A and Frac(A). Similarly, when A as
identity 1 we use the same symbol to denote the identity of Frac(A).
A similar issue arises when one considers “fractions of fractions”. When
dealing with rational numbers, we have the equality of fractions
ad
a/b
=
.
c/d
bc
However, according to our formal definition these two fractions belong to
two different rings: the fraction ad
bc is an element of Q, while the fraction
a/b
c/d is an element of Frac(Q). Hence, this equality cannot hold literally. The
sense in which it holds is explained in the following proposition:
Proposition 3.2.7. If K is a field, the map ι : K → Frac(K) defined in
Proposition 3.2.6 is surjective, and hence is a ring isomorphism.
Proof. The proof reduces to the observation that xy = ι(xy −1 ), which according with the conventions above is written in the form “ xy = xy −1 ”.
In the case of the composite fraction above (x = ab and y = dc ), strictly
speaking, we have that:
ad
a c −1
ad
a/b
=ι
=ι
=ι
.
c/d
b d
bc
bc
ad
Using the identification of x with ι(x), we find that a/b
c/d = bc .
In the previous section we saw that the existence of the finite fields Zp
follows from the axioms for the integers. We have just seen that the existence
of the field Q is another consequence of those axioms. We will see later that
these fields are, in some sense, the smallest fields: we will show that any
field of characteristic p contains necessarily a subfield isomorphic to Zp (if
p 6= 0) or isomorphic to Q (if p = 0).
142
CHAPTER 3. OTHER EXAMPLES OF RINGS
Exercises.
1. Prove Lemma 3.2.4.
2. Complete the proof of Theorem 3.2.5
3. What is the field of fractions of the ring of Gaussian integers?
4. If Frac(A) is isomorphic to Frac(B), is it true that A is isomorphic to B? If
A is isomorphic to B, is it true that Frac(A) is isomorphic to Frac(B)?
5. Show that, if a field K is an extension of a ring A, then K contains a subfield
isomorphic to Frac(A) (this result, generalizes Proposition 3.2.6, that Frac(A)
is the smallest field that contains A). (Hint: If K is a field, the intersection
of all subfields of K is the smallest subfield of K.)
6. Show that, if A is countable, then Frac(A) is countable.
7. Assume that A is an integral domain. Show that then the characteristics of
Frac(A) and A are equal.
8. Show that Q can only be ordered so that: m
n > 0 ⇔ mn > 0. (Hint: In an
ordered field one always has x > 0 ⇔ x−1 > 0.)
9. Show that, if A is an ordered ring, then Frac(A) is also an ordered field.
3.3
Polynomials and Power Series
The polynomials with real coefficients are often defined as the maps p : R →
R of the form
N
X
p n xn ,
p(x) =
n=1
where the real numbers pn are called the coefficients of the polynomial p.
However, one cannot define in this way polynomials with coefficients in a
arbitrary ring A if we want to keep the property that polynomials with distinct coefficients correspond to distinct polynomials. In fact, if A has more
than one element, there is an infinite number of choices for the coefficients
of polynomials. However, if A is finite, there are only a finite number of
maps f : A → A, so these cannot be used to define all the polynomials with
coefficients in A.
3.3. POLYNOMIALS AND POWER SERIES
143
Example 3.3.1.
The set underlying the ring Z2 is {0, 1}, so has two elements. Therefore the
set of maps f : Z2 → Z2 has 4 elements. On the other hand, if polynomials with
distinct coefficients correspond to distinct polynomials, there exist an infinite
number of polynomials with coefficients in Z2 .
This problem is solved identifying a polynomial with the sequence of its
coefficients. We will come back later to the issue of defining an associated
map. In fact, we will consider polynomials (which have a finite number of
non-zero coefficients) as a special case of a “power series”. The last ones
will be defined without any reference to convergence issues, so they will be
“formal” power series.
In what follows, we denote by A an abelian ring with identity 1.
Definition 3.3.2. A (formal) power series in A is a map s : N0 → A.
A power series is called a polynomial if and only if there exists a N ∈ N0
such that s(n) = 0 for all n > N .
Examples 3.3.3.
1. The following polynomials with coefficients in A have special roles:
• 0 = (0, 0, 0, . . . ), called the zero polynomial;
• 1 = (1, 0, 0, . . . ), called the identity polynomial;
• x = (0, 1, 0, . . . ), which we will call the indeterminate x.
2. The polynomial a = (a, 0, 0, . . . ), where a ∈ A, is called a constant polynomial. In particular, the zero and identity polynomials are constant polynomials.
3. The set of all power series in Z2 is infinite and non-contable (it should be
obvious that it is a set in bijection with P(N)), and the set of all polynomials
in Z2 is infinite but countable.
The terms s(0), s(1), s(2), . . . of a formal power series s : N0 → A are
called the coefficients of the series In order to avoid confusion with the
values of a function associated with the power series, we will always denotes
these coefficients by s0 , s1 , s2 , etc. We will denote by the symbols A[[x]]
and A[x], respectively, the sets of power series and polynomials, respectively,
144
CHAPTER 3. OTHER EXAMPLES OF RINGS
with coefficients in A. Obviously, A[x] ⊂ A[[x]]4 .
The addition and product of polynomials with real coefficients is certainly familiar to us. For example, if we consider polynomials of degree 2,
we have:
(a0 + a1 x + a2 x2 ) + (b0 + b1 x + b2 x2 ) = (a0 + b0 ) + (a1 + b1 )x + (a2 + b2 )x2
(a0 + a1 x + a2 x2 )(b0 + b1 x + b2 x2 ), = (a0 b0 ) + (a0 b1 + a1 b0 )x
+(a0 b2 + a1 b1 +a2 b0 )x2 + (a1 b2 + a2 b1 )x3 + (a2 b2 )x4 .
Our next definition amounts to recognize that the the operations over
the coefficients of the polynomials, on the right hand side of the previous
identities, make sense in any ring. Notice that the resulting addition operation is just the usual addition of sequences, while the product is distinct.
When there is some risk of ambiguity, we will call the product defined below
the convolution product , and denote it by s ⋆ t instead of st.
Definition 3.3.4. Let s, t : N0 → A be power series, the sum s + t and the
(convolution) product s ⋆ t are the power series:
(3.3.1)
(3.3.2)
(s + t)n = sn + tn , and,
n
X
sk tn−k .
(s ⋆ t)n =
k=0
Examples 3.3.5.
1. If a = (a, 0, 0, . . . ) and b = (b, 0, 0, . . . ) are constant polynomials, their
sum and product are given by a + b = (a + b, 0, 0, . . . ) and a ⋆ b = ab =
(ab, 0, 0, . . . ). Hence, the set of constant polynomials is a ring isomorphic to
the ring A.
2. If a = (a, 0, 0, . . . ) is constant and s = (s0 , s1 , s2 , . . . ) is any power series,
the
Pn product a ⋆ s is the power series (as0 , as1 , as2 , . . . ) = as, since the sum
k=0 ak sn−k reduces the term with k = 0.
. . ) with coefficients
3. Consider the power series s = (1, 1, 1, .P
Pn in Z2 . In order
n
to compute its square, note that (ss)n = k=0 sk sn−k = k=0 1 = n + 1. We
conclude that
(1, 1, 1, . . . )(1, 1, 1, . . . ) = (1, 0, 1, 0, . . . ).
4
Note that the use of the letter “x” in the symbols A[x] and A[[x]] is caused by the
choice of this letter to denote the indeterminate (0, 1, 0, . . . ). In some situations we may
denote this indeterminate by some other letter, e.g., y, in which case we will use the
notation A[y] or A[[y]].
3.3. POLYNOMIALS AND POWER SERIES
145
4. If s = (s0 , s1 , s2 , . . . ) is any power series, the product xs is the power series
obtained from s pby translation of all its coefficients to the right:
(xs)0 = x0 s0 = 0,
(xs)n+1 =
n+1
X
xk sn+1−k = sn .
k=0
When s = x, we conclude that x2 = (0, 0, 1, 0, . . . ), x3 = (0, 0, 0, 1, 0, . . . ), etc.
We extend this to the case n = 0 by stipulating by convention that
x0 = (1, 0, 0, . . . ) = 1.
The next result has a straightforward proof, and so it is left as an exercise.
Theorem 3.3.6. If A is an abelian ring with identity, then both A[[x]] and
A[x]5 are abelian rings with identity for the sum and product defined by
(3.3.1) and (3.3.2)), and A[x] is a subring of A[[x]].
We have already observed that the constant polynomials form a ring
isomorphic to A. Therefore, this justifies using the same symbol a to denote an element of the ring A, and the corresponding constant polynomial
(a, 0, 0, . . . ). We also say that A[x] and A[[x]] are extensions of A. Notice
that the expression axn is then representing the product of the constant
polynomial (a, 0, 0, . . . ) by the n-th power of the indeterminate x, i.e., xn .
This polynomial, according to what we saw above, has only one non-zero
coefficient namely (axn )n = a (note that here, a has two distinct meanings!)
We conclude that if p = (p0 , p1 , p2 , . . . , pN , 0, . . . ), then
p = p0 + p1 x + · · · + pN x
N
=
N
X
p n xn .
n=0
The expression on the right hand side is called the canonical form of the
polynomial p. As usual, the coefficient is omitted whenever it equals 1.
Example 3.3.7.
Assume that p, q ∈ Z4 [x] are given by p = 1 + x + 2x2 and q = 1 + 2x2 .
In order to add and multiply these polynomials, we proceed exactly as we do
5
The ring of power series A[[x]] and the ring of sequences in A have the underlying set,
and the same addition operation, differing only on the multiplication operation. Whenever
is necessary to distinguish the power series s in A[[x]] from the corresponding sequence a
in S, one often says that s is the z-transform of a.
146
CHAPTER 3. OTHER EXAMPLES OF RINGS
for polynomials with real coefficients, since the usual procedure only uses the
algebraic properties common to any ring. One can easily recognize these apart
from the specific properties of the ring Z4 in the following computations:
(1 + x + 2x2 ) + (1 + 2x2 ) = (1 + 1) + (1 + 0)x + (2 + 2)x2
= 2 + x,
2
2
(1 + x + 2x )(1 + 2x ) = (1 + x + 2x2 )1 + (1 + x + 2x2 )2x2
= (1 + x + 2x2 ) + (2x2 + 2x3 )
= 1 + x + 2x3 .
P
Sometimes it is possible to give a meaning to “infinite sums” ∞
n=0 sn ,
where each sn is a power series. If snk denotes the coefficient k P
of the power
series sn , the simplest meaning one can give to the “sum” t = ∞
n=0 sn is
t=
∞
X
n=0
sn ⇐⇒ tk =
∞
X
n=0
snk , for all k ∈ N0 .
We will always use this definition whenever the sequence snk is eventually
zero for all k ≥ 0, interpreting the “infinite sum” on the right and the (finite)
sum of all the non-zero terms. In particular, we will write for any power
series s:
∞
X
s=
s n xn .
n=0
Definition 3.3.8. If p 6= 0 is a polynomial, the degree of p is the integer
deg p defined by
deg p = max{n ∈ N0 : pn 6= 0}.
When p = 0 we convention that deg p = −∞.
Example 3.3.9.
Obviously, deg xn = n, for any n ≥ 0.
The example above with the product of polynomials in Z4 [x] shows that
it is not always true that the degree of the product of two polynomials is
the sum of of the degrees of the factors. The next result clarifies this issue,
which is related with the presence of zero divisors. In order to avoid dealing
separately with the zero polynomial, we convention that
deg p + deg q = −∞,
whenever p = 0 or q = 0.
3.3. POLYNOMIALS AND POWER SERIES
147
Proposition 3.3.10. Let p, q ∈ A[x]. Then:
(i) deg(p + q) ≤ max{deg p, deg q}, and deg(pq) ≤ deg p + deg q;
(ii) If A is an integral domain, then deg(pq) = deg p + deg q, and A[x]
and A[[x]] are also integral domains.
We leave the proof as an exercise.
Notice that, according to (ii), when A is an integral domain we can form
the field of fractions of A[x] (i.e., Frac(A[x]) in the notation of the previous
section). This field of fractions if the formal abstract analogue of the usual
field of rational functions.
Definition 3.3.11. If A is an integral domain, A(x) (respectively, A((x)))
denotes the field of fractions de A[x] (respectively, A[[x]]).
Example 3.3.12.
When A = Z, the field Z(x) consist of fractions whose numerators and denominators are polynomials with integer coefficients . In this field, we have
x2 − 1
= x − 1.
x+1
2
−1
and g(x) = x − 1 are not the
Notice, however, that the functions f (x) = xx+1
same, since their domains of definition are distinct.
If φ : A → B is a ring homomorphism, then the map Φ : A[x] → B[x]
defined by
!
N
N
X
X
φ(pn )xn
p n xn =
Φ
n=0
n=0
is also a ring homomorphism. Frequently, for p ∈ A[x], we will denote the
polynomial q = Φ(p) by pφ (x). Sometimes (example, if A ⊂ B is a subring
and φ : A → B is the inclusion) we may use the same letter to denote this
two polynomials, if it is clear from the context to which ring the coefficients
belong to. The next examples shows that this is a reasonable practice, which
we are already used to!
Example 3.3.13.
Let p = 1 + 2x + 3x2 ∈ Z[x] be a polynomial with integer coefficients. Obviously, Z ⊂ Q and we can consider the canonical inclusion ι : Z → Q = Frac(Z)
and we have pι = 1 + 2x + 3x2 ∈ Q[x]. Notice that the symbols “1”, “2”
148
CHAPTER 3. OTHER EXAMPLES OF RINGS
and “3” denote now rational numbers, rather than integers. This is a reasonable abuse of notation, as we saw in our treatment of the field of fractions:
the fraction ax
a is denoted by x. Therefore, it is natural to represent this new
polynomial by the same letter as the original polynomial.
Finally, observe that if φ : A → B is an injective homomorphism of rings
and p ∈ A[x], then p and pφ have the same degree.
Exercises.
1. Compute the following product in Zm [x], when m = 2, 3 and 6:
(1 + x + 2x2 )(2 + 3x + 2x2 ).
2. Proof Theorem 3.3.6.
3. Show that the ideal h2, xi in Z[x] is not a principal ideal.
4. Give a proof of Proposition 3.3.10.
5. What are the invertible elements of A[x], when A is an integral domain?
6. Assume that A is a commutative ring with identity. Show that the characteristic of A[x] and the characteristic of A are equal.
7. The polynomials in several variables can be defined in several ways. For
example, in the case of two variables, we can consider:
(i) The ring A[x] and the polynomials with coefficients in A[x], which we
denote by A[x][y] (in this case we denote the new indeterminate by y);
(ii) The maps s : N0 × N0 → A, with an appropriate definition of sum and
convolution product, and indeterminates x and y, giving a ring which is
denoted by A[x, y].
Both ways of defining polynomials in several variables are useful and the following exercises show that they are equivalent.
(a) Give a detailed description of the definition sketched in (ii).
(b) Show that the two definitions are equivalent, i.e., they lead to isomorphic
rings.
(c) How should one generalize these definitions for polynomials in several
variables x1 , . . . , xn ?
(d) How should one generalize these definitions for polynomials in the variables xα , with α ∈ I, where I can be an infinite set?
3.3. POLYNOMIALS AND POWER SERIES
149
8. Show that the following statements are equivalent:
(a)
(b)
(c)
(d)
A is an integral domain;
A[x] is an integral domain;
A[[x]] is an integral domain;
For any p, q ∈ A[x], deg(pq) = deg p + deg q.
9. Show that there exist rings with characteristic p, a prime, which are not
fields, and fields with characteristic p which are infinite (both countable and
uncountable).
PN
10. If p ∈ A[x] is a polynomial p = n=0 pn xn , its (formal) derivative p′
is the polynomial
X
p′ =
npn xn−1 .
n=1
′
Is it true that p = 0 implies that p is a constant polynomial?
11. Use the previous exercise and the homomorphism φ : Z → Zp defined by
φ(n) = n to deduce the following generalization of Euclides Lemma: if p ∈ N
is a prime, a, b ∈ Z[x] and p|ab, then either p|a or p|b.
12. Let D be an integral domain and K = Frac(D) its field of fractions. Show
that D(x) is isomorphic to K(x).
13. Define a ring, denoted
A[1/x, x]], with underlying set consists of all power
P
n
6
series of the form ∞
n=k sn x , where k ∈ Z is arbitrary . Show that if the
coefficients belong to a field K, then the ring K[1/x, x]] is a field isomorphic
to K((x)) = Frac(K[[x]]).
14. Show that in K[[x]] one has
∞
X
1
=
xn .
1 − x n=0
(Notice the abuse of notation, which is justified by the previous exercise!)
15. Determine the power series inverse to (a− x) and to (a− x)(b − x) in K[[x]].
16. Show that in Z2 [[x]], one has
6
∞
X
1
=
x2n .
(1 − x)2
n=0
These series are usually called Laurent series. The corresponding ring, denoted
A[1/x, x]], is called the ring of Laurent series with coefficients in A. The special case
where A = C plays a crucial role in Complex Analysis and in Algebraic Geometry.
150
CHAPTER 3. OTHER EXAMPLES OF RINGS
17. One can solve problems like the one of the Fibonacci sequence in Section 2.3
by formal power series computations. Recalling that this sequence is defined
recursively by:
an+2 = an+1 + an ,
(n ≥ 0),
one notes that is equivalent to
∞
X
an+2 xn =
n=0
and also:
∞
X
If we let s =
an+2 xn+2 = x
∞
X
n=0
n=0
an+1 xn +
n=0
n=0
P∞
∞
X
∞
X
∞
X
n=0
an+1 xn+1 + x2
n=0
n
an x n ,
∞
X
an x n .
n=0
an x , then
an+2 x
n+2
= s − a0 − a1 x,
∞
X
n=0
an+1 xn+1 = s − a0 ,
and we conclude that the recurrence relation above is equivalent to:
s − a0 − a1 x = x(s − a0 ) + x2 s.
Solving this equation for s, we obtain:
s=
−a0 + (a0 − a1 )x
.
x2 + x − 1
If α and β are the roots of x2 + x − 1, we can factorize this rational fraction
as:
−a0 + (a0 − a1 )x
A
B
=
+
,
x2 + x − 1
α−x β−x
The exercise above now allows to compute explicitly the coefficients of s, i.e.,
the terms of the Fibonacci sequence.
Verify all these statements and compute the coefficients of the Fibonacci sequence.
3.4
Polynomial Functions
We have argued in the previous sections that one should not define polynomials with coefficients in A as functions of some sort with domain and
values in A. However, there is nothing forbidding us to define functions
A → A from polynomials in A[x].
3.4. POLYNOMIAL FUNCTIONS
151
P
n
Definition 3.4.1. If p = N
n=0 pn xPis a polynomial in A[x], the function
N
n
p∗ : A → A defined by p∗ (a) =
n=0 pn a is called the polynomial
function associated with p.
Example 3.4.2.
Let A = Z2 and p = 1 + x + x2 (7 ). The polynomial function associated to the
polynomial p is p∗ : Z2 → Z2 given by p∗ (a) = 1 + a + a2 , for any a ∈ Z2 .
In this case, we have p∗ (0) = p∗ (1) = 1, and hence p∗ is a constant function,
although p is not a constant polynomial. In particular, when q = 1, we have
p 6= q and p∗ = q ∗ .
Given a ring A we will denote by AA the set of all functions f : A → A.
In Chapter ?? we already observed that AA is a ring, with the usual addition
and multiplication of functions. We have a map
Ψ : A[x] → AA ,
p 7→ p∗ ,
which associates to each polynomial p ∈ A[x] the corresponding function
p∗ : A → A.
Proposition 3.4.3. Ψ : A[x] → AA is a ring homomorphism.
PN
P
n
n
Proof. Let p, q ∈ A[x]. We can write p = N
n=0 qn x ,
n=0 pn x and q =
by eventually letting some of the coefficients be zero. If a ∈ A, we need to
show that the following identities hold:
(p + q)∗ (a) = p∗ (a) + q ∗ (a), e
(pq)∗ (a) = p∗ (a)q ∗ (a).
A more general version of the first identity was already proved in Chapter
??. The second identity can also be proved easily by induction.
Examples 3.4.4.
1. Given a polynomial p ∈ A[x], the associated function p∗ is defined not just in
the ring A, but also in an extension of A. For example, if p = 1 + 2x + 3x2 ∈
Z[x], the associated polynomial function p∗ : Z → Z is of course p∗ (n) =
1 + 2n + 3n2 , for any n ∈ Z. Since Z ⊂ Q we can also consider q ∗ : Q → Q,
given also by q ∗ (r) = 1 + 2r + 3r2 , for any r ∈ Q.
7
From now, we will not distinguish between the integer n and the class n. It should be
clear from the context if we are referring to an integer (element of Z) or an equivalence
classe in some Zm .
152
CHAPTER 3. OTHER EXAMPLES OF RINGS
2. More formally, if the ring B is an extension of the ring A, so that there
exists an injective homomorphism φ : A → B, then φ(A) is a subring of B
P
n
isomorphic to A. Hence, if p = N
n=0 pn x is a polynomial in A[x], then p
determines a polynomial function q ∗ : B → B, namely:
q ∗ (b) =
N
X
φ(pn )bn .
n=0
Of course, p also determines a polynomial q ∈ B[x] which was denoted pφ in
the previous section, so in fact we have:
q ∗ = (pφ )∗ .
As we have observed before, it is a matter of common sense to decide to distinguish between p and q, to write different symbols for the respective functions
p∗ and q ∗ , or to use different notations for their coefficients pn and φ(pn ).
3. Continuing with Example 1, recall that M2 (Z) denotes the ring of 2 × 2
matrices with integer entries. The matrices of the form nI, where I denotes
the identity matrix and n ∈ Z, is isomorphic to Z: indeed we have an injective
homomorphism φ : Z → M2 (Z) given by φ(n) = nI. If p = 1 + 2x+ 3x2 ∈ Z[x]
we have that (pφ )∗ : M2 (Z) → M2 (Z) is given by (pφ )∗ (C) = φ(1) + φ(2)C +
φ(3)C 2 , where C denotes now any matrix in M2 (Z). Obviously, we can simplify
the notation and write the expression φ(1) + φ(2)C + φ(3)C 2 as I + 2C + 3C 2 .
This is formally the same expression as in Example 1, with the small subtlety
that the symbol “1” is replaced by the symbol “I”, representing the identity
matrix.
In the first example, the identification of Z with its image in Q is quite
natural. In the last example, one does not need to substitute the coefficients
of degree > 0 (because the product of a matrix by the integer n is the same as
the product by the matrix nI), but we must replace the coefficient of degree
zero, because the same of matrix with an integer is not a priori defined.
We can use this circle of ideas to formalize the notion of root of a polynomial:
Definition 3.4.5. Let p ∈ A[x] and let B be an extension of A. We say
that b ∈ B is a root of p if p∗ (b) = 0.
When we write p∗ (a) we usually think of p∗ as fixed and a as a variable
(the “independent variable” of the function p∗ ). However, we can also consider p∗ as an “independent variable”. Still assuming that A is any abelian
ring with identity and that B is an extension of A, we make the following
definition (recall the remarks above about identifications!):
3.4. POLYNOMIAL FUNCTIONS
153
Definition 3.4.6. The map Val : A[x] × B → B is defined by
Val(p, b) := p∗ (b).
Val(p, b) is called the evaluation of the polynomial p at b.
The function Val has two independent variables: the polynomial p and
the element b. If we fix the polynomial p, then Val is just the function
p∗ : B → B associated to the polynomial p. We can also fix the element b,
resulting in an evaluation map ψ : A[x] → B, p 7→ p∗ (b).
Examples 3.4.7.
1. Let A = Z, B = C and b = i. Then ψ : Z[x] → C is given by ψ(p) = p∗ (i). In
order to find out the image ψ(Z[x]) notice that the division of any polynomial
p ∈ Z[x] by 1 + x2 , results in a reminder of degree less than 2. In other
words, p = (1 + x2 )q + (a0 + a1 x), where q ∈ Z[x] and a0 , a1 ∈ Z. Hence,
p∗ (i) = a0 + a1 i so ψ(Z[x]) consists of all Gaussian integers.
√
√
2. Let A = Q, B = R and b = 2. Then ψ(p) = p∗ ( 2) Since we can always
2
write
√ + (a0 + a1 x), where q ∈ Q[x] and a0 , a1 ∈ Q, we have that
√ p = (2 − x )q
2, so we conclude that ψ(Q[x]) consists of all real numbers
p∗ ( 2) = a0 + a1 √
of the form a0 + a1 2 with a0 , a1 ∈ Q.
3. The ring A[x] is always an extension of the ring A. Hence, if we let A be
any ring, B = A[x] and b = x, it is obvious that ψ(p) = p∗ (x) = p. In
other words, ψ : A[x] → A[x] is the identity function. For this reason, we can
denote the polynomial p by the symbol p∗ (x), which usually simplified to p(x).
Motivated by the last example, whenever B is an extension of A, b ∈ B,
and ψ : A[x] → B is the evaluation map ψ(p) = p∗ (b), we will denote the
image ψ(A[x]) by the symbol A[b]. This is the reason why we used the
symbol Z[i] for the Gaussian integers. In the case of Example 3.4.7.2, we
have
√
√
√
Q[ 2] = {p∗ ( 2) : p ∈ Q} = {a0 + a1 2 : a0 , a1 ∈ Q}.
The previous examples suggest that A[b] is always a subring of B. It is
easy to check that this always holds:
Proposition 3.4.8. For any fixed b ∈ B, the evaluation map ψ : A[x] → B,
p 7→ p∗ (b), is a ring homomorphism. Hence, A[b] is a subring of B.
Proof. Proceed as in the proof of Proposition 3.4.3.
154
CHAPTER 3. OTHER EXAMPLES OF RINGS
Also, in analogy with Definition 3.3.11, whenever A[b] is an integral
domain, we will denote its field of fractions by A(b).
Notice that A[b] always contains the values of the constant polynomials.
These form a subring of A[b] isomorphic to A. Hence, A[b] is always an
extension√ of A: the ring of Gaussian integers contains a subring isomorphic
to Z, Q[ 2] contains a subring isomorphic to Q, etc.). On the other hand,
in all Examples 3.4.7, only for the last one is the evaluation map ψ injective.
Therefore, only for the last of these examples, we have A[b] isomorphic to
A[x]. Of course, ψ is injective if and only if its kernel N (ψ) consist of the
zero polynomial zero. Since:
N (ψ) = {p ∈ A[x] : p∗ (b) = 0},
we conclude that ψ is injective if and only if b is not a root of any non-zero
polynomial with coefficients in A.
Definition 3.4.9. Let B be an extension of A and b ∈ B. We say that b
is algabraic over A if there exists a non-zero polynomial p ∈ A[x] such
that p∗ (b) = 0. Otherwise, we say that b is transcendental over A .
Example 3.4.10.
√
As we saw above, i ∈ C is algebraic over Z, and 2 ∈ R is algebraic over Q
(and also over Z). The polynomial x ∈ A[x] is always transcendental over A.
In general, an extension B of A may (i) contain both algebraic elements
and transcendental elements over A, or (ii) may contain only algebraic elements over A. We distinguish these two possibilities:
Definition 3.4.11. We call B an algebraic extension of A if all its
elements are algebraic over A. Otherwise, B is called a transcendental
extension of A.
Examples 3.4.12.
1. The ring of Gaussian integers is an algebraic extension of Z. In fact, any
Gaussian integer m + ni is a root of the non-zero polynomial with integer
coefficients (x − m)2 + n2 = x2 − 2mx + m2 + n2 .
√
√
2. Q[ 2] is an algebraic extension of Q, since a + b 2 is a root of the non-zero
polynomial with rational coefficients (x − a)2 − 2b2 .
3. Q is an algebraic extension of Z, because
m
n
is a root of nx − m ∈ Z[x].
4. C is an algebraic extension of R, because a+bi is a root of (x−a)2 +b2 ∈ R[x].
3.4. POLYNOMIAL FUNCTIONS
155
5. A[x] is a transcendental extension of A.
6. We will show in the next chapter that R is a transcendental extension of Q.
We close this section with the following result which expresses the fact
that A[x] is the smallest transcendental extension of A. We leave its proof
for the exercises.
Theorem 3.4.13. Any transcendental extension of A contains a subring
isomorphic to A[x].
Exercises.
1. Conclude the proof of Proposition 3.4.3.
√
√
2. Assume that n is irrational. Show that Q[ n] is an algebraic extension of
Q, a subfield of R, and a vector space of dimension 2 over Q.8
3. Prove Theorem 3.4.13.
4. Assume that A ⊂ B are integral domains and b ∈ B. Show that:
(a) A[b] is the smallest integral domain containing A and b.
(b) A(b) is the smallest field containing A and b.
5. Let K be a field and fix n points (ak , bk ) in K × K with ai 6= aj for i 6= j.
Show that :
(a) There exists a polynomial pi ∈ K[x] such that

se j = i,
 1,
pi (aj ) =

0,
se j =
6 i.
(Hint: Modify the polynomial q i =
Qn
k=1 (x−ak )
(x−ai )
.)
(b) There exists a polynomial p(x) ∈ K[x] of degree ≤ n − 1 such that
p(ak ) = bk , for all k = 1, . . . , n.
The formula defining p is called Lagrange interpolation formula.
6. Let K be a finite field. Show that any function K → K is polynomial.
√
The fields Q[ a] where a is not a perfect square are called quadratic fields, and
play an important role in Number Theory.
8
156
CHAPTER 3. OTHER EXAMPLES OF RINGS
7. Let K be a subfield of L, and assume that L is a finite dimensional vector
space over K. Show that L is an algebraic extension of K.
(Hint: if the dimension of L over K is n and a ∈ L then {ak : 0 ≤ k ≤ n}
cannot be a linearly independent set.)
8. Let K be a subfield of L and let b ∈ L be algebraic over K. Use the previous
exercise to show that K(b) is an algebraic extension of K.
3.5
Division of Polynomials
In this and in the next sections we will study in some detail the ring of
polynomials A[x]. Underlying this study, playing a fundamental role, there
will be the division algorithm for polynomials. Therefore, we need to find
sufficient conditions on the ring A for this algorithm to hold. Many of the
results that we will obtain are then analogous to the results we have obtained
in Chapter ?? for the ring of integers.
According to Example 3.4.7.3, we will adopt the following convention: a
polynomial p ∈ A[x] is represented by the symbol p(x) and the value of the
polynomial p at a ∈ A will now be represented by p(a) instead of p∗ (a).
Definition 3.5.1. A polynomial p(x) ∈ A[x] is called monic if pn = 1,
where deg p(x) = n and 1 is the identity in the ring A.
Example 3.5.2.
The polynomial 5 + 3x + 2x2 + x4 ∈ Z[x] is monic.
The division algorithm presents obvious issues in rings with zero divisors.
On the other hand, if D is a integral domain, then by Proposition 3.3.10(b)
D[x] is also an integral domain. In fact, we have:
Theorem 3.5.3 (Division Algorithm). Let D be an integral domain. If
p(x), d(x) ∈ D[x] and d(x) is monic, there exist unique polynomials q(x)
and r(x), with deg r(x) < deg d(x), such that p(x) = q(x)d(x) + r(x).
Similarly to the case of the integers, the polynomials q(x) and r(x) are
called, respectively, the quotient and the remainder of the division of
p(x) by d(x). The case where r(x) = 0 corresponds, of course, to the case
where d(x) is a divisor (or factor) of p(x). We recall that in this case we
write d(x)|p(x).
3.5. DIVISION OF POLYNOMIALS
157
Proof of Theorem 3.5.3. We will show separately the existence and uniqueness.
Existence: Let R = {p(x) − a(x)d(x) : a(x) ∈ D[x]}. There are two
cases, depending if 0 belongs to R or not:
(a) If 0 ∈ R there exists a0 (x) ∈ D[x] such that p(x) − a0 (x)d(x) = 0. In
this case we set q(x) = a0 (x) and r(x) = 0. It follows that:
p(x) = q(x)d(x) + r(x),
with deg r(x) < deg d(x).
(b) If 0 6∈ R the set G = {deg(p(x) − a(x)d(x)) : a(x) ∈ D[x]} ⊂ N0 ,
formed by the degrees of the polynomials in R must have has a minimum m. Let a0 (x) be a polynomial with deg(p(x) − a0 (x)d(x)) = m.
If we set q(x) = a0 (x) and r(x) = p(x) − q(x)d(x), it follows that:
p(x) = q(x)d(x) + r(x),
and we just need to check that m = deg r(x) < deg d(x). Assume
that n = deg d(x) ≤ m. Then we set k = m − n and r ′ (x) = r(x) −
rm xk d(x). On the one hand, r ′ (x) = p(x) − (q(x) + rm xk )d(x) ∈ R.
On the other hand, since d(x) is monic, we have deg r ′ (x) < m =
deg r(x), contradicting the definition of m. Therefore, we must have
m < n, i.e, deg r(x) < deg d(x).
Uniqueness: Let p(x) = q(x)d(x) + r(x) = q ′ (x)d(x) + r ′ (x), where
deg r(x) and deg r ′ (x) are both smaller than deg d(x). We have that
q(x) − q ′ (x) d(x) = r ′ (x) − r(x).
If q(x) 6= q ′ (x), then
which contradicts
deg r ′ (x) − r(x) ≥ deg d(x),
deg r ′ (x) − r(x) ≤ max{deg r ′ (x), deg r(x)} < deg d(x).
Hence we must have q(x) = q ′ (x), which also implies r(x) = r ′ (x).
The argument above for the existence part of the proof can be implemented as an algorithm as follows: given a polynomial p(x) = an xn +
an−1 xn−1 + · · · + a0 ∈ D[x] of degree n denote by ptop (x) = an xn the highest degree term of p(x). To divide the polynomial p(x) by a polynomial
d(x) one iterates as follows:
158
CHAPTER 3. OTHER EXAMPLES OF RINGS
• Starting with q(x) = 0 and r(x) = p(x), one replaces at each step
q(x) → q(x) +
r top (x)
,
dtop (x)
r(x) → r(x) −
r top (x)
d(x).
dtop (x)
The iteration ends when deg r(x) < deg d(x).
Example 3.5.4.
Let D = Z5 . The division of p(x) = 2x4 +4x3 +x2 +2x+3 by d(x) = x2 +x+1
results in a quotient q(x) = 2x2 + 2x + 2, with remainder r(x) = 3x + 1.
Assuming that a ∈ D, the polynomial d(x) = (x−a) is monic of degree 1.
Hence, any polynomial p(x) ∈ D[x] can be divided by (x−a) and, according
to the division algorithm, the remainder is a constant polynomial. The next
corollary states that the corresponding element of D is just the value of p(x)
at a. We leave the proof as an exercise:
Corollary 3.5.5 (Remainder Theorem). If p(x) ∈ D[x] and a ∈ D, the
remainder of the division of p(x) by (x − a) is the constant polynomial
r(x) = p(a). In particular, (x − a)|p(x) if and only if a is a root of p(x).
Examples 3.5.6.
1. Consider the polynomial p(x) = 2x4 + 4x3 + x2 + 2x + 3 in Z5 [x]. Since
p(1) = 12 ≡ 2 (mod 5), it follows that the remainder of the division of 2x4 +
4x3 + x2 + 2x + 3 by (x − 1) = (x + 4) is r(x) = 2.
2. If we assume instead that p(x) = 2x4 + 4x3 + x2 + 2x + 3 is a polynomial
with coefficients in Z3 , we have that p(1) = 12 ≡ 0 (mod 3), so in this case
(x − 1) = (x + 2) is a factor of 2x4 + 4x3 + x2 + 2x + 3.
Another consequence of the Division Algorithm (or, more precisely, of
Corollary 3.5.5) is the classical result about the maximal number of roots of
a non-zero polynomial:
Proposition 3.5.7. If p(x) ∈ D[x] and deg p(x) = n ≥ 0, then p(x) has at
most n roots in D.
Proof. We use induction in the degree of the polynomial p(x). If n = 0, the
polynomial p(x) is constant and non-zero. It is then obvious that it has no
roots.
Assume now that the result hods for an integer n ≥ 0. Let deg p(x) =
n+1 and assume that a is a root of p(x) (if p(x) has no roots there is nothing
3.5. DIVISION OF POLYNOMIALS
159
to prove!). According to the Remainder Theorem, p(x) = q(x)(x−a), where
of course deg q(x) = n. On the one hand, by the induction assumption, q(x)
has at most n roots. On the other hand, if b ∈ D is distinct from a, then
p(b) = q(b)(b − a). Since D is an integral domain, we can only have p(b) = 0
if q(b) = 0. In other words, the remaining roots of p(x) must also be roots
of q(x). Therefore, p(x) has at most n + 1 roots.
When D is an integral domain, the only invertible polynomials q(x) ∈
D[x] are the constant polynomials q(x) = a, with a ∈ D invertible. These
polynomials can always be used to obtain trivial factorizations of any polynomial p(x): if a(x)b(x) = 1 then p(x) = a(x)b(x)p(x). For this reason,
if q(x)|p(x), one often says that q(x) is a proper factor of p(x) if and
only if p(x) = a(x)q(x), where neither a(x) nor q(x) are invertible. Hence,
a factorization is non-trivial if and only if it contains at least on proper
factor.
Our next definition, which should be compared with the definition of
prime numbers presented in Chapter ??, says that an irreducible polynomial
is a non-invertible polynomial with only trivial factorizations. Notice that
the restricting to a non-invertible polynomial is entirely analogous to the
exclusion of 1 from the set of prime numbers.
Definition 3.5.8. A non-invertible polynomial p(x) ∈ D[x] is said to be
irreducible in D[x] when it has no proper factors in D[x]. In other words,
if for q1 (x), q2 (x) ∈ D[x],
p(x) = q1 (x)q2 (x)
=⇒
q1 (x) or q2 (x) is invertible in D[x].
We say that p(x) is reducible in D[x] if it is not irreducible.
Examples 3.5.9.
1. p(x) = x − a is always irreducible, no matter what the domain D is.
2. If D = Z, then p(x) = 2x − 3 is irreducible (check this!), but q(x) = 2x + 6
is reducible, because 2x + 6 = 2(x + 3), and both 2 and x + 3 are not invertible
in Z[x].
3. If deg p(x) ≥ 2 and p(x) has at least one root in D, it follows from the
Remainder Theorem that p(x) is necessarily reducible in D[x].
4. If p(x) is monic of degree 2 or 3, then p(x) is reducible in D[x] if and only
if it had at least one root in D (why?).
5. It is possible for p(x) to have no roots in D and be reducible in D[x]: for
example, x4 + 2x2 + 1 is reducible in Z[x], since x4 + 2x2 + 1 = (x2 + 1)2 , but
it has no roots in Z.
160
CHAPTER 3. OTHER EXAMPLES OF RINGS
6. You probably have learned in elementary Algebra that the only irreducible
polynomials in R[x] are the polynomials of degree 1 and the quadratic polynomials p(x) = ax2 + bx + c, with negative discriminant ∆ = b2 − 4ac. We will
see later that this is a consequence of the Fundamental Theorem of Algebra.
7. The property of a polynomial being irreducible depends heavily on the domain
D. For example, the polynomial x2 + 1 is irreducible in R[x], but reducible in
C[x] ⊃ R[x]. On the other hand, the polynomial p(x) = 2x + 6 is reducible in
Z[x], but irreducible in Q[x] ⊃ Z[x].
In same cases it is possible to describe all the irreducible polynomials in
D[x], as in the case D = R. However, there are cases where it is practically
impossible to describe them. The next results illustrate how complex this
issue is in the case of Z[x] and Q[x].
Given p(x) ∈ Z[x], p(x) = a0 + a1 x + · · · + an xn , we will say that c(p) =
gcd(a0 , a1 , . . . , an ) is content of p(x). We say that p(x) is primitive if its
coefficients are relatively prime, i.e., if c(p) = 1. Obviously p(x) is primitive
if and only if it has no factorizations of type p(x) = kq(x), with k ∈ Z,
k 6= ±1 and q(x) ∈ Z[x].
Lemma 3.5.10. If p(x) ∈ Z[x] and p(x) = a(x)b(x), with a(x), b(x) ∈
Q[x], then there exist polynomials a′ (x), b′ (x) ∈ Z[x], and k ∈ Q, such that
p(x) = a′ (x)b′ (x), a(x) = ka′ (x) and b(x) = k−1 b′ (x).
Proof. Obviously, there are integers n, m such that ã(x) = na(x) ∈ Z[x],
b̃(x) = mb(x) ∈ Z[x], and nmp(x) = ã(x)b̃(x). Let q be any prime factor of
nm. By the generalization of Euclides Lemma given in Exercise 11, Section
3.3:
q|ã(x)b̃(x) =⇒ q|ã(x) or q|b̃(x).
Hence, we can divide both sides of the identity nmp(x) = ã(x)b̃(x) by q,
still obtaining on the right hand side a product of two polynomials in Z[x].
If we repeat this procedure for all prime factors of nm, we arrive at an
identity
p(x) = a′ (x)b′ (x),
where a′ (x), b′ (x) ∈ Z[x], ã(x) = sa′ (x) and b̃(x) = tb′ (x), with s, t ∈ Z.
We conclude that a(x) = ns a′ (x), b(x) = mt b′ (x), and ns mt = 1.
Lemma 3.5.11 (Gauss). If p(x) ∈ Z[x] is not constant, then p(x) is irreducible in Z[x] if and only if it is primitive and irreducible in Q[x].
3.5. DIVISION OF POLYNOMIALS
161
Proof. Assume first that p(x) is reducible and primitive in Z[x]. We claim
that p(x) is reducible in Q[x]. Indeed, in this case, p(x) = a(x)b(x), with
a(x), b(x) ∈ Z[x]. Note that if any of the polynomials a(x) or b(x) is
constant then it must be invertible, i.e., ±1, since p(x) is primitive. We
conclude that a(x) and b(x) are not constant, hence not invertible in Q[x].
Therefore, the factorization p(x) = a(x)b(x) is non-trivial in Q[x], so p(x)
is reducible in Q[x], as claimed.
If p(x) is not primitive, it is obvious that it is reducible in Z[x]. Therefore, it remains to show that if p(x) is reducible in Q[x] it is also reducible in Z[x]. In this case, p(x) = a(x)b(x), where a(x), b(x) ∈ Q[x]
are non constant. According to Lemma 3.5.10, there exist polynomials
a′ (x), b′ (x) ∈ Z[x] such that p(x) = a′ (x)b′ (x), and a′ (x), b′ (x) are not
constant. Hence, p(x) is reducible in Z[x].
The next theorem and the previous lemma allows one to easily obtain
examples of irreducible polynomials in Z[x] and Q[x].
Theorem 3.5.12 (Eisenstein’s Criterion). Let a(x) = a0 +a1 x+· · ·+an xn ∈
Z[x] be a polynomial of degree n. If there exists a prime p ∈ Z such that
ak ≡ 0 (mod p) for 0 ≤ k < n, an 6≡ 0 (mod p) and a0 6≡ 0 (mod p2 ) then
a(x) is irreducible in Q[x].
Proof. Assume that in Q[x] we have a factorization
a(x) = b(x)c(x) = (b0 + b1 x + . . . )(c0 + c1 x + . . . ).
According to Lemma 3.5.10, we may assume without loss of generality that
b(x), c(x) ∈ Z[x]. If b0 ≡ c0 ≡ 0 (mod p), then a0 = b0 c0 ≡ 0 (mod p2 ),
which contradicts the assumption that a0 6≡ 0 (mod p2 ). Hence we can
assume, e.g., that c0 6≡ 0 (mod p).
It is obvious that if p|b(x) then p|a(x), which is impossible because
an 6≡ 0 (mod p). We conclude that the set {k ≥ 0 : bk 6≡ 0 (mod p)} is
non-empty. Let us denote by m its minimum.
Notice that
am =
m
X
i=0
bi cm−i ≡ bm c0 6≡ 0
(mod p),
so we must have m = n, since an is the only coefficient of a(x) not divisible
by p.
Therefore, deg b(x) ≥ deg a(x), and since a(x) = b(x)c(x), we have
deg b(x) = deg a(x), and c(x) must be constant. This shows that a(x) is
irreducible in Q[x].
162
CHAPTER 3. OTHER EXAMPLES OF RINGS
Examples 3.5.13.
1. If p ∈ Z is a prime, q(x) = xn − p is irreducible in Z[x] and in Q[x].
2. Eisenstein’s Criterion does not apply to the polynomial q(x) = x6 + x3 + 1.
However, noting that
q(x + 1) = (x + 1)6 + (x + 1)3 + 1
= x6 + 6x5 + 15x4 + 21x3 + 18x2 + 9x + 3,
Eisenstein’s Criterion show that q(x + 1) is irreducible in Q[x]. We conclude
that q(x) is irreducible in Q[x]. Since q(x) is monic, it is also irreducible in
Z[x].
The examples and remarks above lead to the following conclusions:
• In any domain, the polynomials x − a are irreducible.
• There are domains for which there exist irreducible polynomials of
degree as large as we wish.
• There are domains for which there are irreducible polynomials only up
to same degree larger or equal to 1.
It remains to show that there are fields where the only irreducible polynomials take the form x − a. We leave it as an exercise to check that if this
is true then the any polynomial of degree > 1 must have a root. For this
reason, we introduce the following:
Definition 3.5.14. A field K is called algebraically closed if every nonconstant polynomial p(x) ∈ K[x] has at least one root in K.
We will not prove the next result, which will only be used in examples
and exercises. One can find a proof in any text of Complex Analysis. We
leave as an exercise to use it to determine the irreducible polynomials em R.
Theorem 3.5.15 (Fundamental Theorem of Algebra). The field C of complex numbers is algebraically closed, i.e., any non-constant complex polynomial has at least one complex root.
Exercises.
1. If p(x) ∈ D[x], p(x) 6= 0, and a ∈ D is a root of p(x), the largest natural
number m such that p(x) is a multiple of (x − a)m is called the multiplicity
of the root a. Show that the sum of the multiplicities of the roots of p(x) is
≤ deg p(x).
3.5. DIVISION OF POLYNOMIALS
163
2. Show that p(x) ∈ A[x] can have more than deg p(x) roots, if A has zero
divisors.
3. Show that x2 + 1 is irreducible in Z3 [x].
4. Determine all the irreducible polynomials p(x) ∈ Z3 [x] with deg p(x) ≤ 2.
5. How many irreducible polynomials are there in Z5 [x] with degree 5?
(Hint: count the reducible and irreducible polynomials in Z5 [x] of degrees
≤ 5.)
6. Show that the following statements are equivalent:
(a) The field K is algebraically closed.
(b) Any polynomial in K[x] of degree ≥ 1 is a product of polynomials of
degree 1.
7. Assume that K is an algebraically closed field and show that if p(x) ∈ K[x]
and deg p(x) = n ≥ 1, then the sum of the multiplicities of the roots of p(x)
is exactly n.
8. Assume that p(x) ∈ R[x] and using the Fundamental Theorem of Algebra
show that:
(a) If c ∈ C is a root of p(x), the complex conjugate of c is also a root of
p(x).
(b) If p(x) is irreducible in R[x] and deg p(x) > 1, then p(x) = ax2 + bx + c
and b2 − 4ac < 0.
9. If K is an algebraically closed field and D is an integral domain and an
algebraic extension de K, show that K = D.9
10. If p(x) = a0 + a1 x + · · · + αn xn ∈ Z[x], and c(p) = gcd(ao , a1 , . . . , αn ), show
that:
(a) p(x) = c(p)q(x) where q(x) is primitive,
(b) If p(x) and q(x) are primitive then p(x)q(x) is primitive.
9
In particular, according to the FundamentalTheorem of Algebra, there is no field L
which is an algebraic extension of C. This gives a complete answer to Hamilton’s problem,
discussed in Chapter ??.
164
3.6
CHAPTER 3. OTHER EXAMPLES OF RINGS
The Ideals of K[x]
As we saw in Chapter ??, the structure of the ideals in Z is specially simple, a
fact which is behind Euclides Algorithm for the computation of the greatest
common divisor and least common multiple of two integers. For a general
integral domainD the structure of the ideals of D[x] can be very complex, as
the example of Z[x] shows. In general, there is no analogue of the Euclides
Algorithm. However, if D = K is a field, then the structure of the ideals in
K[x] is easy to describe, as we shall see now..
We start by remarking that the Division Algorithm, as stated in Theorem
3.5.3, in the case of K[x] can be strengthen as follows:
Theorem 3.6.1. If p(x), d(x) ∈ K[x] and d(x) 6= 0, there exist unique
polynomials q(x) and r(x), such that p(x) = q(x)d(x)+r(x) and deg r(x) <
deg d(x).
This result is similar to the result we have proved in Chapter ?? for the
integers, and so can be used to establish various analogies between the rings
K[x] and Z. In particular:
Theorem 3.6.2. Every ideal in K[x] is principal.
Proof. Assume that I ⊂ K[x] is an ideal. If I = {0}, then I = h0i is a
principal ideal.
Therefore, assume that I 6= {0}, so there exists some non-zero polynomial p(x) ∈ I, and consider the set
N = {n ∈ N0 : ∃p(x) ∈ I, n = deg p(x)} ⊂ N0 .
The set N is non-empty, so it must have a minimum. Let m(x) ∈ I be such
that deg m(x) = min N . Then
(3.6.1)
r(x) ∈ I and r(x) 6= 0
=⇒
deg m(x) ≤ deg r(x).
Since m(x) ∈ I, obviously hm(x)i ⊂ I. On the other hand, if p(x) ∈ I
the Division Algorithm gives p(x) = m(x)d(x) + r(x), where deg r(x) <
deg m(x). Since I is an ideal, it follows that
r(x) = p(x) − m(x)d(x) ∈ I,
and we conclude from (3.6.1) that r(x) = 0 (otherwise we would have
deg r(x) < deg m(x), a contradiction). Hence, p(x) ∈ hm(x)i and I =
hm(x)i.
3.6. THE IDEALS OF K[X]
165
In Chapter ?? we have used the fact that all ideals in Z are principal to
construct Euclides Algorithm, which allows one to find the greatest common
divisor and the least common multiple of any two integers. We will now
apply the same set of ideas to the ring K[x].
We start with the following proposition, whose proof is left for the exercises:
Proposition 3.6.3. Let I = hp(x)i and J = hq(x)i be ideals in K[x]. Then:
(a) I ⊂ J if and only if q(x)|p(x);
(b) I is maximal if and only if p(x) is irreducible;
(c) If I = J and p(x) and q(x) are monic or zero, then p(x) = q(x).
If p(x), q(x) ∈ K[x], then I = hp(x), q(x)i is the ideal in K[x], given by:
I = {a(x)p(x) + b(x)q(x) : a(x), b(x) ∈ K[x]}.
According to Theorem 3.6.2, this ideal is principal. Therefore, there exists
a polynomial d(x) ∈ K[x] such that hd(x)i = hp(x), q(x)i. Note that d(x)
satisfies:
• d(x)|p(x) and d(x)|q(x);
• there exist polynomials a(x) and b(x) such that d(x) = a(x)p(x) +
b(x)q(x);
• if c(x)|p(x) and c(x)|q(x), then c(x)|d(x).
In other words, d(x) is a common divisor of p(x) and q(x), and it is a
multiple of any other common divisor of these two polynomials.
Similarly, hp(x)i∩hq(x)i is a principal ideal, so there exists m(x) ∈ K[x]
such that hm(x)i = hp(x)i ∩ hq(x)i. In this case, we have:
• p(x)|m(x) and q(x)|m(x);
• if p(x)|n(x) and q(x)|n(x), then m(x)|n(x).
Hence, m(x) is a common multiple of p(x) and q(x), and is a divisor of any
other polynomial which is a common multiple of these two polynomials.
Finally, note that according to Proposition 3.6.3 (c), if p(x) and q(x) are
monic polynomials or zero and hp(x)i = hq(x)i, then p(x) = q(x). Hence
we can introduce:
166
CHAPTER 3. OTHER EXAMPLES OF RINGS
Definition 3.6.4. Let p(x), q(x) ∈ K[x].
(i) If hd(x)i = hp(x), q(x)i, with d(x) monic or zero, then d(x) is called
the greatest common divisor of p(x) and q(x), abbreviated to
d(x) = gcd(p(x), q(x)).
(ii) If hm(x)i = hp(x)i ∩ hq(x)i, with m(x) monic or zero, then m(x) is
called the least common multiple of p(x) and q(x), abbreviated
to m(x) = lcm(p(x), q(x)).
Similarly to the integers, notice that
p(x) = q(x)a(x) + r(x)
=⇒
hp(x), q(x)i = hq(x), r(x)i,
so Euclides Algorithm remains valid in K[x]. We illustrate it with an example in Z5 [x].
Example 3.6.5.
In order to find the greatest common divisor of p(x) = x4 + x3 + 2x2 + x + 1
and q(x) = x3 + 3x2 + x + 3 in Z5 [x], we proceed as follows:
1. Divide p(x) by q(x), so one obtains
x4 + x3 + 2x2 + x + 1 = (x3 + 3x2 + x + 3)(x + 3) + 2x2 + 2.
Hence:
hx4 + x3 + 2x2 + x + 1, x3 + 3x2 + x + 3i = hx3 + 3x2 + x + 3, 2x2 + 2i.
2. Then divide x3 + 3x2 + x + 3 by 2x2 + 2, leading to
x3 + 3x2 + x + 3 = (2x2 + 2)(3x + 4).
Hence:
hx3 + 3x2 + x + 3, 2x2 + 2i = h2x2 + 2i = hx2 + 1i.
3. One concludes that
gcd(x4 + x3 + 2x2 + x + 1, x3 + 3x2 + x + 3) = x2 + 1.
The exact some argument that we used for the integers, also leads to the
following result:
Lemma 3.6.6. If p(x), q(x) ∈ K[x] are monic polynomials, then
gcd(p(x), q(x)) lcm(p(x), q(x)) = p(x)q(x).
3.6. THE IDEALS OF K[X]
167
Example 3.6.7.
We saw above that the greatest common divisor of p(x) = x4 +x3 +2x2 +x+1
and q(x) = x3 + 3x2 + x + 3 em Z5 [x] is d(x) = x2 + 1. Hence, we conclude
that the least common multiple of these polynomials is the polynomial
(x4 + x3 + 2x2 + x + 1)(x3 + 3x2 + x + 3)
x2 + 1
5
4
2
= x + 4x + 2x + 4x + 3.
m(x) =
Exercises.
1. Prove Theorem 3.6.1.
2. Prove Proposition 3.6.3.
3. Let p(x), q(x) ∈ K[x]. Show that
I = hp(x), q(x)i = {a(x)p(x) + b(x)q(x) : a(x), b(x) ∈ K[x]}.
4. Let p(x), q(x) ∈ K[x]. Show that if hd(x)i = hp(x), q(x)i, then:
(a) There exist a(x), b(x) ∈ K[x] such that d(x) = a(x)p(x) + b(x)q(x).
(b) d(x)|p(x) and d(x)|q(x).
(c) If c(x)|p(x) and c(x)|q(x), then c(x)|d(x), hence deg c(x) ≤ deg d(x).
5. Show that the following generalization of Euclides Lemma holds: if p(x), q1 (x),
q2 (x) ∈ K[x], p(x) is irreducible and p(x)|q1 (x)q2 (x), then p(x)|q1 (x) or
p(x)|q2 (x).
6. If p(x), q(x) ∈ K[x], shows that
p(x) = q(x)a(x) + r(x)
=⇒
hp(x), q(x)i = hq(x), r(x)i.
7. Let d(x) be the greatest common divisor of x4 + x3 + 2x2 + x + 1 and
x3 + 3x2 + x + 3 in Z5 [x]. Determine a(x) and b(x) in Z5 [x] such that
d(x) = a(x)(x4 + x3 + 2x2 + x + 1) + b(x)(x3 + 3x2 + x + 3).
8. Let p(x), q(x) ∈ K[x], d(x) = gcd(p(x), q(x)) and m(x) = lcm(p(x), q(x)).
Show that:
(a) If p(x)|r(x) and q(x)|r(x), then p(x)q(x)|r(x)d(x).
(b) There exists k ∈ K such that kd(x)m(x) = p(x)q(x).
168
CHAPTER 3. OTHER EXAMPLES OF RINGS
9. Let q(x) ∈ K[x] be non-zero and non-invertible. Prove the following statements (recall the Fundamental Theorem of Arithmetic):
(a) There exist irreducible polynomials p(x) such that p(x)|q(x).
(b) There exist irreducible monic polynomials m1 (x), . . . , mk (x) ∈ K[x] and
Qk
a0 ∈ K such that q(x) = a0 i=1 mi (x).
(c) The decomposition in (b) is unique up to the order of the factors.
10. Show that the ideal hx, yi in K[x, y] is not principal.
11. Assume that the ring A is an extension of the field K, and let a ∈ A be
algebraic over K. If J = {p(x) ∈ K[x] : p(a) = 0}, show that:
(a) J = hm(x)i is an ideal of K[x]
10
.
(b) If A does not have zero divisors, then m(x) is irreducible, and K[a] =
K(a) is a field.
(c) If A does not have zero divisors, and B is the set of all elements in A
which are algebraic over K, then B is a field and it is the largest algebraic
extension of K in A.
√
√
12. Show that Q[ 3 2] and Q[ 4 2] are algebraic extensions of Q and subfields of
R. What are their dimensions, when viewed as vector spaces over Q?
13. Let A ⊂ R be the set of all real numbers which are algebraic over Q. Show
that:
(a) A is a countable field.
(b) A is an algebraic extension of Q.
(c) A has infinite dimension, when viewed as a vector space over Q.
3.7
Divisibility and Prime Factorization
We saw before in our studies of the ring of integers Z and of the ring of
polynomials K[x] that in these rings any non-zero element which is noninvertible has an essential unique factorization has a product irreducible or
prime elements. It is natural to wonder if this fact holds in other rings. We
will study in this and in the next section how the notions of divisibility and
factorization can be extended to any integral domain D.
10
One say that m(x) is the minimal polynomial of the element a.
3.7. DIVISIBILITY AND PRIME FACTORIZATION
169
Recall that if a, b ∈ D we say that a divides (or is a factor of) b if
there exists d ∈ D such that b = da. In this case we write “a|b”11 . The
following notions are partially inspired by the usual concepts we have seen
in Z and K[x], and they play a basic role.
Definition 3.7.1. Let D be an integral domain, a, b, p ∈ D, and p a noninvertible element. We say that:
(i) a is associate to b, if a|b and b|a;
(ii) p is prime, if p 6= 0 and p|ab ⇒ p|a or p|b;
(iii) p is irreducible, if p = ab ⇒ a is invertible or b is invertible.
Notice that in this more general context, Euclides Lemma becomes a
definition of a prime element, and the irreducible elements are those elements that only admit trivial factorizations. We will see below that in the
rings Z and K[x] the prime elements in the sense of the above definition
are exactly the irreducible elements. It is only for historical reasons that
we use the terms “prime” in Z and “irreducible” in K[x]. For a general
integral domain, the notion of prime and irreducible may not coincide, but
we will identify many classes of domains where these notions are equivalent,
and where one can establish appropriate generalizations of the Fundamental
Theorem of Arithmetics, concerning the existence and uniqueness of factorizations into irreducible elements.
The binary relation associate to is actually an equivalence relation: it is
easy to check that a is associate to b if and only if a = ub for some invertible
element u. Hence, if a, b ∈ D are associates, we write a ∼ b. In Factorization
Theory, it is common to call the invertible elements units, and we will follow
this practice. Notice that the units are precisely those elements which are
associate to the identity of D, and that given p, q ∈ D, if p ∼ q, then p is
a prime (respectively irreducible) if and only if q is a prime (respectively,
irreducible). In particular, if p is a prime then all the elements obtained
from p by multiplying by units are also prime elements (and this is not the
usual convention in Z).
We start by observing the following general fact:
Lemma 3.7.2. In any integral domain, a prime element is always irreducible.
11
We also use the symbol “a ∤ b” to say that a does not divide b.
170
CHAPTER 3. OTHER EXAMPLES OF RINGS
Proof. If p ∈ D is prime and p = ab, then either p|a or p|b. If, for example,
p|a, then there exists x ∈ D such that a = px, and we conclude that
p = ab
=⇒
p = pxb, and since p 6= 0,
=⇒
1 = xb,
=⇒
b is invertible.
Similarly, if p|b, we conclude that a is invertible.
Examples 3.7.3.
1. In Z the units are {1, −1}. An element p ∈ Z is irreducible in the sense of
Definition 3.7.1 if and only if the natural number |p| is prime in the sense of
Chapter ??. Also, if p|n then |p||n, so we see that Euclides Lemma shows that
if p is irreducible then
p|ab =⇒ |p||ab =⇒ |p||a or |p||b =⇒ p|a or p|b.
We conclude that in Z irreducible and prime elements, in the sense of Definition 3.7.1, coincide.
2. The units in K[x] are the zero degree polynomials, while an irreducible polynomial in the sense of Definition 3.7.1 is just an irreducible polynomial in the
usual sense. The generalization of Euclides Lemma given in Exercise 5 shows
that if p(x) ∈ K[x] is irreducible then it is prime. We conclude again that in
K[x] irreducible and prime elements, in the sense of Definition 3.7.1, coincide.
3. The units in the ring of Gaussian integers Z[i] are {1, −1, i, −i}. The element
2 ∈ Z[i] is not irreducible in Z[i] (although it is irreducible in Z) since we have:
2 = (1 + i)(1 − i), with 1 ± i non invertible.
We claim that 1 + i and 1 − i are irreducible. In order to show this, we observe
that the function N : Z[i] → N0 defined by
N (a + bi) = |a + bi|2 = a2 + b2 .
satisfy the following two properties:
(a) if z1 , z2 ∈ Z[i], then N (z1 z2 ) = N (z1 )N (z2 );
(b) N (z) = 1 if and only if z is invertible.
To check, for example, that 1 + i is irreducible, assume that 1 + i = z1 z2 in
Z[i]. By property (a), we find that
2 = N (1 + i) = N (z1 z2 ) = N (z1 )N (z2 ).
3.7. DIVISIBILITY AND PRIME FACTORIZATION
171
Since 2 is irreducible in Z, we must have N (z1 ) = 1 or N (z2 ) = 1. By property
(b), we conclude that z1 or z2 are invertible, and 1 + i is irreducible in Z[i].
Observe that −(1 + i) = −1 − i, i(1 + i) = −1 + i and −i(1 + i) are also
irreducible. We will show later that in Z[i] any irreducible element is also a
prime, and how one can determine all the irreducible in Z[i].
4. There are integral domains where there are irreducible elements which are
not prime. We have already observed in Chapter ?? that in the ring of even
integers 2Z we have two distinct factorizations
36 = 6 × 6 = 2 × 18,
so 2 divides 6×6 but it does not divide 6 (note however, that we assume integral
domains
√ to have a unit, which is not the case with 2Z). Consider instead the
ring Z[ −5], which is an integral domain. In this ring, we have:
√
√
√
√
9 = 3 · 3 = (2 + −5)(2 − −5) =⇒ 3|(2 + −5)(2 − −5).
We leave it to the
√ exercises to check
√ that 3 is irreducible
√ but is not a factor
neither of (2 + −5) nor of (2 − −5). Hence 3 ∈ Z[ −5] is irreducible but
not a prime.
Using these elementary notions we can introduce domains where factorizations exist:
Definition 3.7.4. A domain D is called a unique factorization domain
(abbreviated UFD) if: satisfeitas:
(i) If 0 6= d ∈ D is non-invertible there exists irreducible elements p1 , · · · , pn
such that
(3.7.1)
d=
n
Y
pi .
i=1
Q
Q
′
(ii) If p1 , · · · pn , and p′1 · · · p′m are irreducible, and ni=1 pi = m
i=1 pi , then
n = m, and there exists a permutation π ∈ Sn such that p′i ∼ pπ(i) .
In other words, in a UFD every non-zero and non-invertible element has
a factorization into a product of irreducible elements and this factorization is
unique up to the order of the factors and multiplication of eachQfactor by an
appropriate unit (note that if p′i = ui pπ(i) , then we must have ni=1 ui = 1).
Note that the factorization 3.7.1 can also be expressed in powers of de
irreducible elements, but in this case we may need to include a unit in the
factorization:
(3.7.2)
d = u · pe11 · · · penn .
172
CHAPTER 3. OTHER EXAMPLES OF RINGS
Examples 3.7.5.
1. The ring Z is a UFD: it follows from the Fundamental Theorem of Arithmetics that every integer can be factored in the form
m = p1 · · · pm ,
where each pi is irreducible (i.e., |pi | is a prime). This factorization is unique
up to the order of the factor and multiplication by ±1. For example, we have:
−15 = (3) · (−3) = (−1)32 ,
where 3 and −3 are both irreducible.
2. By Exercise 9 in Section 3.6, given a polynomial q(x) ∈ K[x], there exist
irreducible polynomials p1 (x), . . . , pn (x) ∈ K[x] such that
q(x) =
n
Y
pi (x).
i=1
This factorization is unique up to the order of the factor and multiplication by
units. Hence, K[x] is a UFD. For example, in Q[x] we have
2x2 + 4x + 2 = (2x + 2)(x + 1) = 2(x + 1)2 ,
where 2x + 2 and x + 1 are both irreducible.
3. We will see below that the ring Z[i] of Gaussian integers is a UFD.
It is useful to observe that all the elementary properties concerning factors and divisibility introduced above can be expressed in terms of ideals.
For that, we will say that an ideal 0 ( P ( D is a prime ideal if, for any
ideals I, J ⊂ D,
IJ ⊂ P =⇒ I ⊂ P or J ⊂ P.
Proposition 3.7.6. Let a, b, p, u ∈ D. Then:
(i) a|b if and only if hai ⊃ hbi;
(ii) a ∼ b if and only if hai = hbi;
(iii) u is a unit if and only if hui = D;
(iv) p is a prime if and only if hpi is a prime ideal;
(v) p is irreducible if and only if hpi is maximal among all principal ideals
in D.
3.7. DIVISIBILITY AND PRIME FACTORIZATION
173
Proof. We leave the proof of (i), (ii) and (iii) as an easy exercise.
In order to show that (iv) holds, let p ∈ D be a prime, and I, J ⊂ D ideals
such that IJ ⊂ hpi. If I 6⊂ hpi, then there exists a ∈ I such that a 6∈ hpi,
i.e., such that p ∤ a (by (i)). Hence, for all b ∈ J, we have ab ∈ hpi ⇔ p|ab
and p ∤ a. Since p is a prime, we must have p|b, i.e., b ∈ hpi (by (i)). We
conclude that J ⊂ hpi. This proves that hpi is a prime ideal.
For the converse, assume that hpi is a prime ideal p|ab. Then
habi = haihbi ⊂ hpi
=⇒
hai ⊂ hpi or hbi ⊂ hpi.
By (i), this means that either p|a or p|b, so p is a prime.
In order to show that (v) holds, consider first an irreducible element
p ∈ D and and let hpi ⊂ hai, for some a ∈ D. Then p = ax hence either a is
a unit or x is a unit. If a is a unit, then by (iii) hai = D. If x is a unit, then
p ∼ a and, by (ii), hpi = hai. Hence, hpi is maximal among all principal
ideals of D.
Conversely, assume that hpi is maximal among all principal ideals of D.
If p = ab, since hpi ⊂ hai, we see that either hai = D and a is invertible (by
(iii)), or hpi = hai and a ∼ p (by (ii)). In the later case, there exists a unit
u ∈ D such that a = pu, so that
p = ab
=⇒
p = pub,
=⇒
1 = ub
=⇒ b is invertible.
Therefore, either a or b is invertible, which shows that p is irreducible.
The previous proposition suggests one maybe able to construct a Factorization Theory based in the ideals of D rather than the elements of D. This
is indeed true and useful, and the name “ideal” derives historically from this
idea (see Chapter 7). For now, we consider only factorization of elements of
D, but we use ideals to give a characterization of a UFD.
Given a domain D we will say that it satisfies the ascending chain
condition on principal ideals if every chain of principal ideals:
hd1 i ⊂ hd2 i ⊂ · · · ⊂ hdn i ⊂ . . .
stabilizes, i.e., after some natural number n0 ∈ N we have
hdn0 i = hdn0 +1 i = . . . .
This property turns out to be crucial for factorizations to exist:
174
CHAPTER 3. OTHER EXAMPLES OF RINGS
Theorem 3.7.7. Let D be an integral domain. Then D is a UFD if and
only if the following two conditions hold:
(i) every irreducible element is prime;
(ii) D satisfies the ascending chain condition on principal ideals.
Proof. Let D be a UFD and p ∈ D an irreducible element. If p|ab, then
there exist x ∈ D such that px = ab, where x, a and b admit factorizations
of the type (3.7.1):
x = p1 · · · pr , a = p′1 · · · p′s , b = p′′1 · · · p′′t
with pi , p′j , p′′k irreducible in D. Hence:
p · p1 · · · pr = p′1 · · · p′s · p′′1 · · · p′′t ,
and, by uniqueness of the factorization, we must have p ∼ p′i or p ∼ p′′j . In
the first case, p|a and and in the second case p|b. Hence, p is a prime.
On the other hand, consider an ascending chain condition of principal
ideals
hd1 i ⊂ hd2 i ⊂ · · · ⊂ hdn i ⊂ . . .
We can assume, without loss of generality (why?), that d1 6= 0 and di is
non-invertible, ∀i. Since di |d1 , for all i, the factorizations of d1 and di take
the form
d1 = p1 · · · pr , di = p′1 · · · p′s .
The irreducible factors of di are factors of d1 , so s ≤ r. In particular, the
chain cannot contain more than r distinct ideals, so there exists a natural
n0 such that hdn0 i = hdk i, for all k ≥ n0 . This completes half of the proof.
Conversely, let D be an integral domain where conditions (i) and (ii) in
the statement hold. Let d ∈ D be a non-zero and non-invertible element.
Let us assume that d does not admit a factorization, and see that this leads
to a contradiction. By induction, we construct as follows a sequence {dn }n∈N
where d1 = d, dn+1 |dn , dn 6∼ dn+1 and none of the elements dn admits a
factorization into a product of irreducible elements.
The case n = 1 is trivial, so assume that n > 0, and we have already
constructed elements d1 , · · · , dn . Since dn cannot be irreducible, we have
dn = an bn , where an and bn are non-invertible. Obviously, if both an and bn
can be factored into a product of irreducible elements, then dn can also be
factored. So assume, without loss of generality that bn cannot be factored.
3.7. DIVISIBILITY AND PRIME FACTORIZATION
175
Then we set dn+1 = bn , and observe that obviously dn+1 |dn , and dn 6∼ dn+1 .
This shows that the sequence {dn }n∈N exists.
Now, the principal ideals generated by the dn ’s satisfy
hd1 i ( hd2 i ( · · · ( hdn i ( . . .
but this contradicts the ascending chain condition. Hence, we conclude that
every non-zero and non-invertible element can be factored into irreducible
elements.
In order to verify the uniqueness of factorization, assume that
p1 · · · pn = p′1 · · · p′m ,
where, say, n ≤ m. Since the pi , p′j are irreducible, by (i) they are also
prime. Since pn |p′1 · · · p′m , we have that pn is associate to some p′j , which
we denote by p′π(n) . Removing these two elements from the products, and
repeating the argument, we obtain by exhaustion that n = m and pi ∼ p′π(i)
for some permutation π ∈ Sn .
This theorem justifies the use of the term “prime factorization” to denote
factorizations of the type (3.7.1) or (3.7.2).
As we have already observed, the crucial property of the rings Z and
K[x], in what concerns factorization, is that all its ideals are principal.
Definition 3.7.8. An integral domain D is called a principal ideal domain (abbreviated PID) if all its ideals are principal, i.e., of the form hdi.
As a consequence of Theorem 3.7.7 we have:
Corollary 3.7.9. Every PID is a UFD.
Proof. First we show that the irreducible elements in D are prime. For that,
assume that p ∈ D is irreducible and that p|ab. The ideal ha, pi is principal,
so there exists d ∈ D such that hdi = ha, pi. Since hpi ⊂ hdi ⊂ D and hpi is
maximal (Proposition 3.7.6), we have
hpi = hdi or hdi = D.
In the first case, p ∼ d and since d|a, we conclude that p|a. In the second
case there exist r, s ∈ D such that 1 = ra + sp, hence
b = 1 · b = (ra + sp)b = rab + spb.
176
CHAPTER 3. OTHER EXAMPLES OF RINGS
Since p divides each of the terms in the right hand side, we conclude that
p|b. In any case, p|a or p|b, so p is a prime.
Last we verify the ascending chain condition on principal ideals. Consider the chain:
hd1 i ⊂ hd2 i ⊂ · · · ⊂ hdn i ⊂ . . .
Note that ∪∞
i=1 hdi i is an ideal that, by assumption, must be principal:
∪∞
hd
i
=
hd
0 i. Obviously, there exists a natural number n0 such that
i=1 i
d0 ∈ hdn0 i, and it follows that:
hdn0 i = hdn0 +1 i = . . . .
We have verified both conditions in Theorem 3.7.7, so D is a UFD.
Examples 3.7.10.
1. The ring of Gaussian integers is a UFD, since we know that Z[i] is a PID,
by Exercise ?? in Section 2.6.
2. We will see in the next section that if D is a UFD then D[x] is also a UFD.
In particular, Z[x] is a UFD, although it is not a PID.
In geral, the problem of determining if a given integral domain is a UFD
maybe hard to solve. For example, it is known that the quadratic domains
√
Z[ m], for m < 0, are UFD if and only if m = −1, −2, −3, −7 and −11, but
this is a very non-trivial result. For m > 0 is it not known which domains
√
Z[ m] are UFD. Also, even if one knows that D is a UFD, it maybe very
hard to determine its prime elements. We will illustrate this problem in the
case of the Gaussian integers.
In order to simplify the exposition, we will call Euclidean primes the
prime natural numbers in Z, and Gaussian primes the prime Gaussian integers in Z[i].
Proposition 3.7.11. Let p ∈ Z be an Euclidean prime.
(i) If p = n2 + m2 has solutions n, m ∈ Z, then z = n + mi is a Gaussian
prime;
(ii) If p = n2 + m2 has no solutions in Z, then p is a Gaussian prime;
In particular, z ∈ Z[i] is a Gaussian prime if and only if z ∼ p, where p ∈ Z
is a Gaussian prime, or z ∼ n + mi, where n2 + m2 = p is an Euclidean
prime.
3.7. DIVISIBILITY AND PRIME FACTORIZATION
177
Proof. Let N (n + mi) = n2 + m2 denote the norm of the Gaussian integer
n + mi. To show that (i) holds, assume that p = n2 + m2 is an Euclidean
prime, and consider a Gaussian factorization:
n + mi = zw.
Then we have:
p = N (n + im) = N (z)N (w).
Hence, either N (z) = 1 and z is invertible, or N (w) = 1 and w is invertible.
Hence, we conclude that n + mi is a Gaussian prime.
Next, to show that (ii) holds, observe that if p is not a Gaussian prime
then there exist non-invertible Gaussian integers z and w such that p = zw.
Then p2 = N (p) = N (z)N (w), and since N (z) and N (w) are integers 6= ±1,
one must have p = N (z) = N (w). Hence the equation p = n2 + m2 has
solutions.
Finally, let z = a + bi be a Gaussian prime, so N (z) > 1. If p is any
prime factor (in Z) of N (z), there exists w ∈ Z[i] such that
N (z) = (a + bi)(a − bi) = pw.
Since p ∈ Z, we have that p|a + bi = z if and only if p|a − bi. Hence, one of
the following alternative must hold:
(a) p is also a Gaussian prime: Then either p|a + bi or p|a − bi, i.e., p is
a factor of z. Hence, z ∼ p, since z and p are both Gaussian primes;
(b) p is not a Gaussian prime: We conclude from (ii) that p = n2 + m2
has solutions, and we then have
N (z) = (a + bi)(a − bi) = pw = (n + mi)(n − mi)w.
From (i), n + mi is a Gaussian prime, and we conclude that either
a + bi ∼ n + mi or a + bi ∼ n − mi.
Examples 3.7.12.
1. Obviously, the equation 3 = n2 + m2 has no solutions in Z, so 3 is both an
Euclidean prime and a Gaussian prime.
2. Since 5 = 12 + 22 , it follows that 5 is not a Gaussian prime, but the Gaussian
integers ±1 ± 2i and ±2 ± i are all Gaussian primes.
178
CHAPTER 3. OTHER EXAMPLES OF RINGS
As we have just seen, the identification of Gaussian primes relies on the
solutions of p = n2 + m2 , where p is an Euclidean prime. Fermat found a
very elegant result concerning the values of p for which this equation has
solutions:
Theorem 3.7.13 (Fermat). Let p be an Euclidean prime. Then the following statements are equivalent:
(i) The equation p = n2 + m2 has solutions in Z,
(ii) p 6≡ 3 (mod 4),
(iii) The equation x2 = −1 has solutions in Zp .
Proof. We leave it as exercise to check that “(i) =⇒ (ii)”.
To show that “(ii) =⇒ (iii)”, notice that we can assume p =
6 2, since
2
x = 1 is a solution of x = −1 = 1. Let 2 6= p ≡ 1 (mod 4). If x ∈ Zp set
C(x) = {x, −x, x−1 , −x−1 } and define:
x ≈ y ⇐⇒ C(x) = C(y).
One checks easily that “≈” is an equivalence relation in Z∗p , and that x 6= −x
for any x ∈ Z∗p (because p 6= 2). The equivalence class of x is just the set
C(x). Denoting by #(C(x)) the number of elements of the class C(x), notice
that the possible values of #(C(x)) are:
#(C(x)) = 2, if x = x−1 , i.e., if x = ±1,
= 2, if x = −x−1 , i.e., if x and x−1 are the solutions of x2 = −1,
= 4, if x is not a solution of x2 = ±1 (or, equivalently, of x4 = 1).
The equivalence classes of ≈ give a partition of Z∗p , which has p − 1 elements.
We have just saw that there exists at least one class with 2 elements, namely
C(1) = {1, −1}. Possibly, there exists another class with 2 elements, formed
by the roots of x2 + 1, if this polynomial has roots in Z∗p . If n is the number
of equivalence classes with 4 elements, we conclude that the p − 1 elements
of Z∗p are assembled as follows:
• if there are no solutions of x2 = −1, then p − 1 = 2 + 4n ⇔ p = 4n + 3,
since there exists only one equivalence class with 2 elements, and the
remaining n classes have 4 elements each, or
• there exist solutions of x2 = −1 and then p − 1 = 2 + 2 + 4n ⇔
p = 4(n + 1) + 1, because there are 2 equivalence classes each with 2
elements, and the remaining n classes have 4 elements.
3.7. DIVISIBILITY AND PRIME FACTORIZATION
179
Since p 6= 2 is a prime, we have that p 6≡ 3 (mod 4) =⇒ p ≡ 1 (mod 4) and
we conclude that the equation x2 = −1 has solutions in Z∗p .
Finally, to show that “(iii) =⇒ (i)” we note that x2 = −1 has solutions in
Zp if and only if there exists an integer k such that p|1+k2 = (1+ki)(1−ki).
If p is a Gaussian prime then p|1 + ki or p|1 − ki, a contradiction, since p 6 |1.
Hence p is not a Gaussian prime, and according to Proposition 3.7.11, the
equation p = n2 + m2 must have solutions.
Examples 3.7.14.
1. The Euclidean primes 7, 11 and 19 are also Gaussian primes.
2. 1973 is an Euclidean prime which is not a Gaussian prime because p ≡ 1
(mod 4). It follows that the equation 1973 = n2 + m2 has solutions n, m ∈ Z,
which is not an immediate fact (check that n = 23 and m = 38 is a solution).
An important property of a UFD is that any two elements have a greatest
common divisor and a least common multiple. In order to discuss this,
we need a more abstract definition of greatest common divisor and least
common multiple, valid for any integral domain.
Definition 3.7.15. Let D be an integral domain, and a1 , . . . , an ∈ D.
(i) d ∈ D is called a greatest common divisor of a1 , . . . , an if d|ai
(i = 1, . . . , n) and for all b ∈ D such that b|ai (i = 1, . . . , n) we have
that b|d;
(ii) m ∈ D is called a least common multiple of a1 , . . . , an if ai |m
(i = 1, . . . , n) and for all b ∈ D such that ai |b (i = 1, . . . , n) we have
that m|b.
Note the lack of uniqueness in these notions: if d is a greatest common
divisor, and c ∼ d, then c is also a greatest common divisor, and the same
holds for a least common multiple. In Z and in K[x], we could ensure
uniqueness by requiring that d and m be non negative in Z, and monic in
K[x]. Apart from this detail, the definitions above are compatible with the
definitions introduced in Chapters 2 and 3.
It is not at all obvious that given elements a1 , . . . , an ∈ D a greatest common divisor and/or a least common multiple exist. However, if a1 , . . . , an
have at least one greatest common divisor (respectively, least common multiple) we will denote by gcd(a1 , . . . , an ) (respectively, lcm(a1 , . . . , an )) any
such element. We leave the proof of the following elementary properties as
an exercise:
180
CHAPTER 3. OTHER EXAMPLES OF RINGS
Lemma 3.7.16. Let a, b, c ∈ D. Then:
(i) gcd(a, gcd(b, c)) ∼ gcd(gcd(a, b), c) ∼ gcd(a, b, c);
(ii) gcd(ca, cb) ∼ c gcd(a, b).
Similarly, with gcd replaced by lcm.
When D is a UFD the gcd(a1 , . . . , an ) and lcm(a1 , . . . , an ) always exist:
Proposition 3.7.17. Let D be a UFD, and a, b ∈ D. Then:
(i) gcd(a, b), lcm(a, b) exist and ab ∼ gcd(a, b) lcm(a, b);
(ii) If D is a PID, then gcd(a, b) = ra + sb for some r, s ∈ D.
Proof. (i) If a = 0 then gcd(a, b) = b and lcm(a, b) = 0. If a is invertible,
then gcd(a, b) = a and lcm(a, b) = b. Assume then that both a and b are
non-zero and non-invertible. The prime factorizations of a and b can be
written as
a = u · pn1 a1 · · · pns as , and b = u′ · p1nb1 · · · pns bs ,
where the pi are non-associate, nai ≥ 0 and nbi ≥ 0. Setting for i = 1, . . . , s
mi = min{nai , nbi } and Mi = max{nai , nbi }, we see immediately that we
can choose:
M1
ms
Ms
1
gcd(a, b) = pm
1 · · · ps , and lcm(a, b) = p1 · · · ps .
(ii) Let D be a PID. If a, b ∈ D, by (i) there exists gcd(a, b). We leave
as an exercise to check that since D is a PID one has: ha, bi = hgcd(a, b)i.
This means that there exist r, s ∈ D such that gcd(a, b) = ra + sb.
Exercises.
1. Verify itens (i)-(iii) in Proposition 3.7.6.
√
√
2. In the ring Z[ −5] show that the elements 3 and 2 ± −5 are irreducible.
√
3. Show that the ring Z[ 10] is not a UFD.
4. If p is a Euclidean prime and there are integers n and m such that p = n2 +m2 ,
show that p 6≡ 3 (mod 4).
5. Show that if p is a Euclidean prime and n, m, a, b ∈ Z satisfy p = n2 + m2 =
a2 + b2 then {n2 , m2 } = {a2 , b2 }.
3.7. DIVISIBILITY AND PRIME FACTORIZATION
181
6. An integral domain D is called an Euclidean domain if there exists a map
δ : D → N with the following property: ∀a, b ∈ D − {0} there exist q, r ∈ D
such that
a = qb + r, with r = 0 or δ(r) < δ(b).
Such a map is called an Euclidean function . Show that:
(a) Z and K[x] are Euclidean domains;
(b) The ring of Gaussian integers Z[i] is an Euclidean domain;
(c) A Euclidean domain D always admits an Euclidean function δ̃ : D → N
satisfying δ̃(a) ≤ δ̃(ab) for all a, b ∈ D − {0} and δ̃(u) = δ̃(1) if and only
if u is a unit;
Hint: Define a new Euclidean function δ̃(a) = min{δ(xa) : x ∈ D −{0}}.
(d) A Euclidean domain is a UFD (without using neither Theorem 3.7.7 or
its Corollary);
(e) An Euclidean domain is a PID.
√
One can show that the algebraic integers 12 in the field Q( −19) form a PID
which is not an Euclidean domain, so the following inclusions are strict:
Euclidean Domains ( PID ( UFD ( Integral Domains
7. Let D be an integral domain.
(a) Show that if D satisfies the ascending chain condition on principal ideals,
then every non-zero and non-invertible element of D has a factorization
into irreducible elements (possibly, non unique).
(b) Give an example of an integral domain D which does not satisfy the
ascending chain condition on principal ideals.
8. Prove Lemma 3.7.16.
9. Show that if D is a PID then for any a, b ∈ D one has ha, bi = hgcd(a, b)i,
hai ∩ hbi = hlcm(a, b)i and gcd(a, b) lcm(a, b) ∼ ab.
10. Assume that k ∈ N and show that k = n2 + m2 has solutions in Z if and
only if any prime factor p of k with p ≡ 3 (mod 4) satisfies p2N |k. What is
the smallest natural number k for which the equation k = n2 + m2 = s2 + t2
has solutions n, m, s, t such that {n2 , m2 } 6= {s2 , t2 }?
12
If K is a field extension of Q, d ∈ K is called an algebraic integer if d is a root of
some monic polynomial p(x) ∈ Z[x].
182
CHAPTER 3. OTHER EXAMPLES OF RINGS
11. Let P be the set of Euclidean primes and G the set of Gaussian primes.
Show that both P − G and G − P are infinite sets. In other words, show that
there are infinite Euclidean primes of the form p = 4n + 1 and of the form
p = 4n + 3.
Hint: Consider the natural numbers of the form k = (2N !) − 1 and of the
Q
2
N
form k =
+ 4.
i=1 (2i − 1)
3.8
Factorization in D[x]
When K is a field, the ring of polynomials K[x] is a UFD, since it is a
principal ideal domain. For an arbitrary domain D the structure of the
ideals of D[x] can be very complex. In fact, D[x] is a PID if and only if D
is a field, and this explains why some example of integral domains that we
have mentioned before, like Z[x] or K[x, y] = K[y][x], are not PIDs (see
Exercise 10 in Section 3.6). In any case, it is obvious that if D is not a UFD
then D[x] is also not a UFD. In fact, we will see in this section that the ring
D[x] is a UFD if and only if D is a UFD, and this shows in particular that
Z[x] and K[x, y] are UFDs.
In this section we will always assume that D is a UFD, so greatest
common divisors and least common multiples always exist in D. Also, we
will denote by K the field of fractions Frac(D). The definition of content
of a polynomial, introduced in Section 3.5 for polynomials p(x) ∈ Z[x], can
be extended without changes to D[x].
Definition 3.8.1. If p(x) = a0 + a1 x + · · · + an xn ∈ D[x], we say that
c(p) ∈ D is the content of p(x) if and only if
(3.8.1)
c(p) = gcd(a0 , . . . , an ).
It should be clear that, just like the greatest common divisor, the content
of a polynomial is only defined up to a multiple by a unit. Again we will
call a polynomial p(x) ∈ D[x] primitive if c(p) ∼ 1.
Lemma 3.8.2. Let p(x) ∈ D[x]. Then:
(i) There exists q(x) ∈ D[x] primitive such that p(x) = c(p)q(x).
(ii) If p(x) = dq(x), with q(x) ∈ D[x] primitive and d ∈ D, then d ∼ c(p).
Proof. Part (i) is obvious.
3.8. FACTORIZATION IN D[X]
183
In order to show that (ii) holds, let p(x) = a0 + a1 x + · · · + an xn and
q(x) = b0 + b1 x + · · · + bn xn , with q(x) a primitive polynomial, and assume
that p(x) = dq(x). Then ai = dbi and, by Lemma 3.7.16, we conclude that
c(p) = gcd(a0 , . . . , an ) ∼ d gcd(b0 , . . . , bn ) ∼ d.
The next two results are useful to express a polynomial p(x) ∈ K[x] in
terms of a primitive polynomial q(x) ∈ D[x].
Lemma 3.8.3. Let 0 6= p(x) ∈ K[x]. Then:
(i) There exist q(x) ∈ D[x] primitive and k ∈ K such that p(x) = kq(x);
(ii) If p(x) = kq(x) = k̃q̃(x), with k, k̃ ∈ K and q(x), q̃(x) ∈ D[x] primitives, then q̃(x) = uq(x) and k̃ = u−1 k, where u ∈ D is a unit.
Proof. (i) If
p(x) = α0 + α1 x + · · · + αn xn =
an
a0 a1
+ x + · · · + xn ∈ K[x],
b0
b1
bn
Q
we let b = ni=1 bi . Clearly, r(x) = bp(x) ∈ D[x]. If c = c(r), by Lemma
3.8.2, there exists q(x) ∈ D[x] primitive such that r(x) = cq(x) and we have
p(x) = kq(x), where k =
c
∈ K.
b
(ii) Exercise.
Corollary 3.8.4. If p(x), q(x) ∈ D[x] are primitives and p(x) ∼ q(x) in
K[x], then p(x) ∼ q(x) in D[x].
Proof. If p(x) ∼ q(x) in K[x], then p(x) = αq(x), with α ∈ K. The
corollary then follows from Lemma 3.8.3, item (ii).
The next two results are just generalizations to UFD of results that we
have obtained before in the cases D = Z and K = Q.
Lemma 3.8.5. Let p(x), q(x), r(x) ∈ D[x] be such that p(x) = q(x)r(x).
Then:
(i) If d ∈ D is prime, then d|c(p) ⇐⇒ d|c(q) or d|c(r), and
(ii) p(x) is primitive if and only if q(x) and r(x) are both primitive.
184
CHAPTER 3. OTHER EXAMPLES OF RINGS
Proof. Part (ii) follow from (i).
To prove (i), note first that if d is a prime and d|c(q) or d|c(r), obviously
d|c(p). So it remains to prove the converse. Write p(x) = a0 + a1 x + · · · ,
q(x) = b0 + b1 x + · · · , and r(x) = c0 + c1 x + · · · . Let d ∈ D be a prime such
that d|c(p) and assume, by contradiction, that d 6 |c(q) and d 6 |c(r). Let:
s := min{k ≥ 0 : d 6 |bk }, and t =: min{k ≥ 0 : d 6 |ck }.
If m = s + t, then we observe that:
am =
m
X
bk cm−k =
k=0
s−1
X
bk cm−k + bs ct +
k=0
=
s−1
X
m
X
bk cm−k ,
k=s+1
bk cm−k + bs ct +
k=0
t−1
X
bm−k ck ,
k=0
where the last sums are assume to represent zero, when s = 0 or t = 0. It
follows that:
s−1
t−1
X
X
bs ct = am −
bk cm−k −
bm−k ck .
k=0
k=0
The right hand side of the last identity is a multiple of d, while the left
hand side cannot be a multiple of d. Hence, we must have either d|c(q) or
d|c(r).
Example 3.8.6.
The polynomials p(x) = 3x2 + 2x + 5 and q(x) = 5x2 + 2x + 3 in Z[x] are
both primitive since gcd(3, 2, 5) = 1. Its product is the primitive polynomial
p(x)q(x) = 15x4 + 16x3 + 38x2 + 16x + 15.
We now generalize Lemma 3.5.10 from D = Z to any UFD: the factors
a(x) ∈ K[x] of a polynomial p(x) ∈ D[x] are associate in K[x] of the factors
of p(x) in the original ring D[x].
Lemma 3.8.7. If p(x) ∈ D[x], and p(x) = a(x)b(x) with a(x), b(x) ∈
K[x], then there exist ã(x), b̃(x) ∈ D[x], and k ∈ K, such that
p(x) = ã(x)b̃(x), a(x) = kã(x), and b(x) = k−1 b̃(x).
Proof. Lemma 3.8.3 (i) shows that a(x) = sa′ (x) and b(x) = tb′ (x), where
s, t ∈ K, and a′ (x) and b′ (x) are primitive polynomials in D[x]. On the
3.8. FACTORIZATION IN D[X]
185
other hand, we have from Lemma 3.8.2 that p(x) = c(p)p′ (x), where p′ (x)
is also primitive in D[x]. We conclude that p(x) = c(p)p′ (x) = sta′ (x)b′ (x).
By Lemma 3.8.5 (ii), a′ (x)b′ (x) is primitive and it follows from Lemma 3.8.3
(ii) that c(p)u = st and p′ (x) = u−1 a′ (x)b′ (x), for some unit u ∈ D.
Let us set, for example, ã(x) = c(p)ua′ (x) and b̃(x) = b′ (x). Then
ã(x)b̃(x) = c(p)ua′ (x)b′ (x) = sta′ (x)b′ (x) = p(x).
It follows also that the constant in the statement can be taken to be k =
and k−1 = 1t .
c(p)u
s
In this more general context Gauss’s Lemma takes the same form as in
the case D = Z.
Corollary 3.8.8 (Gauss’s Lemma). If p(x) ∈ D[x] is not constant, then
p(x) is irreducible in D[x] if and only if it is primitive in D[x] and irreducible
in K[x].
Proof. If p(x) is not primitive then it has a non-trivial factorization in D[x],
of the form p(x) = c(p)p̃(x). On the other hand, if p(x) = a(x)b(x) is a
non-trivial factorization in K[x], then p(x) has a non-trivial factorization
in D[x], by Lemma 3.8.7.
If p(x) is reducible in D[x], then it has a non-trivial factorization p(x) =
a(x)b(x) in D[x]. If any of these non-trivial factors is constant, then p(x)
is not primitive. If all factors are non-constant, we obtain a non-trivial
factorization in K[x].
Finally, we can prove
Theorem 3.8.9. D[x] is a UFD if and only if D is a UFD.
Proof. Assume that D[x] is a UFD. Since any 0 6= d ∈ D can be identified
with a zero degree polynomial, it is obvious that D must be a UFD.
Conversely, let D be a UFD. We will show that any q(x) ∈ D[x] of
degree > 0 has a unique factorization.
Existence: We have that q(x) = c(q)p(x), whereQp(x) primitive. Since
K[x] is a UFD, q(x) has a factorization q(x) = nk=1 qk (x), where the
polynomials qk (x) ∈ K[x] are irreducible in K[x] and deg(qk (x)) ≥ 1. As
we saw above, there exist primitive polynomials pk (x) ∈ D[x] and constant
sk ∈ K such that qk (x) = sk pk (x). By Gauss’s Lemma, the polynomials
pk (x) are irreducible in D[x], and we have
q(x) = c(q)p(x) = s
n
Y
k=1
pk (x), where s =
n
Y
k=1
sk .
186
CHAPTER 3. OTHER EXAMPLES OF RINGS
Q
On the other hand, according to Lemma 3.8.5 (ii), nk=1 pk (x) is primitive.
By Lemma 3.8.3 (ii), there exists aQunit u ∈ D such that s = c(p)u and, in
particular, s ∈ D. If we factor s = m
k=1 ck into irreducible elements ck ∈ D
we have that
! n
!
m
Y
Y
q(x) =
ck
pk (x)
k=1
k=1
is a factorization of q(x) into irreducible elements in D[x].
Q ′ ′ Qn′ ′
Uniqueness: Assume that q(x) = m
k=1 pk (x) is another fack=1 ck
torization of q(x) into irreducible polynomials in D[x], where c′k ∈ D and
deg(p′k (x)) ≥ 1. By Gauss’s Lemma the polynomials p′k (x) are primitive
and irreducible in K[x]. Hence:
•
Qm
Q ′ ′
is primitive, so m
k=1 ck em D. Since D
k=1 ck ∼ c(q) ∼
is a UFD, we have m = m′ and, after an appropriate permutation of
these factorizations, we have ck ∼ c′k in D.
•
Qn
Qn′
′
k=1 pk (x)
Q ′
∼ nk=1 p′k (x) in K[x]. Since K[x] is a UFD, we have that
n′ = n, and also after an appropriate permutation of these factorizations pk (x) ∼ p′k (x) in K[x]. By Corollary 3.8.4, we also have that
pk (x) ∼ p′k (x) in D[x].
k=1 pk (x)
Exercises.
1. If p(x) ∈ Z[x] is a monic polynomial with integers coefficients show that any
rational root p(x) is actually an integer.
2. Let D be an integral domain which has an non-invertible element d 6= 0.
Show that D[x] is not a PID.
3. Prove the following generalization of Eisenstein’s Criterion: let D be a UFD,
K = Frac(D) and p(x) = a0 + a1 x + · · · + an xn ∈ D[x] with n ≥ 1. If p ∈ D is
a prime such that p|ak for 0 ≤ k < n, p ∤ an and p2 ∤ a0 , then p(x) is irreducible
in K[x].
4. Show that the polynomial p(x, y) = x3 + x2 y + xy 2 + y is irreducible in
K[x, y].
5. Show that the polynomial p(x) = ix3 + 4x2 + 16ix + (3 − i) is irreducible in
Z[i][x].
3.8. FACTORIZATION IN D[X]
187
6. Show that when D is a UFD and p(x) ∈ D[x] is monic, then any monic
factor of p(x) in K[x] belongs to D[x].
7. Let D be a UFD and p(x), q(x) ∈ D[x].
(a) Can one use Euclides Algorithm to compute gcd(p(x), q(x)) in D[x]?
(b) Does the equation gcd(p(x), q(x)) = a(x)p(x) + b(x)q(x) always have
solutions a(x), b(x) ∈ D[x]?
8. Let D be an UFD. Which of the following rings are UFD?
(a) The ring D[[x]] of power series with coefficients in D.
(b) The ring D[1/x, x] of fractions p(x)/q(x), with p(x), q(x) ∈ D[x] and
q(x) 6= 0.
(c) The ring D[1/x, x]] of Laurent series.
(d) The ring D[x1 , x2 , . . . ] of polynomials in an infinite number of indeterminates.
188
CHAPTER 3. OTHER EXAMPLES OF RINGS
Chapter 4
Quotients and Isomorphisms
4.1
Groups and Equivalence Relations
In defining the ring operations in Zm one takes the following steps:
(i) First, one defines congruence modulo m (x ≡ y (mod m) ⇔ y − x ∈
hmi), and shows that it is an equivalence relation.
(ii) Then, one considers the set Zm consisting of the equivalence classes
x = {x + z : z ∈ hmi}, which one often writes as x = x + hmi.
(iii) Finally, one defines operations on these equivalence classes, from the
usual operations on the integers, through the identities x + y = x + y
and xy = xy.
We shall now see that this procedure can be extended to much more general
contexts, and hence can be used to provide many new examples of algebraic
structures.
We start by replacing the additive group (Z, +) by an arbitrary group
(G, ·). We use the multiplicative notation since the fact that addition was
commutative plays here no role. We also replace the subgroup hmi ⊂ (Z, +)
by an arbitrary subgroup H ⊂ G, and we generalize the congruent relations
modulo m as follows:
Definition 4.1.1. If (G, ·) is a group and H ⊂ G is a subgroup, the congruence relation modulo H is given by:
g1 ≡ g2
(mod H)
⇐⇒
189
g2−1 · g1 ∈ H.
190
CHAPTER 4. QUOTIENTS AND ISOMORPHISMS
Note that, in fact, congruence modulo m is the special case of this definition where G = (Z, +) and H = hmi. We leave it as an exercise to check
that congruence modulo H is indeed an equivalence relation.
Proposition 4.1.2. If (G, ·) is a group and H ⊂ G is a subgroup, then ≡
(mod H) is an equivalence relation in G.
Proceeding as we did for the case of the integers, we observe that if
g ∈ G, the equivalence class of g, denoted by g, can be written as:
g = {g′ ∈ G : g ′ ≡ g},
= {g′ ∈ G : g −1 g′ = h ∈ H},
= {g′ ∈ G : g ′ = gh, h ∈ H}.
If A and B are any subsets of the group G, we will denote by AB the
set of all products of elements of A and elements of B:
AB = {ab : a ∈ A and b ∈ B}.
If A = {a} (respectively, B = {b}) is a singleton, we will write aB (respectively, Ab) instead of AB. We leave it as an exercise to check that
A(BC) = (AB)C and that, in general, AB 6= BA.
With these conventions, we will denote by gH the equivalence class of
g ∈ G for the congruence (mod H), and call it the left coset of H 1 . The
set of all equivalence classes {gH : g ∈ G} is called the quotient set of G
by H, and denoted G/H. Therefore, we have that:
G/H = {gH : g ∈ G}.
The number of elements of G/H, which is just the number of left cosets, is
called the index of the subgroup H in G, and is denoted by [G : H].
Examples 4.1.3.
1. Consider the symmetric group G = S3 = {I, α, β, γ, δ, ε} and take as a subgroup the alternating group H = A3 = {I, δ, ε}. Note that:
• The equivalence class of I is the coset I = IH = H = {I, δ, ε}. We
conclude that I ≡ δ ≡ ε, and I = δ = ε. Also, H = δH = εH.
• If we let g = α, it is immediate that α = αH = {αI, αδ, αε} and a simple
computation shows that α = {α, β, γ}. It follows that α ≡ β ≡ γ, or
α = β = γ = {α, β, γ} = αH = βH = γH.
1
If G is an additive group with operation +, it is convenient to write A + B instead of
AB and g + H instead of gH. In this case, we have A + B = B + A and g + H = H + g.
4.1. GROUPS AND EQUIVALENCE RELATIONS
191
We conclude that in this example there are 2 distinct cosets each with 3 elements. Hence, the quotient set is S3 /A3 = {I, α} = {A3 , αA3 }, and the index
of A3 in S3 is [S3 , A3 ] = 2.
2. Consider again the group S3 , but let us fix now the subgroup H = {I, α}.
Then:
• The equivalence class of I is the coset I = H, hence I ≡ α and I = α;
• The equivalence class of β is the coset β = βH = {βI, βα} = {β, ε},
hence β ≡ ε and β = ε;
• The equivalence class of γ is the coset γ = γH = {γI, γα} = {γ, δ},
hence γ ≡ δ and γ = δ.
We conclude that there are now 3 distinct cosets, each with 2 elements. The
quotient set is S3 /H = {I, β, γ} = {H, βH, γH} while the index is [S3 : H] = 3.
3. Obviously the index of the subgroup hmi in Z is the number of elements of
Zm , i.e., [Z : hmi] = m.
Instead of the binary relation in Definition 4.1.1, we can consider instead
the binary relation defined by
g1 ≡ g2
(mod H)
⇔
g1 g2−1 ∈ H.
We still obtain an equivalence relation (distinct from the previous one, if
the group multiplication is not commutative), and the equivalence class of
g ∈ G is now given by
g = {g ′ ∈ G : g ′ ≡ g}
= {g ′ ∈ G : g ′ g −1 = h ∈ H}
= {g ′ ∈ G : g ′ = hg, h ∈ H}.
We denote this equivalence class by Hg, and call it a right coset of H.
The set of right cosets is denoted by H\G(2 ). One may notice that the left
and right cosets of a subgroup maybe equal, i.e., Hg = gH, for every g ∈ G,
as in Example 4.1.3.1, or distinct, as in Example 4.1.3.2. We leave this easy
check for the exercises.
If ≡ is an equivalence relation in a set X, we know that the corresponding
equivalence classes give a partition of X. In other words, the equivalence
classes are distinct, disjoint, subsets of X, and their unions is the set X.
2
By default, unless otherwise stated, we will always use left cosets. When it is clear
G
.
which cosets (left or right) one is using, we may also use the notation H
192
CHAPTER 4. QUOTIENTS AND ISOMORPHISMS
Of course, if X is a finite set, each equivalence class is also a finite set, and
there are only a finite number of distinct equivalence classes. Denoting by
X1 , X2 , . . . , Xn the equivalence classes of the equivalence relation ≡ in the
set X, we have:
|X| = |X1 | + · · · + |Xn | =
(4.1.1)
n
X
i=1
|Xi |.
This identity is sometimes referred to as the class equation.
When X = G and one considers the equivalence relation (mod H), for
some subgroup H ⊂ G, the number of elements in each equivalence class is
always the same:
Proposition 4.1.4. If H is a finite subgroup of G, then
|gH| = |Hg| = |H|,
for all g ∈ G.
Proof. Fix an element g ∈ G. The map φ : H → gH defined by φ(h) = g · h
is obviously surjective. On the other hand, by the cancelation law, it is clear
that φ is also 1:1:
φ(h) = φ(h′ ) ⇒ g · h = g · h′ ⇒ h = h′ .
Therefore, φ is a bijection between H and gH, so that |H| = |gH|. Similarly,
one shows that |Hg| = |H|.
The previous proposition, combined with the class equation (4.1.1), yields:
Theorem 4.1.5 (Lagrange). If G is a finite group and H ⊂ G is a subgroup,
then:
|G| = [G : H]|H|.
In particular, both |H| and [G : H] are factors of |G|.
Proof. Since both G and H are finite, there exists only a finite number of
left cosets so that [G : H] = n. Denote by g1 H, . . . , gn H these left cosets.
The class equation (4.1.1) gives
|G| =
n
X
i=1
|gi H| =
n
X
i=1
|H| = n|H| = [G : H]|H|.
4.1. GROUPS AND EQUIVALENCE RELATIONS
193
The number of elements of a group G is usually called the order of the
group G. Hence, by Lagrange’s Theorem, the order of a finite group G is
a multiple of the order of any of its subgroups. Analogously, if g ∈ G is any
element, then the order of the element g is the order of the subgroup
generated by the element g, i.e., the order of hgi = {g n : n ∈ Z}. It
follows from Lagrange’s Theorem that the order of any element of G is also
a factor of the order of G.
Examples 4.1.6.
1. In Example 4.1.3.1 we find that |G| = 6, |H| = 3, and [G : H] = 2. In the
case of Example 4.1.3.2, we find that |G| = 6, but |H| = 2 and [G : H] = 3.
2. If π ∈ S3 , it is easy to see that the order of π can be 3 (δ and ε), 2 (α, β,
and γ), or 1 (I). Of course, in any of these cases, the order of π is a factor
of the order of S3 . Note that 6 is also a factor of the order of S3 , but there is
no element in S3 of order 6.
3. In the study of the rings Zm we saw that the subgroups of Zm take the form
hdi, where d is a divisor of m. Obviously, in this case, the number of elements
3
of the subgroup hdi is m
d , which is a factor of m. Note also that d = [Zm : hdi] .
4. If (A, +, ·) is any ring with identity 1, the order of the additive subgroup
generated by 1 is precisely the characteristic of the ring A. Hence, for a finite
ring A its characteristic is a always a factor of the number of elements of A.
In many cases it is important to study the factorization of groups into
smaller pieces, i.e., to establish conditions for a group G to factor as a direct
product of groups K and H. These two groups then are more “elementary
blocks”, which can lead to a better knowledge of the structure of the group,
an idea which we will pursue in the next chapter. We give now some results in this direction. Their application is often made easier by Lagrange’s
Theorem.
Lemma 4.1.7. Let G be a group with identity e. If K and H are normal
subgroups in G such that K ∩ H={e}, then kh = hk for any k ∈ K and
h ∈ H.
Proof. Let k ∈ K and h ∈ H. Consider the element k−1 h−1 kh. We have
that h−1 kh ∈ K, since K is a normal subgroup of G. Hence, k−1 h−1 kh ∈ K.
Similarly, we have that k−1 h−1 k ∈ H, since H is a normal subgroup, and
3
When we refer to the “group Zm ”, with no further qualitative, we always mean the
additive group (Zm , +).
194
CHAPTER 4. QUOTIENTS AND ISOMORPHISMS
hence k−1 h−1 kh ∈ H. But k−1 h−1 kh ∈ K ∩ H = {e}, so we conclude that
k−1 h−1 kh = e, i.e., that kh = hk.
Theorem 4.1.8. Let H and K be subgroups of G, and assume that:
(i) H and K are normal in G;
(ii) G = HK, and
(iii) H ∩ K = {e}.
Then G ≃ H × K.
Proof. Recall that the group H × K as underlying set the Cartesian product
H × K = {(h, k) : h ∈ H, k ∈ K}, and binary operation (h1 , k1 )(h2 , k2 ) =
(h1 h2 , k1 k2 ).
Define φ : H × K → G by φ(h, k) = hk. Using Lemma 4.1.7, one checks
easily that φ is a group homomorphism. The assumption G = HK, implies
immediately that φ onto.
In order to determine the kernel of φ, observe that if φ(h, k) = e, then
hk = e. This means that h = k−1 , hence we conclude that h, k ∈ H ∩ K =
{e}. It follows that h = k = e, so the kernel of φ consists of (e, e).
Hence, φ is a group isomorphism and G ≃ H × K.
As we have mentioned above, in the case of finite groups Lagrange’s
Theorem is sometimes useful in verifying the assumptions of the previous
theorem. For example, if G is a finite group, then |H ∩ K| is a factor of |H|
and |K|. Hence, if |H| and |K| happen to be relative primes then we must
have |H ∩ K| = 1. If this is the case, the homomorphism φ used in the proof
of the theorem is 1:1 and we can conclude that |HK| = |H||K|, so we can
check if HK = G.
Here is an example in the case of abelian groups, where any subgroup is
obviously a normal subgroup:
Example 4.1.9.
In the group Z6 consider the subgroups H = {0, 3} and K = {0, 2, 4}. We
know that H is isomorphic to Z2 and K is isomorphic to Z3 . We have then
H ∩ K = {0}, and |G| = |H||K|, so we conclude that: Z6 ≃ Z2 ⊕ Z3 .
More generally, suppose that gcd(n, d) = 1 and recall Proposition ??: if m =
nd, the group Zm has subgroups B ≃ Zn and C ≃ Zd . Since |B ∩ C| is
a factor of both n and d, it follows immediately that |B ∩ C| = 1. Hence,
|B + C| = |B||C| = nd = m = |Zm | and so B + C = Zm . We conclude that
Zm ≃ B ⊕ C ≃ Zn ⊕ Zd .
4.1. GROUPS AND EQUIVALENCE RELATIONS
195
The theorem above can be generalized to direct products with more than
two factors. The proof is analogous and is left as an exercise:
Theorem 4.1.10. Let H1 , . . . , Hn be subgroups of a group G such that:
(i) Each Hi is normal in G;
(ii) G = H1 · · · Hn , and
(iii) Hi ∩
n
Q
k=1
k6=i
Hk = {e}, for i = 1, . . . , n.
Then:
G ≃ H1 × · · · × Hn .
Exercises.
1. Proof Proposition 4.1.2.
2. Show that {gH : g ∈ G} 6= {Hg : g ∈ G}, when G = S3 and H = {I, α}.
3. Show that if e is the identity of G, then g ≡ e (mod H) if and only if g ∈ H.
4. Find the quotient set G/H when G = Z6 and H = h2i.
5. Find the quotient set G/H when G = S4 and H = h(1234)i.
6. Determine the order of every element in the groups S3 , (Z6 , +) and (Z∗9 , ·).
7. Let G = Sn and H = An . Show that π ≡ σ (mod H) if and only if π and σ
are permutations with the same parity. Conclude that [Sn : An ] = 2.
8. Show that the function φ : G/H → H\G given by φ(gH) = Hg −1 is a well
defined bijection. Conclude that [G : H] coincides also with the number of
right cosets.
9. Show that if K ⊂ H, where K and H are subgroups of a finite group G, then
[G : K] = [G : H][H : K].
10. If A, B and C are any subsets of a group G, show that:
(a) (AB)C = A(BC).
(b) AA = A when A is a subgroup of G.
(c) If A and B are subgroups of G, then AB = BA if and only if AB is a
subgroup of G.
196
CHAPTER 4. QUOTIENTS AND ISOMORPHISMS
11. Show that, if A and B are finite subgroups of G, then |AB||A ∩ B| = |A||B|.
(Hint: Use the fact that A ∩ B is a subgroup of both A and B.)
12. Show that any permutation of S3 is of the form π = αn εm , without explicitly
computing these powers.
13. If |G| = p, where p is a prime, what are the subgroups of G, and what is
the order of the elements of G?
14. Give an example of an infinite group, where all elements have order 2, with
the exception of the identity.
15. Give an example of an infinite group, where all elements have finite order,
but such that for each n ∈ N it has elements of order n.
16. Prove Theorem 4.1.10.
17. Show the following converse of Theorem 4.1.8: if G is a group isomorphic
to the direct productH × K, then G has normal subgroups H ′ and K ′ such
that H ′ K ′ = G and H ′ ∩ K ′ = {e}, where e is the identity in G.
18. Let A be a finite ring with identity. Show that the characteristic of A is a
factor of |A|.
19. Classify the rings with unit with 2, 3, 4 and 5 elements.
(Hint: use the previous exercise.)
20. Let G be an abelian group with 9 elements, which has no element of order
9. Show that G ≃ Z3 ⊕ Z3 .
21. Show that if G is a finite group where all elements, except the identity, have
order 2, then G is abelian. What can you say if all elements distinct from the
identity have order 3?
22. If G is a group with order 2n, show that there exists at least one element
in G which has order 2.
(Hint: If x ∈ G, define C(x) := {x, x−1 }. Show that the binary relation x ∼
y ⇔ C(x) = C(y) is an equivalence relation and that C(x) is the equivalence
class of x).
4.2. QUOTIENT GROUP AND QUOTIENT RING
4.2
197
Quotient Group and Quotient Ring
The ring operations in Zm were defined from the ring operations in Z. When
can one generalize the procedure adopted for Zm , to define operations in the
quotient G/H, from given operations in G? We will see that this is possible
only under some restrictions in the subgroup H.
In fact, as we will show next, it is possible to define a group operation
in the quotient G/H, provided H is a normal subgroup of G 4 .
Recall that in the case of Zm , the addition operation was defined because
of the following property of addition in Z:
x ≡ x′ (mod m)
(4.2.1)
If
, thenx + y ≡ x′ + y ′ (mod m).
y ≡ y ′ (mod m)
Indeed, this implies that for any x and y in Zm , we can define
x + y := x + y
without any ambiguity caused by a choice of representatives x and y for
each of the equivalence classes. The following example shows that property
(4.2.1) does not generalize for congruence (mod H).
Example 4.2.1.
If G = S3 and H = {I, α}, we can have g1 ≡ g1′ (mod H) and g2 ≡ g2′
(mod H) without having g1 g2 ≡ g1′ g2′ (mod H): for example, take g1 = g1′ = α
and g2 = γ, g2′ = δ; then γ ≡ δ (mod H), but αγ = ε and αδ = γ are not
equivalent (mod H).
In fact, property (4.2.1) generalizes only in the case of normal subgroups:
Proposition 4.2.2. Let H be a subgroup of G. The following statements
are equivalent:
(i) H is a normal subgroup of G;
(ii) gHg−1 = H for all g ∈ G;
(iii) (g1 H)(g2 H) = (g1 g2 )H for all g1 , g2 ∈ G;
(iv) if g1 ≡ g1′ (mod H) and g2 ≡ g2′ (mod H), then g1 g2 ≡ g1′ g2′ (mod H)
for all g1 , g2 , g1′ , g2′ ∈ G.
4
Note that this restriction has no consequences for Zm : since (Z, +) is an abelian group,
any of its subgroups is a normal subgroup.
198
CHAPTER 4. QUOTIENTS AND ISOMORPHISMS
Proof. Let us show that
(i) ⇒ (ii) ⇒ (iii) ⇒ (iv) ⇒ (i).
(i) ⇒ (ii): From the definition of a normal subgroup we have:
gHg−1 ⊂ H.
Since g is arbitrary, we can replace g by g−1 obtaining g−1 Hg ⊂ H. Note
also that
g −1 Hg ⊂ H
=⇒
g(g −1 Hg)g−1 ⊂ gHg−1
=⇒
H ⊂ gHg−1 .
Since we know that gHg−1 ⊂ H, we conclude that gHg−1 = H.
(ii) ⇒ (iii): Since gHg−1 = H, one sees immediately that
(gHg−1 )g = Hg, i.e, gH = Hg.
Hence, we find:
(g1 H)(g2 H) =((g1 H)g2 )H = (g1 (Hg2 ))H = (g1 (g2 H))H
=((g1 g2 )H)H = (g1 g2 )(HH) = (g1 g2 )H.
(iii) ⇒ (iv):
g1 ≡ g1′
(mod H) and g2 ≡ g2′
(mod H) ⇒ g1′ ∈ g1 H and g2′ ∈ g2 H.
Hence g1′ g2′ ∈ (g1 H)(g2 H). Since(g1 H)(g2 H) = (g1 g2 )H, we find
g1′ g2′ ∈ (g1 g2 )H ⇔ g1 g2 ≡ g1′ g2′
(mod H).
(iv) ⇒ (i): If g ∈ G and h ∈ H, we must show that ghg −1 ∈ H. For
this, observe that gh ≡ g (mod H) and g −1 ≡ g−1 (mod H). Hence we
must have ghg−1 ≡ gg −1 = e (mod H). It follows that ghg−1 ∈ H as
claimed.
By the previous proposition, if H is a normal subgroup of G, then
(g1 H)(g2 H) = (g1 g2 )H defines a binary operation in G/H. It is easy to
check that:
Theorem 4.2.3. If H is a normal subgroup of G, then G/H is a group for
the binary operation defined by
(g1 H)(g2 H) = (g1 g2 )H.
Moreover, the quotient map π : G → G/H defined by π(g) = g = gH is a
surjective group homomorphism with kernel N (π) = H.
4.2. QUOTIENT GROUP AND QUOTIENT RING
199
Proof. We saw above that we have a well defined binary operation G/H ×
G/H → G/H given by (g1 H, g2 H) → (g1 H)(g2 H) = (g1 g2 )H.
This binary operation is associative since:
((g1 H)(g2 H)) (g3 H) = ((g1 g2 )H) (g3 H)
= ((g1 g2 )g3 ) H
= (g1 (g2 g3 )) H
= (g1 H) ((g2 g3 )H)
= (g1 H) ((g2 H)(g3 H)) .
If e is the identity in G, obviously eH = H and (gH)H = H(gH) = gH.
Hence, H is the identity in G/H.
It is also clear that (gH)(g−1 H) = (g−1 H)(gH) = eH = H. This means
that every element in G/H has an inverse. This shows that G/H is a group.
Let π : G → G/H be defined by π(g) = g = gH. It is immediate to
check that
π(g1 )π(g2 ) = (g1 H)(g2 H) = (g1 g2 )H = π(g1 g2 ),
so π is a group homomorphism.
Finally, since H is the identity in the group G/H, we have that the
kernel of π is N (π) = {g ∈ G : gH = H} = H.
Examples 4.2.4.
1. Let G = S3 . The subgroup H = A3 is normal, so that G/H = {I, α} is a
group. We easily find the multiplication table:
I I = α α = I,
I α = α I = α.
2. When G = Z6 and H = h2i = {0, 2, 4}, we find G/H = {0, 1}, where:
0 = 0 + h2i = {0, 2, 4}
1 = 1 + h2i = {1, 3, 5}.
In this case the multiplication table is given by
0 + 0 = 1 + 1 = 0,
0 + 1 = 1 + 0 = 1.
Obviously, the groups S3 /A3 and Z6 /{0, 2, 4} are isomorphic (there exists only
one group with two elements!).
200
CHAPTER 4. QUOTIENTS AND ISOMORPHISMS
We leave as an exercise the proof of the following result:
Theorem 4.2.5. If H is a normal subgroup of G, then the subgroups (respectively, normal subgroups) of G/H take the form K/H, where H ⊆ K ⊆ G,
and K is a subgroup (respectively, normal subgroup) of G.
Let us turn now to the study of quotient rings, which is slightly more
involved. On the one hand, if B ⊂ A is a subring of A, then (B, +) is a
subgroup of (A, +), so we can form the quotient group A/B: this follows
immediately from the fact that (A, +) is an abelian group, so any of its
subgroups is normal. The “addition” in A/B is given by:
a1 + a2 = a1 + a2 , or (a1 + B) + (a2 + B) = (a1 + a2 )B.
This does not imply that A/B is a ring, since we need also a “multiplication”
in A/B, such that the axioms of a ring are satisfied.
Let us recall that in the case of Zm the definition of multiplication relied
on the following property:
x ≡ x′ (mod m)
(4.2.2)
If
, then xy ≡ x′ y ′ (mod m).
y ≡ y ′ (mod m)
This property shows that given elements x and y in Zm , the binary operation
x · y = xy
is well defined independent of the choice of representatives x and y for each
equivalence class. This suggests that for an arbitrary ring A with a subring
B ⊂ A we should set:
a1 a2 = a1 a2 or (a1 + B)(a2 + B) = a1 a2 + B.
However, this definition is consistent only if whenever a1 ≡ a′1 (mod B) and
a2 ≡ a′2 (mod B) we also have a1 a2 ≡ a′1 a′2 (mod B). This is clarified by
the next result:
Proposition 4.2.6. Let B be a subring of A and set a ≡ a′ (mod B) if and
only if a′ − a ∈ B. Then the following statements are equivalent:
(i) B is an ideal in A;
(ii) If a1 ≡ a′1 (mod B) and a2 ≡ a′2 (mod B) then a1 a2 ≡ a′1 a′2 (mod B),
for any a1 , a′1 , a2 , a′2 ∈ A.
4.2. QUOTIENT GROUP AND QUOTIENT RING
201
Proof. We will show that both implications (i) ⇒ (ii) and (ii) ⇒ (i), hold.
(i) ⇒ (ii): If a1 ≡ a′1 (mod B) and a2 ≡ a′2 (mod B), then a′1 = a1 + b1
and a′2 = a2 + b2 , where b1 , b2 ∈ B. Hence, a′1 a′2 = (a1 + b1 )(a2 + b2 ) =
a1 a2 + a1 b2 + a2 b1 + b1 b2 . It is clear that b1 b2 ∈ B, because B is a subring,
and a1 b2 , a2 b1 ∈ B, because B is an ideal. We conclude that a′1 a′2 = a1 a2 +b,
where b = a1 b2 + a2 b1 + b1 b2 ∈ B, and hence a1 a2 ≡ a′1 a′2 (mod B).
(ii) ⇒ (i): We must show that, if a ∈ A and b ∈ B, then ab, ba ∈ B. For
that, we observe that b ∈ B if and only if b ≡ 0 (mod B), where 0 denotes
the zero in the ring A. According to (ii), we then have ab ≡ a0 (mod B) and
ba ≡ 0a (mod B), or equivalently ab ≡ 0 (mod B) and ba ≡ 0 (mod B).
We conclude that ab, ba ∈ B, so B is an ideal.
We can now state an analogue of Theorem 4.2.3, for rings. We leave its
proof as an exercise.
Theorem 4.2.7. If I ⊂ A is an ideal in a ring A, then A/I is a ring with
the operations:
a1 + a2 := a1 + a2
a1 · a2 := a1 · a2 .
If A is abelian (respectively, with identity 1), then A/I is abelian (respectively, with identity 1). Moreover, the quotient map π : A → A/I defined by
π(a) = a = a + I is a ring homomorphism with kernel N (π) = I.
The next examples show that often properties of the ring A do not pass
to the quotient.
Examples 4.2.8.
1. The rings Zm are examples of the situation described in the theorem, where
A = Z and I = hmi, where m is a fixed natural number. In this case, A is
always an integral domain, while the quotient A/I is not, if m is not a prime
number: if d is a factor of m then d is a zero divisor in Zm .
2. Let A = Q[x] and I = hm(x)i where m(x) = x2 + 1. Given any p(x) ∈ Q[x],
the division algorithm shows that there exists q(x) ∈ Q[x] such that
p(x) = q(x)(x2 + 1) + (a + bx),
(a, b ∈ Q).
Since obviously q(x)(x2 + 1) ∈ I, we conclude that p(x) ≡ a + bx, i.e., p(x) =
a + bx, so that
Q[x]
= {a + bx : a, b ∈ Q}.
hx2 + 1i
202
CHAPTER 4. QUOTIENTS AND ISOMORPHISMS
One can find easily the operations in this ring. For the sum we have:
a + bx + a′ + b′ x = a + bx + a′ + b′ x
= (a + a′ ) + (b + b′ )x.
To find the product we observe first that x2 + 1 ≡ 0, i.e., that x2 = −1.
Therefore:
(a + bx)(a′ + b′ x) = (a + bx)(a′ + b′ x),
= aa′ + (ba′ + ab′ )x + bb′ x2 ,
= (aa′ − bb′ ) + (ba′ + ab′ )x.
In order to make the notation lighter, let us right a instead of a, and i instead
of x (note that a = b if and only if a = b). With these conventions, the two
operations can be written:
(a + bi) + (a′ + b′ i) = (a + a′ ) + (b + b′ )i,
(a + bi)(a′ + b′ i) = (aa′ − bb′ ) + (ab′ + a′ b)i.
It should now be clear that Q[x]/hx2 + 1i is actually isomorphic to Q[i], a
“coincidence” which will be explained later. Observe also that Q[x]/hx2 + 1i
is a field extension of Q, while the original ring Q[x] was not a field. In this
field the polynomial x2 + 1 has roots and is reducible.
3. The previous example, extends to rings of polynomials over finite fields. Let
A = Z2 [x] and I = hx2 + x + 1i. Just as before, if p(x) ∈ Z2 [x] there exists
q(x) ∈ Z2 [x] such that
p(x) = q(x)(x2 + x + 1) + (a + bx).
Therefore, once more p(x) ≡ a + bx, i.e., p(x) = a + bx. In this case, this
leads to a finite ring with 4 elements:
hx2
Z2 [x]
= {a + bx : a, b ∈ Z2 } = {0, 1, x, 1 + x}.
+ x + 1i
Again, lets us write a instead of a, and α instead of x, where 1 + α + α2 = 0,
or still α2 = −1 − α = 1 + α. (Note that since a = −a in Z2 , we also have
a = −a in the quotient ring). We can then exhibit the tables for addition and
multiplication in the quotient ring (we write β = α2 = 1 + α for convenience):
+
0
1
α
β
0
0
1
α
β
1
1
0
β
α
α
α
β
0
1
β
β
α
1
0
·
0
1
α
β
0
0
0
0
0
1
0
1
α
β
α
0
α
β
1
β
0
β
1
α
These are the same tables as the field with 4 elements mentioned in an exercise
in Chapter ??. It is a field extension of Z2 , and in this field the polynomial
x2 + x + 1 has roots and is reducible.
4.2. QUOTIENT GROUP AND QUOTIENT RING
203
In the exercises one is asked to check that, in general, the quotient ring
K[x]/hm(x)i is always an extension of the field K and also a vector space
over K of dimension n, where n is the degree of the polynomial m(x)).
In the last two examples above the quotient ring is actually a field. On
the other hand, we know that the rings Zm are fields when m is a prime,
and this occurs precisely when hmi is a maximal ideal in Z. These facts are
indeed related:
Theorem 4.2.9. If A is an abelian ring with identity and I ( A is an ideal,
the quotient A/I is a field if and only if I is a maximal ideal in A.
Proof. Let us assume first that I is a maximal ideal in A. We need to show
that A/I has an identity 1 6= 0, and that if a 6= 0, there exists x ∈ A such
that ax = 1.
Let us observe first that 1 6∈ I, i.e., 1 6= 0, for otherwise we would have
I = A. Next, given a 6= 0, i.e., a 6∈ I, consider the set J = {ax + b : x ∈
A and b ∈ I}. Obviously, a ∈ J and so J 6= I. It is also obvious that I ⊂ J.
Since A is abelian, one checks immediately that J is an ideal in A. Since I
is assumed to be a maximal ideal we must have J = A. Therefore, we have
1 ∈ J which means that there exists x ∈ A and b ∈ I such that 1 = ax + b,
or ax = 1.
Conversely, assume now that A/I is a field and let π : A → A/I be the
quotient map. If J ) I is an ideal in A containing I, then {0} =
6 π(J) ⊂ A/I
is an ideal. It follows that there exists a ∈ J such that a ≡ 1, i.e., 1 = a + b,
for some b ∈ I. Since I ⊂ J, we conclude that 1 ∈ J, so that J = A. This
shows that I is a maximal ideal.
Whenever D is an integral domain, the polynomial ring D[x] is also an
integral domain, and m(x) is irreducible if and only if hm(x)i is a maximal
ideal maximal among all principal ideals in D[x] (see Proposition ??). When
D = K is a field, all ideals in K[x] are principal, so the theorem gives:
Corollary 4.2.10. If K is a field, the quotient ring K[x]/hm(x)i is a field
if and only if m(x) is an irreducible polynomial in K[x].
This corollary explains why in the examples above the quotient rings
are fields. As in those examples, this corollary can be used to obtain field
extensions of known fields and, in particular, to construct new fields. On
the other hand, Theorem 4.2.9 can be used to define the real numbers from
the rationals numbers and to verify that the axiomatics of the real numbers
is also a consequence of the axioms of the integers, presented in Chapter
??. We shall do this in the next section, where we will also present a formal
204
CHAPTER 4. QUOTIENTS AND ISOMORPHISMS
definition of the complex numbers, identified as the quotient of R[x] by the
ideal hx2 + 1i.
Exercises.
1. Show that if H is a subgroup of G and [G : H] = 2, then H is a normal
subgroup of G.
2. Show that N is a normal subgroupo of G if and only if there exists a group
H and a homomorphism φ : G → H whose kernel is N .
3. Let N be a normal subgroup of G, and π : G → G/N the quotient map.
(a) Show that if N ⊂ H ⊂ G where H is a subgroup of G, then N is a normal
subgroup of H, and H/N is a subgroup of G/N .
(b) Show that the subgroups of G/N take the form H/N , where H is a
subgroup of G which contains N .
4. Let N be a normal subgroup of G and x ∈ G.
(a) Assume that the order of x in G is finite and equal to m. Show that the
order of x in G/N is finite and divides m.
(b) Find examples where x has infinite order in G and x in G/N has (i) finite
order and (ii) infinite order.
5. Let A be a ring and I an ideal in A, verify that the product in the quotient
ring A/I, which is defined as (a1 + I)(a2 + I) = a1 a2 + I, does not in general
correspond to the product of sets, which was defined as CD = {cd : c ∈
C and d ∈ D}.
6. Prove Theorem 4.2.7.
7. Show that the function φ : Q ⊕ Q → Q[x]/hx2 + 1i, given by φ(a, b) = ax + b,
is a bijection.
8. Find the tables for addition and multiplication of the quotient ring Z2 [x]/hx2 +
1i. Check that this ring is not a field. Why doesn’t this contradict Theorem
4.2.9?
9. Consider the ring Q[x]/hm(x)i, where m(x) = x6 + x4 + x2 + 1. Determine
the inverse of x + 2. Check if this ring has any zero divisors and if yes, give
an example of one such element.
10. Show that Q[x]/hx2 − 3x + 2i is isomorphic to Q ⊕ Q. Hint: Show that
the map φ : Q[x]/hx2 − 3x + 2i → Q ⊕ Q given by φ(p(x)) = (p(1), p(2)) is
well defined and yields an isomorphism of rings.
4.3. REAL AND COMPLEX NUMBERS
205
11. Let L = Z2 [x]/hx2 + x + 1i. Find the factorization of the polynomial
x2 + x + 1 in L[x].
12. Let m(x) be an irreducible polynomial of degree n in K[x], and let L =
K[x]/hm(x)i. Show that:
(a) L is a field and a vector of dimension n over K;
(b) The field L is an extension of K;
(c) m(x) has at least one root in L;
(d) There exists a field extension of K, where m(x) is a product of factors of
degree 1.
13. Show that the polynomial x3 + x2 + 1 is irreducible in Z2 [x]. Use this fact
to give an example of a field L with 8 elements. Find the factorization of the
polynomial x3 + x2 + 1 in L.
14. Let I ⊂ A be an ideal, and π : A → A/I the quotient map π(a) = a. Show
that J ⊂ A/I is an ideal in A/I if and only if J = π(J), where J is an ideal in
A and I ⊂ J.
15. Find all the ideals of Z2 [x]/hx2 + 1i.
16. Show that a non-abelian group G with 6 elements is isomorphic to S3 , by
showing that:
(a) G has an element x of order 3 and H =< x > is normal in G.
(b) G has an element y of order 2 and y 6∈ H.
(c) Show that yx = xy 2 (use yx ∈ xH). Conclude that G ≃ S3 .
4.3
Real and Complex Numbers
It is more or less obvious that the rational numbers can by represented by
points in a line: to determine which point corresponds to a given rational
number one just needs to fix two points in the line that correspond to the
numbers 0 and 1. This correspondence was of course known to the Ancient
Greeks, who also discovered an interesting phenomena: although to each
rational number corresponds a point in the line, there are points in the line
which do not correspond to any rational number. The Greeks interpreted
this as mistake by the Gods, since it suggested that the rational numbers,
a byproduct of the natural numbers, were in some sense insufficient. In
fact, for some time they even tried to hide this phenomena from general
206
CHAPTER 4. QUOTIENTS AND ISOMORPHISMS
knowledge, afraid that it could cause the wrath of the Gods. Actually, as
we shall see in this section, the Greeks were wrong since the real numbers,
which correspond to all points in the line, can be defined from the rational
numbers, and therefore also from the natural numbers.
In modern language, the “insufficiency” of the rational numbers can be
expressed in terms of Cauchy sequences. We start by recalling its definition,
adapted to the special case of the rational numbers.
Definition 4.3.1. Let x = (x1 , x2 , . . . ) be a sequence in Q. The sequence
is called:
(a) bounded, if there exists M ∈ Q such that
|xn | ≤ M, ∀n ∈ N.
(b) convergent in Q with limit l ∈ Q, if
∀ε ∈ Q+ , ∃N ∈ N : n ≥ N =⇒ |xn − l| < ε.
(c) Cauchy, if
∀ε ∈ Q+ , ∃N ∈ N : n, m ≥ N =⇒ |xn − xm | < ε.
As usual, if x = (x1 , x2 , . . . ) is a convergent sequence with limit l, then
one writes xn → l or limn→∞ xn = l. One has the usual results about
convergence of sums, products and differences of convergent sequences. Also,
it is easy to show that in Q:
(i) Every convergent sequence is Cauchy, and
(ii) Every Cauchy sequence is bounded.
On the other hand, there are Cauchy sequences in Q which are not convergent, as illustrated in the following example:
Example 4.3.2.
Consider the map f : Q → Q defined by f (x) =
that f (x) > 1, because
(x − 1)2 + 1 > 0
=⇒
=⇒
=⇒
x2 +2
2x .
If x > 0, we observe
x2 − 2x + 2 > 0,
x2 + 2 > 2x,
x2 + 2
> 1.
2x
4.3. REAL AND COMPLEX NUMBERS
207
If x, y > 0, we also observe that
f (x) − f (y) =
(xy − 2)(x − y)
xy − 2 x − y
=
.
2xy
xy
2
If, additionally, x, y ≥ 1, it is easy to check that −1 ≤
g(z) = 1 − 2z is increasing for z > 0. Therefore,
|f (x) − f (y)| ≤
xy−2
xy
< 1, since
1
|x − y|.
2
Now let {xn }n∈N be the sequence in Q defined by
x1 = 1, and xn+1 = f (xn ) se n ∈ N.
For n > 1 we have:
|xn+1 − xn | = |f (xn ) − f (xn−1 )| ≤
1
|xn − xn−1 |,
2
so that
1
|x2 − x1 |.
2n−1
We leave as an exercise to verify that if m > n then
|xn+1 − xn | ≤
|xm − xn | ≤
1
|x2 − x1 |,
2n−2
so the sequence {xn }n∈N is a Cauchy sequence.
Although this sequence is a Cauchy sequence, it is not convergent in Q. In
fact, we have
xn+1 = f (xn ) =⇒ 2xn+1 xn = x2n + 2,
so that if xn → x, then 2x2 = x2 + 2, or x2 = 2, an equation that we know has
no solutions in Q.
Although the sequence in the previous example is not√convergent in Q,
it is obviously convergent in R to the irrational number 2. You probably
know that every Cauchy sequence in Q converges to a real number, which
may or may not be a rational number. This leads to the following motto:
• Every Cauchy sequence in Q determines a real number. 5 .
Of course, distinct Cauchy sequences can determine the same real number,
i.e., can have the same limit. This happens precisely when the difference of
the two sequences converges to zero. In other words:
5
You should compare this with the idea used to define the rational numbers from the
integers: each pair (m, n) of integers with n 6= 0 determines a rational number
208
CHAPTER 4. QUOTIENTS AND ISOMORPHISMS
• The Cauchy sequences {xn }n∈N and {yn }n∈N determine the same real
number if and only if (xn − yn ) → 0.
We define two Cauchy sequences, {xn }n∈N e {yn }n∈N , to be equivalent if
(xn − yn ) → 0. This leads to the main idea we will use to define the real
numbers out of the rational numbers: a real number is an equivalence classe
of Cauchy sequences in Q.
In order to pursue this idea6 , we need the following proposition, which is
in fact a special case of the general theory developed in the previous section.
Its proof is left as an exercise.
Theorem 4.3.3. Let A be the set of all sequences of rational numbers.
Then:
(i) The set A with the operations of addition and product of sequences is
a ring.
(ii) The subset B ⊂ A consisting of all Cauchy sequences in Q is a subring
of A.
(iii) The set I formed by all sequences in Q which converge to 0 is a subring
of A and an ideal in B.
If x, y ∈ B are Cauchy sequences in Q, then x and y determine the same
real number if and only if x − y converges to 0, i.e., if and only if x − y ∈ I.
This suggests:
Definition 4.3.4 (Cantor). The ring B/I is denoted by R. Its elements
(which are just equivalence classes of Cauchy sequences in Q) are called
real numbers.
It should be clear that R is an extension of the ring Q: given any rational
q ∈ Q, we can take the constant sequence q given by qn = q, n ∈ N (obviously
a Cauchy sequence), and the map ι : Q → R given by ι(q) = q is an injective
homomorphism. Observe also that the zero in R is the equivalence class of
the zero sequence (the ideal I), and the identity is the equivalence class of
the constant sequence with all terms equal to 1. Of course, any sequence of
rationals converging to 0 also represents I = 0, and any sequence of rationals
converging to 1 is a representative of 1.
6
This way of defining the real numbers is due to Georg Cantor (1845-1918), a German
mathematician who also made fundamental discoveries in Set Theory, including the theory
of “transfinite numbers”.
4.3. REAL AND COMPLEX NUMBERS
209
In order to show that R is a field (which amounts to show that I is a
maximal ideal in B), one needs to show that if x ∈ R − {0} then there exists
y ∈ R such that xy = 1. In terms of Cauchy sequences in Q, this means we
need to show that:
Proposition 4.3.5. If x is a Cauchy sequence in Q which does not converge
to 0, then there exists a Cauchy sequence y in Q such that xn yn → 1.
Proof. Let x be a Cauchy sequence in Q which does not converge to 0. We
leave as an exercise to show that there exists a rational δ > 0 and a natural
number N ∈ N such that |xn | > δ for n ≥ N .
Next, we define a sequence y ∈ Q by
yn =
0,
1
xn ,
se n ≤ N
se n > N.
Notice that if n > N then |yn | = | x1n | ≤ 1δ , hence for n, m > N we have
|ym − yn | =
1
|xn − xm |
≤ 2 |xn − xm | → 0,
|xn xm |
δ
This shows that y is a Cauchy sequence in Q.
Since xn yn = 1 for n > N , it is obvious that xn yn → 1.
Next we show that R is an ordered field. For that we need to define a
set R+ such that:
1. x, y ∈ R+
⇒
x + y ∈ R+ and xy ∈ R+ ;
2. If x ∈ R+ , exactly one of the following 3 conditions hold:
x ∈ R+ or x = 0, or − x ∈ R+ .
After a little thought, one sees there is really just one way to proceed:
Definition 4.3.6. If x ∈ R (i.e., x is a Cauchy sequence in Q), we say that
x is positive if there exists a rational number ε > 0 and N ∈ N, such that
n > N ⇒ xn ≥ ε. The set of positive real numbers is denoted by R+ .
It is now easy to show that
Theorem 4.3.7. R is an ordered field.
210
CHAPTER 4. QUOTIENTS AND ISOMORPHISMS
It follows that, as it was explained in Chapter ?? in our discussion of
ordered fields, that one can define |x| = max{x, −x} for any x ∈ R.
If q ∈ Q is a rational number we will denote by q the constant sequence
where qn = q for all n ∈ N. This is both a Cauchy sequence and a convergent sequence in Q. We denote by q the corresponding real number (the
equivalence class determined by q). As we have already mentioned, the map
f : Q → R given by f (q) = q is a ring homomorphism, which allows us to
say that the field R is an extension of the field Q. In Analysis we learn that
any real number can be approximated by a rational number, to an arbitrary
small error, i.e., that “Q is dense in R”. In our approach to the real number,
this idea is formalized precisely as follows:
Proposition 4.3.8. If x and ε are real numbers and ε > 0, then there
exists a rational q ∈ Q such that |x − q| < ε.
Proof. We start by choosing representatives for x and ε, i.e., Cauchy sequences x = (x1 , x2 , . . . ) and ε = (ε1 , ε2 , . . . ) in Q. Since ε > 0, there exists
a rational number r > 0 such that εn ≥ r for all n ≥ N1 , where N1 ∈ N.
We can now obtain the sought “rational number” q by replacing x by
a constant sequence whose general term is one of the terms of x of high
enough order. In fact, since x is a Cauchy sequence, there exists N2 ∈ N
such that
r
n, m ≥ N2 =⇒ |xn − xm | < .
2
If we let q = xN2 , a rational number, we have that 2r < xn − q ≤ 2r for
n ≥ max{N1 , N2 }. Hence, we conclude that −r < x − q < r, and this
implies that |x − q| < ε.
The basic properties of the real numbers, which form the foundations of
Analysis, are usually introduced in an axiomatic form. The usual minimal
set of axioms that one considers includes only the axioms that R is an ordered
field and a completeness axiom, such us the Least Upper Bound Axiom or
Supremum Axiom. This is invoked, for example, to show that every Cauchy
sequence in R is convergent, contrary to what happens in Q.
Our approach to the real numbers in a constructive (as opposed to an
axiomatic) approach. We have already showed that R is an ordered field and
it remains to show that the “Supremum Axiom” is another consequence of
our definition. For this, one shows directly that every Cauchy sequence in
R is convergent:
Theorem 4.3.9. Every Cauchy sequence in R is convergent7 .
7
For this reason, we say that R is a complete field.
4.3. REAL AND COMPLEX NUMBERS
211
We leave the proof as an exercise. Using this result one can show the
“Supremum Axiom” holds in R.
Corollary 4.3.10 (Supremum Axiom). Every non-empty subset of R having
an upper bound must have a least upper bound (a supremum).
Proof. Assume A ⊂ R is a non-empty set having an upper bound: this
means there exists M ∈ R such that x ≤ M , for all x ∈ A. One defines a
sequence in R, by successive bisections, a common method in Analysis. We
let x1 = M .
Since A 6= ∅, there exist a ∈ A and we let a1 := a.Obviously, a1 ≤ x1 ,
1
so we can set a2 := a1 +x
2 . The number a2 splits the interval [a1 , x1 ] into 2
subintervals of equal length. There are two possibilities:
(i) There exists x ∈ A such that x > a2 . In this case we let x2 := x1 ;
(ii) One has x ≤ a2 for all x ∈ A. In this case, we let x2 := a2 .
We repeat this procedure, obtaining a sequence of real numbers which is
easily shown to be a Cauchy sequence. By Theorem 4.3.9, this sequence
converges and one shows easily that its limit is a least upper bound of
A.
This finishes the proof that the real numbers can be constructed from the
rational numbers (and hence, from the integers) and that its properties are
a consequence of the axioms for the integers that were discussed in Chapter
??.
One can construct the complex numbers from the real numbers with
not so much difficulty: since R is an ordered field, it is obvious that the
polynomial x2 + 1 is irreducible in R[x]. Therefore the ring
C=
R[x]
hx2 + 1i
is a field, called the field of complex numbers. The imaginary unit i is,
by definition, the equivalence class of the polynomial x. It clearly satisfies
the identity i2 = −1 in C. We will not discuss in detail the properties of C,
but one can show that C is also a complete field.
Exercises.
1. Let A be an ordered field. Show that any convergent sequence in A is a
Cauchy sequence and that any Cauchy sequence is bounded.
212
CHAPTER 4. QUOTIENTS AND ISOMORPHISMS
2. Show that if one sets x1 := 1 and xn+1 := f (xn ), where f is the function in
1
|x2 − x1 |.
Example 4.3.2, then |xn − xm | ≤ 2n−2
3. Show that the set of all Cauchy sequences in Q is a subring of the ring of all
sequences in Q.
4. Show that the set of all sequences in Q which converge to 0 is an ideal in the
ring of all Cauchy sequences in Q.
5. Let x be a Cauchy sequence in Q. Show that the following statements are
equivalent:
(a) x does not converge to 0;
(b) there exists a rational number ε > 0 and subsequence xnk such that
|xnk | ≥ ε for k large enough;
(c) there exists a rational number d > 0 such that |xn | ≥ d for n large
enough.
6. Let x, y ∈ R.
(a) Show that if x, y ∈ R+ , then x + y ∈ R+ and xy ∈ R+ .
(b) Show that either x ∈ R+ , x = 0, or −x ∈ R+ , but none of these hold
simultaneously.
7. Give a proof of Theorem 4.3.9 and finish the proof of Corollary 4.3.10.
8. Show that the order of the real numbers is unique, i.e., show that if R is an
ordered field then x ∈ R+ if and only if there exists y ∈ R − {0} such that
x = y2.
9. Show that R is not countable. Conclude that R is a transcendental extension
of Q (and a vector space of infinite dimension over Q).
10. Show that if x is a real number and 0 ≤ x < 1, then there exists a sequence
a1 , a2 , . . . such that 0 ≤ an ≤ 9 for all n ∈ N and
x=
11. Show that C is a complete field.
∞
X
an
.
10n
n=1
4.4. ISOMORPHISM THEOREMS FOR GROUPS
4.4
213
Isomorphism Theorems for Groups
Let G and H be groups and let K ⊆ G be a normal subgroup. It is natural to
ask what is the relationship between the group homomorphisms φ : G → H
and the group homomorphisms φ̃ : G/K → H.
On the one hand, since the quotient map π : G → G/K, given by
π(x) = x = xK is a group homomorphism, it is obvious that for any group
homomorphism φ̃ : G/K → H the composition φ := φ̃ ◦ π : G → H is a
group homomorphism and that φ̃(x) = φ(x).
On the other hand, given a group homomorphism φ : G → H one may
wonder if there exists some homomorphism φ̃ : G/K → H, such that φ̃(x) =
φ(x). This is illustrated by the following commutative diagram, where the
dot arrow is used to indicate that we seek the existence of the corresponding
homomorphism:
G
π
G/K
φ
7/ H
♦♦
♦
♦♦
♦ ♦ φ̃
Let us assume that there exists a homomorphism φ̃ : G/K → H such
that φ̃(x) = φ(x). If x ∈ K then x = K is the identity in G/K, so φ̃(x) is the
identity in H. Then φ(x) = φ̃(x) is also the identity in H so that x ∈ N (φ).
In other words, a necessary condition for the existence of φ̃ : G/K → H is
that K ⊆ N (φ). Actually, this condition is also sufficient:
Proposition 4.4.1. Let K ⊂ G be a normal subgroup. Given a group
homomorphism φ : G → H there exists a group homomorphism φ̃ : G/K →
H such that φ̃(x) = φ(x) if and only if K ⊆ N (φ). Moreover, every group
homomorphism φ̃ : G/K → H arises in this way.
Proof. We have already observed that if φ̃ : G/K → H is a group homomorphism then φ = φ̃ ◦ π is a group homomorphism φ : G → H with kernel
N (φ) ⊇ K.
For the converse, suppose that φ : G → H is a group homomorphism
with K ⊆ N (φ). Then we define φ̃ : G/K → H by setting:
φ̃(xK) := φ(x).
This map is well defined because if x′ = xk, with k ∈ K, then φ(x′ ) =
φ(xk) = φ(x)φ(k) = φ(x). Also, φ̃ is a group homomorphism because:
φ̃(x) · φ̃(x′ ) = φ(x)φ(x′ ) = φ(x · x′ ) = φ̃(x · x′ ).
214
CHAPTER 4. QUOTIENTS AND ISOMORPHISMS
Examples 4.4.2.
1. Let G = Z, H = Zn and K = hki. Denoting the quotient map π : Z → Zm by
πm , we consider the homomorphism φ = πn : Z → Zn given by πn (x) = x ∈ Zn :
φ=πn
/
♣8 Zn
♣
♣
π=πk
♣ ♣φ̃
♣
♣
Zk
Z
Since the kernel of φ is N (φ) = hni, there exists a homomorphism φ̃ : Zk → Zn
such that φ̃(πk (x)) = πn (x) if and only if hki ⊆ hni, i.e., i.e., if and only if n|k.
We can write φ̃(x) = x but note, in general, φ̃ is not the identity. For example,
if k = 4 and n = 2, we have that φ̃(0) = φ̃(2) = 0, and φ̃(1) = φ̃(3) = 1.
2. Let H = {1, i, −1, −i} and let φ : Z → H be the group homomorphism
φ(n) = in . Its kernel is N (φ) = h4i, so taking K = N , we conclude that there
exists a group homomorphism φ̃ : Z4 → H such that φ̃(n) = in . Since φ̃(0) = 1,
φ̃(1) = i, φ̃(2) = −1, and φ̃(3) = −i, its is clear that φ̃ is an isomophism.
φ
/ {1, i, −1, −i}
6
❧❧
❧
π
❧ ❧φ̃
❧❧❧
Z4
Z
3. Let G = Z, H = Z210 , K = hki, and consider the homomoprhism φ : Z →
Z210 given by φ(x) = 36x. The kernel of φ is
N (φ) = {x ∈ Z : 210|36x} = {x ∈ Z : 35|6x} = h35i.
We conclude that there exists a group homomorphism φ̃ : Zk → Z210 such that
φ̃(x) = 36x if and only if hki ⊆ h35i, i.e., if and only if 35|k. In particular,
φ̃ : Z70 → Z210 , given by φ̃(x) = 36x, is a well defined group homomorphism.
Obviously, if φ and φ̃ are as in Proposition 4.4.1, then they have the
same image, so φ̃ is surjective if and only if φ is surjective. The injectivity
of φ̃ is discussed in the following:
Proposition 4.4.3. Let φ : G → H be a group homomorphism with kernel
N (φ) ⊇ K, for some normal subgroup K ⊆ G. Denote by π : G → G/K
the quotient map and φ̃ : G/K → H the induced homomorphism. Then the
kernel of φ̃ is N (φ)/K = π(N (φ)). In particular, φ̃ is injective if and only
K = N (φ).
4.4. ISOMORPHISM THEOREMS FOR GROUPS
215
Proof. Let e be the identity of H. Then:
N (φ̃) ={x ∈ G/K : φ̃(x) = e}
={x ∈ G/K : φ(x) = e}
={x ∈ G/K : x ∈ N (φ)} = π(N (φ)) = N (φ)/K.
Examples 4.4.4.
1. In Example 4.4.2.3, where G = Z, H = Z210 , K = h70i, and φ : Z → Z210
is given by φ(x) = 36x, we saw that the kernel of φ is N (φ) = h35i. Hence,
it follows that the kernel of the induced homomorphism φ̃ : Z70 → Z210 is
N (φ)/K = π70 (h35i) = h35i = {35, 0}.
2. If in the previous example we let instead K = h35i, we conclude that the
group homomorphism
φ̃ : Z35 → Z210 ,
x 7−→ 36x,
is injective.
If φ : G → H is a surjective group homomorphism and K = N (φ) the
previous proposition reduces to an important basic result in Group Theory
which we shall use repeatedly:
Theorem 4.4.5 (First Isomorphism Theorem). If φ : G → H is a surjective
group homomorphism and K = N (φ) , then G/N (φ) and H are isomorphic:
there exists a group isomorphism φ̃ : G/N → H given by φ̃(x) = φ(x), for
all x ∈ G:
G
π
G/N
φ
/
♦7 H
♦
♦
♦♦
♦
φ̃
♦
The First Isomorphism Theorem allows one to establish that groups
of a very distinct nature are actually isomorphic. Note that even when a
homomorphism φ : G → H is not surjective we can still apply the theorem,
replacing H by φ(G), and concluding that there is an isomorphism:
φ(G) ≃ G/N (φ).
216
CHAPTER 4. QUOTIENTS AND ISOMORPHISMS
Examples 4.4.6.
1. We saw in Example 4.4.2.2 that the multiplicative group H = {1, i, −1, −i} is
isomorphic to the additive group Z4 . More generally, consider the multiplicative
group Rn consisting of the nth-roots of the unit Rn = {αk : k ∈ Z} = hαi,
2π
where α = e n i . The group homomorphism φ : Z → Rn defined by φ(k) = αk
is surjective and its kernel is N (φ) = {k ∈ Z : αk = 1} = hni. Hence, Rn is
isomorphic to the additive group Zn .
2. Let φ : Sn → Z2 be the surjective homomorphism given by φ(ρ) = sgn(ρ).
Its kernel (by definition) is the alternating group An . Hence we conclude that
Sn /An is isomorphic to Z2 .
3. Let φ : Z → Zm ⊕ Zn be the group homomorphism φ(x) = (πm (x), πn (x)).
Its kernel is given by:
N (φ) = {x ∈ Z : x ≡ 0 (mod m) and x ≡ 0 (mod n)}
= {x ∈ Z : m|x and x|n} = hlcm(m, n)i.
It follows that the homomorphism φ̃ : Zlcm(m,n) → Zm ⊕ Zn , x 7→ (x, x) is
injective, and that Zlcm(m,n) ≃ φ̃(Zlcm(m,n) ).
In particular, when n and m are relatively prime numbers, so that we have
lcm(m, n) = mn, it follows that Zmn and Zm ⊕ Zn are isomorphic, since both
groups have the same number of elements. We leave it to the exercises to
show that this map is actually the only isomorphism of rings between Zmn and
Zm ⊕ Zn , and to determine its inverse.
The example above concerning the roots of the unit is much more general
that it looks at first sight. In fact, if G is a multiplicative group with identity
e and α ∈ G, we know that the group generated by α is hαi = {αk : k ∈ Z}.
The group homomorphism φ : Z → hαi given by φ(k) = αk is always
surjective and its kernel is N (φ) = {k ∈ Z : αk = e}. Since N (φ) is a
subgroup of Z, we have that N (φ) = hni, where n ≥ 0. There are two
possibilities for n:
(i) n = 0 ⇐⇒ N (φ) = {0}: this means that φ is injective, so we conclude
that hαi ≃ Z, and hαi is an infinite group. The element α has infinite
order.
(ii) n > 0 ⇐⇒ N (φ) 6= {0}: then n is the smallest positive integer in
N (φ), i.e., the smallest positive solution of the equation αk = e. In
this case, hαi ≃ Z/hni = Zn , and hαi had n elements. Therefore, α
has order n, and the order of an element α is precisely the smallest
natural number such that αk = e.
4.4. ISOMORPHISM THEOREMS FOR GROUPS
217
Examples 4.4.7.
1. Consider the permutation ε in S3 . We have ε1 = ε, ε2 = δ and ε3 = I.
Hence, ε has order 3, and hεi = {ε, δ, I} = A3 ≃ Z3 .
2. Recall that D5 is the symmetry group of a regular pentagon. The order of a
non-trivial rotation r ∈ D5 is 5, so that hri ≃ Z5 . More generally, the group
Dn has a subgroup H ≃ Zn , and this is a normal subgroup in Dn (why?).
Definition 4.4.8. A group G is called a cyclic group if there exists some
g ∈ G such that hgi = G. The element g is called a generator of G.
Examples 4.4.9.
1. The group Z is cyclic, with generators 1 and −1.
2. A3 is a cyclic group: we can take g = ε or g = δ.
3. The group {1, i, −1, −i} is cyclic: we can take either g = i or g = −i.
4. The groups Zn are cyclic: any element of Z∗n is a generator.
5. The Z2 ⊕ Z4 is not cyclic (why?).
The next result identifies all cyclic groups and it is just summarizes what
we saw above:
Corollary 4.4.10 (Classification of cyclic groups). If G is a cyclic group
then either:
(i) G is infinite and G ≃ Z, or
(ii) G is finite with n elements and G ≃ Zn .
Also, applying Lagrange’s Theorem, we obtain the classification of all
groups of prime order:
Corollary 4.4.11 (Classification of groups of prime order). If G is a finite
group of order p, with p prime, then G ≃ Zp .
There are also other isomorphism theorems for groups, which are less
used than the First Isomorphism Theorem, but which are still useful. They
are actually consequences of the First Isomorphism Theorem.
218
CHAPTER 4. QUOTIENTS AND ISOMORPHISMS
Theorem 4.4.12 (Second Isomorphism Theorem). Let N and H be subgroups of G, and assume that N is normal in G. Then HN is a subgroup
of G, N is normal in HN , H ∩ N is normal in H, and
HN
H
≃
.
N
H ∩N
Proof. Exercise 10 in Section 4.1 shows that if H and N are subgroups of G,
then HN is a subgroup of G if and only if HN = N H. Since N is normal
in G, we have that HN = N H, and we conclude that HN is a subgroup of
G.
Now take the quotient homomorphism π : G → G/N and restrict it to
H: one obtains a group homomorphism φ : H → G/N given by
φ(x) = π(x) = x,
(x ∈ H).
The kernel of φ is N (φ) = {x ∈ H : x ∈ N } = H ∩ N . On the other hand,
its image φ(H) is a subgroup of G/N , so that φ(H) = K/N , where K is a
subgroup of G which contains both H and N . Therefore HN ⊆ K. But also
any element in K is equivalent to an element in H, i.e., if k ∈ K then there
exists h ∈ H and n ∈ N such that k = hn. Hence we have that K = HN .
Now the First Isomorphism Theorem applied to φ gives:
HN
H
≃
.
N
H ∩N
Notice, as a special case of the Second Isomorphism Theorem, that if
H ∩ N = {e} then HN/N ≃ H.
Theorem 4.4.13 (Third Isomorphism Theorem). If K ⊂ H ⊂ G are normal subgroups of G, then K is a normal subgroup of H, H/K is a normal
subgroup of G/K, and
G/K
G
≃ .
H/K
H
Proof. Assume that K ⊆ H ⊆ G, where K and H are normal in G. Let
πK : G → G/K and πH : G → G/H be the usual quotient homomorphisms,
so that πK has kernel K and πH has kernel H. Consider the diagram:
G
π=πK
G/K
φ=πH
/
7 G/H
♦
♦
♦♦
♦
♦ ♦ φ̃
4.4. ISOMORPHISM THEOREMS FOR GROUPS
219
The existence of φ̃ is a consequence of the assumption K ⊆ H (see Proposition 4.4.1). The homomorphism φ̃ is clearly surjective, since πH is surjective. Applying Proposition 4.4.3, the kernel of φ̃ is the group H/K. Now the
First Isomorphism Theorem applied to the homomorphism φ̃ : G/K → G/H
yields the conclusion of the theorem.
This result states that the quotients of quotients of G are isomorphic to
quotients of G. Notice, by the way, that this is just a generalization of the
remarks in Example 4.4.2.1.
Examples 4.4.14.
1. Let G = Z, H = h3i, and K = h6i. Obviously K ⊂ H, and since both K and
H are normal subgroups of Z, we find
G/K = Z/h6i = Z6 ,
H/K = h3i/h6i = h3i ⊂ Z6 , and
G/H = Z/h3i = Z3 .
According to the Third Isomorphism Theorem, we conclude that Z6 /h3i and Z3
are isomorphic.
2. The previous example is a special instance of the following general fact: if
n|m then Zm /hni ≃ Zn . In fact, let G = Z, H = hni, and K = hmi, so
that K ⊂ H are normal subgroups in Z. Then G/K = Zm , G/H = Zn , and
H/K = hni ⊆ Zm , so that the Third Isomorphism Theorem yields our claim.
Exercises.
1. Let H = hgi = {g n : n ∈ Z} be a cyclic group. Show that
(a) if H is infinite, its only generators are g and g −1 ;
(b) if H has m elements, the order of g n is
m
d,
where d = gcd(n, m);
n
(c) if H has m elements, g is a generator of H if and only if gcd(n, m) = 1.
2. Assume that g1 and g2 belong to an abelian group G and have order, respectively, n and m. Show that the order of g1 g2 divides lcm(n, m). Conclude that
the subset formed by all elements of finite order is a subgroup of G.
3. Decide which of the groups Z4 , Z2 ⊕ Z2 , Z8 , Z4 ⊕ Z2 and Z2 ⊕ Z2 ⊕ Z2 are
isomorphic.
4. Let G = Z ⊕ Z and N = {(n, n) : n ∈ Z}. Show that G/N ≃ Z.
220
CHAPTER 4. QUOTIENTS AND ISOMORPHISMS
iπ
5. Consider the multiplicative group H = he 4 i ⊂ C formed by the complex
solutions of z 8 = 1.
(a) Determine all the generators and subgroups of H;
(b) Determine all the automorphisms φ : H → H.
6. Determine all the generators and subgroups of H = h(123456)i ⊂ S6 .
7. Show that Aut(Zn ) ≃ Z∗n .
8. Show that Zmn and Zm ⊕ Zn are isomorphic groups/rings if and only if m
and n are relatively prime.
9. Assume that G is a finite group, H is a normal subgroup of G, K is a subgroup
of G, G = HK, and G/H is isomorphic to K. Show that H ∩ K = {e}.
10. Show that if n > 1, then Zpn is not isomorphic to ⊕nk=1 Zp .
11. Show that the quotient Z40 /h15i is isomorphic to Zn and find n.
12. Assume that the only subgroups of G are the trivial groups {1} and G.
Show that G is a cyclic group of prime order.
13. Show that if G is an abelian group of order pq, when p and q are both prime,
then G is cyclic.
14. Classify the groups with 2p elements, where p > 2 is a prime. (Hint: Show
that there exists an element x of order p, and that every element of order p
belongs to < x >).
15. Classify the groups with 8 elements, by proceeding as follows:
(a) Show that if G is abelian, then it is isomorphic to one of the groups Z8 ,
Z4 ⊕ Z2 , or Z2 ⊕ Z2 ⊕ Z2 , and these are non isomorphic groups.
(b) Assuming that G is not abelian, show that:
(i) G has an element x of order 4, and H =< x > is normal in G.
(ii) Assuming y 6∈ H, show that y 2 ∈ H, where y 2 = 1 or y 2 = x2 .
(iii) Finally show that yx ∈ Hy, where yx = x3 y (note that the order of
yxy −1 is the order of x).
(iv) Compare the result with the tables of D4 and H8 .
16. Let G and H be groups, with normal subgroups K ⊂ G and N ⊂ H. Show
that (G × H)/(K × N ) is isomorphic to (G/K) × (H/N ).
4.5. ISOMORPHISM THEOREMS FOR RINGS
4.5
221
Isomorphism Theorems for Rings
When A and B are rings, I ⊆ A is an ideal of A, and φ : A → B is a
homomorphism of rings, we may apply the results from the previous section
to φ as a homomorphism of the additive groups (A, +) and (B, +). In
particular, we know that if the kernel of φ contains I, then there exists
a group homomorphism φ̃ : A/I → B such that we have a commutative
diagram.
A
π
A/I
φ
/
♣7 B
♣
♣
♣♣
♣
φ̃
♣
However, we note that if φ is a homomorphism of rings then we have
φ̃(x) · φ̃(x′ ) = φ(x)φ(x′ ) = φ(x · x′ ) = φ̃(x · x′ ).
Hence, the group homomorphism φ̃ is a ring homomorphism as long as the
original homomorphism φ is a ring homomorphism. This allows us to obtain
immediately the analogues for rings of Propositions 4.4.1 and 4.4.3.
Proposition 4.5.1. Let A and B be rings, I ⊆ A an ideal in A, and denote
by π : A → A/I the quotient map, π(x) = x + I.
(i) The ring homomorphisms φ̃ : A/I → B are the maps of the form
φ̃(π(x)) = φ(x), where φ : A → B is a ring homomorphism with
kernel N (φ) ⊇ I.
(ii) If φ : A → B is a ring homomorphism with kernel N (φ) ⊇ I, then
the induced ring homomorphism φ̃ : A/I → B has kernel N (φ)/I =
π(N (φ)). In particular, φ̃ is injective if and only if I = N (φ).
Given a ring A, a map φ : Z → A is a homomorphism of the underlying
additive groups if and only if h(n) = na, for some fixed a ∈ A. It is easy to
check that φ is a ring homomorphism if and only if additionally a = φ(1) is
a solution of x2 = x in A. We make use of this simple remark to look back
at the examples of the previous section, now from the perspective of their
ring structure.
222
CHAPTER 4. QUOTIENTS AND ISOMORPHISMS
Examples 4.5.2.
1. Let A = Z, B = Zn , and I = hki. The quotient φ = πn : Z → Zn is a
homomorphism of rings. If n|k then
φ=πn
/
♣8 Zn
♣
♣
π=πk
♣ ♣φ̃
♣
♣
Zk
Z
where φ̃ : Zk → Zn , given by φ̃(πk (x)) = πn (x) is a homomorphism of rings.
2. Let A = Z, B = Z210 and I = hki. The equation x2 = x has (non-obvious)
solutions in Z210 , such as x = 21. Hence, the homomorphism φ : Z → Z210
given by φ(x) = π210 (21x) is a ring homomorphism. It is easy to check that
its kernel is N (φ) = h10i. We conclude that there exists a ring homomorphism
φ̃ : Zk → Z210 such that φ̃(x) = 21x if and only if 10|k. Obviously, φ̃ is
injective when k = 10. In fact, when k = 10 we have that φ(Z) = φ̃(Z10 ) = h21i
is unitary subring of Z210 , isomorphic to Z10 .
Z
π=π10
Z10
φ
/
7 Z210
♦
♦
♦♦
♦
♦ ♦ φ̃
3. Let A = Q[x], B = Q, and denote by φ : Q[x] → Q the evaluation map
φ(p(x)) = p(1). By the Polynomial Remainder Theorem, we know that φ is
a homomorphism of rings with kernel N (φ) = hx − 1i. If I = hm(x)i is the
ideal of Q[x] generated by the polynomial m(x), it follows that there exists a
homomorphism of rings φ̃ : Q[x]/I → Q, defined by φ̃(p(x)) = p(1), if and
only if (x − 1)|m(x), i.e, if and only if p(1) = 0.
Using Proposition 4.5.1 we obtain immediately the First Isomorphism
Theorem for for rings:
Theorem 4.5.3 (First Isomorphism Theorem for rings). If φ : A → B
is a surjective ring homomorphism and I is the kernel of φ, the the rings
A/I and B are isomorphic. In particular, there is an isomorphism of rings
φ̃ : A/I → B such that φ̃(a) = φ(a), for all a ∈ A.
The other isomorphism theorem for rings follow by a simple application
of the First Isomorphism Theorem for rings, exactly in the same manner as
we did in the case of groups. We state them and leave the easy proofs for
the exercises:
4.5. ISOMORPHISM THEOREMS FOR RINGS
223
Theorem 4.5.4 (Second Isomorphism Theorem for rings). Let A be a ring,
I an ideal of A, and B a subring of A. Then I + B is a subring of A, I is
an ideal in I + B, I ∩ B is an ideal in B, and there is an isomorphism of
rings
I +B
B
≃
.
I
I ∩B
Theorem 4.5.5 (Third Isomorphism Theorem for rings). Let A be a ring,
I, J ideals of A with I ⊂ J. Then I is an ideal of J, J/I is an ideal of A/I,
and there is an isomorphism of rings
A
A/I
≃ .
J/I
J
Examples 4.5.6.
1. Let us check that when n and m are relatively prime then the rings Znm
and Zn ⊕ Zm are isomorphic. Just like in the case of groups, it is enough to
observe that φ = πnm : Z → Zn ⊕ Zm given by φ(k) = (πn (k), πm (k) is a
homomorphism of rings. Hence, by the First Isomorphism Theorem for rings,
the isomorphism of groups φ̃ : Znm → Zn ⊕ Zm in Example 4.4.6.3 is also an
isomorphism of rings.
2. Let A = Z and assume that n|m. If we let I = hmi and J = hni, then
J ⊇ I and I and J are ideals in Z. In this case, A/I = Zm , A/J = Zn , and
J/I = hni ⊆ Zm . We conclude from the Third Isomorphism Theorem that the
rings Zm /hni and Zn are isomorphic. In particular, the quotient rings formed
from the rings Zm are always rings Zn .
A
I
= Zm
π
A/I
J/I
=
Zm
hni
φ
/ Zn =
♠6
♠
♠ ♠
♠ ♠ φ̃
A
J
3. Let α ∈ C be algebraic over Q, α 6∈ Q. Denote by m(x) its minimal polynomial. Recall that the evaluation map φ : Q[x] → C given by φ(p(x)) = p(α) is a
homomorphism of rings, with kernel N (φ) = hm(x)i and that Q[α] = φ(Q[x]).
We conclude by the First Isomorphism Theorem for rings that
Q[α] ≃
Q[x]
.
hm(x)i
Notice that since m(x) is an irreducible polynomial, we have that Q[x]/hm(x)i
is a field. Hence (why?):
Q[α] ≃
Q[x]
≃ Q(α).
hm(x)i
224
CHAPTER 4. QUOTIENTS AND ISOMORPHISMS
We now illustrate the Isomorphism Theorems for rings with two applications.
First, we can use the result of Example 4.5.6.1 to compute the Euler
function ϕ : N → N. Recall that this function is defined as ϕ(n) = |Z∗n |,
i.e., ϕ(n) is the number of invertible elements in the ring Zn , or also, the
number of integers 1 ≤ k ≤ n which are relatively prime to n.
Lemma 4.5.7. If n1 , . . . , nk are relatively prime, then
ϕ(n1 · · · nk ) = ϕ(n1 ) · · · ϕ(nk ).
Proof. We give a proof only for the case k = 2. The cases k > 2 can be
obtain by a simple induction procedure.
Recall that if A and B are rings with units, then (A ⊕ B)∗ = A∗ × B ∗ .
Hence, if C ≃ A ⊕ B, and the rings are finite, it is obvious that |C ∗ | =
|A∗ ||B ∗ |.
We apply this to A = Zn , B = Zm , and C = Znm , assuming that n and
m are relative primes. Since Znm ≃ Zn ⊕Zm , we conclude immediately that:
ϕ(nm) = ϕ(n)ϕ(m).
The next theorem shows that one can compute ϕ(n) in a straightforward
fashion, as long as one knows the prime factors of n:
Q
Theorem 4.5.8. If n = ki=1 pei i is the prime factorization of n then
ϕ(n) = n
k Y
i=1
1−
1
pi
.
Proof. The lemmaQabove shows that if n is a natural number with prime
factorization n = ki=1 pei i , then
ϕ(n) =
k
Y
ϕ(pei i ).
i=1
Now if p is a prime number and m is a natural number, one finds easily
ϕ(pm ): the elements of Zpm which are not invertible are just the elements of
the ideal hpi in Zpm . This ideal has exactly pm /p = pm−1 elements (why?).
Therefore Z∗pm = Zpm − hpi has pm − pm−1 = pm (1 − p1 ) elements. In other
words:
1
ϕ(pei i ) = pei i − piei −1 = pei i (1 − ).
pi
4.5. ISOMORPHISM THEOREMS FOR RINGS
225
From this it follows that:
ϕ(n) =
k
Y
i=1
ϕ(pei i )
=
k
Y
i=1
pei i
1
1−
pi
=n
k Y
i=1
1
1−
pi
.
Example 4.5.9.
The prime factors of 9000 are 2, 3 and 5. Therefore:
1
1
1
1−
1−
= 2400.
ϕ(9000) = 9000 1 −
2
3
5
As another application of the Isomorphism Theorems for rings, we will
now show that the fields Zp and Q are, in some sense, the smallest fields with
a given characteristic. In other words, we will show that any field necessarily
contains a subfield isomorphic to either Zp or Q. In what follows, we will
denote by K a field and denote the identity by 1.
Definition 4.5.10. We say that K is a primitive field if its does not
contain any strictly smaller subfield (i.e., 6= K).
Obviously there exist primitive fields (e.g., Z2 ) and non-primitive fields
(e.g., R). Moreover, any field K has exactly one primitive subfield: the
intersection of all the subfields of K is necessarily a primitive field primitive.
One calls it the primitive subfield of K.
Example 4.5.11.
Obviously, Q is the primitive
subfield
√
√ of R and of C. Similarly, Q is also the
primitive subfield of Q[ 2] = {a + b 2 : a, b ∈ Q}.
Our next result identifies all the possible primitive fields:
Theorem 4.5.12. Let m be the characteristic of K, where m = 0 or m = p,
for a prime number p. Then:
(i) If m = 0, the primitive subfield of K is isomorphic to Q;
(ii) Se m = p, the primitive subfield of K is isomorphic to Zp .
226
CHAPTER 4. QUOTIENTS AND ISOMORPHISMS
Proof. We give the proof in the case m = p, leaving the case m = 0 for the
exercises.
Consider the homomorphism φ : Z → K defined by φ(n) = n1. It is
easy to check that any subfield of K must contain 1 and, hence, also φ(Z).
It was shown in Chapter ??, that φ(Z) is isomorphic to Zp , and so it is a
field. Since φ(Z) is primitive, it must be the primitive subfield of K.
A field K is always a vector space over its primitive subfield, which may
have finite or infinite dimension. If K is a finite field, its characteristic is
necessary a prime p > 0, so that its primitive subfield J has p elements and
is isomorphic to Zp . The dimension of K as a vector space over J is finite,
for otherwise K would be infinite. Therefore, there exists a natural number
n such that K is isomorphic to the vector space J n . Hence, we have proved:
Theorem 4.5.13. Any finite field K has pn elements, where p is a prime,
namely the characteristic of K.
We know that there are finite fields with exactly p elements, namely
Zp . One can show that, if p is a prime and n is a natural number, there
exist fields with pn elements, and all fields with pn elements are isomorphic.
Hence, up to isomorphism, there exists exactly one field with pn elements,
called the Galois field 8 of order pn , denoted CG(pn ). We will not prove
these statements now, but we remark that if p(x) ∈ Zp [x] is an irreducible
polynomial of degree n, then K = Zp [x]/hp(x)i is a field with pn elements.
Therefore, it must be the Galois field CG(pn ). In this way, one can reduce
the existence of Galois fields to the existence of irreducible polynomials of
arbitrary degree n in Zp .9
Exercises.
1. Prove the Isomorphism Theorems for rings.
2. Let A be a ring with unit which has n elements. Show that:
(a) A ≃ Zn as rings if and only if A has characteristic n.
(b) A ≃ Zn as rings if and only if (A, +) ≃ (Zn , +) as groups.
8
After Évariste Galois (1811-1832). Galois, was a main contributor to a major mathematical discovery of the 19th Century, the Theory of Groups. Galois is also a tragic figure
of the History of Mathematics, for he died at the young age of 21, in a duel because of a
women of dubious reputation. We will discuss in Chapter 7 the theory of Galois.
9
See also Exercise 8 in this section.
4.5. ISOMORPHISM THEOREMS FOR RINGS
227
3. The statement that Zn ⊕ Zm ≃ Znm , if gcd(n, m) = 1, expresses the Chinese
Remainder Theorem in terms of isomorphisms of rings. How does one expresses
the Fundamental Theorem of Arithmetic in terms of isomorphisms of rings?
4. Assume that n, m ∈ N, d = gcd(n, m) and k = lcm(n, m). Show that
Zn ⊕ Zm ≃ Zd ⊕ Zk .
5. Assuming that n and m are relatively prime, show that:
(a) There exists exactly one isomorphism of rings φ : Zmn → Zm ⊕ Zn , and
(b) The inverse isomorphism φ−1 : Zm ⊕ Zn → Zmn is given by φ−1 (x, y) =
φm (x) + φn (y), where φk : Zk → Zmn are injective homomorphisms
such that φk (πk (x)) = πmn (ak x). Find the integers ak and explain the
relationship between φ−1 and the Chinese Remainder Theorem.
6. Consider homomorphisms of rings φ : Zn → Z210 .
(a) For which values of n are there injective homomorphisms φ?
(b) For which values of n are there surjective homomorphisms φ?
7. Solve the equation ϕ(m) = 6, where ϕ is the Euler function, using the following steps:
(a) Show that the prime factors of m must be 2, 3, or 7.
(b) Show that if 7|m, then m = 7 or m = 14.
(c) Show that if 7 is not a factor of m, then 3|m and 9|m.
(d) Determine all solutions of of ϕ(m) = 6.
8. Assume that K ⊂ L are fields, u ∈ L is algebraic over K, and m(x) is the
minimal polynomial of u in K[x]. Show that K[u], K(u) and K[x]/hm(x)i are
isomorphic fields.
9. Let I ⊂ A be an ideal of A. Is it true that A is necessarily isomorphic to the
ring I ⊕ A/I? On the other hand, show that if A is isomorphic to I ⊕ J, then
J is isomorphic to A/I.
10. Let K be a field, and assume that p(x) = q(x)d(x) holds in K[x]. Show
that K[x]/hp(x)i is isomorphic to K[x]/hq(x)i ⊕ K[x]/hd(x)i if and only if
gcd(q(x), d(x)) = 1.
11. Consider p(x) = (x2 + x + 1)(x3 + x + 1) ∈ Z2 [x]. How many invertible
elements are there in Z2 [x]/hp(x)i? And in Z2 [x]/hp(x)2 i?
12. Finish the proof of Theorem 4.5.12.
228
CHAPTER 4. QUOTIENTS AND ISOMORPHISMS
13. Let K be a primitive field, let L and M be extensions of K, and let φ : L →
M be a non-zero homomorphism. Show that:
(a) For all a ∈ K: φ(a) = a.
(b) If p(x) ∈ K[x], b ∈ L and p(b) = 0, then p(φ(b)) = 0, i.e., φ takes roots
of p(x) into roots of p(x).
(c) Q[x]/hx3 − 2i is not isomorphic to Q[x]/hx3 − 3i.
14. Show that any ordered field is an extension of Q, i.e., Q is the smallest
ordered field.
15. Show that any complete ordered field is an extension of the reals, i.e., R is
the smallest complete ordered field.
4.6
Free Groups, Generators and Relations
If X is a subset of a group G, the subgroup generated by X is the
intersection of all subgroups of G which contain X and is denoted by hXi.
Similarly to the case of ideals in rings, if X = {x1 , x2 , · · · , xn } is a finite set,
we also write hXi = hx1 , x2 , · · · , xn i.
Example 4.6.1.
If G = S3 then we see immediately that:
h(12)i = {I, (12)}
h(123)i = h(321) =i{I, (123), (321)}
On the other hand, h(12), (123)i = S3 . In fact, for any subset X ⊂ S3 containing a transposition and a 3-cycle we have hXi = S3 .
These examples make it clear that a subgroup can have many different
sets of generators.
The set X is called a generating set of the group G if and only if
hXi = G. This condition is equivalent to say that any element of G can
be written, in multiplicative notation, as a product of positive and negative
powers of elements of X:
g = xn1 1 · · · xnk k ,
xi ∈ X, ni ∈ Z.
If the group G has a finite generating set, i.e., if G = hx1 , x2 , · · · , xk i, then
G is called a finitely generated group.
4.6. FREE GROUPS, GENERATORS AND RELATIONS
229
In the case of an abelian group we use the additive notation. So, for
example, if G is a finitely generated abelian group with generators x1 , . . . , xk ,
then we can write any element of G as:
g = n 1 x1 + n 2 x2 + · · · + n k xk ,
ni ∈ Z.
Examples 4.6.2.
1. A cyclic group G = hαi is, by definition, finitely generated with one generator.
Any element g ∈ G is of the form g = αn , possibly for multiple values of n.
2. Any finite group is finitely generated, since we can always take X = G as a
generating set.
3. The group G = Z ⊕ Z2 ⊕ Z4 is neither cyclic nor finite, but it is finitely
generated: a generating set is X = {(1, 0, 0), (0, 1, 0), (0, 0, 1)}.
4. The abelian group Zk := ⊕ki=1 Z is finitely generated. A generating set is
given by the n vectors of the usual canonical basis of Rk , e1 , e2 , · · · , ek , where
ei has all components zero but the i-th component which is 1. Any element of
g ∈ Zk can be written uniquely as:
g=
k
X
ni e i ,
i=1
ni ∈ Z.
One can also define the normal subgroup of G generated by a subset
X ⊂ G as the intersection of all normal subgroups which contain X. In
the abelian case, of course, this is the same as the subgroup generated by
X, but it the non-abelian the two subgroups are, in general, distinct. For
example, in G = S3 if we let X = {σ}, where σ is any transposition, then the
subgroup generated by X is H = {1, σ}, but the normal subgroup generated
by X is S3 itself.
If X is a generating set for a group G, in general, there are multiple
products of elements of X which yield the identity 1 ∈ G. For example:
(a) For all x ∈ X, we have xx−1 = 1;
(b) If G is cyclic of order m, and X = {x} is a generator, then xm = 1.
One calls expressions formed by products of elements of G which equal the
identity a relation. There are trivial relations, as in example (a), which are
consequences of the groups axioms, and non-trivial relations, as in example
(b), which depend on the specific group G and generating set X. We will
make these concepts more precise later.
230
CHAPTER 4. QUOTIENTS AND ISOMORPHISMS
Many groups can be completely described, in a very succinct form, by
specifying a set of generators X and relations between these generators which
we write (in multiplicative notation) as:
xc11 xc22 · · · xckk = 1.
Examples 4.6.3.
1. The cyclic group G of order n is completely described by specifying the generator X = {α}, and the relation αn = 1.
2. The group S3 is generated by X = {α, δ}. Its multiplicative table can be
obtained form the relations α2 = 1, δ 3 = 1, and δα = αδ −1 . The last relation
can also be written as αδαδ = 1.
3. The group H8 is generated by X = {i, j}, and is completely described by the
relations i2 j 2 = i4 = 1 and iji = j.
4. The group Dn of symmetries of a regular polygon of n sides is generated be
two elements σ and a ρ satisfying the relations
σ 2 = 1,
ρn = 1,
σρσρ = 1.
The element ρ represents a rotation by 2π/n, while the element σ represents a
reflection on a symmetry axis of the polygon.
In order to make easier the comparison of two distinct groups G and H,
using a single set of generators X, we will also say that G is generated by
a set X if there exists an injective map ι : X → G such that G is generated
by the set ι(X), in the sense we used before. When the map ι is clear from
the context, in order to simplify the notation, we will use the same symbol
to denote both an element x ∈ X and the corresponding element ι(x) ∈ G.
Example 4.6.4.
According to these conventions, the groups S3 , H8 , and Z⊕ Z are all generated
by X = {x1 , x2 }.
Assume now that G is generated by X = {x1 , x2 , . . . , xn }, and let H be
an arbitrary group. A moment of thought shows that:
• Any homomorphism φ : G → H is uniquely determined on all elements
of G, by the values that φ takes on the generators xk , but
4.6. FREE GROUPS, GENERATORS AND RELATIONS
231
• The values yk = φ(xk ) are not arbitrary, because whatever relations
hold for x1 , . . . , xn in G they must also hold for the y1 , . . . , yn in H.
Example 4.6.5.
Any group homomorphism φ : S3 → H is uniquely determined by its values
ρ = φ(α) and ξ = φ(δ). However, the elements ρ, ξ ∈ H cannot be arbitrary,
and must satisfy the same relations that α and δ satisfy, namely: ρ2 = ξ 3 = 1
and ξρξρ = 1.
Still from an informal point of view, a group G generated by X is free
from relations between its generators, if there always exist a homomorphism
φ : G → H, whatever the values φ(x), x ∈ X. We make these ideas more
precise as follows:
Definition 4.6.6. For a set X, we call F a free group (respectively, free
abelian group) on X, if F is a group (respectively, abelian group) and
there exists a map ι : X → F such that the following property holds: for
any group (respectively, abelian group) H and any map φ : X → H there
exists a unique group homomorphism φ̃ : F → H such that the following
diagram commutes:
ι
/F
X ◆◆
✤
◆◆◆
✤
◆◆◆
◆
φ ◆◆◆◆ ✤
&
φ̃
H
Example 4.6.7.
Ln
If X = {x1 , . . . , xn } is a finite set, we consider the group Zn = i=1 Z and
the map ι : X → Zn defined by ι(xi ) = ei , i = 1, . . . , n. If H is any abelian
group and φ : X → H is any map, then we have a group homomorphism
φ̃ : Zn → H given by:
φ̃(k1 , . . . , kn ) = k1 φ(x1 ) + · · · + kn φ(xn ).
This makes the following diagram commute:
ι
/ Zn
{x1 , . . . , xn }
✤
❘❘❘
❘❘❘
✤
❘❘❘
φ̃
❘❘❘
φ
❘❘❘ ✤
(
H
It is easy to see that φ̃ : Zn → H is the only group homomorphism with this
property. Hence, Zn is a free abelian group on X = {x1 , . . . , xn }.
232
CHAPTER 4. QUOTIENTS AND ISOMORPHISMS
We leave it as an exercise to check that the definition of free group F
on the set X implies that ι : X → F is injective and F is generated by X.
Notice that, if the group H is also generated by X, so there exists a map
φ : X → H such that hφ(X)i = H, then the homomorphism φ̃ is onto. In
particular, by the First Isomorphism Theorem, the groups F/N and H are
isomorphic, where N denotes the kernel of φ̃. This shows the relevance of
the free group F for the classification of groups: Any group generated by X
is a group quotient of the free group F .
Example 4.6.8.
Any abelian group generated by X = {x1 , . . . , xn } is a group quotient of the
group Zn . We will explore this remark later, in order to obtain a classification
of all finitely generated abelian groups.
We will see now that for a set X, there exists (up to isomorphism)
exactly one free non-abelian group and one free abelian group generated by
X, except when X = {x1 } consists of a single element (where both groups
are abelian and isomorphic to Z).
We start by showing that the free group on X is unique, up to isomorphism.
Proposition 4.6.9. Let F and F ′ be free groups on a set X, relative to
maps ι : X → L and ι′ : X → L′ , respectively. Both in the abelian and in
the non-abelian case, there exists a unique isomorphism ψ : F → F ′ making
the following diagram commute:
> F✤
⑥⑥ ✤
⑥
⑥
⑥⑥
✤
⑥⑥
✤
X❆
✤ψ
❆❆
❆❆
✤
❆❆ ✤
′
❆
ι
ι
F′
Proof. From Definition 4.6.6, if we let H = F ′ and φ = ι′ , we conclude that
there exists a unique homomorphism ψ : F → F ′ which makes the diagram
in the statement commutative. Similarly, interchanging the roles of ι and ι′ ,
we conclude that there exists a unique homomorphism ψ ′ : F ′ → F which
4.6. FREE GROUPS, GENERATORS AND RELATIONS
233
makes the following diagram commutative:
′
F
⑥> ✤
⑥
′
ι ⑥⑥ ✤
⑥⑥
✤
⑥⑥
✤ ′
X❇
✤ψ
❇❇
❇❇
✤
❇
ι ❇❇ ✤
F
It follows that the following diagrams are also commutative:
F
⑦> ✤
ι ⑦⑦⑦ ✤
⑦⑦
✤
⑦⑦
✤
′
X❅
✤ ψ◦ψ
❅❅
❅❅
✤
❅
ι ❅❅ ✤
F
′
F
⑥> ✤
ι′ ⑥⑥⑥ ✤
⑥⑥
✤
⑥⑥
✤ ′
X❆
✤ ψ ◦ψ
❆❆
❆❆
✤
❆❆
ι′ ❆ ✤
F′
Finally, observe that if in the last two diagrams we replace ψ ◦ ψ ′ and ψ ′ ◦ ψ
by the identity homomorphisms we also obtain commutative diagrams. The
uniqueness in the defining property of a free group, allows us to conclude
that ψ ◦ ψ ′ = idF and ψ ′ ◦ ψ = idF ′ . Hence, the homomorphism ψ has a
unique inverse and so it is an isomorphism.
Let us turn now to the existence of a free abelian group generated by X.
We already know that this group exists if X is finite with n elements and it
is isomorphic to Zn . For infinite sets we need the following:
Definition 4.6.10. Let {Gi }i∈I be a family of groups.
Q
(i) The direct product of the Gi ’s, denoted
group
Q by i∈I Gi , is the 10
with supporting set the cartesian product i∈I Gi of the groups and
group operation
Q defined as follows: if g = (g
Qi )i∈I and h = (hi )i∈I are
elements of i∈I Gi , then gh ≡ (gi hi )i∈I ∈ i∈I Gi .
L
(ii) The direct sum
of
the
G
’s,
denoted
of the
i
i∈I Gi , is the subgroup
Q
Q
direct product i∈I Gi consisting of elements (gi )i∈I ∈ i∈I Gi where
only a finite number of gi ’s is different from the identity (in Gi ).
Obviously, when the set of indices is finite the direct sum and direct
product coincide, and are the same as the product defined in Chapter 1. We
leave the proof of the following proposition for the exercises:
10
See the Definition ?? in the Appendix.
234
CHAPTER 4. QUOTIENTS AND ISOMORPHISMS
L
Proposition 4.6.11. If X is any set the direct sum F = x∈X Z is a free
abelian group generated by X relative to the map ι : X → F which to each
x0 ∈ X associates the element (gx )x∈X ∈ L, with gx0 = 1 and gx = 0, if
x 6= x0 .
Notice that the map ι in this Proposition is injective. For that reason it
is called the canonical
injection. If we identify each element
L
L xi ∈ X with its
image ι(xi ) ∈ x∈X Z then X can be seen
as
a
subset
of
x∈X Z. Moreover,
L
we can express every element g 6= 0 of x∈X Z in additive notation in the
form
g = n 1 x i1 + n 2 x i2 + · · · + n k x ik ,
where the indices i1 , . . . , ik are all distinct and n1 , . . . , nk are non-zero integers. This expression is unique, up to the order
Lof the factors, and every
expression of this form represents an element of x∈X Z.
Let us explain now how to construct the non-abelian free group on a set
X with more than one element. First we index the elements of X, so that
X = {xi : i ∈ I}, and then we form an “appropriate product” of all free
groups Fi in the sets Xi = {xi }. The product that we need is called the free
product of groups, and is discussed in the following proposition.
Proposition
4.6.12. Let {Gi }i∈I be a family of groups.
There exists a
Q
Q
group ∗i∈I Gi and group homomorphisms φi : Gi → ∗i∈I Gi with the following property: given any group H and group homomorphisms
ψi : Gi → H,
Q
there exists a unique homomorphism of groups ψ : ∗i∈I Gi → H such that
the following diagram commutes for all i ∈ I:
Q
φi
/ ∗ Gi
Gi PP
i∈I✤
PPP
PPP
✤ψ
PP
ψi PPPP ✤
P'
H
Proof. Let {Gi }i∈I be a family of groups. We call a word in the Gi ’s any
finite sequence (g1 , . . . , gn ), where each gk belongs to some Gi . The natural
number n is called the word length, and we conventioneer that the empty
word, denoted 1, has length zero. A reduced word is any word (g1 , . . . , gn )
which satisfies the following two properties:
(a) none of the gk is the identity element of a group Gi ;
(b) none of the successive terms in the word belong to the same group Gi ;
4.6. FREE GROUPS, GENERATORS AND RELATIONS
235
Q
We denote by ∗i∈I Gi the set of all reduced words, including the empty
word 1.
Q
In the set ∗i∈I Gi , we define a binary operation as follows. Let g =
(g1 , . . . , gn ) and h = (h1 , . . . , hm ) be two reduced words with n ≤ m. Let
0 ≤ N ≤ n be the smallest integer such that N < k ≤ n and both gk and
hn−k+1 belong to the same group Gi , while gk hn−k+1 is the identity in Gi .
Then the product gh is the reduced word defined by

if N > 0 and gN , hn−N +1 do not


(g1 , . . . , gN , hn−N +1 , . . . , hm )


belong to the same group,








 (g1 , . . . , gN −1 , gN hn−N +1 , hn−N +2 , . . . , hm ) if N > 0, and gN , hn−N +1
gh =
belong to the same group,







(hn−N +1 , . . . , hm ) if N = 0 and n < m,






1 if N = 0 and n = m.
By definition the product of a reduced word g with the empty word 1 is
g1 Q
= 1g = g. It is easy to check that this operation defines a group structure
in ∗i∈I Gi , with identity the empty word 1, and where the inverse of the
reduced word g = (g1 ,Q
. . . , gn ) is the reduced word g−1 = (gn−1 , . . . , g1−1 ).
Now let φi : Gi → ∗i∈I Gi be the map which to an element g ∈ Gi , with
g 6= e associates the reduced word (g), and to e associates 1. Its is obvious
that φi is a group homomorphism.
Finally, given any group H and group
Q∗ homomorphisms ψi : Gi → H,
we define a group homomorphism ψ : i∈I Gi → H as follows: ψ(1) = e
(the identity in H) and ψ(g1 , . . . , gn ) := ψi1 (g1 ) · · · ψin (gn ), if gk ∈ Gik .
One checks immediately that this is the unique group homomorphism which
makes the following diagram commute for all i ∈ I:
Q
φi
/ ∗ Gi
Gi PP
i∈I✤
PPP
PPP
✤ψ
PP
ψi PPPP ✤
P'
H
Q
The group ∗i∈I Gi is called the free product of the groups Gi . From
now on, we will use the multiplicative notation to write the word (g1 , . . . , gn )
in the form g1 · · · gn .
236
CHAPTER 4. QUOTIENTS AND ISOMORPHISMS
Our next example illustrates the difference between the free product and
the direct product of groups.
Example 4.6.13.
Let G = {1, g} and H = {1, h} be two cyclic groups of order 2. An element
in the free product G ∗ H can be written as a finite alternating sequence of
products of g and h. For example, the following expressions are elements in
the free product:
g, h, gh, hg, ghg, hgh, ghgh, . . .
Notice that gh 6= hg and that both these elements have infinite order! By
contrast, the direct product G × H is an abelian of order 4!
Q
Given any set X = {xi : i ∈ I} we consider the free product F ≡ ∗i∈I Fi
of all free groups Fi in the sets {xi } (recall that Fi ≃ Z). We have an injective
map ι : X → F which associates to the element xi the reduced word (xi ).
Q
Proposition 4.6.14. If X = {xi : i ∈ I}, the free product F = ∗i∈I Fi is
a free group generated by X relative to the map ι : X → F .
Proof. We need to show that for any group H and map φ : X → H there
exists a unique homomorphism of groups φ̃ : F → H so that we have a
commutative diagram:
ι
/F
X ◆◆
✤
◆◆◆
✤
◆◆◆
◆
φ ◆◆◆◆ ✤
&
φ̃
H
The elements of F are reduced words of the form
xk11 · · · xknn ,
where the k1 , . . . , kn are non-zero integers. It is easy to see that φ̃ must be
defined by
φ̃(xk11 , . . . , xknn ) = φ(x1 )k1 · · · φ(xn )kn .
Any element in the free group generated by the set X = {xi : i ∈ I} can
be written in the reduced form
xk11 · · · xknn ,
4.6. FREE GROUPS, GENERATORS AND RELATIONS
237
and two elements written in this form can be multiplied in the obvious form,
by juxtaposition and elimination, using multiplication in the groups Gi and
cancelation of units.
There is a relationship between the non-abelian and the abelian free
groups in a set X. To see this, let us denote by (G, G) ⊂ G the smallest
subgroup of G which contains all commutator elements:
(g, h) ≡ g −1 h−1 gh,
g, h ∈ G.
It is easy to see that (G, G) is a normal subgroup of G and that the quotient
G/(G, G) is abelian. We will study this commutator group in Chapter
5.
We leave the proof of the following proposition as an exercise:
Proposition 4.6.15. If F is a free group on X relative to ι : X → F , then
F/(F, F ) is the free abelian group on X relative to the map ῑ : X → F/(F, F )
given by ῑ = π ◦ ι, where π : F → F/(F, F ) is the quotient map.
The existence of free groups allows us to formalize the notion of relation,
and to clarify the difference between trivial and non-trivial relations. Moreover, we can define precisely what we mean by complete set of relations. In
what follows, H will denote a group generated by X ⊂ H and F denotes
the free group on X relative to a map ι : X → F . If H is abelian, we will
assume naturally that F is the abelian free group. The map i : X ֒→ H
denote the canonical inclusion (i(x) = x). As we have already observed:
Proposition 4.6.16. There exists a surjective homomorphism φ : F → H.
This is summarized by the following diagram:
ι
/F
X ◆◆
✤
◆◆◆
✤
◆◆◆
φ
◆
i ◆◆◆◆ ✤
&
H
Definition 4.6.17. A non-trivial relation of H is any element r ∈ ker φ
distinct from the identity 1 ∈ F .
Notice that r takes the form ι(xi1 )x1 ι(xi2 )n2 · · · ι(xik )nk , where xj ∈ X.
Moreover, to say that r ∈ ker φ is equivalent to say that
xni11 xni22 · · · xnikk = 1,
and so we will continue to say that the last identity is a “relation”.
238
CHAPTER 4. QUOTIENTS AND ISOMORPHISMS
Given a set of relations R = {ri }i∈I , we say that a relation r is a consequence of the relations ri ’s, if r belongs to the normal subgroup of F
generated by the ri ’s. Moreover, we say the set R is a complete set of
relations if the kernel of φ is the normal subgroup of F generated by R.
Given a complete set of relations R = {ri }i∈I the group H is completely
determined, up to isomorphism, by the set of generators X and by the set R.
In fact, H is isomorphic to quotient of F by the normal subgroup generated
by R. One calls the pair (X, R) a presentation of the group H.
Two groups with the same presentation are obviously isomorphic. On
the other hand, in general, a group admits many distinct presentations.
Example 4.6.18.
Consider the group H = Z2 ⊕Z3 . It is easy to see that this group is generated by
the element x1 = (1, 1). The free group F generated by X = {x1 } is isomorphic
to Z, where we identify x1 with the element 1. There exists a unique surjective
group homomorphism φ : Z → Z2 ⊕ Z3 . This homomorphism maps 1 to (1, 1)
and the element 6 belongs to its kernel. In fact, 6 generates the kernel, so
R = {6} is a complete set of relations for H, and the pair {{x1 }, R} is a
presentation of H. Using the multiplicative notation we can write:
{{x1 }, x61 = 1}
Now note that x1 = (1, 0) and x2 = (0, 1) is an equally valid choice of generators for Z2 ⊕ Z3 . In this case, we should take instead the free group F = Z ⋆ Z,
and the kernel of φ : Z ⋆ Z → Z2 ⊕ Z3 is the normal group generated by x21 , x32
−1
and x1 x2 x−1
1 x2 . Therefore, we obtain as a different presentation for H:
−1
{{x1 , x2 }, x21 = 1, x32 = 1, x1 x2 x−1
1 x2 = 1}.
Unfortunately, the characterization of a group by presentations does not
solve the classification problem for groups. In the 1950’s, Adyan and Rabin11
have shown, independently, that there is no algorithm to decide if two group
presentations represent or not isomorphic groups.
However, in the case of finitely generated abelian groups one can indeed
use presentations to classify them, as we will now sketch. We will see in
Chapter 6, in the context of the theory of modules, that this procedure can
be generalized to obtain a classification of finitely generated modules over a
principal ideal domain D (the case of abelian groups corresponds to taking
D = Z).
11
S. Adyan, “The unsolvability of certain algorithmic problems in the theory of groups”,
Trudy Moskov. Mat. Obsc. 6 (1957), 231-298, and M. Rabin, “Recursive unsolvability of
group theoretic properties”, Ann. of Math. 67 (1958), 172-174.
4.6. FREE GROUPS, GENERATORS AND RELATIONS
239
Let us assume then that A is a finitely generated abelian group, with a
set of generators X = {x1 , x2 , · · · , xn } ⊆ A. Let Zn = ⊕nk=1 Z be the free
abelian group on X, with ι : X → Zn given by ι(xk ) = ek . The generators
ni of the kernel of φ : Zn → A take the form ni = (ri1 , ri2 , · · · , rin ), where
rik ∈ Z, and correspond to relations of the form
ri1 x1 + ri2 x2 + · · · + rin xn = 0.
We can assemble these generators into a matrix R of dimension m × n with
entries rik ∈ Z. Hence, in practical terms, a presentation of the finitely
generated abelian group A is simply a matrix R ∈ Mm,n (Z).
As we have discussed before, we can replace X by any other set of generators of A, which has the effect of changing the matrix R. We leave as an
exercise to check that:
Proposition 4.6.19. Let A be an abelian group generated by the set X =
{x1 , x2 , · · · , xn }, and let R ∈ Mm,n (Z) be the matrix with the corresponding
list of relations. Then:
(i) If
= (skj ) is an invertible matrix in Mn (Z), then the elements yk =
PS
n
k=1 skj xj , are also generators of A, and
(ii) The matrix RS −1 gives the list of relations for the generators {y1 , . . . , yn }.
The rows l1 , l2 , · · · , lm of the matrix R are the generators of kernel of φ in
It should now be clear that if P = (pkj ) is an invertible matrix in Mm (Z),
then the elements
l1 , l2 , · · · , lm can be replaced by elements l1′ , l2′ , · · · , λ′m ,
P
m
where lk′ = j=1 pkj lj . This amounts to a having a new presentation of A
with matrix P R. In other words, we see that:
Zn .
Proposition 4.6.20. Let A be an abelian group generated by the set X =
{x1 , x2 , · · · , xn }, and let R ∈ Mm,n (Z) be the matrix with the corresponding
list of relations. If P and Q are any invertible matrices, respectively in
Mm (Z) and in Mn (Z), then P RQ is also the matrix of a presentation of A.
In particular, we can choose P and Q to perform the usual elementary
operations of Gauss elimination on the rows and columns of R. Note, however, we can only use invertible operations in Mn (Z), which allows us to:
• Switch rows or columns,
• Add a row (or column) a multiple of another row (or column),
• Multiply a row (or column) by −1.
240
CHAPTER 4. QUOTIENTS AND ISOMORPHISMS
Examples 4.6.21.
1. Let A be an abelian group generated by X = {x1 , x2 }. Assume that these
generators satisfy the relations 6x1 − 6x2 = 0 and 12x1 + 20x2 = 0. We can
perform the following sequence of operations on the resulting matrix:
6 −6
6 −6
96 0
96 0
−→
−→
−→
.
12 20
30 2
30 2
0 2
We conclude that A is an abelian group which has a set of generators y1 and
y2 , satisfying 2y1 = 0 and 96y2 = 0. Hence, A ≃ Z2 ⊕ Z96 . More precisely,
there exists a surjective homomorphism φ : Z2 → A with kernel N = h2i⊕h96i,
so it follows that
A ≃ Z2 /N ≃ Z/h2i ⊕ Z/h96i = Z2 ⊕ Z96 .
2. The group Z6 ⊕ Z16 has generators x1 = (1, 0) and x2 = (0, 1), and these
satisfy the relations 6x1 = 0 and 16x2 = 0. Again, we can perform the following
operations:
6 0
6 0
6 −12
−→
−→
−→
0 16
6 16
6
4
18 −12
2
4
−→
0
2
−48
4
−→
2 0
0 48
to conclude that:
Z6 ⊕ Z16 ≃ Z2 ⊕ Z48 .
In both these examples, the first aim was to find the greatest common
divisor of all entries of the matrix. In fact, we have:
Proposition 4.6.22. Let R ∈ Mn (Z) and d = gcd(R) be the greatest common divisor of all entries of R. There exist invertible matrices P, Q ∈ Mn (Z)
′ = gcd(R′ ) = d.
such that the matrix R′ = P RQ has R11
The proof of this proposition is not hard, but we will see in Chapter 6 a
much more general result so we omit it.
Note that once we have obtained a matrix R with an entry equal to
gcd(R), then we can use it to eliminate all other entries in the same row
and column. In the examples above, where we had 2 × 2 matrices, the
elimination stops after this step. For matrices of dimension n×n, with n > 2,
we can move the gcd to upper left corner, as in the previous proposition,
and eliminate all entries in the first row and column. One starts all over
4.6. FREE GROUPS, GENERATORS AND RELATIONS
241
with now a matrix R′ of dimension (n − 1) × (n − 1), formed by the elements
rij , with i > 1 and j > 1, where there exist, possibly, non-zero elements.
The gcd of the entries of R′ is a multiple of the gcd of the entries of R, so
this process yields a sequence of integers d′1 |d′2 | · · · |d′n .
Example 4.6.23.
We illustrate the algorithm with a 3 × 3 matrix.







3 0 0
3 0 0
3 0 0
3
 9 6 12  −→  0 6 12  −→  0 6 12  −→  0
12 6 24
0 6 24
0 0 12
0

0 0
6 0 
0 12
After this elimination process, the homomorphism φ : Zn → A is given
by φ(k1 , k2 , · · · , kn ) = k1 y1 + k2 y2 + · · · + kn yn , and has kernel the subgroup
N = hd′1 i ⊕ · · · hd′n i. Hence, we conclude that
A ≃ Zn /N ≃ Zd′1 ⊕ · · · Zd′n .
It is possible that 1 = d′1 = d′2 = · · · = d′k , for some k ≤ n. The correspondent quotients Zd′i are trivial and we can ignore them. We can also have
0 = d′j = d′j+1 = · · · = d′n for some j ≤ n. The corresponding quotients are
Zd′i ≃ Z. We conclude that:
Theorem 4.6.24 (Classification of finitely generated abelian groups). If A
is a finitely generated abelian group, then
A ≃ Zd1 ⊕ · · · ⊕ Zdm ⊕ Zr ,
where 1 < d1 |d2 | · · · |dn .
The integer r is the number of integers d′j which are zero and is called the
characteristic of the abelian group A. The integers d1 , . . . , dn are called
invariant factors or torsion coefficients of the abelian group A.
These integers characterize the abelian group up to isomorphism. For this
reason, we say that they form a a complete set of invariants of an abelian
group.
In general, for any abelian group A, the subset of A formed by all elements of finite order is a subgroup called the torsion subgroup of A,
which we will denote by Tors(A). If Tors(A) is trivial, the group is said to
be torsion free, and if Tors(A) = A, then we say that A is a torsion
group. In any case, the quotient A/ Tors(A) is always a torsion free group.
242
CHAPTER 4. QUOTIENTS AND ISOMORPHISMS
Example 4.6.25.
In Examples 4.6.21 both groups are torsion. The group A has invariant factors
2 and 96, while the invariant factors of Z6 ⊕ Z16 are 2 and 48.
Exercises.
1. Let G be a group and X ⊆ G. Show that the intersection of all subgroups
(respectively, normal subgroups) of G which contain X is the smallest subgroup
(respectively, normal subgroup) of G that contains X.
2. Show directly from the definition of a free abelian group F on a set X, that
the image ι(X) is a set generating F .
3. Prove Proposition 4.6.11.
4. Verify that if A1 and A2 are free abelian groups generated by finite sets, then
A1 and A2 are isomorphic if and only if they have the same characteristic.
5. Show that a group with two generators a and b and relations
aba−1 b−1 = 1,
an = 1,
bm = 1,
is isomorphic to the group Zn ⊕ Zm .
6. Show that a group with two generators σ and ρ and relations
σ 2 = 1,
ρn = 1,
ρσ = σρ−1 ,
is isomorphic to the group Dn of symmetries of a regular polygon with n sides.
7. Show that a group with two generators a and b and relations
a4 = 1,
a2 b2 = 1,
abab−1 = 1,
is isomorphic to the group H8 .
8. Let G be a group with two generators a and b satisfying the relation a3 b−2 =
1, and let H be a group with two generators x and y satisfying the relation
xyxy −1 x−1 y −1 = 1. Show that these two groups are isomorphic.
9. Give an example of two non-isomorphic abelian groups A1 and A2 such that
Tors(A1 ) ≃ Tors(A2 ), and A1 / Tors(A1 ) ≃ A2 / Tors(A2 ).
10. Find the invariant factors of Z2 ⊕ Z4 ⊕ Z6 and Z16 ⊕ Z3 ? Are these two
groups isomorphic?
4.6. FREE GROUPS, GENERATORS AND RELATIONS
243
11. Consider the groups in Examples 4.6.21. In each case, there exist generators
{y1 , y2 } such that n1 y1 = n2 y2 = 0, where n1 , n2 are the invariant factors of
the group. What is the relation between these generators and the “original”
generators x1 and x2 ? Find the group homomorphisms φ which allow one
to conclude that these two groups are isomorphic to Z2 ⊕ Z96 and Z2 ⊕ Z48 ,
respectively.
12. Proof Proposition 4.6.15.
13. Let {Gi : i ∈ I} be a family
non-trivial groups with #I > 1. Show that
Qof
∗
the non-abelian free product i∈I Gi has elements of infinite order and that
its center is trivial.
14. Let G and H be groups. Show that if some element g 6= 1 in the free product
G ∗ H has finite order, then g is conjugate to an element of G or of H.
15. Give proofs for Propositions 4.6.19 and 4.6.20.
16. In the proof of Theorem 4.6.24 we have assumed that the number of generators of the kernel of φ (denoted m) equals the number of generators of the
group A (denoted n). Does this assumption imply any loss of generality?
244
CHAPTER 4. QUOTIENTS AND ISOMORPHISMS
Chapter 5
Finite Groups
5.1
Groups of Transformations
In this Chapter we study one of the most classic themes of Algebra, the
structure of finite groups1 . We know already many examples of finite groups,
such as the finite cycles groups Zn , the symmetric groups Sn or the dihedral groups Dn . These groups are very distinct, although one can certainly
find some connections between them. For example, Dn contains a normal
subgroup isomorphic to Zn (the subgroup of rotations), and S3 ≃ D3 . Our
aim now is, precisely, to study the finite groups in a more systematic way,
trying to make such connections more evident.
We have already studied some of the properties of the symmetric group
Sn , the group of bijections of the set {1, . . . , n}. This group has a central role in the study of finite groups, since as we will see shortly, Cayley”s
Theorem states that any finite group is isomorphic to a subgroup of Sn .
More generally, we call a group of transformations of a set X any
subgroup of the group SX of bijections of X. Often, it is helpful to represent an abstract group as a group of transformations, since some properties
of the group become more intuitive and geometric. In fact, the notion of
a group of transformations is so natural that historically it preceded the
abstract notion of a group: the great mathematicians of the 19th Century
that many fundamental discoveries in Group Theory, such as Galois and Lie
only worked with transformation groups and did not know of the abstract
1
Although this is a classic theme, one of the greatest achievements of modern days
mathematics was the classification of finite simple groups, a class of groups we will study
later. This classification occupies several hundreds of pages of mathematical literature.
For a brief account see, e.g., the excellent article by R. Solomon, “On finite simple groups
and their classification”, Notices of the American Mathematical Society 42, 231–239 (1995)
245
246
CHAPTER 5. FINITE GROUPS
notion of a group, which would only be formalized later in the beginning of
the 20th Century.
The passage from a an abstract group to a group of transformations
is done through the notion of group action, whose formal definition is as
follows:
Definition 5.1.1. An action of a group G on a set X is a map G×X → X,
written (g, x) 7→ gx, which satisfy the following properties:
(i) ∀x ∈ X, ex = x;
(ii) ∀g1 , g2 ∈ G, ∀x ∈ X, g1 (g2 x) = (g1 · g2 )x.
A slightly different, but equivalent, point of view is the following. Assume that a group G acts on a set X. For each g ∈ G define a transformation
T (g) : X → X by setting T (g)(x) ≡ gx. Then conditions (i) and (ii) in the
definition of an action are equivalent to, respectively,
(i’) T (e) = I (identity transformation);
PSfrag replaements
Z
(ii’) ∀g1 , g2 ∈ G, T (g1 · g2 ) = T (g1 ) ◦ T (g2 ).
h
mi
Conversely, given a transformation T (g) : X → X, for each g ∈ee~ G, satisfying
h1i = Zgx ≡ T (g)(x).
(i’) and (ii’), one obtains an action of G on X by the formula
h2i
Note, also, that each transformation T (g) is bijective.
h4i
h12i
h6i
X
g2 x
T (g2 )
h3i
m; n)i
hmi
hni
T (gh1mm(
)
m; n)i
x
hmd(
g1 (g2 x)
T (g1 g2 )
Figure 5.1.1: Action.
Hence, the map g 7→ T (g) is a group homomorphism from G to the
group SX of bijections of the set X. We will call T the associated homomorphism of action. Therefore, an action of G on X realizes G as a
5.1. GROUPS OF TRANSFORMATIONS
PSfrag replaements
Z
h
247
mi
group of transformations of X. An action is called effective if the associe
ated homomorphism T is injective, i.e., if the kernel of the homomorphism
e~
=Z
T : G → SX is {e}. The kernel of the homomorphism T his1icalled
the kernel
h2i
of the action.
h4i
h12i
When G acts in two distinct sets X1 and X2 , with associated
homomorh6i
phisms T1 and T2 , we call the actions equivalent if there exists
a bijection
h3i
φ : X1 → X2 such that
hmd(m; n)i
mi
ni
hmm(m; n)i
h
φ ◦ T1 (g) = T2 (g) ◦ φ,
(5.1.1)
X1
∀g ∈ G.
h
X2
gx
X
(gx)
T 1 (g )
T2 (g )
g2 x
g1 (g2 x)
T (g 2 )
(xT) (g1 )
T (g1 g2 )
x
Figure 5.1.2: Equivalent actions.
This equation can also be expressed by the commutativity of the diagram:
X1
φ
/ X2
T1 (g)
T2 (g)
X1
φ
/ X2
Examples 5.1.2.
1. Consider the group O(n) consisting of n×n orthogonal matrices, i.e., AAT =
I. There is an obvious action of O(n) on Rn given by the action of a matrix
on a vector: (A, x) 7→ Ax. This is the usual way we view O(n) as a group
of transformations of Rn . We saw in Chapter ?? that for each g ∈ O(n) the
transformation T (g) is an isometry of Rn which fixes the origin. This action
is effective (why?).
2. Similarly, consider the Euclidean group E(n) formed by all pairs (A, b),
where A ∈ O(n) and b ∈ Rn , with binary operation (A1 , b1 ) · (A2 , b2 ) ≡
248
CHAPTER 5. FINITE GROUPS
(A1 A2 , A1 b2 + b1 ). The group E(n) acts on Rn by setting ((A, b), x) 7→
Ax + b. We saw in Chapter ?? that the transformations T (g), where g ∈ E(n),
are isometries of Rn and that any isometry of Rn is realized by a T (g). This
is also an effective action.
3. A group G acts on itself by left translations : G× G → G, (g, x) 7→ g ·x.
This action is effective (why?). We can also define an action of G on itself by
right translations by setting (g, x) 7→ xg −1 . These actions are equivalent:
an equivalence is provided by the inverse map φ : G → G, x 7→ x−1 .
4. Another action of G on itself is the action by conjugation, which is
defined by:
(g, x) 7→ g x ≡ gxg −1 ,
(5.1.2)
g, x ∈ G.
It is easy to check that conditions (i) and (ii) in Definition 5.1.1 are satisfied
and that they can be written as:
e
( x) = g1 ·g2 x.
g1 g2
x = x,
We leave it as an exercise to check that the kernel of this action is the center
C(G) of the group G.
5. Let G be a group and H ⊂ G a subgroup. Then G acts on the set of left
cosets G/H by setting (g, xH) 7→ (g · x)H. An element g ∈ G belongs to the
kernel of this action if and only if:
(g · x)H = xH, ∀x ∈ G
⇐⇒
g ∈ xHx−1 , ∀x ∈ G.
Hence the kernel of this action is
\
xHx−1 .
x∈G
We leave it as an exercise to check that this is the largest normal subgroup of
G which is contained in H. Hence, the action is effective if and only if there
is no non-trivial subgroup of H, normal in G.
We will explore the actions of a group on itself, as in the previous examples, to obtain information about the structure of the group. A simple
application of this idea yields the following:
Theorem 5.1.3 (Cayley). Let G be a finite group of order n. Then G is
isomorphic to a subgroup of Sn .
Proof. Consider the action of G on itself by left translations. Since this
action is effective, the associated homomorphism
T : G → SG ≃ Sn
is injective.
5.1. GROUPS OF TRANSFORMATIONS
249
PSfragofreplaements
When G acts on a set X we obtain a partition
X as follows. One deZ an element
fines an equivalence relation ∼ on X where x ∼ y if there exists
g ∈ G such that x = gy. An equivalence class of ∼ is called a G-orbit.
Hence, we obtain a partition of X into G-orbits, where the G-orbit
containe
ing the element x ∈ X is
e~
hmi
h1i
=
Z
h2i
Ox ≡ {gx : g ∈ G}.
h4i
h12i
The set of G-orbits is denoted by X/G. When X is a finiteh6set
there exists
i
h3i
a finite number of orbits O1 , . . . , On , and each orbit has a finite
number of
hmd(m; n)i
elements. Therefore, counting elements, we obtain a class hequation:
mi
|X| = |O1 | + · · · + |On |.
(5.1.3)
hmm(
hni
m; n)i
X
X2
1 X.
An action is called transitive if it has only one orbit, namely
gx
X
Ox
x
g1 x
gk x
g2 x
(x)
(gx)
T1 (g )
T2 (g )
g1 (g2 x)
T (g 2 )
T (g 1 )
T (g1 g2 )
Figure 5.1.3: A G-orbit of x.
Examples 5.1.4.
1. The orbits of the action of the group O(n) on Rn are the spheres Sr = {|x| =
r : x ∈ Rn } (r > 0) and the origin {0}.
2. The action of G on itself by (right or left) translations has only one orbit,
hence it is transitive.
3. Let H ⊂ G be a subgroup. The H-orbits of the action of H on G by left
(respectively, right) translations are the right (respectively, left) cosets of H.
In Chapter ?? we saw that the orbits of this action all have the same cardinal
and that from this fact, and the class equation (5.1.3), it follows immediately
Lagrange’s Theorem.
250
CHAPTER 5. FINITE GROUPS
4. Consider the action of G on itself by conjugation. The G-orbits are usually
called conjugacy classes. We will denote the conjugacy class containing
the element x ∈ G by G x. Two elements x and y are said to be conjugate if
they belong to the same conjugacy class, i.e., if y = gxg −1 , for some g ∈ G.
An element x belongs to the center C(G) if and only if G x = {x}, so the center
C(G) is the union of all conjugacy classes containing a single element.
Definition 5.1.5. If G acts on X, the isotropy subgroup or stabilizer
group of an element x ∈ X is the subgroup Gx ⊂ G given by
Gx = {g ∈ G : gx = x}.
(5.1.4)
When two elements x, y ∈ X belong to the same orbit, the corresponding
isotropy subgroups are conjugate. In fact, if y = gx for some g ∈ G, then
h ∈ Gx
⇐⇒
⇐⇒
⇐⇒
hx = x,
hg −1 y = g−1 y,
ghg −1 y = y
⇐⇒
ghg −1 ∈ Gy .
Hence, the isotropy subgroups of x and of y = gx are related by
Gy = gGx g−1 .
Moreover, the orbits are determined by the isotropy subgroups, as shown by
the following proposition:
Proposition 5.1.6. Assume a group G acts on a set X. For any x ∈ X, the
map φ : G/Gx → Ox , hGx 7→ hx, gives an equivalence between the G-action
on G/Gx and the G-action on the orbit through x.
We leave the proof for the exercises. In particular, for an action on a
finite set we conclude that:
Corollary 5.1.7. If G acts on a finite set X with orbits Ox1 , . . . , Oxn then
(5.1.5)
|X| =
n
X
i=1
n
X
[G : Gxi ].
|G/Gxi | =
i=1
The actions that we have described are sometimes called left actions,
because they satisfy (ii) in Definition (5.1.1). One can also consider right
actions where one writes the action as X × G → X, (x, g) 7→ xg, and where
(ii) is replaced by
∀g1 , g2 ∈ G, ∀x ∈ X, (xg1 )g2 = x(g1 · g2 ).
5.2. THE SYLOW THEOREMS
251
Unless otherwise mentioned, we will always consider left actions. You
should check how to modify the examples in this section in order to obtain
right actions.
Exercises.
1. Show that the kernel of the action by conjugation of G on itself is the center
C(G).
2. Let H be a subgroup of G. Show that
subgroup of G which is contained in H.
T
−1
x∈G xHx
is the largest normal
3. Give a proof of Proposition 5.1.6.
4. An action of a group G on a set X is called a free action if any e 6= g ∈ G
acts without fixed points, i.e,, if the isotropy subgroups Gx are trivial for all
x ∈ X. Show that a free action is effective and determined which of the actions
in Examples 5.1.2 are free.
5. One says that a group G acts by automorphisms on a group K if there
exists an action of G on K such that, for each g ∈ G, the map k 7→ gk is an
automorphism of K. Assuming that G acts by automorphisms on K, show
that:
(a) the binary operation (g1 , k1 ) ∗ (g2 , k2 ) = (g1 g2 , k1 (g1 k2 )) defines a group
(G× K, ∗), called the semidirect product of G by K, which is denoted
by G ⋉ K;
(b) the maps G → G⋉K : g 7→ (g, e) e K → G⋉K : k 7→ (e, k) are monomorphisms. The image of second monomorphism is a normal subgroup of
G ⋉ K.
6. Consider the action of O(n) on (Rn , +). Show that this action is by automorphisms, and describe the semidirect product O(n) ⋉ Rn .
7. Determine the partition of the symmetric group Sn into conjugacy classes.
(Hint: Consider first the case n = 3.)
5.2
The Sylow Theorems
If G is a finite group, then the order of a subgroup divides the order of the
group. When G is a cyclic group, then for each divisor d of |G| there exists
exactly one subgroup of G of order d. In general, for a finite group G, it is
natural to ask:
252
CHAPTER 5. FINITE GROUPS
• Given a factor d of |G|, is there a subgroup of G of order d?
In this section we will explore the action by conjugation of the group G on
itself, and the corresponding class equation (5.1.3), to obtain some answers
to this question.
Let x ∈ G. The isotropy subgroup of x for the action by conjugation is
precisely:
{g ∈ G : g · x · g −1 = x} = {g ∈ G : g · x = x · g},
i.e., the set of all elements of G which commute with x. This group is usually
called the centralizer of x, and will be denoted by CG (x). Recall that
G x denotes the G-orbit of x, i.e., the conjugacy class which contains x. By
Proposition 5.1.6, G x is isomorphic to G/CG (x) and we conclude that:
|G x| = [G : CG (x)].
Moreover, it follows that:
Proposition 5.2.1. For any finite group G one has:
(5.2.1)
|G| = |C(G)| +
n
X
[G : CG (xi )],
i=1
where x1 , . . . , xn are representatives of each of the conjugacy classes of G
with more than one element.
Proof. As we have noted before, the center C(G) is the union of all the
conjugacy classes of G with exactly one element. Equation (5.2.1) now
follows from the class equation (5.1.5).
As we shall see now, formula (5.2.1) is specially useful to exclude the
existence of subgroups of certain orders. For example, as a first elementary
application we show the existence of a family of groups whose centers are
non-trivial.
Proposition 5.2.2. If |G| = pm , where p is a prime, then the center of G
has order pk , where k ≥ 1.
Proof. By Lagrange’s Theorem the order of the center of C(G) divides the
order of G. Hence, from the class equation (5.2.1), one obtains
pm = pk +
n
X
i=1
[G : CG (xi )].
5.2. THE SYLOW THEOREMS
253
Since xi does not belong to the center, CG (xi ) 6= G and we conclude that
[G : CG (xi )] = pmi , where mi ≥ 1. Hence:
k
m
p =p −
n
X
i=1
pmi ,
m, mi ≥ 1
and we must have k ≥ 1.
Corollary 5.2.3. Every group G with |G| = p2 is abelian.
Proof. The previous proposition implies that |C(G)| = p or p2 . We leave it
as an exercise to check that the case |C(G)| = p is not possible..
The next result gives a partial answer to the question raised at the
beginning of this section in the case of abelian groups.
Theorem 5.2.4 (Cauchy). If G is a finite abelian group group and p is a
prime factor of |G|, then G has an element g of order p.
Proof. We use induction on the order |G| of G. If |G| = p then the result
is obvious. Assume that |G| > p and assume the result holds for all abelian
groups of order less than |G. Choose an element e 6= a ∈ G. Two cases are
possible:
(i) a is an element of order divisible by p. In this case, the cyclic group
hai has an element g of order p, so the theorem holds.
(ii) p does not divide the order of a. In this case, the group G/hai has
order divisible by p. By the induction hypothesis, this group has an
element bhai of order p. The order s of b is divisible by p, since
hai = bs hai = (bhai)s . Hence, the cyclic subgroup hbi contains an
element g of order p.
Cauchy’s Theorem holds for finite abelian groups. We will see later that
these groups can by classified. This classification will clarify completely what
are the possible subgroups of a finite abelian group. We now turn to nonabelian groups for which we have the following generalization of Cauchy’s
Theorem2 .
2
The results that follow are due to the Norwegian mathematician Ludvig Sylow (18321918) and appeared in the paper “Théorèmes sur les groups de substitutions”, Math.
Ann., 5 (1872).
254
CHAPTER 5. FINITE GROUPS
Theorem 5.2.5 (Sylow I). Let G be a finite group. If a prime power pk is
a factor of the order of |G|, then there exists s subgroup H of G of order pk .
Proof. We shall use again induce on the order |G|. Again, starting from the
class equation (5.2.1):
n
X
[G : CG (xi )].
|G| = |C(G)| +
i=1
we observe that:
(i) If p ∤ |C(G)|, then for some i ∈ {1, . . . , n} we have that p ∤ [G : CG (xi )].
It follows that CG (xi ) is a subgroup of G whose order is less than
|G| and divisible by pk . By the induction hypothesis, there exists a
subgroup H of CG (xi ) of order pk .
(ii) If p | |C(G)|, by Cauchy’s Theorem there exists an element g ∈ C(G)
of order p. If k = 1 we are done. If not, the subgroup hgi is normal
in G and the group G/hgi has order smaller than |G| and his divisible
by pk−1 . By induction, G/hgi contains a subgroup of order pk−1 . This
subgroup takes the form H/hgi, where H is a subgroup of G, and we
have:
|H| = [H : hgi] |hgi| = pk−1 p = pk .
The previous theorem motivates the following definitions:
Definition 5.2.6. A group of order pk is called a p-group (of exponent
k). A p-subgroup H ⊂ G where the exponent k is maximal is called a
Sylow p-subgroup.
The Sylow p-subgroups of a group G are, in same sense, the analogues
of the subgroups of a cyclic group, as shown by the following theorem:
Theorem 5.2.7 (Sylow II). Let G be a finite group and assume p | |G|.
Then:
(i) The Sylow p-subgroups of G are unique up to conjugation.
(ii) The number of Sylow p-subgroups of G is a divisor of the index of any
Sylow p-subgroup and equals ≡ 1 (mod p).
(iii) Any subgroup of G of order pk is a subgroup of some Sylow p-subgroup
of G.
5.2. THE SYLOW THEOREMS
255
The proof of the first Sylow’s Theorem was based on the action of G on
itself by conjugation. For the proof of the second Sylow’s Theorem we will
use the action of G by conjugation of the set of its subgroups: if H ⊂ G is a
subgroup, then gHg−1 , g ∈ G, is a subgroup of G and |gHg−1 | = |H|. For
this action, the isotropy subgroup of H ⊂ G is precisely:
NG (H) ≡ {g ∈ G : gHg−1 = H}.
This subgroup is usually called the normalizer of H in G. Note that H
is normal in NG (H) and, in fact, we leave it as an exercise to check that
NG (H) is the largest subgroup of G containing H as a normal subgroup.
We can also restrict this action of G to an action on the set Π of Sylow
p-subgroups of G.
Lemma 5.2.8. Let P be a Sylow p-subgroup of G, and H ⊂ NG (P ) a
subgroup of order pk . Then H ⊂ P .
Proof of Lema 5.2.8. Since P is a normal subgroup of NG (P ), we have a
homomorphism π : NG (P ) → NG (P )/P . Since H is a subgroup of NG (P ),
it follows that π(H) is a subgroup of NG (P )/P of order a power of p. Since
P is a Sylow p-subgroup of G, is it also a Sylow p-subgroup of NG (P ) and
we conclude that p 6 ||NG (P )/P |. But then we must have π(H) = {e}, in
other words H ⊂ P .
Proof of Sylow II. Consider the action by conjugation of G on the set Π of
Sylow p-subgroups of G. Denote by OP the orbit of a Sylow p-subgroup P .
Then:
(a) |OP | ≡ 1 (mod p) : Consider the action of the group P on OP , obtained by restriction of the action of G (note that the G-action on OP
is transitive by definition, but not the P -action on OP ). The P -orbits
which do not contain P have more than one element, since if {P̃ } is a
P -orbit distinct from P , then P̃ is a Sylow p-subgroup distinct from P
and P ⊂ NG (P̃ ), contradicting Lemma 5.2.8. On the other hand,
P all
the P -orbits have cardinality a power of p. Hence, |OP | = 1 + i pki .
(b) OP = Π : Assume that P̃ ∈ Π − OP . The same argument as above
applied to the action of P̃ on OP , shows that |OP | ≡ 0 (mod p), which
contradicts (a).
Part (i) of Sylow II is equivalent to (b).
On the other hand, to show that (ii) holds, we note that (b) implies that
|Π| = |G/NG (P )| = [G : NG (P )].
256
CHAPTER 5. FINITE GROUPS
Since P ⊂ NG (P ) ⊂ G, we conclude that:
[G : P ] = [G : NG (P )][NG (P ) : P ],
so the number of p-subgroups is a divisor of [G : P ]. By (a) and (b), we also
have |Π| ≡ 1 (mod p).
Finally, to show that (iii) holds, note that if H ⊂ G is a subgroup of order
k
p , then the orbits of the action by conjugation of H on Π have cardinality a
power of p. But |Π| ≡ 1 (mod p), so at least one of the orbits is of the form
{P̃ }. This means that H ⊂ NG (P̃ ). By Lemma 5.2.8, H ⊂ P̃ , as claimed
in (iii).
Examples 5.2.9.
1. Consider the symmetric group S3 = {I, α, β, γ, δ, ε}. Since |S3 | = 6, by Sylow
I there exist Sylow p-subgroups of orders 2 and 3.
By Sylow II, the number of Sylow 3-subgroups must be 1 (mod 3) and a divisor
of 2. Hence, there exists 1 subgroup of order 3. Obviously, this subgroup of S3
is
P = A3 = {I, δ, ε}.
Similarly, the number of Sylow 2-subgroups must be 1 (mod 2) and a divisor
of 3. Hence, we can have either 1 or 3 subgroups of order 2. Obviously, there
are 3 subgroups of order 2:
P1 = {I, α},
P2 = {I, β},
P3 = {I, γ}.
It is easy to check that these subgroups are all conjugate:
P1 = δP2 δ −1 = εP3 ε−1 .
2. The subgroup A4 of S4 formed by the even permutations has order 12. By
Sylow I, A4 has Sylow p-subgroups of orders 3 and 4.
By Sylow II, the number of Sylow 3-subgroups can be 1 or 4. It is easy to find
the following 4 Sylow 3-subgroups:
P1 = {I, (123), (321)}
P3 = {I, (134), (431)}
P2 = {I, (124), (421)}
P4 = {I, (234), (432)}.
We leave it as an exercise to check that all these subgroups are conjugate and
to determine the Sylow 2-subgroups (which have order 4).
3. Let G be a group of order pq, where p < q are prime numbers. Assume that
p 6 | (q − 1). Then we claim that G ≃ Zpq .
5.2. THE SYLOW THEOREMS
257
To see this, observe that the number of Sylow p-subgroups of G is 1 + kp and
divides pq, so it is must be 1. On the other hand, the number of q-Sylow
subgroups is 1 + lq and divides pq, so it must be 1. It follows that G has
two normal subgroups H1 , H2 ⊂ G of order p and q, respectively. Obviously,
H1 ∩ H2 = {e} and G = H1 H2 , so we conclude that G ≃ H1 ⊕ H2 ≃ Zp ⊕ Zq ≃
Zpq . For example, any group of order 15 is isomorphic to Z15 .
One can also show that when p < q are prime numbers and p | (q − 1) there
are exactly two groups of order pq: the abelian group Zpq and a non-abelian
group with presentation:
{a, b, {ap = 1, bq = 1, ab = bs a}},
where s ∈ N satisfies s 6≡ 1 (mod q) and sp ≡ 1 (mod q) (see exercises).
Using the techniques developed so far, it is possible to classify all groups
of order less or equal to 15, up to isomorphism. The list is as follows:
Order
1
2
3
4
5
6
7
8
9
10
Groups
{e}
Z2
Z3
Z4 , Z2 ⊕ Z2
Z5
Z6 , D3
Z7
Z8 , Z4 ⊕ Z2 , Z2 ⊕ Z2 ⊕ Z2 , H8 , D4
Z9 , Z3 ⊕ Z3
Z10 , D5
Order
11
12
13
14
15
Groups
Z11
Z12 , Z2 ⊕ Z6 , A4 , D6 , E
Z13
Z14 , D7
Z15
In this list, the group E is a group of order 12 with presentation:
{{a, b}, a6 = 1, a3 b2 = 1, abab−1 = 1}.
By contrast, there are 14 distinct groups of order 16, 51 distinct groups of
order 32, etc. Although a finite group of order n is a subgroup of Sn , there
is no known formula for the number of distinct groups of order n.
258
CHAPTER 5. FINITE GROUPS
Exercises.
1. Show that if G is a finite group and |G| = p2 , then G is isomorphic to Zp2
or to Zp ⊕ Zp .
2. Show that, if G is a finite abelian where all elements, excepting the identity
e, have order p, then |G| = pn and G ≃ Zp ⊕ · · · ⊕ Zp .
3. Classify the finite groups of order ≤ 7.
4. Given an example of a group G where the converse to Lagrange’s Theorem
fails, i.e., such that |G| has a factor d but G has no subgroup of order d.
Hint: Consider G = A4 .
5. Show that the normalizer NG (H) of a subgroup H ⊂ G is the largest subgroup of G containing H as a normal subgroup.
6. Determine the Sylow p-subgroups of the alternating group A4 and their conjugacy relations.
7. Determine all the p-subgroups of the group of quaternions H8 .
8. Determine the Sylow p-subgroups of the dihedral group Dp when p is a prime.
9. Let φ : G1 → G2 be a surjective group homomorphism between finite groups.
Show that is P ⊂ G1 is a Sylow p-subgroup, then φ(P ) is a Sylow p-subgroup
of G2 .
10. Show that if P ⊂ G is a Sylow subgroup then NG (NG (P )) = NG (P ).
11. Let p and q be distinct prime numbers such that p|(q − 1) and s a natural
number such that s 6≡ 1 (mod q) and sp ≡ 1 (mod q) (one can show that such
an s always exists). Show that:
(a) The map α : Zq → Zq , k 7→ sk, is an automorphism of Zq ;
(b) The map θ : Zp × Zq → Zq , (i, k) → αi (k), is an action of Zp on Zq by
automorphisms;
(c) The semi-direct product Zp ⋉ Zq is a group with two generators a and b
satisfying the relations: ap = 1, bq = 1, ab = bs a.
5.3. NILPOTENT AND SOLVABLE GROUPS
5.3
259
Nilpotent and Solvable Groups
The p-groups, as we saw in the last section, play a crucial role in the study of
the structure of finite groups. A p-group is an example of a nilpotent group.
In this section we will study this class of groups, as well as the largest class
of solvable groups. These two classes of groups appear naturally when one
looks at the failure of elements of a group in commuting.
Let G be any group. The commutator of two elements g1 , g2 ∈ G is the
element g1 −1 g2 −1 g1 g2 ∈ G. We denote this element by (g1 , g2 ) 3 , so that:
g1 g2 = g2 g1 (g1 , g2 ).
The element (g1 , g2 ) measures the failure of g1 and g2 in commuting. The
next result gives some elementary properties of commutators whose verification is a simple exercise.
Proposition 5.3.1 (Properties of commutators). Let g1 , g2 , g3 ∈ G be any
elements in a group. Then:
(i) (g1 , g2 )−1 = (g2 , g1 );
(ii) (g1 , g2 ) = e if and only if g1 and g2 commute;
(iii) g (g1 , g2 ) = (g g1 , g g2 );
(iv) (g1 g2 , g3 ) · (g2 g3 , g1 ) · (g3 g1 , g2 ) = e;
(v)
g1 ((g −1 , g ), g ) · g3 ((g −1 , g ), g )
1
2
2
3
3
1
·
g2 ((g −1 , g ), g )
3
1
2
= e;
(vi) If φ : G → H is a group homomorphism, then φ((g1 , g2 )) = (φ(g1 ), φ(g2 )).
Let A, B ⊂ G be subgroups. We will denote by (A, B) the subgroup
of G generated by all commutators (a, b), where a ∈ A and b ∈ B. So, by
definition, (A, B) is the smallest subgroup of G which contains all elements
(a, b), with a ∈ A, b ∈ B. Notice that since (A, B) is a group, if (a, b) ∈
(A, B) then (b, a) = (a, b)−1 ∈ (A, B), so that (A, B) = (B, A). Notice
also that there can exist elements in (A, B) which are not commutators. In
general, the elements of (A, B) are of the form
(a1 , b1 )±1 · (a2 , b2 )±1 · · · (as , bs )±1 ,
ai ∈ A, bi ∈ B,
where s ≥ 1.
3
Often the commutator is also denoted by [g1 , g2 ], but we will reserve this notation for
the commutator in the context of the so called Lie algebras.
260
CHAPTER 5. FINITE GROUPS
Definition 5.3.2. The derived group G is the subgroup (G, G) of G. We
denote this group by D(G)4 .
One also calls D(G) = (G, G) the commutator group of G but this
name is a little bit misleading since, as we have already observed, there can
exist elements in D(G) which are not commutators.
Proposition 5.3.3 (Properties of the derived group). Let G, G1 and G2 be
groups.
(i) If φ : G1 → G2 is a group homomorphism, then φ(D(G1 )) ⊂ D(G2 ),
and if φ is surjective, then φ(D(G1 )) = D(G2 ).
(ii) D(G) is a normal subgroup of G.
(iii) G/D(G) is an abelian group and for every homomorphism φ : G → A
into an abelian abelian group A there exists a unique homomorphism
φ̃ : G/D(G) → A making the following diagram commute:
G
π
φ
/6 A
♥ ♥
♥
♥ ♥
♥
φ̃
♥
G/D(G)
Proof. (i) Obvious from property (vi) of the commutators.
(ii) For each g ∈ G, the map h 7→ ghg −1 is a group automorphism of G.
Hence, by (i), it follows that gD(G)g−1 ⊂ D(G), so D(G) is normal in G.
(iii) Since g · h = h · g · (g, h) it is obvious that G/D(G) is an abelian
group. If φ : G → A is a group homomorphism from G into an abelian
group A, we have
ḡ = g · (a, b)
=⇒
φ(ḡ) = φ(g · (a, b))
= φ(g) · (φ(a), φ(b))
= φ(g).
Hence, we can define φ̃ : G/D(G) → A by setting φ̃(gD(G)) ≡ φ(g). It
is very easy to check that φ̃ is a group homomorphism. By construction,
φ = φ̃ ◦ π, where π : G → G/D(G) is the quotient map.
Note that properties (ii) and (iii) actually characterize the derived group.
4
Some authors also denote the derived group of G by G′ .
5.3. NILPOTENT AND SOLVABLE GROUPS
261
Examples 5.3.4.
1. Obviously, a group G is abelian if and only it its derived group is D(G) = {e}.
2. For the group H8 = {1, i, j, k, −1, −i, −j, −k} the derived group is D(H8 ) =
{1, −1} ≃ Z2 , since the commutators of elements in H8 are either 1 or −1.
This group is a normal subgroup of H8 and the quocient H8 /D(H8 ) is isomorphic to Z2 ⊕ Z2 (exercise).
3. For the symmetric group S3 = {I, α, β, γ, δ, ε} the derived group is the alternating group A3 = {I, δ, ε}. In fact, the commutator of any two permutations
is necessarily an even permutation and one checks easily that, for example,
δ = (α, γ) and ε = (γ, α), so all even permutations are commutators. The
group A3 is normal in S3 and S3 /A3 ≃ Z2 .
For a group G one defines its lower central series {C k (G)}k≥0 by:
C 0 (G) ≡ G,
C k+1 (G) ≡ (G, C k (G))
(k ≥ 0).
One obtains a chain of groups:
G = C 0 (G) ⊃ C 1 (G) ⊃ · · · ⊃ C i (G) ⊃ · · ·
When there exists an n such that C n (G) = C n+k (G) for all k ≥ 0, we say that
the series stabilizes. Notice that this will be the case if C n (G) = C n+1 (G).
For an infinite group, the series may never stabilize.
Definition 5.3.5. A group G is called nilpotent if theres exists some n
such that
C n (G) = {e}.
The smallest such integer n is called the nilpotency class of G.
Examples 5.3.6.
1. A group is nilpotent of class ≤ 1 if and only it is abelian.
2. The group H8 is nilpotent of class 2, since we have C 0 (H8 ) = H8 , C 1 (H8 ) =
(H8 , H8 ) = Z2 , C 2 (H8 ) = (H8 , Z2 ) = {e}.
3. The dihedral group S3 is not nilpotent: one finds that C 0 (S3 ) = S3 , C 1 (S3 ) =
A3 and C 2 (A3 ) = A3 , so the lower central series stabilizes in A3 at k = 2.
4. The subgroup of GL(n, R) formed by the upper triangular matrices with 1s
in the main diagonal is nilpotent of class n − 1 (exercise).
5. Any subgroup and any quotient of a nilpotent group is nilpotent (exercise).
262
CHAPTER 5. FINITE GROUPS
In general, to try to check directly from the definition that a group is
nilpotent, involves a considerably amount of computation. It is convenient
to have alternative characterizations of a nilpotent group. For that, we will
call a tower of subgroups of G a sequence
G = G0 ⊃ G1 ⊃ · · · ⊃ Gm .
Also, by a normal tower we mean a tower where Gk+1 is normal in Gk ,
for all k. In this case we will often write:
G = G0 ⊲ G1 ⊲ . . . ⊲ Gm .
Finally, by an abelian tower we mean a normal tower where Gk /Gk+1 is
abelian, for all k.
Proposition 5.3.7. The following statements are equivalent:
(i) G is nilpotent of class ≤ n.
(ii) There exists a tower of subgroups G = G0 ⊃ G1 ⊃ · · · ⊃ Gn = {e}
where Gk+1 ⊃ (G, Gk ).
(iii) There exists a subgroup A of the center C(G) such that G/A is nilpotent of class ≤ n − 1.
Proof. (i) ⇔ (ii): If G is nilpotent of class ≤ n, then the tower Gk ≡
C k (G) satisfies (ii). Conversely, given a tower as in (ii), one shows easily by
induction that C k (G) ⊂ Gk : in fact, C 0 (G) = G = G0 and
C k (G) ⊂ Gk ⇒ C k+1 (G) = (G, C k (G)) ⊂ (G, Gk ) ⊂ Gk+1 .
Therefore, C n (G) ⊂ Gn = {e}, so G is nilpotent of class ≤ n.
(i) ⇔ (iii): If G is nilpotent of class ≤ n, we have (G, C n−1 (G)) =
n
C (G) = {e}, hence A = C n−1 (G) is a central subgroup. We leave as an
exercise to check that G/A is nilpotent of class ≤ n − 1. Conversely, let A be
a subgroup of C(G) such that G/A is nilpotent of class ≤ n−1. The quotient
map π : G → G/A is surjective, so π(G, G) = (G/A, G/A). Iterating, we
conclude that π(C n−1 (G)) = C n−1 (G/A) = {e}, so that C n−1 (G) ⊂ A. Since
A is central, it follows that:
C n (G) = (G, C n−1 (G)) ⊂ (G, A) = {e}.
This shows that G is nilpotent of class ≤ n.
5.3. NILPOTENT AND SOLVABLE GROUPS
263
Another series associated with a group is its derived series {D k (G)}k∈N ,
which is defined inductively by
D 0 (G) ≡ G,
D k+1 (G) ≡ D(D k (G)),
(k ≥ 0).
Note that D 0 (G) = C 0 (G) = G and D 1 (G) = C 1 (G) = (G, G). We leave it
to the exercises to check that:
(5.3.1)
k−1
D k (G) ⊂ C 2
(G),
(k ≥ 0).
Similarly, to what we did for the central series, we now define:
Definition 5.3.8. A group G is called solvable if for some n,
D n (G) = {e}.
The smallest such integer n is called the derived length of G.
Examples 5.3.9.
1. A group is solvable of length ≤ 1 if and only if it is abelian.
2. Every nilpotent group of class ≤ 2n−1 is solvable of length ≤ n.
3. The group S3 is solvable of class 2: its derived series has terms: D0 (S3 ) = S3 ,
D1 (S3 ) = A3 and D2 (S3 ) = {e}.
4. The subgroup of GL(n, R) formed by all invertible n × n triangular matrices
is solvable (what is its derived length?).
5. Any subgroup and any quotient of a solvable group is solvable.
We also have alternative characterizations of solvable groups. The proof
of the following proposition is left for the exercises.
Proposition 5.3.10. The following statements are equivalent:
(i) G is solvable of length ≤ n.
(ii) There exists a tower G = G0 ⊃ G1 ⊃ · · · ⊃ Gn = {e} where Gk is
normal em G and Gk /Gk+1 is abelian, for all k ≥ 0.
(iii) There exists an abelian tower G = G1 ⊲ G2 ⊲ . . . ⊲ Gn = {e}.
(iv) There exists an abelian normal subgroup A ⊂ G such that G/A is
solvable of length ≤ n − 1.
264
CHAPTER 5. FINITE GROUPS
Example 5.3.11.
We saw above that the group S3 is solvable (but not nilpotent). The group S4
is also solvable, for it admits the following abelian tower of subgroups:
S4 ⊲ A4 ⊲ H ⊲ {e},
where H = {I, (12)(34), (13)(24), (14)(23)}.
Exercises.
1. If π : G1 → G2 is a group homomorphism, show that π(C k (G1 )) = C k (π(G1 ))
and π(Dk (G1 )) = Dk (π(G1 )).
2. Is the dihedral group Dn nilpotent? Solvable? (you answer may depend on
n).
3. Show that the subgroup of GL(n, R) formed by the upper triangular matrices
with 1s in the main diagonal is nilpotent of class n − 1.
4. Show that the subgroup of GL(n, R) formed by the invertible upper triangular matrices is solvable. What is its derived length?
5. Let G be a group. Show that:
(a) If H1 , H2 , H3 ⊂ G are normal subgroups, then
(H1 , (H2 , H3 )) ⊂ (H3 , (H2 , H1 )) · (H2 , (H1 , H3 ));
(Hint: Use property (v) of the commutators.)
(b) for all m, n ∈ N
(c) for all n ∈ N
(C m (G), C n (G)) ⊂ C m+n (G);
n−1
Dn (G) ⊂ C 2
(G).
6. Show that any subgroup and any quotient of a nilpotent (respectively, solvable) group is a nilpotent (respectively, solvable) group. What about direct
products?
7. Show that if G is nilpotent (respectively, solvable) of class (respectively,
length) ≤ n, then G/C n−1 (G) (respectively, G/Dn−1 (G)) is nilpotent (respectively, solvable) of class (respectively, length) ≤ n − 1.
8. Give a proof of Proposition 5.3.10.
5.4. SIMPLE GROUPS
265
9. The upper central series of a group G is the tower {Ck (G)}k∈N defined
inductively as follows: C1 (G) = C(G) and Ck (G) is the largest normal subgroup
of G such that Ck (G)/Ck−1 (G) is the center of G/Ck−1 (G). Show that:
(i) Ck (G) = {g ∈ G : (g, h) ∈ Ck−1 (G), ∀h ∈ G};
(ii) A group is nilpotent if and only if G = Cn (G), for some n.
10. Show that a p-group is nilpotent.
11. Show that a finite group is nilpotent if and only if it is a direct product of
p-subgroups.
(Hint: If G is a nilpotent group, show that:
(a) If H ( G is a subgroup, then H ( NG (H);
(b) Every Sylow subgroup P ⊂ G is normal;
(c) G is the direct product of its Sylow subgroups.)
5.4
Simple Groups
Contrary to what Example 5.3.11 may suggest, the symmetric groups Sn ,
for n ≥ 5, are not solvable. They are related to a different class of groups,
which in some sense is on the other extreme relative to the class of solvable
groups. Notice that if one looks for an abelian tower of subgroups in some
group G, then one may end up with a non-abelian subgroup which has no
normal subgroups, besides the trivial ones. This motivates partially the
following definition:
Definition 5.4.1. A group G is called simple if it has no other normal
subgroups besides the trivial subgroups {e} and G.
In other words, the simple groups are the the groups for which there
exists only one congruence equivalence relation.
Examples 5.4.2.
1. A subgroup of an abelian group is always a normal subgroup. Hence, an
abelian group G is simple if it is cyclic of prime order p, i.e., Zp .
2. If a solvable group G is simple, then its derived group must be D(G) must be
trivial, so G is abelian. Hence, the only simple solvable groups are the groups
Zp with p prime.
3. If G is a simple non-abelian group then G = D(G).
266
CHAPTER 5. FINITE GROUPS
The most elementary, non-abelian, finite simple groups are the alternating groups An , with n ≥ 5. Galois found that the fact that An is simple for
n ≥ 5 is the reason why there is no general algebraic formula for the roots
of algebraic equations of order greater or equal to 5. We will study these
issues in Chapter 7.
Theorem 5.4.3. The alternating groups An are simple if n ≥ 5.
Proof. We will show that if {I} 6= N ⊂ An is a normal subgroup then
N = An . We divide the proof into 3 steps:
(i) The group An is generated by 3-cycles: Any representation of π ∈ An
as product of transposition has an even number of terms. On the other
hand, any product of 2 distinct transpositions can be written as a product
of 3-cycles (for example, (12)(23) = (123) and (12)(34) = (123)(234)).
(ii) If N contains a 3-cycle, then N = An : Assume, for example, that
(123) ∈ N . Then conjugation by the element
1 2 3 4 5 ···
δ=
i j k l m ···
gives δ(123)δ−1 = (ijk). This shows that the claim holds as long as δ ∈ An .
This can always be achieved, eventually by replacing δ by δ → (lm)δ.
(iii) N contains a 3-cycle: We choose an element α 6= I in N which has
the property that moves the smallest number of integers:
(PM) The number of integers i such that α(i) = i is maximal
among all elements of N .
We show that α must be a 3-cycle by contradiction. Suppose that α is not
a 3-cycle. Since we cannot have α = (123l) (this permutation is odd), the
expression of α as a product of disjoint cycles takes either of the following
forms:
α = (123 . . . ) . . . (. . . ), ou α = (12)(34 . . . ) . . .
where in the first case α permutes at least two more elements (for example,
4 and 5). One checks easily that if we let β = (345), then γ = (α, β) belongs
to N and has the following properties:
(a) if i > 5 and α(i) = i, then γ(i) = i;
(b) γ(1) = 1;
(c) in the second case, γ(2) = 2;
5.4. SIMPLE GROUPS
267
This shows that γ is an element contradicting property (MP) of α. Hence,
α must be a 3-cycle, as claimed.
Corollary 5.4.4. Sn is not solvable if n ≥ 5.
Proof. Assume that Sn is solvable. The theorem would imply that An ⊂ Sn
is a solvable simple group, i.e., would be abelian, a contradiction.
The simple groups are, in some sense, indecomposable groups and that
is the reason behind their name. One can decompose any finite group into
“simple components”. In fact, if G is a finite group then it has a normal
tower
G = A0 ⊲ A1 ⊲ . . . ⊲ Am = {e},
where Ak /Ak+1 is a simple group, for all k: one chooses for A1 any normal
subgroup of A0 = G which is not contain in any normal subgroup of A0 ,
chooses for A2 any normal subgroup of A1 which is not contain in any normal
subgroup of A1 , and so on.
Definition 5.4.5. A composition series of a group G is a normal tower:
G = A0 ⊲ A1 ⊲ . . . ⊲ Am = {e},
where Ak /Ak+1 is a simple group.
Hence a finite group G always admits a composition series. We will
see below that the Jordan-Hölder Theorem states that composition series
are “essentially unique”. In order to state it precisely, consider two normal
towers of a group G:
G = A0 ⊲ A1 ⊲ . . . ⊲ As ,
G = B0 ⊲ B1 ⊲ . . . ⊲ Br.
One says that:
s
r
• the tower {Ai }i=0 is a refinement of the tower {B j }j=0 if r < s and
for each i there exists a ji such that B ji −1 ⊇ Ai ⊇ B ji .
s
r
• The towers {Ai }i=0 and {B j }j=0 are called equivalent if r = s and
there exists a permutation of the indices i 7→ i′ such that
′
′
Ai /Ai+1 ≃ B i /B i +1 .
268
CHAPTER 5. FINITE GROUPS
Theorem 5.4.6 (Schreier). Let G be a group. Any two normal towers of
subgroups of G which end with the trivial subgroup {e} admit equivalent
refinements.
s
r
Proof. Let {Ai }i=0 and {B j }j=0 be two normal towers ending in the trivial
group. Define:
Āij ≡ Ai+1 (B j ∩ Ai ).
Since Āir = Ai+1 , Āi0 = Ai and Āij+1 is a normal subgroup of Āij , it follows
that {Āij } is a refinement of {Ai }. Simlarly, defining
B̄ ji ≡ B j+1 (Ai ∩ B j ),
one obtains a refinement of {B j }. The proof is completed by invoking the
following lemma whose proof is left as an exercise:
Lemma 5.4.7 (Zassenhaus). If G ⊃ S ⊲ S ′ and G ⊃ T ⊲ T ′ , then S ′ (S ∩
T ′ ) is normal in S ′ (S ∩ T ), T ′ (S ′ ∩ T ) is normal in T ′ (S ∩ T ) and there is
an isomorphism:
T ′ (S ∩ T )
S ′ (S ∩ T )
≃
.
S ′ (S ∩ T ′ )
T ′ (S ′ ∩ T )
T′
If in Zassenhaus’s Lemma we let S = Ai , S ′ = Ai+1 , T = B j and
= B j+1 , it follows that:
Āij
Ai+1 (Ai ∩ B j )
B̄ ji
B j+1 (Ai ∩ B j )
=
.
≃
=
Ai+1 (Ai ∩ B j+1 )
B j+1 (Ai+1 ∩ B j )
Āij+1
B̄ ji+1
Hence, {Āij } and {B̄ ji } are equivalent refinements.
As a corollary of Schreier’s Theorem, we obtain:
Theorem 5.4.8 (Jordan-Hölder). Two composition series of a finite group
G are equivalent.
Proof. Notice that a composition series is precisely a normal tower of subgroups ending with the trivial group, and which do not admit any refinement.
Therefore, by Schreier’s Theorem, any two such towers must be equivalent.
The Jordan-Hölder Theorem shows that the composition series is an
invariant of a finite group, i.e., any two isomorphic finite groups have equivalent composition series. Hence, they can be used to decide if two groups
5.4. SIMPLE GROUPS
269
are isomorphic. For example, two groups which have composition series of
different lengths cannot be isomorphic. The length of a composition series
is a numeric invariant of a finite group: any two isomorphic finite groups
have the same length.
Examples 5.4.9.
1. The cyclic group G = hai of order pm has composition series of length m:
m−1
G = hai ⊲ hap i ⊲ . . . hap
m
i ⊲ hap i = {e}.
2. The group H8 = {1, i, j, k, −1, −i, −j, −k} has the following composition
series of length 3:
H8 ⊲ {1, i, −1, −i} ⊲ {1, −1} ⊲ {1},
H8 ⊲ {1, j, −1, −j} ⊲ {1, −1} ⊲ {1},
H8 ⊲ {1, k, −1, −k} ⊲ {1, −1} ⊲ {1}.
These composition series are all isomorphic: the quotient groups Gi /Gi+1 are
all isomorphic to Z2 .
3. By definition, a group is simple if has a unique trivial composition series:
G ⊲ {e}.
4. A finite group G is solvable if and only if for any of its composition series
G = G0 ⊲ G1 ⊲ . . . ⊲ Gm = {e}
the factors Gk /Gk+1 are all cyclic groups of prime order, i.e., isomorphic to
some Zpk , with pk prime. In fact, if G is solvable, all subgroups Gk are solvable
and also its quotients Gk /Gk+1 are solvable (cf. Exercise 6 in Section 5.3). But
any solvable simple group is abelian, hence isomorphic to some Zpk .
The results in this section suggest a program to classify all finite groups:
one should first classify all finite simple groups and then classify all the
possible composition towers made of this groups. The classification of all
finite simple groups is one of the greatest achievements of modern days
Mathematics. The classification theorem can be roughly stated as follows:
Theorem 5.4.10 (Classification of finite simple groups). Up to isomorphism there are 18 infinite families of simple groups and 26 sporadic simple
groups.
270
CHAPTER 5. FINITE GROUPS
Examples of infinite families are the groups Zp , with p prime, and the
groups An , with n ≥ 5. Another example is the projective linear group
PL(n, K) of some finite field K: it is obtained by taking the quotient of
the group SL(n, K) of n × n matrices of determinant 1 by its center. The
sporadic simple groups are more mysterious groups. For example, the largest
sporadic simple group has order:
246 · 320 · 59 · 76 · 112 · 133 · 17 · 19 · 23 · 29 · 31 · 41 · 47 · 59 · 71 ∼ 8 · 1053
and it is called the monster group.
Exercises.
1. Proof Zassenhaus’s Lemma.
(Hint: Use the Second Isomorphism Theorem to show that each of the quotients that appear in the statement of the lemma are both isomorphic to
S ∩ T /(S ∩ T ′ )(S ′ ∩ T ).)
2. Show that a cyclic p-group has a unique composition series.
3. Show that an abelian group has a composition series if and only if it is finite.
4. Determine the composition series of the following groups:
(a) Z6 × Z5 ;
(b) S4 ;
(c) G where |G| = pq (p and q prime).
5. If a simple group G has a subgroup of index n > 1, show that the order of G
divides n!. Use this to prove that A5 has no subgroup of order 15.
6. Show that a group of order pq 2 , where p and q are primes, is solvable.
7. Classify all groups of order 20.
8. Show that there exists no non-abelian simple group of order smaller than 30.
5.5
Symmetry Groups
We introduced in Chapter 1 the notion of symmetry group of of a subset
Ω ⊂ Rn . We will now apply the results we have obtained before on the
structure of finite groups to the classification of symmetry groups.
5.5. SYMMETRY GROUPS
271
Recall that the group of symmetries of Rn is, by definition, the euclidean
group E(n) consisting of all isometries of Rn . We saw in Exercise 5.1.5, that
this group is isomorphic to the semi-direct product O(n) ⋉ Rn . In fact, by
Theorem 1.8.6, an isometry f ∈ E(n) takes the form
f (x) = Ax + b,
where A ∈ O(n) represents an orthogonal transformation and b ∈ Rn a
translation. Recall also that an orthogonal transformation is called a rotation if det A = 1.
If Ω ⊂ Rn its group of symmetries GΩ ⊂ E(n) consists of all isometries
which map Ω into itself:
GΩ = {f ∈ E(n) : f (Ω) = Ω}.
When Ω is bounded, it is obvious that group GΩ contains only orthogonal
transformations. Moroever, we have:
Proposition 5.5.1. If GΩ is the group of symmetries of a bounded set
Ω ⊂ Rn , then one of the following statements holds:
(i) GΩ contains only rotations;
(ii) The rotations of GΩ form an index 2 subgroup (hence, normal) in GΩ .
Proof. Since Ω is bounded, GΩ ⊂ O(n) and GΩ is formed by rotations iff
GΩ ⊂ SO(n). If GΩ 6⊂ SO(n), then H = GΩ ∩ SO(n) is the subgroup
of rotations in GΩ , and coincides with the kernel of the epimorphism det :
GΩ → {1, −1}. By the First Isomorphism Theorem, it follows that
[GΩ : H] = |GΩ /H| = |{1, −1}| = 2.
In the reminder of this section we will consider only finite symmetry
groups. We will see that the the results we have obtained so far on the
structure of finite groups allows us to completely classify the symmetry
groups of planar (n = 2) and tridimensional (n = 3) bounded figures.
5.5.1
Symmetry groups of plane figures
Before we get to the classification of symmetry groups of plane bounded
figures, let us recall some examples of plane figures whose symmetry groups
we have encountered before.
272
CHAPTER 5. FINITE GROUPS
Examples 5.5.2.
1. In Example 1.8.7.2 we saw that D3 , the group of symmetries of an equilateral
4π
triangle, contains 6 elements: 3 rotations (I, 2π
3 , 3 ) and 3 reflections across
its lines of symmetry.
More generally, the dihedral group Dn , the symmetry group of a regular polygon
with n sides, has 2n elements: n rotations and n reflections across the lines
of symmetry of the polygon. If we denote by ρ a rotation of 2π
n and by σ a
reflection across one line of symmetry of the polygon, then:
Dn = hρ, σi = {I, ρ, . . . , ρn−1 , σ, σρ, . . . , σρn−1 }.
As we saw in Example 4.6.3.4, this group has the presentation
{{ρ, σ}, σ 2 = I, ρn = I, ρσ = σρ−1 }.
2. Consider the group of symmetries of the sails of a windmill. It is a cyclic
group of order 4, generated by a rotation ρ by π2 :
C4 = {I, ρ, ρ2 , ρ3 }.
More generally one can consider a windmill with n sails, leading to a cyclic
group of symmetries with n elements: Cn = {I, ρ, . . . , ρn−1 }5
Figure 5.5.1: Plane figure with cyclic group of symmetries.
The symmetry groups that one finds in these examples exhaust all the
possibilities since one has the following:
Theorem 5.5.3. A finite group of symmetries of a plane figure Ω ⊂ R2 is
isomorphic to either Cn or Dn .
Proof. Let Ω ⊂ R2 and let G be its group of symmetries, which we assume
to be finite. According to Proposition 5.5.1, we need to consider two cases:
5
In the study of symmetries it is common to denote by Cn the cyclic group of order n,
which we have denoted before by Zn .
5.5. SYMMETRY GROUPS
273
(a) G contains only rotations: Let ρ ∈ G be a rotation by same angle
θρ . We can assume that θρ is the smallest possible among all rotations of
G (which exists, since G is finite). Then {I, ρ, ρ2 , . . . } ⊂ G. On the other
hand, if ρ̃ ∈ G is a rotation by an angle θρ̃ which is not among the powers
ρn , then there exists some positive integer k such that:
kθρ < θρ̃ < (k + 1)θρ .
It follows that ρ̃ρ−k is a rotation by an angle smaller than ρ, a contradiction.
Hence, G = {I, ρ, . . . , ρn−1 } = Cn .
(b) G contains a reflection σ: By (a) the subgroup H ⊂ G of proper
rotations is of the form H = {I, ρ, . . . , ρn−1 }. Since [G : H] = 2, we find
that G = {I, ρ, . . . , ρn−1 , σ, σρ . . . , σρn−1 }. We leave as an exercise to check
that in this case G ≃ Dn .
If n = p is prime, |Dp | = 2p and the Sylow Theorems show that the
subgroups of Dp are:
(a) p subgroups of order 2: {I, σ}, {I, σρ}, . . . , {I, σρp−1 };
PSfrag replaements
(b) 1 subgroup of order p: Cp = {I, ρ, . . . , ρp−1 };
Z
hmi
e
The subgroup of order p is normal. Since p is a prime,
the subgroups of
e~
order 2 are conjugate via a rotation (exercise). Geometrically, this means
that we can obtain any reflection from a fixed reflection by conjugating it
by rotations. For example, the figure below illustrates in the case p = 5
how the reflection ρσ can be obtain from the reflection σ, conjugating by
3
the rotation ρ2 .
md(m; n)
h i
h
i
mi
ni
hmm(m; n)i
X
D X1
h
B
C
2
D
D
h
C
A
B
E
A
A
C
E
E
A
B
D
Ox
X2
x
gx
g1 x
gk x2
(x)
(gx)
B T1(g)
T2 (g )
g2 x
g1 (g2 x)
T
(g2 )
C T (g1)
T (g1 g2 )
E
Figure 5.5.2: Symmetries of a pentagon.
The structure of the subgroups of the dihedral group Dn when n is not
a prime is more complex and will not be discussed here.
274
5.5.2
CHAPTER 5. FINITE GROUPS
Symmetry groups of tridimensional figures
We consider now the symmetry groups G of a tridimensional figure. We start
by looking at the rotational symmetries, i.e., the case where G ⊂ SO(3).
Theorem 5.5.4. A finite group of rotational symmetries of a figure Ω ⊂ R3
is isomorphic to one of the following rotational symmetry groups:
(i) The symmetry group Cn of a windmill with n sails;
(ii) The symmetry group Dn of a regular polygon with n sides;
(iii) The rotational symmetry group T of a regular tetrahedron;
(iv) The rotational symmetry group O of a cube or a regular octahedron:
(v) The rotational symmetry group I of a regular dodecahedron or a regular
icosahedron:
Proof. Let G ⊂ SO(3) be a finite subgroup. The idea of the proof consists
in introducing an action of G in a fine set P and then exploring the class
equation (5.1.3).
5.5. SYMMETRY GROUPS
275
For the finite set P we will take the set of poles of G: an element p ∈
= {x ∈ R3 : |x| = 1} is called a pole if there exists a non-trivial rotation
g ∈ G, such that g · p = p. Therefore, p is a pole fixed by g ∈ G if and only
if p ∈ S 2 ∩ L, where L the axis of rotation of g. In particular, to each g ∈ G
one associates two poles. Obviously, P 6= ∅ if G is non-trivial.
The group G acts on the set of its poles P : if p ∈ P is such that g · p = p
for some g ∈ G, then for any h ∈ G, we have that
S2
(hgh−1 )h · p = h · p.
Since hgh−1 6= e of g 6= e, it follows that h · p ∈ P .
The study of the action of G on P leads to the following lemma:
Lemma 5.5.5. The action of G on the set of its poles P satisfies
X
1
2
(5.5.1)
(1 −
) =2− ,
rpi
N
i
where the sum is over the orbits Oi of the action, pi is a pole representing
Oi , rpi is the order of the isotropy subgroup Gpi and N is the order of G.
Proof of the Lemma. For each pole p ∈ P , the isotropy subgroup Gp consists
of those elements g ∈ G that fix p. Let N = |G| and rp = |Gp |. For each
g ∈ G − {e} there exist 2 poles associated to g, hence we find:
X
X
2=
(rp − 1).
(5.5.2)
2(N − 1) =
g∈G
g6=e
p∈P
Let us group together the elements of P in terms of the orbits of G. If Oi is
an orbit, we choose a representative pi ∈ Oi and we write ni = |Oi |. Then
equation (5.5.2) yields
X
ni rpi − |P | = 2(N − 1),
i
where the sum is now over the orbits Oi of the action. Since Oi ≃ G/Gpi ,
it follows that ni rpi = N , and we conclude:
X
(5.5.3)
N − |P | = 2(N − 1).
i
On the other hand, the class equation (5.1.3) gives
X
X N
.
(5.5.4)
|P | =
|Oi | =
rpi
i
i
Replacing (5.5.4) in (5.5.3), we obtain (5.5.1).
276
CHAPTER 5. FINITE GROUPS
Equation (5.5.1) puts restrictions over G which eventually lead to its
classification. A first remark is that there can be at most 3 orbits. Indeed,
the right hand side of (5.5.1) is < 2, while each term on the left side is ≥ 21 .
It follows that we have the following possibilities:
(i) 1 orbit: We would have
1−
2
1
=2− .
r1
N
This equation has no solutions
(ii) 2 orbits: We obtain
1
2
1
+
= .
r1 r2
N
The only solution is r1 = r2 = N . This means that we have 2 poles p1
and p2 which are fixed by all elements of G. Hence, we have G = CN ,
the group of rotations around the axis that goes through p1 and p2 .
(iii) 3 orbits: In this case we obtain
1
1
1
2
+
+
−1= .
r1 r2 r3
N
We can assume, without loss of generality, that r1 ≤ r2 ≤ r3 . It follows
that r1 = 2 and we obtain the following subgroups:
(a) r1 = r2 = 2, N = 2r3 . |O1 | = |O2 | =
N
2,
|O3 | = 2;
(b) r1 = 2, r2 = r3 = 3, N = 12. |O1 | = 6, |O2 | = |O3 | = 4;
(c) r1 = 2, r2 = 3, r3 = 4, N = 24. |O1 | = 12, |O2 | = 8, |O3 | = 6;
(d) r1 = 2, r2 = 3, r3 = 5, N = 60. |O1 | = 30, |O2 | = 20, |O3 | = 12.
We leave as an exercise to check that the cases (a), (b), (c) and (d)
correspond, respectively, to the groups G ≃ D N , G ≃ T , G ≃ O and G ≃ I.
2
In case (a), the poles are the intersections of the symmetry lines of the
regular polygon with the unit sphere and the intersection of the line through
the center of the polygon and perpendicular to its plane with the unit sphere.
In cases (b)-(d), the poles are the intersections of the symmetry axis of the
regular polyhedra with the unit sphere.
5.5. SYMMETRY GROUPS
277
Proposition 5.5.1 now shows that the full symmetry groups take one of
the following forms:
(a) If −I ∈ G, then G = H ∪ −H, where H = {I, ρ1 , . . . , ρn−1 } the
subgroup of rotations in G and −H ≡ {−I, −ρ1 , . . . , −ρn−1 }.
(b) If −I 6∈ G, then G = H ∪ H̃, where H is the subgroup of rotations in
G and, if −ρ ∈ H̃ then ρ has even order and ρ2 ∈ H.
Since we already know what H can be, by Theorem 5.5.4, this leads to the
classification of the finite groups of symmetries of a tridimensional figure
Ω ⊂ R3 .
Podemos descrever, of forma mais explı́cita, The symmetry groups of the
regular polyhedra can be described even more explicitly. We illustrate with
the case of a regular dodecahedron (for the remaining regular polyhedra, see
the exercises at the end of the section).
Example 5.5.6.
The group of symmetries I of a dodecahedron (or its dual, a regular icosahedron) has order 60 = 22 ×3×5. By the Sylow Theorems we obtain the following
subgroups:
(i) Subgroups of order 5: the number of subgroups of order 5 divides 12 and is
equal to 1 (mod 5). Hence, we can have either 1 or 6 subgroups of order
5. There are 6 subgroups each generated by a rotation by 2π
5 around a
line that connects the centers of two parallel faces of the dodecahedron.
These subgroups are precisely the isotropy subgroups of the orbit O3 .
(ii) Subgroups of order 3: the number of subgroups of order 3 divides 20 and is
equal to 1 (mod 3). We can have 1, 4 or 10 subgroups of order 3. There
10 subgroups each generated by a rotation by 2π
3 around a line through
opposite vertices of the dodecahedron. These subgroups are precisely the
isotropy subgroups of the orbit O2 .
(iii) Subgroups of order 4: the number of subgroups of order 4 divides 15 and
is equal to 1 (mod 2). We can have 1, 3, 5 ou 15 subgroups. There are
15 subgroups of order 2 corresponding to rotations by π around the lines
through the centers of parallel edges of the dodecahedron. These subgroups
are precisely the isotropy subgroups of the orbit O1 , and they give rise to
5 subgroups of order 4, formed by the rotations associated to 3 orthogonal
edges of the dodecahedron (in the figure below, these are the edges parallel
to the edges of the cube).
This list of rotations of order 2, 3 and 5, exhaust all the elements of the group
I, since:
(5.5.5)
60 = |I| = 1 + |{z}
15 + |{z}
20 + |{z}
24 .
order 2 order 3 order 5
278
CHAPTER 5. FINITE GROUPS
This description of the elements of I also allows to show that I is a simple
group. In fact, if H ⊂ G is a normal subgroup and H contains an element
of order r, then H must contains all the elements of order r, since the Sylow
subgroups are all conjugate and the subgroups of order 2 are not normal. Hence,
the order of H is a sum of terms in equation (5.5.5). However, there is no
integer that is a sum of terms of (5.5.5) and that divides 60. Hence, we must
have H = I, so this is a simple group.
Our study of the alternating group An suggests that I ≃ A5 . To see that this is
indeed the case, consider the 5 cubes inscribed in the dodecahedron (see figure).
The action of I in the vertices of the dodecahedron transforms the vertices of
the cubes into vertices of the cubes (preserving their orientation). This gives
an action of I in a set of 5 elements:
T (R)(cube) = R(cube)
Figure 5.5.3: One of the cubes inscribed in a dodecahedron.
Since I is simple, this action is effective: N (T ) = {e}. Since I contains only
rotations which preserve orientations, we have Im(T ) ⊂ A5 . Finally, since
|I| = 60 = |A5 |, we conclude that I ≃ Im(T ) = A5 .
The classification of finite groups of symmetries of figures Ω ⊂ Rn , for
n > 3, is only known for small values of n. A special important case is the
groups generated by reflections in hyperplanes of Rn , the so-called Coxeter
groups. Its classification, obtained by Coxeter in 19346 , is related to the
classification of Lie algebras, and has many applications in several domains
of Mathematics and Physics.
6
H. S. M. Coxeter, Discrete Groups Generated by Reflections, Ann. Math. 35, (1934)
588-621.
5.5. SYMMETRY GROUPS
279
Exercises.
1. Complete the proof of Theorem 5.5.3.
2. Let Dp be the group of symmetries of a regular polygon with p sides. Show
that if p is a prime, all subgroups of Dp of order 2 are conjugate via a rotation.
3. Complete the proof of Theorem 5.5.4.
4. Show that the action of the symmetry group I on the set of 5 cubes inscribed
in the dodecahedron is effective (see Example 5.5.6).
5. Show that T ≃ A4 .
(Hint: Consider the action of T on the vertices of the tetrahedron.)
6. Mostre que O ≃ S4 .
(Hint: Consider the action of O on the diagonals of the cube.)
280
CHAPTER 5. FINITE GROUPS
Chapter 6
Modules
6.1
Modules over a ring
Let us recall the definition of a vector space over a field:
Definition 6.1.1. A vector space over a field K is an abelian group
(V, +) with a binary operation K × V → V , written (k, v) 7→ kv, which
satisfies:
(i) k(v 1 + v 2 ) = kv 1 + kv 2 , k ∈ K, v 1 , v 2 ∈ V ;
(ii) (k + l)v = kv + lv, k, l ∈ K, v ∈ V ;
(iii) k(lv) = (kl)v, k, l ∈ K, v ∈ V ;
(iv) 1v = v, v ∈ V .
Now let (G, +) be an abelian group, for which we use the additive notation. Recall that we have a binary operation which to the elements n ∈ Z
and g ∈ G associates the element ng ∈ G. This operation satisfies:
• n(g1 + g2 ) = ng1 + ng2 , n ∈ Z, g1 , g2 ∈ G;
• (n + m)g = ng + mg, n, m ∈ Z, g ∈ G;
• n(mg) = (nm)g, n, m ∈ Z, g ∈ G;
• 1g = g, g ∈ G.
These properties are formally analogous to the axioms in Definition 6.1.1,
where we replace the elements of V (“vectors”) by elements of the group G,
and the elements of the field K (“scalars”) by elements of the ring Z.
281
282
CHAPTER 6. MODULES
Next consider a linear transformation T : Rn → Rn . If p(x) = a0 +
a1 x + · · · + an xn ∈ R[x] is any polynomial and v ∈ Rn is a vector, we let
p(x) · v ∈ Rn be defined by1
p(x) · v := a0 v + a1 T (v) + · · · + an T n (v) =
n
X
ak T k (v),
k=1
where:
T 0 = I,
Tk = T ◦ T ◦ ··· ◦ T
(k − times).
For example, if T : R3 → R3 is the linear transformation whose matrix
relative to the canonical basis e1 = (1, 0, 0), e2 = (0, 1, 0), e3 = (0, 0, 1) is:


2 0 0
A =  0 3 1 .
0 0 3
and we let p(x) = 3 + 12 x − 2x2 and v = (1, 2, 1), then
1
p(x) · v = 3(1, 2, 1) + T (1, 2, 1) − 2T 2 (1, 2, 1) = (−4, −77/2, −27/2).
2
It is easy to check that this operation satisfies the following properties:
• p(x) · (v 1 + v 2 ) = p(x) · v 1 + p(x) · v 2 ;
• (p(x) + q(x)) · v = p(x) · v + q(x) · v;
• p(x) · (q(x) · v) = (p(x)q(x)) · v;
• 1v = v.
for any p(x), q(x) ∈ R[x], v, v 1 , v 2 ∈ Rn . In this example, the “vectors” are
elements in Rn and the “scalars” are elements of the polynomial ring R[x].
Notice that in each of these three examples all the properties satisfied
by the operation in question involve only the ring structure of the “scalars”
(respectively, K, Z and R[x]) and the additive structures of the “vectors”.
There are other similar circumstances where one can find analogues properties, so it is natural to extend these concepts to an arbitrary ring. The
resulting algebraic structure, called module, is the basic stone to study concepts of Linear Algebra, such as linear independence, linear transformations,
dimension, basis, etc.
1
From now, in order to simplify the notation, we will not use the convention of denoting
an indeterminate by x. Usually, the indeterminates will be denoted by x1 , . . . , xn (or x
if n = 1; or x, y if n = 2; or x, y, z if n = 3) and we will denote by bold the vectors (the
elements of a vector space or, more generally, of a module).
6.1. MODULES OVER A RING
283
Definition 6.1.2. A module M over a ring A or an A-module is an
abelian group (M, +), together with an operation of a unitary ring A on M ,
written (a, v) 7→ av, satisfying the following properties(2 ):
(i) a(v 1 + v 2 ) = av 1 + av 2 , a ∈ A, v 1 , v 2 ∈ M ;
(ii) (a1 + a2 )v = a1 v + a2 v, a1 , a2 ∈ A, v ∈ M ;
(iii) a1 (a2 v) = (a1 a2 )v, a1 , a2 ∈ A, v ∈ M ;
(iv) 1v = v, v ∈ M .
To be precise, the modules we have just defined are “left modules”. We
leave it to the reader the task of defining “right modules”. All the results
in this chapter will be stated for left modules, but they are valid mutatis
mutandis for right modules. For a commutative ring A there is little sense
in distinguishing between a left and a right module.
We will denote by 0A and 0M the identities in (A, +) and (M, +). Since
(M, +) is an abelian group the element nv ∈ M , where n ∈ Z and v ∈ M ,
has the usual meaning. Similarly, we have the element na ∈ A, where n ∈ Z
and a ∈ A. One checks easily the following elementary properties:
Proposition 6.1.3. For any A-module M :
(i) a0M = 0M , a ∈ A;
(ii) 0A v = 0M , v ∈ M ;
(iii) (−a)v = −(av) = a(−v), a ∈ A, v ∈ M ;
(iv) n(av) = a(nv) = (na)v, n ∈ Z, a ∈ A, v ∈ M .
A submodule N of an A-module M is a subgroup of (M, +) which is
closed for multiplication by elements of A: if a ∈ A and v ∈ N , then av ∈ N .
A submodule is obviously an A-module.
Examples 6.1.4.
1. Vector spaces are the same thing as modules over a field. More generally,
one can call a module over a division ring D a vector space. In this case the
submodules are the linear subspaces.
2. Abelian groups are the same thing as modules over Z. In this case, the
submodules coincide with the subgroups of G.
2
One can also consider modules over rings A without an identity 1, where one omits
axiom (iv). We shall only consider modules over unitary rings.
284
CHAPTER 6. MODULES
3. Generalizing the example above, a fixed linear transformation T : V → V
makes the vector space V into a module over the polynomial ring K[x]: given
p(x) = a0 + a1 x · · · + an xn ∈ K[x] and v ∈ V then one sets:
p(x) · v := a0 v + a0 T (v) + · · · + an T n (v).
In this example the submodules are the linear subspaces W ⊂ V which are
invariant under the linear transformation T .
4. If A is a ring and I ⊂ A is a (left) ideal then I is a A-module: if a ∈ A
and b ∈ I, then ab ∈ I. Similarly, A/I is an A-module, since if a ∈ A and
b + I ∈ A/I, we can set:
a(b + I) = ab + I.
5. If A is a ring and B ⊂ A is a subring, then A is a B-module. In particular,
the rings A[x1 , . . . , xn ] and A[[x1 , . . . , xn ]] are A-modules.
6. If G is an abelian group and End(G) is the ring of endomorphisms of G.
Then G is a End(G)-module for the operation
φg ≡ φ(g),
φ ∈ End(G), g ∈ G.
7. Let A and B be rings and φ : A → B a ring homomorphism. If M is a
B-module, then we can make M into A-module, denoted φ∗ M , which has the
same underlying supporting set but a new operation defined by
av ≡ φ(a)v,
a ∈ A, v ∈ M.
Notice that in all these examples by studying the structure of the module
(e.g, the structure of its submodules) one can obtain information about
the underlying objects: the subgroups of an abelian group, the invariant
subspaces, etc. This will be a recurrent theme in this chapter.
Definition 6.1.5. A homomorphism of A-modules φ : M1 → M2 is a
map between A-modules which satisfies:
(i) φ(v 1 + v 2 ) = φ(v 1 ) + φ(v 2 ), v 1 , v 2 ∈ M ;
(ii) φ(av) = aφ(v), a ∈ A, v ∈ M .
One defines in a obvious way monomorphisms, epimorphisms and isomorphisms of A-modules. We will also use interchangeably the terms A-linear
transformations or linear map to denote a homomorphism of A-modules.
Note that if φ : M1 → M2 is a linear transformation, its kernel N (φ)
and its image Im(φ) are submodules of M1 and M2 .
6.1. MODULES OVER A RING
285
Examples 6.1.6.
1. A homomorphism of Z-modules is the same thing as a homomorphism of
abelian groups.
2. If V1 and V2 are vector spaces over K, the K-homomorphisms φ : V1 → V2
are the usual linear transformations.
If M is an A-module and N ⊂ M is a submodule, the inclusion ι : N →
M is an A-linear map. The quotient M/N has a natural structure of an
A-module for which the quotient map π : M → M/N is an A-linear map:
M/N is an abelian group and we can define a scalar multiplication by:
a(v + N ) ≡ (av) + N.
One checks easily that (i)-(iv) are satisfied. The module M/N is called the
quotient module of M by N .
T
If {Ni }i∈I is a family of submodules of an A-module M , then i∈I Ni is a
submodule of M . Hence, if S ⊂ M is a non-empty set, then the intersection
of all submodules of M which contains S is a submodule hSi, called the
module generated by S. The elements of hSi are of the form a1 v 1 +· · ·+ar v r ,
where ai ∈ A and v i ∈ S.3
P If {Ni }i∈I is a family of submodules of an
S A-module M , we denote by
N
the
submodule
generated
by
S
=
i
i∈I
i∈I Ni . If I = {1, . . . , m} is
Pm
finite,
we
write
N
or
N
+
·
·
·
+
N
.
In general, the elements of
1
m
i=1 i
P
i∈I Ni take the form v i1 + · · · + v im , v ij ∈ Nij .
Theorem 6.1.7 (Isomorphism Theorems for Modules).
(i) If φ : M1 → M2 is a homomorphism of A-modules, then there exists
an isomorphism of A-modules: Im(φ) ≃ M1 /N (φ).
(ii) If N1 and N2 are submodules of an A-module M , then there exists an
isomorphism of A-modules:
N1
N1 + N2
≃
.
N2
N1 ∩ N2
(iii) If N and P are submodules of an A-module M and M ⊃ N ⊃ P , then
P is a submodule of N and there is an isomorphism of A-modules:
M/N ≃
M/P
.
N/P
3
Note that this does not hold for
P modulesPover rings without a unit. For this modules,
the elements of hSi are of the form i ai v i + j nj ṽ j , where ai ∈ A, nj ∈ Z and v i , ṽ j ∈ S.
286
CHAPTER 6. MODULES
We leave the easy proofs of this theorems for the exercises.
Q
Let {Mi }i∈I be a family of A-modules. We define the A-module i∈I Mi ,
called direct
product of the family {Mi }i∈I , as follows. The underlying
Q
set
of
M
i is the cartesian product of the Mi . If (v i )i∈I , (w
i∈I
Q
Qi )i∈I ∈
i∈I Mi , then (v i )i∈I + (w i )i∈I denotes the element (v i + w i )i∈I
Q ∈ i∈I Mi ,
and if a ∈ A, then a(v i )i∈I
denotes
the
element
(av
)
∈
i i∈I
i∈I Mi . One
Q
checks immediately that Qi∈I Mi becomes an A-module. If k ∈ I, the
canonical Q
projection πk : i∈I Mi → Mk the A-linear map which maps
(v i )i∈I ∈ i∈I Mi to the element v k ∈ Mk .
L
The direct sum of a family of A-modules
{Mi }i∈I , denoted i∈I Mi , is
Q
the submodule of the direct product i∈I Mi formed by the elements (v i )i∈I
where only a finite number
of v i ’s are non-zero. If k ∈ I, the canonical
L
injection ιk : Mk → Qi∈I Mi is the A-linear map which maps v k ∈ Mk to
the element (v i )i∈I ∈ i∈I Mi with all vi = 0 for i 6= k.
When I = {1, . . . , m} isLfinite, the direct sum and direct product coincide. In this case we write m
i=1 Mi or M1 ⊕ · · · ⊕ Mm .
Proposition 6.1.8. M ≃ M1 ⊕ · · · ⊕ Mm (as A-modules) if and only if
there exist A-linear maps πk : M → Mk and ιk : Mk → M such that:
(i) πk ◦ ιk = idMk , k = 1, . . . , m;
(ii) πk ◦ ιl = 0, k 6= l;
(iii) ι1 ◦ π1 + · · · + ιm ◦ πm = idM .
Proof. Assume that φ : M → M1 ⊕ · · · ⊕ Mm is an isomorphism. Then the
compose of the canonical projections and injections with φ and φ−1 satisfy
(i), (ii) and (iii).
Conversely, if there exist A-linear maps satisfying (i), (ii) and (iii), we
can defined φ : M → M1 ⊕ · · · ⊕ Mm and ψ : M1 ⊕ · · · ⊕ Mm → M by:
φ(x) = (πk (x))k={1,...,m} ,
ψ((xk )k={1,...,m} ) = ι1 (x1 ) + · · · + ιm (xm ).
These are A-linear maps which, by (i), (ii) and (iii), satisfy φ◦ψ = idM1 ⊕···⊕Mm
and ψ ◦ φ = idM . Hence, φ and ψ establish an isomorphism of A-modules
M ≃ M1 ⊕ · · · ⊕ Mm .
If {Ni }i∈I is a family of submodules of an A-module M , we
Lsay that M is
the direct
sum
of
the
submodules
{N
}
,
if
the
map
i i∈I
i∈I Ni → M ,
P
L
(v i ) 7→ i v i , is an isomorphism. In this case, we write M = i∈I Ni . We
have the following result which characterizes when an A-module is a direct
sum of submodules. We leave the proof as an exercise:
6.1. MODULES OVER A RING
287
Proposition 6.1.9. L
Let M be an A-module and {Mi }i∈I a family of submodules. Then M = i∈I Mi if and only if:
P
(i) M = i∈I Mi ;
(ii) Mj ∩ (Mi1 + · · · + Mik ) = {0} for j 6∈ {i1 , . . . , ik }.
The notion of module is also relevant for studying other algebraic structures. One important example is the following:
Definition 6.1.10. Let R be a ring with identity. An algebra over R
or R-algebra is a ring A such that:
(i) (A, +) is an R-module;
(ii) k(ab) = (ka)b = a(kb) for all k ∈ R and a, b ∈ A.
If (A, +, ·) is a division ring, A is called a division algebra.
The notions of subalgebra, homomorphism and isomorphism of R-algebras
are more or less obvious and we leave it to the reader its definition. An algebra over a field K which, as a vector space over K, has finite dimension
is called a finite dimensional algebra over K.
Examples 6.1.11.
1. A field extension k ⊂ K can be viewed as an algebra over k. For example,
the fields Q ⊂ R ⊂ C are algebras over each of the preceding fields. Similarly,
the quaternions H is an algebra over Q and R (what about over C?). These
are all division algebras.
2. Let R be a ring with identity. The set A = Mn (R) of all matrices n × n
with entries in R is an algebra over R, which fails to be a division algebra. If
R = K is a field, Mn (K) is an algebra over K of finite dimension.
3. If V is a vector space over a field K, the set A = EndK (V ) of all the linear
maps V → V is an algebra over K. Note that A is finite dimensional iff V is
finite dimensional: if dim V = n then A is isomorphic to the algebra Mn (K).
4. If A is a commutative ring with identity, the polynomial ring A[x1 , . . . , xn ]
and the power series ring A[[x]] are algebras over A.
Algebras where the product is non-associative are also important (4 ) but
will not be discussed here.
4
For example, Lie algebras and Jordan algebras are non-associative algebras.
288
CHAPTER 6. MODULES
Exercises.
1. Consider the R[x]-module structure on V = Rn defined by the linear transformation T (v1 , . . . , vn ) = (vn , v1 , . . . , vn−1 ). Determine the elements v ∈ Rn
such that (x2 − 1)v = 0.
2. Let φ : M1 → M2 be a homomorphism of A-modules, and Ni ⊂ Mi (i = 1, 2)
submodules such that φ(N1 ) ⊂ N2 . Show that:
(a) There exists one, and only one, homomorphism of A-modules φ̃ : M1 /N1 →
M2 /N2 for which the following diagram commutes:
M1
φ
/ M2
π1
π2
M1 /N1 ❴ ❴ ❴ ❴ ❴ ❴/ M2 /N2
φ̃
(b) φ̃ is an isomorphism if and only if Im(φ) + N2 = M2 and φ−1 (N2 ) ⊂ N1 .
3. Let {Ni }i∈I be a family of A-modules. Show that:
(a) Given an A-module M and homomorphisms
Q {φi : M → Ni }i∈I , there exists a unique homomorphism φ : M → i∈I Ni such that for all k ∈ I
the following diagram commutes:
Q
❴ ❴ ❴ ❴φ ❴ ❴/ i∈I Ni
M P
PPP
PPP
PPP
πk
φk PPPP
P' Nk
(b)
Q
i∈I
Ni is determined up to isomorphism by this property.
4. Let {Ni }i∈I be a family of A-modules. Show that:
(a) Given an A-module M and homomorphisms
{φi : Ni → M }i∈I , there exL
ists a unique homomorphism φ : i∈I Ni → M such that for all k ∈ I
the following diagram commutes:
L
φ
M o❴
gPPP❴ ❴ ❴ ❴ ❴ i∈IO Ni
PPP
PPP
ιk
P
φk PPPP
P
Nk
(b)
L
i∈I
Ni is determined up to isomorphism by this property.
6.1. MODULES OVER A RING
289
5. Let M be
Lan A-module and {Mi }i∈I a family of submodules de M . Show
that M = i∈I Mi if and only the following two conditions hold:
P
(i) M = i∈I Mi ;
(ii) Mj ∩ (Mi1 + · · · + Mik ) = {0} se j 6∈ {i1 , . . . , ik }.
6. A sequence of homomorphisms of A-modules:
M0
φ1
/ M1
φ2
/ ···
/ M2
φn
/ Mn ,
is said to be exact if Im(φi ) = N (φi+1 ), i = 1, . . . , n − 1. Show that:
(a) if N ⊂ M is a submodule, then one has an exact sequence:
/N
0
ι
/M
π
/ M/N
/0
(b) if M1 and M2 are A-modules, then one has an exact sequence:
0
/ M1
ι1
/ M1 ⊕ M2
π2
/0
/ M2
7. (Five Lemma) Consider the following commutative diagram of A-modules
and linear transformations:
M1
φ1
N1
/ M2
/ M3
/ M4
/ M5
φ2
φ3
φ4
φ5
/ N2
/ N3
/ N4
/ N5
Show that, if the rows are exact and φ1 , φ2 , φ4 and φ5 are isomorphisms, then
φ3 is an isomorphism.
8. If M and N are left A-modules, HomA (M, N ) denotes the set of all A-linear
transformations φ : M → N . Show that:
(a) HomA (M, N ) is a Z-module;
(b) HomA (M, A) is a right A-module;
(c) EndA (M ) ≡ HomA (M, M ) is a A-algebra, provided A is abelian.
9. Let M be a left A-module. The dual of M is the right A-module M ∗ ≡
HomA (M, A) (see previous exercise). Show that:
(a) if φ : M → N is A-linear, there exists a dual linear transformation (of
right A-modules) φ∗ : N ∗ → M ∗ ;
Q
L
∗
(b) ( i∈I Ni ) ≃ i∈I Ni ∗ ;
(c) one can have M 6= {0} and M ∗ = {0}.
290
CHAPTER 6. MODULES
6.2
Linear Independence
Let M be an A-module. A set ∅ =
6 S ⊂ M is called linearly independent
if for any finite family {v 1 , . . . , v n } of elements of S and any a1 , . . . , an ∈ A:
a1 v 1 + · · · + an v n = 0
=⇒
a1 = · · · = an = 0.
Otherwise, we call S linearly dependent.
A set S ⊂ M is called generating if M = hSi. In this case any element
v ∈ M can be written as linear combination (in general, non-unique) of
elements de S:
m
X
ai vi , ai ∈ A, v i ∈ S.
v=
i=1
An A-module is said to be of finite type if it has some finite generating
subset.
A set S ⊂ M which is both generating and linearly independent is called
a basis S of the A-module M . Given a basis S, any
P element v ∈ M can be
written in a unique way as a linear combination m
i=1 ai v i , ai ∈ A, v i ∈ S.
An A-module may or may not have a basis (see the examples below). We
will say that an A-module M is free5 if M admits a basis.
Examples 6.2.1.
1. A vector space always contains a basis (exercise).
2. The abelian group Zn , viewed as a Z-module, does not admit a basis. In fact,
given g ∈ Zn , there exists always a 0 6= m ∈ Z such that mg = 0. Hence in Zn
every subset is linearly dependent.
3. The abelian group Zm = Z ⊕ · · · ⊕ Z is free. A basis is given by S =
{e1 , . . . , em }, where ei = (0, . . . , 1, . . . , 0).
4. If D is an integral domain, the polynomial ring D[x] is free. A basis is given
by the monomials S = {1, x, . . . , xn , . . . }.
5. Any ring A with a unit is a free A-module with basis {1}. The submodules
coincide with the ideals of A. In particular, a submodule may not be free, and
even if its free, it may have a basis with more than one element.
We say that an A-module M is cyclic if it is generated by one element,
i.e., if M = hvi for some v ∈ M .6 . If M = hvi is cyclic, then we have an
5
6
This is the analogue for modules of the notion of a free group.
This is the analogue for modules of the notion of a cyclic group
6.2. LINEAR INDEPENDENCE
291
homomorphism of A-modules, A → M , given by a 7→ av. This homomorphism is surjective and, by the First Isomorphism Theorem, M ≃ A/ ann v,
where the annihilator of v is the ideal
ann v := {a ∈ A : av = 0}.
If ann v = {0}, then we say that v is a free element, since in this case
M = hvi ≃ A is free. A non-free element is said to have torsion. The set
of elements of M with torsion is denoted by Tors(M ). When A is a domain,
Tors(M ) ⊂ M is a submodule of M .
If X is any set and A is a ring, letL
us associate to each x ∈ X a copy of
A and form the free A-module M = x∈X A, called the free A-module
generated by the set X. We can represent the elements of M as finite
sums a1 x1 + · · · + ar xr , where xL
1 , . . . , xr ∈ X and ai ∈ A: by such a sum
we mean the element (ax )x∈X ∈ x∈X A, where ax1 = a1 , . . . , axr = ar and
ax = 0 if x 6= xi (i = 1, . . . , r).
The next proposition gives a characterization of free modules. In particular, notice that the free module generated by the set X satisfies the same
universal property that characterizes free groups.
Proposition 6.2.2. Let A be a ring. For an A-module M , the following
statements are equivalent:
(i) M is free.
(ii) There exists a family
L {Ni }i∈I of cyclic submodules of M , with Ni ≃ A,
such that M ≃ i∈I Ni .
(iii) M ≃
L
j∈J
A for some index set J.
(iv) There exists a set X 6= ∅ and a map ι : X → M such that the following
universal property holds: for any A-module N and map φ : X → N
there exists a unique homomorphism of A-modules φ̃ : M → N such
that the following diagram commutes:
ι
/M
X ◆◆
✤
◆◆◆
✤
◆◆◆
φ̃
◆◆◆
φ
◆◆' ✤
N
Proof. We show that (i) ⇒ (ii) ⇒ (iii) ⇒ (iv) ⇒ (i).
292
CHAPTER 6. MODULES
(i) ⇒ (ii) Assume that M is free and let {ei }i∈I be a basis of M . Them,
for each L
i ∈ I, Ni := hei i is a cyclicPsubmodule of M isomorphic to A. The
map φ : i∈I Ni → M , (v i )i∈I 7→ i∈I v i is an isomorphism of A-modules.
(ii) ⇒ (iii) Obvious.L
(iii) ⇒ (iv) Let ψ : j∈J A
L→ M be an isomorphism of A-modules and
ek = (xj )j∈J the element of
j∈J A, with xk = 1 and xj = 0, for j 6= k.
Also, let X = J and consider the map ι : X → M given by ι(j) = ψ(ej ). If
φ : X → N is any map into an A-module N
P, we can define φ̃ : M → N to
be the linear transformation ψ((xj )j∈J ) 7→ j xj φ(j). Clearly, φ̃ makes the
diagram above commute. Since {ψ(ek )} is a basis of M , φ̃ is unique.
(iv) ⇒ (i) We leave as an exercise to check that {ι(x)}x∈X is a basis of
M.
Let M be a free A-moduleL
which admits a finite basis {e1 , . . . , en }. The
proposition shows that M ≃ ni=1 A ≡ An . Is it true that any other basis
of M has the same number of elements? In other words, is it true that
An ≃ Am implies n = m? Maybe somewhat surprising, the answer is no, as
shown by an exercise at the end of this section.
On the other hand, if M is a free A-module which admits an infinite
basis we have the following result:
Proposition 6.2.3. If an A-module M admits a basis with an infinite number of elements, then all the bases of M have the same cardinality.
Proof. Let {ei }i∈I and {f j }j∈J be bases of M and assume that I is infinite.
Then:
(a) J is infinite: Assume, to the contrary, that J is finite, say J =
{1, . . . , m}. Then
P there exist elements cjil ∈ A with il ∈ I, l, j ∈ J,
such that f j = m
l=1 cjil eil . But then {ei1 , . . . , eim } is a set generating M . Hence, if e0 is another basis element distinct from eil there
exist a1 , . . . , am ∈ A such that
e0 = a1 ei1 + · · · + am eim ,
contradicting the linear independence of the {ei }i∈I .
(b) ∃ φ : I → Pfin (J)×N injective: 7 Let ψ : I → Pfin (J) be the map which
to i ∈ I associates {j1 , . . . , jm }, where the j1 , . . . , jm are the (unique)
7
We denote by Pfin (J) the set consisting of the finite subsets of J. It is shown in the
Appendix, that if J is infinite, then Pfin (J) has the same cardinality as J.
6.2. LINEAR INDEPENDENCE
293
indices in J such that:
ei = aj1 f j1 + · · · + ajm f jm
(ajl 6= 0).
The map ψ is not injective, but if P ∈ Pfin (J), then ψ −1 (P ) is finite
(why?). Hence we can order ψ −1 (P ). If i ∈ ψ −1 (P ), set φ(i) ≡ (P, α),
where α is the ordinal number of i in ψ −1 (P ). Since I is the disjoint
union of the ψ −1 (P ), the map φ : I → Pfin (J) × N is injective.
(c) |I| = |J|: Since J is infinite, by (b), we have
|I| ≤ |Pfin (J) × N| = |Pfin (J)| = |J|.
Interchanging the roles of I and J, we conclude that |J| ≤ |I|. By the
Schröder-Bernstein Theorem, we see that |I| = |J|.
The previous results motivate the following definition:
Definition 6.2.4. We say that a ring A has the invariant dimension
property if for any free A-module M , all the bases of M have the same
cardinality. In this case, the common cardinality of the bases of M is called
the dimension of M , denoted dimA M or simply dim M .
We leave as an exercise to check that any division ring has the invariant
dimension property, hence it makes sense to talk about the dimension of a
vector space. We also have:
Proposition 6.2.5. A commutative ring has the invariant dimension property.
Proof. Let {e1 , . . . , en } and {f 1 , . . . , f m } be bases of a free A-module M .
Then there are elements bji , cij ∈ A, i = 1, . . . , n, j = 1, . . . , m such that
X
X
fj =
bji ei ,
ei =
cij f j .
i
By substitution, we conclude that:
X
fj =
bji cil f l ,
il
j
ei =
X
cij bjl el .
jl
Since {e1 , . . . , en } and {f 1 , . . . , f m } are bases of M , if we define the matrices
n,m
B = (bji )m,n
j=1,i=1 and C = (cij )i=1,j=1 , we conclude that BC = Im×m and
CB = In×n .
294
CHAPTER 6. MODULES
If A has zero characteristic, since A is commutative, we have8 :
m = tr(Im×m ) = tr(BC) = tr(CB) = tr(In×n ) = n.
The first and last equality are valid in characteristic zero. We leave the
proof for non-zero characteristic as an exercise.
Exercises.
1. Give an example of an A-module, non-isomorphic to A, where any set with
2 or more elements is linearly dependent.
2. Let A be a commutative ring with unit and M an A-module.
(a) Show that, if v ∈ Tors(M ), then hvi ⊂ Tors(M );
(b) Is Tors(M ) always a submodule of M ?
3. Let A be a commutative ring. Show that EndA (An ) is isomorphic to the ring
Mn (A) of n × n matrices with entries in A.
4. Let M be an A-module, X 6= ∅ a set and ι : X → M a map satisfying the
following property: for any A-module N and map φ : X → N there exists a
unique A-linear transformation φ̃ : M → N such that φ = φ̃ ◦ ι. Show that
{ι(x)}x∈X is a basis of M .
5. Let V be a vector space over a division ring D. Show that V has a basis.
6. Show that a division ring D has the invariant dimension property.
7. Let A be a commutative ring. Complete the proof of Proposition 6.2.5 by
showing that:
(a) if B, C ∈ Mn (A) are n×n matrices, then BC = In×n implies CB = In×n ;
(b) if B is a m×n matrix, C is a n×m matrix, BC = Im×m and CB = In×n ,
then m = n.
L∞
∞
8. Let R∞ =
i=1 R (direct sum of R-modules) and let A = End(R ) be
∞
the ring of all R-linear transformations of R . Show that A ≃ A ⊕ A (as
A-modules), i.e., that A has a basis with 2 elements, so A does not have the
invariant dimension property.
9. Show that any A-module is the quotient of a free A-module.
8
The trace of a n × n matrix A = (aij ) is defined by tr A =
Pn
i=1
aii ∈ A.
6.3. TENSOR PRODUCTS
6.3
295
Tensor Products
In this section A denotes a commutative ring(9 ). In particular, as we saw
before, the free modules have the invariant dimension property.
If M1 , . . . , Mr , N are A-modules, an A-multilinear map is a map µ :
M1 × · · · × Mr → N which is A-linear in each entry:
′
′′
′
′′
µ(v 1 , . . . , av i +bv i , . . . , v r ) = aµ(v 1 , . . . , v i , . . . , v r )+bµ(v 1 , . . . , v i , . . . , v r ).
We will denote by L(M1 , . . . , Mr ; N ) the set of all A-multilinear maps. One
checks immediately that L(M1 , . . . , Mr ; N ) é is an A-module for the usual
operations of addition and multiplication by scalars:
(µ1 + µ2 )(v 1 , . . . , v r ) := µ1 (v 1 , . . . , v r ) + µ2 (v 1 , . . . , v r ),
(aµ)(v 1 , . . . , v r ) := aµ(v 1 , . . . , v r ).
If M1 = · · · = Mr = M we write Lr (M ; N ) instead of L(M, . . . , M ; N ).
Proposition 6.3.1. Let M1 , . . . , Mr be A-modules.
Nr
(i) There exists an A-module
i=1 Mi ≡ M1 ⊗ · · · ⊗ Mr and an Amultilinear map ι : M1 × · · · × Mr → M1 ⊗ · · · ⊗ Mr satisfying the
following universal property: for any A-module N and A-multilinear
map φ : M1 × · · · × Mr → N , there exists a unique A-linear map
φ̃ : M1 ⊗ · · · ⊗ Mr → N which makes the following diagram commute:
ι
/ M1 ⊗ · · · ⊗ Mr
✤
❱❱❱❱
❱❱❱❱
✤
❱❱❱❱
φ̃
❱❱❱❱
φ
❱❱❱❱ ✤
*
M1 × · · · × M❱ r
N
(ii) The A-module M1 ⊗ · · · ⊗ Mr is determine by the universal property
in (i) up to isomorphism.
Proof. Let L be the free A-module generated by the set I = M1 × · · · × Mr ,
i.e.,
M
L=
A,
i∈I
9
One can also define tensor products of modules over non-commutative rings. The
definition is similar, but one needs to distinguish between left and right modules.
296
CHAPTER 6. MODULES
where in this sum there exists a term for each (v 1 , . . . , v r ) ∈ M1 × · · · × Mr .
Denoting by R the submodule of L generated by the elements of the form
′
′′
′
(6.3.1) (v 1 , . . . , av i + bv i , . . . , v r ) − a(v 1 , . . . , v i , . . . , v r )−
′′
b(v 1 , . . . , v i , . . . , v r ),
we let M1 ⊗ · · · ⊗ Mr denote the quotient module L/R. We define ι :
M1 × · · · × Mr → M1 ⊗ · · · ⊗ Mr to be the compose of the canonical injection
M1 × · · · × Mr → L with the quotient map L → L/R.
If φ : M1 × · · · × Mr → N is an A-multilinear map, then we induce a
linear map φ̄ : L → N as follows: if ⊕i∈I ai (v i1 , . . . , v ir ) ∈ L, then
X
φ̄(⊕i∈I ai (v i1 , . . . , v ir )) =
ai φ(v i1 , . . . , v ir ).
i∈I
This map is well-defined, since by definition of the direct sum only a finite
number of the ai ’s is non-zero. Also, it is immediate to check that φ̄ is
A-linear.
Since φ is A-multilinear, the linear map φ̄ vanishes on elements of the
form (6.3.1), hence on the submodule R. Passing to the quotient, we obtain
an A-linear transformation φ̃ : M1 ⊗ · · · ⊗ Mr → N which, by definition,
satisfies φ = φ̃ ◦ ι.N
′
′
r
NrFinally,′ let ( i=1 Mi ) be an A-module and ι : M1 × · · · × Mr →
( i=1 Mi ) an A-multilinear map satisfying the universal property expressed
in (i). We then obtain commutative diagrams:
M1 × · · · ×PMr
ι
/
Nr
PPP
PPP
PPP
PPP
ι′
(
N
(
Mi
i=1
✤
✤
✤
ι̃′
r
′
i=1 Mi )
ι′
M1 × · · · ×PMr
Nr
Mi )′
/(
i=1✤
PPP
PPP
✤
PP
ι PPPP
✤ ι̃
(
N
r
i=1 Mi
The compose of the A-linear maps ι̃ and ι̃′ make the following diagrams
commute:
Nr
N
( ri=1 Mi )′
i=1 Mi
♥♥7
♠♠6
ι′♠♠♠♠
ι♥♥♥♥
♥♥♥
♥♥♥
M1 × · · · ×PMr
PPP
PPP
P
ι PPPP
(
N
ι̃◦ι̃′
r
i=1 Mi
♠♠♠
♠♠♠
M1 × · · · ×◗Mr
◗◗◗
◗◗◗
◗◗◗
◗◗(
ι′
N
(
ι̃′ ◦ι̃
r
′
i=1 Mi )
6.3. TENSOR PRODUCTS
297
Since the identity transformations idM1 ⊗···⊗Mr and id(M1 ⊗···⊗Mr )′ also make
the diagrams commute, the uniqueness requirement in the universal property
implies that ι̃ ◦ ι̃′ =idM1⊗···⊗Mr and N
ι̃′ ◦ ι̃ =id(M1 ⊗···⊗M
N r )′ . Hence, these maps
give an isomorphism of A-modules ri=1 Mi ≃ ( ri=1 Mi )′ .
The A-module M1 ⊗ · · · ⊗ Mr is called the tensor product of the modules M1 , . . . , Mr . If (v 1 , . . . , v r ) ∈ M1 × · · · × Mr , the image ι(v 1 , . . . , v r ) ∈
M1 ⊗ · · · ⊗ Mr is denoted by v 1 ⊗ · · · ⊗ v r . Notice that in this notation, the
following identity holds:
′
′′
′
v 1 ⊗ · · · ⊗ (av i + bv i ) ⊗ · · · ⊗ v r = a(v 1 ⊗ · · · ⊗ v i ⊗ · · · ⊗ v r )
′′
+ b(v 1 ⊗ · · · ⊗ v i ⊗ · · · ⊗ v r ).
An element of M1 ⊗ · · · ⊗ Mr can be written as a linear combination of
elements of the form v 1 ⊗ · · · ⊗ v r , since we saw above that the set of all
such elements is a generating set. However, this representation is very far
from being unique, since ι is not injective.
Example 6.3.2.
In the tensor product (over Z) of Z2 with Z4 , we have the following relations:
0 = 0(n ⊗ m) = 0 ⊗ m = n ⊗ 0,
1 ⊗ 2 = 2(1 ⊗ 1) = 2 ⊗ 1 = 0 ⊗ 1 = 0,
1 ⊗ 1 = 1 ⊗ 1 + 1 ⊗ 2 = 1 ⊗ 3.
Hence, the only element in Z2 ⊗ Z4 which is possibly non-zero is 1 ⊗ 1 = 1 ⊗ 3.
But since the bilinear map Z2 × Z4 → Z2 , (n, m) 7→ nm, maps (1, 1) 7→ 1 6= 0,
we conclude that 1 ⊗ 1 6= 0, and then it follows that Z2 ⊗ Z4 ≃ Z2 .
Using the definition of tensor product we obtain immediately:
N
Proposition 6.3.3. (Properties of ) Given A-modules M , N , P and
{Mi }i∈I , there exist isomorphisms of A-modules:
(i) M ⊗ N ⊗ P ≃ (M ⊗ N ) ⊗ P ≃ M ⊗ (N ⊗ P ) such that v ⊗ w ⊗ z ↔
(v ⊗ w) ⊗ z ↔ v ⊗ (w ⊗ z) for any elements v ∈ M , w ∈ N and
z ∈ P;
(ii) M ⊗ N ≃ N ⊗ M such that v ⊗ w ↔ w ⊗ v, for any v ∈ M and
w ∈ N;
L
L
(iii) ( i∈I Mi ) ⊗ N ≃ i∈I (Mi ⊗ N ) such that (v i )i∈I ⊗ w ↔ (v i ⊗ w)i∈I ,
for any v i ∈ Mi , w ∈ N .
298
CHAPTER 6. MODULES
As we saw in the example above, in general the tensor product M ⊗ N
involves a large number of relations between elements of the form v ⊗ w.
However, int he case of free modules, there are only the “obvious relations”
among these elements, as shown by the following proposition:
Proposition 6.3.4. Let M and N be free A-modules, with bases {v i }i∈I
and {wj }j∈J . Then M ⊗ N is free, with basis {v i ⊗ wj }(i,j)∈I×J .
Proof. Obviously, {v i ⊗ wj }(i,j)∈I×J is a generating set. In order to see that
L
it is also linearly independent, consider the map φ : M × N → (i,j)∈I×J A
defined by:
X
X
φ(
ai v i ,
bj wj ) = (ai bj )(i,j)∈I×J .
i∈I
j∈J
Since φ is A-bilinear, there exists an A-linear map φ̃ : M ⊗N →
such that φ = φ̃ ◦ ι e
L
(i,j)∈I×J
A
φ̃(v k ⊗ wl ) = φ(v k , wl ) = (ekl )(i,j)∈I×J .
where (ekl )ij = 1, if (k, l) = (i, j), and (ekl )ij = 0, otherwise. The elements
(ekl )(i,j)∈I×J form a basis of ⊕(i,j)∈I×J A, hence, the set {v i ⊗ wj }(i,j)∈I×J
is linearly independent.
Corollary 6.3.5. Let M and N be free A-modules with dim M = m and
dim N = n. Then dim(M ⊗ N ) = mn.
If φi : Mi → Ni , i = 1, . . . , r are homomorphisms of A-modules, we have
the homomorphism T (φ1 , . . . , φr ) : M1 ⊗ · · · ⊗ Mr → N1 ⊗ · · · ⊗ Nr defined
as follows: T (φ1 , . . . , φr ) is the unique A-linear map satisfying:
T (φ1 , . . . , φr )(v 1 ⊗ · · · ⊗ v r ) = φ1 (v 1 ) ⊗ · · · ⊗ φr (v r ).
Since the right hand side defines a multilinear expression in the v 1 , . . . , v r
the universal property of tensor product shows that this map is well defined.
In the next results we use the fact that A is commutative to write
HomA (M, N ), EndA (M ) and M ∗ as left A-modules (see Exercises 8 and
9 in Section 6.1). Note that these are all free of finite dimension, provided
M and N are free of finite dimension.
Proposition 6.3.6. Let Mi and Ni , i = 1, . . . , r, be free A-modules of finite
dimension. There exists an isomorphism:
HomA (M1 , N1 ) ⊗ · · · ⊗ HomA (Mr , Nr ) ≃
≃ HomA (M1 ⊗ · · · ⊗ Mr , N1 ⊗ · · · ⊗ Nr ),
which maps the element φ1 ⊗ · · · ⊗ φr to T (φ1 , . . . , φr ).
6.3. TENSOR PRODUCTS
299
Proof. By the associativity property of the tensor product, it is enough to
prove the case r = 2. So let M1 , M2 , N1 and N2 be free A-modules with bases
′
′
′′
′′
′
′
′′
′′
{v 1 , . . . , v m1 }, {v 1 , . . . , v m2 }, {w1 , . . . , wn1 } and {w 1 , . . . , w n2 }, respectively. Define a basis {φij } of HomA (M1 , N1 ) and {ψkl } of HomA (M2 , N2 )
by the formulas:
′
φij (v a ) =
 ′
 wj

if
a = i,
if
a 6= i,
′′
ψkl (v b ) =
0
 ′′
 wl

0
if
b = k,
if
b 6= k.
By the previous proposition, {φij ⊗ ψkl } is a basis of HomA (M1 , N1 ) ⊗
HomA (M2 , N2 ). On the other hand, we see that
′
′′
T (φij , ψkl )(v a ⊗ vb ) =
 ′
′′
 wj ⊗ wl

0
if
(a, b) = (i, k),
if
(a, b) 6= (i, k).
Hence, {T (φij , ψkl )} is a basis of Hom(M1 ⊗ N1 , M2 ⊗ N2 ), and we conclude
that there exists an isomorphism of A-modules such that φ ⊗ ψ 7→ T (φ, ψ).
This result shows that in the case of free A-modules of finite dimension
we can write φ1 ⊗ · · · ⊗ φr instead of T (φ1 , . . . , φr ), without any ambiguity.
Corollary 6.3.7. Let M and N be free A-modules of finite dimension.
There exist isomorphisms:
(i) EndA (M ) ⊗ EndA (N ) ≃ EndA (M ⊗ N );
(ii) M ∗ ⊗ N ∗ ≃ (M ⊗ N )∗ .
These isomorphisms together with the following isomorphism lead to a
more concrete interpretation of the tensor product of free A-modules of finite
dimension.
Corollary 6.3.8. Let M and N be free A-modules of finite dimension.
There exists an isomorphism
M ∗ ⊗ N ≃ HomA (M, N )
mapping an element l ⊗ w to the homomorphism φl,w given by v 7→ l(v)w.
300
CHAPTER 6. MODULES
Proof. If {v 1 , . . . , v m } is a basis for M , let {l1 , . . . , lm } be the dual basis for
M ∗ defined by:

 1 if j = i,
li (v j ) =

0 if j 6= i.
Then, if {w1 , . . . , w n } is a basis for N , the tensor products {li ⊗ wk } form a
basis for M ∗ ⊗ N . On the other hand, the A-linear transformations φli ,wk ∈
HomA (M, N ) satisfy

 wk if j = i,
φli ,wk (v j ) = li (v j )w k =

0
if j 6= i,
hence the {φli ,wk } form a basis of HomA (M, N ), and there exists an isomorphism M ∗ ⊗ N ≃ HomA (M, N ), l ⊗ w 7→ φl,w .
This leads to the following interpretation of the tensor product of free
modules of finite dimension, which is often used in Geometry to characterize
tensor fields and differential forms:
Corollary 6.3.9. Let M and N be free A-modules of finite dimension.
There exists an isomorphism
M ⊗ N ≃ L(M ∗ , N ∗ ; A).
Proof. The diagram in the universal property of tensor product:
ι
/M
❘❘❘
❘❘❘
❘❘❘
❘❘❘
φ
❘❘❘
)
M × N❘
⊗✤ N
✤
φ̃
✤
A
determines an isomorphism:
L(M, N ; A) ≃ (M ⊗ N )∗ ,
φ 7→ φ̃.
In general, this isomorphism does not characterizes the tensor product, since
one can have M ⊗ N 6= {0} and (M ⊗ N )∗ = {0}. However, when M and
N are both free of finite dimension, we obtain:
L(M ∗ , N ∗ ; A) ≃ (M ∗ ⊗ N ∗ )∗ ≃ (M ∗ )∗ ⊗ (N ∗ )∗ ≃ M ⊗ N.
6.3. TENSOR PRODUCTS
301
It is well known that any vector space V over R can be seen as a vector
space over C. One can use tensor products to generalize this construction
and extend the ring of scalars of a given module, as follows.
Let A be a ring and à an extension of A (i.e., A is a subring of Ã). We
can view à as an A-module: if a ∈ A and b ∈ Ã, then the product ab is
defined using the multiplication in Ã. Hence, if M is any A-module, we can
form the A-module Mà = à ⊗A M (10 ) and define an operation of à on MÃ
by:
b(c ⊗ v) := (bc) ⊗ v.
One checks easily that MÃ with this new scalar multiplication operation is
an Ã-module. We say that MÃ is obtained from M by extension of the
ring of scalars. If φ : M → N is an A-linear map, then we obtain an
Ã-linear map φ̃ : MÃ → NÃ by setting:
φ̃(c ⊗ v) := c ⊗ φ̃(v).
Proposition 6.3.10. If M is a free A-module, then MÃ is a free Ã-module
and dimA M = dimà Mà .
Proof. If M ≃ ⊕i∈I A, then
MÃ = Ã ⊗A M
≃ Ã ⊗A (⊕i∈I A)
≃ ⊕i∈I (Ã ⊗A A) ≃ ⊕i∈I Ã,
where the last isomorphism is obtained from the isomorphism à → à ⊗A A,
a 7→ a ⊗ 1.
Examples 6.3.11.
1. If V is a vector space over R, then VC = V ⊗R C (sometimes called the
complexification of V ) is a vector space over C. If {v 1 , . . . , v n } is a basis of
V over R, then {1 ⊗ v1 , . . . , 1 ⊗ v n } is a basis of VC over C. Hence, if V ≃ Rn
then VC ≃ Cn .
2. If one extends the ring of scalars of the Z-module Z to Q, one obtains a
Q-module isomorphic a Q.
3. On the other hand, if one extends the ring of scalars of the Z-module Zn to
Q, one obtains a trivial Q-module (exercise).
10
Whenever one works with more than one ring at the same time, it is convenient to
indicate the ring as a subscript in the symbol of tensor product, so that it is clear in which
ring one is taking the tensor product.
302
CHAPTER 6. MODULES
There are many other constructions where tensor products, Hom and
duality play a crucial role. A few of them are covered in the exercises.
Exercises.
1. Check the basic properties of tensor products given in Proposition 6.3.3.
2. Let ρ1 : G → GL(V1 ) and ρ2 : G → GL(V2 ) be representations of the
same group G in vector spaces V1 and V2 . Show that there exists exactly one
representation ρ : G → GL(V1 ⊗ V2 ) satisfying the following property:
ρ(g)(v 1 ⊗ v 2 ) = ρ1 (g)(v 1 ) ⊗ ρ2 (g)(v 2 ).
3. Show that Zm ⊗Z Zn ≃ Zq . What is the expression of q in terms of m and
n?
4. Show that Q ⊗Z Zn is trivial.
5. Show that if
/ M1
0
/ M2
/ M3
/0
is a short exact sequence of A-modules (see Exercise 6, in Section 6.1) and N
is an A-module, then the sequence of A-modules
M1 ⊗ N
/ M2 ⊗ N
/ M3 ⊗ N
/0
is also exact. Give an example that shows that the first map in this sequence
may fail to be injective.
6. Let M be an A-module, and let R be the submodule de
by elements of the form
Nr
i=1
M generated
v i = v j for some i, j (i 6= j)
Nr
Denote by
M the quotient
i=1 M/R, and by v 1 ∧ · · · ∧ v r the
Vr module
image of v 1 ⊗ · · · ⊗ v r in
M . Show that:
Vr
v 1 ⊗ · · · ⊗ vr ,
(i) the A-multilinear map ι : M × · · · × M → M ∧ · · · ∧ M , (v 1 , . . . , v r ) 7→
v 1 ∧ · · · ∧ v r is alternating, i.e.,
ι(v σ(1) , . . . , v σ(r) ) = sgn σ · ι(v 1 , . . . , v r ),
∀σ ∈ Sr .
(ii) if φ : M × · · · × M → N is any A-multilinear alternating map, there
exists a unique A-linear map φ̃ : M ∧ · · · ∧ M → N making the following
diagram commute:
ι
/ M ∧ ···∧ M
M × ···× M
✤
❯❯❯❯
❯❯❯❯
✤
❯❯❯❯
❯❯❯❯
✤ φ̃
φ
❯❯❯* N
6.3. TENSOR PRODUCTS
303
(iii) The A-module M ∧ · · · ∧ M is determined by the universal property in
(ii) up to isomorphism.
V
n
(iv) If M is free of finite dimension n, then r M is free of dimension
r
if 1 ≤ r ≤ n, and of dimension 0 if r > n.
Vr ∗
(v) If M is free of finite dimension, then
M ≃ Ar (M ) (the module of all
alternating A-multilinear maps ϕ : M
· · × M} → A).
| × ·{z
r vezes
7. Let {Mi }i∈I be a family of A-modules where I is a partial ordered set which
satisfies the following condition 11 :
∀i, j ∈ I, ∃k ∈ I : i ≤ k and j ≤ k.
Assume also that for any pair i, j ∈ I with i ≤ j there exists an A-linear map
φji : Mi → Mj such that whenever i ≤ j ≤ k one has:
φkj ◦ φji = φki ,
φii = id.
Show that:
(a) there exists an A-module M and A-linear maps φi : Mi → M satisfying
the following universal property: if N is an A-module and ϕi : Mi → N
are A-linear maps such that ϕj ◦ φji = ϕi , there exists a unique A-linear
map ϕ : M → N making the following diagram commute:
φj
i
/ Mj
Mi✶ ❇
✶✶❇❇❇ φi
φj ⑤⑤✌✌
✶✶ ❇❇❇
⑤⑤ ✌
✶✶ ❇ }⑤⑤⑤ ✌✌
✶
✌✌
ϕi ✶ M
✶✶ ✤ ✌✌✌ ϕj
✶✶ϕ ✤ ✌✌
✶ ✤ ✌✌
N
S
Show also that M = i∈I φi (Mi ) is unique up to isomorphism. One calls
M the direct limit of the family {Mi , φji } and denotes it by lim Mi ;
−→
(b) if φi (v) = 0 for some v ∈ Mi , then there exists j ≥ i such that φji (v) = 0;
(c) if N is an A-module, then lim(Mi ⊗ N ) = (lim Mi ) ⊗ N ;
−→
−→
(d) if M1 ⊂ M2 ⊂ · · · ⊂ Mi ⊂ . . . are A-modules, find lim Mi .
−→
8. Define inverse limit of a directed family of A-modules, and show that it is
characterized by a universal property analogous to the direct limit where the
sense of the arrows is inverted.
11
A partial ordered set which satisfies this condition is said to be directed or filtered.
304
6.4
CHAPTER 6. MODULES
Modules over Integral Domains
In this section we assume that rings are commutative with a unit and the
cancellation law holds, i.e., they are integral domains. In linear algebra this
property of the ring leads to:
Proposition 6.4.1. Let M be a module over an integral domain D. Then
Tors(M ) is a submodule of M .
Proof. Recall that
Tors(M ) = {v ∈ M : existe a ∈ D com av = 0 and a 6= 0}.
Hence, if v 1 , v 2 ∈ Tors(M ), then there exist a1 , a2 ∈ D, non-zero, such that
a1 v 1 = 0 and a2 v 2 = 0. If d1 , d2 ∈ D, then
a1 a2 (d1 v 1 + d2 v 2 ) = a2 d1 a1 v 1 + a1 d2 a2 v 2 = 0,
with a1 a2 6= 0, for if a1 a2 = 0 then the cancellation law would imply that
a1 = 0 or a2 = 0. Hence, d1 v 1 + d2 v 2 ∈ Tors(M ).
One calls Tors(M ) torsion submodule of M . If M = Tors(M ), then
one says that M is a torsion module. If Tors(M ) = 0, i.e., if all elements
of M are free, then on says that M as module free of torsion.
The next proposition gives elementary properties of the torsion submodule and its proof is left as an exercise.
Proposition 6.4.2. Let D be an integral domain.
(i) If φ : M1 → M2 is a D-linear transformation, then
φ(Tors(M1 )) ⊂ Tors(M2 ).
If φ is injective, then φ(Tors(M1 )) = Tors(M2 ) ∩ Im(φ). If φ is surjective with ker(φ) ⊂ Tors(M1 ), then φ(Tors(M1 )) = Tors(M2 ).
(ii) If M is a D-module, then M/ Tors(M ) is D-module free of torsion.
(iii) If {Mi }i∈I is a family of D-modules, then
M
M
Tors(
Mi ) =
Tors(Mi ).
i∈I
i∈I
The following examples show that one should not confuse the concepts
of a free module and free of torsion.
6.4. MODULES OVER INTEGRAL DOMAINS
305
Examples 6.4.3.
1. If M is a free module over an integral domain, then Tors(M ) = 0 (exercise)
and M is free of torsion.
2. The Z-module Q is free of torsion, but Q is not a free Z-module.
3. The modules Zn are torsion Z-modules.
4. If V is a finite dimension vector space over K and T : V → V is a linear
transformation, then V is a torsion K[x]-module (exercise).
Let M be a module over an integral domain D and denote by K =
Frac(D) the field of fractions of D. Since K is an extension of D, we can
extend the ring of scalars of M to K, resulting in a vector space MK over
K. This vector space reflects the properties of M up to its torsion.
Proposition 6.4.4. Let M be a D-module, K = Frac(D), and φ : M → MK
the D-linear transformation v → 1 ⊗ v. Then:
(i) Every element of MK is of the form
v ∈ M.
1
d φ(v),
where 0 6= d ∈ D and
(ii) The kernel of φ is the torsion submodule Tors(M ).
Proof. In order to prove (i), observe that the module MK is generated by
all elements k ⊗ v, with k ∈ Frac(D) and v ∈ M . Hence, if w ∈ MK , then:
w=
n
X
i=1
ki ⊗ v i =
n
X
ai
i=1
bi
⊗ vi.
Denoting by d the product of the bi ’s, there exist ci ∈ D such that
hence:
!
n
X
1
1
w= ⊗
ci v i = φ(v).
d
d
ai
bi
=
ci
d,
i=1
The proof of (ii) is left for the exercises.
We call the vector space MK over K = Frac(D) the vector space associated with the D-module M . The previous proposition suggests the following
definition:
Definition 6.4.5. If M is a D-module and S ⊂ M , we call the rank of S
the dimension of the linear subspace MK generated by φ(S). In particular,
the rank of M is the dimension dim MK .
306
CHAPTER 6. MODULES
Recall that M is a finite type if it is finitely generated. The proposition
above shows that:
Corollary 6.4.6. Every D-module M of finite type has finite rank.
Proof. If S is a finite set generating M , then φ(S) is finite and contains a
basis of MK , hence dim MK < ∞. In particular, M has finite rank.
Notice that the rank of a D-module is an invariant: if M1 ≃ M2 , then
M1 and M2 have the same rank. The converse in general fails, so the rank
is certainly not a complete invariant.
Examples 6.4.7.
1. Since Q ⊗Z Q = Q the rank of Q, as a Z-module, is 1. Since Q is not finitely
generated over Z, the converse to the corollary does not hold.
2. Since Q ⊗Z Zn = {0}, the rank of Zn is zero. More generally, if M is a
torsion module, then its rank is zero.
Corollary 6.4.8. Let M be a D-module. Then {ei }i∈I ⊂ M is a linearly
independent family over D if and only if {1 ⊗ ei }i∈I ⊂ MK is a linearly
independent family over K.
Proof. If {ei }i∈I ⊂ M is a linearly independent family, the submodule N =
L
i∈I Dei is free of torsion, hence the restriction of φ : M → MK to N is
injective.
It follows from this corollary that if M is a free D-module then its rank
equals its dimension. On the other hand, if M is a free D-module then a
submodule N ⊂ M may fail to be free. In fact, we have:
Proposition 6.4.9. If D is an integral domain such that for any free Dmodule M the submodules N ⊂ M are all free, then D is a PID.
Proof. Since M = D is a free D-module, by assumption its submodules, i.e.,
the ideals I ⊂ D, must be free. Let I ⊂ D be an ideal. A basis of I contains
only one element, since any two elements a, b ∈ I are linearly dependent:
(−b)a + ab = 0.
If {d} is a basis of I, then I = hdi, so I is a principal ideal.
Actually, the principal ideal domains are characterized by the property
expressed in this proposition, since we have the following result:
6.4. MODULES OVER INTEGRAL DOMAINS
307
Theorem 6.4.10. If D is a PID and M is a free D-module, then any
submodule N ⊂ M is free and dim N ≤ dim M .
Proof. Let {ei }i∈I be a basis for M over D and {0} =
6 N ⊂ M a submodule.
If J ⊂ I, consider an ordered pair (NJ , BJ ′ ), where


M
Dej  ,
NJ = N ∩ 
j∈J
and BJ ′ = {f j }j∈J ′ is a basis of NJ , with J ′ ⊂ J. We denote by P the set
formed by all such ordered pairs. In P we have the partial order relation
given by:
(NJ1 , BJ1′ ) ≤ (NJ2 , BJ2′ )
⇔
J1 ⊂ J2 and BJ1′ ⊂ BJ2′ .
We claim that we can apply Zorn’s Lemma to (P, ≤):
(i) P is non-empty:
Since N 6= {0}, there exists
Ln J0 = {j1 , . . . , jn } ⊂ I
Ln−1
such that N ∩ i=1 Deji = {0} and N ∩ i=1 Deji 6= {0}. The set
{a ∈ D : aejn +
n−1
X
i=1
bi e j i ∈ N }
is an ideal in P
D, hence takes the form hd0 i. ItPfollows that there exists
n−1
n−1
f 0 = d0 ejn + i=1
b0i eji ∈ N . If v = aejn + i=1
bi eji ∈ N , we must
have a = kd0 and
v − kf 0 =
n−1
X
i=1
(bi − kb0i )eji ∈ N ∩
n−1
M
i=1
Deji = {0}.
We conclude that B = {f 0 } is a basis of N{J0 } , so P is non-empty.
(ii) Every ascending chain {(NJα , BJα′ )}α∈A in (P, ≤) has an upper bound:
It is enough to consider the pair ∪α∈A NJα , ∪α∈A BJα′ .
Zorn’s Lemma applied to (P, ≤) gives a maximal element (NJˆ, BJˆ′ ). To
finish the proof it is enough to show that Jˆ = I, for this implies that NJˆ = N ,
so that BJˆ′ is a basis for N .
Assume that I − Jˆ 6= ∅. Then there exists l ∈ I − Jˆ and a ∈ D such that
M
(6.4.1)
ael + v a ∈ N for some v a ∈
Dej .
j∈Jˆ
308
CHAPTER 6. MODULES
The elements a ∈ D which satisfy (6.4.1) form an ideal, which must be
principal: a ∈ hd0 i. We claim that BJˆ′ ∪ {f 0 }, where f 0 = d0 el + v d0 , is a
basis for NJ∪{l}
. Leting BJˆ′ = {f j }j∈Jˆ′ we have:
ˆ
is
(a) BJˆ′ ∪ {f 0 } is a generating set: In fact, every element of v ∈ NJ∪{l}
ˆ
of the form (6.4.1), hence:
v = ael + v a
= a′ d0 el + v a = a′ f 0 − a′ v d0 + v a ,
a′ ∈ D.
L
It follows that −a′ v d0 + v a ∈ N ∩ ( j∈Jˆ Dej ) = NJˆ, where
X
v = a′ f 0 +
aj f j ,
j∈Jˆ′
and so BJˆ′ ∪ {f 0 } is generating.
(b) BJˆ′ ∪ {f 0 } is linearly independent: given a linear combination


X
X
X
X
aj f j + af 0 =
bk e k
aj 
cjk ek  + ad0 el + a
j∈Jˆ′
j∈Jˆ′
=
X
k∈Jˆ
k∈Jˆ
k∈Jˆ


X
j∈Jˆ′

aj cjk + abk  ej + ad0 el .
we see that if this linear combination is zero, then ad0 = 0, hence
a = 0. Since the elements {f j } are linearly independent, also aj = 0
and the elements BJˆ′ ∪ {f 0 } are linearly independent.
Therefore the pair (NJ∪{l}
, BJˆ′ ∪{l} ) contradicts the maximality of (NJˆ, BJˆ′ ).
ˆ
ˆ as claimed.
We conclude that I = J,
One consequence of the previous theorem is that for modules of finite
type over a PID, the concepts “free” and “free of torsion” actually coincide:
Proposition 6.4.11. Let M be a module of finite type over a PID. If
Tors(M ) = 0, then M is free.
Proof. Let S be a finite generating set. In S choose a set B = {v 1 , . . . , v n }
which is maximal linearly independent in S. If v ∈ S, there exist av , a1 , . . . ,
an ∈ D (non-unique) such that
av v = a1 v 1 + · · · + an v n
(av 6= 0).
6.4. MODULES OVER INTEGRAL DOMAINS
309
Q
φ
Set a ≡ v∈S av . Since M is torsion free, the map φ : M →
LnM , w 7−→ aw,
is an injective homomorphism. We claim that φ(M ) ⊂
i=1 Dv i . Since
M = hSi it is enough to observe that if w ∈ S, then


Y
aw = 
av  aw w
v∈S−{w}

=
Y
v∈S−{w}

av  (a1 v1 + · · · + an v n ) ∈
n
M
Dv i .
i=1
Hence M is isomorphic to a submodule of a free module free. By Theorem
6.4.10, M is free.
Exercises.
1. Show that Proposition 6.4.2 holds
2. Show that, if M is a free D-module over an integral domain D, then M is
torsion free. Give an example of a free module N over a ring A such that
Tors(N ) 6= 0.
3. Let V be a finite dimensional vector space K and T : V → V a linear
transformation. Show that V is a torsion K[x]-module.
4. Give an example of a free module over a integral domain which admits submodules which are not free.
5. Let D be an integral domain and K = Frac(D), seen as a D-module. In
D − {0} consider the partial order defined by:
d1 ≤ d2 ⇐⇒ Kd1 ⊂ Kd2 ,
where Kd ⊂ K is a D-submodule { da : a ∈ D}.
(a) Show that K = lim Kd .
−→
(b) If M is a D-module and MK is the associated vector space show that
MK = lim(Kd ⊗ M ).
−→
(c) Conclude that 1 ⊗ v ∈ MK is zero vector if and only if v ∈ Tors(M ).
6. Let D be a PID. If M is a free D-module, N ⊂ M is a submodule, and
Tors(M/N ) = M/N , show that dim N = dim M .
310
6.5
CHAPTER 6. MODULES
Finitely Generated Modules over a PID
We will now study modules which are finitely generated over PID, eventually
obtaining its classification. As particular instances of this classification one
obtains the classification of finitely generated abelian groups and canonical
forms forms for linear transformations between finite dimensional vector
spaces.
Recall that in the previous section we have shown that for a PID D every
submodule of a free D-module is free, and that “free” and “free of torsion”
coincide for D-modules of finite type. This leads to the first step in the
classification of modules of finite type over a PID:
Theorem 6.5.1. Let M be a module of finite type over a PID. There exists
a free submodule L ⊂ M such that M = Tors(M ) ⊕ L and dim L = rank M .
Proof. The module M/ Tors(M ) is free of torsion and of finite type, hence
it is free. This means we can choose elements e1 , . . . , en ∈ M , linearly
independent, such that
M/ Tors(M ) =
n
M
Dπ(ei ),
i=1
where π : M → M/ Tors(M ) denotes the quotient map. Let L =
M . Then:
Ln
i=1 Dei
⊂
(a) L∩Tors(M ) = {0}: If v ∈ L∩Tors(M ) there exist scalars d, d1 , . . . , dn ∈
D, with d 6= 0, such that
dv = 0,
v=
n
X
di ei .
i=1
Hence (dd1 )e1 + · · · + (ddn )en = 0 and we conclude that dd1 = · · · =
ddn = 0. By the cancellation law, d1 = · · · = dn = 0, and so v = 0.
(b) M = L + Tors(M ): If v ∈ M define d1 , . . . , dn ∈ D by
π(v) =
n
X
di π(ei ).
i=1
Then v = vT + vL , where v L =
N (π) = Tors(M ).
Pn
i=1 di ei
∈ L and v T = v − v L ∈
6.5. FINITELY GENERATED MODULES OVER A PID
311
By (a) and (b), we have M = Tors(M )⊕L. If K = Frac(D) and φ : M →
MK is the canonical injection, the restriction of φ to L is injective. Since
φ(L) generates MK , we conclude that the rank of M equals the dimension
of L.
In the above decomposition the free factor L is not unique. This is clear
from the proof, where a different choice of a basis {e1 , . . . , en } may lead to
a different factor. However, its dimension coincides with the rank of M and
hence it is an invariant.
The rank of M classifies, up to isomorphism, the free part of M . To
complete the classification, one needs to classify the finitely generated torsion
modules. This will be done in the next paragraphs.
6.5.1
Diagonalization of matrices with entries in a PID
Recall that Mn (D) denotes the ring of n × n matrices with entries in a PID
D. The following result will be very useful to find special basis in a free
module:
Proposition 6.5.2. Let A ∈ Mn (D). There exist invertible matrices P, Q ∈
Mn (D) such that:


d1
0


..
Q−1 AP = 
,
.
0
dn
where d1 | d2 | · · · | dn . The di ’s are unique up to multiplication by units.
Note that this result does not say that a matrix can be diagonalized by
a change of basis: in general, the matrices P and Q are not related in any
way.
The normal form for a matrix given by Proposition 6.5.2 can be obtained
by performing elementary operations in the rows and columns of the matrix,
as in the usual Gauss elimination. Let Eij denote the matrix whose entries
are all zero with the exception of the entry (i, j) which equals 1 ∈ D. Right
and left multiplication by the following invertible matrices leas to the usual
elementary operations:
• Exchange of columns (rows): Pij = I − Eii − Ejj + Eij + Eji ;
• Multiplication of columns (rows) by units: Di (u) = I + (u − 1)Eii
(u ∈ D a unit);
312
CHAPTER 6. MODULES
• Addition of a multiple of a column (row) to another column (row):
Tij (a) = I + aEij (a ∈ D).
Also, we define the length δ(d) os an element 0 6= d ∈ D to be the number
of prime factors in a prime factorization of d.
Given any n × n matrix A = (aij ) we would like to show that A is
equivalent12 to a diagonal matrix. If A = 0, there is nothing to prove.
If not, then there is a non-zero entry of A of minimal length, and using
elementary operations we can transport it to the (1, 1) entry. Let a1k be
some entry such that a11 ∤ a1k . By exchange of columns 2 and k, we can
assume that this entry is a12 . If d = gcd(a11 , a12 ), then there exist elements
p, q ∈ D such that pa11 + qa12 = d. If we let r = a12 d−1 e s = a11 d−1 , then
we see that the matrices




p r
s r
 q −s

 q −p









−1
1
1
P =
P =
,
,




..
..




.
.
1
1
are inverse to each other. Right multiplication of A = (aij ) by the matrix P gives an equivalent matrix whose first row is (d, 0, a13 , . . . , a1n ) and
δ(d) < δ(a11 ). Similarly, if a11 ∤ ak1 , we can obtain a new equivalent matrix
where the new (1, 1) entry d has length δ(d) < δ(a11 ), so the minimal δ has
decreased again. Since δ takes values in N, we can repeat this process a
finite number of times leading to an equivalent matrix where a11 | a1k and
a11 | ak1 , for all k = 2, . . . , n. Performing elementary operations we obtain
finally an equivalent matrix of the form:


d1 0 . . .
0
 0 â22 . . . â2n 


 ..
..  .
..
..
 .
.
. 
.
0
ân2 . . .
ânn
Obviously we can repeat this process again with the second row and column,
etc., so we see that the original matrix is equivalent to a diagonal matrix:


d1
0


..
.

.
0
dn
12
In the following we will say that two matrices A and B are equivalent if there exist
invertible matrices P and Q such that B = P AQ.
6.5. FINITELY GENERATED MODULES OVER A PID
313
If d1 ∤ d2 , then we can add the second row to the first row and repeat
the all process again. Since δ always decreases, we will eventually obtain a
diagonal matrix where d1 | d2 . It should now be clear that we can obtain
an equivalent diagonal matrix where d1 | d2 | · · · | dn .
The elements d1 , . . . , dn in the normal form given by Proposition 6.5.2
are called invariant factors of the matrix A. The uniqueness of the
invariant factors follows from the following result which also gives a more
efficient method to compute them than the elimination that we used above
to prove existence. We leave the proof as an exercise.
Lemma 6.5.3. Let A ∈ Mn (D) and assume that A is equivalent to a diagonal matrix


d1
0


..
,

.
0
dn
with d1 | d2 | · · · | dn . If the rank of A is r, then di = 0, for i > r, and
i
di = ∆∆i−1
for i ≤ r, where ∆0 = 1 and ∆i is the greatest common divisor of
the minors of A of size i.
The formulas in this lemma, lead immediately to the following corollary:
Corollary 6.5.4. The invariant factors are unique up to multiplication by
units. Two matrices are equivalent if and only if they possess the same
invariant factors.
Example 6.5.5.
Let D = C[x] and consider the matrix

x−2 0
A =  −1 x
−2
4

0
−1  .
x−4
Computing the minors given in Lemma 6.5.3, we obtain
∆1 = 1,
∆2 = x − 2,
∆3 = (x − 2)3 .
Hence: d1 = 1, d2 = (x − 2) and d3 = (x − 2)2 . In fact, the method of
314
CHAPTER 6. MODULES
elimination used above gives the following invertible matrices:

0
−1
 −1 −x + 2
1
x−4
6.5.2

0
x−2
0   −1
1
−2


0
0
1 −1 0
x
−1   0
0 1 
4 x−4
0
1 x


1
0
0
.
0
= 0 x−2
0
0
(x − 2)2
Decomposition into invariant factors
Let M be a module over a PID D. For each v ∈ M , one defined the order
ideal of v by
ann v ≡ {d ∈ D : dv = 0}.
This is a principal ideal: ann v = hai, and the element a ∈ D is called the
order of v. The order is defined up to multiplication by units. Obviously,
the cyclic submodule hvi is isomorphic to D/ ann v.
Example 6.5.6.
Let G be an abelian group, viewed as a Z-module. If g ∈ G, then the cyclic
subgroup hgi generated by g is isomorphic to Z/ ann g. The order of g, as
defined above, coincides with the usual notion of order up to a sign (the units
in this case are ±1).
We can now state and prove our first version of the classification of
modules of finite type over a PID:
Theorem 6.5.7 (Invariant factors decomposition). Let D be a PID and let
M be a module of finite type over D. Then
M = hv 1 i ⊕ · · · ⊕ hv k i,
where ann v 1 ⊃ ann v 2 ⊃ · · · ⊃ ann v k . Letting ann v i = hdi i, there is an
isomorphism
M ≃ D/hd1 i ⊕ · · · ⊕ D/hdk i
where d1 | d2 | · · · | dk . The ideals hdi i are uniquely determined by M .
Proof. If rank M = r then
M ≃ Tors(M ) ⊕ D
· · ⊕ D},
| ⊕ ·{z
r terms
6.5. FINITELY GENERATED MODULES OVER A PID
315
hence it is enough to prove the result for a torsion module M .
Let {w 1 , . . . , wn } be a finite set of generators of M . Denote by L the
free module generated by the wi ’s. In L there is a basis {ŵ 1 , . . . , ŵ n } such
that π(ŵ i ) = wi , where π : L → M is the projection. Let R be the kernel
of π, so that M ≃ L/R. Since R is a free submodule of L and since M is
torsion, we must have dim R = dim L = n (see Exercise 6.4.6).
Let {v̂ 1 , . . . , v̂ n } be a basis of R, so there exist scalars aij ∈ D satisfying
the relations
X
v̂ i =
aji ŵ j ,
i = 1, . . . , n.
j
Changing basis in L and R,
X
qji ŵ j ,
ŵ′i =
v̂ ′i =
X
pji v̂ j ,
j
j
we obtain new relations
v̂ ′i =
X
bji ŵ′j ,
i = 1, . . . , n,
j
and its is immediate to verify that the matrices A = (aij ), B = (bij ), P =
(pij ) and Q = (qij ) are related by
B = Q−1 AP.
As we saw above, we can choose invertible matrices P and Q (or, equivalently, we can choose bases of L and R) such that B = diag(d1 , . . . , dn ) with
d1 | d2 | · · · | dn . In this case:
v̂ ′i = di ŵ′i ,
i = 1, . . . n.
If w′i = π(ŵ ′i ), we claim that
M = hw′1 i ⊕ · · · ⊕ hw′n i.
Since ann w′i = hdi i, this
the existence part of the proposition.
P proves
′
Obviously, M = i hwi i, since the ŵ ′i form a generating set of L and
π : L → M is
Psurjective. Hence, to prove the claim, it is enough to show
that hw′k i ∩ i6=k hw ′i i = {0}. Let w be an element in this intersection.
Then there exist ai ∈ D such that
X
w = ak w′k =
ai w′i .
i6=k
316
CHAPTER 6. MODULES
It follows that in L we have:
ak ŵ′k −
X
i6=k
ai ŵ′i ∈ R
so we conclude that there exist bi ∈ D such that ai = bi di , i = 1, . . . , n. But
then w = ak w′k = π(ak ŵ ′k ) = π(bk dk ŵ′k ) = π(bk v̂ ′k ) = 0.
The proof of uniqueness will be given later.
Note that in the decomposition above we can assume that the v k 6 0, i.e.,
that the di are non-invertible. The elements d1 , . . . , dk in the decomposition
above are then defined up to multiplication by a unit, and are called the
invariant factors of the module M .
Corollary 6.5.8. Two modules of finite type over a P ID are isomorphic if
and only if they have the same invariant factors.
6.5.3
Decomposition into primary factors
There is an alternative way of encoding the classification of modules over a
PID, which is based on the prime factorization of elements of D, and which
we now explain.
Recall that a PID is a UFD so that any 0 6= a ∈ D can be expressed in
the form
a = u · p1 · · · pn ,
where u ∈ D is a unit and the pi ∈ D are primes. This factorization is
unique, up to the order of the factors and multiplication by units. Recall
that a, b ∈ D are associate, i.e., differ by multiplication by a unit, we write
a ∼ b.
Lemma 6.5.9. Let D be a PID and assume that gcd(a, b) = 1. Then:
D/habi ≃ D/hai ⊕ D/hbi.
We leave the proof as an exercise. Using this lemma, we can show:
Theorem 6.5.10 (Primary factors decomposition). Let D be a PID and let
M be a module of finite type over D. Then:
M = L ⊕ hw1 i ⊕ · · · ⊕ hw n i ≃ L ⊕ D/hp1 m1 i ⊕ · · · ⊕ D/hpn mn i,
where L is a free submodule with dim L = rank M , ann wi = hpi mi i, and the
elements p1 , . . . , pn ∈ D are all primes. The ideals hp1 m1 i, . . . , hpn mn i are
uniquely determined, up to its order, by M .
6.5. FINITELY GENERATED MODULES OVER A PID
317
Proof. Consider the decomposition of M into invariant factors:
M = hv 1 i ⊕ · · · ⊕ hv k i.
If ann vi = hdi i, then d1 | d2 | · · · | dk and dk−r+1 = · · · = dk = 0, where
r = rank M . Hence, we have
hv k−r+1 i ⊕ · · · ⊕ hv k i = L,
where L is free of dimension r. On the other hand, if p1 m1 , . . . , pn mn are
the prime powers in the prime factorizations of d1 , . . . , dk−r , Lemma 6.5.9
shows that
hv 1 i ⊕ · · · ⊕ hv k i ≃ D/hp1 m1 i ⊕ · · · ⊕ D/hpn mn i ⊕ L.
The proof of uniqueness will be given later.
The elements p1 m1 , . . . , pn mn associated with the module M are defined
up to multiplication by units. They are called the elementary divisors
of M . The elementary divisors together with the rank form a complete set
of invariants of M :
Corollary 6.5.11. Two modules of finite type over a PID are isomorphic
if and only if they have the same list of elementary divisors and the same
rank.
The proof of the theorem shows that the decomposition of M into invariant factors determines a decomposition of M into primary factors.
Conversely, let
M ≃ L ⊕ D/hp1 m1 i ⊕ · · · ⊕ D/hpn mn i
be a decomposition de M into primary factors. Let p1 , . . . , ps be the distinct
(i.e., non-associate) primes appearing in this decomposition. We arrange the
prime powers in the decomposition as follows:
(6.5.1)
p1 n11
p1 n21
..
.
p2 n12
p2 n22
..
.
···
···
ps n1s
ps n2s
..
.
p1 nt1
p2 nt2
···
ps nts ,
where n1i ≤ n2i ≤ · · · ≤ nti , i = 1, . . . , s. Note that one may eventually
need to add some factors 1 = pi 0 . We let dj be the product of all the prime
318
CHAPTER 6. MODULES
power in row j, i.e., dj ≡ p1 nj1 · p2 nj2 · · · ps njs . Then d1 | d2 | · · · | dt , and
since the prime powers in each dj correspond to distinct primes they are
relatively prime, so the preceding lemma gives an isomorphism
Tors(M ) ≃ D/hd1 i ⊕ · · · ⊕ D/hdt i.
If the dimension of the free part L is r, then we add to the list of dj ’s the
elements dt+1 = · · · = dt+r = 0, and this leads to the decomposition of M
into invariant factors.
Given the list of elementary divisors {pi nji }, the invariant factors dk are
determined by the process above. Conversely, given the list of invariant factors {dk }, the elementary divisors {pi nji } are the prime powers in the prime
factorizations of the dk ’s. Hence, the uniqueness of the ideals hd1 i, . . . , hdk i
follows from the uniqueness of the ideals hp1 m1 i, . . . , hpn mn i, which we will
study next.
6.5.4
Primary components and uniqueness of decompositions
If M is a D-module and p ∈ D is a prime, the p-primary component of
M is the submodule
M (p) = {v ∈ M : pk v = 0, for some k ∈ N}.
It is an easy exercise to check that if Tors(M ) = M , then
M
M=
M (p).
p prime
Since M is of finite type, only a finite number of terms in this direct sum is
non-zero.
Given a primary decomposition:
M = L ⊕ D/hp1 m1 i ⊕ · · · ⊕ D/hps ms i
note that the primary components of M are:
M
M (p) =
D/hpi mi i.
{pi : pi ∼p}
Hence, if we are given two primary decompositions
M = L ⊕ D/hp1 m1 i ⊕ · · · ⊕ D/hps ms i
= L ⊕ D/hp̃n1 1 i ⊕ · · · ⊕ D/hp̃nt t i,
6.5. FINITELY GENERATED MODULES OVER A PID
319
we have that:
M (p) =
M
{pi : pi ∼p}
D/hpi mi i =
M
{p̃i :p̃i ∼p}
D/hp̃ni i i.
Hence, to prove the uniqueness of the primary decomposition of M , we
can assume that M = M (p) and that the primes appearing in the two
decompositions coincide:
M (p) = D/hpm1 i ⊕ · · · ⊕ D/hpms i
= D/hpn1 i ⊕ · · · ⊕ D/hpnt i
Then we can order the decompositions so that m1 ≤ m2 ≤ · · · ≤ ms and
n1 ≤ n2 ≤ · · · ≤ nt . If v s ∈ M is such that ann vs = hpms i, then the second
decomposition gives pnt v s = 0, so nt ≥ ms . Similarly, we find that ms ≥ nt ,
so we must have ms = ns . The quotient module M (p)/hv s i admits the
decompositions
M (p)/hv s i ≃ D/hpm1 i ⊕ · · · ⊕ D/hpms−1 i
≃ D/hpn1 i ⊕ · · · ⊕ D/hpnt−1 i
By exhaustion, we conclude that mi = ni and s = t. This proves the
uniqueness of the decompositions into primary factors and invariant factors.
Exercises.
1. An ideal I in a commutative ring A is called a primary ideal if ab ∈ I
implies that either a ∈ I or bn ∈ I, for some n ∈ N. Show that if D is a UFD
and p ∈ D is a prime, then hpn i is a primary ideal. This justifies the name
“primary decomposition” used in Theorem 6.5.10.
2. Deduce the formulas for the invariant factors given in Lemma 6.5.3.
3. Prove Lemma 6.5.9.
4. Determine diagonal matrices equivalent to the matrices
36 12
(a)
over Z;
16 18


x − 1 −2
−1
x
1  over R[x].
(b)  0
0
−2 x − 3
320
CHAPTER 6. MODULES
5. Show that if p ∈ N is a prime, the following two matrices in Mn (Zp ) are
equivalent:

0
 0




 0
1
1
0
0
1
..
.
···
···
0
0
6. Show that M =
···
···
L

0
0 


,

1 
0
p prime M (p)

1 1
 0 1




 0 0
0 0
0
1
..
.
···
···
···
···
1

0
0 


.

1 
1
if and only if Tors M = M .
7. If M = D/hp1 p2 2 p3 i ⊕ D/hp1 p2 3 p3 2 p4 i ⊕ D/hp1 3 p2 2 p4 5 i is a module over a
PID D, where p1 , . . . , p4 are non-associate primes, determine the decompositions of M into invariant factors and primary factors.
8. Let M1 and M2 modules of finite type over a PID D.
(a) Show that if M1 and M2 are cyclic, then M1 ⊗ M2 is cyclic.
(b) Determine the decompositions of M1 ⊗ M2 into invariant and primary
factors in terms of the corresponding decompositions of M1 and M2 .
9. Let M1 and M2 be cyclic modules over a PID D, of order a and b, respectively.
Show that if gcd(a, b) 6= 1, then the invariant factors of M1 ⊕ M2 are gcd(a, b)
and lcm(a, b).
6.6
Classification of Abelian Groups and Canonical Forms for Matrices
We will now show how the structure theory that we developed for modules
of finite type over a PID can be used to classify finitely generated abelian
groups and to establish canonical forms for linear transformations between
vector spaces. This correspond, of course, to consider the cases where D = Z
and D = K[x], with K a algebraically closed field. When D = Z, the
units are ±1, so every non-zero ideal has a non-negative integer as unique
generator. If D = K[x] the units are are the constant, non-zero, polynomials,
so every non-zero ideal has a monic polynomial as a unique generator. Hence,
we have unique representatives for the invariant factors and the elementary
divisors, and the classifications becomes particularly nice.
6.6. CLASSIFICATIONS
6.6.1
321
Classification of Finitely Generated Abelian Groups
Let G be group. We say that G is of finite type if there exist elements
g1 , . . . , gm ∈ G such that
g = g1 n1 · · · gm nm .
∀g ∈ G, ∃n1 , . . . , nm ∈ Z :
If G is an abelian group, then G is of finite type if and only if G is a Zmodule of finite type. Since Z is a PID, the classifications theorems in the
previous sections immediately yield:
Theorem 6.6.1 (Classification of abelian groups of finite type). Let G be
an abelian group of finite type. Then
G ≃ Zd1 ⊕ · · · ⊕ Zdn ,
where d1 , . . . , dn are non-negative integers uniquely determined by the condition d1 | d2 | · · · | dn . If p1 n1 , . . . , ps ns are the prime powers in the prime
factorization of the non-zero di ’s,then
G ≃ Zp1 n1 ⊕ · · · ⊕ Zps ns ⊕ Zr ,
where r is the number of d′i s equal to zero, i.e., the rank of G.
The integers di (respectively, pi ni ) are called the invariant factors (respectively, elementary divisors) of the abelian group G. Together with the
rank, they both form two distinct complete set of invariants of an abelian
group. Notice that one can find one list of invariants, given the other list.
Examples 6.6.2.
1. If n ∈ N admits a prime factorization n = p1 n1 · · · ps ns , then
Zn ≃ Zp1 n1 ⊕ · · · ⊕ Zps ns .
In this case, there exists only one invariant factor n and the elementary divisors
are the powers p1 n1 , . . . , ps ns .
2. Let G = Z6 ⊕ Z15 ⊕ Z18 . The primary decomposition de G is
G ≃ Z2 ⊕ Z3 ⊕ Z3 ⊕ Z5 ⊕ Z2 ⊕ Z32
so the list of elementary divisors is {2, 2, 3, 3, 32, 5}. We can obtain the list of
invariant factors from Table 6.5.1, which in this case reads:
20
2
2
3
3
32
50
50
5.
322
CHAPTER 6. MODULES
The invariant factors are the products of the prime powers in each line of this
table: d1 = 3, d2 = 6, d3 = 90. Hence, the decomposition of G into invariant
factors is:
G ≃ Z3 ⊕ Z6 ⊕ Z90
6.6.2
Jordan Canonical Form
Let K be an algebraically closed field. Let V be a finite-dimensional vector
space over K, and T : V → V a linear transformation. The K[x]-module
structure on V defined by T is defined by: p(x) = an xn + · · · + a0 ∈ K[x]
and v ∈ V , then
p(x) · v ≡ an T n (v) + · · · + a0 v.
We can use this K[x]-module structure to obtain information about T : for
example, Ṽ ⊂ V is a K[x]-submodule if and only if Ṽ linear subspace of V ,
invariant under T (see Exercise 6.1.1).
Since K[x] is a PID and V is of finite type, one of the classification
theorems for modules over PID leads to the following:
Theorem 6.6.3 (Jordan Canonical Form). Let T : V → V be a linear transformation of a finite-dimensional vector space over an algebraically closed
field K. There exists a basis {e1 , . . . , en } for V where the matrix representing the linear transformation T is


J1
0


..
J =
,
.
0
Jm
where Ji is a (ni × ni ) matrix of the form

λi 1

0


.
.

.


0




.

1 
λi
Proof. Since K is algebraically closed a polynomial p(x) ∈ K[x] is prime if
and only if p(x) = x − λ. Hence, the decomposition of V as a K[x]-module
into primary factors takes the form:
V ≃ V1 ⊕ · · · ⊕ Vm ,
6.6. CLASSIFICATIONS
323
where Vi ≃ hv i i are T -invariant subspaces, and ann(v i ) = h(x − λi )ni i. The
vectors
{(x − λi )ni −1 v i , . . . , (x − λi )v i , v i }
form a basis for Vi over K (exercise), and the matrix of T relative to this
basis is precisely Ji .
To compute the the Jordan canonical form one needs only to know the
elementary divisors (or the invariant factors) of the K[x]-module V . These
can be determined as follows, as shown by inspection of the proof of Theorem
6.5.7: Let {f 1 , . . . , f n } be a basis of V over K, and let A = (aij ) be the
matrix representing T in this basis. The set {f 1 , . . . , f n } generates V as a
K[x]-module. Forming the free module L generated by these elements, we
obtain the surjective homomorphism π : L → V which has kernel R. The
elements
X
ei := xf i −
aji f j
j
form a basis for R (as a K[x]-module) and the invariant factors of V can be
obtained by applying Proposition 6.5.2 to the matrix


x − a11 −a12 · · ·
−a1n
 −a21 x − a22 · · ·
−a2n 



.
..
..
.
.


.
.
.
−an1
−an2
···
x − ann
In other words, we have an equivalent diagonal matrix


1


..


.
0




1

,


d1 (x)




..


.
0
ds (x)
where d1 (x) | · · · | ds (x) are the invariant factors.
Example 6.6.4.
Let T : C3 → C3 be the linear transformation defined relative to the canonical
basis by the matrix


2
0 0
0 1 .
A= 1
2 −4 4
324
CHAPTER 6. MODULES
As we saw in Example 6.5.5, we have

0
−1
 −1 −x + 2
1
x−4

0
x−2
0   −1
1
−2


0
0
1 −1 0
x
−1   0
0 1 
4 x−4
0
1 x


1
0
0
,
0
= 0 x−2
0
0
(x − 2)2
and so the elementary divisors are (x − 2)
Jordan canonical form of T is

2 0
J = 0 2
0 0
and (x − 2)2 . We conclude that the

0
1 .
2
The Jordan canonical form is a consequence of the decomposition of the
K[x]-module V into primary factors. The decomposition of V into invariant
factor leads to another canonical form known as the rational canonical
form13 (see Exercise 6, below).
Exercises.
1. Determine all abelian groups of order 120.
2. If K is an algebraically closed field of characteristic zero, show that U = {r :
r is a root of xn − 1 = 0} is an abelian group isomorphic to Zn .
3. Let T : V → V be a linear transformation of a finite-dimensional vector space
over a field K and assume that V ≃ hvi (as K[x]-module), where ann(v) =
h(x − λ)m i. Show that the elements
{(x − λ)m−1 v, . . . , (x − λ)v, v}
form a basis of V over K.
4. Determine the Jordan canonical form of the matrices:


−1
1 −2
4 ,
(a) A =  0 −1
0
0
1
13
Actually, the rational canonical form, contrary to the Jordan canonical form, does not
require K to be algebraically closed.
6.6. CLASSIFICATIONS

0
 1

(b) B = 
0
0
0
0
1
0
325

0 −8
0
16 
.
0 −14 
1
6
5. Let T : V → V be a linear transformation of a finite-dimensional vector space
over a field K, and let d1 (x) | · · · | ds (x) be the invariant factors of K[x]-module
V . One calls m(x) = ds (x) the minimal polynomial and p(x) = d1 (x) · · · ds (x)
the characteristic polynomial of the linear transformation T .
(a) Show that m(x) 6= 0, m(T ) = 0 and that if q(x) is a polynomial such
that q(T ) = 0 then m(x)|q(x);
(b) Show that p(x) 6= 0, p(T ) = 0 and that p(x) = det(xI − T ).
6. (Rational Canonical Form) Let T : V → V be a linear transformation
of a finite-dimensional vector space over a field K. Using the decomposition
of V into invariant factors, as a K[x]-module, show that there exists a basis
{e1 , . . . , en } for V over K where the matrix representing T is


R=
R1
0
..
.
0
Rm


,
where Ri is a (ni × ni ) matrix of the form

0 0

 1 0





0
..
.
0
1
−a0
..
.
..
.
−ani −2
−ani −1




.



One calls R the rational canonical form of the linear transformation T .
7. Recall that a scalar linear differential ordinary equation (o.d.e)
dn−1 y
dy
dn y
+ an−1 n−1 + · · · + a1
+ a0 y = 0
n
dt
dt
dt
is equivalent to a system of first order linear o.d.e’s:
dx
= xA,
dt
326
CHAPTER 6. MODULES
where x = (y, y ′ , y ′′ , . . . , y (n−1) ) and A is

0 0

 1 0


..

.


0
0
1
the companion matrix

−a0

..

.

.
..

.

−an−2 
−an−1
Using the rational canonical form, show the following converse: every system
of first order linear o.d.e’s is equivalent to a system of decoupled scalar linear
o.d.e.’s.
6.7
Categories and Functors
Many of the algebraic structures that we have been studying, albeit distinct,
often admit similar properties. We have seen that groups, rings or modules
share common constructions and that the methods of proof are the same.
We shall now see that one can formalize these similarities in a very precise
way. This relies on the following fundamental concept:
Definition 6.7.1. A category C consists of:
(i) A class of objects.
(ii) For each pair of objects (X, Y ), a set Hom(X, Y ), whose elements are
called morphisms.
(iii) A map Hom(X, Y ) × Hom(Y, Z) → Hom(X, Z), called composition
of morphisms.
The image of the pair (φ, ψ) under composition of morphisms will be denoted
by ψ ◦ φ, and the following properties must hold:
(C1) Associativity: if φ ∈ Hom(X, Y ), ψ ∈ Hom(Y, Z) and τ ∈ Hom(Z, W ),
then τ ◦ (ψ ◦ φ) = (τ ◦ ψ) ◦ φ.
(C2) Identity: for any object X, there exists a morphism 1X ∈ Hom(X, X)
which satisfies
1X ◦ φ = φ and ψ ◦ 1X = ψ,
for any φ ∈ Hom(W, X) and ψ ∈ Hom(X, Y ), where Y and W are
arbitrary objects.
6.7. CATEGORIES AND FUNCTORS
327
Examples 6.7.2.
1. The category S of sets is the category where the objects are the sets, the
morphisms Hom(X, Y ) are the maps φ : X → Y , and the composition of
morphisms is the usual composition of maps.
2. The category G of groups is the category where the objects are the groups,
the morphisms Hom(G, H) are the group homomorphisms φ : G → H, and the
composition of morphisms is the composition of homomorphisms.
3. The category MA of (left/right) modules over a ring A is the category where
the objects are the (left/right) A-modules, the morphisms Hom(M, N ) are the
A-linear transformations φ : M → N , and the composition of morphisms is
the composition of linear transformations.
4. The category T of topological spaces is the category where the objects are the
topological spaces, the morphisms Hom(X, Y ) are the continuous maps, and
the composition of morphisms is the usual composition of continuous maps.
The first example above shows that, in general, the objects of a category
form a classe, rather than a set: one cannot define a “set of all sets” without
getting into paradoxs (e.g., is the set of all sets a set?). This distinction,
whose complete justification requires a deep study of Set Theory 14 , means
that to classes one does not apply the usual operations over sets, such as
passing to subsets. A category where all the objects are elements of a set is
called a small category.
It maybe convenient of think of a morphism φ ∈ Hom(X, Y ) as an arrow
from X to Y . However, it is important to note that, in spite of its name,
a morphism φ ∈ Hom(X, Y ) does not need to be a map from X to Y , as
shown in the following simple example:
Example 6.7.3.
Fix a group G. Let C be the category with only one object {∗} and where the
morphisms Hom(∗, ∗) are the elements of G. The composition of two morphisms is the multiplication in the group G. If G is a non-trivial group, the
morphisms are not maps between the objects.
A concrete category is a category C where every morphism φ ∈
Hom(X, Y ) is actually a map X → Y , where the identity morphism 1X ∈
14
See, e.g., P. R. Halmos, Naive Set Theory, Undergraduate Texts in Mathematics,
Springer-Verlag, 1974.
328
CHAPTER 6. MODULES
Hom(X, Y ) is the identity map X → X, and where composition of morphisms is the usual composition of maps. Most of the categories that we
have found up to now are concrete categories. In any case, we will always
represent a morphism φ ∈ Hom(X, Y ) symbolically as φ : X → Y , keeping
in mind that φ is not necessarily a map from X to Y .
Lets us give some elementary properties of categories.
Proposition 6.7.4. In a category C, for each object X, the identity morphism 1X is unique.
Proof. In fact, if 1X and 1′X are two identities in X, then by the identity
property (C2) applied to both 1X and 1′X , we find
1X ◦ 1′X = 1′X and 1X ◦ 1′X = 1X .
Hence, 1X = 1′X .
In a category C, given a morphism f : X → Y , we say that g : Y → X
is a left inverse of f if
g ◦ f = 1X .
Similarly, one defines a right inverse of f .
Proposition 6.7.5. If f : X → Y had both a left inverse g and a right
inverse g′ , then g = g ′ .
Proof. From the definition of left/right inverse we obtain:
(g ◦ f ) ◦ g ′ = 1X ◦ g ′ = g ′ ,
g ◦ (f ◦ g ′ ) = g ◦ 1Y = g.
Therefore, by the associativity property (C1), we find that g = g′ .
When f : X → Y has both a right and a left inverse g, one calls g
the inverse of f and writes g = f −1 . In this case we say that f is an
isomorphism in the category.
Examples 6.7.6.
1. In Examples 6.7.2 the isomorphisms are the bijections (in the category of
sets), the group isomorphisms (in the category of groups), the linear isomorphisms (in the category of modules over a ring) and the homeomorphisms (in
the category of topological spaces).
2. In Example 6.7.3 every morphism is an isomorphism. A small category where
every morphism is invertible is called a groupoid.
6.7. CATEGORIES AND FUNCTORS
329
Next we introduce a notion which allows us to relate two categories.
Definition 6.7.7. A covariant functor F from a category C to a category D is a map that associates to an object X of C an object F (X) of D,
and to each morphism φ : X → Y , a morphism F (φ) : F (X) → F (Y ), such
that the following properties hold:
(i) F preserves identities: F (1X ) = 1F (X) , for every object X of C;
(ii) F preserves compositions: F (φ ◦ ψ) = F (φ) ◦ F (ψ), for all morphisms
φ and ψ that are composable.
Similarly, a contravariant functor F from a category C to a category
D is a map that associates to each object X of C an object F (X) of D, and
to each morphism φ : X → Y , a morphism F (φ) : F (Y ) → F (X), and:
(i) F preserves identities: F (1X ) = 1F (X) for every object X de C;
(ii’) F inverts compositions: F (φ ◦ ψ) = F (ψ) ◦ F (φ) for all morphisms φ
and ψ that are composable.
Let us see some examples of functors:
Examples 6.7.8.
1. The map that associates to each group its underlying set and to each group
homomorphism the underlying map of sets, is a covariant functor from the
category of groups to the category of sets. More generally, for any concrete
category C, there is a forgetful functor F : C → S from C to the category
of sets S which “forgets” the structure: to each object X of C, the functor F
associates the underlying set of X, and to each morphism φ ∈ Hom(X, Y ) the
functor F associates the underlying map X → Y .
2. The map that associates to each module M over a commutative ring A its
dual M ∗ , and to each A-linear map φ : M → N the transpose φ∗ : N ∗ → M ∗ ,
is a contravariant functor from the category of left A-modules to the category
of right A-modules.
3. The map that associates to each topological space X the k-th homology group
Hk (X, Z) (respectively, the k-th cohomology group H k (X, Z)) and to each continuous map φ : X → Y the group homomorphism φ∗ : Hk (X, Z) → Hk (Y, Z)
(respectively, φ∗ : H k (Y, Z) → H k (X, Z)), is a covariant (respectively, contravariant) functor from the category of topological spaces to the category of
abelian groups.
330
CHAPTER 6. MODULES
Many constructions that we have seen before are special instances of
general constructions in Category Theory. Consider, for example, the notion
of direct product: if {Xi }i∈I is a family of objects in some category C the
product of the objects {Xi }i∈I is a pair (Z, {πi }i∈I ), where Z is an object
in C and each πi : Z → Xi is a morphism in C, satisfying the following
universal property:
• For any object Y and any collection of morphisms {φi : Y → Xi }∈I ,
there exists a unique morphism φ : Y → Z such that, for each i ∈ I,
the following diagram commutes:
φ
Y ❴▼▼ ❴ ❴ ❴ ❴ ❴/ Z
▼
▼▼▼
▼▼▼
φi ▼▼▼▼ &
πi
Xi
It is easy to see that if products exist, then they are unique
up to an isoQ
morphism in C. In this case, we denote the product by i∈I Xi . Note that
the product may or may not exist, depending on the category. For a given
category it is necessary to show the existence, and for that one needs some
concrete model (usually quite obvious). We have followed exactly this procedure in the case of the category of groups (direct product of groups) and in
the case of the category of modules over a ring (direct product of modules).
Another advantage of Category Theory is to make it possible to give a
precise meaning to certain expressions one often uses when discussing some
mathematical formalism. For example, one may say that a certain map is
“induced” or that a certain construction is “natural”. In order to illustrate
this usage of Category Theory, let us see how one can make the notion
of “natural” precise. We recall that this expression is often is used as a
synonymous of “not depending on choices”.
Definition 6.7.9. A natural transformation T between two functors
F : C → D and G : C → D is a map which associates to each object X in a
category C a morphism TX : F (X) → G(X) in the category D, so that for
any morphism φ : X → Y of C the following diagram commutes:
F (X)
TX
/ G(X)
F (φ)
G(φ)
F (Y )
TY
/ G(Y )
6.7. CATEGORIES AND FUNCTORS
331
If TX is an isomorphism for every object X, we say that T is a natural
equivalence.
As a simple illustration, we show in the next example how one can formalize precisely the statement that for any finite-dimensional vector space
V there exists a natural isomorphism between the double dual (V ∗ )∗ and V .
Example 6.7.10.
Let C be the category of (left) modules over a ring A. For each A-module M ,
denote by M ∗∗ the double dual:
M ∗∗ ≡ (M ∗ )∗ ,
and for each A-linear transformation φ : M → N denote by φ∗∗ : M ∗∗ → N ∗∗
the double transpose:
φ∗∗ ≡ (φ∗ )∗ .
This defines a covariant functor from C to C.
Now, for each A-module M , we denote by TM : M → M ∗∗ the A-linear transformation defined as follows: for each v ∈ M , TM (v) : M ∗ → A the linear
transformation
ξ 7→ ξ(v).
One checks easily that M 7→ TM is a natural transformation between the double
dual functor and the identity functor.
If instead of the category of A-modules we consider instead the category of
finite-dimensional vector spaces over a field K, then V 7→ TV defines a natural
equivalence natural between the double dual functor and the identity functor.
We will not make heavy use of Category Theory (but in some situations
it we will be quite useful as a language!). Hence, we will not develop further
Category Theory, but you should keep in mind that Category Theory is an
important subject, that plays a major role in many branches of Mathematics,
including Algebraic Topology, Algebraic Geometry, etc.
Exercises.
1.QLet {Xi }i∈I be a family of objects in a category C. Define coproduct
∗
i∈I Xi of the objects {Xi }i∈I by diverting the direction of the arrows in the
definition of product, and verify that the direct sum of abelian groups and
modules, and the free product of groups, are coproducts in the appropriate
categories.
332
CHAPTER 6. MODULES
2. A set with a marked point is a pair (X, x) where X is a set and x ∈ X. A
morphism (X, x) → (Y, y) between sets with marked points is a map f : X → Y
such that f (x) = y.
(a) Show that sets with a marked point and the morphisms between marked
sets form a category.
(b) Show that products exist in this category and describe them explicitly.
(c) Show that coproducts exist in this category and describe them explicitly.
3. Let L be an object in a category C, S a non-empty set, and ι : X → L a
map. One says that L is a free object in the set S if for each object X
in C and map φ : S → X there exists a unique morphism φ̃ : L → X such that
the following diagram commutes:
ι
/
S ◆◆
◆◆◆
◆◆◆
◆
φ ◆◆◆◆
&
L✤
✤
✤ φ̃
X.
Show that, in any category C, if L is a free object in the set S, L′ is a free
object in the set S ′ and |S| = |S ′ |, then L′ is isomorphic to L.
4. An object I in a category C is called universal or initial if for each object
X in C there exists a unique morphism φ : I → X:
(a) Show that any two initial objects in a category C are isomorphic.
(b) Determine the initial objects in the category of groups.
(c) Show that the coproduct can be consider as an initial object in an appropriate category.
5. Define, similarly to the previous exercise, a co-universal or terminal
object in a category C.
(a) Show that any two terminal objects in a category C are isomorphic.
(b) Determine the terminal objects in the category of groups.
(c) Show that the product can be consider as a terminal object in an appropriate category.
Chapter 7
Galois Theory
The solution of a quadratic equation was known to the mathematicians in
Babylon1 , which knew how to “complete the square”, and was popularized
in the western world during the Renascence in latin translations of the Arab
book by al-Khowarizmi Al-jabr wa’l muqābalah (mentioned in Chapter ??).
In 1545, it was published the monograph Ars Magna by Geronimo Cardano
(1501–1576), also known by the name Cardan, which included formulas to
solve third and forth degree polynomial equations. Cardan attributed these
formulas, respectively, to Niccolo Tartaglia (1500–1565) and Ludovico Ferrari (1522–1565). The discovery of these formulas, and the fight for the
priority in its discovery, has a very curious and amusing history, which can
be found in the books mentioned above.
The formula for the solution of the cubic equation x3 + px = q, known
today as Cardan’s formula, is
qp
qp
3
3
(p/3)3 + (q/2)2 + q/2 −
(p/3)3 + (q/2)2 − q/2.
x=
The general cubic equation y 3 + by 2 + cy + d = 0 can be reduced to this case
by performing the substitution y = x − b/3. One can have a good idea of the
degree of difficulty in solving these equations if one verifies by substitution
that Cardan’s formula does indeed furnish a solution.
A quartic equation can also be reduced to the solution of a cubic. First
one can assume, eventually after some translation, that the quartic is of the
form x4 + px2 + qx + r = 0. Completing the square, one obtains
(x2 + p)2 = px2 − qx − r + p2 .
1
For the historical references in this chapter see, e.g., Carl R. Boyer, A History of
Mathematics, John Wiley & Sons, New York (1968), and Dirk J. Struik, A Concise History
of Mathematics, Dover Books on Mathematics, New York (1987).
333
334
CHAPTER 7. GALOIS THEORY
The trick consists then in observing that, for any y, one has:
(x2 + p + y)2 = px2 − qx − r + p2 + 2y(x2 + p) + y 2
= (p + 2y)x2 − qx + (p2 − r + 2py + y 2 ).
This last equation is quadratic in x, and we can choose y so that we obtain
a perfect square. This is achieved precisely when the discriminant vanishes:
q 2 − 4(p + 2y)(p2 − r + 2py + y 2 ) = 0.
Now the last equation is the cubic in y:
−8y 3 − 20py 2 + (−16p2 + 8r)y + (q 2 − 4p3 + 4pr) = 0,
which can be solved by using Cardan’s formula. For this value of y, the right
hand side of the above auxiliary equation is then a perfect square, so taking
square roots we obtain a quadratic equation which can be easily solved.
The solutions to the cubic and the quartic contained in the Ars Magna
provided a strong stimulus in the search of general formulas for the solutions
of algebraic equations of higher order. These efforts were unsuccessful for
over 300 years, and one had to wait for the beginning of the 19th Century
when Abel and Rufini reach the opposite conclusion: for an equation of the
5th degree there is no general formula that expresses the solutions in terms
of radicals of the coefficients of the equation.
Inspired by Abel’s proof of the impossibility of solving the quintic equation, Galois initiated a systematic study of algebraic equations of arbitrary
degree and showed that it was not only impossible to give a general formula
for equations of any degree higher than 5, but gave also a criterion to decide if a particular equation could be solved and, in the affirmative case, a
method to solve it. Galois investigations, in spite of its premature death,
were fundamental in shaping the field of Algebra as we know it nowadays,
and had much more consequences that the complete solution to the original
problem of solving algebraic equation through radicals.
In order to illustrate Galois ideas, consider the quartic equation with
rational coefficients:
p(x) = x4 + x3 + x2 + x + 1 = 0.
2πk
The roots of this equation are rk = ei 5 (k = 1, . . . , 4) (why?). Consider
all the possible polynomial equations, with rational coefficients, which are
satisfied by these roots, such as:
r1 + r2 + r3 + r4 − 1 = 0,
2
(r1 + r4 ) + (r1 + r4 ) − 1 = 0,
r1 r4 = 1,
5
(r1 ) − 1 = 0,
...
7.1. FIELD EXTENSIONS
335
The key observation is the following: the set of all permutations of the roots
which transform equations of this type into equation of this type form a
group, now called the Galois group of the equation. In the present case
the group is: G = {I, (1243), (14)(23), (1342)}. Galois understood that the
structure of this group is the key to solve the equation 2 .
Consider, for example, the subgroup H = {I, (14)(23)} ⊂ G. One can
check that among the polynomial expressions above the ones that are fixed
by H are precisely the polynomials in w1 = r1 + r4 and w2 = r2 + r3 . Now
w1 and w2 are the roots of the quadratic equation
x2 + x − 1 = 0.
Hence, assuming that we did not know the solutions to the original quartic
equation, we could try to solve it by solving first this quadratic equation,
obtaining:
√
√
−1 − 5
−1 + 5
,
r2 + r3 =
,
r1 + r4 =
2
2
and then solving the quadratic equation:
(x − r1 )(x − r4 ) = x2 − (r1 + r4 )x + r1 r4 = 0,
Note that the last equation is an equation whose coefficients are polynomials
in w1 and w2 , since r1 r4 = 1.
Note that the Galois Group of a polynomial equation p(x) ∈ Q[x] can
be characterized as the group of symmetries of the equation: it consists of
the transformations that take solutions (roots) to solutions, preserving the
algebraic structure of the roots. In the modern formulation of Galois Theory
one constructs the field Q(r1 , . . . , rn ) generated by the roots of p(x) and the
Galois Group consists of the automorphisms of this field 3 . One can say
that Galois Theory consists in replacing questions concerning the structure
of this field into questions about the structure of the Galois group.
7.1
Field Extensions
Recall that a field L is said to be an extension of the field K, if K is a
subfield of L. We will now initiate a systematic study of field extensions.
We start by recalling some results that were deduced in Chapter 3.
2
This discovery is even more remarkable, since Galois had first to invent the concept
of a group, which at the time was not known!
3
The notion of field was only formalized by Dedekind in 1879, more that 50 years after
the tragic death of Galois.
336
CHAPTER 7. GALOIS THEORY
Let K be a field and L an extension of K. Recall that we call L a finite
extension (respectively infinite extension) if L, seen as a vector space over
K, has finite dimension (respectively, infinite dimension). The dimension of
L over K is denoted by [L : K]. If S ⊂ L is some subset, we denote by K(S)
the smallest subfield of L which contains K and S. Obviously, K(S) is an
extension of K generated by S. If S = {u1 , . . . , un }, we write K(u1 , . . . , un )
instead of K({u1 , . . . , un }).
Let us consider in detail the case where S consists of a single element, i.e.,
K(u) where u ∈ L. If x is an indeterminate, there is a ring homomorphism
φ : K[x] → L which associates to a polynomial g(x) its value at u: g(x) 7→
g(u). The kernel ker(φ) is an ideal in K[x], which must be principal. There
are two possibilities:
(i) u is transcendental over K. This corresponds to the case ker(φ) =
{0}, so φ is a monomorphism whose image is K[u] and which extends
uniquely to the field of fractions K(x) = Frac(K[x]). It follows that
K(u) ≃ K(x), and the elements in K(u) take the form g(u)/f (u)
where g(x), f (x) ∈ K[x] and f (x) 6= 0. The extension K(u) has infinite
dimension over K;
(ii) u is algebraic over K. In this case ker(φ) = hp(x)i, where p(x) is
an irreducible polynomial, and K(u) = K[u] ≃ K[x]/hp(x)i. If we
require p(x) to be monic, then p(x) is unique and is called the minimal
polynomial of u. The extension K(u) has dimension [K(u) : K] =
deg p(x). We call this dimension the algebraic degree of u over K.
Definition 7.1.1. An extension L of K is called a simple extension if
there exists u ∈ L such that L = K(u). In this case, one says that u is a
primitive element of L.
A extension L of K is called algebraic (respectively transcendental) if all
the elements of L are algebraic (respectively,there exists a transcendental
element in L) over K. A simple extension L is algebraic or transcendental
exactly when its primitive elements are algebraic or transcendental. A finite
extension L over K is always algebraic (see Exercise 3.4.8), but there are algebraic extensions of infinite dimension over K. A transcendental extension
is always infinite.
Examples 7.1.2.
√
1. Consider L = Q( n 2) ⊂ R and K = Q. By Eisenstein Criterion, the √polynomial p(x) = xn − 2, (n ≥ 2) is
irreducible over Q.
The number u = n 2 is
√
√
n
n
a root of p(x) in R. Hence, [Q( 2) : Q] = n and 2 is algebraic of degree n
over Q.
7.1. FIELD EXTENSIONS
337
2. Consider L = C and K = R. The polynomial x2 + 1 is clearly irreducible
over R. The number i ∈ C is a root of x2 + 1 in C, hence i is algebraic of
degree 2 over R. In fact, C = R[x]/hx2 + 1i = R(i), so C is a simple extension
of R, and i is a primitive element.
3. Consider L = C and K = Q. Then L is a transcendental extensional of K
and [C : Q] = ∞.
We leave the proof of the following proposition as an exercise:
Proposition 7.1.3. Let M ⊃ L ⊃ K be successive extensions of a field K.
Then [M : K] is finite if and only if both [M : L] and [L : K] are finite and
in this case:
[M : K] = [M : L] · [L : K].
Example 7.1.4.
√
q(x) = x2 −3 is irreducible
We saw√above that [Q( 2),
√ Q] = 2. The polynomial
√
over Q( 2): a root a + b 2 of q(x) in Q( 2) must satisfy
√
(a + b 2)2 = 3, a, b ∈ Q.
√
Hence, (a2 + 2b2 ) + 2ab 2 = 3, so√that a2 + 2b2 = 3 and ab = 0. If b = 0, then
2
a2 = 3 which is impossible
since 3 6∈ Q. If a = 0 √
we have
= 6 which is
√
√ (2b) √
also impossible√since
6
∈
6
Q.
We
conclude
that
Q(
2)(
3)
≃
Q(
2)[x]/hx2 −
√
√
3i so that [Q( 2)( 3), Q( 2)] = 2. By Proposition 7.1.3, we conclude that
√ √
√ √
√
√
[Q( 2, 3), Q] = [Q( 2)( 3), Q( 2)] · [Q( 2), Q] = 2 · 2 = 4.
In the examples so far, we have consider only subfields of the field of
complex numbers. In this case, there is no problem in finding extensions
where a given polynomial has roots due to a fundamental property of C:
every polynomial (of degree ≥ 1) with coefficients in C has at least one root,
i.e., C is algebraically closed. We recall:
Proposition 7.1.5. The following statements are equivalent:
(i) L is an algebraically closed field.
(ii) Every polynomial p(x) = an xn + · · ·Q
+ a1 x + a0 ∈ L[x] factors into a
product of linear factors: p(x) = an n1 (x − ri ).
(iii) Every irreducible polynomial in L[x] has degree 1.
(iv) There are no proper algebraic extensions of L.
338
CHAPTER 7. GALOIS THEORY
There are many interesting examples of fields which are not subfields of
C. For example, the number fields Zp , relevant in Number Theory, or the
field of fractions C(z), fundamental in Algebraic Geometry. For such fields it
is not completely obvious that given a polynomial there exists an extension
where the polynomial factors into linear polynomials. We will deal with
these issues in the next sections.
Exercises.
1. Show that the subset of C formed by all algebraic elements over Q is an
algebraic extension of Q of infinite dimension over Q (such elements of C are
usually called algebraic numbers).
2. Give a proof of Proposition 7.1.3.
3. Show that:
(a) every algebraically closed field K is infinite;
(b) if K is an infinite field, show that any algebraic extension of K has the
same cardinality as K;
4. Show that there are elements in C which are not algebraic over Q.
5. Let L ⊂ C be the smallest extension of Q which contains the roots of the
polynomial x3 − 2. Decide if L is simple or not and in the afirmative case give
an example of a primitive element.
6. Prove Proposition 7.1.5.
7. Let K be a field and x an indeterminate. In K(x) consider the element
u = x2 . Show that K(x) is a simple extension of K(u). What is the dimension
of [K(x) : K(u)] ?
7.2
Compass-and-Straightedge Constructions
The mathematicians in Ancient Greece used to express in a geometric manner many mathematical concepts and constructions. In general, a construction was consider to be valid only if it could be obtained by using only a
compass and a straightedge (an unmarked ruler). In spite of their spectacular achievements, there were a few simple constructions that they could not
find a way to perform using only compass and straightedge.
7.2. COMPASS-AND-STRAIGHTEDGE CONSTRUCTIONS
339
Among the most famous constructions were the Greeks failed to find a
compass-and-straightedge construction, were the following:
(i) trisecting an angle;
(ii) duplicating a cube;
(iii) constructing a regular heptagon;
(iv) squaring the circle, i.e., constructing a square with area equal to a
given circle.
The problem of existence of such compass-and-straightedge constructions
can be transformed into a problem in Field Theory. Then Galois Theory
can be applied to solve the problem. We will formulate here a first version
of the theory which allows us already to show that (i)-(iv) do not have a
compass-and-straightedge construction.
Starting with two points in the plane let us attempt to determine all possible compass-and-straightedge constructions based at these points. Without loss of generality, we can assume that the two points are the numbers 0
and 1 in the complex plane C. We define inductively subsets Xm ⊂ C, m =
1, 2, . . . , as follows:
• X1 ≡ {0, 1}, and
• Xm+1 is the union of Xm with
(C1) the points of intersection of straight lines connecting points of
Xm ;
(C2) the points of intersection of straight lines connecting points of
Xm with circumferences with center in points of Xm and radius
segments with end points in Xm ;
(C3) the points of intersection of pairs of circumferences with center
in points of Xm and radius segments with end points in Xm ;
A point z ∈ C is said to be constructible with compass and straightedge if it
belongs to the union:
[
Xm .
C≡
m
We call C the set of constructible points.
Let us give some alternative descriptions of the set C of constructible
points. These will be complemented later with more complete descriptions,
after we study Galois Theory.
340
CHAPTER 7. GALOIS THEORY
Theorem 7.2.1. C is the smallest subfield of C which contains Q, and is
closed for the operations of complex conjugation (z 7→ z̄) and taking square
1
roots (z 7→ z 2 ).
Proof. The proof is split into 3 parts:
(a) C is a subfield of C: Assume that z1 , z2 ∈ C. Then z1 − z2 and
z1 /z2 (z2 6= 0) belong to C, since the usual operations of addition and multiplications of complex can be expressed geometrically exclusively using (C1),
(C2) and (C3) (exercise).
(b) C is closed for the operations of complex conjugation and taking
square roots It is obvious that C is closed for the operation of complex
√
conjugation. On the other hand, if z ∈ C and r = |z|, we can obtain r
using (C1), (C2) and (C3), through the geometric construction sketched
in the figure. Since bisecting an angle can also be performed using these
1
operations, we see that z 2 can be constructed using (C1), (C2) and (C3).
PSfrag replaements
p
r
1
mi
h
e
e~
1h1+i =rh2Zi
h4i
h12i
h6i
h3i
hmd(m; n)i
hmi
hni
hmm(m; n)i
Figure 7.2.1: Finding the square root.
(c) If Q ⊂ C ′ ⊂ C is any subfield closed for the operations of complex
conjugation and taking square roots, then C ⊂ C ′ : We need to show that the
points of intersection of
(i) straight lines connecting points of C ′ ,
(ii) straight lines connecting points of C ′ with circumferences with center
in C ′ and radius segments with end-points in C ′ ,
(iii) pairs of circumferences with center in C ′ and radius segments with
end-points in C ′ ,
7.2. COMPASS-AND-STRAIGHTEDGE CONSTRUCTIONS
341
belong to C ′ . Observe first that if z = x + iy ∈ C ′ , then x and y belong to
C ′ , since C ′ is closed under conjugation. It follows that if ax + by + c = 0
is the equation of a straight line connecting points of C ′ , then a, b, c ∈ C ′ .
Similarly, if x2 + y 2 + dx + ey + f = 0 is the equation of a circumference with
center in C ′ and radius a segment with end-points in C ′ , then d, e, f ∈ C ′ .
Finally, observe that the points of intersection of two objects of this type,
have coordinates certain fractions which, at most, contain square roots in
the coefficients a, b, c, d, e, f , hence, they belong to C ′ .
Next, we observe that one can translate the property of being constructible to an issue about the structure of fields.
Theorem 7.2.2. A complex number z belongs to C if and only if it belongs to
a subfield Q(u1 , u2 , . . . , ur ) ⊂ C, where u1 2 ∈ Q and um+1 2 ∈ Q(u1 , . . . , um )
for m = 1, 2, . . . , r − 1.
Proof. Since C is a field which contains Q and is closed under taking square
roots, it follows that fields of the form Q(u1 , u2 , . . . , ur ), with u1 2 ∈ Q and
um+1 2 ∈ Q(u1 , . . . , um ) (m < r), are subfields of C. To complete the proof
it is enough to check that the complex numbers that belong to some field
Q(u1 , u2 , . . . , ur ) with u1 2 ∈ Q and um+1 2 ∈ Q(u1 , . . . , um ) (m < r), form a
subfield of C, closed under conjugation (z 7→ z̄) and taking square roots (z 7→
1
z 2 ). For this, note that if z ∈ Q(u1 , u2 , . . . , ur ) and z ′ ∈ Q(u′ 1 , u′ 2 , . . . , u′ s ),
then z − z ′ and z/z ′ (z ′ 6= 0) belong to Q(u1 , u2 , . . . , ur , u′ 1 , u′ 2 , . . . , u′ s ).
This field is obviously closed under conjugation and taking square roots is
obvious.
As a corollary, we obtain the following criterion to exclude a complex
number from C:
Corollary 7.2.3. If a number z ∈ C is constructible, then it is algebraic
over Q of degree a power of 2.
Proof. If K(u) is an extension of K where u2 ∈ K, then either K(u) = K
or [K(u) : K] = 2. Hence, the fields Q(u1 , u2 , . . . , ur ) with u1 2 ∈ Q and
um+1 2 ∈ Q(u1 , . . . , um ), if m < r, satisfy [Q(u1 , . . . , ur ) : Q] = 2s , for some
integer s ≤ r. The corollary now follows from the theorem, if one observes
that z ∈ C, then Q(z) ⊂ Q(u1 , . . . , ur ), hence, [Q(z) : Q] = 2t .
We will see later, in our study of Galois Theory, that the converse of
this corollary does not hold. However, we can use the corollary to show
that constructions (i)–(iv) are not possible if one uses only a compass and
straightedge.
342
CHAPTER 7. GALOIS THEORY
Examples 7.2.4.
1. Trisecting an angle. One can construct a 60o degree angle with compass
and straightedge. We claim that one cannot trisect an 60o angle. If this was
possible, then the number cos 20o + i sin 20o would be constructible and, in
particular, cos 20o would be constructible. The trigonometric identity
cos 3θ = 4 cos3 θ − 3 cos θ,
shows that cos 20o is a root of the polynomial 4x3 −3x− 21 = 0. This polynomial
is irreducible over Q (exercise), hence the degree of cos 20o over Q is 3. By the
corollary, cos 20o is not constructible.
√
2. Duplicating a cube. We need to √
show that 3 2 is not√constructible. For
that, it is enough to observe that [Q( 3 2) : Q] = 3, since 3 2 is a root of the
irreducible polynomial x3 − 2.
3. Constructing a regular do heptagon. If this construction exists, then
2π
the number z = cos 2π
7 + i sin 7 is constructible. This number is a root of
7
the polynomial x − 1 = (x − 1)(x6 + x5 + · · · + 1). Since x6 + x5 + · · · + 1 is
irreducible over Q, we conclude that [Q(z) : Q] = 6. Since 6 is not a power of
2, the corollary shows that z is not constructible.
4. Squaring the circle. Starting from a circle of radius 1, we see that an
affirmative answer implies that the number π is constructible. As we know, π
is transcendental over Q, hence is not constructible4 .
There exist algebraic numbers of degree a power of 2 which are not
constructible. We will see later that Galois Theory yields a more efficient
criterion to determine if a given algebraic number is constructible. For now,
we remark that the structure of certain extensions of Q was the relevant
factor to decide about this property.
Exercises.
1. Interpret the usual addition and multiplication operations on complex numbers geometrically, showing that it involves exclusively (C1), (C2) and (C3)
and, hence, maybe constructed with a compass and straightedge.
4
We will not discuss in this book the transcendence of π. The existence of transcendentals numbers was established for the first time by Liouville in 1844. Liouville observed that
algebraic numbers cannotP
be the limit of rational sequences which converge “very fast”.
−n!
For example, the number ∞
cannot be algebraic. Hermite, in 1873, showed that
n=1 10
the base of the natural logarithms “e” is transcendental. Finally, Lindemann, in 1882,
used methods similar to Hermite to prove that π is transcendental. In 1874, Cantor gave
an argument without using limits (see Exercises 3 and 4 in Section 7.1) showing that there
exist transcendental numbers.
7.3. SPLITTING FIELDS
2. Show that the number
√
2
2
343
+i
√
2
2
is constructible.
3. Show that the polynomial q(x) = 4x3 − 3x −
1
2
is irreducible over Q.
4. Show that arccos 71/125 can be trisected.
5. Let p be a prime. Show that the polynomial xp−1 +xp−2 +· · ·+1 is irreducible
over Q.
(Hint: Replace x be x + 1 in the expression xp−1 + xp−2 + · · · + 1 = (xp −
1)/(x − 1) and apply Eisenstein’s Criterion.)
6. Let p be a prime.
(a) Show that if a regular polygon with p sides can be constructed with a
compass and straightedge then p = 2s + 1.
t
(b) Show that if p = 2s + 1 is prime then p is a Fermat number: p = 22 + 1.
(c) Conclude that if a regular polygon with p sides can be constructed with
a compass and straightedge then p is a Fermat number:.
7. Decide if a regular polygon of 9 sides has a compass-and-straightedge construction.
7.3
Splitting Fields
As usual, we assume that K is a field. Given a polynomial p(x) ∈ K[x]
is there some extension L of K where p(x) splits into a product of linear
factors? Since K is not a priori a subfield of an algebraically closed field
this is not completely obvious. Such an extension, if it exists, is necessarily
obtained through some “abstract” construction. In fact, such an extension
always exists and its construction is similar to the construction of the complex field C, as an extension of R, in the form of the “abstract” quotient
R[x]/hx2 + 1i.
Theorem 7.3.1. Let p(x) ∈ K[x] be polynomial of degree n ≥ 1. There
exists an extension L ⊃ K where p(x) splits into a product of linear terms.
Moreover, one can take this extension to be of the form
L = K(r1 , . . . , rn ),
where r1 , . . . , rn are the roots of p(x) in L.
344
CHAPTER 7. GALOIS THEORY
Proof. Without loss of generality, we can assume that p(x) is a monic polynomial of degree n. Denoting by l the number of irreducible factors of p(x),
the proof proceeds by induction on n − l.
If n − l = 0, then p(x) is already a product of linear terms and the roots
of p(x) belong to K, Hence, the result holds in this case.
Assume now that n − l > 0. Then there exists a factor q(x) of p(x) which
is irreducible of degree > 1. The field M = K[x]/hq(x)i is an extension of
K, containing the root r = x + hq(x)i of q(x), and coincides with K(r). In
M , we have the factorization q(x) = (x − r)q̃(x), so the polynomial p(x)
splits into a product of ˜l irreducible factors with ˜l > l. Since n − ˜l < n − l
the induction hypothesis gives an extension L of M , where p(x) splits into
a product of linear terms, and such that:
L = M (r1 , . . . , rn ),
where r1 , . . . , rn are the roots of p(x) in L. Since the root r is among the
roots r1 , . . . , rn , we see that
L = M (r1 , . . . , rn )
= K(r)(r1 , . . . , rn )
= K(r, r1 , . . . , rn ) = K(r1 , . . . , rn ).
This result leads to the following definition.
Definition 7.3.2. Let p(x) ∈ K[x]. An extension L ⊃ K is called a splitting extension of p(x) if:
(i) p(x) splits in L into a product of terms of degree 1;
(ii) L = K(r1 , . . . , rn ) where r1 , . . . , rn are the roots of p(x) in L.
Similarly, if {pi (x)}i∈I ⊂ K[x] is a family of polynomials, an extension
L ⊃ K is called a splitting extension of the family {pi (x)}i∈I if:
(i) each pi (x) splits in L into a product of terms of degree 1;
(ii) L is generated by the roots of the polynomials pi (x).
Often, we will use the term splitting field instead of splitting extension,
if the base field is clear from the context. We will see in the next section
that two splitting extensions of a polynomial (or a family) are isomorphic.
The proof of Theorem 7.3.1 gives a practical algorithm to construct a
splitting extension of a polynomial, as we illustrate in the next examples.
7.3. SPLITTING FIELDS
345
Examples 7.3.3.
1. Consider a polynomial p(x) = x2 + bx + c with coefficients in a field K. If
p(x) has a root in K, then K is already a splitting field of p(x). Assume then
that p(x) is irreducible. Then K[x]/hp(x)i is an extension of K, r1 = x−hp(x)i
is a root of p(x), and K[x]/hp(x)i = K(r1 ). In K(r1 ) we have x2 + bx + c =
(x − r1 )(x − r2 ), hence K(r1 ) = K(r1 , r2 ) is a splitting field of x2 + bx + c and
[K(r1 , r2 ) : K] = 2.
√ √
2. We have seen in a example in the previous section
that Q( 2, 3) ⊂ C is a
√
√
splitting field of (x2 − 2)(x2 − 3) and that [Q( 2, 3) : Q] = 4.
3. The polynomial p(x) = xp − 1 (p a prime) has the root r0 = 1 in Q, so that
xp − 1 = (x − 1)(xp−1 + xp−2 + · · · + 1). The polynomial xp−1 + xp−2 + · · · + 1
is irreducible in Q (exercise). Let r ∈ Q[x]/hxp−1 + xp−2 + · · · + 1i be the
element x + hxp−1 + xp−2 + · · · + 1i. In Q(r)Q
we have (rk )p = rkp = 1, and
p
that r, r2 , . . . , rp−1 are all distinct, so xp − 1 = i=1 (x − ri ). We conclude that
p
Q(r) is a splitting field of x − 1 and that [Q(r) : Q] = p − 1. If z ∈ C is a
complex root of xp −1 distinct from 1, one checks easily that Q(z) is isomorphic
to Q(r).
4. The polynomial p(x) = x3 +x2 +1 is irreducible over Z2 [x]. In fact, p(x) does
not have any roots in Z2 , since p(0) = 0 + 0 + 1 = 1 and p(1) = 1 + 1 + 1 ≡ 1
(mod 2). The field Z2 [x]/hx3 + x2 + 1i is an extension of Z2 where p(x) has
the root r = x + hx3 + x2 + 1i. Hence, in this field we can factor p(x) =
(x − r)(x2 + bx + c). One checks by comparison that b = 1 + r and c = r2 + r.
Using the relations r3 = r2 + 1 and r4 = r2 + r + 1, one check immediately that
r2 is a root of x2 + (r + 1)x + (r2 + r). Hence, p(x) splits into linear factors
in Z2 (r)[x]. This shows that Z2 (r) is a splitting field of x3 + x2 + 1 over Z2 ,
and that [Z2 (r) : Z2 ] = 3.
5. The polynomial p(x) = x3 − 2 is irreducible over Q. We form the extension
L = Q[x]/hx3 − 2i and let r1 = x + hx3 − 2i. Then L = Q(r1 ) and in L
the polynomial x3 − 2 factors as (x − r1 )(x2 + r1 x + r1 2 ). The polynomial
x2 + r1 x + r1 2 is irreducible over Q(r1 ) (exercise), so we can form a new
extension M = Q(r1 )[x]/hx2 + r1 x + r1 2 i. Setting r2 = x + hx2 + r1 x + r1 2 i ∈
M , we find that M = Q(r1 , r2 ) and in Q(r1 , r2 )[x] we have the factorization
x3 − 2 = (x − r1 )(x − r2 )(x − r3 ). We conclude that Q(r1 , r2 ) = Q(r1 , r2 , r3 )
is a splitting field of x3 − 2. Also, by Proposition 7.1.3, we have [Q(r1 , r2 , r2 ) :
Q] = 3 · 2 = 6.
As we have mentioned before, the splitting field Q(r1 , . . . , rn ) of a polynomial p(x) = an xn +· · ·+a1 x+an ∈ Q[x] can always be realized as a subfield
of C, since this field is an algebraically closed extension of Q. Actually, we
can now show that any field admits an algebraically closed extension.
346
CHAPTER 7. GALOIS THEORY
Theorem 7.3.4. Let K be a field. There exists an algebraically closed
extension of K.
Proof. We split the proof into 4 steps:
(a) There exists an extension L1 of K where every polynomial in K[x] of
degree ≥ 1 has a root: For each p(x) ∈ K[x] of degree ≥ 1 choose an
indeterminate xp . Denote by S the set of all such indeterminates. We
claim that in the ring K[S] the polynomials5 p(xp ) generate a proper
ideal. Assume that
(7.3.1)
g1 p1 (xp1 ) + · · · + gm pm (xpm ) = 1,
for some gi = gi (xp1 , . . . , xpm ) ∈ K[S]. There exists an extension
M of K where the polynomials p1 , . . . , pm have roots α1 , . . . , αm , so
replacing xpi 7→ αi in (7.3.1) we obtain 0 = 1, a contradiction. This
proves the claim, and so there exists a maximal ideal I in K[S] which
contains the polynomials p(xp ). The field L1 ≡ K[S]/I is the sought
extension.
(b) By(a), using induction, we construct a chain of extensions K ⊂ L1 ⊂
L2 ⊂ . . . ⊂ Ln ⊂ . . ., where for each k, any polynomial of Lk [x] of
degree ≥ 1 has a root in Lk+1 .
(c) Let L ≡ ∪i Li . Then L is a field: if a, b ∈ L, there exists a k such that
a, b ∈ Lk , and one can define a + b and ab as the sum and product
in Lk . This definition is independent of k, since the Li are successive
extensions.
(d) The field L is a algebraically closed extension: if p(x) ∈ L[x] has
degree ≥ 1, then the coefficients of p(x) belong to some Lk . Hence,
p(x) ∈ Lk [x], so p(x) must have a root in Lk+1 , hence in L.
Corollary 7.3.5. Let K be a field. There exists an algebraic extension
L̃ ⊃ K which is algebraically closed.
Proof. Let L ⊃ K be an algebraically closed extension and denote by L̃ the
set of all algebraic elements of L over K. It is easy to check that L̃ is a
subfield of L which has the desired properties.
5
One should not confuse p(x) with p(xp )!
7.3. SPLITTING FIELDS
347
An algebraic extension of K which is algebraically closed is called an
algebraic closure of K. We will see in the next section that any two
a
such extensions are isomorphic. We will denote by the symbol K any
algebraic closure of K.
The existence of a splitting field of a polynomial (Theorem 7.3.1) can be
easily shown if one appeals to the algebraic closure. However, note that the
proof of the existence of an algebraic closure uses precisely Theorem 7.3.1.
Still, we can now easily prove the existence of a splitting field of an arbitrary
family of polynomials.
Corollary 7.3.6. Let {pi (x)}i∈I ⊂ K[x] be a family of polynomials. There
exists a splitting extension L ⊃ K for the family {pi (x)}i∈I .
a
Proof. It is enough to consider in the algebraic closure K the subfield L
generated by all the roots of the polynomials {pi (x)}i∈I .
Let us consider now the converse problem: when is an extension L of K
a splitting extension of some family of polynomials with coefficients in K?
Definition 7.3.7. An algebraic extension L of K is called a normal extension of K if every irreducible polynomial in K[x] which has a root in L
splits into a product of linear terms in L[x].
Examples 7.3.8.
√
1. Q( 2) is a normal extension of Q. To check
√ p(x) ∈ Q[x] be an
√ this let
irreducible polynomial with a root r = a + b 2 ∈ Q( 2). Then p(x) is a
multiple of the minimal polynomial q(x) of r. Since
q(x) = (x − a)2 − 2b2 ∈ Q[x]
and p(x) is irreducible
over Q, we must have √
p(x) = λq(x),√for some λ ∈ Q.
√
Finally, in Q( 2) we have p(x) = c(x − a − b 2)(x − a + b 2).
√
Q. In √
fact, the polynomial x4 − 2 is
2. Q( 4 2) is not a normal extension of
√
4
irreducible over Q, and has the root √2 in Q( 4 2), However, this polynomial
does not split into linear terms in Q( 4 2), since this field does not contain the
imaginary roots of x4 − 2.
The last example suggest that there must exist some relation between
splitting extensions and normal extensions. The next result clarifies completely this issue:
348
CHAPTER 7. GALOIS THEORY
a
Theorem 7.3.9. Let L ⊂ K be an algebraic extension of K. The following
statements are equivalent:
(i) L is a normal extension of K.
(ii) L contains a splitting field of the minimal polynomial of any u ∈ L.
(iii) L is a splitting extension of a family {pi (x)}(i∈I) ⊂ K[x].
a
(iv) Every monomorphism φ : L → K such that φ|K =id is an automorphism of L: φ(L) = L.
We defer the proof of this theorem for the next section. An immediate
corollary is the following:
Corollary 7.3.10. An extension L of K is a splitting extension of a polynomial p(x) ∈ K[x] if and only if L is a normal, finite-dimensional extension
of K.
We have provide above a simple example of a non normal extension. If
an algebraic extension is not normal, one can try to find a larger normal
extension which is not “too big”.
Proposition 7.3.11. Let L be an algebraic extension of K. There exists
an extension L̃ ⊃ L such that:
(i) L̃ is a normal extension of K.
(ii) for any normal extension M of K such that L ⊂ M ⊂ L̃, one has
M = L̃.
(iii) [L̃ : K] < ∞ if and only if [L : K] < ∞.
Proof. Let S = {ui : i ∈ I} be a basis for L over K and for each i ∈ I denote
by pi (x) ∈ L[x] the minimal polynomial of ui . Let L̃ be a splitting extension
for the family {pi (x)}i∈I . By Theorem 7.3.9, L̃ is a normal extension of K,
so (i) holds. Also by Theorem 7.3.9, any normal extension of K must contain
a splitting extension of the minimal polynomial of any of its elements so we
conclude that L̃ is the smallest normal extension of K that contains L, and
(ii) also holds. Finally, to see that (iii) holds, note that if [L : K] < ∞,
then I is finite and we can replace the family {pi (x)}i∈I by the polynomial
Q
p(x) = i∈I pi (x). It follows that [L̃ : K] < ∞. Conversely, if [L̃ : K] < ∞
then, by Proposition 7.1.3, [L : K] < ∞.
7.3. SPLITTING FIELDS
349
From the uniqueness of the the splitting extension it follows that the extension L̃ given in the previous proposition is unique up to an isomorphism.
n
Such an extension is called the normal closure of L and is denoted L .
Example 7.3.12.
√
of Q.
We saw in a example above that L = Q( 4 2) is not a normal extension
√
n
It is easy to check that its normal closure is the extension L = Q( 4 2, i).
Exercises.
1. Let L = Q[x]/hx3 − 2i and r = x + hx3 − 2i ∈ L. Show that the polynomial
x2 + rx + r2 is irreducible over L.
2. Prove the following extension of Theorem 7.3.1, without using the algebraic
closure: If p1 (x), . . . , pm (x) ∈ K[x], then there exists a splitting extension of
the family of polynomials p1 (x), . . . , pm (x).
3. For the following polynomials over Q find splitting extensions and determine
their dimensions:
(a) x2 + x + 1;
(b) (x3 − 2)(x2 − 1);
(c) x6 + x3 + 1;
(d) x5 − 7.
4. Consider the polynomial x3 − 3. Find a splitting extension and its dimension
over each of the following ground fields:
(a) Q;
(b) Z3 ;
(c) Z5 .
5. If p(x) ∈ K[x] is a polynomial of degree n, and L is a splitting extension of
p(x), show that [L : K] | n!.
n
6. Determine a splitting extension of the polynomial xp − 1 over the field Zp .
7. Show that the extension L̃ constructed in the proof of Corollary 7.3.5 is
algebraically closed.
8. If L is an extension of K and [L : K] = 2, show that L must be a normal
extension of K.
350
CHAPTER 7. GALOIS THEORY
9. If M ⊃ L ⊃ K are successive extensions, with M normal over L and L
normal over K, is it true that M must be normal over K?
10. Show that if M ⊃ L ⊃ K are successive extensions and M is normal over
K, then M is normal over L.
7.4
Homomorphisms of Extensions
Definition 7.4.1. Let L1 ⊃ K1 and L2 ⊃ K2 be extensions. A field homomorphism φ : L1 → L2 such that φ(K1 ) ⊂ K2 is called a homomorphism
of extensions. When K1 = K2 = K and φ|K =id we say that φ is a
homomorphism over K or a K-homomorphism6 .
Notice that a K-homomorphism φ : L1 → L2 is the same thing as a
field homomorphism which is also a linear transformation from L1 to L2 , as
vector spaces over K.
We are interested in following questions: Given an isomorphism of fields
φ : K1 → K2 , when can one extend φ to an isomorphism of extensions
Φ : L1 → L2 ? How many such extensions are there? The answers will
allow us to prove uniqueness of algebraic and normal closures, as well as of
splitting fields.
Proposition 7.4.2. Let φ : K1 → K2 be an isomorphism of fields, L1 ⊃
K1 and L2 ⊃ K2 extensions, and r ∈ L1 algebraic over K1 with minimal
polynomial p(x). The isomorphism φ can be extended to a monomorphism
Φ : K1 (r) → L2 if and only if the polynomial pφ (x) has a root in L2 . The
number of such extensions equals the number of distinct roots of pφ (x) in
L2 .
Proof. Let s be a root of pφ (x) in L2 . It is easy to check that there exist
a unique field homomorphism Φ : K1 (r) → L2 such that Φ|K1 = φ and
Φ(r) = s.
On the other hand, if Φ is any extension of φ, then
pφ (Φ(r)) = Φ(p(r)) = Φ(0) = 0,
so pφ (x) has a root in L2 .
Here is a first answer to the questions posed above:
6
Recall that a homomorphism of fields is necessarily injective, hence in these definitions
one can replace “homomorphism” by “monomorphism”.
7.4. HOMOMORPHISMS OF EXTENSIONS
351
Theorem 7.4.3. Let φ : K1 → K2 be an isomorphism of fields and p(x) ∈
K1 [x]. If L1 and L2 are splitting extensions of p(x) and pφ (x), respectively,
then there exists an isomorphism Φ : L1 → L2 which extends φ. The number
of such extension is ≤ [L1 : K1 ], and equals [L1 : K1 ] when pφ (x) has distinct
roots in L2 .
Proof. We use induction on [L1 : K1Q
].
If [L1 : K1 ] = 1,Q
then p(x) = an ni=1 (x − ri ), where ri ∈ K1 = L1 . But
then pφ (x) = φ(an ) ni=1 (x − φ(ri )), with φ(an ), φ(ri ) ∈ K2 . Since the roots
of a polynomial generate its splitting field we conclude that L2 = K2 , and
that there exists exactly [L1 : K1 ] = 1 extension.
Assume now that [L1 : K1 ] > 1. Then p(x) has an irreducible factor
q(x) ∈ K1 [x] of degree ≥ 1. Let r be a root of q(x) in L1 . By Proposition
7.4.2 the isomorphism φ : K1 → K2 can be extended to a monomorphism
φ̃ : K1 (r) → L2 and there exist as many such extensions as distinct roots of
q φ (x) in L2 . We can also view L1 and L2 as splitting extensions of p(x) and
pφ (x) over K1 (r) and φ̃(K1 (r)), respectively. Since
[L1 : K1 (r)] = [L1 : K1 ]/[K1 (r) : K1 ] = [L1 : K1 ]/ deg q(x) < [L1 : K1 ],
by the induction hypothesis, there exists an extension of φ̃ to an isomorphism
Φ : L1 → L2 , and the number of such extensions is ≤ [L1 : K1 (r)], and equal
[L1 : K1 (r)] if pφ (x) has distinct roots in L2 . It follows that Φ is an extension
of φ, and the number of extensions Φ of this type is
[L1 : K1 (r)] · deg q(x) = [L1 : K1 (r)] · [K1 (r) : K1 ] = [L1 : K1 ],
if pφ (x) has distinct roots in L2 .
Finally, observe that we can obtain any extension of φ in this way. In
fact, given any extension Φ : L1 → L2 its restriction to K1 (r) gives a
monomorphism K1 (r) → L2 , which by Proposition 7.4.2 is an extension of
φ of the type φ̃, as above.
Setting K1 = K2 = K and φ =id, in this theorem, yields:
Corollary 7.4.4. Any two splitting fields of p(x) ∈ K[x] are isomorphic.
Example 7.4.5.
In an example in the previous section, we have constructed an (abstract) splitting field L1 = Q(r1 , r2 , r3 ) of x3 − 2 ∈ Q[x], with [L1 : Q] = 6. Another
splitting field L2 can be obtained
√ taking the√subfield of C generated by Q
√ by
By the theorem, all the 6
and the roots of x3 − 2 in C: 3 2, 3√2/2(−1
√ ± i 3). √
bijections between {r1 , r2 , r3 } and { 3 2, 3 2/2(−1 ± i 3)} can be realized by a
Q-isomorphism L1 → L2 .
352
CHAPTER 7. GALOIS THEORY
Let us consider now the case of arbitrary algebraic extensions:
Theorem 7.4.6. Let φ : K1 → K2 be an isomorphism of fields, L1 ⊃ K1 an
algebraic extension and L2 ⊃ K2 an algebraically closed extension. There
exists a monomorphism Φ : L1 → L2 which extends φ. If L1 is algebraically
closed and L2 is algebraic over K2 , then Φ is an isomorphism L1 ≃ L2 .
Proof. Conside the set P formed by ordered pairs (N, τ ) where N ⊂ L1 is
an extension of K1 , and τ : M → L2 is a monomorphism which extends
φ. In P one defines a partial order relation by declaring (N1 , τ1 ) ≤ (N2 , τ2 )
if and only if N1 ⊂ N2 and τ2 |N1 = τ1 . The set P is non-empty, since
it contains the pair (K1 , φ), and any chain {(Ni , τi )}i∈I of elements of P
is bounded above by the pair (N, τ ), where N ≡ ∪i Ni and τ |Ni ≡ τi . By
Zorn’s Lemma, P contains a maximal element (M, Φ).
We claim that M = L1 , so that Φ gives the desired extension. In fact,
assume that there exists r ∈ L1 − M . We can then form the extension
extension M (r) and, by Proposition 7.4.2, there exists an extension Φ̃ :
M (r) → L2 . The pair (M (r), Φ̃) contradicts the maximality of (M, Φ), so
we must have L1 = M , as claimed.
Finally, if L1 is algebraically closed, then Φ(L1 ) is also algebraically
closed. If L2 is algebraic over K2 , then we must have L2 ⊂ Φ(L1 ), hence Φ
is onto.
The previous result leads to the uniqueness of the algebraic closure and
of the splitting extension of any family of polynomials. For example, if one
takes K1 = K2 = K and φ =id in the theorem, we obtain immediately:
Corollary 7.4.7. Let K be a field, L and L̃ algebraically closed extensions,
algebraic over K. There exists a K-isomorphism Φ : L → L̃.
Since a splitting extension of a family of polynomials can be taken inside
an algebraic closure, and it is generated by the all roots of the polynomials
in the family, we also obtain:
Corollary 7.4.8. Let {pi (x)}i∈I ⊂ K[x] be a family of polynomials. Any
two splitting extensions of the family {pi (x)}i∈I are isomorphic.
Finally, the uniqueness of the normal closure of an extension L of K
follows from Theorem 7.3.9 whose proof we are now able to furnish.
Proof of Theorem 7.3.9. We will prove the implications (i) ⇒ (ii) ⇒ (iii) ⇒
(iv) ⇒ (i).
7.4. HOMOMORPHISMS OF EXTENSIONS
353
(i) ⇒ (ii): Let r ∈ L and denote by p(x) ∈ K[x] the minimalQpolynomial
of r. Since L is normal and p(r) = 0, p(x) splits into a product ni=1 (x − ri )
in L[x]. The subfield K(r1 , . . . , rn ) ⊂ L is a splitting field of p(x).
(ii) ⇒ (iii): L is a splitting extension of the family {pr (x)}r∈L , where
pr (x) is the minimal polynomial of the element r ∈ L.
a
(iii) ⇒ (iv): Let r ∈ L ⊂ K be a root of a polynomial pi (x). Then φ(r)
is also a root of pi (x), since φ|K =id. Since the roots of the family {pi (x)}i∈I
generate the field L, we conclude that φ(L) = L.
(iv) ⇒ (i): Let p(x) ∈ K[x] be an irreducible polynomial with a root
a
a
r ∈ L. If r̃ ∈ K is any other root of p(x), the map φ : K(r) → K(r̃) ⊂ K
which takes r 7→ r̃ and is the identity in K, is a K-isomorphism, which by
Theorem 7.4.6 can be extended to L. Then r̃ = φ(r) ∈ L, hence all the roots
of p(x) belong to L, so p(x) factors into a product of linear terms in L[x].
Hence, L is a normal extension of K.
Exercises.
1. Let φ : K1 → K2 be an isomorphism, L1 ⊃ K1 and L2 ⊃ K2 extensions, and
r ∈ L1 algebraic over K1 with minimal polynomial p(x). If s is a root of pφ (x)
in L2 , show that there exists a unique homomorphism of fields Φ : K1 (r) → L2
such that Φ|K1 = φ and Φ(r) = s.
√
√
2. Consider the extensions Q(i), Q( 2), Q(i, 2) ⊂ C of Q.
√
(a) Show that Q(i) and Q( 2) are isomorphic Q-vector spaces, but are not
isomorphic as fields.
√
(b) Find all the Q-automorphisms of Q(i, 2).
3. Let r be a root of x6 + x3 + 1. Determine all the Q-homomorphisms Φ :
Q(r) → C.
(Hint: x6 + x3 + 1 is a factor of x9 − 1.)
4. Determine all the Z2 -automorphisms of the splitting extension of x3 +x2 +1 ∈
Z2 [x].
5. Show that an algebraic extension L of K is a normal extension if and only if
every irreducible polynomial of K[x] factors in L[x] into a product of irreducible
factors where all the factors have the same degree.
354
7.5
CHAPTER 7. GALOIS THEORY
Separability
As it will be clear in the sequel, the K-automorphisms of an extension L ⊃ K
play a fundamental role in Galois Theory. As we have seen in the previous
section, the precise counting of the number of K-automorphisms depend
on the number of distinct roots of certain polynomials. We will now study
when a given polynomial has multiple roots and we will establish a criterion
for its existence.
Any polynomial p(x) ∈ K[x] splits into a product of linear terms in
a
a
K [x]. Hence, if r ∈ K is a root of p(x), we can define its multiplicity to
a
be the largest integer k such that (x − r)k | p(x) in K [x]. The root is called
simple if k = 1, and multiple if k > 1.
The criterion for the existence of multiple roots will require a notion of
derivative of a polynomial. When the polynomial can be viewed as real or
complex valued function, this is just the usual notion of derivative, but in
general this will be only a formal derivative.
Definition 7.5.1. Let K be a field. The formal derivative operator
of K[x] is the only map D : K[x] → K[x] which satisfies the following
properties:
(i) Linearity: D(p(x) + q(x)) = D(p(x)) + D(q(x)) and D(a · p(x)) =
a · D(p(x)).
(ii) Leibniz Rule: D(p(x)q(x)) = D(p(x))q(x) + p(x)D(q(x)).
(iii) Normalization: D(x) = 1.
One checks easily that there is exactly one operator satisfying these
properties. In fact, if p(x) = an xn + · · · + a1 x + a0 ∈ K[x], then by applying
(i), (ii) and (iii), we find:
D(p(x)) = nan xn−1 + · · · + 2a2 x + a1 .
(7.5.1)
We will often write p′ (x) instead of D(p(x)).
Theorem 7.5.2. Let p(x) ∈ K[x] be a monic polynomial of degree ≥ 1. The
a
roots of p(x) in K are simple if and only if gcd(p(x), p′ (x)) = 1.
a
Proof. If r ∈ K is a multiple root of p(x), then p(x) = (x − r)k q(x), where
a
k > 1 and q(x) ∈ K [x]. Diferenciating:
p′ (x) = (x − r)k q ′ (x) + k(x − r)k−1 q(x),
7.5. SEPARABILITY
355
so we see that (x − r)k−1 | p′ (x), and (x − r) is a common factor of p(x) and
p′ (x). We conclude that if gcd(p(x), p′ (x)) = 1, then p(x) has no multiple
roots.
a
Assume now that p(x) has no multiple roots. Then in K [x] we have
p(x) = (x−r1 )(x−r2 ) · · · (x−rn ), where the ri are all distinct. Differentiating
we obtain:
p′ (x) =
n
X
(x − r1 ) · · · (x − ri−1 )(x − ri+1 ) · · · (x − rn ),
i=1
so that (x − ri ) ∤ p′ (x), for all i = 1, . . . , n. Hence, gcd(p(x), p′ (x)) = 1.
Notice that to apply this criterion we do not need to know the ala
gebraic closure K or even a splitting field of p(x): the computation of
gcd(p(x), q(x)) is independent of the extension where one takes the coefficients of the polynomials p(x) and q(x).
Definition 7.5.3. One calls p(x) ∈ K[x] a separable polynomial if each
of its irreducible factors has no multiple roots. A perfect field is a field
K where every polynomial in K[x] is separable.
The criterions above leads immediately to:
Corollary 7.5.4. Every field of characteristic zero is perfect.
Proof. Let p(x) ∈ K[x] be a monic irreducible polynomial with multiple
roots. Then gcd(p(x), p′ (x)) 6= 1, so we must have p(x) | p′ (x). Since
deg p′ (x) < deg p(x), it follows that p′ (x) = 0. Since the characteristic of K
is assumed to be zero, formula (7.5.1) for the derivative shows that p(x) ∈ K,
a contradiction.
The proof fails in characteristic p, since then the condition q ′ (x) = 0 only
implies that q(x) = a0 + a1 xp + a2 x2p + . . . . One can then find examples of
non-separable polynomials in characteristic p 6= 0 (see the exercises). When
p 6= 0 the study of separability is based on the following simple lemma:
Lemma 7.5.5. If K has characteristic p 6= 0 and a ∈ K, then the polynomial xp − a is either irreducible or takes the form (x − b)p , b ∈ K.
Proof. Suppose that xp − a = g(x)h(x), where g(x) and h(x) are monic
a
polynomials. Let b ∈ K be a root of xp − a. Then
p
X
(pk )xk (−b)p−k = xp − bp = xp − a,
(x − b) =
p
k=0
356
CHAPTER 7. GALOIS THEORY
since for 0 < k < p the coefficients pk are all divisible by p. Hence, g(x) =
(x − b)k and we must have bk ∈ K. Since gcd(k, p) = 1, there exist r, s ∈ Z
such that rk + sp = 1, and it follows that b = brk+sp = (bk )r + (bp )s ∈ K.
This shows that xp − a = (x − b)p , with b ∈ K.
Theorem 7.5.6. A field K of characteristic p 6= 0 is perfect if and only if
K = K p , where K p denotes the subfield of K formed by all the powers ap ,
with a ∈ K.
Proof. Let a ∈ K − K p . By Lemma 7.5.5, xp − a is irreducible and since
(xp − a)′ = pxp−1 = 0, this polynomial is not separable. Hence if K 6= K p
then K is not perfect.
Conversely, assume that K is not perfect, so there exists q(x) ∈ K[x]
irreducible and non-separable. Since q ′ (x) = 0, we have q(x) = a0 + a1 xp +
a2 x2p + . . . . Note that at least one ai 6∈ K p , for if aj = bj p with bj ∈ K for
all j, then q(x) = (b0 + b1 x + b2 x2 + . . . )p , which contradicts the assumption
that q(x) is irreducible. Hence, K 6= K p .
If K is a field of characteristic p 6= 0, the monomorphism φ : K → K,
a→
7 ap , is called the Frobenius endomorphism. The previous theorem
states that K is perfect if and only if the Frobenius endomorphism is an
automorphism.
Corollary 7.5.7. If K is finite of characteristic p 6= 0, then K is a perfect.
Proof. If K is finite, the Frobenius endomorphism is surjective.
In most examples that we will present in our discussion of Galois Theory
the ground field will be a perfect field. When this is not the case, in order
for Galois Theory to work, we need the following extra assumption:
Definition 7.5.8. Let L be an extension of K.
(i) u ∈ L is called separable element over K if u is algebraic over K,
and the minimal polynomial of u is separable.
(ii) L is called a separable extension over K if all elements of L are
separable over K.
Given an algebraic extension L ⊃ K one defines its separable degree,
denoted [L : K]s , to be the cardinality of the set of K-homomorphisms from
a
L to the algebraic closure K :
a
[L : K]s := #{φ : L → K : φ is a K-homomorphism}.
7.5. SEPARABILITY
357
Proposition 7.5.9 (Properties of the separable degree). Let K be a field.
(i) If M ⊃ L ⊃ K are algebraic extensions then
[M : K]s = [M : L]s · [L : K]s .
(ii) If L is an extension of finite dimension over K, then [L : K]s ≤
[L : K], with equality precisely when L is separable over K.
(iii) If L = K(u1 , . . . , um ), then L is separable over K if and only if
u1 , . . . , um are separable over K.
Proof.
a
(i) If φ : M → K is a K-homomorphism, then φ is an extension of
a
the K-homomorphism φ|L : L → K . Hence, it is enough to show that
a
for a K-homomorphism ψ : L → K its extensions to a K-homomorphism
a
φ : M → K are in bijection with the L-homomorphisms τ : M → L̄a .
a
a
In fact, we can extend ψ to an isomorphism ψ̃ : L → K , hence the
a
correspondence that to a L-homomorphism τ : M → L associates the
extension φ = ψ̃ ◦ τ is a bijection: the inverse associates to an extension
a
φ : M → K the L-homomorphism τ = ψ̃ −1 ◦ φ.
(ii) By (i), it is enough to show that [K(u) : K]s ≤ [K(u) : K] for any
algebraic element u over K, with equality if and only if K(u) is separable
over K. Since [K(u) : K]s represents the number of K-homomorphisms
φ : K(u) → K̄ a , it equals the number of distinct roots of the minimal
polynomial of u over K. Hence, [K(u) : K]s ≤ [K(u) : K], with equality if
and only if u is separable over K. But u is separable over K if and only if
K(u) is a separable extension over K: if u′ ∈ K(u) is not separable, then
[K(u) : K]s = [K(u) : K(u′ )]s [K(u′ ) : K]s
< [K(u) : K(u′ )][K(u′ ) : K] = [K(u) : K].
(iii) If the ui are separable over K, then ui is separable over K(u1 , . . . , ui−1 ).
From (i) and (ii), we conclude that
[K(u1 , . . . , um ) : K]s = [K(u1 , . . . , um ) : K],
so K(u1 , . . . , um ) is separable over K.
If K is perfect, then obviously all algebraic extensions of K are separable.
In particular, if K has characteristic 0 or K has characteristic p 6= 0 and
K = K p , then every algebraic extension of K is separable.
358
CHAPTER 7. GALOIS THEORY
Exercises.
1. Show that if K has characteristic 0 and p(x) ∈ K[x] is monic, then q(x) =
−1
p(x)[gcd(p(x), p′ (x))] is a polynomial whose roots are all simple, and that
these are precisely the roots of p(x).
2. Show that if K is a field with characteristic 6= 2, then any quadratic polynomial x2 + ax + b ∈ K[x] is separable.
3. Consider the field Zp (x) of fractions q(x)/r(x), where q(x) and r(x) 6= 0 are
polynomials over Zp .
(a) Show that Zp (x) has characteristic p.
(b) Show that the element x ∈ Zp (x) is not a power of degree p, i.e., there is
no b(x) ∈ Zp (x) such that x = b(x)p .
(c) Given an example of a polynomial over Zp (x) which is not separable.
4. Let K be a field of characteristic p, and q(x) ∈ K[x] an irreducible polynomial. Show that all the roots of q(x) have the same multiplicity pn , for some
integer n.
5. Let L be an extension of K, with [L : K] < ∞. Show that [L : K]s | [L : K].
(Note: One calls [L : K]i = [L : K]/[L : K]s the inseparable degree of L
over K.)
6. Let M ⊃ L ⊃ K be succesive extensions. Show that:
(i) If M is separable over K, then M is separable over L and L is separable
over K;
(ii) If M is separable over L and L is separable over K, then M is separable
over K.
7.6
The Galois Group
As we have already mentioned, Galois’s great insight was to replace questions
about field extensions by questions in group theory. We are now ready to
introduce the relevant groups.
If L is an extension of K, obviously the K-automorphisms of L form
a group under composition: if φ1 and φ2 are K-automorphisms of L, then
φ1 ◦ φ2 is also a K-automorphism.
Definition 7.6.1. The group of K-automorphisms of an extension L of K
is called the Galois group of L over K and is denoted AutK (L).
7.6. THE GALOIS GROUP
359
The next examples show that this group can be of a very different nature,
depending on the extension.
Examples 7.6.2.
√
√
1. Let L = Q( 2). The element 2 has minimal polynomial x2 − 2. Any
Q-automorphism φ : L → L transforms roots of this polynomial into roots.
Hence, we have two automorphisms, namely the identity and
√
√
φ(a + b 2) = a − b 2.
√
We conclude that the Galois group AutQ (Q( 2) is isomorphic to Z2 .
√ √
2. Let L = Q( 2, 3). As in the previous example we see that
√ the
√ Q-automorphisms
of L are completely determined by its action on the set { 2, 3}. There are 4
possibilities: the identity and
√
√
√
√
φ1 ( 2) = − 2, φ1 ( 3) = 3;
√
√
φ2 ( 2) = 2,
√
√
φ2 ( 3) = − 3;
√
√
√
√
φ3 ( 2) = − 2, φ3 ( 3) = − 3.
In this case the Galois group is isomorphic to Z2 ⊕ Z2 .
3. Let K be a field of characteristic p such that K 6= K p . If a 6∈ K p , the
polynomial q(x) = xp − a is irreducible. Let L be a splitting extension of q(x).
In L we have q(x) = (x−r)p , so L = K(r). If φ : L → L is a K-automorphism,
then φ(r) = r and we conclude that φ=id. The Galois group AutK (L) is trivial.
4. Let L = K(x) be the field of fractions p(x)/q(x) with p(x), q(x) ∈ K[x] and
q(x) 6= 0. It is easy to see that the primitive elements of L are all of the form
t=
ax + b
,
cx + d
a, b, c, d ∈ K,
ad − bc 6= 0.
Any K-automorphism of L transforms primitive elements into primitive elements. Hence, a K-automorphism φ : L → L, transforms the element
p(x)/q(x) ∈ L in an element p(t)/q(t) ∈ L. We conclude that the Galois group
is isomorphic to P GL2 (K) := GL2 (K)/Z, where Z is the center of GL2 (K)
consisting of 2 × 2 matrices of the form
a 0
Z={
: a 6= 0}.
0 a
If K is infinite, this group is infinite.
Since we know now that all the splitting fields of a polynomial p(x) are
isomorphic, we can set:
360
CHAPTER 7. GALOIS THEORY
Definition 7.6.3. The Galois group of p(x) ∈ K[x] (or of p(x) = 0) is
the Galois group of any splitting field of p(x) over K.
It is natural to identify the Galois group of an equation p(x) = 0 with the
subgroup of group of permutations of the roots of p(x)7 . If L is a splitting
field of p(x), and S = {r1 , . . . , rn } is the set of distinct roots of p(x), then
L = K(r1 , . . . , rn ). If φ ∈ AutK (L) is an element of the Galois group of
p(x), then φ transforms roots of p(x) into roots, so it induces a permutation
of S. On the other hand, if we know the effect of φ on the roots of p(x), then
we know how φ transforms any element of L = K(r1 , . . . , rn ). Hence, the
map φ 7→ φ|S is a monomorphism AutK (L) → Sn . This shows that we can
identify AutK (L) with a subgroup of the group of permutations the roots.
In general, as it was already clear in the examples
above, AutK (L) ( Sn ,
PSfrag replaements
even when p(x) is irreducible.
Example 7.6.4.
Let L ⊂ C be a splitting extension of polynomial p(x) = x6 − 2 ∈ Q[x].
r2
r1
1 +pr
r
hmi
e
e~
rZ
r3
r4
h1 i =
h2i6
h4 i
h12i
h6 i
h3 i
hmd(m; n)i
hm i
hni
hmm(m; n)i
r5
Figure 7.6.1: Roots of x6 − 2 = 0.
This polynomial is irreducible with roots rk =
that, for example
r3 + r6 = 0.
√
2kπi
6
2e 6 , k = 1, . . . , 6. Notice
7
It was precisely in this way that Galois thought of this group, much before the abstract
notion of group was discovered!
7.6. THE GALOIS GROUP
361
Since r3 + r1 6= 0, there is no automorphism in the Galois group corresponding
to the transposition (16). From the figure below, note also that
(r1 + r5 )6 = r6 6 = 2,
hence there are no automorphisms of the Galois group corresponding to the
permutations (13)(56) and (16)(35). One can exclude many other elements of
S6 in this way. In facto, we will see shortly that |G| = 12.
The computation of the Galois group of an equation p(x) = 0 or of an
extension maybe a very hard task. However, we have already developed the
tools to find its order.
Theorem 7.6.5. Let L be an extension of finite dimension over K, and
G = AutK (L) its Galois group. Then |G| ≤ [L : K]. If L a normal,
separable, extension of K then |G| = [L : K].
Proof. We can assume that L ⊂ K̄ a . If φ ∈ G, then we obtain a Khomomorphism φ : L → K̄ a . The number of such homomorphisms is [L :
K]s ≤ [L : K]. Hence |G| ≤ [L : K].
If L is normal, then every K-homomorphism ψ : L → K̄ a is in fact an
automorphism of L, by Theorem 7.3.9. If L is separable, then [L : K]s =
[L : K], by Proposition 7.5.9. Hence, if L is normal and separable over K,
then |G| = [L : K].
For a separable polynomial we conclude that:
Corollary 7.6.6. If p(x) is a separable polynomial over K with Galois group
G, and L is a splitting extension of p(x), then |G| = [L : K].
Examples 7.6.7.
√
4
2) is a splitting extension of the polynomial√x4 − 2 ∈
1. The extension Q(i,
√
4
Q[x]. Since [Q(i, 2) : Q(i)] = 4√and [Q(i) : Q] = 2, we have [Q(i, 4 2) : Q] =
8, and √
the Galois group of Q(i, 4 2) has order 8. We have Q-automorphisms
of Q(i, 4 2) defined by (check this!)
σ(i) = −i,
√
√
4
4
σ( 2) = 2,
τ (i) = i,
√
√
4
4
τ ( 2) = i 2.
These automorphisms have orders 2 and 4, respectively, and satisfy the relation
τ σ = στ 3 . It is then easy to see that
√
4
AutQ Q(i, 2) = {1, τ, τ 2 , τ 3 , σ, στ, στ 2 , στ 3 } ≡ G.
362
CHAPTER 7. GALOIS THEORY
This group is isomorphic to the dihedral group D4 . In terms of the roots rk =
πk
e 2 i , these automorphisms correspond to the permutations
σ = (13),
τ = (1234).
2. The Galois √
group G of the polynomial p(x)
= x6 − 2 over Q has order order
√
2π
2π
6
6
i
12, since [Q( 2, e 3 ) : Q] = 12 and Q( 2, e 3 i ) is a splitting extension of
p(x). We leave it as an exercise to check that G ≃ D6 and to determine its
representation as a permutation group of the roots.
The previous examples suggest that one should be able to compute the
Galois group of a polynomial xn − a over a field K of characteristic 0. This
depends on the ability to compute the Galois group of xn − 1.
Definition 7.6.8. Let K be a field. A splitting extension of the polynomial
xn − 1 is called the cyclotomic field of order n over K.
The Galois group of a cyclotomic field is much simple in the case of
characteristic zero.
Proposition 7.6.9. If K has characteristic 0, the Galois group of the cyclotomic field of order n is isomorphic to a subgroup of Z∗n .
Proof. Let L be a splitting extension of xn − 1 over K. Since K has characteristic zero, and (xn − 1)′ = nxn−1 6= 0, the roots of xn − 1 are all simple
roots. We conclude that the set of roots U = {r ∈ L : r n − 1 = 0} is isomorphic to Zn (see Exercise 6.6.2.2). On the other hand, if φ ∈ AutK (L) ≡ G,
then φ|U is an automorphism of U and this restriction determines completely
φ. Hence G is isomorphic to a subgroup of Aut(U ) ≃ Aut(Zn ) ≃ Z∗n , the
group of units of the ring Zn .
The previous result shows that in characteristic 0 the Galois group of
a cyclotomic field is abelian. In general, there is nothing else we can say
about the Galois group: for example, if K contains the roots of xn − 1 = 0,
then obviously the cyclotomic field of order n coincides with K, so its Galois
group is trivial. However, when K = Q, the proof above shows that:
Corollary 7.6.10. The Galois group of the cyclotomic field of order n over
Q is Z∗n . In particular, its order is given by the Euler function:
| AutQ (Q(e
2iπ
n
))| = ϕ(n) = n
k Y
i=1
1
1−
pi
where p1 , . . . , pn are the distinct prime factors of n.
,
7.6. THE GALOIS GROUP
363
2iπ
Proof. Let ζ = e n , so U = {ζ, ζ 2 , . . . , ζ n } ≃ Zn is set of n-th roots of the
unit. In this case the map AutQ (ζ) → Aut(Zn ), φ 7→ φ|U is an isomorphism,
so the result follows.
Once one has information about the cyclotomic fields, one can find the
Galois group of an equation of the form xn − a = 0. For example, one has:
Proposition 7.6.11. If K has characteristic 0 and contains the roots of
xn − 1 = 0, then the Galois group of xn − a over K is cyclic of order a
divisor of n.
Proof. Let L be a splitting extension of xn − a over K, and denote again by
U = {z ∈ L : z n − 1 = 0} the set of roots of xn − 1. If r ∈ L is a root of
xn − a, then {zr : z ∈ U } is the set of n roots of xn − a in L. Hence, we
have L = K(r). If φ1 , φ2 ∈ AutK (L) ≡ G, then φ1 (r) = z1 r, φ2 (r) = z2 r,
for some z1 , z2 ∈ U , and φ1 ◦ φ2 (r) = z1 z2 r. Hence, the map φ 7→ z is a
monomorphism form G into the cyclic group U ≃ Zn and we conclude that
G is isomorphic to a subgroup of Zn .
Exercises.
√ √ √
1. Determine the Galois group of the extension Q( 2, 3, 5) over Q.
2. Let L = K(x) be the field of fractions of K[x]. Show that the primitive
elements of L take the form
ax + b
,
a, b, c, d ∈ K, ad − bc 6= 0
t=
cx + d
(Hint: If t = p(x)/q(x), where gcd(p(x), q(x) = 1, define the degree of t to
be the maximum of the degrees of p(x) and q(x). Show that p(w) − yq(w)
is irreducible in K[w, y], hence also in K(y)[w], and that x is algebraic over
K(t) with minimal polynomial a multiple of p(w) − tq(w). Conclude that
[K(x) : K(t)] = 1 (or that K(x) = K(t)) if and only if the degree of t is 1.)
3. Determine the Galois group of x3 − 2 over Q and over Z2 .
4. Find the Galois group of the polynomial xn − 2 ∈ Q[x] and represent it as a
group of permutations of the roots, for n = 4, 5, 6, 7.
5. Determine the Galois group of the polynomial p(x) = 2x3 + 3x2 + 6x + 6 ∈
Q[x].
6. Show that the Galois group of p(x) = x3 + x2 − 2x − 1 ∈ Q[x] is isomorphic
to the alternating group A3 .
(Hint: If r is a root of p(x), then r2 − 2 is also a root.)
364
7.7
CHAPTER 7. GALOIS THEORY
The Galois Correspondence
Finally we are now able to explain how Galois Theory allows one to replace a
problem concerning (extensions of fields of) polynomials by a simpler problem in Group Theory. The correspondence, which goes back to Galois, allows
one to associate to an intermediate extension a subgroup of the Galois group,
as we now explain.
Let L ⊃ K be an extension of K and H ⊂ AutK (L) a subgroup of
the Galois group. We can think of H as a group of transformations of L, in
which case the fixed point set of H is an intermediate field L ⊃ Fix(H) ⊃ K.
Conversely, given an intermediate field L ⊃ K̃ ⊃ K then AutK̃ (L) is a
subgroup of the Galois group AutK (L).
The following properties are immediate for the definitions of these correspondences:
Proposition 7.7.1 (Properties of the Galois Correspondence). Let L be
an extension of K with Galois group G = AutK (L). Let K̃, K̃1 and K̃2
intermediate extensions, and H, H1 , H2 ⊂ G subgroups.
(i) If H1 ⊃ H2 , then Fix(H1 ) ⊂ Fix(H2 ).
(ii) If K̃1 ⊃ K̃2 , then AutK̃1 (L) ⊂ AutK̃2 (L).
(iii) Fix(AutK̃ (L)) ⊃ K̃.
(iv) AutFix(H) (L) ⊃ H.
Example 7.7.2.
√ √
Consider the extension L = Q( 2, 3) of K = Q. As we saw before, the
Galois group of this extension contain 4 elements:
AutK (L) = {id, φ1 , φ2 , φ3 } ≡ G.
Besides the trivial subgroup H0 = {id}, this group has the subgroups H1 =
{id, φ1 }, H2 = {id, φ2 } and H3 = {id, φ3 }. This lattice8 of subgroups can be
represented by the diagram:
G ❏❏
❏❏
ttt
❏❏❏
t
t
t
t
H1 ■
H2
H3
■■
✉
■■
✉✉
✉
■
✉✉
H0
8
A lattice is a partially ordered set in which every subset of two elements has a
supremum and a infimum. The set of subgroups of a group G, ordered by inclusion, is
a lattice. The set of intermediate extensions K ⊂ K̃ ⊂ L of a fixed extension L of K,
ordered by inclusion, is a also a lattice.
7.7. THE GALOIS CORRESPONDENCE
365
The field fixed by the Galois group
√ G√= AutQ (L) is the ground field: Fix(G) =
Q, while obviuosly Fix(H0 ) = Q( 2, 3). On the other hand, one checks easily
that
√
√
√
Fix(H2 ) = Q( 2),
Fix(H3 ) = Q( 6).
Fix(H1 ) = Q( 3),
Hence, the lattice of intermediate extensions is given by the diagram:
♥♥ Q PPPPP
PPP
♥♥♥
♥
♥
PP
♥
♥
♥
√
√
√
Q( 2)
Q( 6)
Q( 3)
PPP
♥
PPP
♥♥♥
♥
P
♥
√ √ ♥
Q( 2, 3)
As in the previous example, extensions with a rich group of automorphisms so that the correspondences above between intermediate extensions
and subgroups are inverse to each other (so in properties (iii) and (iv), inclusions can be replaced by equalities) are specially important and deserve
a special name:
Definition 7.7.3. A finite dimensional extension L ⊃ K is called a Galois
extension of K if Fix(AutK (L)) = K.
We have the following alternative characterizations of a Galois extension.
Proposition 7.7.4. The following statements are equivalent:
(i) L is a Galois extension of K;
(ii) L is a splitting extension of a separable polynomial over K;
(iii) L is a finite dimensional extension, normal and separable over K.
If any of these equivalent conditions hold, then:
(7.7.1)
| AutK (L)| = [L : K].
Proof. Note that (7.7.1) follows from Theorem 7.6.5. So it remains to prove
the implications (i) ⇒ (ii) ⇒ (iii) ⇒ (i).
(i) ⇒ (ii): Since [L : K] < ∞, there exist r1 , . . . , rn ∈ L, algebraic over
K, such that L = K(r1 , . . . , rn ). Let pi (x) be the minimal polynomial of
each ri in K[x], and Oi = {φ(ri ) : φ ∈ AutK (L)} the orbit of ri under the
366
CHAPTER 7. GALOIS THEORY
action of the Galois group. This set is finite and consists of roots of pi (x).
The polynomial
Y
qi (x) ≡
(x − r) ∈ L[x]
r∈Oi
divides pi (x) and is separable, since all its roots are distinct. If φ ∈ AutK (L),
then
Y
Y
(x − φ(r)) =
(x − r) = qi (x),
qiφ (x) =
r∈Oi
r∈Oi
so the coefficients of qi (x) belong to Fix(AutK (L)) = K. We conclude that
qi (x) ∈ K[x], so we must have pi (x) = qi (x). It follows
Q that pi (x) is separable
and all its roots are in L. The polynomial p(x) = i pi (x) is separable and
L is a splitting extension of p(x) over K.
(ii) ⇒ (iii): If L is a splitting extension of a polynomial p(x) over K,
then, by Corollary 7.3.10, L is a normal extension of finite dimension over
K. If p(x) is a separable polynomial, then L = K(r1 , . . . , rm ), where the
r1 , . . . , rm are separable. By Proposition 7.5.9, L is a separable extension.
(iii) ⇒ (i): Let K̃ = Fix(AutK (L)). From the general properties of
the Galois correspondence, K̃ is an intermediate field L ⊃ K̃ ⊃ K. Let
r1 ∈ K̃ − K and denote by p(x) the minimal polynomial of r1 over K. Since
L is normal and separable,
Qm p(x) splits in L[x] into a product of distinct
linear factors: p(x) = i=1 (x − ri ), with m > 1 and the ri all distinct. If
a
φ : K(r1 ) → K is the K-monomorphism mapping r1 to r2 , by Theorem
a
7.4.6, we can extend φ to a monomorphism Φ : L → K . By Theorem 7.3.9,
Φ is actually a K-automorphism of L. It follows that Φ is an element of
AutK (L) which does not fix r1 . This contradicts r1 ∈ K̃ = Fix(AutK (L)).
Hence, we must have K = K̃ = Fix(AutK (L)).
If L fails to be a Galois extension of K, then | AutK (L)| ≤ [L : K]. On
the other hand, we have the following general result:
Lemma 7.7.5 (Artin9 ). Let G be a finite group of automorphisms of a field
L and K = Fix(G). Then
(7.7.2)
[L : K] ≤ |G|.
Proof. Let G = {φ1 = id, φ2 , . . . , φm }. We need to show that any n elements
of L with n > m are linearly dependent over K.
9
Emil Artin (1898-1962) was among the greatest algebraists of the do 20th Century.
The modern approach to Galois Theory was formulated by him, together with Irving
Kaplanski (1917-2006). Artin and Kaplanski where both members of the Bourbaki group.
7.7. THE GALOIS CORRESPONDENCE
367
Let u1 , . . . , un ∈ L and consider the homogeneous linear system:
n
X
ai φj (ui ) = 0,
(j = 1, . . . , m).
i=1
Since there are more variables than equations, elementary Linear Algebra
shows that it must have a non-trivial solution (a1 , . . . , an ) ∈ Ln . After
possibly some reordering, we can choose the solution to be of the form
(a1 , . . . , as , 0, . . . , 0), where ai 6= 0 for i = 1, . . . , s and s ≥ 2 is minimal (i.e.,
there is no non-trivial solution with a smaller s). Dividing by a1 , we can
additionally assume that a1 = 1. We claim that ai ∈ K = Fix(G), for all i.
In fact, if some ai 6∈ K = Fix(G), there exists φ ∈ G such that φ(ai ) 6= ai .
Then the vector (1, φ(a2 ), . . . , φ(as ), 0, . . . , 0) is also a solution of the system
and the difference vector (0, a2 − φ(a2 ), . . . , as − φ(as ), 0, . . . , 0) is a nontrivial solution with more zeros than (1, a2 , . . . , as , 0, . . . , 0), contradicting
the assumption that s was minimal.
We conclude that ai ∈ K, for all i. When j = 1 the system gives the
linear dependence relation
n
X
ai ui = 0.
i=1
Finally we are able to state and prove the key result of Galois Theory.
Theorem 7.7.6 (Fundamental Theorem of Galois Theory). Let L be a
Galois extension of K. There exists a bijective correspondence between the
intermediate extensions K ⊂ K̂ ⊂ L and the subgroups H of the Galois
group G ≡ AutK (L), given by:
K̂ 7→ AutK̂ (L) ≡ H,
H 7→ Fix(H) ≡ K̂.
If one writes H ↔ K̂, this correspondence satisfies:
(i) If H1 ↔ K̂1 and H2 ↔ K̂2 , then H2 ⊂ H1 if and only if K̂2 ⊃ K̂1 . In
this case we have [H1 : H2 ] = [K̂2 : K̂1 ].
(ii) If H ↔ K̂, then H is a normal subgroup of G if and only if K̂ is a
normal extension of K. In this case, K̂ is a Galois extension with
Galois group G/H.
368
CHAPTER 7. GALOIS THEORY
Proof. First we check that the two assignments are inverse to each other:
Fix(AutK̂ (L)) = K̂: Let L ⊃ K̂ ⊃ K be an extension. By Proposition
7.7.4, L is a finite-dimensional extension, normal and separable over K, hence
also over K̂. But then L is a Galois extension of K̂, and we conclude that
K̂ = Fix(H) with H = AutK̂ (L).
AutFix(H) (L) = H: Let H ⊂ G ≡ AutK (L) and K̂ = Fix(H). By Artin’s
Lemma, |H| ≥ [L : Fix(H)] = [L : K̂]. On the other hand, L is a Galois
extension over K̂ and H ⊂ AutK̂ (L). Then by relation (7.7.1) we see that
|H| ≤ | AutK̂ (L)| = [L : K̂]. Hence, |H| = | AutK̂ (L)| and we conclude that
H = AutK̂ (L), with K̂ = Fix(H).
In order to check that (i) holds, notice that from Proposition 7.7.1 and
the fact that the correspondence is bijective, we have H2 ⊂ H1 if and only
if K̂2 ⊃ K̂1 . Since L is a Galois extension over both K̂1 and K̂2 , we find:
[K̂2 : K̂1 ] =
[L : K̂1 ]
[L : K̂2 ]
=
| AutK̂1 (L)|
| AutK̂2 (L)|
=
|H1 |
= [H1 : H2 ].
|H2 |
In order to check that (ii) holds, assume that H ↔ K̂. If φ ∈ G, then
the extension corresponding to φHφ−1 is φ(K̂). Hence, we have:
(a) If H is a normal subgroup of G, then for all φ ∈ G we have φ(K̂) ⊂ K̂.
The map φ 7→ φ|K̂ is a surjective homomorphism of G onto AutK (K̂)
with kernel H. Hence, AutK (K̂) ≃ G/H and
Fix(AutK (K̂)) = Fix(G/H) = Fix(G) = K.
We conclude that K̂ is a Galois extension with Galois group G/H.
(b) Conversely, assume that K̂ is a normal extension. By Theorem 7.3.9,
φ(K̂) = K̂, for every φ in the Galois group G. Since φHφ−1 ↔ φ(K̂),
we conclude that φHφ−1 = H. Hence, H is a normal subgroup of G.
Example 7.7.7.
√
We saw before that Q(i, 4 2) is a Galois extension over Q, with Galois group
√
4
AutQ Q(i, 2) = {1, τ, τ 2 , τ 3 , σ, στ, στ 2 , στ 3 } ≡ G,
√
where σ and τ are the Q-automorphisms of Q(i, 4 2) definided by
σ(i)
√ = −i,√
σ( 4 2) = 4 2,
τ (i)
√ = i, √
τ ( 4 2) = i 4 2.
7.7. THE GALOIS CORRESPONDENCE
369
This group has the following lattice of subgroups:
❥ G ❚❚❚❚❚
❚❚❚❚
❥❥❥❥
❥❥❥❥
2
3
{1, τ , στ, στ }
hτ i
{1, σ, τ 2 , στ 2 }
❙❙❙❙
❙
❙
❦
❦
❙
❦
❙❙❙❙
❙❙❙❙
❦❦❦
❦❦❦
❙❙
❦❦❦
❦❦❦❦
❦
2
3
2
hστ i ❱❱❱
hστ i
hτ i
hσi
✐✐ hστ i
❱❱❱❱
▲▲▲
✐
s
✐
s
✐
✐
❱❱❱❱
ss
✐✐✐✐
❱❱❱❱ ▲▲▲▲
sss ✐✐✐✐✐✐✐
❱❱❱❱ ▲▲▲
s
s
❱❱❱❱ ▲▲
ss ✐✐✐✐
❱❱❱
s
✐✐✐✐
{1}
The Galois correspondence yields a similar lattice of intermediate extensions
of Q:
✐✐✐ Q ◗◗◗◗◗
✐✐✐✐
◗◗◗
✐
✐
✐
✐✐
2
Q(ir ) ❚❚
Q(i)
Q(r2 )
❖❖❖
❚❚❚❚
❤❤❤
♥
❤
♥
❤
♥
❤
❖❖❖
❚❚❚❚
❤❤❤❤
♥♥♥
2
2
2
Q(r)
Q(i, r )
Q(ir , (1 + i)r)
Q(ir , (1 − i)r)
❦ Q(ir)
◆◆◆
❲❲❲❲❲
❦❦❦
✈✈
❲❲❲❲❲
❦
◆◆◆
❦
✈
❦
❲❲❲❲❲
❦❦
✈✈
❲❲❲❲❲ ◆◆◆◆◆
✈✈ ❦❦❦
❲❲❲❲❲ ◆
✈❦✈❦❦❦❦❦
✈
❲
Q(i, r)
√
where r = 4 2. The extensions in the first row are normal extensions, being
extensions of degree 2. They correspond to subgroups of index
√ 2, therefore
normal subgroups. Int the second row, only the extension Q(i, 2) is normal:
it corresponds to the center C(G) = {1, τ 2 }.
Exercises.
√ √ √
1. Determine the Galois correspondence for the extension Q( 2, 3, 5) ⊂ R
over Q.
2. Determine the Galois correspondence for the splitting field of the polynomial
x3 − 2 over Q and over Z2 .
3. Determine the Galois correspondence for the splitting field of the polynomial
(x3 − 2)(x2 − 3) ∈ Q[x].
4. Determine the Galois correspondence for the splitting field of the polynomial
x4 − 4x2 − 1 ∈ Q[x].
370
CHAPTER 7. GALOIS THEORY
5. Let p(x) ∈ Q[x] be a polynomial of degree 3 with Galois group G. If
r1 , r2 , r3 ∈ C denote the roots of p(x) and δ ≡ (r1 − r2 )(r1 − r3 )(r2 − r3 ),
show that:
(a) |G| = 1 if and only if the roots of p(x) belong to Q;
(b) |G| = 2 if and only if p(x) has exactly one rational root;
(c) |G| = 3 if and only if p(x) has no rational roots and δ ∈ Q;
(d) |G| = 6 if and only if p(x) has no rational roots and δ 6∈ Q.
6. If p(x) ∈ K[x] is a polynomial of degree n with roots r1 , . . . , rn , define the
discriminant of p(x) to be ∆ = δ 2 , where
δ=
Y
(ri − rj ).
i<j
Assuming that p(x) is separable and K has characteristic different from 2,
show that
(a) ∆ ∈ K;
(b) ∆ = 0 if and only if p(x) has a multiple root;
(c) ∆ is a perfect square in K if and only if the Galois group of p(x) is
contained in An .
7. If p(x) = x3 + px + q ∈ Q[x], show that the discriminant (see the previous
exercise) is ∆ = −4p3 − 27q 2 . What is the Galois group of x3 + 6x2 − 9x + 3 ?
8. Let L be a Galois extension of K, and L ⊃ K̂ ⊃ K an intermediate extension.
Let H ⊂ AutK (L) be the subgroup formed by the K-automorphisms which
map K̂ into itself. Show that H is the normalizer of AutK̂ (L) in AutK (L).
7.8
Applications
We will now see several applications of Galois Theory. Our first application
is to characterize all symmetric polynomials. Our second application is
to find a criterion to decide if a complex number is constructible or not,
completing our discussion in Section 7.2. Our third and final application is
to the original question that motivated Galois: we will prove Galois criterion
to decide if an algebraic equation is solvable by radicals.
7.8. APPLICATIONS
7.8.1
371
Symmetric Polynomials
Let x1 , . . . , xn be indeterminates. In the field of fractions K(x1 , . . . , xn )
consider the polynomial
n
Y
(x − xi ) = xn − s1 xn−1 + s2 xn−2 − · · · + (−1)n sn .
p(x) =
i=1
The coefficients si of this polynomial are certain polynomials in the indeterminates xi , called elementary symmetric polynomials,. They can be
explicit written as:
s1 =
X
xi ,
i
s2 =
X
xi xj ,
i<j
..
.
s n = x1 · · · xn .
We write their general formula in the form:
si =
X
j1 <···<ji
xj 1 · · · xj i ,
i = 1, . . . , n.
The use of the term “symmetric” is justified by the fact that the polynomials remain the same under any permutation of the indices j1 , . . . , jn .
More generally, we can consider symmetric rational expressions in the variables xi , which can be formalized as follows. Every permutation π ∈ Sn
determines a K-automorphism φπ of K(x1 , . . . , xn ) which transforms xi in
xπ(i) . A symmetric rational expression is any element of K(x1 , . . . , xn )
which is fixed by the group G ≡ {φπ : π ∈ Sn } ≃ Sn . Notice that
G ⊂ AutK (K(x1 , . . . , xn )) and, under the Galois correspondence, the symmetric rational expressions are precisely the elements of the intermediate
extension K ⊂ Fix(G) ⊂ K(x1 , . . . , xn ).
Examples 7.8.1.
Q
1. The polynomial p(x) = ni=1 (x − xi ) is clearly invariant under the action of
the φπ ( i.e., p(x) = pφπ (x)), hence its coefficients si are fixed by φπ and we
have si ∈ Fix(G). In other words, the si are symmetric rational expressions.
372
CHAPTER 7. GALOIS THEORY
2. The fractions
x1 x2
x2 x3
x3 x1
+
+
,
x3
x1
x2
1
1
1
+ 2 + 2,
x1 2
x2
x3
are symmetric rational expressions.
We can use the Galois correspondence to show that any symmetric rational expression can be expressed in terms of the elementary symmetric
polynomials s1 , . . . , sn . More precisely, we have:
Theorem 7.8.2. K(x1 , . . . , xn ) is a Galois extension of K(s1 , . . . , sn ) with
Galois group G = {φπ : π ∈ Sn } ≃ Sn . In particular, Fix(G) = K(s1 , . . . , sn ).
Proof. As we observed above, we have K(s1 , . . . , sn ) ⊂ Fix(G) hence,
G ⊂ AutK(s1,...,sn ) K(x1 , . . . , xn ).
Note that K(x
Q 1 , . . . , xn ) = K(s1 , . . . , sn , x1 , . . . , xn ) is a splitting extension
of p(x) = ni=1 (x − xi ) ∈ K(s1 , . . . , sn )[x]. On the other hand, if φ ∈
AutK(s1 ,...,sn) K(x1 , . . . , xn ), then φ permutes the roots of p(x), hence
AutK(s1 ,...,sn ) K(x1 , . . . , xn ) ⊂ G.
We conclude that the Galois group of K(x1 , . . . , xn ) over K(s1 , . . . , sn ) is
G ≃ Sn . Since K(x1 , . . . , xn ) is a normal and separable extension, it is a
Galois extension and Fix(G) = K(s1 , . . . , sn ).
Example 7.8.3.
The polynomial x1 3 + x2 3 + x3 3 ∈ Q(x1 , x2 , x3 ) is a rational symmetric expression, hence it can be expressed as a rational fraction in s1 = x1 + x2 + x3 ,
s2 = x1 x2 + x1 x3 + x2 x3 and s3 = x1 x2 x3 . Elementary degree considerations
show that there exist rational numbers a1 , a2 and a3 , such that
x1 3 + x2 3 + x3 3 =
= a1 (x1 + x2 + x3 )3 + a2 (x1 + x2 + x3 )(x1 x2 + x1 x3 + x2 x3 ) + a3 x1 x2 x3 .
By choosing convenient values of x1 , x2 and x3 , we can find the coefficients:
3 = 27a1 + 9a2 + a3
2 = 8a1 + 2a2
(if we set x1 = x2 = x3 = 1),
(if we set x1 = x2 = 1, x3 = 0),
1 = a1
(if we set x1 = 1, x2 = x3 = 0).
This linear system has the solution a1 = 1, a2 = −3, a3 = 3. Hence, we
conclude that:
x1 3 + x2 3 + x3 3 = (x1 + x2 + x3 )3 − 3(x1 + x2 + x3 )(x1 x2 + x1 x3 + x2 x3 )
+ 3x1 x2 x3 .
7.8. APPLICATIONS
7.8.2
373
Constructible Numbers
We have study in Section 7.2 the subfield C ⊂ C formed by the numbers that
can be constructed using compass and straightedge. Using Galois Theory
we obtain the following characterization of the constructible numbers:
Theorem 7.8.4. A complex number z ∈ C is constructible if and only if
n
z is algebraic over Q, and the normal closure Q(z) has dimension 2s , for
some s ∈ N.
Proof. Recall that a complex number z ∈ C is constructible if and only if z
belongs to a subfield Q(u1 , . . . , ur ), with u1 2 ∈ Q and um+1 2 ∈ Q(u1 , . . . , um ),
for m = 1, . . . , r − 1. It follows that if z is a constructible, then ]
n
n
Q(z) ⊂ Q(u1 , . . . , ur ) .
n
Letting G = AutQ (Q(u1 , . . . , ur ) ), Theorem 7.3.9 shows that the field
n
Q(u1 , . . . , ur ) is generated by the images φ(Q(u1 , . . . , ur )), with φ ∈ G.
Hence, if G = {φ1 , . . . , φn } we find:
n
Q(u1 , . . . , ur ) = Q(φ1 (u1 ), . . . , φ1 (ur ), . . . , φn (u1 ), . . . , φn (ur )).
n
Since φj (ur )2 = φj (u2r ), we conclude that Q(u1 , . . . , ur ) is an extension of
the form Q(ũ1 , . . . , ũl ) with ũ21 ∈ Q and ũ2m+1 ∈ Q(ũ1 , . . . , ũm ), for m =
1, . . . , l − 1. We can now compute the algebraic degrees as follows:
n
[Q(u1 , . . . , ur ) : Q] =
l−1
Y
[Q(ũ1 , . . . , ũm+1 ) : Q(ũ1 , . . . , ũm )] = 2t .
m=0
It follows that
n
[Q(z) : Q] =
n
[Q(u1 , . . . , ur ) : Q]
n
n
[Q(u1 , . . . , ur ) : Q(z) ]
= 2s ,
for some s ∈ N.
n
n
Conversely, assume that [Q(z) : Q] = 2s . Then Q(z) is a Galois
extension of Q whose Galois group has order |G| = 2s . By the results of
Chapter 5 on p-groups, we know that there exists a normal tower
G = Hs ⊲ Hs−1 ⊲ · · · ⊲ H1 ⊲ H0 = {e},
where [Hm : Hm−1 ] = 2. Galois correspondence, then gives intermediate
extensions
n
Q(z) = Ks ⊃ Ks−1 ⊃ · · · ⊃ K1 ⊃ Q,
374
CHAPTER 7. GALOIS THEORY
where [Km : Km−1 ] = 2. Hence, for each 1 ≤ m ≤ s, there exist um ∈ Km
n
such that Km+1 = Km (um+1 ) and um+1 2 ∈ Km . We conclude that Q(z) =
Q(u1 , . . . , us ), with um+1 2 ∈ Q(u1 , . . . , um ) for 0 ≤ m ≤ s − 1. Hence, z is
constructible.
Galois Theory gives more than a criterion to decide if a number is constructible: for constructible numbers it gives a method to construct the
number! The is illustrated in the next example.
Example 7.8.5.
A regular polygon with 5 sides can be constructed with rule and compass: if the
polygon is inscribed in the unit circle, then this is equivalent to the statement
2πi
that na primitive root of x5 − 1 = 0 is constructible. In fact, if r = e 5 , then
2πi
2
Q(r) = Q(e 5 ), and this extension has algebraic degree 4 = 2 .
In order to give an explicit description of a regular pentagon, we start by observing that (x5 − 1) = (x − 1)(x4 + x3 + x2 + x + 1), hence r is a root of
an irreducible polynomial of 4th degree. The extension L = Q(r) is a splitting
extension of this polynomial, hence is a Galois extension. The Galois group
has order |G| = 22 . The Q-automorphism defined by φ(r) = r2 is an element
2πik
of the Galois group. If we label the roots as zk = e 5 (k = 1, . . . , 4), then
φ(z1 ) = z2 ,
φ(z2 ) = z4 ,
φ(z4 ) = z3 ,
φ(z3 ) = z1 ,
so we see that φ corresponds to the permutation (1243). this element has order
4, so the Galois group is G = {I, φ, φ2 , φ3 }. In terms of permutations of the
roots:
G = {I, (1243), (14)(23), (3241)}.
This group admits the tower of p-subgroups:
G ⊃ H ⊃ {I},
where H = {I, (14)(23)}. Since [G : H] = 2, to H corresponds an extension K̂
of degree 2 over Q. To determine this extension, we observe that an element
u ∈ L is of the form u = a1 z1 + a2 z2 + a3 z3 + a4 z4 , where
φ2 (u) = a1 z4 + a2 z3 + a3 z2 + a4 z1 ,
and u is fixed by (14)(23) if and only if a1 = a4 and a2 = a3 . Hence, K̂ =
Q(ω1 , ω2 ), where ω1 = z1 + z4 and ω2 = z2 + z3 . It is easy to check that
ω1 + ω2 = z1 + z4 + z2 + z3 = r + r2 + r3 + r4 = −1
ω1 ω2 = (r + r4 )(r2 + r3 ) = r + r2 + r3 + r4 = −1,
so we have:
(x − ω1 )(x − ω2 ) = x2 + x − 1.
7.8. APPLICATIONS
375
Solving this equation, we see that
√
−1 + 5
ω1 =
,
2
−1 −
ω2 =
2
√
5
(these values are 2 Re(z1 ) = 2 Re(z4 ) and 2 Re(z2 ) = 2 Re(z3 )).
We can now explain the traditional way of constructing a pentagon. In a
unitary circle, we mark the four points A = (1, 0), B = (0, 1), C = (−1, 0)
and D = (0, −1). Splitting the segment √
OD into two equal parts, we obtain the
compass in E we
point E. The segment EA has length 5/2. Centering the
√
obtain the arc AF . The segment OF has length ω1 = −1+2 5 , and we can mark
the point G in the horizontal axis, so that OG = OF
2 . The point G corresponds
to the x-coordinate of z1 (it is slightly simple to observe that AF has the same
length as the side of a pentagon).
B
z2
z1
F
A
C
O
z3
G
E
D
z4
Figure 7.8.1: Construction of a regular pentagon.
The Greeks knew how to construct regular polygons with 3, 5 and 15
sides and, given a regular polygon with n sides, how to obtain one with 2n
sides (obviously by bisecting the sides). Gauss, when he was only 19 years
old, and before Galois Theory was discovered, found a way to construct a
regular polygon with 17 sides! This discovery made him choose Mathematics
over he study of Languages. In fact, he was so proud of this discovery, that
later in his life he asked that a regular polygon with 17 sides be engraved
in his tombstone. This never happened because the sculptor thought that a
polygon with some many sides could be confused with a circle.
Galois Theory shows that a regular polygon with n sides is constructible
if and only if n = 2r p1 · · · ps where the pi are Fermat primes. Using Galois
Theory, its is known how to construct regular polygons with 257 and 65 537
sides!
376
CHAPTER 7. GALOIS THEORY
7.8.3
Solving algebraic equations by radicals
We will now comeback to the opening discussion in this chapter, concerning
the criterion found by Galois which allows one to decide if an algebraic
equation is solvable by radicals. In order to simplify the discussion, we will
assumed the fields have characteristic 0.
Definition 7.8.6. Let p(x) ∈ K[x] be a monic polynomial. We say that
a equation p(x) = 0 is solvable by radicals if there exists an extension
L ⊃ K which contains a splitting field of p(x) and which is of the form:
L = Km+1 ⊃ · · · ⊃ K2 ⊃ K1 = K,
where Ki+1 = Ki (di ) and di ni ∈ Ki .
Notice the meaning of this definition: any root of p(x) belongs to L
and can be expressed starting from elements of K by a sequence of rational
operations and by finding n-th roots.
Theorem 7.8.7 (Galois Criterion). Let p(x) ∈ K[x] be a monic polynomial.
The equation p(x) = 0 is solvable by radicals if and only if its Galois group
is solvable.
Example 7.8.8.
The equation x5 − 4x + 2 = 0 is not solvable by radicals (in Q). This can be
seen as follows. By Eisenstein’s Criterion, the polynomial p(x) = x5 − 4x + 2
is irreducible over Q. It is easy to check that this polynomial has three real
distinct roots r1 , r2 , and r3 , and two complex conjugate roots r4 and r5 . Let
L = Q(r1 , . . . , r5 ) be the splitting field of p(x) and denote by G its Galois group.
Since [Q(r) : Q] = 5, for any r ∈ {r1 , . . . , r5 }, we have that 5 | [L : Q] = |G|,
the Sylow Theorems, show that the Galois group G ⊂ S5 contains an element
of order 5, i.e., a cycle (i1 , . . . , i5 ). On the other, complex conjugation a+ib 7→
a − ib restricted to L = Q(r1 , . . . , r5 ) gives an element of G of order 2, i.e.,
a transposition. We leave it as an exercise to check that these two elements
generate S5 . Hence, G = S5 , by Galois Criterion the equation is not solvable
by radicals.
Before we prove Galois criterion, we deduce the following corollary:
Corollary 7.8.9 (Theorem of Abel-Rufini). There is no general algebraic
solution by radicals for polynomial equations of degree five or higher.
7.8. APPLICATIONS
377
Proof. A more precise way of formulating this result is that the general
equation
xn − an−1 xn−1 + · · · + (−1)n a0 = 0,
is not solvable by radicals, for n ≥ 5. By “general equation” we mean
that a0 , . . . , an−1 are indeterminates. Hence, we consider the polynomial
p(x) = xn − an−1 xn−1 + · · · + (−1)n a0 over the field K(a0 , . . . , an−1 ) and we
need to show that the Galois group of p(x) over this field is not solvable.
Let L be the splitting field of p(x) over K(a0 , . . . , an−1 ), so that in L we
have the factorization:
p(x) = (x − r1 )(x − r2 ) . . . (x − rn ).
By comparison, we see that
an−1 =
X
ri ,
i
an−2 =
X
ri rj ,
i<j
..
.
a0 = r1 r2 . . . rn .
Hence, L = K(ao , . . . , an−1 )(r1 , . . . , rn ) = K(r1 , . . . , rn ).
Consider a set of indeterminates x1 , . . . , xn , and in the field K(x1 , . . . , xn )
consider the subfield of rational symmetric expressions (see Section 7.8.1):
this subfield is of the form K(s1 , . . . , sn ), where s1 , . . . , sn are the elementary
symmetricQpolynomials in the xi ’s and K(x1 , . . . , xn ) is a splitting extension
of q(x) = i (x − xi ) over K(s1 , . . . , sn ). We also saw that the Galois group
of this extension is Sn .
If there is an isomorphism K(r1 , . . . , rn ) ≃ K(x1 , . . . , xn ) which to the
subfield K(a0 , . . . , an−1 ) associates K(s1 , . . . , sn ), then the Galois group of
the general equation of degree n is Sn , which is not solvable for n ≥ 5, so
we are done. Let us establish the existence of this isomorphism.
Consider the K-homomorphism φ : K[a0 , . . . , an−1 ] → K[s1 , . . . , sn ],
associating ai 7→ sn−i : this homomorphism exists, because the a0 , . . . , an are
indeterminates. Similarly, we have a homomorphism ψ : K[x1 , . . . , xn ] →
K[r1 , . . . , rn ], and the following diagram commutes:
K[a0 , . . . , an ]
φ
_
K[r1 , . . . , rn ] o
ψ
/ K[s1 , . . . , sn ]
_
K[x1 , . . . , xn ]
378
CHAPTER 7. GALOIS THEORY
In fact, we have:

ψ(φ(ai )) = ψ(si ) = ψ 
X
j1 <···<ji

xj 1 · · · xj i  =
X
j1 <···<ji
rj1 · · · rji = ai .
The diagram shows that φ must be a monomorphism. Since φ is clearly
surjective, it follows that φ is an isomorphism. Extending this isomorphism
to the corresponding fields of fractions, we obtain an isomorphism of fields
φ̃ : K(a0 , . . . , an−1 ) → K(s1 , . . . , sn ).
This isomorphism associates to a polynomial p(x) ∈ K(a0 , . . . , an−1 )[x] the
polynomial pφ̃ (x) = q(x) ∈ K(s1 , . . . , sn )[x]. As we saw in Section 7.4, φ̃ extends to an isomorphism of the corresponding splitting fields K(r1 , . . . , rn ) ≃
K(x1 , . . . , xn ), as claimed.
The Abel-Rufini Theorem shows that there exists no general formula to
solve a polynomial equation of of degree n, when n ≥ 5. Still there are
equations which can be solved by radicals, e.g., x5 − 2 = 0. In fact, it could
happen that a general formula did not exist, but still every equation could
be solved by radicals. However, the example of x5 − 4x + 2 = 0 shows that
is not the case.
It remains to prove Galois Criterion. In this proof, the splitting fields
of the equations xn − a = 0 play an essential role. We saw before that the
Galois group of xn − a = 0 is cyclic if K contains all the roots of unit of
order n and is abelian when a = 1. In general, an extension L of K whose
Galois group is abelian (respectively, cyclic) is called an abelian extension
(respectively, cyclic extension) of K.
We need the following preliminary result:
Proposition 7.8.10. Let K be a field which contains the p roots of xp −1 = 0
(p a prime). If L is a cyclic extension cyclic of K with [L : K] = p, then
L = K(r), where r p ∈ K.
Proof. If u ∈ L − K, then L = K(u), since L ⊇ K(u) ) K and [K(u) :
K] | [L : K] = p. If AutK (L) = hφi and {z1 , . . . , zp } ⊂ K are the p-roots of
xp − 1 = 0, consider the elements
(7.8.1)
ri = u + φ(u)zi + φ2 (u)zi 2 + · · · + φp−1 (u)zi p−1 .
Since φ(ri ) = zi −1 ri , we have φ(ri p ) = ri p , and we conclude that ri p ∈ K.
We can always write u as a linear combination of the ri ’s by solving the
7.8. APPLICATIONS
379
linear system (7.8.1) for the unknowns u, φ(u), . . . , φp−1 (u) (this is possible
because the corresponding determinant is a Van der Monde determinant).
Hence, L = K(r1 , . . . , rp ), and for some k0 we must have rk0 6∈ K. Setting
r = rk0 , we have L = K(r), with r p ∈ K.
Proof of the Galois Criterion. Let p(x) ∈ K denote by G = AutK (L) its
Galois group. We will show both implications:
(i) If p(x) = 0 is solvable by radicals, then G is solvable: If p(x) = 0
is solvable by radicals there exists an extension L of K, which contains a
splitting extension of p(x), and which admits a tower of subfields
L = Kl+1 ⊃ · · · ⊃ K2 ⊃ K1 = K,
(7.8.2)
n
where Ki+1 = Ki (di ), with di ni = ai ∈ Ki . The normal closure L of
n
n
L is generated by the φ(L), with φ ∈ AutK (L ). Hence, if AutK (L ) =
{id, φ1 , . . . , φr }, we obtain
n
(7.8.3)
L = K(d1 , . . . , dl , φ1 (d1 ), . . . , φ1 (dl ), . . . , φr (d1 ), . . . , φr (dl )).
n
Let m = lcm(n1 , . . . , nl ). We can extend the tower (7.8.3) to L (z),
n
where z is a primitive root of unit of order m. Since L the splitting field of
n
a polynomial p(x), it follows that L (z) the splitting field of p(x)(xm − 1),
n
and we conclude that L (z) is normal. Reordering terms, we obtain the new
tower:
n
L (z) = Kl (z) ⊃ · · · ⊃ K̃3 = K2 (z) ⊃ K̃2 = K(z) ⊃ K̃1 = K.
This new tower satisfies K̃i+1 = K̃i (d˜i ), with d˜ni i ∈ K̃i , for all i.
n
Let G be the Galois group of p(x) and H = AutK (L (z)). Then K̃i is
an abelian extension of K̃i−1 . If the subgroup Hi ⊂ H corresponds to the
intermediate extension K̃i , we have Hi−1 ⊲ Hi and Hi−1 /Hi is isomorphic
to the Galois group of K̃i over K̃i−1 , i.e., is abelian. We conclude that H
n
admits an abelian tower, so it is a solvable group. Since L (z) contains a
splitting field of p(x), G is a subgroup of H and it must be solvable.
(ii) If G is solvable, then p(x) = 0 is solvable by radicals: Let L be a
splitting field of p(x) = 0 and n = |G| = [L : K]. If K1 = K and K2 = K(z),
where z is a primitive root of xn − 1 = 0, then M = L(z) has Galois group
over K2 isomorphic to a subgroup H of G. Hence, H is solvable and has
a composition series H = H1 ⊲ H2 ⊲ . . . ⊲= {e}, where each Hi /Hi+1 is
cyclic of prime order. By the Galois correspondence, there exists a tower of
subfields K2 ⊂ K3 ⊂ · · · ⊂ M where each Ki+1 is a normal extension over
Ki with Galois group cyclic of order prime pi . Since pi | n and Ki contains a
380
CHAPTER 7. GALOIS THEORY
primitive root of xn − 1 = 0, we see that Ki contains the pi -th roots of unit.
By Proposition 7.8.10, we conclude that Ki+1 = Ki (di ), with di pi ∈ Ki .
This shows that the equation p(x) = 0 is solvable by radicals.
Exercises.
1. Show that x11 3 + x12 3 + x13 3 is a symmetric rational expression and determine
its representation in terms of elementary symmetric polynomials.
2. Determine the integeres 1 ≤ n ≤ 10 for which a regular polygon of n sides
admits a compass-and-straightedge construction.
3. Let G ⊂ Sp , with p prime, be a subgroup which contains a cycle of length p
and a transposition. Show that one must have G = Sp .
4. Let G be any finite group. Show that there exists fields L and K such that
L is an extension of K, with Galois group G.
(Hint: By Cayley’s Theorem, one can assume that G ⊂ Sn for some n.)
5. Using Galois Criterion, show that the Galois group of the equation xn −a = 0
(over Q) is solvable.
Chapter 8
Commutative Algebra (not
yet available)
381
382CHAPTER 8. COMMUTATIVE ALGEBRA (NOT YET AVAILABLE)
Appendix A
Set Theory
The notion of set is the most important of all mathematical notions, and the
foundations of all mathematics rest upon it. The reader is for sure familiar
with the informal notion of set and element of a set, as well as some elementary constructions with sets, such as (unions, intersections, complements,
etc.). On the other hand, statements such as:
• two sets are equal if and only if they have the same elements,
• given two sets, there exists a set containing them,
• given any set, there exists a set formed by all its subsets,
are usually accepted as obvious. However, to fully justified them it is necessary a deeper look into the foundations of Set Theory, which is beyond the
aim of this book. For example, the famous Russel paradox, concerning the
existence of the set of all sets, can only be solved via an axiomtization of Set
Theory. The interested reader can, e.g., consult the concise book by Paul
Halmos escreveu a este respeito1 . In this appendix we will limit ourselves
to a few elementary notions and results of Set Theory which are essentially
to the study of abstract algebra.
A.1
Relations and Functions
The notions of binary relation and map are directly related with the notion
of ordered pair and, in fact, can be defined from the latter notion. From
1
P. R. Halmos, Naive Set Theory, Undergraduate Texts in Mathematics, SpringerVerlag, 1974.
383
384
APPENDIX A. SET THEORY
a practical point of view, the most basic property of an ordered pair is the
equivalence:
(A.1.1)
(x1 , y1 ) = (x2 , y2 )
⇐⇒
x1 = x2 and y1 = y2 .
The properties that an ordered pair should satisfy can be expressed in terms
of even more basic notions of Set Theory, and lead to the previous equivalence:
Definition A.1.1. If X and Y are sets, x ∈ X and y ∈ Y , the ordered
pair (x, y) is the set (x, y) = {x, {x, y}}. The elements x and y are called
the components of the pair (x, y). The set of all ordered pairs (x, y), where
x ∈ X and y ∈ Y , are called the cartesian product of X and Y , and is
denoted by X × Y .
Note that (A.1.1) is an immediate logical consequence of Definition A.1.1,
and in particular the fact that (x, y) is equal to (y, x) if and only if x = y.
One can use the notion of ordered pair to formalize another important notion
which will lead to the notion of map:
Definition A.1.2. A relation between X and Y is a subset R ⊂ X × Y .
We will often write “xRy” instead of “x, y ∈ R”, and when X = Y ,we say
that R is a binary relation on X.
It is immediate to check that a relation R between X and Y has an
associated relation between Y and X, which is obtained by “switching” the
components of every ordered pair in R.
Definition A.1.3. If R is a relation between X and Y , the inverse or
opposite relation of R, denoted by Rop , is defined by setting
Rop = {(y, x) : (x, y) ∈ R}.
There are several important classes of relations. In particular, the notions of order relations, equivalence relations and maps, play an essential
role in many of our considerations.
Definition A.1.4. Let R be a relation on the set X. We say R is an order
relation on X if:
(i) Transitivity: For x, y, z ∈ X, if xRy and yRz, then xRz.
(ii) Anti-symmetry: For x, y ∈ X, if xRy and yRx, then x = y.
A.1. RELATIONS AND FUNCTIONS
385
There are also various kinds of order relations, which are distinguished
by adding one of the terms strict/non-strict e total/partial:
Definition A.1.5. Let R be an order relation on X.
(i) R is called a total order relation (the opposite of a partial
order relation) if it satisfies the trichotomy property: For any x, y ∈
X, one of the following holds: xRy or yRx or x = y.
(ii) R is called a strict order relation (the opposite of a non-strict
order relation) if satisfies the anti-reflexivity property: For x, y ∈
X, if xRy then x 6= y.
The following examples illustrate all the various possibilities:
Examples A.1.6.
1. The usual relation “>” (larger than) between real numbers is a strict order
relation and also a total order relation.
2. The usual relation “≥” (larger or equal) between real numbers is a non-strict
order relation and a total order relation.
3. The relation “⊇” (contains) between subsets of a given fixed set is a non-strict
order relation and a partial order relation.
4. The relation “)” (strictly contains) between subsets of a given fixed set is a
strict order relation and a partial order relation.
Given a partial ordered set X, with order relation denoted by “≤”, any
subset Y ⊂ X is also partially ordered with the order relation induced on X
(which we will still denote by “≤”). Obviously, the order relation induced
on X may have properties that the order relation on X does not satisfy. For
example, it may very well happened that X is partially ordered while the
induced ordered in Y ⊂ X is a total order. In this case, the set Y is called
a chain in X.
Example A.1.7.
In R2 consider the partial order relation defined by:
(x1 , y1 ) ≤ (x2 , y2 ) if and only if y1 = y2 and x1 ≤ x2 ,
where the last inequality is the usual order relation for real numbers. In this
case, any subset {(x, y) ∈ R2 : y = c}, with c ∈ R fixed ( i.e., an horizontal line
recta), is a chain. Note, also, that R2 , with this order relation is not totally
ordered.
386
APPENDIX A. SET THEORY
The next class of binary relations is the following:
Definition A.1.8. Let R be a binary relation on the set X. We call R an
equivalence relation on X if it satisfies:
(i) Reflexivity: For all x ∈ X, xRx.
(ii) Symmetry: For x, y ∈ X, if xRy, then yRx.
(iii) Transitivity: For x, y, z ∈ A, if xRy and yRz, then xRz.
Here are some simple examples of equivalence relations.
Examples A.1.9.
1. The relation of parallelism between lines in the plane is an equivalence relation on the set of all lines.
2. The relation of equality between elements of any set X is an equivalence
relation on X.
3. The relation de congruence modulo m is an equivalence relation on the set
of integers.
Any equivalence relation R on a set X determines an important collection
of subsets of X.
Definition A.1.10. Given an equivalence relation R on X and x ∈ X, the
equivalence class of x, denoted by x or [x], is the set
x = [x] = {y ∈ X : xRy}.
The set of all equivalence classes x is called the quotient set of X by R
and is denoted by X/R, so that:
X/R = {x : x ∈ X} = {{y ∈ X : xRy} : x ∈ X}.
Examples A.1.11.
1. For the relation of parallelism between the lines in the plane, the equivalence
class of a line L is formed by all lines parallel to L.
2. If R is the relation of equality on the set X, the quotient set is X/R = {{x} :
x ∈ X}.
3. If R is the congruence relation modulo m in the set of integers Z and a ∈ Z,
then a = {a + km : k ∈ Z}.
A.1. RELATIONS AND FUNCTIONS
387
It is a basic fact that two equivalence classes either coincide or are disjoint
sets. The proof is elementary.
Proposition A.1.12. Let R be an equivalence relation on X. For x, y ∈ X,
the following statements are equivalent:
(i) xRy;
(ii) x = y;
(iii) x ∩ y 6= ∅.
Finally, we consider the following important class of equivalence relations, usually called “maps”.
Definition A.1.13. A relation f between X and Y is called a map from
X to Y , denoted f : X → Y , of:
(i) for each x ∈ X there exists y ∈ Y such that xf y, and
(ii) if xf y and xf y ′ , then y = y ′ .
Because of (ii), we will write y = f (x) instead of xf y. We then call X is o
domain and Y the image (or range) of the map f .
Other common names to denote a map are function and transformation.
Examples A.1.14.
1. If X is a set, IX : X → X, given by IX (x) = x, is called the identity map
in X.
2. If X ⊃ Y are sets, iY : Y → X, given by iY (y) = y, is called the inclusion
map of Y in X.
3. If R is an equivalence relation on X, the map πX/R : X → X/R, given by
πX/R (x) = x, is called quotient map of X in X/R.
In geral, if X ⊃ X ′ and Y ⊃ Y ′ , given a map f : X → Y , we set:
f (X ′ ) ≡ {f (x) : x ∈ X ′ },
and
f −1 (Y ′ ) ≡ {x ∈ X : f (x) ∈ Y ′ }.
We say that f (X ′ ) is the direct image of X ′ by f , and f −1 (Y ′ ) is a
inverse image of Y ′ by f . In particular, the image of f the set f (X),
which we will often denote by Im f .
388
APPENDIX A. SET THEORY
Example A.1.15.
if f : R → R is the map cos x, its image is the set f (R) = [−1, +1]. The
inverse image of the set {0} is f −1 ({0}) = { 2n+1
2 π : n ∈ Z}.
As we all know certain kinds of maps have special designations:
Definition A.1.16. If f : X → Y is a map, then f is called:
(i) surjective or onto if for all y ∈ Y there exists x ∈ X such that
y = f (x);
(ii) injective if f (x) = f (x′ ) ⇔ x = x′ ;
(iii) bijective or a bijection if it is both injective and surjective. In this
case. we say that sets are equipotent or isomorphic.
Examples A.1.17.
1. The identity map IX : X → X is bijective.
2. The inclusion map iY : Y → X is injective.
3. The quotient map pX/R : X → X/R is surjective.
4. The map cos : R → R is neither injective nor surjective.
Given maps f : X → Y and g : Y → Z, the compose of f and g is the
map g ◦ f : X → Z (read “g after f ”) given by (g ◦ f )(x) = g(f (x)). We
leave for the exercises to verify that:
Proposition A.1.18. If X, Y , Z, and W are sets. Then:
(i) Associativity: If f : X → Y , g : Y → Z and h : Z → W are maps,
then (h ◦ g) ◦ f = h ◦ (g ◦ f );
(ii) Left Inverse: f : X → Y is injective if and only if there exists g : Y →
X such that g ◦ f = IX ;
(iii) Right Inverse: f : X → Y is surjective if and only if there exists
g : Y → X such that f ◦ g = IY ;
(iv) Inverse: f : X → Y is bijective if and only if there exists g : Y → X
such that f ◦ g = IY and g ◦ f = IX . In this case, g is called the
inverse map of f and is denoted by f −1 .
A.1. RELATIONS AND FUNCTIONS
389
Exercises.
1. Use Definition (A.1.1) to prove that (A.1.1) holds.
2. Describe the anti-symmetry and trichotomy properties of a relation R in
terms of R and its inverse Rop .
3. Give a proof of Proposition (A.1.12).
4. Show that, if f : X → Y is a map and {Yi }i∈I is a family of subsets de Y ,
then the following identities hold:
f −1 (
[
i∈I
Yi ) =
[
f −1 (Yi ),
and
f −1 (
i∈I
\
Yi ) =
i∈I
\
f −1 (Yi ).
i∈I
Are these still true if we assume that {Xi }i∈I is a family of subsets of X and
we replace f −1 por f ?
5. Let f : X → Y be a map.
(a) Can there exist more that one map g : Y → X such that f ◦ g = IY ?
(b) Can there exist more that one map g : Y → X such that g ◦ f = IX ?
(c) Can there exist more that one map g : Y → A such that f ◦ g = IY and
g ◦ f = IX ?
6. Prove itens (i) and (ii) of Proposition A.1.182 .
7. Show that the inverse f −1 of a map f : X → Y is:
(a) a map f −1 : f (X) → X if f is injective;
(b) a map f −1 : Y → X if f is bijective.
8. Verify that, if f : X → Y is a bijective map, then g = f −1 is the only map
such that f ◦ g = IY and g ◦ f = IX .
9. Show that, if X 6= ∅ and Y = ∅, there are no maps f : X → Y .
10. If X = ∅ and f : X → Y , then f is injective, and moreover f is surjective
if and only if Y = ∅.
2
The proofs of itens (iii) and (iv) require the Axiom of Choice which will be mentioned
later.
390
A.2
APPENDIX A. SET THEORY
Axiom of Choice, Zorn’s Lemma and Induction
We have mentioned several times the notion of cartesian product of sets.
Using the notion of map, we can introduce define the cartesian product of
any arbitrary number of sets.
The set of of integers will play a relevant role in the discussion and given a
positive integer n ∈ N we will denote in this section by In = {k ∈ N : k ≤ n}
the set consisting of the first n natural numbers. In particular, I0 = ∅.
Given a set X, a map f : I2 → X determines uniquely an ordered pair
with components in X, namely the pair (f (1), f (2)), which we will also write
as (f1 , f2 ). Hence, the set of all maps f : I2 → X is isomorphic to the set of
all ordered pairs with components in X.
A similar observation holds if we consider maps f : I2 → X ∪ Y such
that f (1) ∈ X and f (2) ∈ Y . The set of all such maps f : I2 → X ∪ Y is
isomorphic to the set of ordered pairs (x, y), with x ∈ X and y ∈ Y , i.e.,
to the cartesian product X × Y . This leads us to define arbitrary cartesian
products as sets of maps.
Definition
A.2.1. Let X1 , X2 , . . . , Xn be sets. The cartesian product
Qn
X
is
the
set of all maps f : In → ∪ni=1 Xi such that f (k) ∈ Xk .
i=1 i
Q
If f ∈ ni=1 Xi we write f = (f (1), f (2), . . . , f (n)), or f = (f1 , f2 , . . . , fn ).
n
Also, when
Qnthe sets Xi all coincide with the same set X, we will write X instead of i=1 X, and called it the power n of X. In this case, the elements
of X n are called n-tuples of elements of X.
One can apply the method used in Definition A.2.1 to define the cartesian
product of an arbitrary number of sets. For that, we replace the family of
sets X1 , X2 , . . . , Xn , indexed by a natural number 1 ≤ k ≤ n, by a family
{Xi : i ∈ I}, indexed by a parameter i belonging to an arbitrary set I.3
Definition A.2.2. Given a family {Xi : i ∈ I}, the cartesian product
Q
i∈I Xi is given by
Y
[
Xi = {f : I →
Xi , wiith f (i) ∈ Xi para qualquer i ∈ I}.
i∈I
i∈I
Q
The canonical
Q projection πk : i∈I Xi → Xk4 is the map associating to an
element f ∈ i∈I Xi the element f (k) ∈ Xk .
3
Recall that given some class of sets X, a family indexed by i ∈ I is just a map
f : I → X. We write Xi instead of f (i), just like we write, for example, xn instead of
f (n) when we deal with a sequence of real numbers.
4
We will often write fk , instead of f (k).
A.2. AXIOM OF CHOICE, ZORN’S LEMMA AND INDUCTION
391
Examples A.2.3.
Q
1. Given an set X, the cartesian product n∈N X is just the set of all sequences
with values in X.
Q
2. The cartesian product n∈N In is the set of sequences of natural numbers
f : N → N such that f (n) ≤ n, for all n ∈ N.
Q
3. The cartesian product x∈R [x − 1, x + 1] is the set of all functions f : R → R
such that x − 1 ≤ f (x) ≤ x + 1.
If we consider all maps with
Q a fixed domain X and fixed range Y , this
is just the cartesian product x∈X Y , but it is common to denote it by the
symbol “Y X ”.
Definition A.2.4. Y X = {f : X → Y } is the set of all maps from X to Y .
Examples A.2.5.
1. X N is the set consisting of all sequences in X.
2. RR denotes the set of all real valued functions of one variable.
3. RR×R is the set consisting of all real valued functions of two variables.
4. In general, X X×X is the set of all binary operations on X.
5. One can also denote the set X n by X In .
I
Q
S
,
X
is
a
strict
subset
of
In general, the cartesian product
X
i
i
i∈I
i∈I
S
since not all maps f : I → i∈I Xi satisfy the condition f (i) ∈ Xi , for all
i ∈ I. However, when Xi = X for every i ∈ I, it is obvious that
Y
Y
Xi =
X = XI .
i∈I
i∈I
Hence, the set X I is a special type of cartesian product and this justifies
the use of the “power” notation.
Q
One shows easily (by induction) that the product ni=1 Xi of any finite
number of sets is non-empty, as long as the sets Xi are non-empty. The
very same statement, but for an arbitrary number of sets, is a fundamental
axiom of Set Theory:
Axiom
I (Axiom of Choice). If I 6= ∅ and Xi 6= ∅, for all i ∈ I, then
Q
X
=
6
∅.
i∈I i
392
APPENDIX A. SET THEORY
The reason for the designation for
Q this axiom is easy to explain. An
element f of the cartesian product i∈I Xi amounts to a “choice” of an
element in each of the sets Xi . The axiom states that given an arbitrary
family of sets, there exists always a map which “chooses” exactly one element
from each set.
The next example illustrates the kind of issue that the Axiom of Choice
allows to bypass.
Example A.2.6.
Assume that each Xi is formed a single pair of shoes. Then we can decide to
choose, for example, the right shoe of each pair. In this case. the Axiom of
Choice is useless. On the other hand, if each Xi is formed by a pair of socks,
then there is no criteria for a choice of an element in Xi , and we can invoke
the Axiom of Choice to claim the existence of a set formed by exactly one sock
from each set Xi .
Many results concerning the existence of some set, or of an element in a
set, satisfying some property, can be reformulated in terms of the existence
of a maximal element for an appropriate order relation 5 . In this respect,
the following result often plays a crucial role:
Theorem A.2.7 (Zorn’s Lemma). Let X be a partially ordered, non-empty,
where every chain possui has an upper bound. Then X contains a maximal
element.
One can show that Zorn’s Lemma is equivalent the Axiom of Choice (see
P. Halmos book cited at the beginning of this appendix). The next example
illustrates how Zorn’s Lemma can be applied.
Example A.2.8.
Let A be a ring, I ( A an ideal, and denoted by X the set of all proper ideals
of A that contains I. Since I belongs to X, this set is non-empty. We consider
on X the relation of inclusion, which
S is obviously a partial order relation. If
{Ij : j ∈ J} is a chain in X, then j∈J Ij is an upper bound (you should check
that this is indeed an ideal of A which contains I and hence belongs to X). We
conclude from Zorn’s Lemma that X has a maximal element. This shows in
an arbitrary ring A, every proper ideal is contain in a maximal ideal.
5
The notions of upper/lower bound, supremum/infimum and maximal/minimal element for subsets of partial ordered sets is discussed in Section 2.2.
A.2. AXIOM OF CHOICE, ZORN’S LEMMA AND INDUCTION
393
One can imagine an alternative proof of the result in this example, as
follows: given a proper ideal ⊂ A, if I is not maximal, then there exists
an ideal I0 containing I. Then, if I0 is not maximal, there exists an ideal
I1 which contains I0 , and so on. The problem, of course, is that ”so on”
may not terminate. Zorn’s Lemma is useful precisely to avoid this kind of
problem.
For basically the same reason, a partial ordered set X may fail to contain
a minimal element. Even if a minimal element exists, a subset Y ⊂ X may
fail to have a minimal element. The following definition attempts to prevent
this.
Definition A.2.9. A partial ordered set X is said to be well-ordered if
every non-empty subset S ⊂ X has a minimal element.
Obviously, the order relation in a well-ordered set X is a total order:6 if
x, y ∈ X, the set {x, y} has a minimal element, hence either x ≤ y or y ≤ x.
Examples A.2.10.
1. The set N, with the usual order relation, is a well-ordered set.
2. The set Z, with the usual order relation, is not a well-ordered set since, e.g.,
the subset {n ∈ Z : n ≤ 0} does not have a minimal element.
Another result which is equivalent to the Axiom of Choice and, hence,
also to Zorn’s Lemma, is the following:
Theorem A.2.11 (Well-Ordering Principle). Every set can be well-ordered.
Again we refer to the book of P. Halmos for a proof of this result.
Example A.2.12.
We observed above that the set of integers Z, with the usual order relation, is
not well-ordered. However, we have, for example, the following well-ordering
of Z:
0, 1, −1, 2, −2, . . . , n, −n, . . .
The main feature of well-ordered sets is the possibility of generalizing the
usual induction method to these sets, as we now explain. Given a partial
6
From now on, unless otherwise mentioned, we denote a (non-strict) order relation on
set set X by the symbol “≤”.
394
APPENDIX A. SET THEORY
ordered set let us denote by s(x) the set of all elements strictly less than x,
also called the “segment” ending at x:
s(x) = {y ∈ X : y ≤ x, and y 6= x}.
We then have:
Theorem A.2.13 (Transfinite Induction). Let X be an well-ordered set,
and S ⊂ X a subset such that:
∀x ∈ X, s(x) ⊂ S =⇒ x ∈ S.
Then S = X.
Proof. If X − S is non-empty, denote by x its minimal element. Then the
segmento s(x) is contained in S. By the “induction hypothesis ”, x ∈ S.
Since x cannot belong at the same time to S and to X − S, we must have
X − S empty, and X = S.
The Transfinite Induction method has a large range of application, as it
follows from the Well-Ordering Principle. For now, we will use it to study
recursive definitions, including transfinite recursive definitions.
If A is a set and n ∈ N, consider the set An of all maps f : In → A, i.e.,
all the n-tuples (x1 , x2 , · · · , xn ) in A. Consider also the class Φ = ∪n∈N An .
Definition A.2.14. A map F : Φ → A is called a recursive formula.
The justification for this name is that a map F : Φ → A allows to
associate to a n-tuple fn = (x1 , x2 , · · · , xn ) the (n+1)-tuple given by fn+1 =
(x1 , x2 , · · · , xn , xn+1 ), where xn+1 = F (fn ) = F (x1 , x2 , · · · , xn ).
Theorem A.2.15 (Recursive Definitions). Given an element x1 ∈ A and
a recursive formula F : Φ → A, there exists a unique sequence φ : N → A
such that
(i) φ(1) = x1 , and
(ii) φ(k + 1) = F (φ|Ik ), for all k ∈ N.
where we have denoted by φ|Ik the restriction of φ to the set Ik .
Proof. Using induction, we prove first that for every n ∈ N there exists
fn ∈ An satisfying:
(1) fn (1) = x1 , and
A.2. AXIOM OF CHOICE, ZORN’S LEMMA AND INDUCTION
395
(2) fn (k + 1) = F (fn |Ik ), for all k < n.
The result is obvious for n = 1, if we set f1 (1) = x1 and if we observe that in this case condition (2) is empty. Assume the result holds for
some n ≥ 1: there exists an n-tuple fn = (x1 , x2 , · · · , xn ) ∈ An satisfying
(1) and (2). Then we define fn+1 = (x1 , x2 , · · · , xn , xn+1 ) ∈ An+1 , where
xn+1 = F (fn ) = F (x1 , x2 , · · · , xn ). One checks immediately that fn+1 automatically satisfies (1), and it also satisfies (2) for k < n + 1.
Assume now that fn ∈ An and fm ∈ Am satisfy both (1) and (2). We
assume, without loss of generality, that n < m, and we prove that fn is the
restriction of fm tp In . For that, consider the set D = {k ∈ In : fn (k) 6=
fm (k)}. Assuming D non-empty, let r + 1 be its minimal element, and note
that r ≥ 1, because by assumption fn (1) = fm (1) = x1 . Hence, we have
that fn (k) = fm (k) for all k ≤ r, so the restrictions fn |Ir and fm |Ir are
equal. But then we must have fn (r + 1) = F (fn |Ir ) = F (fm |Ir ) = fm (r + 1),
contradicting that r + 1 ∈ D.
In order to conclude the proof, since we already know that for any n ∈ N
there exists exactly one element fn ∈ An satisfying (1) and (2), we define
φ : N → A by f (n) = fn (n). One checks easily that this is the only sequence
that satisfies (i) and (ii).
The previous result maybe generalized, replacing N by any well-ordered
set X. In this case, the sets In are replaced by the segments s(x) = {y ∈
X : y < x}, Ax = As(x) is then the set of all maps f : s(x) → A and Φ =
∪x∈X Ax . A transfinite recursive formula is again a map F : Φ → A.
The proof of the following generalization of Theorem A.2.15 is left for
the exercises:
Theorem A.2.16 (Transfinite Recursive Definitions). Given a transfinite
recursive formula F : Φ → A, there exists a unique map f : X → A such
that f (x) = F (f |s(x) ), for all x ∈ X.
Exercises.
1. Show that X × Y is isomorphic to Y × X. Under what conditions does the
equality X × Y = Y × X hold?
2. Show that X × (Y × Z) is isomorphic to (X × Y ) × Z.
3. Describe the sets X ∅ , ∅X and ∅∅ .
4. Show that the sets X Y ∪Z and X Y × X Z are isomorphic.
396
APPENDIX A. SET THEORY
Z
5. Show that the sets X Y ×Z and (X Y ) are isomorphic, whenever Y ∩ Z = ∅.
6. Show that (X × Y )Z is isomorphic to X Z × Y Z .
7. Use the Axiom of Choice to prove that f : X → Y is surjective if and only if
there exists g : Y → X such that f ◦ g = IY .
8. Show
Q that, if I 6= ∅ and Xi 6= ∅ for all i ∈ I, then the canonical projections
πk : i∈I Xi → Xk are surjective.
9. Use Zorn’s Lemma to show that for an arbitrary group, every proper subgroup is contained in a maximal subgroup.
10. Prove the following statements:
(a) Every partially ordered set has a maximal chain;
(b) Every chain in a partially ordered set is contained in a maximal chain.
11. Show that the Principle of Transfinite Induction is equivalent to the usual
Principle de Induction (see Chapter 2) when X = N.
12. Give an example of a well-ordering of Q.
13. Show that a totally ordered set X is well-ordered if and only if for all x ∈ X
the segment s(x) is well-ordered.
14. Prove Theorem A.2.16. Explain why the statement of this result does not
mention an element x1 as was the case with Theorem A.2.15.
A.3
Finite Sets
Intuition says that a set X is finite if one can “count” its elements. The
prototype of a finite set with n ≥ 0 elements is given by the set formed by
the first n natural numbers:
In = {1, 2, 3, . . . , n} = {k ∈ N : k ≤ n}.
When n = 0, we have the empty set I0 = ∅. A moment’s thought shows
that by “counting” we really mean establishing a bijective correspondence
(map) between X and In . More formally, we define:
Definition A.3.1. A set X is called finite if it is isomorphic to In , for
some n ≥ 0. If X is not isomorphic to any In , then X is called infinite.
A.3. FINITE SETS
397
Examples A.3.2.
1. The set In is obviously isomorphic to itself, hence it is a finite set.
2. We leave as an exercise to check that a subset X ⊂ Z is finite if and only if
is bounded.
We have mentioned in Chapter 1 that X is infinite if and only if there
exists an injective map φ : X → X which is not surjective. This will be
established in the next section, where we study in some detail infinite sets.
For now, we limit ourselves to finite sets. Let us look first at the sets In .
Lemma A.3.3. If φ : In → In is injective, then φ is surjective.
PSfrag replaements
Proof. We use induction. When n = 0, there is nothing
to prove.7
Assuming that the result holds for some n, let φ : In+1 → In+1 be an
injective map, and let α = φ(n+1). Consider (see Figure A.3.1) the bijective
map ψ : In+1 → In+1 given by ψ(n + 1) = α, ψ(α) = n + 1, and ψ(x) = x
in all other cases (ψ “exchanges” the natural numbers α and n + 1, and is
y
the identity if x 6= α, n + 1, but this last fact is irrelevant forr1the proof).
rr2
rr34
5
In+1
In+1
In+1 r6
#
n+1
n+1
n+1
In
In
= Æ 1 +pr
r
mi
h
e
e~
h1i = Z
h2i
h4i
In
h12i
h6i
h3i
hmd(m; n)i
hmi
hni
hmm(m; n)i
Figure A.3.1: The maps φ, φ∗ and ψ.
Now we define φ∗ = ψ ◦ φ and observe that φ∗ is injective, since it is
the composition of injective maps. Also, it satisfies φ∗ (n + 1) = n + 1,
7
A map f : X → Y is just a set of ordered pairs with some special properties. It is
possible that f is the empty set, and this happens exactly when X is also the empty set.
In this case, f is necessarily injective, and it is surjective if Y is also the empty set.
398
APPENDIX A. SET THEORY
by the definition of ψ. It follows that if x ∈ In (i.e., if x 6= n + 1), then
φ∗ (x) 6= n + 1, so that φ∗ (x) ∈ In , or equivalently φ∗ (In ) ⊆ In .
This shows that the restriction of φ∗ to In is an injective map from In into
In . By the induction hypothesis, this restriction is surjective: φ∗ (In ) = In .
Since φ∗ (n+1) = n+1, conclude that φ∗ (In+1 ) = In+1 , i.e., φ∗ is a surjective
map from In+1 onto In+1 .
Finally, we conclude that φ = ψ −1 ◦ φ∗ is surjective, since it is the
composition of surjective maps.
PSfrag replaements
Proposition A.3.4. If X is finite and φ : X → IX
is injective, then φ is
n
n
+
1
surjective.
Proof. Let Ψ : In → X be a bijection. Note that φ∗ = Ψ−1 ◦ φ ◦ Ψ : In → In
is injective, since it is the composition of injective maps
(see Figure A.3.2).
y
∗
r
1
According to Lemma A.3.3, φ is necessarily surjective.
r2
r3
r4
r5
r6
X pr
r
1+
X
hmi
e
e~ 1
I
= 1 Æ Æ h1i =Z
h 2i
h 4i
h12i
h 6i
I h 3i
hmd(m; n)i
h mi
hmm(m; n)i
Figure A.3.2: The maps φ, φ∗ and Ψ.
It follows that φ = Ψ ◦ φ∗ ◦ Ψ−1 is a composition of surjective maps, and
hence it is surjective.
Corollary A.3.5. If φ : X → X is injective but not surjective, then X is
an infinite set.
Example A.3.6.
In Chapter 2, we observed that the map f : N → N given by f (n) = n + 1 is
injective and not surjective. From the previous result, we conclude that N is
infinite.
Another consequence of the proposition above is that a finite set cannot
be isomorphic to a proper subset of it. The example above shows that this
A.3. FINITE SETS
399
fails for infinite sets. We state this result as follows and leave the proof for
the exercises:
Corollary A.3.7. If X is finite, X ⊇ Y and φ : X → Y is injective, then
X =Y.
It seems obvious that X is isomorphic to In if and only if X has n
elements, in which case X cannot be isomorphic to Im for m 6= n. In fact,
this is a direct consequence of Proposition A.3.4.
Corollary A.3.8. If φ : In → X and ψ : Im → X are both bijections, then
n = m.
Proof. Assume, without loss of generality, that m ≤ n, so that Im ⊂ In .
Then the map Ψ = ψ −1 ◦ φ : In → Im gives an injective map from In into
its subset Im . By Corollary A.3.7, we must have Im = In , i.e., n = m.
Hence, if X is finite there exists a unique non-negative integer n such
that X is isomorphic to In . In this case, we say that X has n elements, and
we call n the cardinal number or cardinality of X. We will use often
the symbol #X to denote the cardinality of X.
We will list a few properties of the cardinality of finite sets. We only
sketch the proofs, leaving the details for the exercises.
Proposition A.3.9. If Y is a subset of the set X, then:
(i) If X is finite, then Y is also finite and #Y ≤ #X.
(ii) If X is finite and #Y = #X, then X = Y .
(iii) If Y is infinite, then X is also infinite.
The proof uses the following two lemmas, the first of which has an elementary proof and so it is left as an exercise.
Lemma A.3.10. If φ : In → X is injective but not surjective, then there
exists φ∗ : In+1 → X injective.
Lemma A.3.11. If φ : X → In is injective, then X is finite and #X ≤ n.
Proof. Let M (X) be the set of integers m ≥ 0 for which there exists an
injective map Ψm : Im → X. Note that 0 belongs to M (X), and we claim
that M (X) is bounded above.
400
APPENDIX A. SET THEORY
If Ψm : Im → X is injective, the composition φ ◦ Ψ : Im → In is also
injective. According to Corollary A.3.7, we cannot have n < m, i.e., n is an
upper bound for M (X).
Hence, M (X) has a maximum k. By Lemma A.3.10, we have k = #X.
Since n is an upper bound for M (X), it follows that #X = k ≤ n.
Our last proposition expresses in a precise form some of our intuition
about the meaning of the addition and the product of natural numbers.
Proposition A.3.12. If X and Y are finite sets, then X ∪ Y and X × Y
are finite sets and we have:
(i) #(X ∪ Y ) ≤ #X + #Y ;
(ii) #(X ∪ Y ) = #X + #Y , if X and Y are disjoint;
(iii) #(X × Y ) = (#X)(#Y ).
Proof. We prove only item (ii), leaving the remaining statements for the
exercises.
Let φ : In → X and ψ : Im → Y be bijections, so that #X = n and
#Y = m. Define a map Ψ : In+m → X ∪ Y by:

if 1 ≤ k ≤ n,
 φ(k)
Ψ(k) =

ψ(k − n)
if n + 1 ≤ k ≤ n + m.
Obviously, Ψ is surjective and injective (since X and Y are disjoint). Hence,
#(X ∪ Y ) = n + m.
Exercises.
In these exercises, the symbols X and Y denote arbitrary sets.
1. Prove Corollary A.3.7.
2. Prove Lemma A.3.10
3. Let φ : In → X be a surjective map. Show that Ψ : X → In , given by
Ψ(x) = max{k ∈ In : φ(k) = x} is injective.
4. Show that the following statements are equivalent:
A.4. INFINITE SETS
401
(a) X is finite and #X ≤ n.
(b) There exists an injective map φ : X → In .
(c) There exists a surjective map ψ : In → X.
5. Let X be a finite set. Show that if either φ : X → Y is surjective or
ψ : Y → X is injective, then Y is finite and #Y ≤ #X.
6. Give a proof of Proposition A.3.9 using Corollary A.3.7 and Lemma A.3.11.
7. Show that if X is finite, then #X = #(X − Y ) + #(X ∩ Y ).
8. Let X and Y be finite. Show that X ∪ Y is also finite and
#(X ∪ Y ) = #X + #Y − #(X ∩ Y ).
9. Let X and Y be finite. Show that X ×Y is finite and #(X ×Y ) = (#X)(#Y ).
10. Let X1 , X2 , . . . , Xn be finite sets. Show that:
Pn
Sn
(a) #( k=1 Xk ) ≤ k=1 #Xk ;
S
P
(b) #( nk=1 Xk ) = nk=1 #Xk , if the sets Xk ’s are disjoint;
Qn
Qn
(c) #( k=1 Xk ) = k=1 #Xk .
11. Assume that #X = n and #Y = n. Find the cardinality of the following
sets:
(a) The set Y X of all maps f : X → Y .
(b) The set of all injective maps f : X → Y .
(c) The set of all surjective maps f : X → Y .
12. Assuming that #X = n, show that:
(a) X has 2n distinct subsets;
n!
subsets with exactly k elements.
(b) X has nk = k!(n−k)!
13. Let X ⊂ Z. Show that X is finite if and only if X is bounded.
A.4
Infinite Sets
We have already seen some elementary results on infinite sets. In particular,
we have seen that N is an infinite set. We will now study some more properties of infinite sets. We start by showing that N is the “smallest” infinite
set.
402
APPENDIX A. SET THEORY
Lemma A.4.1. X is infinite if and only if X contains a subset isomorphic
to N.
Proof. If there exists a subset Y ⊂ X and a bijection φ : N → Y , it follows
that Y is infinite, and hence X is also infinite, according to Proposition
A.3.9 (iii).
Conversely, assume that X is infinite. We claim that there exists an
injective map φ : N → X. Taking Y = φ(N), this completes the proof. We
will define the map φ recursively.
Since X 6= ∅, there exists x1 ∈ X and we set φ(1) = x1 . Assume
that φ has been defined and is injective in {1, 2, . . . , n}. The set Zn =
X − {φ(1), . . . , φ(n)} is non-empty, for otherwise X would be finite. Choose
z some element of Zn , and set φ(n + 1) = z. This defines an injective map
φ : N → X recursively.
The previous result allows to complete Corollary A.3.5, and to justify
the characterization of infinite sets mentioned in the exercises of Section 2.1.
Theorem A.4.2. X is infinite if and only if there existe φ : X → X
injective and not surjective.
Proof. Corollary A.3.5 shows that if there exists φ : X → X injective and
not surjective, then X is infinite. It remains to show that if X is infinite,
then there exists such a map.
If X is infinite, Lemma A.4.1 yields an injective map (sequence) ψ :
N → X. Let Y = ψ(N) so that ψ : N → Y is a bijection. Now we define
φ : X → X as follows:

if x 6∈ Y,
 x
φ(x) =

ψ(ψ −1 (x) + 1)
if x ∈ Y.
One checks easily that φ is injective and not surjective.
All the properties of infinite sets that we have studied so far are, perhaps,
not surprising. Our next result is less obvious: it states that Lemma A.4.1
cannot be improve, i.e., there are infinite sets which are not isomorphic to
N. In what follows we will denote by P(X) the set consisting of the subsets
of X.
Theorem A.4.3 (Cantor). Let Ψ : X → P(X) be any map. Then Ψ is not
surjective.
A.4. INFINITE SETS
403
Proof. The proof is by contradiction, and resembles the ideas behind Russel’s paradox, that we mentioned in Chapter ??.
Let Ψ : X → P(X) and define a subset Y ⊂ X by:
Y = {x ∈ X : x 6∈ Ψ(x)}.
Since Y ∈ P(X), if Ψ is surjective there exists an element y ∈ X such that
Y = Ψ(y). Obviously, either y ∈ Y or y 6∈ Y . We claim that both cases lead
to a contradiction:
(i) If y ∈ Y = Ψ(y), then the definition of Y gives that y 6∈ Y , a contradiction;
(ii) If y 6∈ Y = Ψ(y) it follows from the definition of Y , that y ∈ Y , which
is also a contradiction.
It follows that there is no y ∈ X such that Y = Ψ(y), hence Ψ is not
surjective.
Examples A.4.4.
1. In order to illustrate the proof above, let X = {0, 1}, so that:
P(X) = {∅, {0}, {1}, {0, 1}}.
If Ψ : X → P(X) is given by Ψ(0) = {1} and Ψ(1) = {0, 1}, we have that
Y = {x ∈ X : x 6∈ Ψ(x)} = {0},
so obviously Y 6∈ Ψ(X) = {{1}, {0, 1}}.
2. Consider the set P(N) whose elements are the sets containing only natural
numbers. The result above says that this set is not isomorphic to N. On the
other hand, the map Φ : N → P(N) given by Φ(n) = {n} is obviously injective,
and hence P(N) is infinite.
Definition A.4.5. A set X is called countable if it is finite or isomorphic
to N. Otherwise, X is called (infinite) uncountable or
Therefore, as we saw above, X is an infinite set if and only if it contains
an infinite countable subset, but there are infinite uncountable sets, such as
P(N). On the other hand, there are also uncountable sets, not isomorphic
to N, and also not isomorphic between themselves: for example, the sets
P(N), P(P(N)), P(P(P(N))), etc.
404
APPENDIX A. SET THEORY
There are many infinite uncountable sets which we are used to, such as
R or C. To check that R is indeed uncountable, we will use the following
proposition which is usually presented as a consequence of the Supremum
Axiom.
Proposition A.4.6. Every bounded monotone sequence of real numbers
converges.
For the proof we refere to Chapter 4, where the real numbers are introduced in a constructive way.
Theorem A.4.7. R is an uncountable set.
Proof. Let φ : N → R be any sequence of real numbers. We need to show
that φ is not surjective, i.e., that there exists x ∈ R such that x 6∈ φ(N).
There exist real numbers a1 and b1 such that a1 < b1 < φ(1). In particular, φ(1) 6∈ [a1 , b1 ]. One can define recursively sequences an and bn which
are, respectively, increasing and decreasing, and such that:
an < bn ,
φ(n) 6∈ [an , bn ].
Obviously, both sequences are bounded by a1 and b1 . It follows from Proposition A.4.6 that both sequences converge, and we denote their limits by a
and b, respectively. It is also clear that an ≤ a ≤ b ≤ bn , and we conclude
immediately that a 6= φ(n), for all n ∈ N. Hence, φ is not surjective.
Example A.4.8.
Consider the set {0, 1}N, consisting of all sequences in {0, 1}, (called binary
sequences) is also uncountable. In this case, it is easy to show that {0, 1}N is
isomorphic to P(N). For that, observe that a binary sequence φ : N → {0, 1}
is completely determined by its support, i.e., by the set of natural numbers n
for which φ(n) 6= 0. In other words, the map Ψ : {0, 1}N → P(N), defined by
Ψ(φ) = {n ∈ N : φ(n) 6= 0}, is a bijection.
One can use the bijection in the last example, together with some elementary facts concerning binary representations of real numbers (base two),
to show that both P(N) and {0, 1}N are isomorphic to R (see the exercises).
We will not define “#X” when X is an infinite set. Instead, we will
use the symbol “|X|”, also called cardinal of X, where X denotes any set
(finite or infinite), as part of the expressions “|X| = |Y |”, “|X| ≤ |Y |” and
“|X| < |Y |”, with the following meaning:
A.4. INFINITE SETS
405
Definition A.4.9. We will write:
(i) |X| = |Y | if X and Y are isomorphic, i.e., if there exists a bijection
φ:X →Y;
(ii) |X| ≤ |Y | if X is isomorphic to a subset de Y , i.e., if there exists a
injective map φ : X → Y ;
(iii) |X| < |Y | if |X| ≤ |Y |, and X is not isomorphic to Y .
The equality “|X| = |Y |” and the inequality “|X| ≤ |Y |” are analogues
of the equalities and inequalities between numbers “#(X) = #(Y )” and
“#(X) ≤ #(Y )”, when X and Y are finite sets. For this reason, we could
write #X = |X|. Even when X and Y are infinite sets, the cardinal has
some properties similar to the equalities and inequalities between numbers.
For example, the following two properties are immediate to prove:
Proposition A.4.10. Let X, Y and Z be sets.
(i) If |X| = |Y | and |Y | = |Z|, then |X| = |Z|;
(ii) If |X| ≤ |Y | and |Y | ≤ |Z|, then |X| ≤ |Z|.
It may look obvious that
|X| ≤ |Y | and |Y | ≤ |X| ⇐⇒ |X| = |Y |.
However, the proof of this fact is not at all obvious for infinite sets. In order
to prove this we start by showing that the following lemma holds. Note that
we already know it holds for finite sets, in which case we can add to the
conclusion that “and X = Y ” (but not for infinite sets!).
Lemma A.4.11. If Y ⊂ X and φ : X → Y is injective, then |X| = |Y |.
Proof. We define recursively two sequences of sets as follows: X1 = X,
Y1 = Y , and para n > 1, Xn = φ(Xn−1 ) and Yn = φ(Yn−1 ). We define also
Zn = Xn − Yn , and
∞
[
Zn .
Z=
n=1
Note that (X − Z) ⊂ Y , since if x ∈ X and x 6∈ Y = Y1 , then x ∈ Z1 ,
so x ∈ Z. Moreover, if x ∈ Z, i.e., if there exists n such that x ∈ Zn , then
φ(x) ∈ Zn+1 , so that φ(x) ∈ Z.
PSfrag replaements
406
APPENDIX A. SET THEORY
I
In
n+1
1
= 1 Æ Æ y
1
2
3
4
r5
r6
r
r
Z
Z
1
2
3
Z
r
:::
r
1 +p
r
r
X
hmi
e
e~
h1i = Z
h2i
h4i
h6i
h3i
hmd(m; n)i
hmi
hmm(m; n)i
Figure A.4.1: The sets Z1 , Z2 , Z3 , . . .
Now, define Ψ : X → X by
Ψ(x) =

 φ(x),

x,
if x ∈ Z,
if x 6∈ Z.
Notice that Ψ(X) ⊆ Y , since if x ∈ Z, then Ψ(x) = φ(x) ∈ Y , and if x 6∈ Z,
then x ∈ Y and Ψ(x) = x. We claim that Y ⊆ Ψ(X), so it follows that
Ψ(X) = Y . To prove the claim, consider x ∈ Y and observe that if x 6∈ Z
then x = Ψ(x) ∈ Ψ(X). On the other hand, if x ∈ Z ∩ Y , then x ∈ Zn for
some n > 1 (obviously Y does not contain any element of Z1 ), and hence
x ∈ φ(Zn−1 ), so that x ∈ Ψ(X) (see Figure A.4.1).
Since Ψ(X) = Y , in order to complete the proof, we only need to show
that Ψ is injective. Let x, y ∈ X and assume that x 6= y. Then:
(i) If x, y ∈ Z, then Ψ(x) 6= Ψ(y), since Ψ = φ in Z, and φ is injective.
(ii) If x, y 6∈ Z, then Ψ(x) 6= Ψ(y), since Ψ is the identity in X − Z.
(iii) If x ∈ Z and y 6∈ Z, then Ψ(x) 6= Ψ(y), since Ψ(y) = y 6∈ Z and
Ψ(x) = φ(x) ∈ Z.
We conclude that X and Y are isomorphic.
The construction of the set Y in the previous proof seems ingenious, but
it is actually very simple, as we illustrate in the following example.
A.4. INFINITE SETS
407
Example A.4.12.
Let X = N0 = {n ∈ Z : n ≥ 0} and Y = N and φ(x) = x + 2. Then we
find immediately that Xn = {k ∈ Z : k ≥ 2(n − 1)} and Yn = {k ∈ Z : k ≥
2(n − 1) + 1}. It follows that Zn = {2(n − 1) : n ∈ N}, so Z is the set of even
positive integers. The map Ψ : N0 → N in the previous proof is then given by:

if x = 2n,
 x + 2,
Ψ(x) =

x,
if x = 2n + 1.
This is obviously a bijection from N0 onto N.
We can now show:
Theorem A.4.13 (Schroeder-Bernstein). If |X| ≤ |Y | and |Y | ≤ |X|, then
|X| = |Y |.
Proof. Let φ : X → Y and ψ : Y → X be injective maps and Z = ψ(Y ).
Note that ψ ◦ φ : X → Z is injective and since Z ⊆ X, the previous
lemma shows that there exists a bijection Ψ : Z → X. The composition
Ψ ◦ ψ : Y → X is the desired bijection.
The Schroeder-Bernstein theorem is often used to show that two sets are
isomorphic, without having to exhibit an explicit bijection between the sets.
This is illustrated in the next example.
Example A.4.14.
Consider the sets X = N × N and Y = N. The map φ : N → N × N given
by φ(n) = (n, 1) is injective. According to the Fundamental Theorem of Arithmetic the map ψ : N × N → N given by ψ(n, m) = 2n 3m is also injective. It
follows from the Schroeder-Bernstein theorem that N and N×N are isomorphic.
The following result follows easily from |N × N| = |N| and is left as an
exercise:
Proposition A.4.15. Let X, Xn and Y be countable sets.
(i) If Y 6= ∅ and X is infinite, then |X ×Y | = |X|, i.e., X ×Y is countable.
(ii) The set
S∞
n=1 Xn
is countable.
408
APPENDIX A. SET THEORY
This proposition shows that the behavior of the notion of cardinal under
forming unions and products of infinite sets is slightly peculiar. We state
two general results concerning unions and products, illustrating this. We will
omit the proofs of these statements, since we do not wish to get even more
involved with technical issues in Set Theory (see, however, the exercises at
the end of the section).
Theorem A.4.16. If X is infinite and |Y | ≤ |X|, then:
(i) |X ∪ Y | = |X|;
(ii) |X × Y | = |X|, if Y 6= ∅.
Exercises.
1. Show that Z and Q are countable.
2. Show that the intervals [0, 1], ]0, 1[, and [0, 1[ (in R) are all isomorphic to R.
3. Let X be a countable (finite or infinite) set. For each of the following examples determine if the set is countable (finite or infinite), justifying your answer.
(a) The set X {0,1} of all maps f : {0, 1} → X.
(b) The set X In of all maps f : In → X.
S∞
(c) The set Y = n=1 X In .
(d) The set X N of all maps f : N → X.
(e) The set {0, 1}X of all maps f : X → {0, 1}.
(f) The set of sequences f : N → X which are “eventually constant ”, i.e., for
each sequence there exists N ∈ N such that n > N ⇒ f (n) = x ∈ X.
(g) The set Pfin (X) of finite subsets of X.
4. Show that the set of polynomials with rational coefficients is countable.
5.SLet Xn , n = 1, 2, 3, . . . be countable sets. Show that the sets
∞
n=1 Xn are also countable.
6. Let XnQ
, n = 1, 2, 3, . . . be countable sets. Show that
∞
When is n=1 Xn countable?
QN
n=1
SN
n=1
Xn and
Xn is countable.
7. Show that if X is infinite, Y ⊂ X and Y is finite, then |X| = |X − Y |.
8. Show that if X is infinite, Y ⊂ X, X − Y is infinite and Y is countable, then
|X| = |X − Y |. Conclude that if X is uncountable and Y is countable, then
|X ∪ Y | = |X|.
A.4. INFINITE SETS
409
9. Let {0, 1}N be set of all binary sequences. Show that {0, 1}N is isomorphic
to the interval [0, 1] ⊂ R, and hence isomorphic to R.
10. Let {0, 1}N be set of all binary sequences. Show that {0, 1}N × {0, 1}N is
isomorphic to {0, 1}N, and conclude that Rn is isomorphic to R, for any natural
number n.
11. Assume that |Xn | ≤ |R|. Show that |
S∞
n=1
Xn | ≤ |R|.
12. Consider the sequence of sets X1 = N, Xn+1 = P(Xn ). Show that there
exists a set Y such that |Y | > |Xn |, for every n ∈ N.
13. Let X be an infinite set and let Pfin (X) be set of finite subsets of X. Show
that |X| = |Pfin (X)|.
410
APPENDIX A. SET THEORY
References (not available
yet)
411