Download lecture notes

Introduction to Algebra Rui L. Fernandes University of Illinois at Urbana-Champaign Manuel Ricou Instituto Superior Técnico April 14, 2017 2 Contents 1 Basic Algebraic Structures 1.1 Introduction . . . . . . . . . . . . . . . . . . . 1.2 Groups . . . . . . . . . . . . . . . . . . . . . . 1.3 Permutations . . . . . . . . . . . . . . . . . . 1.4 Homomorphisms and Isomorphisms . . . . . . 1.5 Rings, Integral Domains and Fields . . . . . . 1.6 Homomorphisms and Isomorphisms of Rings . 1.7 The Quaternions . . . . . . . . . . . . . . . . 1.8 Symmetries . . . . . . . . . . . . . . . . . . . 2 The 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 Integers The Axioms of the Integers . . . . . . . . Inequalities and Ordered Rings . . . . . . Principle of Induction . . . . . . . . . . . Summations and Products . . . . . . . . . Factors, Multiples and Division . . . . . . Ideals and Euclides Algorithm . . . . . . . The Fundamental Theorem of Arithmetic Congruences . . . . . . . . . . . . . . . . . Prime Factorizations and Cryptography . 3 Other Examples of Rings 3.1 The Rings Zm . . . . . . . . . . . . . 3.2 Fractions and Rational Numbers . . 3.3 Polynomials and Power Series . . . . 3.4 Polynomial Functions . . . . . . . . 3.5 Division of Polynomials . . . . . . . 3.6 The Ideals of K[x] . . . . . . . . . . 3.7 Divisibility and Prime Factorization 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 7 13 19 24 33 42 50 54 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 65 70 77 83 88 93 103 109 119 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 125 137 142 150 156 164 168 4 CONTENTS 3.8 Factorization in D[x] . . . . . . . . . . . . . . . . . . . . . . . 182 4 Quotients and Isomorphisms 4.1 Groups and Equivalence Relations . . 4.2 Quotient Group and Quotient Ring . . 4.3 Real and Complex Numbers . . . . . . 4.4 Isomorphism Theorems for Groups . . 4.5 Isomorphism Theorems for Rings . . . 4.6 Free Groups, Generators and Relations 5 Finite Groups 5.1 Groups of Transformations . . 5.2 The Sylow Theorems . . . . . . 5.3 Nilpotent and Solvable Groups 5.4 Simple Groups . . . . . . . . . 5.5 Symmetry Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Modules 6.1 Modules over a ring . . . . . . . . . . . 6.2 Linear Independence . . . . . . . . . . . 6.3 Tensor Products . . . . . . . . . . . . . 6.4 Modules over Integral Domains . . . . . 6.5 Finitely Generated Modules over a PID 6.6 Classifications . . . . . . . . . . . . . . . 6.7 Categories and Functors . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Galois Theory 7.1 Field Extensions . . . . . . . . . . . . . . 7.2 Compass-and-Straightedge Constructions 7.3 Splitting Fields . . . . . . . . . . . . . . . 7.4 Homomorphisms of Extensions . . . . . . 7.5 Separability . . . . . . . . . . . . . . . . . 7.6 The Galois Group . . . . . . . . . . . . . 7.7 The Galois Correspondence . . . . . . . . 7.8 Applications . . . . . . . . . . . . . . . . . 8 Commutative Algebra (not yet available) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 . 189 . 197 . 205 . 213 . 221 . 228 . . . . . 245 . 245 . 251 . 259 . 265 . 270 . . . . . . . 281 . 281 . 290 . 295 . 304 . 310 . 320 . 326 . . . . . . . . 333 . 335 . 338 . 343 . 350 . 354 . 358 . 364 . 370 381 CONTENTS A Set A.1 A.2 A.3 A.4 Theory Relations and Functions . . . . . . . Axiom of Choice, Zorn’s Lemma and Finite Sets . . . . . . . . . . . . . . . Infinite Sets . . . . . . . . . . . . . . 5 . . . . . . Induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383 . 383 . 390 . 396 . 401 6 CONTENTS Chapter 1 Basic Algebraic Structures 1.1 Introduction Algebra is still today, as it always has been, the study of operations, rules and methods to solve equations. The origin of the term “Algebra” itself is specially enlightening. According to B. L. van der Waerden, one of the outstanding algebraists of the XX Century, the term was used for the first time by in the IX Century the Arab mathematician al-Khowarizmi, in the title of a monograph 1 which aimed at presenting mathematical knowledge of practical application. The Arab word al-jabr is used in this monograph to denote two basic procedures in solving equations, namely: 1. adding the same positive quantity to both sides of an equation, in order to cancel negative quantities, and 2. multiplying both sides of the equation by the same positive quantity in order to cancel fractions. The aforementioned monograph discusses problems of a very different nature, ranging from geometric problems to the solution of linear and quadratic equations, and includes also a variety of applications to astronomy, to commerce, to calendars, etc. With time, the term al-jabr, or Agebra, became associated with the general area of knowledge that deals with operations and numerical equations. 1 The title of the book was “Al-jabr wa’l muqābala”. His author, Mohammed alKhowarizmi was an astronomer and mathematician, member of the Bait al-hikma (“House of Wisdom”) established in Baghdad, the center of scientific knowledge in those times. The name al-Khowarizmi is also the origin of another common term: algorism! 7 8 CHAPTER 1. BASIC ALGEBRAIC STRUCTURES Nowadays, one sometimes distinguishes Classic and Modern Algebra. Such expressions are not very fortunate, and according the the Italian mathematician F. Severi today’s Modern Algebra will soon become the Classic Algebra of tomorrow. Actually, when one compares, e.g, the XXI Century Algebra with the XVI Century Algebra, and one eliminates differences common to all branches of scientific knowledge (rigor, formalism, notations, the amount of knowledge), the only remaining difference is the generality in which problems are stated and treated. Actually, a tendency to generalization has always been present in Algebra. Initially, this was the case with the successive generalizations of the concept of number: first from natural to positive rational, then to negative numbers, irrational numbers and complex numbers. In the XIX Century, mathematicians realized that the so-called “algebraic ideas” could be applied to objects which are not numbers, such as vectors, matrices and transformations. The initial slow expansion of the scope of Algebra turned into a rapid explosion when it became clear that one could study properties of any algebraic operation without specifying the nature of the objets on which the operation applies or how to calculate the result of the operation. In fact, this study can be performed simply by postulating (i.e., assuming as hypothesis) a certain numbers of basic algebraic properties of the operation, such as commutativity or associativity. Algebra then finally became axiomatic (although with a lag of more than 2000 years relative to Geometry). This was perhaps the most significative achievement of recent times, and justifies the use of the name “Abstract Algebra”, to which nowadays we often refer to Algebra. The axiomatization of Algebra required, first of all, precise definitions of abstract algebraic structures. In its simplest form, an abstract algebraic structure is formed by a non-empty set X, called the support of the structure, and a binary operation on X, which is just a map µ : X × X → X. Distinct sets of hypothesis, or axioms, imposed on the operation µ lead to distinct abstract algebraic structures. Note, however, that the definitions do not include any assumption on the nature of elements of X, nor on the procedure to evaluate the operation µ. There a few conventions that everyone follows. If µ : X × X → X is a binary operation on X, one chooses a symbol such as “+” or “∗” to represent it, and writes “x + y” or “x ∗ y” instead of “µ(x, y)”. Actually, one can even omit the symbol and represent the binary operation by simply juxtaposition, i.e., “xy” instead of “µ(x, y)”. The use of such notations as “x + y” and “xy” does not mean that these symbols represent or bear any 1.1. INTRODUCTION 9 relation with the usual addition and multiplication operations on numbers. In this respect, the only generally accepted convention is that one reserves the symbol “+” to denote commutative operations, i.e., operations such that µ(x, y) = µ(y, x). Whenever we represent a commutative operation by the symbol “+” we will say that we use the additive notation, while in all other cases we say that we use the multiplicative notation. Also, we will make systematic use of the usual conventions with parenthesis, so that we write x ∗ (y ∗ z) = µ(x, µ(y, z)), which, in general, will be distinct from (x ∗ y) ∗ z = µ(µ(x, y), z). As a general rule, imposing few axioms on an algebraic structure lead to results which apply in great generality, for they will be valid for many concrete algebraic structure. However, very general results tend to be not so interesting, precisely because they rely on a small number of hypothesis. On the other hand, if one starts with a richer set of axioms, one obtains in principle more interesting results, but less general, since fewer concrete algebraic structures will satisfy all the axioms. One of the main issues in Abstract Algebra is precisely to determine a set of axioms ( (i.e., definitions of abstract algebraic structures) which are sufficiently general to include many concrete examples and, at the same time, sufficiently rich to lead to interesting results. Let us consider a few important examples illustrating these remarks. Without any additional assumption on a binary operation ∗, we can introduce the notion of identity element, inspired on the properties of the integers 0 and 1 relative to the usual addition and multiplication, respectively. Definition 1.1.1. Let ∗ be a binary operation on a set X. An element e ∈ X is called an identity element for ∗ if x ∗ e = e ∗ x = x for all x ∈ X. It is possible to prove a general result about identity elements which is valid for any binary operation. Proposition 1.1.2. Every binary operation has at most one identity element. Proof. Assume that e and ẽ are both identity elements for the operation ∗. We then have: e ∗ ẽ = ẽ e ∗ ẽ = e We conclude that e = ẽ. (since e is an identity element), (since ẽ is an identity element). 10 CHAPTER 1. BASIC ALGEBRAIC STRUCTURES Hence, we can talk without ambiguity about the identity element, whenever this element exists. Also, it is common to use the terms “zero” and “one” (the last one usually called “identity”) to denote the identity element of ∗. Obviously, we should be consistent: we should reserve the use of the term “zero” (and also the symbol “0”) for the additive notation, and use the term “identity” (and also the symbols “1” or “I”) for the multiplicative notation. Whenever a binary operation ∗ has an identity e, we can introduce the notion of inverse element. The definition is as follows: Definition 1.1.3. Given a binary operation ∗ with identity e, an element x ∈ X is called invertible if there exists y ∈ X such that x ∗ y = y ∗ x = e. In this case, we call y an inverse of x. Again, to be consistent with common usage, if we use the additive notation we call an inverse element of x a symmetric of x. Note that y is an inverse of x if and only if x is an inverse of y, i.e., in other words “being inverse of” is a symmetric relation. If one only has x∗y = e, we say that y is a right inverse of x and that x is a left inverse of y. Obviously, y is an inverse of x if and only if y is both a right and left inverse of x. However, a right inverse may fail to be a left inverse, and vice-versa. Still, when ∗ is an associative operation we can show the following. Proposition 1.1.4. Let ∗ be an associative binary operation on X with identity e. If x ∈ X has a right inverse y and a left inverse z, then y = z and x is invertible. Proof. Assume that y, z ∈ A satisfy x ∗ y = z ∗ x = e. Then: x ∗ y = e ⇒ z ∗ (x ∗ y) = z ⇒ (z ∗ x) ∗ y = z ⇒e∗y =z ⇒y=z (because z ∗ e = z), (because ∗ is associative), (because z ∗ x = e), (because e ∗ y = y). The previous results apply to any binary operation that satisfies the two properties used in the proofs (existence of identity and associativity). These are exactly the assumptions used as axioms in the the following definition: 1.1. INTRODUCTION 11 Definition 1.1.5. An algebraic structure (X, ∗) is called a monoid if it satisfies the following properties: (i) The operation ∗ has identity e on X. (ii) The operation ∗ is associative, i.e., (x ∗ y) ∗ z = x ∗ (y ∗ z), for all x, y, z ∈ X. If ∗ is a commutative operation, i.e., if x ∗ y = y ∗ x for all x, y ∈ X, we say that the monoid is abelian 2 . If the operation is commutative and we use the additive notation, we say that the monoid is additive. From Proposition 1.1.4 we conclude: Proposition 1.1.6. If (X, ∗) is a monoid (with identity e), and x ∈ X is invertible, there exists a unique element y ∈ X such that x ∗ y = y ∗ x = e. Some well-known operations furnish examples of monoids. Examples 1.1.7. 1. The set of all n × n matrices with real entries with the usual product of matrices is a monoid. Hence, if A, B, C are matrices n × n, I is the identity matrix and AB = CA = I, then B = C and the matrix A is invertible. 2. The set RR of all functions 3 f : R → R with “composition product”, defined by (f ◦ g)(x) = f (g(x)) is a monoid. The identity is the function I : R → R given by I(x) = x. Hence, if there exist functions g, h : R → R such that f ◦ g = h ◦ f = I, then g = h and f is invertible ( i.e., is a bijection). 3. The set R of all real numbers with the usual addition is a (additive) monoid. In this case any element is invertible. 4. The set R+ of all positive real numbers with the usual product is a monoid. Again the operation is commutative and every such element is invertible. If an element x in a monoid (X, ∗) is invertible, then the inverse of x is unique. If we use the multiplicative notation, then we denote by “x−1 ” the inverse of x, while in the additive notation we denote by “−x” the symmetric of x. With these conventions, some of the well-known results for symmetrics and inverses of real numbers, actually hold for any monoid. We leave the proofs of the following results as exercises: 2 In honour of Niels Henrik Abel (1802–1829), a Norwegian mathematician who is consider to be one of the founders of Modern Algebra 3 If X and Y are sets, Y X denotes the set of all functions f : X → Y (See Definition A.2.4 in the Appendix). 12 CHAPTER 1. BASIC ALGEBRAIC STRUCTURES Proposition 1.1.8. If (X, ∗) is a monoid, and x, y ∈ X are invertible, then x−1 and xy are invertible, and we have: (x−1 )−1 = x, and (xy)−1 = y −1 x−1 . For an additive monoid, we have: −(−x) = x, and − (x + y) = (−x) + (−y). Exercises. 1. Let X = {x, y} be a set with two elements. How many binary operations are there on X? How many of these operations are (i) commutative, (ii) associative, (iii) have an identity? 2. How many binary operations are there in a set with 10 elements? 3. In (Z, −) is there an identity? Inverses? Is this operation associative? 4. Let RR be the set of all functions mentioned in Example 1.1.7.2, and assume that f ∈ RR . (a) Show that there exists g ∈ RR such that f ◦ g = I if and only if f is surjective. (b) Show that there exists g ∈ RR such that g ◦ f = I if and only if f is injective. (c) If f ◦ g = f ◦ h = I, is true that g = h? 5. Prove Proposition 1.1.8. 6. Let ∗ be binary operation on X, and x, y ∈ X. If n ∈ N is a natural number, define the power xn by induction as follows: x1 = x and, for n ≥ 1, xn+1 = xn ∗ x. Assuming that ∗ is associative, shows that: (a) xn ∗ xm = xn+m , and (xn )m = xnm , for all n, m ∈ N. (b) xn ∗ y n = (x ∗ y)n , for all n ∈ N, if x ∗ y = y ∗ x. How would you express these results in the additive notation? 7. Assume that (X, ∗) is a monoid with identity e, and x ∈ X is invertible. In n this case, we define for n ∈ N x−n = (x−1 ) , x0 = e. Show that the identities (a) and (b) in the previous problem are valid for all n, m ∈ Z. 1.2. GROUPS 1.2 13 Groups The examples we saw in the previous section show that in an arbitrary monoid not every element is invertible. The monoids where every element has an inverse correspond to the one of the most important abstract algebraic structures. Definition 1.2.1. A monoid (G, ∗) is called a group if all the elements in G are invertible. A group is called abelian if the operation ∗ is commutative. The following simple examples already show how general the concept of group is. Examples 1.2.2. 1. (R, +) is an abelian group. 2. (R+ , ·) is also an abelian group. 3. (Rn , +), where + denotes sum of vectors, is an abelian group. 4. The set of all n × n invertible matrices with real entries is a non-abelian group with the usual product of matrices. This group is called the General Linear Group and denoted by GL(n, R)). 5. The complex numbers z ∈ C with |z| = 1 (the unit circle) form an abelian group for complex multiplication, usually denoted by S1 . 6. The complex numbers {1, −1, i, −i}, with the complex multiplication form, a finite abelian group. 7. The set of all functions f : R → R form an abelian group for the usual addition of of functions. We can also consider special classes of functions, such as continuous functions, differentiable functions or integrable functions. All such classes give examples of abelian groups. 8. If X is any set and (G, +) is an abelian group, then the set of all functions f : X → G form an abelian group with operation “+” defined by (f + g)(x) = f (x) + g(x), ∀x ∈ X. (Note that in this expression the symbol “+” is used with two different meanings!) 9. More generally, if X is a set and (G, ∗) is a group, then the functions f : X → G form a group with operation “∗” defined by (f ∗ g)(x) = f (x) ∗ g(x), ∀x ∈ X. (Again, in this expression the symbol “∗” is used with two different meanings.) 14 CHAPTER 1. BASIC ALGEBRAIC STRUCTURES The fourth example above illustrates a general fact: the invertible elements in an arbitrary monoid always form a group. Proposition 1.2.3. Let (X, ∗) be a monoid and G the set of invertible elements in X. Then G is closed for the operation ∗, and (G, ∗) is a group. Proof. The identity e of X is invertible with e−1 = e, since e ∗ e = e. Hence, G is non-empty and contain the identity of X. Proposition 1.1.8 shows that x ∈ G =⇒ x−1 ∈ G (because (x−1 )−1 = x), and also that x, y ∈ G =⇒ x ∗ y ∈ G (because (x ∗ y)−1 = y −1 ∗ x−1 ). Since the operation ∗ is associative (in the original monoid), (G, ∗) is a group. It is not our aim at this point to discuss in depth the theory of groups. For now, we limit ourselves to some elementary results which will be useful later in the study of many other algebraic structures. Proposition 1.2.4. If (G, ∗) is group with identity element e, then4 : (i) Cancellation law: if g1 , g2 , h ∈ G and g1 ∗h = g2 ∗h or h∗g1 = h∗g2 , then g1 = g2 ; (ii) In particular, if g ∗ g = g then g = e; (iii) The equation g ∗ x = h (respectively, x ∗ g = h) has the unique solution x = g−1 ∗ h (respectively, x = h ∗ g −1 ). Proof. We have: g1 ∗ h = g2 ∗ h =⇒ (g1 ∗ h) ∗ h−1 = (g2 ∗ h) ∗ h−1 (because h is invertible), =⇒ g1 ∗ (h ∗ h−1 ) = g2 ∗ (h ∗ h−1 ) (by associativity), =⇒ g1 ∗ e = g2 ∗ e =⇒ g1 = g2 (because h ∗ h−1 = e), (because e is the identity). 4 Note that the results stated in this theorem are just “sophisticated” versions of the al-jabr operations mentioned in the introductory section. 1.2. GROUPS 15 The proof for h ∗ g1 = h ∗ g2 is similar, hence (i) holds. On the other hand, g ∗ g = g =⇒ g ∗ g = g ∗ e =⇒ g = e (because g ∗ e = g), (by the previous result). so (ii) holds. The proof of (iii) is left as an exercise. If (G, ∗) is a group and H ⊂ G is a non-empty set, it is possible that H is closed relative to the operation ∗, i.e., that h1 ∗ h2 ∈ H, whenever h1 , h2 ∈ H. In this case, the operation ∗ is a binary operation on H, and we can ask under what conditions is (H, ∗) a group. When this is the case, we say that (H, ∗) is a subgroup of (G, ∗). The next result gives simple criteria to decide if a given subset H of a group G is a subgroup. Proposition 1.2.5. If (G, ∗) is a group with identity element e, and H ⊂ G is non-empty, then the following are equivalent: (i) (H, ∗) is a subgroup of (G, ∗); (ii) for all h1 , h2 ∈ H, h1 ∗ h2 ∈ H and h−1 1 ∈ H; (iii) for all h1 , h2 ∈ H, h1 ∗ h−1 2 ∈ H. Proof. We will show the implications (i) ⇒ (ii) ⇒ (iii) ⇒ (i). Assume that (i) holds. Then, by definition, H is closed for multiplication: if h1 , h2 ∈ H then h1 ∗ h2 ∈ H. By assumption, H has an identity element ẽ, so that ẽ ∗ ẽ = ẽ. By Proposition 1.2.4 (ii) we conclude that ẽ = e, so H contains the identity of G. If h ∈ H, consider the equation h ∗ x = e. According to Proposition 1.2.4 (iii), this equation has a unique solution x ∈ H, which is also a solution of the same equation in G. Hence, we must have x = h−1 (the inverse of h in the original group G). This shows that if h ∈ H, then h−1 ∈ H. If (ii) holds, then clearly (iii) holds. Finally, assume that (iii) holds. We must show that (H, ∗) is a group. Since H is non-empty, we can choose h ∈ H, and observe that h ∗ h−1 = e ∈ H. Hence, H contains the identity of G. Similarly, if h ∈ H, then e ∗ h−1 = h−1 ∈ H, so H contains the inverses (in G) of all its elements. It follows also that H is closed relative to ∗, for if h1 , h2 ∈ H then we already −1 = h ∗ h ∈ H. Finally, ∗ is know that h−1 ∈ H, so that h1 ∗ (h−1 1 2 2 2 ) associative in H because it was already associative in G. 16 CHAPTER 1. BASIC ALGEBRAIC STRUCTURES In any abelian group, using the additive notation, the difference h1 − h2 is defined by h1 − h2 = h1 + (−h2 ) Hence, in additive notation, the condition “h1 ∗ h−1 2 ∈ H” is written as “h1 − h2 ∈ H. Examples 1.2.6. 1. Consider the group (R, +) and the subset of all integers Z ⊂ R. This subset is non-empty and the difference of two integers is still an integer, so (Z, +) is a subgroup of (R, +). 2. Still in the same group (R, +), consider the subset of natural numbers N ⊂ R. The difference of two natural numbers may fail to be a natural number, so (N, +) is not a subgroup of (R, +). Note that the sum of two natural numbers is still a natural number, so the sum is a binary operation on the set of natural numbers. Let (G, ∗) and (H, ·) be two groups, and consider the cartesian product G × H = {(g, h) : h ∈ G, h ∈ H}. In G × H we define the binary operation (g1 , h1 ) ◦ (g2 , h2 ) = (g1 ∗ g2 , h1 · h2 ). (1.2.1) We leave as an exercise to check that this algebraic structure is a group. It is called the direct product of the groups G and H. Note that G and H can be seen as subgroups of G × H if we identify them with G × {e} = {(g, e) : g ∈ G} and {e} × H = {(e, h) : h ∈ H}. If G and H are abelian groups and the additive notation is used, then we will write G ⊕ H instead of G × H, and we call this group the direct sum of G and H. The notion of direct product or direct sum of groups extends in a more or less straightforward way to any finite number of groups5 . For example, if G, H, and K are groups, the direct product G × H × K can be defined as G × H × K = (G × H) × K. More generally, given groups G1 , G2 , · · · , Gn , we have: ! 1 n n−1 Y Y Y Gk = G1 , and Gk = Gk × Gn . k=1 5 k=1 k=1 We will see later how to treat an infinite number of groups where another distinction between the direct sum and the the direct product of groups will become clear. 1.2. GROUPS 17 Example 1.2.7. We leave it as an exercise to check that the direct sum of (R, +) with itself is (R2 , +). More generally, we have: n R = n M k=1 R = R ⊕ · · · ⊕ R. We can also consider the group (Z, +) and take the direct sum of this group with itself any finite number of times. The resulting group is denoted by Zn = n M k=1 Z = Z ⊕ · · · ⊕ Z. This group, as we shall see in Chapter 4, is the so-called free abelian group in n symbols. It is a subgroup of the group (Rn , +). Exercises. 1. Show that the sets G = {0, 1} and H = {1, −1} with the operations given by the following tables are groups.6 + 0 1 0 0 1 1 1 0 × 1 -1 1 1 -1 -1 -1 1 2. Same as the previous exercise, but now with the sets G = {0, 1, 2} and H = {1, x, x2 }, with operations given by the following tables. 7 + 0 1 2 0 0 1 2 1 1 2 0 2 2 0 1 × 1 x x2 1 1 x x2 x x x2 1 x2 x2 1 x 3. Complete the proof of Proposition 1.2.4 (iii). 4. Give an example showing that the cancelation law, in general, does not hold for monoids. 5. Let (G, ∗) be a group. Show that the map f : G → G, given by f (x) = x−1 , is a bijection of G with G. 6 The group G is usually denoted by (Z2 , +), for reasons to be explained later. The group H is formed by the square roots of the unit. 7 The group G is usually denoted by (Z3 , +). The group H is formed by the cubic roots 2π of the unit, since we can assume, e.g., that x is the complex number e 3 i . 18 CHAPTER 1. BASIC ALGEBRAIC STRUCTURES 6. How can Propositions 1.2.4 and 1.2.5 be stated in additive notation? 7. If (G, ∗) is a group, its center is defined by C(G) = {x ∈ G : g ∗ x = x ∗ g, ∀g ∈ G}. Show that (C(G), ∗) is an abelian subgroup of G. Determine the center of G = GL(n, R). 8. Let (G, ∗) be a group, H1 , H2 subgroups of G. Show that H1 ∩ H2 is a subgroup of G. 9. Show that a group (G, ∗) is abelian if and only if (g ∗ h)2 = g 2 ∗ h2 , for all g, h ∈ G. 10. Let ∗ be an associative binary operation in a set G, which satisfies: (i) there exists e ∈ G such that, for all g ∈ G, g ∗ e = g (right identity). (ii) For all g ∈ G there exists g ′ ∈ G such that g ∗ g ′ = e (right inverses). Show that: (a) (G, ∗) is a group. (Hint: show first that g ∗ g = g ⇒ g = e). (b) Fix some set X and let G be the class of all surjective maps f : X → X, with ∗ the operation of composition of maps. Is this a group? Why doesn’t this example contradicts (a)? 11. Let ∗ be an associative binary operation on a non-empty set G, which satisfies: (i) The equation g ∗ x = h has a solution in G for all g, h ∈ G. (ii) The equation x ∗ g = h has a solution in G for all g, h ∈ G. Show that (G, ∗) is a group. (Hint: show first that if g ∗ e1 = g and e2 ∗ h = h, for some g and h, then one must have e1 = e2 = e and e is the identity element for ∗). 12. Let (G, ∗) and (H, ·) be two groups. Show that the binary operation on G × H defined by (1.2.1) gives this set a group structure. Verify that the direct product (R, +) × (R, +) coincides precisely with (R2 , +). 13. Determine the direct product of the groups described in Exercises 1 and 2. 1.3. PERMUTATIONS 19 14. Consider the group (Z4 , +) given by the following table: + 0 1 2 3 0 0 1 2 3 1 1 2 3 0 2 2 3 0 1 3 3 0 1 2 (a) Determine all its subgroups. (b) Consider the group with support H = {1, −1, i, −i} ⊂ C furnished with the product of complex numbers. Is there a bijection f : Z4 → H such that f (x + y) = f (x)f (y), for all x, y ∈ Z4 ? If yes, how many are there? 1.3 Permutations The bijective maps f : X → X under composition of maps form a group SX called the symmetric group on X. A bijection from X onto X is called a permutation of X, specially when X is a finite set. It should be clear that in this case it does not matter what X is, only its number of elements. We will now study the group of permutations of the set {1, 2, 3, . . . , n}, which is usually denoted by Sn . An elementary counting argument shows that Sn is a finite group with n! elements (here n! denotes the factorial of n, i.e., the product of the first n natural numbers). Examples 1.3.1. 1. The group S2 has only two elements, I and φ, where I is the identity map on the set {1, 2}, and φ “exchanges” 1 with 2, ( i.e., φ(1) = 2 and φ(2) = 1). 2. The map δ : {1, 2, 3} → {1, 2, 3} defined by δ(1) = 2, δ(2) = 3, and δ(3) = 1 is one of the six permutations in S3 . 3. More generally, in Sn we have the permutation π : Sn → Sn which cyclic permutes all elements: π(i) = i + 1 (i = 1, . . . , n − 1) and π(n) = 1. One can represent a permutation π in Sn by a matrix with two rows, where on the first row one writes the variable x and in the second row the value π(x). For example, in the case of S3 its elements can be represented 20 CHAPTER 1. BASIC ALGEBRAIC STRUCTURES by the matrices: 1 2 I= 1 2 1 2 γ= 2 1 3 3 3 3 , α= , δ= 1 2 3 1 3 2 1 2 3 2 3 1 , β= , ε= 1 2 3 3 2 1 1 2 3 3 1 2 , . It is a routine task to find all the products (i.e., composition) of these permutations. The result is given in the following table: I α β γ δ ε I I α β γ δ ε α α I ε δ γ β β β δ I ε α γ γ γ ε δ I β α δ δ β γ α ε I ε ε γ α β I δ Given any permutation π of X and element x ∈ X, the subset of X consisting of all elements obtained from x by applying π repeatedly is denoted by Ox and is called the orbit of the permutation π. Hence, Ox = {x, π(x), π(π(x)), . . . }. Examples 1.3.2. 1. In the case of S3 , we have • orbits of I: O1 = {1}, O2 = {2} e O3 = {3}; • orbits of α: O1 = {1}, O2 = O3 = {2, 3}; • orbits of ε: there is a unique orbit O1 = O2 = O3 = {1, 2, 3}. The structure of the orbits of β and γ is similar to the orbits of α (can you say precisely what they are?), while ε, just like δ, has a unique orbit (which?). 2. The permutation in S4 given by 1 2 π= 2 1 3 4 4 3 has orbits O1 = O2 = {1, 2} and O3 = O4 = {3, 4}. The length of an orbit is simply the number of elements in the orbit. Note that the orbits of a given permutation π of X are disjoint subsets of 1.3. PERMUTATIONS 21 X, whose union is X, so we say that the orbits of π form a partition of the set X. Note also that the identity is the unique permutation with all orbits having length 1. The permutations with at most one orbit of length greater than 1 are called cycles. For example, all permutations in S3 are cycles, but the permutation π in S4 mentioned above is not a cycle: it has two orbits of length 2. Note also that π(x) 6= x, precisely when x belongs to an orbit of π of length greater than 1. A cycle with an orbit of length 2 is called a transposition. If a permutation π is a cycle, it is more neat to represent it by indicating which elements belong to its unique orbit Ox of length greater than 1: we write (x, π(x), π 2 (x), . . . , π k−1 (x)), where k is the length of Ox , i.e., the least natural number such that π k (x) = x. Example 1.3.3. For S3 we have: α = (2, 3) = (3, 2), β = (1, 3) = (3, 1), δ = (1, 2, 3) = (2, 3, 1) = (3, 1, 2), γ = (1, 2) = (2, 1), ε = (1, 3, 2) = (3, 2, 1) = (2, 1, 3). We can represent the identity I, e.g., by I = (1). Note that the inverse permutation of a cycle is the cycle that is obtained by inverting the order in which the elements apear in the corresponding orbit: (i1 , i2 , . . . , ik )−1 = (ik , . . . , i2 , i1 ). In particular, the inverse permutation of a transposition is the same transposition. In the example of S3 , the permutations α, β, and γ coincide with there inverses, and ε and δ are inverses of each other. Two cycles are said to be disjoint if the orbits of length greater than 1 are disjoint. Two disjoint cycles π and ρ commute, i.e., πρ = ρπ, and any permutation is a product of disjoint cycles (one cycle for each of its orbits of length greater than 1). More precisely, in Sn we have the following factorization result, which can be thought of as an analogue of the Fundamental Theorem of Arithmetic 8 : Proposition 1.3.4. Every permutation π in Sn is a product of disjoint cycles. This factorization is unique up to the order of the factors. 8 “Every natural number n ≥ 2 is a product of prime numbers, which are unique up to the order of the factors” (see Chapter 2). 22 CHAPTER 1. BASIC ALGEBRAIC STRUCTURES Notice that, in general, we have (x1 , x2 , . . . , xm ) = (x1 , xm ) . . . (x1 , x3 )(x1 , x2 ). Hence, it is possible to factor permutations in Sn into a product of transpositions. In this case, however, the factors are not unique and the order matters, since its unavoidable to use non-disjoint transpositions. Examples 1.3.5. 1. For the permutation π in S4 above, we can factor it as: 1 2 3 4 1 2 3 4 1 2 3 = 2 1 4 3 2 1 3 4 1 2 4 4 3 , i.e., we can write this permutation in the form π = (1, 2)(3, 4) = (3, 4)(1, 2). 2. Similarly, the cycle (1, 2, 4, 3) can be factor as a product of transpositions: (1, 2, 4, 3) = (1, 3)(1, 4)(1, 2). But this cycle also admits, e.g., the factorizations: (1, 2, 4, 3) = (2, 1)(2, 3)(2, 4) = (1, 3)(1, 4)(1, 2)(2, 4)(1, 3)(2, 4)(1, 3). The previous example shows that in a factorization of a permutation as a product of transpositions, these are not uniquely determined. Note also that the number of transpositions in two distinct factorizations is not unique. In spite of this lack of uniqueness, it is possible to show that the number of factors has a fixed parity, i.e., is either always even or always odd. To see this, if π is a permutation P with orbits O1 , O2 , . . . , OL having lengths n1 , n2 , . . . , nl , we set P (π) = L i=1 (ni − 1), and we show that: Proposition 1.3.6. If π is a permutation and τ is a transposition, then P (πτ ) = P (π) ± 1. Proof. Let τ = (a, b), We must consider two cases: (a) The elements a and b belong to distinct orbits Oi and Oj of π, and (b) The elements a and b belong to the same orbit Oi . For each of these cases it is to check that the following statements hold: 1.3. PERMUTATIONS 23 (a) πτ has the same orbits as π, with the exception of Oi and Oj , which now form a unique orbit of length ni + nj . Hence, P (πτ ) = P (π) + 1; (b) πτ has the same orbits as π, with the exception of Oi , which is split into two orbits. In this case, P (πτ ) = P (π) − 1. We can now show that: Theorem 1.3.7. If τ1 , τ2 , . . . , τm are transpositions such that π = τ1 τ2 · · · τm , then P (π) − m is even, and hence P (π) and m have the same parity (are both even or both odd). Proof. We use induction in m. If m = 1, then π is a transposition and P (π) = 1, so P (π) − m = 0 is even. If m > 1, we let α = τ1 τ2 · · · τm−1 . By the induction hypothesis, we have that P (α)−(m−1) is even, and by the previous proposiition P (π) = P (α)±1. We conclude that P (π) − m = (P (α) ± 1) − m − 1 + 1 = P (α) − (m − 1) − (1 ± 1) is even. The parity of a permutation π is the parity of the number of transpositions in any factorization of π into a product of transpositions or, as we have just seen, the parity of P (π). If P (π) is even (respectively, odd), we say that π is an even (respectively, odd) permutation. The sign of π is +1 (respectively, −1), if π is even (respectively, odd), and is denoted by sgn(π). In particular, any transposition is odd, and also the cycle (1, 2, 4, 3). The identity is an even permutation since, e.g., I = (1, 2)(1, 2). The even permutations in Sn form a subgroup, denoted by An , called the alternating group (in n symbols). We leave for the exercises to check that An contains n! 2 elements. Exercises. 1. Find the factorization of the permutation 1 2 3 4 5 π= 2 3 4 1 6 into a product of disjoint cycles. 6 7 7 5 24 CHAPTER 1. BASIC ALGEBRAIC STRUCTURES 2. What is the parity of the permutation π in the previous exercise? 3. How many transpositions are there in Sn ? 4. How many distinct cycles of length k (1 ≤ k ≤ n) are there in Sn ? 5. Show that, if π, ρ ∈ Sn , then sgn(πρ) = sgn(π) sgn(ρ). 6. Prove that An is a subgroup of Sn . 7. Determine all elements of the group A3 . 8. Determine all the subgroups of the group S3 . 9. Show that in Sn (n > 1) the number of even and odd permutations coincide and conclude that An contains n! 2 elements. 1.4 Homomorphisms and Isomorphisms In order to compare algebraic structures which satisfy the same abstract definition, one uses a fundamental notion of Algebra, namely the notion of an isomorphism. This is a special case of the more general notion of a homomorphism, whose formal definition for monoids is as follows: Definition 1.4.1. If (X, ∗) and (Y, ·) are monoids, a map φ : X → Y is called a homomorphism if φ(x1 ∗ x2 ) = φ(x1 ) · φ(x2 ), ∀x1 , x2 ∈ X. A homomorphism φ which is a bijection is called an isomorphism, in which case the monoids are said to be isomorphic9 One suggestive way to describe the notion of an homomorphism is through the following diagram: X ×X “∗” /X φ×φ φ Y ×Y 9 “·” /Y The use of the following terms is also common: a monomorphism is an injective homomorphism and a epimorphism is a surjective homomorphism. Also, an endomorphism is a homomorphism from an algebraic structure into itself, while an automorphism is an isomorphism of an algebraic structure with itself. 1.4. HOMOMORPHISMS AND ISOMORPHISMS 25 Such a diagram is said to be commutative, because one can follow two distinct paths on it without changing the final result. Examples 1.4.2. 1. The logarithm function φ : R+ → R given by φ(x) = log(x), is a bijection. Since log(xy) = log(x) + log(y), the groups (R+ , ·) and (R, +) are isomorphic. Note that the inverse function, namely the ( exponential) ψ(x) = exp(x) is also an isomorphism. 2. The set of all linear transformations T : Rn → Rn under composition is a monoid. Once a basis of Rn has been fixed, it is possible to find for each linear transformation T its matrix representation M(T ), a certain n× n matrix. It is easy to see that the the map T 7→ M(T ) is an isomorphism of monoids (composition of linear transformations corresponds to the product of their respective matrix representations). If (G, ∗) and (H, ·) are isomorphic groups we will write “(G, ∗) ≃ (H, ·)”, or even just “G ≃ H”, whenever the binary operations are obvious from the discussion. For example, we have (R+ , ·) ≃ (R, +), or R+ ≃ R. 10 Proposition 1.4.3. Let (G, ∗) and (H, ·) be groups, with identities e and ẽ respectively , and let φ : G → H be a homomorphism. Then (i) Invariance of the identity: φ(e) = ẽ. (ii) Invariance of inverses: φ(g−1 ) = (φ(g))−1 , ∀g ∈ G. Proof. (i) Since φ(e) · φ(e) = φ(e ∗ e) = φ(e), the cancellation law shows that φ(e) = ẽ. (ii) Since φ(g) · φ(g−1 ) = φ(g ∗ g−1 ) = φ(e) = ẽ, we conclude that φ(g−1 ) = (φ(g))−1 . Examples 1.4.4. 1. Consider the groups (R, +) and (C∗ , ·), where C∗ denotes the set of all nonzero complex numbers. Define φ : R → C∗ , φ(x) = e2πxi = cos(2πx) + i sin(2πx). 10 We will also use the same symbol ≃ to indicate that two objects in other contexts (e.g., monoids, rings, fields) are isomorphic. 26 CHAPTER 1. BASIC ALGEBRAIC STRUCTURES Then φ is a homomorphism (ez · ew = ez+w , even when z and w are complex numbers11 ). Obviously, φ is not surjective (because φ(x) is a complex number of absolute value 1), and is not injective (because, if x = n is an integer, then φ(n) = 1. Note that φ “wraps” the real line over the unit circle. According to the proposition above, the identity in the domain (the real number 0), is transformed into the identity in the target (the complex number 1), and the image of the symmetric of the real number x is the inverse of the complex number φ(x). 2. Consider the groups (Z, +) and (C∗ , ·) and define φ : Z → C∗ , φ(n) = in . The map φ is a homomorphism which is neither surjective nor injective. The identity in the domain, the integer 0, is transformed into the identity in the target, the complex number 1, and the image of the symmetric of the integer n is the inverse of the complex number φ(n). 3. The previous example can be generalized as follows: if (G, ∗) is any group and g ∈ G is a fixed element, we can define (multiplicative notation): φ : Z → G, φ(n) = g n . The map φ is a group homomorphism from (Z, +) into (G, ∗). Given a group homomorphism φ : G → H, consider now the equation φ(x) = y, where we assume y ∈ H fixed, and x is the unkown to be determined. By analogy with Linear Algebra, if y = ẽ is the identity in the target group we say that we have an homogenous equation, and if y 6= ẽ we say that it is inhomogenous. The set of solutions of the homogenous equation is called the kernel of the homomorphism and will be denoted by N (φ): N (φ) = {g ∈ G : φ(g) = ẽ}. On the other hand, the set of all y ∈ H for which the equation φ(x) = y has a solution x ∈ G, is called the image of the homomorphism and will be denoted by φ(G) (or Im(φ)): φ(G) = {φ(g) ∈ H : g ∈ G}. 11 Recall that if z = x + iy is a complex number, where x, y ∈ R, one defines ez = ex (cos(y) + i sin(y)). 1.4. HOMOMORPHISMS AND ISOMORPHISMS 27 Examples 1.4.5. 1. Continuing the discussion in Examples 1.4.4, the kernel of φ : R → C∗ is precisely the set N (φ) = Z of all integers and the image is the set φ(R) = S1 of all complex numbers of absolute value 1 (the unit circle). 2. Similarly, the kernel of φ : Z → C∗ is precisely the set N (φ) = 4Z of all integers which are a multiple of 4 and the image is the set φ(Z) = {1, −1, i, −i}. The next figure illustrates the concepts of kernel and image. G N () H (G) PSfrag replaements e~ e Figure 1.4.1: Kernel and image of a homomorphism. In the examples above, both the kernel and the image are subgroups of the domain and target groups, respectively. This is a general fact: Proposition 1.4.6. If (G, ∗) and (H, ·) are groups and φ : G → H is a homomorphism, then: (i) The kernel N (φ) is a subgroup of G; (ii) The image φ(G) is a subgroup of H. Proof. (i) The kernel N (φ) is non-empty, since it contains at least the identity of G, by Proposition 1.4.3 (i). Moreover, if g1 , g2 ∈ N (φ) we have φ(g1 ∗ g2−1 ) = φ(g1 ) · φ(g2−1 ) (because φ is a homomorphism), = ẽ · (φ(g2 ))−1 −1 = ẽ · ẽ = ẽ. (by Proposition 1.4.3 and because g1 ∈ N (φ)), (because g2 ∈ N (φ)), 28 CHAPTER 1. BASIC ALGEBRAIC STRUCTURES This shows that for any g1 , g2 ∈ N (φ) we have g1 ∗ g2−1 ∈ N (φ), so N (φ) is a subgroup of G. (ii) φ(G) is non-empty, because G is non-empty. If h1 , h2 ∈ φ(G), there exist g1 , g2 ∈ G such that h1 = φ(g1 ) and h2 = φ(g2 ). Hence: −1 h1 · h−1 = φ(g1 ∗ g2−1 ) ∈ φ(G), 2 = φ(g1 ) · (φ(g2 )) so φ(G) is a subgroup of H. From now on we will stop being explicit about the group operations, except if needed for clarity. Hence, if G and H are groups, g1 , g2 ∈ G and h1 , h2 ∈ H, we will write both products as g1 g2 and h1 h2 . From the context, it will be clear to which operation we are referring too. Similarly, we will denote by e both the identities in G and in H. A very important remark is that the kernel of a group homomorphism is not an arbitrary subgroup. It satisfies the following property: Definition 1.4.7. If H ⊂ G is a subgroup, one says that H is a normal subgroup of G if for all h ∈ H and g ∈ G, one has ghg −1 ∈ H. Examples 1.4.8. 1. When G is an abelian group, it is clear that ghg −1 = hgg −1 = h, so all subgroups of an abelian group are normal subgroups. 2. If G = S3 and H={I, α}, then H is a subgroup of G which is not normal: εαε−1 = γ 6∈ H. 3. If G = S3 and H = A3 = {I, δ, ε}, then H is a normal subgroup. In fact, recall that A3 is formed by the even permutations of S3 , so if π ∈ A3 and σ ∈ S3 , then σπσ −1 is an even permutation (why?), and hence σπσ −1 ∈ A3 . 4. If G × H is the direct product of the groups G and H, then G and H (identified, respectively, with G × {e} and {e} × H) are normal subgroups of G × H. Theorem 1.4.9. If φ : G → H is a group homomorphism, then its kernel N (φ) is a normal subgroup of G. Proof. Let n ∈ N (φ) and g ∈ G. We must show that gng−1 ∈ N (φ), or that φ(gng−1 ) = e, where e is the identity of H. For this, we observe that: φ(gng−1 ) = φ(g)φ(n)φ(g−1 ) = φ(g)φ(g =e −1 ) (definition of homomorphism), (because φ(n) = e, since n ∈ N (φ)), (invariance of inverses). 1.4. HOMOMORPHISMS AND ISOMORPHISMS 29 In Linear Algebra, the number of solutions of the non-homegenous equation φ(x) = y, i.e., the lack of injectivity of φ, depends only on the kernel N (φ). Similarly, we have: Theorem 1.4.10. Let φ : G → H be a homomorphism. Then: (i) φ(g1 ) = φ(g2 ) if and only if g1 g2−1 ∈ N (φ); (ii) φ is injective if and only if N (φ) = {e}; (iii) if x0 is a particular solution of φ(x) = y0 , the general solution is x = x0 n, where n ∈ N (φ). Proof. (i) Notice that: φ(g1 ) = φ(g2 ) ⇔ φ(g1 )(φ(g2 ))−1 = e (multiplication in H by (φ(g2 ))−1 ), ⇔ φ(g1 g2−1 ) = e ⇔ g1 g2−1 (because φ is a homomorphism), ∈ N (φ) (by definition of φ). (ii) φ is injective if and only if φ(g1 ) = φ(g2 ) ⇔ g1 = g2 ⇔ g1 g2−1 = e. From (i), we conclude that this holds if and only if N (φ) = {e}. (iii) If φ(x0 ) = y0 , n ∈ N (φ), and x = x0 n, it is clear that φ(x) = φ(x0 n) = φ(x0 )φ(n) = φ(x0 )e = φ(x0 ) = y0 , so x is also a solution of the inhomogenous equation. On the other hand, if x is a solution of the inhomogenous equation, then φ(x) = φ(x0 ), so −1 xx−1 0 ∈ N (φ). But xx0 = n ⇔ x = x0 n. Examples 1.4.11. 1. Continuing with Examples 1.4.4, we saw that the kernel of φ : R+ → C∗ is the set of all integers, i.e., φ(x) = 1 ⇔ cos(2πx) = 1, sin(2πx) = 0 ⇔ x ∈ Z. If we consider the equation: φ(x) = i an obvious solution is x0 = x = 14 + n, with n ∈ Z. 1 4. ⇔   cos(2πx) = 0,  sin(2πx) = 1, The general solution of this equation is then 30 CHAPTER 1. BASIC ALGEBRAIC STRUCTURES 2. The kernel of φ : Z → C∗ is the set of multiples of 4, i.e., φ(n) = 1 ⇔ in = 1 ⇔ n = 4k, with k ∈ Z. φ(n) = i ⇔ in = i, The equation has the obvious solution n0 = 1. The general solution of this equation is then n = 1 + 4k, with k ∈ Z. 3. The algebraic structures (Rn , +) and (Rm , +), where addition is the usual vector sum, are abelian groups. If T : Rn → Rm is a linear transformation, one checks easily that T is also a group homomorphism, and the usual Linear Algebra result concerning solutions of the equation T (x) = y is just a special case of the previous theorem. The notion of isomorphism between two algebraic structures satisfying the same set of axioms is at the basis of another fundamental problem of Algebra, namely the classification problem for algebraic structures. Loosely speaking, this problem is the following: Given some definition of an (abstract, axiomatic) algebraic structure determine a class C of concrete algebraic structures satisfying the definition, and such that any algebraic structure satisfying the definition is isomorphic to exactly one algebraic structure in the class C. For example, the problem of classification of finite simple groups (a very important class of groups which we will study later in Chapter 5) was solved in the last 20 years, and is consider one of the major accomplishments of modern mathematics. We will see later some less complex classification problems. For now, in order to illustrate these ideas, we discuss a very elementary example: the classification of monoids with exactly two elements. If (X, ∗) is a monoid with two elements, we have X = {I, a}, where I denotes the identity, and I 6= a. Note that the products I ∗ I, I ∗ a and a ∗ I are completely determined by the fact that I is the identity (I ∗ I = I, I ∗ a = a ∗ I = a). It remains to determine the product a ∗ a, and the only possibilities are a ∗ a = a ou a ∗ a = I (in the later case, a is invertible, and hence the monoid is a group). The multiplication tables which describe these two possibilities are: I a I I a a a I I a I I a a a a 1.4. HOMOMORPHISMS AND ISOMORPHISMS 31 In order to check that both cases are possible, consider the sets M = {0, 1}, and G = {1, −1}, with the usual product of integers. It is obvious that in both cases this product is a binary operation which is associative, with identity. Hence, each of these sets with the usual product is a monoid with two elements. Note also, by inspection of the main diagonals of the tables, that these monoids are not isomorphic. 1 -1 1 1 -1 -1 -1 1 1 0 1 1 0 0 0 0 Obviously, any of the first two multplication tables represent a monoid isomorphic to one of these monoids. Indeed, in the case a ∗ a = I, the isomorphism is the map φ : X → G given by φ(I) = 1 and φ(a) = −1. In the case a∗a = a, the isomorphism is the map φ : X → M given by φ(I) = 1 and φ(a) = 0. To summarize this discussion: • G and M are monoids with two elements, • G and M are not isomorphic, and • If X is any monoid with two elements, then either X ≃ G or X ≃ M . Therefore we can say that, “up to isomorphism”, there exist exactly two monoids with two elements, and the classification of monoids with two elements is the family {G, M }. Note that since any group is a monoid, and since only G is a group, we can conclude also that “up to isomorphism”, there exists a unique group with two elements, which is G.12 Exercises. 1. Show that the identity ez+w = ez ew , with z and w complex numbers, is a consequence of the usual identities for ex+y , cos(x + y) and sin(x + y), with x and y real numbers. 2. The complex solutions of the equation xn = 1 are the complex numbers e2πki/n ,where 1 ≤ k ≤ n. These are called the n-th roots of the unit, and they form the vertices of a regular polygon with n sides, inscribed in the unit circle, and with one vertex on the number 1. (a) Show that φ : Z → C∗ given by φ(k) = e2πki/n is a homomorphism. (b) Conclude that the n-th roots of unit form a subgroup of C∗ . 12 This group is obviously isomorphic to the group (Z2 , +) mentioned before. 32 CHAPTER 1. BASIC ALGEBRAIC STRUCTURES (c) Determine the kernel of the homomorphism φ. 3. Assume that G, H and K are groups. Show that: (a) G × H ≃ H × G, and G × (H × K) ≃ (G × H) × K. (b) φ : G → H × K is a group homomorphism if and only if φ(x) = (φ1 (x), φ2 (x)), where φ1 : G → H and φ2 : G → K are homomorphisms. 4. Consider the algebraic structure (R2 , +), where + is the usual sum of vectors. (a) Show that (R2 , +) is an abelian group. (b) Show that any linear transformation T : R2 → R is a homomorphism from group (R2 , +) into the group (R, +). (c) Show that any homomorphism from the group (R2 , +) into the group (R, +) which is continuous is a linear transformation13 . (d) For a fixed a ∈ R2 , define a linear transformation T : R2 → R by T (x) = a · x, where “·” denotes the usual inner product. Find the kernel of T , and check that it is a subgroup and a linear subspace. (e) Given an example of a subgroup of (R2 , +) which is not a linear subspace. 5. Assume that (A, ∗) and (B, ·) are monoids with identities e and ẽ, respectively. If φ : A → B is an isomorphism, show that: (a) φ(e) = ẽ. (b) φ(a−1 ) = (φ(a)) −1 (c) φ −1 if a ∈ A is invertible. : B → A is also an isomorphism. (d) (A, ∗) is a group if and only if (B, ·) is a group. 6. If in the previous exercise one only assumes that φ is an injective (respectively, surjective) homomorphism, which of the statements remain valid? 7. Let G be a group and denote by Aut(G) the set all automorphisms φ : G → G. (a) Show that Aut(G) furnished with composition of maps is a group. (b) If G is the group formed by the complex solutions of x4 = 1, determine Aut(G). 8. Let G be a group. (a) If g ∈ G is fixed, show that φg : G → G given by φg (x) = gxg −1 is an automorphism. 13 There are group homomorphisms which are not continuous and which are not linear transformations. Its existence can only be shown using the Axiom of Choice, which is discussed in the Appendix. 1.5. RINGS, INTEGRAL DOMAINS AND FIELDS 33 (b) Show that the map T : G → Aut(G) defined by T (g) = φg is a group homomorphism. (c) Show that the kernel of T is the center of the group G (see Exercise 7, in the previous section). 9. Classify all the groups with three and with four elements. 10. Show that the map φ : Sn → Z2 , which to a permutation π associates sgn(π), is a group homomorphism with kernel An . 11. Determine all the normal subgroups of S3 . 12. Let G be a group and φ : S3 → G a group homomorphism. Classify the group φ(S3 ) (i.e., list all the possibilities for φ(S3 ) up to isomorphism). 13. Let G be a group and fix g ∈ G. Show that (a) the map Tg : G → G given by Tg (x) = gx, is a permutation on the set G. (b) the map φ(g) : G → SG given by φ(g) = Tg is an injective homomorphism. In particular, G is isomorphic to a subgroup of the group of permutations of the set G. (c) if G is a finite group wth n elements, then there exists a subgroup H ⊆ Sn such that G ≃ H. 1.5 Rings, Integral Domains and Fields The integers, rationals, reals and complex numbers can be added and multiplied and the result is still a number of the same type. Similarly, we can add and multiply square matrices of the same size, linear transformations of a fixed vector space, and many other objects, which are useful in Mathematics and its applications. All these are examples of algebraic structures more complex than groups or monoids, precisely because they involve simultaneously two operations. However, they share a common set of basic properties, which lies at the basis of the definition of an algebraic structure called a ring, that we will now study. We will distinguish some important classes of rings, namely the fields (rings where the product is commutative and a division by non-zero elements is always possible, such as Q, R and C), and integral domains (rings with properties similar to Z). Let A be an non-empty set, and σ, π : A × A → A two binary operations in A. In order to simplify the notation, we will write“a + b” instead of 34 CHAPTER 1. BASIC ALGEBRAIC STRUCTURES “σ(a, b)”, and “a · b” (or still ab) instead of “π(a, b)”. We say that a + b and a · b are respectively the addition and the product of the elements a, b ∈ A. Definition 1.5.1. The triple (A, +, ·) is called a ring if: (i) Addition properties: (A, +) is an abelian group. (ii) Product properties: The product is associative, i.e., ∀a, b, c ∈ A, (a · b) · c = a · (b · c). (iii) Mixed properties: The addition and the product are distributive, i.e., ∀a, b, c ∈ A, a · (b + c) = a · b + a · c, and (b + c) · a = b · a + c · a. Let us list more explicitly all the axioms included in this definition: • Associativity of addition: ∀a, b, c ∈ A, (a + b) + c = a + (b + c). • Commutativity of addition: ∀a, b ∈ A, a + b = b + a. • Identity for addition: ∃0 ∈ A ∀a ∈ A, a + 0 = a. • Symmetrics for addition: ∀a ∈ A ∃b ∈ A : a + b = 0. • Associativity of product: ∀a, b, c ∈ A, (a · b) · c = a · (b · c). • Distributivity: ∀a, b, c ∈ A, a·(b+c) = a·b+a·c, and (b+c)·a = b·a+c·a. As we have already mentioned, Definition 1.5.1 is at least partially inspired by the properties of the set of integers A = Z, where “addition” and “product” are the usual operations on integers, and 0 is the integer zero. In particular, all the axioms in the definition correspond to well-known properties of the integers. However, this does not mean that an arbitrary ring (A, +, ·) behaves just like (Z, +, ·). For example, in general the “product” is not commutative, nor is there any reference in Definition 1.5.1 to an identity for the product (similar to the integer 1). The examples that we will study later show that a ring may have properties dramatically distinct from the ring of integers. If a ring A has an identity (for the product) then (A, ·) is a monoid. In this case we say that (A, +, ·) is a unitary ring. We shall always refer to the (unique) identity for addition has the zero of the ring, reserving the term identity without further qualifications to the (unique) identity for the product, whenever the ring has one (i.e., whenever it is a unitary ring). Also, a commutative or abelian ring is a ring where a · b = b · a, for all a, b ∈ A. 1.5. RINGS, INTEGRAL DOMAINS AND FIELDS 35 Examples 1.5.2. 1. The set Z of all integers with the usual operations of addition and product is an unitary abelian ring. On the other hand, the set of even integers 2Z = {0, ±2, ±4, . . . }, still with the usual operations of addition and product, is an abelian ring without identity. 2. The sets of rational, real and complex numbers, denoted respectively by Q, R and C, also with the usual addition and product are unitary abelian rings. 3. The set of n×n square matrices with entries in either Z, Q, R or C, which we will denote by Mn (A), where A = Z, Q, R or C, still with the usual operations of addition and product of matrices, are rings (non-abelian if n > 1) unitary (the identity is the identity matrix I). More generally, we can consider the ring of n × n matrices Mn (A) with entries in an arbitrary ring A. 4. The set of all functions f : R → R, with pointwise addition and product: (f + g)(x) = f (x) + g(x), (f g)(x) = f (x)g(x), is a unitary abelian ring (the identity is the constant function equal to 1). Similarly, we can consider the ring of continuous functions, differentiable function, etc. 5. The set Z2 = {0, 1}, with addition and product defined by 0 + 0 = 1 + 1 = 0, 0 · 0 = 0 · 1 = 1 · 0 = 0, 0 + 1 = 1 + 0 = 1, 1 · 1 = 1, is a unitary abelian ring. Note that the operations in this ring correspond to the boolean operations of “logical disjunction (OR)” and “logical conjunction (AND)” if we associate 0 → False, 1 → True. The operations in this ring correspond also to the usual “binary” arithmetic ( base 2 arihtmetic), with no transport addition. 6. The set of complex numbers of the form z = n + mi, with n, m ∈ Z, is a unitary abelian ring for the usual complex addition and multiplication. This ring is usually denoted by the symbol Z[i] and its elements are called the Gaussian integers 14 . 14 Carl Friedrich Gauss (1777-1855) was one of the great mathematicians from Göttingen. He was a prodigious child, and when he was only 19 years old he found out a method to construct a regular polygon with 17 sides using only a ruler and a compass (see Chapter 7). For more that 2000 years, since the greek geometers first attempts of such constructions, 36 CHAPTER 1. BASIC ALGEBRAIC STRUCTURES Let us now turn to some basic properties which are shared by any ring. We start with properties which are consequences of results we have seen before in a more general context: Proposition 1.5.3. Let A be a ring. (i) Cancellation law for addition: a + c = b + c ⇒ a = b, and in particular d + d = d ⇒ d = 0. (ii) Uniqueness of symmetrics: The equation a + x = 0 has a unique solution in A, given by x = −a. (iii) Sign rules: −(−a) = a, −(a+b) = (−a)+(−b), and −(a−b) = (−a)+b. We have mentioned above that A is a unitary ring if and only if (A, ·) is a monoid. In this case, we denote by A∗ the set of invertible elements of the monoid (A, ·), also-called the invertible elements of the ring A. Again, recalling a result we proved in a more general context, we have: Proposition 1.5.4. Let A be a unitary ring. Then (A∗ , ·) is a group, and hence: (i) A∗ is closed relative to the product. (ii) If a ∈ A∗ , ax = 1 has the unique solution x = a−1 , where a−1 ∈ A∗ . (iii) If a, b ∈ A∗ , (ab)−1 = b−1 a−1 , and (a−1 )−1 = a. Examples 1.5.5. 1. The unique invertible integers are 1 and −1, i.e., Z∗ = {−1, 1}. 2. All the non-zero rationals, reals and complex numbers are invertible. Hence, we have, e.g., Q∗ = Q − {0} 15 . 3. In the ring Mn (R), the invertible elements are the non-singular matrices, in other words the matrices with non-zero determinat. In general, in a arbitrary ring A with identity 1 6= 0, we can only say that 1 and −1 are invertible, because 1 · 1 = (−1) · (−1) = 1. For the ring Z these the only regular polygons with a prime number of sides that were known to be constructible were the equilateral triangle and the regular pentagon. Some of the most relevant results to be discussed in this book were actually discovered by Gauss, such as the theory of congruences. 15 If A and B are sets, then A − B = {x ∈ A : x 6∈ B}. 1.5. RINGS, INTEGRAL DOMAINS AND FIELDS 37 are the only invertible elements, but at the other extreme, we have rings such as Q, R and C, where all the non-zero elements are invertible. There are intermediate case, such as the ring Mn (A) (A a ring), where determining the invertible elements is more complicated. Let us look next at properties which involve both operations in a ring, which therefore are not consequence of results we have seen so far. The first property, e.g, shows that the zero in a unitary ring is not invertible (so division by zero is not possible), except for the trivial ring A = {0} (where 0 = 1). Proposition 1.5.6. If A is a ring, for any a, b ∈ A, we have: (i) Product by zero: a0 = 0a = 0. (ii) Sign rules: −(ab) = (−a)b = a(−b), and (−a)(−b) = ab. Proof. To show that a0 = 0 we observe that a0 + a0 = a(0 + 0) (distributive property), = a0 (because 0 is the identity for +), ⇒ a0 = 0 (by the cancellation law). The proof of (ii) is left as an exercise. Some of the main differences between examples of rings that we have refer so far, are related with the behavior of the product. The only issue is not just the existence of inverses, but also wether the product satisfies the ”cancellation law”, which can formally be defined as follows: Definition 1.5.7. A ring A satisfies the cancellation law for the product if ∀a, b, c ∈ A, [c 6= 0 and (ac = bc or ca = cb)] ⇒ a = b. The restriction to elements c 6= 0 (which does not appear in the cancellation law for addition) is a consequence of the “product by zero” property. The fact that the cancellation law for the product does not hold in every ring, and hence is not a logical consequence of Definition 1.5.1, can already be seen in the example of M2 (R) where we have the product 0 0 0 0 1 0 0 0 1 0 , = = 0 0 0 0 0 0 0 1 0 0 38 CHAPTER 1. BASIC ALGEBRAIC STRUCTURES so that ca = cb, with c 6= 0 and a 6= b. This example also shows that there are rings were cd = 0 with c 6= 0 and d 6= 0. In this case we call c and d zero divisors. This notion should not be confused with the notion of non-invertible element. However, if ca = cb (or ac = bc) and c is invertible, it is obvious that we can multiply both sides of the equalities by the inverse of c, and conclude a = b. In other words, a zero divisor is always non-invertible, but a ring may have non-invertible elements which are not zero divisors: for example, in Z there are no zero divisors. On the other hand, we leave as an exercise to check that: Proposition 1.5.8. The cancellation law for the product holds in a ring A if and only if A has no zero divisors. In particular, in a ring A where all non-zero elements are invertible (i.e., A∗ = A − {0}), the cancellation law holds. Such a ring is entitled a special name: Definition 1.5.9. A division ring is a unitary ring A where A∗ = A−{0}. An abelian division ring is called a field. The rings Q, R and C are obviously fields. Note that Z2 is a field with two elements, but 2Z or Mn (R) are neither fields, nor division rings. We will describe later a division ring which is not a field. There are also rings which are not division rings, since some non-zero elements are not invertible, but where the cancellation law for the product still holds. Examples of these are the ring of integers Z, the polynomial rings with coefficients in Z, Q, R ou C, and the ring of Gaussian integers. This class of rings is also entitled a special name: Definition 1.5.10. An integral domain is a unitary abelian ring where the cancellation law for the product holds. By Proposition 1.5.8, a unitary abelian ring is an integral domain if and only if it has no zero divisors. The following diagram shows the relationship between the various kinds of rings that we have seen so far (see also the exercises at the end of this section). If (A, +, ·) is a ring and B ⊂ A, it is possible that B is closed for both the addition and the product of A, i.e., it is possible that a, b ∈ B ⇒ a + b ∈ B and a · b ∈ B. If this is the case, then it is possible that (B, +, ·) is a ring. 1.5. RINGS, INTEGRAL DOMAINS AND FIELDS 39 ❘✐ ❣s ■ t❡❣r❛❧ ❉♦♠❛✐ s ❋✐❡❧✂s ❉✐✁✐s✐♦ ❘✐ ❣s Figure 1.5.1: Integral domains, division rings and fields. Definition 1.5.11. Let (A, +, ·) be a ring and let B ⊂ A be a subset closed for the addition and the product of the ring A. We say that B is a subring of A if (B, +, ·) is a ring. In this case, we say also that the ring A is an extension of the ring B. Examples 1.5.12. 1. Z is a subring of Q, and the even integers 2Z is a subring of Z. 2. The set N of all natural numbers (positive integers) is closed for both the addition and product of Z, but it is not a subring of Z. 3. C is an extension of R. 4. The ring Mn (C) is an extension of Mn (Z). According to a result for groups proved in the previous section, if B ⊂ A and B is non-empty, then (B, +) is a subgroup of (A, +) (i.e., it verifies (i) in Definition 1.5.1) if and only if it is closed for the difference. If B is closed for the addition and product of A, then it is obvious that is satisfies (ii) and (iii) in Definition 1.5.1, simply because its operations are the same as the ones of the ring A. We conclude that Proposition 1.5.13. Let A be a ring. A subset B ⊂ A is a subring of A if and only if it is non-empty and it is closed for the difference and the product. If A and B are rings we can form the direct product A × B of the underlying additive groups. Actually, it should be obvious that one can also introduce a product in A × B in a similar fashion, so that: (1.5.1) (a, b) + (a′ , b′ ) = (a + a′ , b + b′ ), (a, b)(a′ , b′ ) = (aa′ , bb′ ). 40 CHAPTER 1. BASIC ALGEBRAIC STRUCTURES We leave it as an exercise to check that A × B with these operations is a ring, which we call the direct sum of the rings A and B, and which we denote by A ⊕ B. It should be clear that we can form the direct sum of any finite number rings (later we will consider the case of an infinite number of rings. Exercises. 1. Check wether the following algebraic structures are rings. If the answer is yes, indicate wether the ring is commutative, has an identity, and/or verifies the cancellation law for the product. If the answer is no, indicate which axioms in Definition 1.5.1 fail: (a) The set of all integers multiple of a fixed integer m, with the usual addition and product of integers; (b) The set of all linear transformations T : Rn → Rn , with the usual addition and the product being composition; (c) The set of all functions f : R → R, with the usual addition and the product being composition; (d) The set of all non-negative integers, with the usual addition and product of integers; (e) The set of all irrational numbers, with the usual addition and product of real numbers; (f) The set Z[i] of Gaussian integers, with the usual addition and product of complex numbers; (g) The set R[x] of all polynomials in the variable x with real coefficients with the usual addition and the product of polynomials. 16 , 2. Show that for any ring 0 = −0. 3. Complete the proof of Theorem 1.5.6. 4. Let A be a set with three distinct elements, which we denote by 0, 1, and 2. How many distinct operations of “addition” and “product” are there, turning A into a unitary ring, with zero the element 0 and identity the element 1? 5. Show that, in a general ring, the difference operation is neither commutative, nor associative, Give examples of rings where this operation satisfies both properties. 16 We will see later that if A is a ring, it is possible to define the ring of polynomials “in the variable x” with coefficients in A, which is usually denoted by A[x]. Note that the use of the symbol Z[i] to denote the ring of Gaussian integers uses the same idea, since a polynomial in the imaginary unit i can always be reduced to a first degree polynomial. 1.5. RINGS, INTEGRAL DOMAINS AND FIELDS 41 6. The equation a + x = b has exactly one solution in a ring A. What can you say about the solutions of the equation ax = b? What about the equation x = −x? 7. Assume that A, B and C are rings. Prove that the following statements hold: (a) The set A × B with the operations defined by (1.5.1) is a ring. (b) If A and B both have more than one element, then A ⊕ B has zero divisors. (c) If A and B are unitary rings then A ⊕ B is also a unitary ring and (A ⊕ B)∗ = A∗ × B ∗ . (d) A⊕B is isomorphic to B⊕A, and A⊕(B⊕C) is isomorphic to (A⊕B)⊕C. (e) φ : A → B ⊕ C is a homomorphism of rings if and only if φ(x) = (φ1 (x), φ2 (x)), with φ1 : A → B and φ2 : A → C homomorphisms of rings. 8. If X is any set and A is a ring, show that the set of all maps f : X → A is a ring with the “usual” operations of addition and product. 9. Use the result of the previous exercise with X = {0, 1} and A = Z2 to obtain an example of a ring with 4 elements. Show that this ring is isomorphic to Z2 ⊕ Z2 . 10. Let A = {0, 1, 2, 3} be a set with four elements. Show that there exists a field structure on A, with zero 0 and identity 1. 11. Let A be a ring. Show that the set of n × n matrices with entries in A, denoted Mn (A), is a ring. Show that if A has an identity, then Mn (A) also has an identity. 12. Let B be a subring of A. Give examples of rings that show that all the following cases are possible: (a) A has identity and B does not have identity. (b) A does not have identity and B has identity. (c) A and B have distinct identities. (Hint: consider subrings of rings of 2 × 2 matrices). 13. Give an example of non-abelian, finite, ring. 14. Show that any subring of a division ring satisfies the cancellation law for the product. Conclude that a subring of a field which contains the identity of the field is an integral domain (but not necessarily a field). 42 CHAPTER 1. BASIC ALGEBRAIC STRUCTURES 15. Show that the cancellation law for the product holds in a ring A if and only if A has no zero divisors. 16. Let A be an integral domain. Do the following statements hold in A? (a) x2 = 1 implies that x = 1 or x = −1; (b) −1 6= 1. 17. Assume that ring A is an extension of the field K, and that K contains the identity of A. Show that A is vector space over K. 18. Determine the invertible elements in the ring Z[i] of Gaussian integers. 19. Show that a finite integral domain A 6= {0} must be a field. 20. Show that M2 (Z)∗ is infinite. (Hint: Show that M2 (Z)∗ = {A ∈ M2 (Z) : det(A) = ±1}.) 21. Let B be a subring of A. Show that: (a) the zero of B is the zero of A; (b) the symmetric of an element of B is the same in B and in A. Assume now that both A and B have an identity. (c) Is it true that B ∗ ⊂ A∗ ? (d) Is it true that the inverse of an element in B ∗ is necessarily the same as the inverse in A∗ ? What if the identities of A and B coincide? 1.6 Homomorphisms and Isomorphisms of Rings We consider now the notions of homomorphisms and isomorphisms associated with the concept of ring. As one could expect, these are maps whose domain and target are rings, and which moreover preserve the algebraic operations of these rings. Observe that in the following definition we use the same symbols to represent the algebraic operations of distinct rings, in spite that these can be very different operations. Note also that a homomorphism of rings is a special case of a group homomorphism. Definition 1.6.1. Let A and B be rings and φ : A → B a map. We say that φ is a homomorphism of rings if: (i) φ(a1 + a2 ) = φ(a1 ) + φ(a2 ), ∀a1 , a2 ∈ A; 1.6. HOMOMORPHISMS AND ISOMORPHISMS OF RINGS 43 (ii) φ(a1 a2 ) = φ(a1 )φ(a2 ), ∀a1 , a2 ∈ A. An isomorphism of rings is a homomorphism which is a bijection17 . We say that the rings A and B are isomorphic if there exists some isomorphism φ : A → B. Examples 1.6.2. 1. Let us denote the complex conjugate of z = x + iy by z = x − iy. We have z + w = z + w, zw = zw. According to the previous definition, the map φ : C → C given by φ(z) = z is an automorphism of the ring C. 2. Consider the map φ : C → M2 (R) defined by x −y φ(x + iy) = . y x Notice that so that, Similarly, so that, x −y y x + x′ y′ −y ′ x′ = x + x′ y + y′ −y − y ′ x + x′ , φ(x + iy) + φ(x′ + iy ′ ) = φ((x + iy) + (x′ + iy ′ )). x −y y x x′ y′ −y ′ x′ = xx′ − yy ′ xy ′ + x′ y −(xy ′ + x′ y) xx′ − yy ′ , φ(x + iy)φ(x′ + iy ′ ) = φ((x + iy)(x′ + iy ′ )). We conclude that φ is a homomorphism of rings. It is easy to check that φ is an injective homomorphism ( i.e., it is a monomorphism) which is not surjective. 3. Let φ : Z → Z2 be given by φ(n) = 0, if n is even, and φ(n) = 1, if n is odd. It is easy to check that φ is a surjective homomorphism of rings ( i.e., it is a epimorphism), which is not injective. 4. Let φ : R → M2 (R) be given by φ(x) = 17 x 0 0 0 . Just like the cases of monoids and groups, we will also use the terms epimorphism and monomorphism of rings, as well as endomorphism and automorphism of rings, which are defined in the obvious way. 44 CHAPTER 1. BASIC ALGEBRAIC STRUCTURES It should be obvious that φ is a monomorphism (which is not surjective). Notice that the matrices in the image of this monomorphism form a subring of M2 (R) with identity distinct from the identity of ring M2 (R). 5. If S, T : Rn → Rn are linear transformations we define the addition S + T and composition ST by (S + T )(x) = S(x) + T (x), (ST )(x) = S(T (x)). We have already observed that these operations make the set of linear transformations of Rn into itself a ring, which we will denote by L(Rn , Rn ). Now fix a basis for Rn and let M(S) denote the matrix of the linear transformation S relative to this basis. It is clear that M(S) is a n × n matrix with real entries, and we know from Linear Algebra that the map M : L(Rn , Rn ) → Mn (R) satisfies the identities M(S + T ) = M(S) + M(T ), M(ST ) = M(S)M(T ). Hence M is an isomorphism of rings. In certain cases, when there exists an “obvious” injective homomorphism φ : A → B, we use the same symbol to denote a and φ(a). Although this is not recommended practice from a logical point of view, it is unavoidable in order not to have a too heavy notation. Examples of this practice are the representation of the real number a and the complex number (a, 0) or the constant polynomial p(x) = a, by the same symbol. Since every homomorphism of rings φ : A → B is also a homomorphism of the additive groups (A, +) and (B, +), we can restate Proposition 1.4.3 as follows: Proposition 1.6.3. Let φ : A → B be a homomorphism of rings. Then: (i) φ(0) = 0; (ii) φ(−a) = −φ(a). According to the definition of ring homomorphism and the previous proposition, a homomorphism φ : A → B “preserves” additions, products, the zero, and symmetrics. However, certain notions related to the product are nor preserved, at least in their most obvious form. One of the examples above shows that if A has identity 1 for the product, then φ(1) may fail to be the identity of the product in B. We will see in the exercises at the end of the section how to formulate a correct statement concerning “preservation” of identities. 1.6. HOMOMORPHISMS AND ISOMORPHISMS OF RINGS 45 The facts concerning the equation φ(x) = y for a ring homomorphism is very similar to the case of group homomorphisms. The kernel of φ is the set of solutions of the homogenous equation φ(x) = 0, i.e., N (φ) = {x ∈ A : φ(x) = 0}, while the image φ(A) is the set of y ∈ B for which the equation φ(x) = y has solutions x ∈ A. Obviously, φ is surjective if and only if φ(A) = B, and φ is injective if and only if N (φ) = {0}. If we restate Theorem 1.4.10 in additive notation, we obtain: Theorem 1.6.4. Let φ : A → B be a homomorphism of rings. Then: (i) φ(x) = φ(x′ ) if and only if x − x′ ∈ N (φ); (ii) φ is injective if and only if N (φ) = {0}; (iii) if x0 is a particular solution of φ(x) = y0 , then the general solution is x = x0 + n, with n ∈ N (φ). Here is a simple example illustrating this result. Example 1.6.5. Consider the homomorphism φ : Z → Z2 given by φ(n) = 0, if n is even, and φ(n) = 1, if n is odd. The kernel of φ is the set of even integers. Since φ(0) = 0 and φ(1) = 1, we conclude that φ(x) = 0 φ(x) = 1 ⇔ x = 2n, with n ∈ Z, ⇔ x = 1 + 2n, with n ∈ Z. Given a homomorphism φ : A → B, we know that N (φ) and φ(A) are subgroups of the additive groups (A, +) and (B, +). It is easy to check that these subgroups are actually subrings: Proposition 1.6.6. If φ : A → B is a ring homomorphism, then N (φ) is a subring of A, and φ(A) is a subring of B. In particular, if φ is injective, then A is isomorphic to φ(A). Proof. In order to show that N (φ) and φ(A) are subrings, we need only to show that they are closed for the respective products. If b1 , b2 ∈ φ(A), there exist a1 , a2 ∈ A such that b1 = φ(a1 ) and b2 = φ(a2 ). Hence, since φ is a homomorphism, we have b1 b2 = φ(a1 )φ(a2 ) = φ(a1 a2 ) ∈ φ(A), 46 CHAPTER 1. BASIC ALGEBRAIC STRUCTURES which shows that φ(A) is closed for the product of B. If a1 , a2 ∈ N (φ), then φ(a1 ) = φ(a2 ) = 0. Again using that φ is a homomorphism, we have φ(a1 a2 ) = φ(a1 )φ(a2 ) = 0 · 0 = 0, hence a1 a2 ∈ N (φ), and N (φ) is closed for the product of A. Finally, it is obvious that if φ is injective, then φ is gives an isomorphism between A and φ(A). Examples 1.6.7. 1. Considere again the homomorphism φ : C → M2 defined by x −y φ(x + iy) = . y x Form the theorem above, we conclude that the set of matrices of the form x −y y x is a subring of M2 (R), isomorphic to the field of complex numbers. 2. If instead we consider the homomorphism φ : R → M2 given by x 0 φ(x) = , 0 0 we conclude that M2 contains also subring isomorphic to the field of real numbers. Note, however, that M2 contains various distinct subrings isomorphic to the field of real numbers: in the previous example we can restrict to the real numbers and obtain the injective homomorphism φ : R → M2 , x 7→ xI, where I is the identity matrix. We have seen that the kernel of a group homomorphism is a special kind of subgroup, namely a normal subgroup. Similarly, the kernel N (φ) of a ring homomorphism φ : A → B is a subring of A of a special kind: not only N (φ) is closed for the product, just like any subring, but also in order for the product ab to belong to N (φ) it is enough that just one of the factors a or b belongs to N (φ): Proposition 1.6.8. Let φ : A → B be a ring homomorphism If a ∈ N (φ) and a′ is any element of A, then both aa′ and a′ a belong to N (φ). 1.6. HOMOMORPHISMS AND ISOMORPHISMS OF RINGS 47 Proof. Note that a1 a2 ∈ N (φ) ⇔ φ(a1 a2 ) = 0 ⇔ φ(a1 )φ(a2 ) = 0. For the product φ(a1 )φ(a2 ) to be zero it is enough for one of the factors φ(a1 ) or φ(a2 ) to be zero, i.e., for one of the elements a1 or a2 to belong to the kernel N (φ). The subrings with this property deserve a special name. Definition 1.6.9. Let A be ring and I ⊂ A a subring. We say that I is an ideal of A if for all a ∈ A and b ∈ I one has ab, ba ∈ I. Not all subrings of a ring are ideals. The following examples illustrate both cases. Examples 1.6.10. 1. It is clear that Z is a subring of R (the difference and the product of integers is always an integer). However, Z is not an ideal of R (the product of an integer by real number may fail to be an integer). 2. Let A = Z and let I be the set of even integers. Obvioulsy, I is a subring of Z (the difference and the product of even integers is an even integer) but I is also an ideal of Z (the product of any integer by an even integer is always an even integer). We will see later that all the subrings of Z are automatically ideals. Note also that I is a subring of R, but obviously not an ideal of R. 3. Every ring A always has the “trivial” ideals {0} and A. 4. In some cases, a ring only has the trivial ideals. For example, this always happens with a field. In fact, if K is a field, and I ⊂ K is an ideal which contains an element x 6= 0 ( i.e., if I 6= {0}), then xx−1 = 1 ∈ I (because x ∈ I and x−1 ∈ K). But then every element y ∈ K belongs to I, because y = 1y, where 1 ∈ I and y ∈ K. In a non-abelian ring A we can consider subrings for which the condition of ideal is only verified for one side: a left ideal of A is a subring B ⊂ A such that if a ∈ A and b ∈ B then ab ∈ B. Similarly, a right ideal of A is a subring B ⊂ A such that if a ∈ A and b ∈ B then ba ∈ B. Obviously I ⊂ A is an ideal in the sense of Definition 1.6.9 if and only if it is both a left and a right ideal. For an abelian ring, all these notions coincide. Due to Proposition 1.6.8, the lateral ideals play a much less important role that the bilateral ideals. 48 CHAPTER 1. BASIC ALGEBRAIC STRUCTURES Exercises. 1. Let A be a ring and φ, ψ : A → A endomorphisms. Show that the composition φ ◦ ψ is an endomorphism, but the sum φ + ψ may fail to be one. In particular, show that the set of endomorphisms of A, denoted by End(A), with the composition operation, form a monoid. 2. Let A be a ring and φ, ψ : A → A automorphisms. Show that the composition φ◦ψ and the inverse φ−1 are automorphisms. In particular, show that the set of all automorphisms of A, denoted by Aut(A), with the composition operation, form a group. 3. Every integer m is of the form m = 3n + r, where n is the result of the division of m by 3 and r is the remainder. Note that n and r are unique, as long as 0 ≤ r < 3. Show that the map φ : Z → Z3 given by φ(m) = r is a homomorphism of rings. What is the kernel of this homomorphism? 4. Show that if A and B are rings, A has identity 1, and φ : A → B is a homomorphism then φ(1) is the identity of φ(A), which may differ from the identity of B. Show also that if a ∈ A∗ , then φ(a−1 ) is the inverse of φ(a) in φ(A), which may not be the inverse of φ(a) in B. In particular, φ(a) may fail to be invertible in B. 5. Show that, if A and B are rings, A has identity 1, and φ : A → B is an isomorphism then φ(1) is the identity of B. Show also that if a ∈ A∗ , then φ(a−1 ) is the inverse of φ(a) in B and φ(A∗ ) = B ∗ . 6. Show that, if A and B are rings, A has identity 1, and φ : A → B is an isomorphism, then B is a field (respectively, division ring, integral domain) if and only if A is a field (respectively, division ring, integral domain). 7. Show that, if K is a field, A is a ring, and φ : K → A is a homomorphism, then either A contains a subring isomorphic to K, or φ is identically 0. 8. Are there any subrings of Z ⊕ Z which are not ideals of Z ⊕ Z? 9. Determine End(A) when A = Z and A = Q. (Hint: Find φ(1) and proceed by induction.) 10. Determine End(R). (Hint: Show that φ(x) ≥ 0 whenever x ≥ 0, so φ is non-decreasing.) 11. Determine all the homomorphisms φ : C → C which satisfy φ(x) ∈ R whenever x ∈ R. (Hint: Show that if φ(1) 6= 0 and x = φ(i) then x2 = −1.) 1.6. HOMOMORPHISMS AND ISOMORPHISMS OF RINGS 12. Assume that I is an ideal of M2 (R) and M2 (R). 1 0 0 0 49 ∈ I. Show that I = 13. Determine all the ideals of M2 (R). 14. Let A be a commutative ring with identity and consider the ring Mn (A). Show that the map det : Mn (A) → A defined by X det(B) = sgn(π)a1π(1) a2π(2) · · · anπ(n) , π∈Sn where B = (aij ), preserve products (but is not a homomorphism of rings). Conclude that a matrix B ∈ Mn (A)∗ if and only if det(B) ∈ A∗ . 15. Assume that C ⊂ B ⊂ A where B is a subring of A. (a) If C is a subring of B, is it true that C is a subring of A? (b) If C is an ideal of B, is it true that C is an ideal of A? (c) If C is an ideal of A, is it true that C is an ideal of B? 16. Show that, if A is a unitary abelian ring and its only ideals are {0} and A, then A is a field. If A is non-abelian, is it true that A is necessarily a division ring? 17. Let A and B be unitary rings. (a) Assume that J is an ideal of A ⊕ B, and show that (a, b′ ), (a′ , b) ∈ J ⇒ (a, b) ∈ J. (b) Show that K ⊂ A × B is an ideal of A ⊕ B if and only if K = K1 × K2 , where K1 is an ideal of A, and K2 is an ideal of B. 18. This exercise concerns the decomposition of rings into direct sums. (a) Assume first, that A is isomorphic to B ⊕ C, and show that there exist ideals I and J of A such that I ∩ J = {0}, and I + J = {i + j : i ∈ I, j ∈ J} = A. (b) Assume now that there exist ideals I and J of A such that I ∩ J = {0}, and I + J = A. Show that A is isomorphic to I ⊕ J as follows: (i) Show that, if i ∈ I, j ∈ J and i + j = 0, then i = j = 0. (ii) Show that, if i ∈ I and j ∈ J, then ij = 0. (iii) Show that the map φ : I ⊕ J → A, given by φ(i, j) = i + j, is an isomorphism of rings. 50 1.7 CHAPTER 1. BASIC ALGEBRAIC STRUCTURES The Quaternions The field of complex numbers is an extension of the field of real numbers. This last field is an extension of the field of rationals, which in turn is an extension of the ring of integers. It is natural to ask if one can continue this sequence, if there is an extension of the field of complex numbers, or to what extend this process of determining successive extensions has a “natural” end. In the XIX Century, W. R. Hamilton 18 asked and tried to solve this question. Z Q R C ?? PSfrag replaements Figure 1.7.1: Hamilton’s Problem. In his first attempt (which lasted 20 years!), Hamilton looked for a field consisting of “numbers” of the form a + ib + cj, where a, b, c ∈ R and i is the imaginary unit. In other words, in modern language, he tried to look for a field structure in the set R3 containing a subfield isomorphic to the field of complex numbers. After many attempts to find a “reasonable” value for the product ij (in the form ij = a + bi + cj), he realized he needed to introduce an additional “number” k, so that ij = k. Hamilton eventually discovered the existence not of a field, but of a division ring structure on the set R4 . The elements of this division ring are now called quaternions, or Hamilton’s numbers. We denote the elements of the canonical basis of the vector space R4 by 1 = (1, 0, 0, 0), 18 i = (0, 1, 0, 0), j = (0, 0, 1, 0), k = (0, 0, 0, 1). William Rowan Hamilton (1805-1865), was a great Irish astronomer and mathematician. Hamilton like Gauss was very precocious: at the age of 5 he could read Greek Hebrew and Latin, and at the age of 10 he was familiar with half a dozen oriental languages! 1.7. THE QUATERNIONS 51 Hence, we can write a quaternion q = (a, b, c, d) as q = a1 + bi + cj + dk, where a, b, c, d are real numbers. We look for a ring structure on R4 for which the injective maps φ : R → R4 and ψ : C → R4 defined by φ(x) = x1 and ψ(x + iy) = x1 + yi, are ring homomorphisms, so that we can identify the real numbers x ∈ R with the quaternions {(x, 0, 0, 0) : x ∈ R}, and the complex numbers x + iy ∈ C with the quaternions {(x, y, 0, 0) : x, y ∈ R}. Given a quaternion q = a1 + bi + cj + dk, we call a1 the real part of q, and bi + cj + dk vectorial part of q. As in the case of complex numbers, we will usually write q = a + bi + cj + dk, so that the quaternion 1 is not included in the notation. The addition of quaternions is the usual vector addition in R4 . Hence, it is obvious that, if x and y are reals and z and w are complex numbers, then φ(x + y) = φ(x) + φ(y), ψ(z + w) = ψ(z) + ψ(w). The product of quaternions is harder to discover. Let us observe first that, if we think of the product of a real number a by the quaternion q as the product (a1)q, then this product should amount to the usual product of a scalar by a vector, and in particular the quaternions form a vector space of dimension 4 over the reals. On the other hand, we should also have i2 = −1, (1.7.1) since i2 = ψ(i)ψ(i) = ψ(i2 ) = ψ(−1) = φ(−1) = −φ(1) = −1. Hamilton found the fundamental identities: (1.7.2) ij = k, jk = i, ki = j. Using these identities, the associativity of the product, and the relation i2 = −1, it is possible to determine the products ji, kj, ik, j 2 and k2 . For example, ij = k ⇒ i(ij) = ik ⇒ (ii)j = ik ⇒ (−1)j = ik ⇒ −j = ik. We leave the details of the other products as an exercise. The final result is: (1.7.3) j 2 = k2 = −1, ji = −k, kj = −i, ik = −j. 52 CHAPTER 1. BASIC ALGEBRAIC STRUCTURES Observe that the product of quaternions is not commutative, so the quaternions do not form a field. From the identities (1.7.1), (1.7.2), and (1.7.3), and using only the associative and distributive properties, it is possible to find the product of two arbitrary quaternions. Instead, we choose to invert the whole process and to define formally the ring of quaternions as follows. Theorem 1.7.1. The set R4 with the usual vector addition and the product of q = a + v and r = b + w defined by19 (1.7.4) (a + v)(b + w) = ab − (v · w) + aw + bv + v × w, is a division ring which will be denoted by H. Proof. We start by observing that the product (q, r) → qr defined by (1.7.4) coincides with the product by scalars if q ∈ R, and also that it is R-bilinear: given a1 , a2 ∈ R, q, q 1 , q 2 ∈ R4 and r 1 , r 2 , r ∈ R4 we have (a1 q 1 + a2 q2 )r = a1 (q 1 r) + a2 (q 2 r), q(a1 r 1 + a2 r 2 ) = a1 (qr1 ) + a2 (qr 2 ). We check also that, with the notation i, j and k for the canonical basis of R3 , the identities (1.7.1), (1.7.2), and (1.7.3) hold. In order to check that associativity holds, we now use the bilinearity and the identities (1.7.1), (1.7.2), and (1.7.3), to find the products (a0 + a1 i + a2 j + a3 k) ((b0 + b1 i + b2 j + b3 k)(c0 + c1 i + c2 j + c3 k)) , and ((a0 + a1 i + a2 j + a3 k)(b0 + b1 i + b2 j + b3 k)) (c0 + c1 i + c2 j + c3 k), and verify that they coincide. Finally, it is clear that q1 = 1q = q, and we leave as an exercise to check that for any non-zero quaternion q = (a, b, c, d) the following identities hold: (1.7.5) 19 qq ′ = q ′ q = 1, where q′ = a − bi − cj − dk . a2 + b2 + c2 + d2 In this expression, (v · w) and v × w denote the usual inner and cross products. Actually, these operations and the notation i, j and k for the canonical basis of R3 are traces that remain from Hamilton’s original work. 1.7. THE QUATERNIONS 53 An interesting fact is that the ring of quaternions H is isomorphic to a subring of the ring M4 (R) of 4 × 4 matrices with real entries. This gives concrete realization of this division ring and a different proof of Theorem 1.7.1. For that, consider the 2 × 2 matrices: 1 0 0 −1 0 0 M= , N= , 0= . 0 1 1 0 0 0 These allow to define a linear transformation ρ : R4 → M4 (R) by setting on the canonical basis: N 0 0 M 0 N ρ(i) = , ρ(j) = , ρ(k) = , 0 −N −M 0 N 0 and ρ(1) = M 0 0 M . Then, we have: Proposition 1.7.2. The map ρ : H → M4 (R) is an injective homomorphism, so the ring M4 (R) has a subring isomorphic to the division ring H. The proof of this proposition is a simple computation. We will see later that as a consequence of the Fundamental Theorem of Algebra (“Every polynomial with complex coefficients of positive degree has one complex root”) there is no field which is an extension of C and at the same time a vector space of finite dimension over R or C. In other words, we now know that the original problem of Hamilton does not have a solution. Exercises. 1. Using formula (1.7.4) for the product of quaternions show that relations (1.7.1), (1.7.2) and (1.7.3) hold. 2. Using only (1.7.4), bilinearity and the identities (1.7.1), (1.7.2), and (1.7.3), prove formula (1.7.5) for the inverse of a quaternion q 6= 0. 3. For any quaternion q = a1 + bi + cj + dk one defines its conjugate to be the quaternion q = a1 − bi − cj − dk. Show that: (a) The map φ : H → H defined by φ(q) = q is an automorphism of (H, +). What can you say about φ(q 1 q 2 )? √ (b) qq = ||q||2 where ||q|| = a2 + b2 + c2 + d2 denotes the norm of the quaternion q = a1 + bi + cj + dk; 54 CHAPTER 1. BASIC ALGEBRAIC STRUCTURES (c) The inverse of the quaternion q is the quaternion q −1 = q ||q||2 . 4. Check that the set formed by all linear combinations of the 4 × 4 matrices: M 0 N 0 0 M 0 N , , , , 0 M 0 −N −M 0 N 0 is a subring of M4 (R). 5. Prove Proposition 1.7.2, i.e., show that the map ρ : H → M4 (R) is is an injective homomorphism. 6. Find all solutions of the equation x2 = −1 in the ring H of quaternions. 7. Assume that {1} ⊂ K ⊂ L ⊂ M are fields, with M an extension of L, and L an extension of K. Recall that M is a vector space over K and over L, and that, in turn, L is a vector space over K. Assume that the dimension of M over L is m, and that the dimension of L over K is n. Show that the dimension of M over K is mn. Conclude that a non-trivial extension of C has at least dimension 4 over R. 8. Consider the quaternions of the form a + bi + cj + dk, where a, b, c, d ∈ Z. Verify that these quaternions form a non-abelian ring, which is not a division ring, but where the cancellation law holds. 9. Verify that the set formed by all the invertible elements of the ring in the previous exercise is a non-abelian group with 8 elements, which will be denoted by H8 . Determine all the subgroups of H8 , and identify the normal subgroups. 1.8 Symmetries In this section we will see how group theory can be used to formalize the notion of symmetry. We will focus mainly on symmetries of planar figures. For that, we start by observing that a “plane figure” is formally a set Ω ⊂ R2 , and we call a symmetry of Ω any map f : R2 → R2 which preserves distances between points of R2 , i.e., such that: ||f (x) − f (y)|| = ||x − y||, ∀x, y ∈ R2 , and which transforms the set Ω in itself, i.e., such that: f (Ω) = Ω. 1.8. SYMMETRIES 55 Example 1.8.1. If Ω is the unit circle centered at the origin, any planar rotation around the origin is a symmetry of Ω. Similarly, any reflection of the plane relative to a line through the origin is also a symmetry of Ω. The symmetries of the plane or, more generally, the symmetries of Rn , are the maps f : Rn → Rn which preserve distances, and which for that reason are called isometries. Examples 1.8.2. 1. Any translation is an isometry of the plane. 2. Any rotation is an isometry of the plane. 3. Any reflection (relative to a line or a point) is an isometry of the plane. Our next aim is to classify all the isometries of Rn . For that, we look first at the isometries f which fix the origin, i.e., such that f (0) = 0. Proposition 1.8.3. For a map f : Rn → Rn such that f (0) = 0, the following statements are equivalent: (i) f is an isometry, i.e., ||f (x) − f (y)|| = ||x − y||, ∀x, y ∈ Rn ; (ii) f preserves inner products, i.e.,20 f (x) · f (y) = x · y, ∀x, y ∈ Rn . Proof. We assume first that f is an isometry, and we observe that ||x|| = ||x − 0|| = ||f (x) − f (0)|| = ||f (x)||. Moreover, we note that ||f (x) − f (y)||2 = ||f (x)||2 + ||f (y)||2 − 2f (x) · f (y), ||x − y||2 = ||x||2 + ||y||2 − 2x · y. Since by assumption ||f (x) − f (y)|| = ||x − y||, and we have already shown that ||x|| = ||f (x)||, it follows immediately that f (x) · f (y) = x · y, for all x, y ∈ Rn . We conclude that (i) implies (ii). We leave as an exercise the proof that (ii) implies (i). 20 In this section we denote by ‘·” the usual inner product of Rn . 56 CHAPTER 1. BASIC ALGEBRAIC STRUCTURES Still considering only isometries fixing the origin, we show next that such an isometry is necessarily a linear transformation. Proposition 1.8.4. If f : Rn → Rn is an isometry and f (0) = 0, then f is a linear transformation. Proof. Denote by {e1 , . . . , en } the canonical basis of Rn and let vk = f (ek ). The vectors v k are unitary (because ||v k || = ||f (ek )|| = ||ek || = 1) and orthogonal (because v i · v j = f (ei ) · f (ej ) = ei · ej ). Hence, the vectors {v 1 , . . . , v n } also form a basis of Rn (why?). Let now x, y ∈ Rn , where y = f (x). Since {e1 , . . . , en } and {v 1 , . . . , v n } are both bases of Rn , there exist scalars x1 , . . . , xn and y1 , . . . , yn such that x= n X xk ek , y= k=1 n X yk v k . k=1 Notice that xk = x · ek and yk = y · v k , and since y · v k = f (x) · f (ek ) = x · ek , we must have xk = yk . Therefore: n n n X X X f (x) = f ( xk ek ) = yk v k = xk f (ek ), k=1 k=1 k=1 so f is a linear transformation. Since an isometry satisfying f (0) = 0 is a linear transformation, it is natural to characterize an isometry in terms of its matrix representation. For that, we recall that a n × n matrix is called orthogonal if AT A = I, or equivalently A−1 = AT . Since det(AT ) = det(A), for any orthogonal matrix A we have [det A]2 = det AT det A = det(AT A) = det I = 1, so that det A = ±1. Proposition 1.8.5. If f : Rn → Rn is any map, the following statements are equivalent: (i) f is an isometry and f (0) = 0; (ii) f is a linear transformation and the matrix of f relative to the canonical basis is orthogonal: f (x) = Ax, with AT A = I. 1.8. SYMMETRIES 57 Proof. (i) ⇒ (ii). If A isP the matrix of f relative to the canonical basis, then A = (aij ), where v j = ni=1 aij ei . Therefore the column j of A is formed by the components of the vector v j in the canonical basis. Since the vectors v j are unitary and orthogonal, we conclude that vj · vk = n X i=1 aij aik =   1  0 if j = k, if j 6= k, which means that AT A = I, This shows that the matrix A is orthogonal. (ii) ⇒ (i). Exercise. The linear transformations which are isometries are usually called orthogonal transformations. We can now characterize an arbitrary isometry of Rn as a orthogonal transformation followed by a translation, as follows: Theorem 1.8.6. If f : Rn → Rn , the following statements are equivalent: (i) f is an isometry, (ii) there exists an orthogonal matrix A and a vector a ∈ Rn such that f (x) = Ax + a. Proof. (i) ⇒ (ii). Let f be an isometry, and a = f (0). The map g(x) = f (x) − a satisfies g(0) = 0 and is an isometry: ||g(x) − g(y)|| = ||f (x) − f (y)|| = ||x − y||. According to Proposition 1.8.5, g is an orthogonal transformation. (ii) ⇒ (i). An orthogonal matrix A defines an orthogonal transformation g(x) = Ax, which we know is an isometry. It is immediate to check that if a ∈ Rn , then f (x) = Ax + a is an isometry. The set of all isometries f : Rn → Rn , with binary operation composition of isometries, form a group E(n, R), called the symmetry group of Rn or euclidean group (why is this a group?). Theorem 1.8.6 shows that an isometry f can be identified with a pair (A, a), formed by an orthogonal matrix A and a vector a ∈ Rn , so one often identifies the euclidean group with the set of all such pairs: E(n, R) = {(A, a) : A ∈ Mn (R), AT A = I, a ∈ Rn }. 58 CHAPTER 1. BASIC ALGEBRAIC STRUCTURES We leave it as an exercise to check that the composition of isometries corresponds to the following binary operations on pairs: (A1 , a1 ) · (A2 , a2 ) = (A1 A2 , A1 a2 + a1 ). The euclidean group has various interesting subgroups. For example, the translations are the isometries of the form f (x) = x + a and they form a subgroup of E(n, R) isomorphic to (Rn , +). Under the identification of isometries with pairs, we have that Rn ⊂ E(n, R) corresponds to the injective homomorphism: a 7→ (I, a). The orthogonal transformations f : Rn → Rn also form a subgroup O(n, R) ⊂ E(n, R), called the orthogonal group. We can view this group as the group of orthogonal matrices: O(n, R) = {A ∈ Mn (R) : AT A = I}. and the inclusion O(n, R) ⊂ E(n, R) corresponds to the injective homomorphism: A 7→ (A, 0). As we have observed above, the determinant of an orthogonal transformation f must be 1 or −1, and the orthogonal transformations f of determinant 1 form a subgroup SO(n, R) ⊂ O(n, R) called the special orthogonal group. The elements of SO(n, R) are called proper rotations ou simply rotations. In terms of matrices: SO(n, R) = {A ∈ Mn (R) : AT A = I, det A = 1}. An isometry f : Rn → Rn which coincides with its own inverse, i.e., such that f ◦ f = I is called a reflection. An isometry f (x) = Ax + a is a reflection if and only if A2 = I and Aa = −a. Composition of reflections may fail to be a reflection, so reflections do not form a group. More generally, if Ω ⊂ Rn , then the isometries of Rn which preserve Ω form a subgroup GΩ ⊂ E(n, R), called the symmetry group of Ω: GΩ = {f : Rn → Rn : f is an isometry and f (Ω) = Ω}. We can talk of the symmetries of Ω which are translations, orthogonal transformations, rotations, reflections, etc. Examples 1.8.7. 1. Let Ω ⊂ R2 be a rectangle centered at origin, with sides (of different lengths) parallel to the coordinate axes. The corresponding symmetry group GΩ has 4 elements: the identity, the reflections in the axes Ox and Oy, and the rotation of 180o around the origin (which is also a reflection relative to the origin). The symmetry group of the rectangle, often called the Klein group, is isomorphic to the direct product Z2 × Z2 . 1.8. SYMMETRIES 59 PSfrag replaements a b Figure 1.8.1: Symmetries of a rectangle. 2. If Ω ⊂ R2 is a regular polygon with n sides centered at origin, the corresponding symmetry group GΩ is called the dihedral group, and has 2n elements: the rotations of 2kπ/n around the origin, the reflections relative to lines through the origin and the vertices, and the reflections relative to lines through the origin and which bisect the sides of the polygon. It is common to denote this group by the symbol Dn . PSfrag replaements a b Figure 1.8.2: Symmetries of an equilateral triangle. For example, for an equilateral triangle (n = 3), there are three rotational symmetries {R, R2 , R3 = I} generated by a rotation R of 2π/3 around the origin. Their matrix representations relative to the canonical basis of R2 are ! √ √ ! 1 3 − 23 − −√12 2 2 2 √ R= , R = . 3 − 23 − 21 − 12 2 We have also three axes of symmetry giving rise to 3 reflections {σ1 , σ2 , σ3 }. If 4πi 2πi we choose the triangle so that its vertices are 1, e 3 , and e 3 , then the matrix representations of the reflections relative to the canonical basis of R2 are: √ ! √ ! 1 1 3 3 − − 1 0 − 2 2 √2 √2 σ1 = , σ = . , σ2 = 3 3 1 1 0 −1 − 3 2 2 2 2 60 CHAPTER 1. BASIC ALGEBRAIC STRUCTURES We leave as an exercise to check that the resulting symmetry group D3 is isomorphic to the symmetric group S3 . In the previous examples, all the figures are bounded. It is also interesting to study groups of symmetry of unbounded figures. Consider, for example, the following subset of the plane: Ω = {na + mb : n, m ∈ Z}, where a, b ∈ R2 are fixed, linearly independent vectors of the plane. Then Ω is a discrete set of points, which can be thought of as a model for a planar lattice of atoms that extend indefinite in the plane (e.g., a crystal). b a PSfrag replaements Figure 1.8.3: A planar lattice Ω. The unbounded planar sets which exhibit symmetry are used extensively in the decoration of surfaces (tilings). Looking at various examples of decorations, one is lead to conjecture that such symmetric tilings are obtained by repetition of one the following figures: equilateral triangle, square, rectangle or hexagon21 . We will not determine the full symmetry group GΩ , but will focus instead on the simpler problem of determining which rotations can be symmetries of Ω. The facts mentioned above suggest that if some rotation is a symmetry of Ω, then it must be a rotation by 60o , 90o , 120o or 180o (besides, of course, the identity, which is also a rotation). To see that this is indeed the case, let f be some rotation which is a symmetry of Ω, and denote by A its matrix 21 For a detail discussion of the notion of symmetry and its relationship with art, see the classical monograph of H. Weyl, Symmetry, Princeton University Press, Princeton N. J. (1952). 1.8. SYMMETRIES 61 PSfrag replaements a b Figure 1.8.4: Symmetries of unbounded planar figures. representation relative to the canonical basis. We can always write A in the form: cos θ − sin θ A= . sin θ cos θ Also, let B be the matrix representation of f relative to the basis {a, b}. In this basis all the elements of Ω have integer coordinates (indeed, the points of Ω are precisely the vectors of R2 whose components in the basis {a, b} are integers). Hence, the matrix B must have integer entries, since these entries represent the vectors f (a) and f (b), which are necessarily points of Ω. We can then write: n11 n12 , B= n21 n22 where the nij are integers. Since A and B represent the same linear transformation relative to distinct basis, we see that A and B are similar matrices i.e., there exists a non-singular matrix S such that S −1 AS = B. Now observe that A and B have the same trace (the sum of the diagonal elements): 2 cos θ = tr(A) = tr(ASS −1 ) = tr(S −1 AS) = tr(B) = n11 + n22 , where we used the fact that tr(CD) = tr(DC), for any square matrices C and D. In particular, 2 cos θ must be an integer. Since −1 ≤ cos θ ≤ 1, we 62 CHAPTER 1. BASIC ALGEBRAIC STRUCTURES conclude that the possible values of cos θ are −1, −1/2, 0, 1/2 or 1, i.e., that θ = 180o , 120o , 90o , 60o or 0o , as claimed. There are many other examples, with concrete important applications, where group theory plays fundamental role in understanding issues related with the idea of symmetry. Examples 1.8.8. 1. By exploring the symmetries of certain unbounded planar regions one can determine the so-called crystallographic groups, which are used in the classification of the crystals that occur in Nature. 2. We will see in Chapter 7 that one can associate to each polynomial a group of symmetries, consisting of certain permutations of its roots, the so-called Galois group of the polynomial. The structure of the Galois group of the polynomial determines whether the roots of the polynomials can be obtained from its coefficients by expressions which only involve fractions and radicals. The application of group theory to Galois theory allows one to explain why there are no formulas to express the solutions of polynomial equations for degree greater or equal to 5. 3. One of the most basic and fruitful ideas in Physics in the principle of “ material objectivity” or “ material frame-indifference”. This principle expresses the idea that different observers, using different frames of reference, should describe the same physical phenomenon using the same laws of Physics. From a mathematical point of view, this forces the laws of Nature to have as symmetry groups the groups of transformations that relate observers in different frames of reference. For example, according to this principle, the laws of Classical Mechanics and the laws of Electromagnetism must have the same symmetry group. It was the careful and systematic use of this idea that led Albert Einstein to the discover the Theory of Relativity, for sure one of the greatest achievements of Science. Exercises. 1. Assume that f : R2 → R2 is a non-trivial isometry of the plane. Show that: (a) if f fixes two points a and b of the plane. then f is a reflection relative to the line determined by a and b; (b) if f fixes a unique point a, then f is a rotation around a; (c) if f does not fix any point, then f is a translation, followed possibly by a rotation and/or a reflection. 2. Conclude the proof of Proposition 1.8.3. 3. Conclude the proof of Proposition 1.8.5. 1.8. SYMMETRIES 63 4. Show that the Klein group is isomorphic to Z2 ⊕ Z2 . 5. Show that the following sets of transformations are groups: (a) The orthogonal transformations and proper rotations f : Rn → Rn . (b) The isometries f : Rn → Rn . (c) The symmetries of a figure Ω ⊂ Rn . 6. Show that the special orthogonal group SO(n, R) is a normal subgroup of O(n, R), but it is not a normal subgroupof E(n, R). 7. Show that D3 (the symmetry group of the equilateral triangle) is isomorphic to the symmetric group S3 . 8. Describe the the group D4 (the symmetry group of a square). 9. The honeycomb in Figure 1.8.4 has rotational symmetries consisting of rotations of 60o around the centers of the faces, as well as rotations of 120o around the vertices. How can one modify this figure so that the only rotational symmetries are rotations of 120o ? 10. Let f : R3 → R3 be a rotation. Show that there exists a basis {v1 , v 2 , v 3 } of R3 relative to which the matrix representation of f is   1 0 0  0 cos θ − sin θ  . 0 sin θ cos θ One calls v 1 the axis of rotation and θ the angle of rotation of f . Can you generalize this result to dimensions n > 3? 11. Let f : R3 → R3 be a reflection that fixes the origin. Show that there exists a basis {v 1 , v 2 , v 3 } of R3 relative to which the matrix representation of f is one of the following:       1 0 0 1 0 0 −1 0 0  0 1 0  ,  0 −1 0  ,  0 −1 0  0 0 −1 0 0 −1 0 0 −1 (reflection through a plane, through a line, through the origin, respectively). What if f (0) 6= 0? Can you generalize this result to dimensions n > 3? 12. Let f : R3 → R3 be an isometry. Show that f can be factored as the composition of a rotation, followed by a reflection, followed by a translation. Is this factorization unique? 64 CHAPTER 1. BASIC ALGEBRAIC STRUCTURES Chapter 2 The Integers 2.1 The Axioms of the Integers We have mentioned (always informally) the ring of integers and some of the properties that these numbers satisfy. So far we have assumed that these properties are well-known and “obvious”. However, it is not possible to develop any mathematical theory in a precise way without a careful examination of its foundations. In particular, one must make a careful distinction between what are the axioms of the theory and what are the consequences of those axioms, i.e., the propositions and theorems. We wish now to discuss which properties of the integers should be consider axioms and which are simply consequences of those axioms. Note, however, that our discussion will not be completely formal: we will continue to use notions and results from set theory in an informal way, without a rigorous formulation. The reason is that they fall outside the domain of Algebra. However, the reader may wish to look at the Appendix in this respect. The choice of axioms that form the base of some theory is, up to some point, arbitrary. It is always possible to chose distinct, but logically equivalent axioms. Hence, the final choice is necessarily a consequence of subjective criteria related to such aesthetical aspects as elegance, brevity, or efficiency of presentation. We choose to start with an axiom which fits easily in the previous discussion: Axiom I. There exists an integral domain Z, whose elements are called integers. 65 66 CHAPTER 2. THE INTEGERS The zero and the identity of Z will be denoted by 0 and 1, respectively. It follows from the results in the previous chapter that a large number of elementary properties of the integers are a direct consequence of Axiom I. In particular, the cancellation laws for addition and product and the sign rules are valid in Z, as well as the property 0 = −0 (proved in the exercises). On the other hand, it should be clear that the this axiom does not characterize completely the integers; for example, it is impossible to decide based solely on this axiom if the property 1 6= −1 holds or not, or to decide if the integers form a finite or a infinite set (why?). In order to complete this with other axioms, we will now look carefully at the properties of the natural numbers. In a informal way, we may think of the natural numbers as integers which are obtained from 1 by “successive addition” of 1, or in other words, the numbers of the form: 1, 2 = 1 + 1, 3 = 2 + 1, 4 = 3 + 1, . . . In particular, denoting by N the set of natural numbers, we should have: 1 ∈ N, and n ∈ N ⇒ n + 1 ∈ N. Taking a more general perspective, which will be useful later, we introduce the following definition. Definition 2.1.1. If A is a unitary ring with identity 1, a subset B ⊂ A is called inductive if: (i) 1 ∈ B; (ii) n ∈ B ⇒ n + 1 ∈ B. The ring A itself is obviously an inductive subset of A. Hence, Z is an inductive subset of itself, so N is not the only inductive subset of Z. However, the heuristic description of N above suggests another property of this set, namely that any inductive subset of Z necessarily contains all the natural numbers. In other words, N is the smallest inductive subset of Z. Coming back to the general case of an arbitrary ring A with identity 1, we formalize these ideas as follows. Since the ring A itself is always an inductive set, the family of all inductive subsets de A is necessarily nonempty. Denoting by N (A) the intersection of all inductive subsets of A, it is clear that N (A) is contained in any inductive subset of A. This remark deserves a name: 2.1. THE AXIOMS OF THE INTEGERS 67 Theorem 2.1.2 (Principle of Finite Induction). Let A be a unitary ring. Then: (i) if B ⊂ A is inductive, then N (A) ⊂ B, and (ii) if B ⊂ N (A) is inductive, then B = N (A). The previous result becomes even more interesting when one observes that: Proposition 2.1.3. If A is a unitary ring, then N (A) is an inductive subset of A. Proof. Since 1 belongs to all inductive subsets of A, it is obvious that 1 ∈ N (A). Assume now that a ∈ N (A) and B is any inductive subset of A. Then a ∈ B (because N (A) ⊂ B) and a + 1 ∈ B (because B is inductive). Since B is arbitrary, it follows that a + 1 belongs to all inductive subsets of A, i.e., a + 1 ∈ N (A). Hence, we conclude that N (A) is inductive. The previous results show that N (A) is an inductive subset which is contained in any inductive subset of A. For this reason we introduce: Definition 2.1.4. For a unitary ring A, the set N (A) is called the smallest inductive subset of A. If A = Z we denote this subset by N instead of N (Z), and call N the set of natural numbers. We will see later that the usual principle of mathematical induction is precisely Theorem 2.1.2 applied to the ring of integers, and we will identify all the possible sets N (A) (up to isomorphism). Note alse that our previous (heuristic) description of N also applies to the set N (A), i.e., this set consists of the elements of A obtained from the identity 1 by “successive addition” of 1 (this will be made more precise later). Examples 2.1.5. 1. If A = C, then N (A) = N. 2. If A = Mn (C), then N (A) = {mI : m ∈ N }. 3. If A = Z2 , then N (A) = Z2 . We know that the addition and the product of natural numbers are still natural numbers. We can now prove this statement as a special instance of a general statement valid for any ring: 68 CHAPTER 2. THE INTEGERS Proposition 2.1.6. The subset N (A) ⊂ A is closed relative to the addition and the product, i.e., ∀a, b ∈ N (A), a + b ∈ N (A) and ab ∈ N (A). Proof. We prove the statement for addition, leaving the case of the product as an exercise. For that, fix a ∈ N (A) and define Ba ⊂ N (A) as the set of all elements b ∈ N (A) such that a+ b ∈ N (A). We must show that Ba = N (A), which will follow from Theorem 2.1.2 (ii) by showing that Ba is inductive: 1. Since N (A) is inductive, it is clear that a + 1 ∈ N (A), and hence 1 ∈ Ba . 2. If b ∈ Ba , then a + b ∈ N (A) and so (a + b) + 1 ∈ N (A), because N (A) is inductive. Since (a + b) + 1 = a + (b + 1), it follows that b + 1 ∈ Ba . We conclude that Ba is inductive. After this long digression, we return to the question of how to characterize axiomatically the ring of integers. Recall that the set of integers is usual describe, informally, as consisting of the natural numbers, the symmetrics of the natural numbers, and zero: Z = {0, 1, −1, 2, −2, 3, −3, . . . }, being understood that in this list there are no repetitions. Our next axiom makes this precise. Axiom II. Every m ∈ Z satisfies exactly one of the following: m = 0 or m ∈ N or − m ∈ N. Note that for all the other examples of rings that we have seen before if we replace in the previous statement Z by A and N by N (A), we obtained a false statement. It is convenient (and standard) to denote the set N ∪ {0} by N0 . Axioms I and II will be the basis of our study of the integers. Eventually, we will show that the usual axioms for the rational and real numbers are logical consequences of these axioms for the integers. The question of knowing if this set of axioms is complete, i.e., if they allow to decide if any “reasonable” statement about the integers is either true or false, and consistent, in the sense that some statement cannot be proved to be both true and false, is a deep and delicate problem studied in Mathematical Logic, so it 2.1. THE AXIOMS OF THE INTEGERS 69 will not be discussed here. We mentioned only that an equivalent question was the subject of Hilbert’s second problem 1 . The solution given by Kurt Gödel 2 in 1930 was one of the most surprising and amazing achievements of 20th Century Mathematics. Gödel showed that these two attributes of the axiomatization of the integers (complete and consistent) are themselves inconsistent: any system of axioms for Z which is consistent must contain statements whose logical value (true or false) cannot be decided based solely on those axioms. It may sound strange at first sight that someone would even ask such questions. Actually, the amazing fact is that their study had profound consequences in our lives. Gödel’s works were followed up by the English mathematician Alan Turing3 , who in 1936 come up with its own theory of a Universal Computing Machine, now known as a Turing Machine. His theoretical findings inspired the Hungarian mathematician John von Neumann 4 , after he moved to the USA, in his efforts to improve the eniac (Electronic Numerical Integrator and Computer ), one of the first digital computers. This machine and other similar machines, which were constructed in the years after the publication of Turing’s work, were the precursors of modern day computers and digital machines. The efforts of these two mathematicians also had profound consequences in other ways. The Second World War started in 1939 and Turing played a fundamental role in the English war efforts, helping to decipher the German secret transmission codes with the help of another digital machine, called enigma. Von Neumann on the other hand, with the help of the eniac, 1 David Hilbert (1862-1943), German mathematician, who was a professor in Göttingen. Hilbert’s communication to the International Congress of Mathematicians held in Paris (1900) included a list of 23 problems which he thought should be consider by mathematicians throughout the XX Century (see Bull. Am. Math. Soc., 2nd ser., vol. 8 (1901-02), pp. 437-79). 2 Kurt Gödel (1906-1978) was born in Austria and emigrated at a young age to the USA, where he became a member of the Institute for Advanced Study, in Princeton. When he solved Hilbert’s second problem he was only 24 years old. 3 Alan Turing (1912-1954) was an English mathematician who towards the end of his life also worked in theoretical biology. He was interested in the development of patterns and shapes in biological organisms (morphogenesis) and the role of the Fibonacci numbers in plant structures. His life is mixed with tragedy, since he was prosecuted and convicted for his sexual orientation, resulting in chemical castration and the removal of his security clearance. These events eventually led him to commit suicide. 4 John von Neumann (1903-1957), was born in Hungary, and lectured in Berlin and Hamburg, before emigrating to the USA. Together with Einstein, he was one of the first members of the Institute for Advanced Study, in Princeton. Von Neumann gave the first axiomatization of the notion of a Hilbert space, a basic notion of Functional Analysis which plays a fundamental role in Quantum Mechanics. 70 CHAPTER 2. THE INTEGERS collaborated on the Manhattan project, which resulted in the construction of the first atomic bomb, and the destruction of the Japanese cities of Hiroshima and Nagasaki, leading to the end of the war. Exercises. 1. Complete the proof of Theorem 2.1.2 and of Proposition 2.1.6. 2. Determine the smallest inductive subset of each of the rings with identity mentioned in Exercise 1.5.1. 3. For which of the rings in the previous exercise does an axiom similar to Axiom II hold if the expression “exactly one” is replaced by “one”? 4. What are the consequences of replacing in Axiom II the expression “exactly one” by “one”, and adding one of the following condition: (a) 0 6∈ N; (b) n ∈ N ⇒ −n 6∈ N; (c) n 6∈ N ⇒ −n ∈ N. 5. A set X is said to be infinite 5 if there exists an injective map Ψ : X → X which is not surjective. Show that N is infinite. 6. Show that, if there exists a bijection Ψ : X → Y , then X is infinite if and only if Y is infinite. 7. Show that, if Y ⊂ X and Y is infinite, then X is also infinite. In particular, show that Z is infinite. 2.2 Inequalities and Ordered Rings Some of the elementary properties of the integer, rational and real numbers, concern the manipulation of inequalities, i.e., the order relation which exists in these rings. In these section we will study order relations in a ring, in an abstract manner, looking for properties which are common to all ordered rings. We will understand why some rings can be ordered, while other cannot, and whether orders are unique or not. In the case of the integers, we will see that there are properties of the order which are specific to this ring, 5 The notion of number of elements (or cardinal) of a set is discussed in the Appendix. 2.2. INEQUALITIES AND ORDERED RINGS 71 and moreover, that they are all consequence of the axioms in the previous section. The abstract notion of an order relation can be defined on any set without any reference whatsoever to some algebraic structure (see the Appendix). Here we are interested in a special kind of an order relation: Definition 2.2.1. A binary relation “>” on a set X is called a strict total order relation if: (i) Transitivity: ∀x, y, z ∈ X, x > y and y > z ⇒ x > z. (ii) Trichotomy: ∀x, y ∈ X, exactly one of the following three cases hold: x > y or y > x or x = y. We call the pair (X, >) an ordered set. Note that condition (ii) states that any two elements of an ordered set (X, >) can be compared, which is not the case for a general partial order relation. Given an ordered set X with an order relation “>”, which one reads “greater than”, one can define, just like for the usual numbers, the relations “<”, “≥” and “≤” (read as usual): • a < b if and only if b > a; • a ≥ b if and only if a > b or a = b; • a ≤ b if and only if a < b or a = b. We recall next some elementary concepts valid for any ordered set (X, >). Definition 2.2.2. Let (X, >) be an ordered set. If Y ⊂ X and x ∈ X, then: (i) x is called an upper bound (respectively lower bound) of Y if x ≥ y (respectively x ≤ y), for all y ∈ Y ; (ii) Y is called bounded above (respectively bounded below) in X if Y has at least one upper bound (respectively lower bound) in X; (iii) Y is called bounded in X if it is bounded above and below in X. Using these notions, we also have the usual notions of maximum, minimum, supremum and infimum, for any ordered set (X, >), which we also state for future reference. 72 CHAPTER 2. THE INTEGERS Definition 2.2.3. Let (X, >) be an ordered set. If Y ⊂ X and x ∈ X, then: (i) if x is an upper bound (respectively lower bound) of Y and x belongs to Y , x is called the maximum (respectively minimum) of Y , and we denote x by max(Y ) (respectively min(Y )); (ii) if the minimum (respectively maximum) of the set of upper bounds of Y exists, it is called the supremum or least upper bound (respectively infimum or largest lower bound) of Y , and it is denoted by sup(Y ) (respectively inf(Y )). It should be clear from the definition of an ordered set that the maximum/minimum of a subset (and hence also the supremum/infimum) if they exist they are unique. Hence, these definitions and notations make sense. The usual notions of intervals, familiar for real numbers, can be defined for any ordered set (X, >). For example, if a, b ∈ X, then ]a, +∞[= {y ∈ X : y > a}, ]a, b] = {y ∈ X : a < y ≤ b}, ... These are also commonly denoted by (a, +∞) and (a, b). Examples 2.2.4. 1. Let X = R with the usual order relation “>” and Y =] − ∞, 0[. In this case, Y has no lower bounds in R, and hence cannot have an infimum or a minimum. The set of upper bounds is the interval [0, +∞[, which has the minimum 0 which does not belong to Y . Obviously, sup(Y ) = 0 and Y has no maximum. 2. Let X = R2 and define a strict, total order “>” as follows:   y1 > y2 , or (x1 , y1 ) > (x2 , y2 ) if and only if  y1 = y2 and x1 > x2 , where > on the right side denotes the usual order in R. One sometimes call this ordered set (X, >) the long line. Now, the interval Y =](1, 0), (0, 2)] (draw a picture of this interval!) has max(Y ) = (0, 2) = sup(Y ) and inf(Y ) = (1, 0), while Y has no minimum. When the set X has some algebraic structure it is natural to consider order relations on X which, in some sense, respect the algebraic structure. In the case of a ring A we will demand that a > b ⇐⇒ a − b > 0, 2.2. INEQUALITIES AND ORDERED RINGS 73 i.e., that a > b if and only if a − b is positive, and that addition and product of positive elements be positive elements. Hence, for a ring it is somewhat more convenient to describe an order relation in terms of the set of positive elements. Definition 2.2.5. A ring A is called an ordered ring if there exists a subset A+ ⊂ A sastifying: (i) Closeness relative to addition and product: a + b, ab ∈ A+ , ∀a, b ∈ A+ . (ii) Trichotomy: For all a ∈ A exactly one of the following three cases hold: a ∈ A+ or a = 0 or −a ∈ A+ . If A is an ordered ring, we define in A an order relation “>” by a > b ⇐⇒ a − b ∈ A+ Notice that this is indeed an order relation in A: transitivity of “>” follows from the fact that A+ is closed for addition, and the trichotomy of “>” is an immediate consequence of (ii). Notice also that one has a > 0 if and only if a ∈ A+ , and therefore we say that A+ is the set positive elements of the ordered ring A. It also clear that a < 0 if and only if −a ∈ A+ , and hence we call the elements of the set A− = {a ∈ A : −a ∈ A+ } the negative elements. The following proposition lists the basic rules to handle inequalities and are valid in any ordered ring. It shows that a large number of properties of the order relation of the integers, rationals and reals coincide, because they are shared by all ordered rings. Proposition 2.2.6. Let A be an ordered ring. For all a, b, c ∈ A, the following properties hold: (i) a > b ⇐⇒ a + c > b + c; (ii) a > b ⇐⇒ −a < −b; (iii) ab > 0 ⇐⇒ (a > 0 and b > 0) or (a < 0 and b < 0); (iv) ab < 0 ⇐⇒ (a > 0 and b < 0) or (a < 0 and b > 0); (v) ac > bc ⇐⇒ (a > b and c > 0) or (a < b and c < 0); (vi) a 6= 0 ⇐⇒ a2 = aa > 0. 74 CHAPTER 2. THE INTEGERS Proof. The proof of these properties is almost immediate from the definition. We illustrate them with the proof of property (i), leaving the rest for the exercises. By Definition 2.2.5, a + c > b + c ⇔ (a + c) − (b + c) ∈ A+ ⇔ a − b ∈ A+ ⇔ a > b. It should now be clear that the ring of integers can be ordered by setting = N, and this corresponds to the usual order of the integers. The trichotomy property of N is exactly Axiom II for the integers, and we have already checked that in any ring with identity the set N (A) is closed relative to addition and the product. Perhaps, more interesting is the fact that the usual order of the integers is the unique order in this ring. In order to show it, we will need one more result valid for any ordered ring A where 1 6= 0. Z+ Theorem 2.2.7. If A is an ordered ring with identity 1 6= 0, then one must have N (A) ⊂ A+ . Proof. We only need to check that A+ is an inductive subset and apply the definition of N (A): (a) Since 1 6= 0 and 1 = (1)(1) = 12 , it follows from property (vi) in Proposition 2.2.6 that 1 > 0, i.e., that 1 ∈ A+ ; (b) Since A+ is closed relative to the addition, a ∈ A+ ⇒ a + 1 ∈ A+ . Hence, A+ is inductive and we conclude that N (A) ⊂ A+ . Obviously there exist rings A such that A+ 6= N (A), i.e. such that N (A) ( A+ . This is the case, e.g., with the rings Q and R. However, as we mentioned above, if A is the ring of integers, then we must have N (A) = A+ . Theorem 2.2.8. The ring of integers Z has only one order, namely the one obtained by setting Z+ = N. Proof. Assume that Z is ordered, possibly differently from the usual order, and let Z+ be the corresponding set of “positive” integers. By the previous theorem, we have that N ⊂ Z+ . We claim that Z+ ⊂ N, from which the result follows. To prove the claim, if m ∈ Z+ , then m 6= 0 and −m 6∈ Z+ , according to trichotomy property in Definition 2.2.5. Since N ⊂ Z+ , it follows that −m 6∈ N. Finally, by Axiom II, we must have m ∈ N. 2.2. INEQUALITIES AND ORDERED RINGS 75 One should not conclude from the previous results that the order of an arbitrary ring is always unique! For example, we shall indicate in the exercises of these section a subring of R that can be ordered in several distinct ways. On the other hand, there are many rings which cannot be ordered. Example 2.2.9. If A = Z2 is the field with two elements, then −1 = 1, so we see that trichotomy cannot be satisfied and Z2 cannot be ordered. As we have already pointed out, the results at the beginning of this section show that a large number of properties of the order of the integers is common to all other ordered rings. The previous result suggests that the properties specific to the order of Z result from the fact that A+ = N (A) for this ring. We illustrate this fact by showing that the positive integers are the integers which are larger than or equal to 1, a statement that is obviously false if we replace the word “integers” by “rationals” or “reals”. Proposition 2.2.10. Z+ = {m ∈ Z : m ≥ 1}. Proof. Let S = {m ∈ Z : m ≥ 1}. If m ∈ S, then m ≥ 1, and since 1 > 0, we have that m > 0. Hence, S ⊂ N. It is obvious that 1 ∈ S and that 1 > 0 ⇒ 1 + 1 > 1 + 0 = 1. Hence, if m ∈ S, then m + 1 ≥ 1 + 1 > 1, and so m + 1 ∈ S, showing that S is inductive. By the Principle of Induction and by Theorem 2.2.8, we conclude that S = N = Z+ . The statement in the previous proposition is equivalent to the fact that in the ring of integers one has ]0, +∞[= [1, +∞[, or also ]0, 1[= ∅. Note that in Q and in R the interval ]0, 1[ is an infinite set. Other properties specific to the order of the integers are discussed in the exercises at the end of this section and the next one. The notion of absolute value or modulus can be introduced without much difficulty in any ordered ring. Definition 2.2.11. Let A be an ordered ring and a ∈ A. The absolute value or modulus of a is the element |a| ∈ A defined by |a| = max{−a, a}. 76 CHAPTER 2. THE INTEGERS It should be clear from this definition that the absolute value of a nonzero number a ∈ A is a positive number |a| ∈ A+ . We list next some properties of the absolute value, which may be shown directly from this definition. Proposition 2.2.12. For any a, b ∈ A, we have: (i) −|a| ≤ a ≤ |a|; (ii) |a + b| ≤ |a| + |b|; (iii) |ab| = |a||b|. The proofs are left to the exercises. Exercises. 1. Complete the proof of Proposition 2.2.6. 2. Show that, if A is a ring with identity 1 6= 0 and there exists in A an element i such that i2 = −1, then A cannot be ordered. 3. Show that, if m ∈ Z, then ]m, m + 1[= ∅ in Z. 4. Show that in Z: (a) m > n ⇔ m ≥ n + 1; (b) mn = 1 ⇔ (m = n = 1) or (m = n = −1); (c) m = sup S if and only if m = max S; (d) m = inf S if and only if m = min S. 5. Prove Proposition 2.2.12. 6. Show that in any ordered ring: (a) |a| ≤ |b| ⇔ −|b| ≤ a ≤ |b| ⇔ a2 ≤ b2 ; (b) ||a| − |b|| ≤ |a − b|; 7. Let B be an ordered ring and φ : A → B an isomorphism of rings. Show that A can be ordered with A+ = {a ∈ A : φ(a) ∈ B + }. √ 8. Let A = {m + n 2 : m, n ∈ Z}. Show that A can be ordered with an order distinct from the one induced √ from R. √ (Hint: Note that φ(m + n 2) = m − n 2 is an automorphism of A and use the previous exercise.) 2.3. PRINCIPLE OF INDUCTION 77 9. Show that, if an ordered ring A is bounded above or bounded below, then A = {0}. 10. Show that any ordered ring A 6= {0} is infinite. 11. Show that the cancellation law for the product holds in any ordered ring. 12. An ordered ring A is called archimedean if for any a, b ∈ A with a = 6 0 there exists x ∈ A such that ax > b. Show that the ring of integers is archimedean. 2.3 Principle of Induction According to the definition of N, discussed in the previous section, the set of natural numbers satisfies the Principle of (finite) Induction: Theorem 2.3.1 (Principle of Induction). If S ⊂ Z is an inductive set, then N ⊂ S. In particular, if S ⊂ N, then S = N. Traditionally, a proof “by induction” obeys the following scheme: given a proposition P(n), depending on some natural n ∈ N, we must (i) show that P(1) is true, and (ii) show that if P(n) is true, then P(n + 1) is also true. The relationship between this procedure and the Principle of Induction can be easily explained by introducing the set S = {n ∈ N : P(n) is true}. Then we have that: • (P(1) is true) ⇐⇒ (1 ∈ S); • (P(n) ⇒ P(n + 1)) ⇐⇒ (n ∈ S ⇒ n + 1 ∈ S). Hence, we see that the usual induction argument amounts to show that the set S of natural numbers for which the proposition is true is an inductive subset de N. Often, one does not mention explicitly the set S, but that should not be a source of any misunderstanding. Example 2.3.2. We say that n ∈ N is even (respectively, odd) if there exists k ∈ Z such that n = 2k (respectively, n = 2k + 1). In order to prove the statement “every natural number is even or odd”, we consider P(n) =“n even or odd”, and observe: 78 CHAPTER 2. THE INTEGERS (i) If n = 1, then 1 = 2 · 0 + 1, hence 1 is odd, so that P(1) is true. (ii) If n is a natural number such that P(n) is true, we have n = 2k or n = 2k+1. It follows that that n+1 = 2k+1 or n+1 = (2k+1)+1 = 2(k+1), and hence P(n + 1) is true. We conclude that P(n) is true for any natural number n. We will often use induction in our proofs. We start illustrating its use by proving a few more properties of the order of the natural numbers and of the integers. Our first application is to prove the Well-ordering Principle. Theorem 2.3.3 (Well-ordering Principle). Any non-empty set of natural numbers has a minimum. Proof. Let S be any non-empty set of natural numbers and consider the statement P(n) =“If S contains a natural k ≤ n, then S has minimum”. Our theorem will follow (why?) if we can show that: “P(n) is true for every n ∈ N”. Therefore, we can try to apply the Principle of Induction: (i) P(1) is true, because, if S contains a natural k ≤ 1, then according to Proposition (2.2.10) we have k = 1, and 1 is obviously the minimum of S, because it is the minimum of N. (ii) Assume now that P(n) is true for some natural n, and assume that S contains a natural k ≤ n + 1. We must show that S has a minimum. If S contains a natural k ≤ n, then it follows from P(n) that S has a minimum. Otherwise, S contains a natural in the interval [1, n + 1], but none in the interval [1, n]. Since we know that the interval ]n, n+1[ is empty, we conclude that n + 1 is the minimum of S. On the other hand, when S is a set of integers, we have Theorem 2.3.4. Any non-empty set of integers which is bounded below (respectively, bounded above) has a minimum (respectively, a maximum). 2.3. PRINCIPLE OF INDUCTION 79 We leave the proof of this theorem has an exercise, and instead we prove the following theorem, which can be used in proofs by induction that “start” at any integer k 6= 1. One should notice that the proof uses Theorems 2.3.3 and 2.3.4, which occasionally are more practical to apply than the Principle of Induction. Proposition 2.3.5. If a statement P(n) is true for some integer n = k and if P(n) ⇒ P(n + 1) for every integer n ≥ k, then P(n) is true for every integer n ≥ k. Proof. Let S = {n ∈ Z : n ≥ k and P(n) is false}. We claim that S is empty, i.e., that P(n) is true for every n ≥ k. Note that S is, by definition, bounded below by k. Hence, according to Theorem 2.3.4, if S is non-empty then it must have a minimum m. Moreover, since m ∈ S, it follows that m ≥ k. By assumption P(k) is true, and hence k 6∈ S, so that m > k. Consider now the integer m − 1. Since m > k, we have m − 1 ≥ k. Since m − 1 is less than the minimum of S, we have m − 1 6∈ S, i.e., P(m − 1) is true. But by assumption P(n) ⇒ P(n + 1) for all n ≥ k so that P(m) is true. This contradicts the fact that m belongs to S. We conclude that S cannot have minimum, and so it must be empty, proving our claim. In certain circunstancies it is more convenient to use yet another form of the Principle of Induction. For example, consider the sequence an consisting of the Fibonacci numbers: 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, . . . This sequence is usually defined as the sequence an satisfying6 : (2.3.1) a0 = a1 = 1, and an+1 = an + an−1 for n ≥ 1. It is possible to determine an explicit expression for this sequence. We will not discuss now how to deduce it, but the expression is the following: √ !n √ !n √ √ 5− 5 1− 5 5+ 5 1+ 5 + . (2.3.2) an = 10 2 10 2 6 Fibonacci, also known as Leonardo of Pisa (1180-1250), found this sequence by considering the following problem about rabbit breeding: Beginning with a single pair of rabbits, how many pairs of rabbits are there at the end of one year if each pair of rabbits gives rise to a new pair of rabbits every month, and each new pair starts giving birth at the end of 2 months? 80 CHAPTER 2. THE INTEGERS It looks obvious that the sequence an defined by this last formula is the only sequence that satisfies (2.3.1). However, the proof of this statement is not as easy as it looks at first sight. First of all it is convenient to formalize the notion of sequence in any set X. Definition 2.3.6. A sequence of elements in a set X is a map f : N → X. When dealing with sequences it is common to write fn instead of f (n), and to speak of the “sequence {f1 , f2 , . . . }”, “{fn }”, or even “fn ”, instead of saying “the sequence f ”. We will also commit this abuse of language (why is this an abuse of language?). Note that, if k is an integer, then to a map g : [k, ∞[∩Z → X corresponds the sequence f : N → X given by f (n) = g(n + k − 1). For this reason, we say that g is a sequence defined in [k, ∞[. Hence, the “Fibonacci sequence” above is a sequence defined in [0, ∞[. The result that we wish to prove is the following: Proposition 2.3.7. If f is a sequence of natural numbers defined in N0 , such that f0 = f1 = 1 and fn+1 = fn + fn−1 for n ≥ 1, then f (n) = a(n), where a(n) = an is the sequence defined by (2.3.2). If we let P(n) be the statement “f (n) = a(n)”, we must show P(n) for all integers n ≥ 0. The difficult in applying the usual induction method is that, for obvious reasons, we cannot show that that P(n) ⇒ P(n + 1), but only that (P(n) and P(n−1)) ⇒ P(n+1). In order to facilitate the solution of the to this and similar problems, we introduce the following form of the Principle of Induction: Theorem 2.3.8. Let S be a set of integers such that (i) k ∈ S, and (ii) for all n ≥ k, [k, n] ⊂ S ⇒ [k, n + 1] ⊂ S. Then [k, +∞[⊂ S. In terms of the statement P(n) =“n ∈ S”, the previous result says that, if P(k) is true and if P(n) is true whenever P(m) is true for all k ≤ m < n, (and not just for m = n − 1), then P(n) is true for all n ≥ k. The proof of this statement is not hard if we use the Well-ordering Principle and proceed as in the proof of Proposition 2.3.5. 2.3. PRINCIPLE OF INDUCTION 81 Proof of Theorem 2.3.8. Let F = {n ≥ k : n 6∈ S}. We claim that F is empty, so that [k, +∞[⊂ S. If F 6= ∅, since F is bounded below by definition, we conclude that F has a minimum m ≥ k. Since k ∈ S, it is clear that m > k. Hence, m − 1 ≥ k, and the interval [k, m − 1] does not contain any element of F , because all its elements are less than the minimum of F . In other words, we have [k, m − 1] ⊂ S. It follows now from (ii) that we must have [k, m] ⊂ S, and hence m ∈ S, i.e., m 6∈ F , a contradiction. We conclude that F cannot have minimum, hence F is empty, as claimed. The previous theorem allows for an immediate proof of Proposition 2.3.7, which we leave for the exercises. Note, however, that this result does not eliminate all the logical difficulties with definitions like (2.3.1). This definition is an example of a recursive definition, where the object to be defined appears in the description of the object. As we know since at least back to Epimenides of Crete (circa 600 BC) who become famous with the phrase “All Cretans are liars”, it is possible to create logical paradoxes, or propositions whose logical value cannot be decided, my making statements which in some way refer to themselves. A classical example is Bertrand Russell’s paradox 7 , suggested by the attempts to define the “set of all sets”. Observe that a definition like “U is the set of all sets” is recursive, because U , being a set, is one of the elements which enters its description. In other words, U has the strange property of being an element of itself, which is not at all usual in the sets we are familiar with! If this property of U is unpleasant, we can consider instead the set N of “normal’ sets, i.e., the sets which are not elements of themselves. In symbols, N = {C ∈ U : C 6∈ C}. 7 Bertrand Russell (1872-1970) co-authored together with Alfred N. Whitehead (18611947) the famous treaty Principia Mathematica (3 vols., 1910-13), where they attempted to formalize in an axiomatic manner all the fundamental notions of Arithmetics. This monumental work was the ultimate result of the “logistic” program to formalize Mathematics, which consisted in constructing all Matematical knowledge through logic deduction from a small number of concepts and principles. Although this program was doomed to fail, due to the later work of Gödel, it was fundamental for the advance of Mathematical Logic. 82 CHAPTER 2. THE INTEGERS The question is now: is N a “normal“ set? Unfortunately, if we assume that N is “normal” (i.e., N 6∈ N ) then N belongs to the set of normal sets (i.e., N ∈ N !). If we assume that N is not “normal”, we have that N ∈ N . But then N is an element of the set of normal sets, and hence N is itself normal (N 6∈ N !). In other words, we cannot give a logical value to the statement “N ∈ N ”. At a superficial level, the lesson to learn from this example is simply that one must be careful with recursive definitions, something that the Greek philosophers were already aware of. A similar difficulty occurs when one uses a spreadsheet and creates a closed loop between cells in the spreadsheet, or when one states a “theorem” such as: Theorem 2.3.9. This statement is false. . . It is more or less clear that such difficulties do not occur with definitions like the one for the Fibonacci sequence, and we will often use such recursive definitions (see the examples in the next section and the Appendix for a formalization). However, at a deeper level, the logical difficulties with recursive definitions, or more generally with statements that refer to themselves, are unavoidable and are related with some of the deepest questions tackled by mathematicians and philosophers nowadays: Is it possible to give a (rigorous) definition of “rigorous definition”? Can we understand how our own intelligence works? How can we reconcile the mechanical aspect of logical deductions, exhibit by a computer program, with the apparent infinite adaptive nature of what we call “intelligent behavior”? At the end of the day, this is the central problem of Artificial Intelligence. Exercises. 1. Determine all the inductive subsets of Z. 2. Prove Theorem 2.3.4. 3. Prove Proposition 2.3.7. 4. Consider the statement (obviously false!) “All blonde women have blue eyes”. What is the mistake in the following “proof” by induction? Denote by P(n) the statement “If in a set of n blonde women there exists one with blue eyes, then all have blue eyes”. Then: (a) P(1) is obviously true. 2.4. SUMMATIONS AND PRODUCTS 83 (b) Assume that P(n) is true, and consider a set with n + 1 blonde women L = {M1 , . . . , Mn+1 }. Assume that M1 has blue eyes. Define Ln+1 = {Mk : k 6= n + 1} and Ln = {Mk : k 6= n}. Since P(n) is true, all women in Ln and Ln+1 have blue eyes. Since L = Ln ∪ Ln+1 , all women in L have blue eyes. (c) Since there exists at least one blonde women with blue eyes, we conclude that all blonde women must have blue eyes. 5. Here is another variant of the “blonde women with blue eyes” problem. Consider the statement (obviously false!) “All finite groups are abelian”. We have the following “proof” by induction: Let G be a group and denote by P(n) the statement “In a subset of G with n elements, all elements commute”. (a) P(1) is obviously true. (b) Assume that P(n) is true, and consider a subset of G with n + 1 elements L = {g1 , . . . , gn+1 }. Denote by Li = {gk : k 6= i} the subset formed by all elements of L, with the exception of the element i. Since P(n) is true, each Li is commutative. Since the Li exhaust all the elements of L, we conclude that L is commutative. (c) Since G is finite, we conclude that G is commutative. 6. Formula (2.3.2) for the Fibonacci sequence can be obtained by determining the sequences of the form β n which satisfy equation (2.3.1). What are the sequences of integers which satisfy bn+1 = bn + 6bn−1 and b0 = b1 = 1? 2.4 Summations and Products Often one is interested in adding and multiplying more than two elements. For this reason, it is convenient to generalize these algebraic operations to an arbitrary (finite) number of factors. We start by looking at the definition of products of more than two factors, since the results for summations are obtained by a simple change of notation. In this section, S will denote a set with an associative binary operation. Given a sequence a1 , a2 , . . . of elements of S, the corresponding sequence of partial products is π1 = a1 , π2 = a1 a2 , π3 = (a1 a2 )a3 , . . . . Formally: Definition 2.4.1. Given a sequence a : N → S its sequence of partial products is the sequence π : N → S defined inductively by: π1 = a1 and πn = πn−1 an if n > 1. We write πn = 1 to n. Qn k=1 ak and called it the product of the ak ’s, with k from 84 CHAPTER 2. THE INTEGERS In term of these sequences the associativity property reads: Proposition 2.4.2. If a, b : N → S are sequences in S, we have: Qn Q Qn (i) for any n > m: ( m k=1 ak ; k=1 ak ) k=m+1 ak = Q Q Q (ii) if the operation is commutative: ( nk=1 ak ) ( nk=1 bk ) = nk=1 (ak bk ). The proof (by induction) is left for the exercises. Also, note that the definition above can be easily adapted to products that start at some k > 1. A particular interesting case of Definition 2.4.1 happens when the sequence a : N → S is constant (i.e., when an = c, for all n ∈ N). The product of the first n terms of the sequence a corresponds then to the notion of power with base c and exponent n. Definition 2.4.3. The power of base c and exponent n ∈ N is defined by Q cn = nk=1 c. The power can be thought of as map Φ : N × S → S, (n, c) 7→ cn . If we fix c, we have the exponential map φ : N → S, while if we fix n, we obtain a map ψ : S → S. According to Definitions 2.4.1 and 2.4.3, we have: (2.4.1) c1 = c and cn = (cn−1 )c for n > 1. When S is a monoid with identity I and c is invertible, we define: (2.4.2) c0 = I and c−n = (c−1 )n for n > 1. Hence, for c is invertible, the power cm is well-defined for any integer m. In particular, when S is a group, the map Φ : N × S → S can be extended to a map Φ̃ : Z × S → S. The usual rules for powers are valid in this more general context: Proposition 2.4.4. If S has an associative operation, then for any a, b ∈ S, one has: (i) an am = an+m and (an )m = anm , for n, m ∈ N; (ii) If ab = ba, then (ab)n = an bn , for n ∈ N; If S is a monoid, then for any a, b ∈ S invertible, one has: (iii) an am = an+m and (an )m = anm , for n, m ∈ Z; (iv) If ab = ba, then (ab)n = an bn , for n ∈ Z; 2.4. SUMMATIONS AND PRODUCTS 85 Proof. As one could guess the proof of these results uses the Principle of Induction. We illustrate it in the case of (i), as an example where the argument involves two natural numbers. The proofs of the other cases are left for the exercises. We prove that an am = an+m by induction in the natural number m. Let P(m) be the statement: P(m) = “an am = an+m for any a ∈ S and natural number n”. P(1) follows from Definition 2.4.1, and to prove P(m) ⇒ P(m + 1), we observe that an am+1 = (an )(am a) n m = ((a )(a ))a = (a n+m )a = an+m+1 (by (2.4.1)), (by associativity), (by the induction hypothesis), (by (2.4.1)). Notice that according to (iii) in the previous proposition, and assuming that S is a group, the map f : Z → S, n 7→ an , for some fixed a ∈ S, is a homomorphism of groups. If S is just a monoid, the restriction of this function to N0 is still a homomorphism of monoids. Let us now formulate the results above in the additive notation. If “+” denotes a commutative binary operation on the set S, one should restate Definition 2.4.1 as follows: Definition 2.4.5. Let + be a commutative, associative, binary operation on the set S. Given a sequence a : N → S its sequence of partial sums is the sequence σ : N → S defined inductively by: σ1 = a1 and σn = σn−1 + an if n > 1. OnePcalls σn the summation of the ak ’s, with k from 1 to n, and denotes it by nk=1 ak . PnWhen a : N → S is a constant sequence with an = c, then we write nc = k=1 c. Moreover, if S has identity 0 and the element c has a symmetric −c, we define 0c = 0, (−n)c = n(−c). The operation Φ : N × S → S, (n, c) 7→ nc is called the “product of a natural n by an element c of S”. Also when c has a symmetric, we call 86 CHAPTER 2. THE INTEGERS the operation Ψ : Z × S → S, (n, c) 7→ nc, the “product of an integer n by an element c of S”. This terminology is slightly ambiguous when S = Z: apparently, we obtain in Z two distinct product operations, namely the operation mentioned in Axiom I and the operation we have just introduced in Definition 2.4.5. We leave as simple exercise to check that these two operations actually coincide. In fact, this duplication suggests that all the references to the product in the axiomatization of the integers are superfluous and redundant. Though we have not chosen this path, this is indeed the case: it is possible to present a set of axioms for the integers without mentioning the product operation, and to deduce all the properties of the product as theorems. If (G, +) is an abelian group, the basic algebraic properties of the product of elements of G by integers can be summarized as follows: Proposition 2.4.6. Given an abelian group (G, +), if g, h ∈ G and m, n ∈ Z, then we have: (i) Identity: 1g = g. (ii) Distributivity: (n + m)g = ng + mg and n(g + h) = ng + nh. (iii) Associativity: n(mg) = (nm)g. One may notice that that these properties are formally similar to the properties in the definition of a linear vector space. More precisely, if we replace the elements of the group G by vectors of a linear vector space V over some field and the integers by scalars of the field, then the properties in Proposition 2.4.6 are exactly the conditions on the operation “product of scalar by a vector ” in the definition of a vector space. In fact, we will see in Chapter 6 the more general concept of a module over a ring, which allows to treat at the same level both the notion of an abelian group and the notion of a vector space over a field. If (A, +, ·) is a ring, we can also look at some “mixed” properties combining the addition and the product: Proposition 2.4.7. Let (A, +, ·) be a ring. If a, a1 , . . . , an , b, c ∈ A and n ∈ N, then we have: P P (i) c ( nk=1 ak ) = nk=1 (cak ); P P (ii) ( nk=1 ak ) c = nk=1 (ak c); (iii) n(ab) = (na)b = a(nb). 2.4. SUMMATIONS AND PRODUCTS 87 We mentioned before that, when G is a group and g ∈ G, then the map ψ : Z → G, n 7→ g n is a homomorphism of groups. Obviously, if G is an abelian group, the map ψ : Z → G, n 7→ ng is also a homomorphism of groups. Finally, if (A, +, ·) is a ring and a ∈ A, then ψ : Z → A, n 7→ na is always a homomorphism of groups between (Z, +) and (A, +), which in general fails to be a ring homomorphism, except when a2 = a (why?). Exercises. 1. What is the sequence defined in Z by a1 = 1, an+1 = Pn k=1 ak ? 2. Complete the proofs of the results stated in this section. 3. Show that, if S is a monoid where the cancellation law holds, then the identity ! n ! n n Y Y Y (ak bk ) ak bk = k=1 k=1 k=1 holds if and only if S is abelian. 4. Let G be an abelian group, and assume that n ∈ Z, and g1 , g2 ∈ G. Is it true that one always has n 6= 0 and ng1 = ng2 ⇒ g1 = g2 ? (Hint: Consider the group Z2 .) 5. Show that, if B is a subset of the ring A which is closed relative to the difference in A, then B is also closed relative to the product by integers, i.e., that: If [a, b ∈ B ⇒ a − b ∈ B] then [(n ∈ Z and b ∈ B)⇒ nb ∈ B]. 6. Use the previous result to prove that in the ring of integers, if B ⊂ Z is non-empty, the following statements are equivalent: (a) B is closed relative to the difference; (b) B is a subring of Z; (c) B is an ideal of Z. 7. Show that: (a) if φ : G → H is a homomorphism of additive groups, then φ(ng) = nφ(g), for all n ∈ Z, g ∈ G; 88 CHAPTER 2. THE INTEGERS (b) if φ : Z → G is a homomorphism of additive groups, then φ(n) = ng, for some g ∈ G. How can one generalize these results for non-additive groups? 8. Show that, if G is a group and g ∈ G, then H = {g n : n ∈ Z} is the smallest subgroup of G that contains g. 9. Let A be a ring with identity I, and let φ : Z → A be given by φ(n) = nI. Show that: (a) φ is a ring homomorphism, and φ(Z) = {nI : n ∈ Z} is the smallest subring of A that contains I; (b) For each a ∈ A, {n ∈ Z : na = 0} is an ideal of Z that contains the kernel N (φ); P (c) φ(N) = {nI : n ∈ N} = { nk=1 I : n ∈ N} coincides with the set N (A)8 . 10. Let A 6= {0} be a ring with identity I, and let φ : Z → A be given by φ(n) = nI. Show that, if A is well-ordered (i.e., if A is ordered and every non-empty subset of A+ has a minimum), then A is isomorphic to Z. (Hint: Show the following: (a) The set {a ∈ A : 0 < a < I} is empty; (b) A+ = φ(N): (c) A = φ(Z); (d) φ is injective.) 2.5 Factors, Multiples and Division In an arbitrary ring A the equation ax = b may fail to have a solution, even if a 6= 0 (if a = 0 the equation can only have solutions if b = 0). In order to avoid having to distinguish the equation ax = b from the equation xa = b, we will assume in this section that A is a commutative ring. Definition 2.5.1. If a, b ∈ A, we say that a is factor (or divisor9 ) of b, or that b is multiple of a, and we write “a|b”, if the equation ax = b has some solution x ∈ A. 8 This is the most rigorous way to formulate the idea that the elements of N (A) are obtained by adding the identity I to itself an arbitrary (but finite) number of times. 9 Note that the term divisor is used here in with a slightly different meaning from the one in the term zero divisor. Recall that a 6= 0 is called a zero divisor if the equation ax = 0 has a solution x 6= 0. 2.5. FACTORS, MULTIPLES AND DIVISION 89 Examples 2.5.2. 1. In a ring with identity 1, any element b has at least the factors 1, −1, b, −b, since b = 1b = (−1)(−b) (but note that is possible that 1 = −1 = b = −b!). 2. If K is a field and 0 6= k ∈ K, then every r ∈ K is a multiple of k. 3. If a ∈ Z and a2 6= 1, the set of multiples of a is smaller than Z. 4. The multiples of (x − 1) in the ring R[x] of polynomials with real coefficients in the variable x are precisely {p(x) ∈ R[x] : p(1) = 0}. It is obvious that that the relation “ is a factor of ” is transitive (if a|b and b|c, then a|c), and if c 6= 0 is not a zero divisor, we have that ac|bc if and only if a|b. On the other hand, if A is an ordered ring, it is also clear that a|b if and only if |a| | |b|. In this chapter we are mainly interested in the case A = Z. The study of factorization and divisibility in more general rings will be the done in the next chapter. In the case of the integers, the implication n > 0 ⇒ n ≥ 1 gives us the following: Lemma 2.5.3. If m, n ∈ Z, then: (i) m|n =⇒ (|m| ≤ |n| or n = 0); (ii) (m|n and n|m) ⇐⇒ |m| = |n|. According to the previous lemma, if n and k are natural numbers and k|n, then k < n. As we have already observed, 1 is a factor of any natural number n. Hence, if n and m are natural numbers, the set of common factors (or divisors) of n and m is non-empty and bounded above, so it must have a maximum. Similarly, the set of natural numbers common multiples of n and m, i.e., the set {k ∈ N : n|k and m|k}, is non-empty since nm > 0 is a common multiple of n and m. Hence, according to the the Well-ordering Principle, this set must have a minimum element. Definition 2.5.4. If n, m ∈ N, then: (i) gcd(n, m) = max{k ∈ N : k|n and k|m} is called the greatest common divisor of n and m; (ii) lcm(n, m) = min{k ∈ N : n|k and m|k} is called the least common multiple of n and m. 90 CHAPTER 2. THE INTEGERS Example 2.5.5. If n = 12 and m = 16, the natural numbers which are divisors of n and m form the sets {k ∈ N : k|12} = {1, 2, 3, 4, 6, 12}, {k ∈ N : k|16} = {1, 2, 4, 8, 16}. Therefore, the natural numbers which are common divisors of 12 and 16 form the set {k ∈ N : k|12 and k|16} = {1, 2, 4}. We conclude that the corresponding greatest common divisor is gcd(12, 16) = 4. The natural numbers which are multiples of 12 and 16 are {k ∈ N : 12|k} = {12, 24, 36, 48, . . . }, {k ∈ N : 16|k} = {16, 32, 48, . . . }, so the natural numbers which are common multiples of 12 and 16 form the set {k ∈ N : 12|k and 16|k} = {48, 96, . . . } We conclude that the corresponding least common multiple is lcm(12, 16) = 48. Notice that in this example gcd(m, n) is a multiple of all the common divisors of n and m, while lcm(n, m) is a factor of all the common multiples of n and m, In fact, we have: Proposition 2.5.6. Let n, m, d, l ∈ N. Then: (i) d = gcd(n, m) if and only if : (a) d|n and d|m; (b) for any k ∈ N, (k|n and k|m) ⇒ k|d. (ii) d = lcm(n, m) if and only if : (a) n|d and m|d; (b) for any k ∈ N, (n|k and m|k) ⇒ d|k. We will prove these statements in the next section. For now, we introduce two more definitions for future reference. Note that in our conventions the natural number 1 is not prime. 2.5. FACTORS, MULTIPLES AND DIVISION 91 Definition 2.5.7. If p, m, n ∈ N. (i) We say that p is prime if p > 1 and if, for every k ∈ N such that k|p, we have that k = 1 or k = p. (ii) We say that n and m are relatively prime if gcd(n, m) = 1. Example 2.5.8. It is easy to check by listing all the possibilidades that 4 and 9 are relatively prime, i.e., gcd(4, 9) = 1, and that 13 is a prime number. In order to show that p is prime it is not necessary to test all the numbers k with 1 < k < p: it is enough to stop testing at the greatest integer k such that k 2 < p. For example, in the case p = 13, it is enough to check that 13 is not a multiple of 2 or 3. We saw in the previous section (by using induction!) that any natural number n is either even or odd, i.e.., that given n, there exist integers q and r such that n = 2q + r, with 0 ≤ r < 2. This result is not specific to the natural number 2: for example, we can always write n = 3q ′ + r ′ , with 0 ≤ r ′ < 3, or n = 4q ′′ + r ′′ , with 0 ≤ r ′′ < 4, etc. A moment of thought shows that these statements are a consequence of the Division Algorithm learned in Elementary School! Theorem 2.5.9 (Division Algorithm). If n, m ∈ Z and n 6= 0, there exist unique integers q, r, such that m = nq + r, and 0 ≤ r < |n|. Proof. We will prove only the case n, m ∈ N, leaving as an exercise the generalization to any integers. Note that the argument to prove existence amounts to the usual method to perform the division of two numbers. (i) Existence: Consider the set Q = {x ∈ N0 : nx ≤ m}. Note that Q is non-empty (because 0 ∈ Q) and bounded above (because x ≤ nx ≤ m). Therefore Q has a maximum x = q. It is clear that nq ≤ m < n(q + 1), because q ∈ Q and (q+1) 6∈ Q. Subtracting nq to both sides of the inequality, we obtain that 0 ≤ r ≤ n, since r = m − nq. (ii) Uniqueness: Assume that m = nq + r = nq ′ + r ′ , with 0 ≤ r, r ′ < n. We have −n < r − r ′ < n, or equivalently |r − r ′ | < |n|, and also that n(|q − q ′ |) = |r − r ′ |. If q 6= q ′ , then |q − q ′ | ≥ 1 and |r − r ′ | > n, so we must have q = q ′ . But from n(|q − q ′ |) = |r − r ′ |, we conclude also that r = r ′ . Definition 2.5.10. If n, m ∈ Z, n 6= 0, and m = nq + r with 0 ≤ r < |n|, we say that q and r are, respectively, the quotient and the remainder of the division of m by n. 92 CHAPTER 2. THE INTEGERS The way the quotient and the remainder depend on the algebraic signs of m and n is not entirely obvious. This is illustrated fin the following table by various examples: m n q r 5 3 1 2 5 −3 −1 2 −5 3 −2 1 −5 −3 2 1 Assuming that n 6= 0 is fixed, consider the map ρ : Z → {0, 1, . . . , n − 1}, where ρ(m) is the remainder of the division of m by n. We leave as an exercise to show that the following result holds true: Proposition 2.5.11. If x, y are arbitrary integers, we have (i) ρ(x) = ρ(y) if and only if n|(x − y); (ii) ρ(x ± y) = ρ(ρ(x) ± ρ(y)); (iii) ρ(xy) = ρ(ρ(x)ρ(y)). The Division Algorithm will be used many times in the sequel. For example, we will use in the next section to describe all the ideals of Z. Moreover, in the next Chapter we will show how it can be generalized to other rings, including some polynomial rings. Actually, all the notions that we have introduced in this section, such as prime number, gcd, lcm, etc., will eventually be generalized to a large class of rings. Exercises. 1. Let A be a commutative ring, and a, b, c ∈ A. Show that, if a|b and b|c, then a|c, and that, if c 6= 0 is not a zero divisor, we have ac|bc if and only if a|b. 2. Prove Lemma 2.5.3. 3. Conclude the proof of Theorem 2.5.9. 4. Show that, if m, n, k ∈ N, mn = k and m2 > k, then n2 < k. 5. List all the natural numbers between 100 and 200. Observe that 172 = 289, and remove from the list all the multiples of 2, 3, 5, 7, 11 and 13. Which numbers are left in the list? 10 10 This procedure is called Eratosthenes filter. Eratosthenes (276 a.C.-194 a.C.) was born where is now Libya, and was the third librarian of the famous Library of Alexandria. Among other achievements he established the sphericity of the Earth and found its diameter with great accuracy. 2.6. IDEALS AND EUCLIDES ALGORITHM 93 6. Determine the prime numbers between √ 1950 and 2050. (Hint: Determine first the primes p ≤ 2050). 7. If m, n ∈ Z, n 6= 0 and ρ : Z → {0, 1, . . . , n − 1} is the remainder of the division by n, when is it true that ρ(m) = ρ(−m)? 8. Prove Proposition 2.5.11. What is the relationship between this theorem and the “casting out nines” test of elementary arithmetics? 9. Is Theorem 2.5.9 still valid if one replaces in its statement the ring of integers by the ring formed by the multiples of 2. What if one replaces the ring of integers by the ring real numbers? 10. State and prove the analogue of Theorem 2.5.9 for the ring of Gaussian integers. 2.6 Ideals and Euclides Algorithm Assuming that one has two sand watches (hourglasses), one measuring intervals of 21 minutes and the other measuring intervals of 30 minutes, what intervals of time can one measure using the two watches? Certain intervals are obviously possible, by using successively each watch, such as 30, 60, 90, . . . , 21, 42, 63, . . . , or additions of these numbers, such as 51, 81, 111, . . . , 102, 123, . . . , 132, 153, . . . , etc. By using simultaneously both watches we can also measure the differences of these numbers, such as 9 = 30 − 21, 3 = 63 − 60, etc. Looking carefully at the resulting numbers, one reaches the following conclusions: • It is possible to obtain any natural number of the form x21 + y30 with x, y ∈ Z. • All the numbers of the form x21 + y30 are multiples of 3 (since 3 is the greatest common divisor of 21 and 30). On the other hand, there exist integers x′ and y ′ (e.g., x′ = 3, y ′ = −2) such that 3 = x′ 21 + y ′ 30. In particular, if m = k3 is any multiple of 3, then m = k(x′ 21 + y ′ 30) = k(x′ 21 + y ′ 30) = x′′ 21 + y ′′ 30. In other words: • the numbers of the form x21 + y30 are precisely the multiples of 3; • 3 = gcd(21, 30) is the smallest natural number of the form x21 + y30. 94 CHAPTER 2. THE INTEGERS Our aim now is to explore this remarks, generalizing them to an arbitrary ring. As a byproduct of this work we will be able to determine all the ideals of the ring Z and we will find a method to compute gcd(n, m), the so-called Euclides Algorithm. This algorithm does not require the knowledge of prime factors of n and m, so it is specially useful for computational purposes, and it is also easily generalized to polynomials, If n and m are arbitrary fixed integers, we will consider the set I = {xn + ym : x, y ∈ Z}. It is clear that I is non-empty and closed relative to the difference and the product by arbitrary integers, i.e., if x, y, x′ , y ′ , z ∈ Z, then (xn + ym) ± (x′ n + y ′ m) = (x ± x′ )n + (y ± y ′ )m ∈ I, z(xn + ym) = (xn + ym)z = (zx)n + (zy)m ∈ I. In other words, I is an ideal of Z. On the other hand, if J is an ideal such that n, m ∈ J, it is obvious that xn, ym ∈ J for all x, y ∈ Z, hence also xn + ym ∈ J, so that I ⊂ J. For this reason we say that I is the smallest ideal of Z that contains n and m, or still that I is the ideal generated by n and m. More generally, consider any ring A instead of the ring of integers, and replace the set {n, m} by an arbitrary (non-empty) subset S ⊂ A. Note that the ring A itself is an ideal of A that contains S. Hence, the family of ideals of A which contains S is non-empty. We will denote by hSi the intersection of all the ideals in this family. It is clear that S ⊂ hSi. Also, one verifies directly from the definition of ideal that the intersection of a family of ideals of A is still an ideal of A. Hence, hSi is an ideal of A that contains S. Finally, observe that if S ⊂ I ⊂ A and I is an ideal, then hSi ⊂ I. We leave as an exercise to verify that all these statements hold true. Definition 2.6.1. If A is a ring and S ⊂ A, one calls hSi the ideal generated by S, or the smallest ideal of A that contains S. The elements of S are said to be generators of the ideal hSi, and S is called a generating set of hSi. When S = {a1 , a2 , . . . , an } is a finite subset of a ring A, we will often write ha1 , a2 , . . . , an i instead of hSi. In the case of the integers, we saw above that hn, mi = {xn + ym : x, y ∈ Z}, and it is immediate to show that hni = {xn : x ∈ Z}. However, there are rings where determining the ideal generated by a given set of elements is not so easy. 2.6. IDEALS AND EUCLIDES ALGORITHM 95 Examples 2.6.2. 1. Let A be an abelian ring and a ∈ A. Then {xa : x ∈ A} is a subring of A, for if x, y ∈ A, we have xa − ya = (x − y)a, (xa)(ya) = (xay)a, showing that {xa : x ∈ A} is closed for the difference and the product. Also, if x, b ∈ A, since A is abelian, we have b(xa) = (xa)b = (bx)a, so {xa : x ∈ A} is an ideal. Finally hai contains necessarily the “multiples” of a, and we conclude that hai = {xa : x ∈ A}. 2. Let A = M2 (Z) be the ring of 2 × 2 matrices with entries in Z, and 1 0 a= . 0 0 The set of all “multiples” of a is {ax : x ∈ M2 (Z)} = n 0 m 0 : n, m ∈ Z , but hai = M2 (Z) as it will be clear in the exercises. The ideal of Z generated by 21 and 30 is also generated by 3. This is true for all ideals in the integers, which are always generated by only one of its elements (but it is by no means a property valid for arbitrary rings): Theorem 2.6.3. I is an ideal of Z if and only if there exists d ∈ Z such that I = hdi. Proof. We verify both implications: (a) If I = hdi, then I is obviously an ideal of Z. (b) Let I be an ideal of Z. If I = {0} it obvious that I = h0i, so we can assume that I 6= h0i. In this case, we observe that the ideal I contains positive integers, since if n ∈ I then −n ∈ I. Set I + = {n ∈ I : n > 0}, and let d be the minimum of I + , which exists by the Well-ordering Principle. Since d ∈ I it follows that hdi ⊆ I (why?). Hence, it remains to show that I ⊆ hdi, i.e., that if m ∈ I, then m is a multiple of d. Let m ∈ I, 96 CHAPTER 2. THE INTEGERS and let q and r be the quotient and the remainder of the division of m by d (recall that d > 0, so d 6= 0). We have then that m = qd + r, or r = m − qd. Since qd ∈ hdi ⊂ I, this means that qd ∈ I. But then r = m − qd is the difference of two elements in the ideal I, so we must have r ∈ I. Finally, since 0 ≤ r < d and d is by definition the smallest positive element of I, we must have r = 0, so m is a multiple of d, i.e., m ∈ hdi. Hence, I ⊆ hdi. In an arbitrary ring we have a special designation for those ideals which are generated by one of its elements: Definition 2.6.4. An ideal I in a ring A is called a principal ideal if there exists a ∈ A such that I = hai. According to Theorem 2.6.3, all the ideals in the ring of integers are principal, but we will see later many examples of rings which have ideals that are not principal. However, the rings where all ideals are principal form an important class of rings. The example of the ideal h21, 30i suggests that, if I = hdi = hn, mi, then |d| is the greatest common divisor of n and m. The previous result allows us to prove this result and also part (i) of Proposition 2.5.6. Corollary 2.6.5. If n, m ∈ N, then hn, mi = hdi where d = gcd(n, m). In particular, we have that: (i) the equation xn + ym = d has solutions x, y ∈ Z; (ii) if k is a common divisor of n and m, then k is also divisor of d. Proof. (i) The theorem shows that there exists a natural number d such that hn, mi = hdi. Obviously, d ∈ hdi = hn, mi. Since hn, mi is the set of integers of the form xn + ym, there are integers x′ , y ′ such that d = x′ n + y ′ m. (ii) It is also obvious that n, m ∈ hn, mi = hdi. Hence, n and m are multiples of d, which is a common divisor of n and m. On the other hand, if k ∈ N is any common divisor of n and m, then we have n = kn′ and m = km′ , so that d = x′ n + y ′ m = k(x′ n′ + y ′ m′ ), or k|d. In particular, k ≤ d and d is the greatest common divisor of n and m. This corollary suggest that one can compute gcd(n, m) by searching for the smallest natural number in the ideal hn, mi. The Division Algorithm makes this search possible by using the following lemma. Lemma 2.6.6. If n, m ∈ N and m = qn + r, then hn, mi = hn, ri. 2.6. IDEALS AND EUCLIDES ALGORITHM 97 Proof. On the one hand, k ∈ hn, mi =⇒ k = xm + yn =⇒ k = x(qn + r) + yn =⇒ k = (xq + y)n + xr =⇒ k ∈ hn, ri, so hn, mi ⊂ hn, ri. On the other hand, k ∈ hn, ri =⇒ k = xn + yr =⇒ k = xn + y(m − qn) =⇒ k = (x − yq)n + ym =⇒ k ∈ hn, mi, so hn, ri ⊂ hn, mi. The Euclides Algorithm consists in applying repeatedly the previous lemma until one obtains an exact division (i.e., r = 0). This is a very simple procedure, easy to program in a computer, which we illustrate now for the example that we started this section. Example 2.6.7. If n = 21 and m = 30, then 30 = 1 · 21 + 9 =⇒ h30, 21i = h21, 9i, 21 = 2 · 9 + 3 =⇒ h21, 9i = h9, 3i, 9 = 3 · 3 + 0 =⇒ h9, 3i = h3i. Hence h30, 21i = h3i, and by the previous corollary we have that 3 = gcd(21, 30). In general, if we start with two natural numbers n and m, and assuming that n < m, the general procedure should be clear, and it corresponds to an iterative process which is very easy to program. Observe that it is also possible to determine integers x and y such that gcd(n, m) = xn + ym: from the equations above it follows immediately that 3 = 21 + (−2)9 and 9 = 30 + (−1)21, so: 3 = 21 + (−2)[30 + (−1)21] = (3)21 + (−2)30. 98 CHAPTER 2. THE INTEGERS The next result takes advantage of the fact that gcd(n, m) is a linear combination of n and m. Proposition 2.6.8. Let m, n, p, k ∈ N and assume that mn|kp. If m and p are relatively prime, then m is factor of k. Proof. By assumption, gcd(m, p) = 1, so there exist integers x′ , y ′ such that 1 = x′ m + y ′ p. Hence, k = k(x′ m + y ′ p). Moreover, since mn|kp, there exists an integer z ′ such that kp = z ′ mn. Hence, k = k(x′ m + y ′ p) = kx′ m + y ′ kp = kx′ m + y ′ z ′ mn = (kx′ + y ′ z ′ n)m, which shows that m|k. The least common multiple of two natural numbers maybe found also by applying Theorem 2.6.3. Given natural numbers n and m, we observe that hni∩hmi is the set of common multiples of n and m. Since the intersection of two ideals is an ideal, we conclude from Theorem 2.6.3, that hni ∩ hmi = hli, where l is a natural number. Of course, l is a common multiple of n and m, and any common multiple k of n and m is a multiple of l, so that l ≤ |k|. Using this, and similarly to the case of the greatest common divisor, one can prove part (ii) of Proposition 2.5.6. These remarks also suggest a way to define the greatest common divisor and the least common multiple of two integers: Definition 2.6.9. If m, n, d, l are integers, and d, l ≥ 0, then: (i) d = gcd(n, m) if hdi = hn, mi; (ii) l = lcm(n, m) if hli = hni ∩ hmi. This definition is consistent with Theorem 2.6.3 and lead to the “natural” result that gcd(n, m) = gcd(|n|, |m|) and lcm(n, m) = lcm(|n|, |m|). Notice, by the way, that gcd(n, 0) = |n| and lcm(n, 0) = 0. Moreover, we leave to the exercises to show that for any integers m, n ∈ Z one has: lcm(m, n) gcd(m, n) = |m||n|. Hence, the Euclides Algorithm also leads to the computation of lcm(m, n). We will further explore these results later to extend the notions of greatest common divisor and least common multiple to certain classes of rings. 2.6. IDEALS AND EUCLIDES ALGORITHM 99 If hmi and hni are ideals of Z, it is clear that hni ⊂ hmi if and only if m|n. In other words, to determine all the ideals that contain hni is equivalent to determine all the divisors of n. This is illustrated in the next figure, when n = 12, where each rectangle represents an ideal of Z that contains h12i. Note that if a rectangle is contained in another, then the corresponding ideals also are, and that the ideal generated by 1 is obviously the whole ring Z. h1i PSfrag replaements Z =Z J Zm h2i I h4i hmi h12i e e~ h6i h3i Figure 2.6.1: The ideals of Z that contain h12i. Alternatively, we can represent the ideals that contain h12i as in the following diagram (notice the subdiagram in the right, PSfragexpressing replaements the general Z property used here). h1i = h2i h Z hmd(m; n)i e e~ h3i hm i h4i mi hni h6i h12i hmm(m; n)i Figure 2.6.2: Os divisors of 12. One should note that a diagram of this sort can be extended indefinitely below, but not above. In particular, given an ideal hmi ⊂ Z it is possible that the only ideal strictly containing it is Z itself, and that happens precisely when n is a prime number. The same may happen with an ideal in an arbitrary ring, so we introduce: 100 CHAPTER 2. THE INTEGERS Definition 2.6.10. An ideal I ⊂ A is maximal if for every ideal J ⊂ A: I ⊂ J =⇒ J = I or J = A. Maximal ideals play an important role in many rings, not just in the ring of integers. For example, we will see later than the maximal ideals of an integral domain D can be used to associate to D certain fields. As we mentioned above, it is possible to identify the maximal ideals of Z: Theorem 2.6.11. An ideal hpi in Z is maximal if and only if p = 1 or |p| is a prime number. Proof. If p, q ∈ N and hpi ⊂ hqi, then q|p. If p = 1 or p is prime, we have that q = 1 or q = p, so that hqi = Z or hqi = hpi, and hence hpi is maximal. On the other hand, if p > 1 is not prime, then there exists q ∈ N such that 1 < q < p and q|p. It follows that hpi ( hqi ( Z, so hpi is not a maximal ideal. In the next section we will look more in detail into the properties of prime numbers. Examples 2.6.12. 1. The ideal h0i is maximal in the ring R, but not in the ring Z. √ 2. The√ideal hx2 − 2i is not maximal in R[x], because hx2 − 2i ( hx + 2i, and hx + 2i 6= R[x]. On the other hand, hx2 − 2i is maximal in Z[x] (why?). 3. In the Appendix, using Zorn’s Lemma, we show that in an arbitrary ring A any proper ideal I ( A is contained in a maximal ideal. If A is a ring with identity I, then the map φ : Z → A, n 7→ nI, is a ring homomorphism. Hence, the set of solutions of the homogenous equation nI = 0 (n ∈ Z), i.e., the kernel of φ, is an ideal of Z. Since every ideal of Z is principal, we see that there exists an unique integer m ≥ 0 such that {n ∈ Z : nI = 0} = hmi. Actually, according to the proof of Theorem 2.6.3, if m > 0, then m is simply the smallest positive solution of the equation nI = 0. In any case, the integer m deserves a special name. Definition 2.6.13. If A is a ring with identity I, its characteristic is the unique number m ∈ N0 such that {n ∈ Z : nI = 0} = hmi. 2.6. IDEALS AND EUCLIDES ALGORITHM 101 Examples 2.6.14. 1. The rings Z, Q, R, C and H all have characteristic 0. 2. The ring Z2 has characteristic 2. Exercises. 1. Give examples of natural numbers a, b, n such that ab is not a factor of n, but a|n and b|n. 2. Determine integers x and y such that gcd(135, 1987) = x135 + y1987. 3. How can one measure 1 pint of water, if one only has two jars of sizes, respectively, 15 and 23 pints? 4. Let a, b, d, m, x, y, s, t ∈ Z. Show that: (a) d = gcd(a, b), a = dx, b = dy ⇒ gcd(x, y) = 1; (b) as + bt = 1 ⇒ gcd(a, b) = gcd(a, t) = gcd(s, b) = gcd(s, t) = 1; (c) gcd(ma, mb) = |m| gcd(a, b); (d) gcd(a, m) = gcd(b, m) = 1 ⇔ gcd(ab, m) = 1; (e) a|m and b|m ⇒ ab|m gcd(a, b); (f) |ab| = gcd(a, b) lcm(a, b). 5. Let I = {Iβ }β∈B be a family of ideals of a ring A indexed by some set B. Consider the intersection of all sets in this family: \ I= Iβ = {a ∈ A : a ∈ Iβ , for all β ∈ B}. β∈B (a) Show that I is an ideal of A. (b) Let S ⊂ A and let I be the family of all ideals of A that contains S. Show that I is the smallest ideal of A that contains S. 6. Show that, if S1 ⊂ S2 ⊂ A, then hS1 i ⊂ hS2 i. 7. Assume that m, n ∈ Z, and show that: (a) hni ⊂ hmi ⇐⇒ m|n; (b) hni = hmi ⇐⇒ m = ±n. 102 CHAPTER 2. THE INTEGERS 8. Show that, if A is an abelian ring with identity and a1 , a2 , . . . , an ∈ A, then ( n ) X ha1 , a2 , . . . , an i = xk ak : xk ∈ A, 1 ≤ k ≤ n . k=1 How would you describe ha1 , a2 , . . . , an i if A was abelian without identity? 9. Let {an }n∈N be a sequence of integers. (a) Define dn = gcd(a1 , a2 ,P . . . , an ) and ln = lcm(a1 , a2 , . . . , an ), and show n that the equation dn = k=1 xk ak has solutions xk ∈ Z. (b) Show that dn+1 = gcd(dn , an+1 ) and ln+1 = lcm(ln , an+1 ). (c) For which values of n does the equation 30x+105y +42z = n have integer solutions? 10. Construct a diagram similar to the one for the divisors of 12 for n = 18. 11. Assume that A and B are unitary rings of characteristic, respectively, n and m. Show that the characteristic of A ⊕ B is lcm(n, m). 12. Let A be an abelian ring with identity 1, and let J be an ideal of A. Show that J is maximal if and only if ,for al a 6∈ J, the equation xa + y = 1 has solutions x ∈ A and y ∈ J. (Hint: Check that {xa + y : x ∈ A, y ∈ J} is the ideal generated by J ∪ {a}.) 13. Let A be a ring with identity I and characteristic m > 0. Show that: {nI : n ∈ Z} = {I, 2I, . . . , mI} is a ring with m elements. Show also that, if a ∈ A, then the smallest positive solution of na = 0 is a factor of m. (Hint: If φ : Z → A, n → nI, and n = mq + r, then φ(n) = φ(r).) 14. Let A be a ring of characteristic 4. Determine the addition and multiplication tables of the subring {nI : n ∈ Z} ⊂ A. Decide whether or not this ring is isomorphic to the field with 4 elements mentioned in Chapter 1. 15. Determine the ideals of each of the following rings: (a) (b) (c) (d) The The The The ring ring ring ring 2Z of even integers. Z[i] of Gaussian integers. Z ⊕ Z. Mn (Z). 16. For each of the rings in the previous exercise, determine whether or not its ideals are all principal. 2.7. THE FUNDAMENTAL THEOREM OF ARITHMETIC 2.7 103 The Fundamental Theorem of Arithmetic One of the most important properties of the prime numbers is the fact that they generate, via multiplication, all the natural numbers n ≥ 2. Our aim now is to make this fact precise and to prove it. Its correct formulation is known as the Fundamental Theorem of Arithmetic. We start by showing that: Proposition 2.7.1. Any natural n ≥ 2 has at least one prime divisor p. Proof. The set D = {m ∈ N : m > 1 and m|n} is non-empty, since it contains n. Let p be a minimum of D. If p is not a prime, then p = mk, where 1 < m < p. Since m is obviously a factor of n, p cannot be a minimum of D. We conclude that p must be a prime. A consequence of this proposition is the existence of factorizations into prime numbers for any natural number n ≥ 2. Corollary 2.7.2. If n ≥ 2 is a natural number there exist prime numbers p1 ≤ p2 ≤ · · · ≤ pk such that n= k Y pi . i=1 Proof. We give a proof by induction. If n = 2, it is evident that n has a prime factorization with k = 1 and p1 = 2. Assume now that any natural number m with 2 ≤ m < n has a prime factorization. We wish to show that n also has a prime factorization. Let P = {p ∈ N : p|n, p prime} be the set of prime factors of n. We now that P is bounded above (n is an upper bound) and non-empty (according to Proposition 2.7.1). Hence, P has a maximum q. If q = n, then n is a prime, so we set k = 1 and p1 = q = n. Otherwise, q < n so n = mq, where 2 ≤ m < n. By the induction Q assumption, there exist prime numbers ′ p1 ≤ p2 ≤ · · · ≤ pk′ such that m = ki=1 pi , all of them bounded above by q. In this case, k = k′ + 1, and pk = q. Now that we know the existence of prime factorizations for all natural numbers n > 2, we look into the question of uniqueness of these factorizations (up to the order of the factors). This is a problem of a rather different nature, as the following example illustrates. 104 CHAPTER 2. THE INTEGERS Example 2.7.3. Let us call an element in the ring 2Z of even integers a “prime” if it cannot be expressed as a product of other even integers. For example, according to this definition the elements 2, 6 and 18 are all “primes”. It is left to the exercises to prove the analogues of the previous results for this ring which, just like the integers, is well-ordered. On the other hand, since 36 = 2 · 18 = 6 · 6, it is clear that the “prime” factorizations in this ring are not unique. The fundamental result about the integers that yields the uniqueness of prime factorizations is the following:. Lemma 2.7.4 (Euclides). Let m, n, p ∈ Z and assume that p is a prime number. If p is a factor of the product mn, then p is a factor of m or a factor of n. Proof. Let d = gcd(m, p). Since d is a factor of p and p is prime, we must have d = 1 or d = p. Obviously, if d = p, then p is factor of a m. On the other hand, If d = 1, there exist integers x and y such that 1 = xm + yp. Hence, n = nxm + nyp and since p is a factor of mn, there exists also an integer z such that mn = zp. We conclude that n = xzp+nyp = (zx+ny)p, which shows that p is a factor of n. The example above shows that Euclides Lemma is not valid in the ring of even integers, for the definition of “prime” that we indicated. Example 2.7.5. When the Greek mathematicians found out about the existence of what we now call irrational numbers they were very intrigued and surprised, and even √ tried to hide its existence! We can now show without much difficulty that 2 n 2 is irrational, i.e., that there are no integers n and m such that m = 2, or equivalently n2 = 2m2 . For this we will argue by contradiction. Assuming that n and m exist, we can choose them to be relatively prime (why?). Now observe that by Euclides Lemma we must have: n2 = 2m2 ⇒ 2|n2 ⇒ 2|n, We conclude that n = 2k, for some integer k. Hence, n2 = 4k 2 , so 4k 2 = 2m2 , or still 2k 2 = m2 . Since 2|m2 , it follows again from Euclides Lemma that 2|m. But 2 being a divisor of both m and n contradicts our assumption that they are relatively prime. We conclude that the equation n2 = 2m2 has no integer √ solutions, and hence that 2 is not rational. 2.7. THE FUNDAMENTAL THEOREM OF ARITHMETIC 105 √ We leave it for the exercises to generalize this example to the case n, where n ∈ N is not a perfect square. In other words, if n is a natural number, its square root is either another natural number (case in which n is a perfect square) or an irrational number. It is not hard to generalize Euclides Lemma to any finite product of integers. The proof is by induction in the number of factors and is left to the exercises. Q Corollary 2.7.6. If p ∈ N is prime and p | ki=1 mi , then: (i) p|mj for some j, with 1 ≤ j ≤ k. (ii) If the integers mi are primes, then p = mj , for some j, with 1 ≤ j ≤ k. Finally, we can state and prove the Theorem 2.7.7 (Fundamental Theorem of Arithmetic). Any natural number n > 2 has a prime factorization which is unique up to the order of the factors. Proof. The existence of prime factorizations follows from Corollary 2.7.2. It remains to prove the uniqueness. Assume that k and m are natural numbers, p1 ≤ p2 ≤ · · · ≤ pk and q1 ≤ q2 · · · ≤ qm are primes, and k Y pi = i=1 m Y qj . j=1 We claim that k = m and pi = qi . For that, we argue by induction in k: • If k =1, the result is obvious from the definition of a prime number. • Assume that the result holds for the natural k − 1, and that k Y pi = m Y qj . j=1 i=1 Let P = {pi : 1 ≤ i ≤ k} and Q = {qj : 1 ≤ j ≤ m}. Again by the definition of prime number we must have m > 1, since k > 1. By Corollary 2.7.6 it is clear that pk ∈ Q, so that pk ≤ qm , Similarly qm ∈ P , so that qm ≤ pk . We conclude that pk = qm , and it follows from the cancellation law that k−1 Y i=1 pi = m−1 Y j=1 qj . 106 CHAPTER 2. THE INTEGERS By the induction hypothesis, k − 1 = m − 1 and pi = qi , for i < k, so we conclude that k = m and pi = qi , for i ≤ k. A prime factorization of n can, of course, contain repeated factors. For that reason, one often writes it in the form: m Y pi ei (ei ≥ 1). n= i=1 This is called the prime powers factorization of n. Again, this expression is unique, up to the order of the factors. The Fundamental Theorem of Arithmetic does not imply directly the existence of an infinite number of primes. This last fact was also discovered by Euclides. Theorem 2.7.8 (Euclides). The set of prime numbers is unbounded. Proof. We show that for any natural number n there exists a prime number p > n. Given a natural number n, consider the natural number m = n! + 1, where n! is the factorial of n. It is obvious from the Division Algorithm that the remainder of the division of m by any natural number between 2 and n is 1. In particular, all the factors of m, including its prime factors (which exist according to Proposition 2.7.1), are larger than n. We conclude that there exist prime numbers greater than n. Hence, the prime numbers form an infinite sequence 2, 3, 5, 7, 11, 13, . . . , pn , . . . and this raises immediately some obvious questions. For example, is it possible to determine an explicit formula involving the natural number n, which gives the prime pn ? How many prime numbers are there in the interval of 1 to n? How hard is it to find the prime factors of a given natural n? How large can it be the gap between two successive primes? The answer to the first question above seems to be negative. In particular, no explicit formula is known that produces only prime numbers. One famous attempt is due to Fermat11 and involved the numbers of the form n Fn = 22 + 1, 11 Pierre of Fermat (1601-1665), French mathematician. Fermat, was a lawyer and one of the most fascinating characters of the History of Mathematics. He was one of the founders of Calculus and he discovered independently from Descartes (of whom he was a friend) the principles of Analytic Geometry. However, his most important contributions were, without a doubt, to Number Theory. 2.7. THE FUNDAMENTAL THEOREM OF ARITHMETIC 107 now known as Fermat numbers. It is easy to compute the Fermat numbers corresponding to n = 0, 1, 2 and 3: namely, 3, 5, 17, and 257, which are all primes. If n = 4 one finds the Fermat number 65537, which is still a prime number. There are however Fermat numbers that are not primes, has Euler12 find out in 1732 by taking n = 5. If this looks like a simple observation, note that n = 5 corresponds to the number: 5 22 + 1 = 4 294 967 297, and Euler found that its prime factorization is: 5 22 + 1 = 641 · (6 700 417). The choice of the exponent e = 2n is easy to explain. Assume that F = 2e + 1, and e = ks, where k, s > 1, with s odd. The polynomial p(x) = xs + 1 has the root x = −1, hence can be factor as p(x) = (x + 1)q(x), where q(x) is a polynomial with integer coefficients. Substituting x by 2k , we conclude that s F = 2e + 1 = (2k ) + 1 = (2k + 1)q(2k ), so the number F is not prime. In other words, if F = 2e + 1 is prime then e has no odd factors greater than 1, so its only prime factor is 2, and we have e = 2n , hence F is the Fermat number Fn . In spite of the promising start of the sequence of Fermat numbers, we don’t know any Fermat number with n > 4 that is a prime, and it is known that some of these Fermat numbers are composed. For example, it is known that the smallest prime factor of the Fermat number corresponding to n = 1945 (a number with more than 10582 digits in its decimal expansion!) is a prime number p of 587 digits: p = 5 · 21947 + 1, and it is conjectured that no Fermat number with n > 4 is a prime. In spite of that, we will see in the exercises that these numbers can also be used to prove the existence of an infinite number of primes. Questions concerning the number of primes in the interval [1, n], or concerning the distribution of primes, are related to the probability of a randomly chosen natural number in the interval [1, n] be a prime. It is also 12 Leonhard Euler (1707-1783), Swiss mathematician. Euler was one of the most prodigious mathematicians of all times, having worked in various branches of Pure and Applied Mathematics, such as Analysis, Geometry, Number Theory, Fluid Mechanics, etc. 108 CHAPTER 2. THE INTEGERS related to the problem of finding an explicit formula for the n-esimal prime, as mentioned above. Legendre13 and Gauss were the first mathematicians to suggest an approximate expression π(x) for the number of primes < x. At the end of the XIX Century, Hadamard 14 showed that π(x) x log x → 1 when x → ∞. We will not discuss here any results of this nature that typically require analytic techniques. Some of the problems are among the hardest problems in Mathematics, and illustrate the capabilities and limitations of the human mind. In spite of its abstract origin, these problems have interesting consequences in our day-to-day life. We will mention later some techniques from Cryptography that take advantage of the relative ease in manipulating “large” prime numbers when compared with the difficulties in finding prime factorizations. In practical applications, these “large” numbers can have more that 100 digits, while the determination of their prime factorizations by sequential verification of its possible prime factors may require a number of divisions of the order of 1050 ! The security in the electronic transmission of the most secret communications or of simple financial transactions depends on our ignorance of a practical algorithm for prime factorization. At this point it is hard to mention other “practical” applications where the properties of prime numbers are important. We mention only that questions concerning speech or image recognition require algorithms for the decomposition of signals into its fundamental frequencies, a technique known as Fourier Analysis. The theoretical maximum speed for these algorithms is directly related to the function π(x). Exercises. Qk Qk 1. If n = i=1 pi ei and m = i=1 pi fi with ei , fi ≥ 0 integers, find expressions for the gcd(n, m) and the lcm(n, m). 2. Let p and q be distinct primes, and let n = p2 q 3 . Count all the natural numbers that are factors of n, and show that their sum is (1 + p + p2 )(1 + q + q 2 + q 3 ). 13 Adrien Marie Legendre (1752-1833), French mathematician. Legendre become known for his fundamental contributions to Number Theory and to the study of elliptic function. 14 Jacques Hadamard (1865-1963), was one of the most influential French mathematicians of the second half of the XIX Century and the first half of the XX Century. His work spanned such different domains of Mathematics as, e.g., Number Theory or the Calculus of Variations. 2.8. CONGRUENCES 109 3. Generalize the previous result to the case where n = Qk i=1 pi ei . 4. Prove a version of Corollary 2.7.2 for the ring 2Z of all even integers. 5. Prove Corollary 2.7.6. 6. Show that, for any natural number n, the interval [n + 1, n! + 1] contains at least one prime. 7. The primes of the form 2n − 1 are called Mersenne primes. Show that, if an − 1 is prime and n > 1, then a = 2 and n is prime. 8. Show that the sequence an = n2 − n + 41 is not formed only by primes. 9. Show that, if p(x) is a non-constant polynomial with integer coefficients, then the set of integers n for which an = p(n) is not prime must be infinite. n 10. Let Fn = 22 + 1 be the n-esimal Fermat number (n ≥ 0). Q (a) Show that Fn+1 = 2 + ni=0 Fi ; (b) Show that if n 6= m then gcd(Fn , Fm ) = 1; (c) Explain why the previous result implies the existence of an infinite number of primes. 11. Show that, if n is not a perfect square, then equation x2 = ny 2 has no solutions x, y ∈ Z). 2.8 √ n is irrational (i.e., the Congruences We now turn to the study of other binary relations in Z, the so-called “congruence modulo m”, which are associated to the divisibility relation We will use them to solve equations of the form ax + by = n, where all the variables are integers. In the next chapter, we will also use them to exhibit an important class of finite rings. Definition 2.8.1. Let m ∈ N0 . If x, y ∈ Z, we say that x is congruent modulo m with y if and only if x − y is a multiple of m. The integer m is called the modulus of congruence. If x is congruent with y modulo m, we will write x ≡ y (mod m). Therefore, we have: x≡y (mod m) ⇐⇒ m|(x − y) ⇐⇒ (x − y) ∈ hmi. 110 CHAPTER 2. THE INTEGERS Recall that a binary relation is called an equivalence relation if it is reflexive, symmetric and transitive (see the Appendix). Proposition 2.8.2. Congruence modulo m is an equivalence relation. Proof. We check that congruence modulo m satisfies all three properties: (i) ≡ is reflexive: since 0 is a multiple of m, we have x ≡ x (mod m). (ii) ≡ is symmetric: obviously, x − y = km if and only if y − x = (−k)m, hence x ≡ y (mod m) ⇐⇒ y ≡ x (mod m). (iii) ≡ is transitive: if x ≡ y (mod m) and y ≡ z (mod m), then there exist integers k and n such that x − y = km and y − z = nm. But then, x − z = (x − y) + (y − z) = km + nm = (k + n)m, hence x ≡ z (mod m). Examples 2.8.3. 1. Ignoring the variable y, the equation 3 = 21x + 30y (x, y integers) is written 3 ≡ 21x (mod 30), or 21x ≡ 3 (mod 30). 2. If m = 0, then since 0 is the only multiple of 0, x≡y (mod 0) ⇐⇒ x = y. Hence, the congruence relation modulo 0 is the usual equality relation. At the other extreme, if m = 1, then x ≡ y (mod 1), for all x, y ∈ Z (the integer x − y is always a multiple of 1). 3. Any integer n is even or odd, i.e., n = 0 + 2k or n = 1 + 2k. Hence n ≡ 0 (mod 2), or n ≡ 1 (mod 2). According to the Division Algorithm, if m > 0 and x is any integer, then x = mq + r, where q, r ∈ Z, and these integers are unique if 0 ≤ r < m. In other words, x ≡ r (mod m), with 0 ≤ r < m, if and only if r is the remainder of the division of x by m. Therefore, we have the following generalization to any m > 0 of the example above for m = 2: 2.8. CONGRUENCES 111 Proposition 2.8.4. If m > 0, any x ∈ Z is congruent with exactly one integer r in the set {0, 1, 2, . . . , m − 1}, where r is the remainder of the division of x by m. Example 2.8.5. Any integer x satisfies exactly one of the following: x ≡ 0 (mod 4), x ≡ 1 (mod 4), x ≡ 2 (mod 4) or x ≡ 3 (mod 4). For a fixed m 6= 0, the set r = {x ∈ Z : x ≡ r (mod m)} is called the congruence class or the residue class or simply the residue of the integer r modulo m. We will see in the next chapter that the set of congruence classes (mod m) is the support of the ring Zm , which generalizes to any m the examples of the rings Z2 and Z3 which we have already mentioned. Equations that involve congruence relations can be dealt with just like the ordinary algebraic equations for numbers, with the exception of the “cancellation law” for the product. This is the content of the following proposition: Proposition 2.8.6. If x ≡ x′ (mod m) and y ≡ y ′ (mod m), then the following properties hold: (i) x ± y ≡ x′ ± y ′ (mod m); (ii) xy ≡ x′ y ′ (mod m). In particular, we have that: (iii) x ≡ x′ (mod m) ⇔ x + a ≡ x′ + a (mod m); (iv) x ≡ x′ (mod m) ⇒ ax ≡ ax′ (mod m). Proof. By assumption, both x − x′ and y − y ′ are multiples of m. Therefore, it is obvious that (x − x′ ) ± (y − y ′ ) = (x ± y) − (x′ ± y ′ ) are multiples of m, i.e., x ± y ≡ x′ ± y ′ (mod m). On the other hand, (x − x′ )y and x′ (y ′ − y) are still multiples of m. Hence, (x − x′ )y − x′ (y ′ − y) = xy − x′ y ′ is a multiple of m, or equivalently, xy ≡ x′ y ′ (mod m). The proofs of the remaining statements are left for the exercises. 112 CHAPTER 2. THE INTEGERS Example 2.8.7. Since 10 ≡ 3 (mod 7) and 11 ≡ −3 (mod 7), according to Proposition 2.8.6: 10 + 11 ≡ 3 + (−3) (mod 7) 10 − 11 ≡ 3 − (−3) (mod 7) 10 · 11 ≡ 3 · (−3) (mod 7) ⇐⇒ ⇐⇒ ⇐⇒ 21 ≡ 0 (mod 7), −1 ≡ 6 (mod 7), 110 ≡ −9 (mod 7). On the other hand, note that 4 · 5 ≡ 4 · 8 (mod 6), but 5 6≡ 8 (mod 6). Hence, in general, the cancellation law for the product fails: ax ≡ ax′ (mod m) does not imply that x ≡ x′ (mod m). An important observation is that Proposition 2.8.6 uses precisely the properties of hmi that make this set into an ideal. A corollary of this Proposition is the following, which is left as an exercise. Corollary 2.8.8. If x ≡ y (mod m), then xn ≡ y n (mod m) for all natural numbers n. One of the most simple applications of Proposition 2.8.6 and of its Corollary is to obtain divisibility criteria in terms of the decimal expansion of x. Example 2.8.9. Letting m = 3, we observe that for any natural number k: 10 ≡ 1 (mod 3) =⇒ 10k ≡ 1 (mod 3). For example, if x = 1998, we have that 1998 = 1 · 1000 + 9 · 100 + 9 · 10 + 8, ≡ 1 + 9 + 9 + 8 ≡ 27 ≡ 0 (mod 3). We conclude that 1998 is divisible by 3, without using the Division Algorithm! This example illustrates the following general divisibility criterium: Proposition 2.8.10. A natural number n is divisible by 3 if and only if the addition of all the digits of its decimal representation is divisible by 3. We discuss in the exercises similar divisibility criteria for 2, 5, 9 and 11. 2.8. CONGRUENCES 113 Let us now turn to the study of linear equations of the form ax ≡ b (mod m). Recall that the equation ax = b has integer solutions if and only if b is multiple of a, in which case the solution if unique (except if a = 0). We are interested in knowing for which values of b (in terms of a and m) does the equation ax ≡ b (mod m) have solutions, and how can we find all its solutions. The problem of existence of solutions of ax ≡ b (mod m) is easily solved using the notion of greatest common divisor: Theorem 2.8.11. The equation ax ≡ b (mod m) has solutions if and only if b is a multiple d = gcd(a, m). Proof. Indeed, the equation ax ≡ b (mod m) has solutions if and only if there exist integers x and y such that b − ax = my, i.e., b = ax + my. In other words, ax ≡ b (mod m) has solutions precisely when b is a linear combination of a and m with integers coefficients. As we saw in Section 2.6 (see Definition 2.6.9), the linear combinations of a and m with integers coefficients are exactly the multiples of gcd(a, m). The previous result shows that Euclides Algorithm can be used to decide about the existence of solutions of the equation ax ≡ b (mod m). Actually, since this algorithm allows to obtain d = gcd(a, m) as a linear combination of a and m, it gives also one solution of ax ≡ d (mod m). This leads easily to the solutions of the original equation, as we illustrate in the next example. Example 2.8.12. Consider the equation 15x ≡ b (mod 40). Using Euclides Algorithm, we have 40 = 15 · 2 + 10, or 10 = 40 + 15 · (−2), 15 = 10 · 1 + 5, or 5 = 15 + 10 · (−1) = 15 · 3 + 40 · (−1), 10 = 5 · 2 + 0, hence 5 = gcd(15, 40) = 15 · 3 + 40 · (−1). We conclude from Theorem 2.8.11 that the equation has a solution if and only if b is a multiple of 5. These very same computations show that we have 5 = 15 · 3 + 40 · (−1). Hence, x=3 is a solution of 15x ≡ 5 (mod 40). More generally, by Proposition 2.8.6, x = 3k is solution of 15x ≡ 5k (mod 40). 114 CHAPTER 2. THE INTEGERS For example, if k = 2 we see that x = 6 is a solution of the equation 15x ≡ 10 (mod 40), because: 15 · 3 ≡ 5 (mod 40) =⇒ 15 · 3 · 2 ≡ 5 · 2 (mod 40), Obviously, any integer x that verifies x ≡ 6 (mod 40) is also a solution of 15x ≡ 10 (mod 40), so this last equation has an infinite number of solutions. However, the integers x ≡ 6 (mod 40) do not include all the solutions of 15x ≡ 10 (mod 40). For example, x = −2 is also solution of 15x ≡ 10 (mod 40), but −2 6≡ 6 (mod 40). This is again a consequence of the lack of a “cancellation law” for the product, because 3 · 5 · (−2) ≡ 5 · 2 (mod 40) 6⇒ 6 ≡ −2 (mod 40). We can look under what circunstancies this multiplicity of solutions is not possible. Notice that Theorem 2.8.11 also shows that the equation ax ≡ b (mod m) has solutions for all b precisely when gcd(a, m) = 1, i.e.., when a and m are relatively prime. Obviously, this happens if and only if the equation ax ≡ 1 (mod m) has solutions. Definition 2.8.13. One says that a ∈ Z is invertible (mod m) if ax ≡ 1 (mod m) has a solution, i.e., if a and m are relatively prime. If c is a solution of the equation ax ≡ 1 (mod m), we call c is an inverse (mod m) of a. Examples 2.8.14. 1. The equation 4x ≡ b (mod 9) has solutions for avery b, since gcd(4, 9) = 1. In particular, 4x ≡ 1 (mod 9) has the solution x = −2 because 4(−2) + 9 = 1. Hence, (a) −2 is an inverse of 4 (mod 9), and (b) x = −2b is a solution of 4x ≡ b (mod 9), for every integer b. 2. Since gcd(21, 30) = 3, 21 is not invertible (mod 30). Assume that x = c is a particular solution of ax ≡ b (mod m). As we have observed, any integer x′ congruent with c (mod m) is also solution of the equation: x′ ≡ c (mod m) =⇒ ax′ ≡ ac ≡ b (mod m). It is possible that there are other solutions x′′ , that are not congruent with c, i.e., we may have ax′′ ≡ b (mod m) with x′′ 6≡ c (mod m). 2.8. CONGRUENCES 115 However, when a and m are relatively prime this is impossible, due to the following “cancellation law”: Theorem 2.8.15. If a and m are relatively prime, ax ≡ ay (mod m) ⇐⇒ x≡y (mod m). Proof. We know already that x≡y (mod m) =⇒ ax ≡ ay. On the other hand, let c be an inverse (mod m) of a, so that ac ≡ ca ≡ 1 (mod m). Then ax ≡ ay (mod m) =⇒ c(ax) ≡ c(ay) =⇒ (ca)x ≡ (ca)y =⇒ x ≡ y (mod m) (Proposition 2.8.6), (mod m) (associativity), (mod m) (ca ≡ 1 (mod m). We conclude from Theorems 2.8.11 e 2.8.15 that Theorem 2.8.16. If a and m are relatively prime and b ∈ Z, then: (i) The equation ax ≡ b (mod m) has at least one solution c ∈ Z; (ii) ax ≡ b (mod m) ⇒ x ≡ c (mod m). Therefore, when a and m are relatively prime the solution of the linear equation ax ≡ b (mod m) is very simple. We illustrate it in the following example. Example 2.8.17. Since gcd(4, 7) = 1 and 4 · 2 ≡ 1 (mod 7), we conclude that the solutions of 4x ≡ 1 (mod 7) (the inverses of 4 (mod 7)) are precisely the integers which satisfy x ≡ 2 (mod 7), i.e., the integers of the form x = 2 + 7k. For example, if b = 3, we have 4x ≡ 3 (mod 7) ⇐⇒ x ≡ 6 (mod 7). When a and m are not relatively prime, it is still easy to determine all the solutions of the equation ax ≡ b (mod m). For that, recall that if a = a′ d and m = m′ d with d = gcd(a, m) 6= 0, then a′ and m′ are relatively prime. 116 CHAPTER 2. THE INTEGERS Example 2.8.18. Let us find all solutions of the equation 15x ≡ 10 (mod 40). Obviously, this equation is equivalent to 15x = 10 + 40y and this last equation can be divided by gcd(15, 40) = 5, giving 3x = 2 + 8y, or 3x ≡ 2 (mod 8). Hence, we have: 15x ≡ 10 (mod 40) ⇐⇒ 3x ≡ 2 (mod 8), where in the last equation obviously gcd(3, 8) = 1. We have already seen that x = 6 is a solution of the original equation, and therefore also of 3x ≡ 2 (mod 8). It follows from Theorem 2.8.16, that the solutions of the last equation are the integers which satisfy x ≡ 6 (mod 8). We conclude that the solutions of 15x ≡ 10 (mod 40) are the integers of the form x = 6 + 8k. In particular, this equation has 5 solutions that are not congruents (mod 40), namely, x = 6, 14, 22, 30, 38 (the solution that we found before, x = −2, is congruent to x = 38 (mod 40)). We discuss in the exercises how to determine the number of solutions of ax ≡ b (mod m) that are not congruent (mod m). One can also solve systems of the form x≡a (mod m), (2.8.1) x≡b (mod n). The following result was found by the Chinese mathematician Sun-Tsu in the 1st Century AD! For that reason, it is known as the Chinese Remainder Theorem. Theorem 2.8.19 (Chinese Remainder Theorem). The system (2.8.1) has solutions for all a and b if and only if m and n are relatively prime. In this case, if c is a solution, then (2.8.1) is equivalent to x ≡ c (mod mn). Proof. It is obvious that the integers of the form x = a+ym are the solutions of the equation x ≡ a (mod m). Hence x = a + ym is a solution of (2.8.1) if and only if x≡b (mod n) ⇒ a+ym ≡ b (mod n) ⇒ my ≡ b−a (mod n). We saw above that this last equation has a solution for all a and b if and only if m and n are relatively prime. In this case, Theorem 2.8.16 shows that 2.8. CONGRUENCES 117 the solutions are the integers of the form y = y ′ + zn, where y ′ is any fixed solution, and z is arbitrary. We conclude that the solutions of the system (2.8.1) are the integers of the form x = a + (y ′ + zn)m = c + z(nm), where c = a + y ′ m, i.e., are the solutions of x ≡ c (mod mn). The case where m and n are not relatively prime is dealt with in an exercise at the end of the section. Example 2.8.20. Consider the system x≡1 x≡2 (mod 4), (mod 9). The solutions of the first equation are the integers of the form x = 1 + 4y. They should also satisfy the second equation: 1 + 4y ≡ 2 (mod 9) ⇐⇒ 4y ≡ 1 (mod 9) with solution y = −2 (we saw that −2 is inverse of 4 (mod 9)). We conclude that c = 1+4(−2) = −7 is a particular solution of the system. By the Chinese Remainder Theorem, the system is equivalent to the equation x ≡ −7 (mod 36), so the solutions are the integers of the form x = −7 + 36z, with z ∈ Z. The equations studied in this section have as unknowns integers. Such equations are called Diophantine 15 . Although, as we saw above, solving linear Diophantine equations is relatively simple, some of the hardest (and more famous) problems in Mathematics concern non-linear Diophantine equations. Perhaps the most famous one is “Fermat’s Last Theorem” concerning the integers solutions x, y and z of the equation xn + y n = z n . Although this equation has an infinite number of solutions when n=2 (e.g, x = 3, y = 4 and z = 5), solutions for n > 2 were never found. Fermat wrote in the margins of a copy of the Arithmetica of Diophantus, next to 15 After Diophantus of Alexandria, a Greek mathematician of the III Century, author of a famous series of books called Arithmetica, most of which are now lost. This treaty included, among many things, solutions (some of there extremely ingenious!) of algebraic equations. Diophantus was only interested in rational solutions, and called the irrational ones “impossible”. 118 CHAPTER 2. THE INTEGERS the discussion of the Pythagoras’s theorem, that he knew how to prove the non-existence of solutions for n > 2, but that the margin was too small to describe the argument. Fermat’s proof remains unknown to these days and, in fact, it took more that 300 years, and the efforts of many generations of mathematicians, to reach a complete solution of Fermat’s Last Theorem. The proof, due to the american mathematician Andrew Wiles16 , is no doubt one of the greatest discovers of Mathematics. The degree of sophistication of the proof, which culminates the works of many great mathematicians for more than 200 years, makes it one of the most elaborate intellectual constructions of all the humanity. Exercises. 1. Find all integers solutions of 21x + 30y = 9. 2. For which integers b does the equation 533x ≡ b (mod 4141) have solutions? 3. State and prove criteria of divisibility by 2, 5, 9 and 11, in terms of the decimal representation of a natural number n. 4. Given some natural number k, find the remainder of the division of 3k be 7. 5. Determine all solutions of xy ≡ 0 (mod 12). 6. For a, m ∈ Z, with m 6= 0, show that exactly one of following holds true: (a) ax ≡ 1 (mod m) has solutions (so gcd(a, m) = 1), or (b) ax ≡ 0 (mod m) has solutions x 6≡ 0 (mod m) (so gcd(a, m) 6= 1). 7. Assume that gcd(a, m) = d and m = dn. Show that: (a) ax ≡ 0 (mod m) has d solutions x, with 0 < x ≤ m, namely x = n, 2n, . . . , dn; (b) ax ≡ b (mod m) has either 0 or d solutions x, with 0 < x ≤ m. 8. Show that the equation x2 + 1 ≡ 0 (mod 11) has no solutions. 16 Andrew Wiles announced in the summer of 1993 a proof of Fermat’s Last Theorem, but it was soon recognized that a crucial step was missing. Finally, in September of 1994, Wiles together with Richard Taylor found an argument which allowed to circumvent the missing step. The correct proof was published in the paper “Modular elliptic curves and Fermat’s last theorem”, Ann. of Math. 141 (1995), no. 3, 443–551, and refers to the following paper in the Annals, which is precisely a joint paper of Wiles and Taylor, “Ring-theoretic properties of certain Hecke algebras”, Ann. of Math. 141 (1995), no. 3, 553–572. 2.9. PRIME FACTORIZATIONS AND CRYPTOGRAPHY 119 9. Determine which numbers between 1 and 8 have an inverse (mod 9). 10. Find all solutions of the equation x2 + 1 ≡ 0 (mod 13). 11. Show that x5 − x ≡ 0 (mod 30) for for every integer x. 12. Show that, if a ≡ a′ (mod m), then gcd(a, m) = gcd(a′ , m). 13. Find the values of c ∈ Z for which the following system has solutions: x≡c (mod 14) 2x ≡ 10 (mod 42) 14. Solve the system of equations 2x + 3y ≡ 3 3x + y ≡ 4 (mod 5) (mod 5). 15. Which day of the week was March 15, 1800? (Hint: In the current Western calendar, called the Gregorian calendar, a leap year is one which is divisible by 4, but not 100, or divisible by 400. For example, the years 1700, 1800, and 1900 are not leap years, but 1600 and 2000 are). 16. Five shipwreck survivors arrive at an island where they find a chimpanzee. After they spend one day picking up coconuts, they decide to leave the divison of the coconuts for the next day. During the night each survivor wakes up and goes pickup what he thought was his share of the coconuts. Each of them finds that he cannot divide the coconuts by 5, since there is always one leftover, which they leave for the chimpanzee. The next day, they decide to split the coconuts that were left from their night incursions and now they find that they can split them exactly among them. Knowing that that there were at most 10 000 coconuts in the island, how many coconuts did they pickup? 2.9 Prime Factorizations and Cryptography We will now describe, as an example of the theory we have developed so far, an application to Cryptography. We start with some preliminary results. We will denote by Ckn the “number of arrangements of n elements in groups of k”, so that: k!(n − k)!Ckn = n!. Since n|n!, this equation and Euclides Lemma, imply that whenever n = p is a prime and 0 < k < p, we have p|Ckp . We state this as a lemma for future reference: 120 CHAPTER 2. THE INTEGERS Lemma 2.9.1. If p is a prime and 0 < k < p, then Ckp ≡ 0 (mod p). Recall that the well-known Newton’s binomial formula states that: (2.9.1) (a + b)n = n X Ckn ak bn−k . k=0 (this formula is actually valid in any commutative ring, as can easily be see by applying induction). When n = p is prime, it leads to: Proposition 2.9.2 (“The Freshman’s Formula”). If p is prime, (a + b)p ≡ ap + bp (mod p). Proof. By Lemma 2.9.1, the only terms in the binomial expansion that may fail to be divisible by p correspond to k = 0 and k = n. Theorem 2.9.3 (Fermat). If p is prime, ap ≡ a (mod p). Proof. Let us prove this result by induction in a. The result is obvious if a = 0. Assuming it is true for some integer a ≥ 0, we have that (a + 1)p ≡ ap + 1 (mod p), by the freshman’s formula, and that ap ≡ a (mod p), by the induction hypoteshis. We conclude that (a + 1)p ≡ a + 1 (mod p), so the result is true for any integer a ≥ 0. Since (−1)p ≡ (−1) (mod p) for any prime p (why?), the result holds for all integers. An interesting corollary of this result is a practical method to find inverses (mod p). Corollary 2.9.4. If p is prime and a 6≡ 0 (mod p), then ap−1 ≡ 1 (mod p). In particular, ap−2 is the inverse (mod p) of a. Proof. Obviously, since p is a prime, gcd(a, p) must be 1 or p, so that either a ≡ 0 (mod p) or gcd(a, p) = 1. Since the first case is excluded, we must have gcd(a, p) = 1 and a is invertible (mod p). Using the Fermat’s theorem, we conclude that ap = ap−1 a ≡ a (mod p) ⇒ ap−1 ≡ 1 (mod p). We can now describe a public key encryption mechanism, which is particularly ingenious. The reason for this name is that the encryption method 2.9. PRIME FACTORIZATIONS AND CRYPTOGRAPHY 121 can be known to everyone, but the decryption method is kept secret. Variations of this method of encryption are used, e.g., to make secure financial transactions in the Internet. 17 . The main ingredients are a natural number N = pq, which is the product of two distinct primes p and q, and another natural number r, relatively prime to both p − 1 and q − 1. The numbers N and r are public, but the prime factors of N are kept secret. Hence, this system explores the possibility of generating very large prime numbers, together with our lack of knowledge of a practical, efficient, algorithm to determine the prime factors of large natural numbers. Now the encryption proceeds as follows: assume that we would like to transmit numbers a such that 0 ≤ a < N . Instead of transmitting a, one sends the remainder of the division of ar by N , which we denote by b = ρ(ar , N ). The decryption amounts to finding a given b = ρ(ar , N ). This computation can only be done efficiently if one knows the prime factors of N , in which case it is a simple application of our previous results, as we now explain. On the one hand, setting c = ρ(b, p) and d = ρ(b, q), it is clear that c ≡ ar (mod p) d ≡ ar and (mod q). On the other hand, since r is relatively prime to both p − 1 and q − 1, there exist integers x, y, x′ , y ′ such that 1 = rx + (p − 1)y = rx′ + (q − 1)y ′ . It follows from the corollary above that a = arx+(p−1)y ≡ (ar )x ≡ cx rx′ +(q−1)y ′ a=a x′ ≡ (ar ) (mod p), x′ ≡d (mod q). Note that we can find x and x′ directly by applying Euclides Algorithm. Also, interpreting negative powers of a as positive powers of an inverse of a, it is clear that one can always choose x and x′ to be both positive. We conclude that the original message a satisfies the system a ≡ (ρ(b, p))x (mod p) . ′ a ≡ (ρ(b, q))x (mod q) 17 One of the most commonly used methods is the so-called RSA encryption method (discovered by Rivest, Shamir and Adlemman). Such a public system, besides the encryption mechanism that we describe, includes also a simple signature verification scheme, which is crucial for private communications via public channels. 122 CHAPTER 2. THE INTEGERS According to the Chinese Remainder Theorem, this system determines uniquely the number a (mod pq), i.e., (mod N ). Since we assumed 1 ≤ a < N , we have found a. Solving this system requires only the knowledge of an inverse of p (mod q), and this can again be achieved by applying Euclides Algorithm. We illustrate the method in the following example. Example 2.9.5. Let N = 21 = 3·7 = p·q. The number r = 5 is relatively prime to p−1 = 2 and q−1 = 6. Let us assume that we wish to transmit the message a = 4. According to the method just described, instead of transmitting a, one send instead the remainder of the division of ar = 1024 by N = 21, that is b = ρ(1024, 21) = 16 (since 1024 = 48 · 21 + 16). Now assume that the message b = 16 reaches the receiver, who is wishes to decrypt it and find a. Since 1 = 5 · (−1) + 2 · 3 = 5 · (−1) + 6′ · 1, we conclude that x = −1 and x′ = −1. On the other hand, the numbers c and d are given by c = ρ(b, p) = 1, (since 16 = 3 · 5 + 1) d = ρ(b, q) = 2, (since 16 = 7 · 2 + 2). Hence, the number a satisfies the system a ≡ (1)−1 (mod 3) . a ≡ (2)−1 (mod 7) Noting that 2 · 4 = 8 ≡ 1 (mod 7), the inverse of 2 (mod 7) is 4, and we conclude that a satisfies the system a ≡ 1 (mod 3) . a ≡ 4 (mod 7) The solutions of the first equation are a = 1 + 3n, with n ∈ Z. Replacing in the second equation, one finds 1 + 3n ≡ 4 (mod 7) ⇒ 3n ≡ 3 (mod 7). In order to solve this last equation we note that the inverse of 3 (mod 7) is 5 (since 3 · 5 = 15 ≡ 1 (mod 7)), so that: n ≡ 3 · 5 = 15 (mod 7). The Chinese Remainder Theorem now gives that a ≡ 1 + 3 · 15 = 46 (mod 21). Since 0 ≤ a < 21, the receiver concludes that the message transmitted was a = 4. 2.9. PRIME FACTORIZATIONS AND CRYPTOGRAPHY 123 Exercises. 1. A word is codified so that each letter of the the (english) alphabet corresponds to a natural number: a 7→ 1, b 7→ 2, c 7→ 3, . . . . Then the message “10,21,23,10,16,1” is transmitted in a system of public key with N = 35 and r = 5. What was the original word? 2. Prove the following generalization of the Chinese Remainder Theorem: If d = gcd(m, n) the system x≡a (mod m) x≡b (mod n) has solutions if and only if d|(a − b). In this case, if c is a solution, then the system is equivalent to x ≡ c (mod lcm(n, m)). 124 CHAPTER 2. THE INTEGERS Chapter 3 Other Examples of Rings 3.1 The Rings Zm We have studied the ring Z in detail in Chapter 2. You should also be familiar with the fields Q, R or C, and their propertied. In this chapter we will study other important examples of rings. We start by studying the rings associated with the congruence (mod m), the rings Zm . In Chapter 1 we have seen briefly the cases Z2 and Z3 , without making any reference to the congruence (mod m). However, a more systematic study of these rings requires using these congruence relations. Let us start by observing that the congruence (mod m) can be replaced by an equality if we “identify” (i.e., treat as unique element) all the integers which are congruent between themselves. The method we use is actually valid for any equivalence relation and consists in replacing one element by the class made of all the elements which are equivalent to it. Let us assume then that we have fixed a modulus of congruence m ∈ Z. For any x ∈ Z we introduce: Definition 3.1.1. The equivalence class (mod m) of x is the set x formed by all integers congruent with x (mod m). In other words: x = {y ∈ Z : x ≡ y (mod m)}. Example 3.1.2. For the congruence with modulus m = 3, we have: 0 = {0, ±3, ±6, . . . }, 1 = {1, 1 ± 3, 1 ± 6, . . . }, 2 = {2, 2 ± 3, 2 ± 6, . . . }. 125 126 CHAPTER 3. OTHER EXAMPLES OF RINGS Obviously the notation x is ambiguous since it does not contain any information concerning the modulus of congruence m it refers to. However, it is clear that x = {x + ym : y ∈ Z}, or that: x = {x + z : z ∈ hmi}. For this reason, whenever necessary to clarify the modulus of congruence, we will write x + hmi instead of x. Example 3.1.3. For m = 4 and m = 5 we have, respectively, m=4: m=5: 3 = 3 + h4i = {. . . , −5, −1, 3, 7, 11, . . . }, 3 = 3 + h5i = {. . . , −7, −2, 3, 8, 13, . . . }. In order to replace the congruence x ≡ y (mod m) by an equality we will use the following lemma. Since it is a direct consequence of Proposition 2.8.2, the lemma is actually valid for any equivalence relation. Lemma 3.1.4. For any integers x, y, the following statements are equivalent: (i) x ≡ y (mod m); (ii) x = y (iii) x ∩ y 6= ∅. Proof. We show that (i) ⇒ (ii) ⇒ (iii) ⇒ (i). To prove that (i) ⇒ (ii), observe that if x ≡ y (mod m), the transitivity property of the congruence relation, gives that x ⊇ y. But, by the symmetry property, x ≡ y (mod m) if and only if y ≡ x (mod m), hence also x ⊆ y. We conclude that x = y. To prove that (ii) ⇒ (iii), we observe that the reflexivity property gives that y ∈ y. Hence, if x = y, then x ∩ y 6= ∅. Finally, to prove that (ii) ⇒ (iii), assume that z ∈ x ∩ y 6= ∅. From the definition of equivalence class, we have that x ≡ z (mod m) and y ≡ z (mod m). By the symmetry and transitivity properties, it follows that x ≡ y (mod m). 3.1. THE RINGS ZM 127 According to the reflexivity property, an integer x belongs to the class x, and hence the union of all the equivalence classes is the set Z. Moreover, according to the previous lemma, distinct equivalence classes are necessarily disjoint. For this reason we say the set of all equivalence classes for a given modulus m, i.e., the set {x : x ∈ Z}, is a partition of Z. We will use the notation introduced in the previous chapter and call x a congruence class. Examples 3.1.5. 1. When m = 2, the resulting partition of Z if the usual classification of the integers into even and odd numbers. 2. When m = 3, the resulting partition amounts to the classification of integers in terms of the remainder of the division by 3: Z = {0, ±3, ±6, . . . } ∪ {1, 1 ± 3, 1 ± 6, . . . } ∪ {2 ± 3, 2 ± 6, . . . }. Obviously, an equivalence class x is uniquely determined once we know one of its members. For this reason, any integer y in x is called a representative of the class x, since we have y = x. Example 3.1.6. If m = 3, the integers 1, 4, 7, −2 and −5 are all representatives of 1, and we have 1 = 4 = 7 = −2 = −5. Given an equivalence relation “∼” in a set X, the corresponding set of equivalence classes is called the quotient of X by ∼ and denoted by X/ ∼. The map π : X → X/ ∼ given by π(x) = x, which associates to each element of X its equivalence class, is called the quotient map. In the case where X = Z and ∼ is the equivalence relation modulo m, we denote the quotient set Z/ ∼ by Zm , and the quotient map by πm , or simply by π. Therefore, we have πm (x) = x + hmi = x. More formally: Definition 3.1.7. For a fixed modulus m > 0, we set: Zm = {x : x ∈ Z}, πm : Z → Z m x 7→ x. In this notation, Proposition 2.8.4 amounts simply to count the number of elements of Zm : 128 CHAPTER 3. OTHER EXAMPLES OF RINGS Proposition 3.1.8. If m > 0, Zm = {0, 1, . . . , m − 1} has m elements. Notice that if m = 0, then x = {x}, and hence Z0 is an infinite set. Actually, for the algebraic operations that we will define shortly, Z0 and Z are isomorphic rings. Example 3.1.9. The set Z6 has 6 elements, and we can write Z6 = {0, 1, 2, 3, 4, 5} = {6, 7, 8, 9, 10, 11} = {36, −5, 2, 63, 610, −19}, etc. Notice the ambiguity in notation: when we write Z4 = {0, 1, 2, 3}, the symbols in this list denote objects that are not elements of Z6 . For example, 2 + h4i 6= 2 + h6i i.e., π4 (2) 6= π6 (2). According to Proposition 2.8.6, if x ≡ x′ (mod m) and y ≡ y ′ (mod m), then x + y ≡ x′ + y ′ (mod m) and xy ≡ x′ y ′ (mod m). In terms of equivalence classes, we have x = x′ and y = y ′ =⇒ x + y = x′ + y ′ and xy = x′ y ′ . In other words, the classes x + y and xy do not depend on the choice of representatives x and y, but only on the congruence classes x and y. We can use this fact to define operations of addition and product in Zm . Definition 3.1.10. The addition and product in Zm are defined by x + y = x + y, and x · y = xy. As one could expect, some of the properties of the algebraic operations in Z yield similar properties for the algebraic operations in Zm . For example: x + (y + z) = x + y + z, = x + (y + z), = (x + y) + z, = x + y + z, = (x + y) + z, so we conclude that addition in Zm is associative. We leave as an exercise to check that: 3.1. THE RINGS ZM 129 Theorem 3.1.11. (Zm , +, ·) is an abelian ring with identity. Note, however, that some properties of Z do not transfer to the ring Zm . Example 3.1.12. The addition and multiplication tables in Z4 are: + 0 1 2 3 0 0 1 2 3 1 1 2 3 0 2 2 3 0 1 · 0 1 2 3 3 3 0 1 2 0 0 0 0 0 1 0 1 2 3 2 0 2 0 2 3 0 3 2 1 Now observe the following differences between Z4 and Z: • The equation x = −x has two solutions in Z4 but only one in Z; • Since 2 · 2 = 0 it follows that 2 is a zero divisor, so Z4 is not an integral domain; • The natural multiples of the identity 1 are 1 · 1 = 1, 2 · 1 = 2, 3 · 1 = 3, 4 · 1 = 4 = 0, etc. and so this ring has characteristic 4; • There are 3 subrings of Z4 , which are all principal ideals (as in the ring of integers): h1i = h3i = Z4 , h2i = {0, 2}, and h0i = h4i = {0}. Note that these ideals correspond precisely to the divisors of 4. Returning to the general case of the ring Zm , with m > 0, let us identify the invertible elements of Zm , which is denoted by Z∗m (according to the notation introduced in Chapter 1). Given a ∈ Z, a is invertible in Zm if and only if the equation a · x = 1 has a solution in Zm . From the results in Section 2.8, we have that: a ∈ Z∗m ⇐⇒ a · x = 1 has a solution in Zm , ⇐⇒ ax ≡ 1 (mod m) has solutions in Z, ⇐⇒ gcd(a, m) = 1. Hence, the invertible elements of Zm correspond to the natural numbers k, 1 ≤ k ≤ m, that are relatively prime to m. The number of elements of Z∗m , is denoted by ϕ(m), and the resulting function ϕ : N → N is called the Euler function. 130 CHAPTER 3. OTHER EXAMPLES OF RINGS Example 3.1.13. The invertible elements in the ring Z9 form the set Z∗9 = {1, 2, 4, 5, 7, 8}, hence, ϕ(9) = 6. We will see later how to compute the value of the Euler function ϕ(m) from the prime factorization of m. For now we observe that for any prime p we have ϕ(p) = p − 1, and that all the non-zero elements of Zp are invertible: Theorem 3.1.14. If p is a prime, then Zp is a finite field with p elements. The characteristic of the rings Zm is also very easy to find. We already observed that Z4 has characteristic 4. In fact: Theorem 3.1.15. The ring Zm has characteristic m. Proof. Using induction, we have that ∀n ∈ N, a ∈ Z : na = n · a = na. (3.1.1) In particular for a = 1, we have n1 = 0 ⇐⇒ n=0 ⇐⇒ n ∈ hmi, so the result follows. We have indicated in the example above all the subrings and ideals of Z4 . In order to extend this to any value of m, let us start by looking at the ideals generated by each of the elements of Zm . The following proposition follows immeadately from the commutativity of the product in Zm and (3.1.1). Proposition 3.1.16. If a ∈ Z, then hai = {a · n : n ∈ Zm } = {na : n ∈ Z}. Hence, given a generator, it is very easy to list the elements of an ideal of Zm . Examples 3.1.17. 1. In Z40 we have: h15i = {15, 30, 45 = 5, 20, 35, 50 = 10, 25, 40 = 0}. 3.1. THE RINGS ZM 131 2. In Z21 we have: h15i = {15, 30 = 9, 24 = 3, 18, 33 = 12, 27 = 6, 21 = 0}. Looking more carefully, one finds that the elements of the ideal hai correspond to the integers b for for which the equation ax ≡ b (mod m) has solutions. This integers are, as we know, the multiples of d = gcd(a, m): Proposition 3.1.18. If d = gcd(a, m), then we have que hai = hdi in Zm . Proof. On the one hand, since d = ax + my, we have d = ax, so that d ∈ hai and: hai ⊃ hdi. On the other hand, a = dz and we have a = dz, so that a ∈ hdi, and: hdi ⊃ hai. This result gives immediately the number of elements of hai: if d = gcd(a, m), then m = dk, and hai = hdi has k elements1 . Examples 3.1.19. 1. In Z40 we have: h15i = h5i = {5, 10, 15, 20, 25, 30, 35, 0}, with 40 5 = 8 elements. 2. In Z21 we have with 21 3 h15i = h3i = {3, 6, 9, 12, 15, 18, 0}, = 7 elements. In Z4 all subrings are principal ideals, just like for the ring of integers. In order to see that this is a general property of the rings Zm . we start by relating the subrings of Zm and the subrings of Z, via the quotient map π : Z → Zm . 1 This shows that the number of elements of the ideal hai is a factor of the number of elements the ring Zm . We will see in the next chapter that this is a special case of Lagrange’s Thoerem. 132 CHAPTER 3. OTHER EXAMPLES OF RINGS Z Zm J hmi PSfrag Ireplaements e~ e Figure 3.1.1: Subrings of Z and of Zm . Proposition 3.1.20. If I is um subring of Zm and J = π −1 (I), then I is a subring of Zm if and only if J is a subring of Z that contains hmi. Proof. Note that: J = π −1 (I) = {a ∈ Z : a ∈ I}. We will show that if I is a subring then J is also a subring, leaving the rest as an exercise. Obviously, if a, b ∈ J, then a, b ∈ I. Hence: a − b = a − b ∈ I =⇒ a − b ∈ J, a · b = ab ∈ I =⇒ ab ∈ J. Corollary 3.1.21. If I is a subset of Zm , the following statements are equivalent: (i) I is a subring of Zm ; (ii) I is an ideal of Zm ; (iii) there exists d ∈ Z such that d|m and I = hdi. Proof. The implications (iii)⇒(ii)⇒(i) are obvious. Therefore, it remains to prove that (i)⇒(iii), which we leave for the exercises. It follows from this corollary that Zm has exactly one subring (which is necessarily a principal ideal) for each divisor of m. One can use Proposition 3.1.18 to find the generators of each of these ideals. 3.1. THE RINGS ZM 133 h1i h1i h5i h2i h5i h2i h10i h4i h10i h4i h8i h8i h20i replaements PSfrag h0i h20i h0i Figure 3.1.2: The ideals of Z40 . Example 3.1.22. The generators of h1i = Z40 correspond to the solutions of gcd(x, 40) = 1: h1i = h3i = h7i = h9i = h11i = h13i = h17i = h19i = h21i = h23i = h27i = h29i = h31i = h33i = h37i = h39i. The generators of h2i correspond to the solutions of gcd(x, 40) = 2: h2i = h6i = h14i = h18i = h22i = h26i = h34i = h38i. The generators of h4i correspond to the solutions of gcd(x, 40) = 4: h4i = h12i = h28i = h36i. The generators of h8i correspond to the solutions of gcd(x, 40) = 8: h8i = h16i = h24i = h32i. The generators of h5i correspond to the solutions of gcd(x, 40) = 5: h5i = h15i = h25i = h35i. The generators of h10i correspond to the solutions of gcd(x, 40) = 10: h10i = h30i. The ideals h20i and h0i obviously have a unique generator. Assume now that n|m, and that B is a subring of Zm with n elements. We have the following natural questions: • Is the additive group B isomorphic to the group Zn ? • Is the ring B always isomorphic to the ring Zn ? 134 CHAPTER 3. OTHER EXAMPLES OF RINGS The first of these questions is very easy to answer: Proposition 3.1.23. If n|m and B is a subring of Zm with n elements, then the additive groups B and Zn are isomorphic. Proof. Let m = dn, so that B = hdi. We define φ : Zn → Zm by setting φ(x) = dx, where here x ∈ Zn , and dx ∈ Zm .2 Notice that one must show that φ is a well defined function: one must show that the right side of φ(x) = dx does not depend on the choice of the integer x representing the class x on the left side. For that it is enough to observe that bx = x′ in Zn ⇐⇒ n|(x − x′ ) ′ ⇐= =⇒ m = dn|(dx − dx ) =⇒ dx = dx′ in Zm . Now it is immediate to check that φ is a group homomorphism and also that φ(Zn ) = {dx : x ∈ Z} = hdi = B. It remains to show that φ is an isomorphism, i.e.that φ is injective. For that we need only to compute its kernel: φ(x) = 0 ⇐⇒ ⇐⇒ dx = 0 (in Zm ) n|x ⇐⇒ ⇐⇒ dn = m|dx x = 0 (in Zn ). ⇐⇒ Since the kernel of φ is trivial, φ is an isomorphism between B and Zn . We will see later that this result is a general property of the so called cyclic groups. The second question above, concerning ring isomorphisms is not so simple. We illustrate its complexity with a few more examples. Examples 3.1.24. 1. Consider in Z4 the subring B = h2i = {2, 0}. We have just seen that (B, +) ≃ (Z2 , +). However, the rings B and Z2 are not isomorphic, since the product in B is always zero: x, y ∈ B ⇒ xy = 0. 2. Consider in Z6 the subrings B = h2i = {2, 4, 0}, and C = h3i = {3, 0}. Again, we have (B, +) ≃ (Z3 , +) and (C, +) ≃ (Z2 , +), in this case, these are also ring isomorphisms, although this is not obvious. The following proposition gives a complete answer to this question: 2 We could also wrote, maybe more precisely, that φ(πn (x)) = πm (dx). 3.1. THE RINGS ZM 135 Proposition 3.1.25. If m = nd and B = hdi is the subring of Zm with n elements, then the following statements are equivalent: (i) The rings B and Zn are isomorphic, (ii) The ring B is unitary, (iii) gcd(n, d) = 1. In this case, the identity of B is the unique x ∈ Zm such that x ≡ 0 (mod d) , and x≡1 (mod n). Example 3.1.26. The ring Z36 has 9 subrings, because 36 has 9 divisors. With the exception of the trivial subrings h0i = {0} and h1i = Z36 , only the subrings B = h4i, with 9 elements, and C = h9i, with 4 elements, have identity. The solution of the system x ≡ 0 (mod 4) and x ≡ 1 (mod 9) is x ≡ 28 (mod 36), and hence the identity of B is x = 28. Similarly, the solution of x ≡ 0 (mod 9) and x ≡ 1 (mod 4) is x ≡ 9 (mod 36), and hence the identity of C is x = 9. Exercises. 1. Prove Theorem 3.1.11. 2. Show that the only subrings of Z4 are h1i, h2i e h0i. 3. Show that, if m > 1, then Zm is either a field or has zero divisors. 4. Show that na = na (in particular, n = n1). 5. Show that, if n > 1, then Mn (Zm ) is a non-abelian ring, with characteristic 2 m, and mn elements. 6. Consider the ring consisting of all maps f : Z → Zm and show that for any m > 1 there exist infinite rings with characteristic m. 7. Show that A ∈ Mn (Zm ) is invertible if and only if det(A) ∈ Z∗m 3 . 3 The determinant of a matrix A = (aij ) of size n × n with entries in a commutative ring is defined as usual by: X det A = sgn(π)a1π(1) · · · anπ(n) . π∈Sn 136 CHAPTER 3. OTHER EXAMPLES OF RINGS 8. Give an example of a finite vector space over a finite field (compare Rn with Znp ). 9. Determine all matrices in GL(2, Z2 ) (2 × 2 invertible matrices with entries in Z2 ). 10. What is the cardinal of GL(n, Zp ), if p is prime? (Hint: Note that the rows of the matrix M ∈ GL(n, Zp ), that are vectors of Znp , must be linearly independent.) 11. Find the inverse of the matrix   1 0 0  2 3 4  ∈ GL(3, Z5 ). 3 2 4 12. Solve the system in Z5 . x + 2y −3x + 3y = a = b 13. Show that hai = Zm if and only if a ∈ Z∗m . 14. Complete the proofs of Proposition 3.1.20 and of Corollary 3.1.21. 15. Find all the elements of the ideal h85i ⊂ Z204 . Which ones are generators of this ideal? 16. How many elements does the ideal h28, 52i ⊂ Z204 have? 17. Determine all the ideals of Z30 . Which of these ideals are unitary rings, and what are the respective identities? 18. If p is a prime number, and n ∈ N, show that ϕ(pn ) = pn − pn−1 . Hint: Show that x ∈ Z∗pn ⇐⇒ x 6∈ hpi. 19. Find ϕ(3000). Hint: Show that x ∈ Z∗3000 ⇐⇒ x 6∈ (h2i ∪ h3i ∪ h5i). 20. Assume that d = gcd(a, m), m = dn, and φ : Zm → Zm is given by φ(x) = ax. (a) Show that φ is a homomorphism of groups. (b) Show that the kernel of φ is hni, and φ(Zm ) has n elements. 3.2. FRACTIONS AND RATIONAL NUMBERS 137 (c) Assuming that m = 12, for which a is φ a group automorphism? (d) Assuming that m = 12, for which a is φ a ring homomorphism? 21. Assuming that n|m, show that the function φ : Zm → Zn given by φ(x) = x, i.e., φ(πm (x)) = πn (x), with x ∈ Z,is well defined, and that it is a homomorphism of rings. What is its kernel? 22. Prove Proposition 3.1.25 proceeding as follows: (a) Show that “(i) =⇒ (ii)”. (b) Show that “(ii) =⇒ (iii)”, by showing first that if a is the identity of B then hai = hdi, and a2 = a. Conclude that a ≡ 0 (mod d), and a ≡ 1 (mod n). (c) Solve the system a ≡ 0 (mod d), and a ≡ 1 (mod n), and consider the function φ : Zn → Zm given by φ(x) = ax. Show that φ is well defined, is an injective ring homomorphism, and φ(Zn ) = B, therefore completing the proof. 23. Consider maps φ : Z4 → Z36 . (a) Find the group homomorphisms φ. Which ones are injective? (b) List the ring homomorphisms φ. Which ones are injective? 24. Assume that a ∈ Z∗m , and consider Ψ : Z∗m → Zm given by Ψ(x) = a · x. (a) Show that Ψ is injective, and that in fact Ψ(x) ∈ Z∗m , for all x ∈ Z∗m . (b) Writing Z∗m = {x1 , x2 , x3 , . . . , xk }, where k = φ(m) and φ is the Euler Qk Qk function, show that i=1 Ψ(xi ) = i=1 xi , and use this fact to prove the Theorem of Euler : ak = 1. (c) Prove also the following Theorem of Fermat : If m = p is prime, then ap = a. 3.2 Fractions and Rational Numbers Our main aim in this section is to define the field of rational numbers and to show that its properties (often introduced in an axiomatic form) are logical consequences of the axioms of the integers studied in Chapter 2. At the same time, we will see that the method to define the rational numbers from the integer numbers can be applied to any abelian ring where the cancellation law holds. This will allow us to define later other important fields. 138 CHAPTER 3. OTHER EXAMPLES OF RINGS The rational numbers (fractions, roots, etc.) often are informally defined as “expressions of the type m n ”, where m and n are integers, and n 6= 0. From a more formal point of view, we can observe that an ordered pair of integers (m, n) determines a rational number, provided that n 6= 0. On the other hand, we know that distinct ordered pairs can correspond to the same m′ rational number: we can have (m, n) 6= (m′ , n′ ) and m n = n′ , This happens precisely when mn′ = m′ n. These remarks suggest that instead of defining the rational numbers as ordered pairs of integers we can define them as equivalence classes of ordered pairs of integers. As we will see, the fact that this works does not depend on specific properties of the integers, but only on the fact that Z is an abelian ring with more that one element, where the cancellation law holds. Therefore, we will formalize this construction in this more general abstract setting. In what follows, A 6= {0} denotes an abelian ring where the cancellation law for the product is valid (i.e., without zero divisors). Definition 3.2.1. Let B = {(a, b) : a, b ∈ A, and b 6= 0}. We denote by “≃” the binary relation in B defined by (a, b) ≃ (a′ , b′ ) ⇐⇒ ab′ = a′ b. The relation just defined is, of course, suggested by the equality of fractions that we mentioned before. Its usefulness depends on being an equivalence relation, which we now check. Lemma 3.2.2. The relation ≃ is an equivalence relation. Proof. Obviously, the relation “≃” is reflexive and symmetric. To check that is also transitivity, assume that (a, b) ≃ (a′ , b′ ), and that (a′ , b′ ) ≃ (a′′ , b′′ ). Using the commutativity and associativity of the product, we find: (a′ , b′ ) ≃ (a′′ , b′′ ) (a, b) ≃ (a′ , b′ ) ⇐⇒ ⇐⇒ a′ b′′ = a′′ b′ ab′ = a′ b =⇒ =⇒ a′ bb′′ = a′′ bb′ , a′ bb′′ = ab′′ b′ . We conclude that a′′ bb′ = ab′′ b′ and, hence, that a′′ b = ab′′ , e (a, b) ≃ (a′′ , b′′ ), since b′ 6= 0 and A has no zero divisors. An equivalence relation “≃” in B determines a partition of B into equivalence classes. If (a, b) ∈ B, we call the corresponding equivalence class (a, b) a fraction of A and denoted it by “a/b”, or by “ ab ”. More formally: 3.2. FRACTIONS AND RATIONAL NUMBERS 139 Definition 3.2.3. Let A 6= {0} be an abelian ring where teh cancellation law holds. (i) If a, b ∈ A and b 6= 0, the fraction a b is defined by a = {(a′ , b′ ) ∈ A × A : b′ 6= 0 and ab′ = a′ b}; b (ii) The set of all fractions a b is denoted by Frac(A); (iii) When A = Z, a fraction ab is called a rational number, and the set Frac(Z) is denoted by Q. When A = Z, the set of rational numbers Frac(Z) = Q is a ring. For the general case, we define on the set of fractions of A algebraic operations of addition and product by copying simply the usual operations with rational numbers. These are given by: (3.2.1) (3.2.2) ad + bc a c + = , b d bd ac ac = . bd bd In order to formalize these definitions, which represent binary operations on equivalence classes, we must check that each of this operations is independent of the choice of representative for each class. We state this in the following lemma, whose proof is left as an exercise: Lemma 3.2.4. If (a, b) ≃ (a′ , b′ ), and (c, d) ≃ (c′ , d′ ), then (ad + bc, bd) ≃ (a′ d′ + b′ c′ , b′ d′ ), (ac, bd) ≃ (a′ c′ , b′ d′ ). According to this lemma, (3.2.1) and (3.2.2) do indeed define addition and product operations on the set of fractions. It is obvious that both operations are commutative, and that the product is associative. With a bit more work, one can verify that addition is associative and that the product is distributive relative to addition. The existence of identities for both operations also does not offer any difficulties. Indeed, just like for the rationals, we have: Theorem 3.2.5. The set Frac(A) with the algebraic operations (3.2.1) and (3.2.2) is a field, called the field of fractions of A. 140 CHAPTER 3. OTHER EXAMPLES OF RINGS Proof. Let b 6= 0 be any non-zero element in A. We set 0′ = We leave as an exercise to check : 0′ = {(0, c) : c 6= 0} and 0 b and 1′ = bb . 1′ = {(c, c) : c 6= 0} are, respectively, identities for the addition and the product in Frac(A). Note-se that the elements 0′ and 1′ are independent of the choice of b 6= 0 in A. In particular, and assuming that y 6= 0, we have: x = 0′ ⇐⇒ x = 0, and y x = 1′ ⇐⇒ x = y. y Note also that the existence of an identity for the product of fractions does not depend of the existence of an identity for the product in the original ring A, and observe finally that a (−a) + = 0′ , b b and that if a b 6= 0′ , then a 6= 0, so b a is a fraction, and ab = 1′ . ba When A = Z is the ring of integers and Frac(A) = Q, we are used to consider Z as subring of Q. For an arbitrary ring this “identification” is still possible: the field of fractions of a ring A contains a subring isomorphic to the ring A. Proposition 3.2.6. Let a 6= 0 be a fixed element in the ring A. Then: (i) ι : A → Frac(A), x 7→ ax a is a ring isomorphism between A and ι(A); (ii) the isomorphism ι is independent of the choice of element a ∈ A − {0}. Proof. The identities ι(x + y) = ι(x) + ι(y) and ι(xy) = ι(x)ι(y) are easily verified, so ι is a homomorphism of rings and ι(A) is a subring of Frac(A). Moreover, ι is an isomorphism between A and ι(A) since: ι(x) = 0′ ⇐⇒ ⇐⇒ ⇐⇒ ax = 0′ a ax = 0 x = 0. 3.2. FRACTIONS AND RATIONAL NUMBERS 141 ′ ax On the other hand, if a, a′ 6= 0 and x ∈ A, then ax a = a′ , because (ax)a′ = (a′ x)a. Hence, ι is independent of the choice of a ∈ A − {0}. According to this result, the elements of Frac(A) of the form ax a , with a, x ∈ A and a 6= 0 fixed, are “copies” of the elements x ∈ A. For this reason, whenever one works with elements of the field Frac(A), the fraction ax a is denoted by “x”, and one says that x is an element of the ring A. We also write Frac(A) ⊃ A, so we consider Frac(A) as an extension of the ring A. This abuse of language is fully justified by the proposition above, and allows for a much lighter notation. For the same reason, we will use the same symbol 0 to denote the zeros of A and Frac(A). Similarly, when A as identity 1 we use the same symbol to denote the identity of Frac(A). A similar issue arises when one considers “fractions of fractions”. When dealing with rational numbers, we have the equality of fractions ad a/b = . c/d bc However, according to our formal definition these two fractions belong to two different rings: the fraction ad bc is an element of Q, while the fraction a/b c/d is an element of Frac(Q). Hence, this equality cannot hold literally. The sense in which it holds is explained in the following proposition: Proposition 3.2.7. If K is a field, the map ι : K → Frac(K) defined in Proposition 3.2.6 is surjective, and hence is a ring isomorphism. Proof. The proof reduces to the observation that xy = ι(xy −1 ), which according with the conventions above is written in the form “ xy = xy −1 ”. In the case of the composite fraction above (x = ab and y = dc ), strictly speaking, we have that: ad a c −1 ad a/b =ι =ι =ι . c/d b d bc bc ad Using the identification of x with ι(x), we find that a/b c/d = bc . In the previous section we saw that the existence of the finite fields Zp follows from the axioms for the integers. We have just seen that the existence of the field Q is another consequence of those axioms. We will see later that these fields are, in some sense, the smallest fields: we will show that any field of characteristic p contains necessarily a subfield isomorphic to Zp (if p 6= 0) or isomorphic to Q (if p = 0). 142 CHAPTER 3. OTHER EXAMPLES OF RINGS Exercises. 1. Prove Lemma 3.2.4. 2. Complete the proof of Theorem 3.2.5 3. What is the field of fractions of the ring of Gaussian integers? 4. If Frac(A) is isomorphic to Frac(B), is it true that A is isomorphic to B? If A is isomorphic to B, is it true that Frac(A) is isomorphic to Frac(B)? 5. Show that, if a field K is an extension of a ring A, then K contains a subfield isomorphic to Frac(A) (this result, generalizes Proposition 3.2.6, that Frac(A) is the smallest field that contains A). (Hint: If K is a field, the intersection of all subfields of K is the smallest subfield of K.) 6. Show that, if A is countable, then Frac(A) is countable. 7. Assume that A is an integral domain. Show that then the characteristics of Frac(A) and A are equal. 8. Show that Q can only be ordered so that: m n > 0 ⇔ mn > 0. (Hint: In an ordered field one always has x > 0 ⇔ x−1 > 0.) 9. Show that, if A is an ordered ring, then Frac(A) is also an ordered field. 3.3 Polynomials and Power Series The polynomials with real coefficients are often defined as the maps p : R → R of the form N X p n xn , p(x) = n=1 where the real numbers pn are called the coefficients of the polynomial p. However, one cannot define in this way polynomials with coefficients in a arbitrary ring A if we want to keep the property that polynomials with distinct coefficients correspond to distinct polynomials. In fact, if A has more than one element, there is an infinite number of choices for the coefficients of polynomials. However, if A is finite, there are only a finite number of maps f : A → A, so these cannot be used to define all the polynomials with coefficients in A. 3.3. POLYNOMIALS AND POWER SERIES 143 Example 3.3.1. The set underlying the ring Z2 is {0, 1}, so has two elements. Therefore the set of maps f : Z2 → Z2 has 4 elements. On the other hand, if polynomials with distinct coefficients correspond to distinct polynomials, there exist an infinite number of polynomials with coefficients in Z2 . This problem is solved identifying a polynomial with the sequence of its coefficients. We will come back later to the issue of defining an associated map. In fact, we will consider polynomials (which have a finite number of non-zero coefficients) as a special case of a “power series”. The last ones will be defined without any reference to convergence issues, so they will be “formal” power series. In what follows, we denote by A an abelian ring with identity 1. Definition 3.3.2. A (formal) power series in A is a map s : N0 → A. A power series is called a polynomial if and only if there exists a N ∈ N0 such that s(n) = 0 for all n > N . Examples 3.3.3. 1. The following polynomials with coefficients in A have special roles: • 0 = (0, 0, 0, . . . ), called the zero polynomial; • 1 = (1, 0, 0, . . . ), called the identity polynomial; • x = (0, 1, 0, . . . ), which we will call the indeterminate x. 2. The polynomial a = (a, 0, 0, . . . ), where a ∈ A, is called a constant polynomial. In particular, the zero and identity polynomials are constant polynomials. 3. The set of all power series in Z2 is infinite and non-contable (it should be obvious that it is a set in bijection with P(N)), and the set of all polynomials in Z2 is infinite but countable. The terms s(0), s(1), s(2), . . . of a formal power series s : N0 → A are called the coefficients of the series In order to avoid confusion with the values of a function associated with the power series, we will always denotes these coefficients by s0 , s1 , s2 , etc. We will denote by the symbols A[[x]] and A[x], respectively, the sets of power series and polynomials, respectively, 144 CHAPTER 3. OTHER EXAMPLES OF RINGS with coefficients in A. Obviously, A[x] ⊂ A[[x]]4 . The addition and product of polynomials with real coefficients is certainly familiar to us. For example, if we consider polynomials of degree 2, we have: (a0 + a1 x + a2 x2 ) + (b0 + b1 x + b2 x2 ) = (a0 + b0 ) + (a1 + b1 )x + (a2 + b2 )x2 (a0 + a1 x + a2 x2 )(b0 + b1 x + b2 x2 ), = (a0 b0 ) + (a0 b1 + a1 b0 )x +(a0 b2 + a1 b1 +a2 b0 )x2 + (a1 b2 + a2 b1 )x3 + (a2 b2 )x4 . Our next definition amounts to recognize that the the operations over the coefficients of the polynomials, on the right hand side of the previous identities, make sense in any ring. Notice that the resulting addition operation is just the usual addition of sequences, while the product is distinct. When there is some risk of ambiguity, we will call the product defined below the convolution product , and denote it by s ⋆ t instead of st. Definition 3.3.4. Let s, t : N0 → A be power series, the sum s + t and the (convolution) product s ⋆ t are the power series: (3.3.1) (3.3.2) (s + t)n = sn + tn , and, n X sk tn−k . (s ⋆ t)n = k=0 Examples 3.3.5. 1. If a = (a, 0, 0, . . . ) and b = (b, 0, 0, . . . ) are constant polynomials, their sum and product are given by a + b = (a + b, 0, 0, . . . ) and a ⋆ b = ab = (ab, 0, 0, . . . ). Hence, the set of constant polynomials is a ring isomorphic to the ring A. 2. If a = (a, 0, 0, . . . ) is constant and s = (s0 , s1 , s2 , . . . ) is any power series, the Pn product a ⋆ s is the power series (as0 , as1 , as2 , . . . ) = as, since the sum k=0 ak sn−k reduces the term with k = 0. . . ) with coefficients 3. Consider the power series s = (1, 1, 1, .P Pn in Z2 . In order n to compute its square, note that (ss)n = k=0 sk sn−k = k=0 1 = n + 1. We conclude that (1, 1, 1, . . . )(1, 1, 1, . . . ) = (1, 0, 1, 0, . . . ). 4 Note that the use of the letter “x” in the symbols A[x] and A[[x]] is caused by the choice of this letter to denote the indeterminate (0, 1, 0, . . . ). In some situations we may denote this indeterminate by some other letter, e.g., y, in which case we will use the notation A[y] or A[[y]]. 3.3. POLYNOMIALS AND POWER SERIES 145 4. If s = (s0 , s1 , s2 , . . . ) is any power series, the product xs is the power series obtained from s pby translation of all its coefficients to the right: (xs)0 = x0 s0 = 0, (xs)n+1 = n+1 X xk sn+1−k = sn . k=0 When s = x, we conclude that x2 = (0, 0, 1, 0, . . . ), x3 = (0, 0, 0, 1, 0, . . . ), etc. We extend this to the case n = 0 by stipulating by convention that x0 = (1, 0, 0, . . . ) = 1. The next result has a straightforward proof, and so it is left as an exercise. Theorem 3.3.6. If A is an abelian ring with identity, then both A[[x]] and A[x]5 are abelian rings with identity for the sum and product defined by (3.3.1) and (3.3.2)), and A[x] is a subring of A[[x]]. We have already observed that the constant polynomials form a ring isomorphic to A. Therefore, this justifies using the same symbol a to denote an element of the ring A, and the corresponding constant polynomial (a, 0, 0, . . . ). We also say that A[x] and A[[x]] are extensions of A. Notice that the expression axn is then representing the product of the constant polynomial (a, 0, 0, . . . ) by the n-th power of the indeterminate x, i.e., xn . This polynomial, according to what we saw above, has only one non-zero coefficient namely (axn )n = a (note that here, a has two distinct meanings!) We conclude that if p = (p0 , p1 , p2 , . . . , pN , 0, . . . ), then p = p0 + p1 x + · · · + pN x N = N X p n xn . n=0 The expression on the right hand side is called the canonical form of the polynomial p. As usual, the coefficient is omitted whenever it equals 1. Example 3.3.7. Assume that p, q ∈ Z4 [x] are given by p = 1 + x + 2x2 and q = 1 + 2x2 . In order to add and multiply these polynomials, we proceed exactly as we do 5 The ring of power series A[[x]] and the ring of sequences in A have the underlying set, and the same addition operation, differing only on the multiplication operation. Whenever is necessary to distinguish the power series s in A[[x]] from the corresponding sequence a in S, one often says that s is the z-transform of a. 146 CHAPTER 3. OTHER EXAMPLES OF RINGS for polynomials with real coefficients, since the usual procedure only uses the algebraic properties common to any ring. One can easily recognize these apart from the specific properties of the ring Z4 in the following computations: (1 + x + 2x2 ) + (1 + 2x2 ) = (1 + 1) + (1 + 0)x + (2 + 2)x2 = 2 + x, 2 2 (1 + x + 2x )(1 + 2x ) = (1 + x + 2x2 )1 + (1 + x + 2x2 )2x2 = (1 + x + 2x2 ) + (2x2 + 2x3 ) = 1 + x + 2x3 . P Sometimes it is possible to give a meaning to “infinite sums” ∞ n=0 sn , where each sn is a power series. If snk denotes the coefficient k P of the power series sn , the simplest meaning one can give to the “sum” t = ∞ n=0 sn is t= ∞ X n=0 sn ⇐⇒ tk = ∞ X n=0 snk , for all k ∈ N0 . We will always use this definition whenever the sequence snk is eventually zero for all k ≥ 0, interpreting the “infinite sum” on the right and the (finite) sum of all the non-zero terms. In particular, we will write for any power series s: ∞ X s= s n xn . n=0 Definition 3.3.8. If p 6= 0 is a polynomial, the degree of p is the integer deg p defined by deg p = max{n ∈ N0 : pn 6= 0}. When p = 0 we convention that deg p = −∞. Example 3.3.9. Obviously, deg xn = n, for any n ≥ 0. The example above with the product of polynomials in Z4 [x] shows that it is not always true that the degree of the product of two polynomials is the sum of of the degrees of the factors. The next result clarifies this issue, which is related with the presence of zero divisors. In order to avoid dealing separately with the zero polynomial, we convention that deg p + deg q = −∞, whenever p = 0 or q = 0. 3.3. POLYNOMIALS AND POWER SERIES 147 Proposition 3.3.10. Let p, q ∈ A[x]. Then: (i) deg(p + q) ≤ max{deg p, deg q}, and deg(pq) ≤ deg p + deg q; (ii) If A is an integral domain, then deg(pq) = deg p + deg q, and A[x] and A[[x]] are also integral domains. We leave the proof as an exercise. Notice that, according to (ii), when A is an integral domain we can form the field of fractions of A[x] (i.e., Frac(A[x]) in the notation of the previous section). This field of fractions if the formal abstract analogue of the usual field of rational functions. Definition 3.3.11. If A is an integral domain, A(x) (respectively, A((x))) denotes the field of fractions de A[x] (respectively, A[[x]]). Example 3.3.12. When A = Z, the field Z(x) consist of fractions whose numerators and denominators are polynomials with integer coefficients . In this field, we have x2 − 1 = x − 1. x+1 2 −1 and g(x) = x − 1 are not the Notice, however, that the functions f (x) = xx+1 same, since their domains of definition are distinct. If φ : A → B is a ring homomorphism, then the map Φ : A[x] → B[x] defined by ! N N X X φ(pn )xn p n xn = Φ n=0 n=0 is also a ring homomorphism. Frequently, for p ∈ A[x], we will denote the polynomial q = Φ(p) by pφ (x). Sometimes (example, if A ⊂ B is a subring and φ : A → B is the inclusion) we may use the same letter to denote this two polynomials, if it is clear from the context to which ring the coefficients belong to. The next examples shows that this is a reasonable practice, which we are already used to! Example 3.3.13. Let p = 1 + 2x + 3x2 ∈ Z[x] be a polynomial with integer coefficients. Obviously, Z ⊂ Q and we can consider the canonical inclusion ι : Z → Q = Frac(Z) and we have pι = 1 + 2x + 3x2 ∈ Q[x]. Notice that the symbols “1”, “2” 148 CHAPTER 3. OTHER EXAMPLES OF RINGS and “3” denote now rational numbers, rather than integers. This is a reasonable abuse of notation, as we saw in our treatment of the field of fractions: the fraction ax a is denoted by x. Therefore, it is natural to represent this new polynomial by the same letter as the original polynomial. Finally, observe that if φ : A → B is an injective homomorphism of rings and p ∈ A[x], then p and pφ have the same degree. Exercises. 1. Compute the following product in Zm [x], when m = 2, 3 and 6: (1 + x + 2x2 )(2 + 3x + 2x2 ). 2. Proof Theorem 3.3.6. 3. Show that the ideal h2, xi in Z[x] is not a principal ideal. 4. Give a proof of Proposition 3.3.10. 5. What are the invertible elements of A[x], when A is an integral domain? 6. Assume that A is a commutative ring with identity. Show that the characteristic of A[x] and the characteristic of A are equal. 7. The polynomials in several variables can be defined in several ways. For example, in the case of two variables, we can consider: (i) The ring A[x] and the polynomials with coefficients in A[x], which we denote by A[x][y] (in this case we denote the new indeterminate by y); (ii) The maps s : N0 × N0 → A, with an appropriate definition of sum and convolution product, and indeterminates x and y, giving a ring which is denoted by A[x, y]. Both ways of defining polynomials in several variables are useful and the following exercises show that they are equivalent. (a) Give a detailed description of the definition sketched in (ii). (b) Show that the two definitions are equivalent, i.e., they lead to isomorphic rings. (c) How should one generalize these definitions for polynomials in several variables x1 , . . . , xn ? (d) How should one generalize these definitions for polynomials in the variables xα , with α ∈ I, where I can be an infinite set? 3.3. POLYNOMIALS AND POWER SERIES 149 8. Show that the following statements are equivalent: (a) (b) (c) (d) A is an integral domain; A[x] is an integral domain; A[[x]] is an integral domain; For any p, q ∈ A[x], deg(pq) = deg p + deg q. 9. Show that there exist rings with characteristic p, a prime, which are not fields, and fields with characteristic p which are infinite (both countable and uncountable). PN 10. If p ∈ A[x] is a polynomial p = n=0 pn xn , its (formal) derivative p′ is the polynomial X p′ = npn xn−1 . n=1 ′ Is it true that p = 0 implies that p is a constant polynomial? 11. Use the previous exercise and the homomorphism φ : Z → Zp defined by φ(n) = n to deduce the following generalization of Euclides Lemma: if p ∈ N is a prime, a, b ∈ Z[x] and p|ab, then either p|a or p|b. 12. Let D be an integral domain and K = Frac(D) its field of fractions. Show that D(x) is isomorphic to K(x). 13. Define a ring, denoted A[1/x, x]], with underlying set consists of all power P n 6 series of the form ∞ n=k sn x , where k ∈ Z is arbitrary . Show that if the coefficients belong to a field K, then the ring K[1/x, x]] is a field isomorphic to K((x)) = Frac(K[[x]]). 14. Show that in K[[x]] one has ∞ X 1 = xn . 1 − x n=0 (Notice the abuse of notation, which is justified by the previous exercise!) 15. Determine the power series inverse to (a− x) and to (a− x)(b − x) in K[[x]]. 16. Show that in Z2 [[x]], one has 6 ∞ X 1 = x2n . (1 − x)2 n=0 These series are usually called Laurent series. The corresponding ring, denoted A[1/x, x]], is called the ring of Laurent series with coefficients in A. The special case where A = C plays a crucial role in Complex Analysis and in Algebraic Geometry. 150 CHAPTER 3. OTHER EXAMPLES OF RINGS 17. One can solve problems like the one of the Fibonacci sequence in Section 2.3 by formal power series computations. Recalling that this sequence is defined recursively by: an+2 = an+1 + an , (n ≥ 0), one notes that is equivalent to ∞ X an+2 xn = n=0 and also: ∞ X If we let s = an+2 xn+2 = x ∞ X n=0 n=0 an+1 xn + n=0 n=0 P∞ ∞ X ∞ X ∞ X n=0 an+1 xn+1 + x2 n=0 n an x n , ∞ X an x n . n=0 an x , then an+2 x n+2 = s − a0 − a1 x, ∞ X n=0 an+1 xn+1 = s − a0 , and we conclude that the recurrence relation above is equivalent to: s − a0 − a1 x = x(s − a0 ) + x2 s. Solving this equation for s, we obtain: s= −a0 + (a0 − a1 )x . x2 + x − 1 If α and β are the roots of x2 + x − 1, we can factorize this rational fraction as: −a0 + (a0 − a1 )x A B = + , x2 + x − 1 α−x β−x The exercise above now allows to compute explicitly the coefficients of s, i.e., the terms of the Fibonacci sequence. Verify all these statements and compute the coefficients of the Fibonacci sequence. 3.4 Polynomial Functions We have argued in the previous sections that one should not define polynomials with coefficients in A as functions of some sort with domain and values in A. However, there is nothing forbidding us to define functions A → A from polynomials in A[x]. 3.4. POLYNOMIAL FUNCTIONS 151 P n Definition 3.4.1. If p = N n=0 pn xPis a polynomial in A[x], the function N n p∗ : A → A defined by p∗ (a) = n=0 pn a is called the polynomial function associated with p. Example 3.4.2. Let A = Z2 and p = 1 + x + x2 (7 ). The polynomial function associated to the polynomial p is p∗ : Z2 → Z2 given by p∗ (a) = 1 + a + a2 , for any a ∈ Z2 . In this case, we have p∗ (0) = p∗ (1) = 1, and hence p∗ is a constant function, although p is not a constant polynomial. In particular, when q = 1, we have p 6= q and p∗ = q ∗ . Given a ring A we will denote by AA the set of all functions f : A → A. In Chapter ?? we already observed that AA is a ring, with the usual addition and multiplication of functions. We have a map Ψ : A[x] → AA , p 7→ p∗ , which associates to each polynomial p ∈ A[x] the corresponding function p∗ : A → A. Proposition 3.4.3. Ψ : A[x] → AA is a ring homomorphism. PN P n n Proof. Let p, q ∈ A[x]. We can write p = N n=0 qn x , n=0 pn x and q = by eventually letting some of the coefficients be zero. If a ∈ A, we need to show that the following identities hold: (p + q)∗ (a) = p∗ (a) + q ∗ (a), e (pq)∗ (a) = p∗ (a)q ∗ (a). A more general version of the first identity was already proved in Chapter ??. The second identity can also be proved easily by induction. Examples 3.4.4. 1. Given a polynomial p ∈ A[x], the associated function p∗ is defined not just in the ring A, but also in an extension of A. For example, if p = 1 + 2x + 3x2 ∈ Z[x], the associated polynomial function p∗ : Z → Z is of course p∗ (n) = 1 + 2n + 3n2 , for any n ∈ Z. Since Z ⊂ Q we can also consider q ∗ : Q → Q, given also by q ∗ (r) = 1 + 2r + 3r2 , for any r ∈ Q. 7 From now, we will not distinguish between the integer n and the class n. It should be clear from the context if we are referring to an integer (element of Z) or an equivalence classe in some Zm . 152 CHAPTER 3. OTHER EXAMPLES OF RINGS 2. More formally, if the ring B is an extension of the ring A, so that there exists an injective homomorphism φ : A → B, then φ(A) is a subring of B P n isomorphic to A. Hence, if p = N n=0 pn x is a polynomial in A[x], then p determines a polynomial function q ∗ : B → B, namely: q ∗ (b) = N X φ(pn )bn . n=0 Of course, p also determines a polynomial q ∈ B[x] which was denoted pφ in the previous section, so in fact we have: q ∗ = (pφ )∗ . As we have observed before, it is a matter of common sense to decide to distinguish between p and q, to write different symbols for the respective functions p∗ and q ∗ , or to use different notations for their coefficients pn and φ(pn ). 3. Continuing with Example 1, recall that M2 (Z) denotes the ring of 2 × 2 matrices with integer entries. The matrices of the form nI, where I denotes the identity matrix and n ∈ Z, is isomorphic to Z: indeed we have an injective homomorphism φ : Z → M2 (Z) given by φ(n) = nI. If p = 1 + 2x+ 3x2 ∈ Z[x] we have that (pφ )∗ : M2 (Z) → M2 (Z) is given by (pφ )∗ (C) = φ(1) + φ(2)C + φ(3)C 2 , where C denotes now any matrix in M2 (Z). Obviously, we can simplify the notation and write the expression φ(1) + φ(2)C + φ(3)C 2 as I + 2C + 3C 2 . This is formally the same expression as in Example 1, with the small subtlety that the symbol “1” is replaced by the symbol “I”, representing the identity matrix. In the first example, the identification of Z with its image in Q is quite natural. In the last example, one does not need to substitute the coefficients of degree > 0 (because the product of a matrix by the integer n is the same as the product by the matrix nI), but we must replace the coefficient of degree zero, because the same of matrix with an integer is not a priori defined. We can use this circle of ideas to formalize the notion of root of a polynomial: Definition 3.4.5. Let p ∈ A[x] and let B be an extension of A. We say that b ∈ B is a root of p if p∗ (b) = 0. When we write p∗ (a) we usually think of p∗ as fixed and a as a variable (the “independent variable” of the function p∗ ). However, we can also consider p∗ as an “independent variable”. Still assuming that A is any abelian ring with identity and that B is an extension of A, we make the following definition (recall the remarks above about identifications!): 3.4. POLYNOMIAL FUNCTIONS 153 Definition 3.4.6. The map Val : A[x] × B → B is defined by Val(p, b) := p∗ (b). Val(p, b) is called the evaluation of the polynomial p at b. The function Val has two independent variables: the polynomial p and the element b. If we fix the polynomial p, then Val is just the function p∗ : B → B associated to the polynomial p. We can also fix the element b, resulting in an evaluation map ψ : A[x] → B, p 7→ p∗ (b). Examples 3.4.7. 1. Let A = Z, B = C and b = i. Then ψ : Z[x] → C is given by ψ(p) = p∗ (i). In order to find out the image ψ(Z[x]) notice that the division of any polynomial p ∈ Z[x] by 1 + x2 , results in a reminder of degree less than 2. In other words, p = (1 + x2 )q + (a0 + a1 x), where q ∈ Z[x] and a0 , a1 ∈ Z. Hence, p∗ (i) = a0 + a1 i so ψ(Z[x]) consists of all Gaussian integers. √ √ 2. Let A = Q, B = R and b = 2. Then ψ(p) = p∗ ( 2) Since we can always 2 write √ + (a0 + a1 x), where q ∈ Q[x] and a0 , a1 ∈ Q, we have that √ p = (2 − x )q 2, so we conclude that ψ(Q[x]) consists of all real numbers p∗ ( 2) = a0 + a1 √ of the form a0 + a1 2 with a0 , a1 ∈ Q. 3. The ring A[x] is always an extension of the ring A. Hence, if we let A be any ring, B = A[x] and b = x, it is obvious that ψ(p) = p∗ (x) = p. In other words, ψ : A[x] → A[x] is the identity function. For this reason, we can denote the polynomial p by the symbol p∗ (x), which usually simplified to p(x). Motivated by the last example, whenever B is an extension of A, b ∈ B, and ψ : A[x] → B is the evaluation map ψ(p) = p∗ (b), we will denote the image ψ(A[x]) by the symbol A[b]. This is the reason why we used the symbol Z[i] for the Gaussian integers. In the case of Example 3.4.7.2, we have √ √ √ Q[ 2] = {p∗ ( 2) : p ∈ Q} = {a0 + a1 2 : a0 , a1 ∈ Q}. The previous examples suggest that A[b] is always a subring of B. It is easy to check that this always holds: Proposition 3.4.8. For any fixed b ∈ B, the evaluation map ψ : A[x] → B, p 7→ p∗ (b), is a ring homomorphism. Hence, A[b] is a subring of B. Proof. Proceed as in the proof of Proposition 3.4.3. 154 CHAPTER 3. OTHER EXAMPLES OF RINGS Also, in analogy with Definition 3.3.11, whenever A[b] is an integral domain, we will denote its field of fractions by A(b). Notice that A[b] always contains the values of the constant polynomials. These form a subring of A[b] isomorphic to A. Hence, A[b] is always an extension√ of A: the ring of Gaussian integers contains a subring isomorphic to Z, Q[ 2] contains a subring isomorphic to Q, etc.). On the other hand, in all Examples 3.4.7, only for the last one is the evaluation map ψ injective. Therefore, only for the last of these examples, we have A[b] isomorphic to A[x]. Of course, ψ is injective if and only if its kernel N (ψ) consist of the zero polynomial zero. Since: N (ψ) = {p ∈ A[x] : p∗ (b) = 0}, we conclude that ψ is injective if and only if b is not a root of any non-zero polynomial with coefficients in A. Definition 3.4.9. Let B be an extension of A and b ∈ B. We say that b is algabraic over A if there exists a non-zero polynomial p ∈ A[x] such that p∗ (b) = 0. Otherwise, we say that b is transcendental over A . Example 3.4.10. √ As we saw above, i ∈ C is algebraic over Z, and 2 ∈ R is algebraic over Q (and also over Z). The polynomial x ∈ A[x] is always transcendental over A. In general, an extension B of A may (i) contain both algebraic elements and transcendental elements over A, or (ii) may contain only algebraic elements over A. We distinguish these two possibilities: Definition 3.4.11. We call B an algebraic extension of A if all its elements are algebraic over A. Otherwise, B is called a transcendental extension of A. Examples 3.4.12. 1. The ring of Gaussian integers is an algebraic extension of Z. In fact, any Gaussian integer m + ni is a root of the non-zero polynomial with integer coefficients (x − m)2 + n2 = x2 − 2mx + m2 + n2 . √ √ 2. Q[ 2] is an algebraic extension of Q, since a + b 2 is a root of the non-zero polynomial with rational coefficients (x − a)2 − 2b2 . 3. Q is an algebraic extension of Z, because m n is a root of nx − m ∈ Z[x]. 4. C is an algebraic extension of R, because a+bi is a root of (x−a)2 +b2 ∈ R[x]. 3.4. POLYNOMIAL FUNCTIONS 155 5. A[x] is a transcendental extension of A. 6. We will show in the next chapter that R is a transcendental extension of Q. We close this section with the following result which expresses the fact that A[x] is the smallest transcendental extension of A. We leave its proof for the exercises. Theorem 3.4.13. Any transcendental extension of A contains a subring isomorphic to A[x]. Exercises. 1. Conclude the proof of Proposition 3.4.3. √ √ 2. Assume that n is irrational. Show that Q[ n] is an algebraic extension of Q, a subfield of R, and a vector space of dimension 2 over Q.8 3. Prove Theorem 3.4.13. 4. Assume that A ⊂ B are integral domains and b ∈ B. Show that: (a) A[b] is the smallest integral domain containing A and b. (b) A(b) is the smallest field containing A and b. 5. Let K be a field and fix n points (ak , bk ) in K × K with ai 6= aj for i 6= j. Show that : (a) There exists a polynomial pi ∈ K[x] such that  se j = i,  1, pi (aj ) =  0, se j = 6 i. (Hint: Modify the polynomial q i = Qn k=1 (x−ak ) (x−ai ) .) (b) There exists a polynomial p(x) ∈ K[x] of degree ≤ n − 1 such that p(ak ) = bk , for all k = 1, . . . , n. The formula defining p is called Lagrange interpolation formula. 6. Let K be a finite field. Show that any function K → K is polynomial. √ The fields Q[ a] where a is not a perfect square are called quadratic fields, and play an important role in Number Theory. 8 156 CHAPTER 3. OTHER EXAMPLES OF RINGS 7. Let K be a subfield of L, and assume that L is a finite dimensional vector space over K. Show that L is an algebraic extension of K. (Hint: if the dimension of L over K is n and a ∈ L then {ak : 0 ≤ k ≤ n} cannot be a linearly independent set.) 8. Let K be a subfield of L and let b ∈ L be algebraic over K. Use the previous exercise to show that K(b) is an algebraic extension of K. 3.5 Division of Polynomials In this and in the next sections we will study in some detail the ring of polynomials A[x]. Underlying this study, playing a fundamental role, there will be the division algorithm for polynomials. Therefore, we need to find sufficient conditions on the ring A for this algorithm to hold. Many of the results that we will obtain are then analogous to the results we have obtained in Chapter ?? for the ring of integers. According to Example 3.4.7.3, we will adopt the following convention: a polynomial p ∈ A[x] is represented by the symbol p(x) and the value of the polynomial p at a ∈ A will now be represented by p(a) instead of p∗ (a). Definition 3.5.1. A polynomial p(x) ∈ A[x] is called monic if pn = 1, where deg p(x) = n and 1 is the identity in the ring A. Example 3.5.2. The polynomial 5 + 3x + 2x2 + x4 ∈ Z[x] is monic. The division algorithm presents obvious issues in rings with zero divisors. On the other hand, if D is a integral domain, then by Proposition 3.3.10(b) D[x] is also an integral domain. In fact, we have: Theorem 3.5.3 (Division Algorithm). Let D be an integral domain. If p(x), d(x) ∈ D[x] and d(x) is monic, there exist unique polynomials q(x) and r(x), with deg r(x) < deg d(x), such that p(x) = q(x)d(x) + r(x). Similarly to the case of the integers, the polynomials q(x) and r(x) are called, respectively, the quotient and the remainder of the division of p(x) by d(x). The case where r(x) = 0 corresponds, of course, to the case where d(x) is a divisor (or factor) of p(x). We recall that in this case we write d(x)|p(x). 3.5. DIVISION OF POLYNOMIALS 157 Proof of Theorem 3.5.3. We will show separately the existence and uniqueness. Existence: Let R = {p(x) − a(x)d(x) : a(x) ∈ D[x]}. There are two cases, depending if 0 belongs to R or not: (a) If 0 ∈ R there exists a0 (x) ∈ D[x] such that p(x) − a0 (x)d(x) = 0. In this case we set q(x) = a0 (x) and r(x) = 0. It follows that: p(x) = q(x)d(x) + r(x), with deg r(x) < deg d(x). (b) If 0 6∈ R the set G = {deg(p(x) − a(x)d(x)) : a(x) ∈ D[x]} ⊂ N0 , formed by the degrees of the polynomials in R must have has a minimum m. Let a0 (x) be a polynomial with deg(p(x) − a0 (x)d(x)) = m. If we set q(x) = a0 (x) and r(x) = p(x) − q(x)d(x), it follows that: p(x) = q(x)d(x) + r(x), and we just need to check that m = deg r(x) < deg d(x). Assume that n = deg d(x) ≤ m. Then we set k = m − n and r ′ (x) = r(x) − rm xk d(x). On the one hand, r ′ (x) = p(x) − (q(x) + rm xk )d(x) ∈ R. On the other hand, since d(x) is monic, we have deg r ′ (x) < m = deg r(x), contradicting the definition of m. Therefore, we must have m < n, i.e, deg r(x) < deg d(x). Uniqueness: Let p(x) = q(x)d(x) + r(x) = q ′ (x)d(x) + r ′ (x), where deg r(x) and deg r ′ (x) are both smaller than deg d(x). We have that q(x) − q ′ (x) d(x) = r ′ (x) − r(x). If q(x) 6= q ′ (x), then which contradicts deg r ′ (x) − r(x) ≥ deg d(x), deg r ′ (x) − r(x) ≤ max{deg r ′ (x), deg r(x)} < deg d(x). Hence we must have q(x) = q ′ (x), which also implies r(x) = r ′ (x). The argument above for the existence part of the proof can be implemented as an algorithm as follows: given a polynomial p(x) = an xn + an−1 xn−1 + · · · + a0 ∈ D[x] of degree n denote by ptop (x) = an xn the highest degree term of p(x). To divide the polynomial p(x) by a polynomial d(x) one iterates as follows: 158 CHAPTER 3. OTHER EXAMPLES OF RINGS • Starting with q(x) = 0 and r(x) = p(x), one replaces at each step q(x) → q(x) + r top (x) , dtop (x) r(x) → r(x) − r top (x) d(x). dtop (x) The iteration ends when deg r(x) < deg d(x). Example 3.5.4. Let D = Z5 . The division of p(x) = 2x4 +4x3 +x2 +2x+3 by d(x) = x2 +x+1 results in a quotient q(x) = 2x2 + 2x + 2, with remainder r(x) = 3x + 1. Assuming that a ∈ D, the polynomial d(x) = (x−a) is monic of degree 1. Hence, any polynomial p(x) ∈ D[x] can be divided by (x−a) and, according to the division algorithm, the remainder is a constant polynomial. The next corollary states that the corresponding element of D is just the value of p(x) at a. We leave the proof as an exercise: Corollary 3.5.5 (Remainder Theorem). If p(x) ∈ D[x] and a ∈ D, the remainder of the division of p(x) by (x − a) is the constant polynomial r(x) = p(a). In particular, (x − a)|p(x) if and only if a is a root of p(x). Examples 3.5.6. 1. Consider the polynomial p(x) = 2x4 + 4x3 + x2 + 2x + 3 in Z5 [x]. Since p(1) = 12 ≡ 2 (mod 5), it follows that the remainder of the division of 2x4 + 4x3 + x2 + 2x + 3 by (x − 1) = (x + 4) is r(x) = 2. 2. If we assume instead that p(x) = 2x4 + 4x3 + x2 + 2x + 3 is a polynomial with coefficients in Z3 , we have that p(1) = 12 ≡ 0 (mod 3), so in this case (x − 1) = (x + 2) is a factor of 2x4 + 4x3 + x2 + 2x + 3. Another consequence of the Division Algorithm (or, more precisely, of Corollary 3.5.5) is the classical result about the maximal number of roots of a non-zero polynomial: Proposition 3.5.7. If p(x) ∈ D[x] and deg p(x) = n ≥ 0, then p(x) has at most n roots in D. Proof. We use induction in the degree of the polynomial p(x). If n = 0, the polynomial p(x) is constant and non-zero. It is then obvious that it has no roots. Assume now that the result hods for an integer n ≥ 0. Let deg p(x) = n+1 and assume that a is a root of p(x) (if p(x) has no roots there is nothing 3.5. DIVISION OF POLYNOMIALS 159 to prove!). According to the Remainder Theorem, p(x) = q(x)(x−a), where of course deg q(x) = n. On the one hand, by the induction assumption, q(x) has at most n roots. On the other hand, if b ∈ D is distinct from a, then p(b) = q(b)(b − a). Since D is an integral domain, we can only have p(b) = 0 if q(b) = 0. In other words, the remaining roots of p(x) must also be roots of q(x). Therefore, p(x) has at most n + 1 roots. When D is an integral domain, the only invertible polynomials q(x) ∈ D[x] are the constant polynomials q(x) = a, with a ∈ D invertible. These polynomials can always be used to obtain trivial factorizations of any polynomial p(x): if a(x)b(x) = 1 then p(x) = a(x)b(x)p(x). For this reason, if q(x)|p(x), one often says that q(x) is a proper factor of p(x) if and only if p(x) = a(x)q(x), where neither a(x) nor q(x) are invertible. Hence, a factorization is non-trivial if and only if it contains at least on proper factor. Our next definition, which should be compared with the definition of prime numbers presented in Chapter ??, says that an irreducible polynomial is a non-invertible polynomial with only trivial factorizations. Notice that the restricting to a non-invertible polynomial is entirely analogous to the exclusion of 1 from the set of prime numbers. Definition 3.5.8. A non-invertible polynomial p(x) ∈ D[x] is said to be irreducible in D[x] when it has no proper factors in D[x]. In other words, if for q1 (x), q2 (x) ∈ D[x], p(x) = q1 (x)q2 (x) =⇒ q1 (x) or q2 (x) is invertible in D[x]. We say that p(x) is reducible in D[x] if it is not irreducible. Examples 3.5.9. 1. p(x) = x − a is always irreducible, no matter what the domain D is. 2. If D = Z, then p(x) = 2x − 3 is irreducible (check this!), but q(x) = 2x + 6 is reducible, because 2x + 6 = 2(x + 3), and both 2 and x + 3 are not invertible in Z[x]. 3. If deg p(x) ≥ 2 and p(x) has at least one root in D, it follows from the Remainder Theorem that p(x) is necessarily reducible in D[x]. 4. If p(x) is monic of degree 2 or 3, then p(x) is reducible in D[x] if and only if it had at least one root in D (why?). 5. It is possible for p(x) to have no roots in D and be reducible in D[x]: for example, x4 + 2x2 + 1 is reducible in Z[x], since x4 + 2x2 + 1 = (x2 + 1)2 , but it has no roots in Z. 160 CHAPTER 3. OTHER EXAMPLES OF RINGS 6. You probably have learned in elementary Algebra that the only irreducible polynomials in R[x] are the polynomials of degree 1 and the quadratic polynomials p(x) = ax2 + bx + c, with negative discriminant ∆ = b2 − 4ac. We will see later that this is a consequence of the Fundamental Theorem of Algebra. 7. The property of a polynomial being irreducible depends heavily on the domain D. For example, the polynomial x2 + 1 is irreducible in R[x], but reducible in C[x] ⊃ R[x]. On the other hand, the polynomial p(x) = 2x + 6 is reducible in Z[x], but irreducible in Q[x] ⊃ Z[x]. In same cases it is possible to describe all the irreducible polynomials in D[x], as in the case D = R. However, there are cases where it is practically impossible to describe them. The next results illustrate how complex this issue is in the case of Z[x] and Q[x]. Given p(x) ∈ Z[x], p(x) = a0 + a1 x + · · · + an xn , we will say that c(p) = gcd(a0 , a1 , . . . , an ) is content of p(x). We say that p(x) is primitive if its coefficients are relatively prime, i.e., if c(p) = 1. Obviously p(x) is primitive if and only if it has no factorizations of type p(x) = kq(x), with k ∈ Z, k 6= ±1 and q(x) ∈ Z[x]. Lemma 3.5.10. If p(x) ∈ Z[x] and p(x) = a(x)b(x), with a(x), b(x) ∈ Q[x], then there exist polynomials a′ (x), b′ (x) ∈ Z[x], and k ∈ Q, such that p(x) = a′ (x)b′ (x), a(x) = ka′ (x) and b(x) = k−1 b′ (x). Proof. Obviously, there are integers n, m such that ã(x) = na(x) ∈ Z[x], b̃(x) = mb(x) ∈ Z[x], and nmp(x) = ã(x)b̃(x). Let q be any prime factor of nm. By the generalization of Euclides Lemma given in Exercise 11, Section 3.3: q|ã(x)b̃(x) =⇒ q|ã(x) or q|b̃(x). Hence, we can divide both sides of the identity nmp(x) = ã(x)b̃(x) by q, still obtaining on the right hand side a product of two polynomials in Z[x]. If we repeat this procedure for all prime factors of nm, we arrive at an identity p(x) = a′ (x)b′ (x), where a′ (x), b′ (x) ∈ Z[x], ã(x) = sa′ (x) and b̃(x) = tb′ (x), with s, t ∈ Z. We conclude that a(x) = ns a′ (x), b(x) = mt b′ (x), and ns mt = 1. Lemma 3.5.11 (Gauss). If p(x) ∈ Z[x] is not constant, then p(x) is irreducible in Z[x] if and only if it is primitive and irreducible in Q[x]. 3.5. DIVISION OF POLYNOMIALS 161 Proof. Assume first that p(x) is reducible and primitive in Z[x]. We claim that p(x) is reducible in Q[x]. Indeed, in this case, p(x) = a(x)b(x), with a(x), b(x) ∈ Z[x]. Note that if any of the polynomials a(x) or b(x) is constant then it must be invertible, i.e., ±1, since p(x) is primitive. We conclude that a(x) and b(x) are not constant, hence not invertible in Q[x]. Therefore, the factorization p(x) = a(x)b(x) is non-trivial in Q[x], so p(x) is reducible in Q[x], as claimed. If p(x) is not primitive, it is obvious that it is reducible in Z[x]. Therefore, it remains to show that if p(x) is reducible in Q[x] it is also reducible in Z[x]. In this case, p(x) = a(x)b(x), where a(x), b(x) ∈ Q[x] are non constant. According to Lemma 3.5.10, there exist polynomials a′ (x), b′ (x) ∈ Z[x] such that p(x) = a′ (x)b′ (x), and a′ (x), b′ (x) are not constant. Hence, p(x) is reducible in Z[x]. The next theorem and the previous lemma allows one to easily obtain examples of irreducible polynomials in Z[x] and Q[x]. Theorem 3.5.12 (Eisenstein’s Criterion). Let a(x) = a0 +a1 x+· · ·+an xn ∈ Z[x] be a polynomial of degree n. If there exists a prime p ∈ Z such that ak ≡ 0 (mod p) for 0 ≤ k < n, an 6≡ 0 (mod p) and a0 6≡ 0 (mod p2 ) then a(x) is irreducible in Q[x]. Proof. Assume that in Q[x] we have a factorization a(x) = b(x)c(x) = (b0 + b1 x + . . . )(c0 + c1 x + . . . ). According to Lemma 3.5.10, we may assume without loss of generality that b(x), c(x) ∈ Z[x]. If b0 ≡ c0 ≡ 0 (mod p), then a0 = b0 c0 ≡ 0 (mod p2 ), which contradicts the assumption that a0 6≡ 0 (mod p2 ). Hence we can assume, e.g., that c0 6≡ 0 (mod p). It is obvious that if p|b(x) then p|a(x), which is impossible because an 6≡ 0 (mod p). We conclude that the set {k ≥ 0 : bk 6≡ 0 (mod p)} is non-empty. Let us denote by m its minimum. Notice that am = m X i=0 bi cm−i ≡ bm c0 6≡ 0 (mod p), so we must have m = n, since an is the only coefficient of a(x) not divisible by p. Therefore, deg b(x) ≥ deg a(x), and since a(x) = b(x)c(x), we have deg b(x) = deg a(x), and c(x) must be constant. This shows that a(x) is irreducible in Q[x]. 162 CHAPTER 3. OTHER EXAMPLES OF RINGS Examples 3.5.13. 1. If p ∈ Z is a prime, q(x) = xn − p is irreducible in Z[x] and in Q[x]. 2. Eisenstein’s Criterion does not apply to the polynomial q(x) = x6 + x3 + 1. However, noting that q(x + 1) = (x + 1)6 + (x + 1)3 + 1 = x6 + 6x5 + 15x4 + 21x3 + 18x2 + 9x + 3, Eisenstein’s Criterion show that q(x + 1) is irreducible in Q[x]. We conclude that q(x) is irreducible in Q[x]. Since q(x) is monic, it is also irreducible in Z[x]. The examples and remarks above lead to the following conclusions: • In any domain, the polynomials x − a are irreducible. • There are domains for which there exist irreducible polynomials of degree as large as we wish. • There are domains for which there are irreducible polynomials only up to same degree larger or equal to 1. It remains to show that there are fields where the only irreducible polynomials take the form x − a. We leave it as an exercise to check that if this is true then the any polynomial of degree > 1 must have a root. For this reason, we introduce the following: Definition 3.5.14. A field K is called algebraically closed if every nonconstant polynomial p(x) ∈ K[x] has at least one root in K. We will not prove the next result, which will only be used in examples and exercises. One can find a proof in any text of Complex Analysis. We leave as an exercise to use it to determine the irreducible polynomials em R. Theorem 3.5.15 (Fundamental Theorem of Algebra). The field C of complex numbers is algebraically closed, i.e., any non-constant complex polynomial has at least one complex root. Exercises. 1. If p(x) ∈ D[x], p(x) 6= 0, and a ∈ D is a root of p(x), the largest natural number m such that p(x) is a multiple of (x − a)m is called the multiplicity of the root a. Show that the sum of the multiplicities of the roots of p(x) is ≤ deg p(x). 3.5. DIVISION OF POLYNOMIALS 163 2. Show that p(x) ∈ A[x] can have more than deg p(x) roots, if A has zero divisors. 3. Show that x2 + 1 is irreducible in Z3 [x]. 4. Determine all the irreducible polynomials p(x) ∈ Z3 [x] with deg p(x) ≤ 2. 5. How many irreducible polynomials are there in Z5 [x] with degree 5? (Hint: count the reducible and irreducible polynomials in Z5 [x] of degrees ≤ 5.) 6. Show that the following statements are equivalent: (a) The field K is algebraically closed. (b) Any polynomial in K[x] of degree ≥ 1 is a product of polynomials of degree 1. 7. Assume that K is an algebraically closed field and show that if p(x) ∈ K[x] and deg p(x) = n ≥ 1, then the sum of the multiplicities of the roots of p(x) is exactly n. 8. Assume that p(x) ∈ R[x] and using the Fundamental Theorem of Algebra show that: (a) If c ∈ C is a root of p(x), the complex conjugate of c is also a root of p(x). (b) If p(x) is irreducible in R[x] and deg p(x) > 1, then p(x) = ax2 + bx + c and b2 − 4ac < 0. 9. If K is an algebraically closed field and D is an integral domain and an algebraic extension de K, show that K = D.9 10. If p(x) = a0 + a1 x + · · · + αn xn ∈ Z[x], and c(p) = gcd(ao , a1 , . . . , αn ), show that: (a) p(x) = c(p)q(x) where q(x) is primitive, (b) If p(x) and q(x) are primitive then p(x)q(x) is primitive. 9 In particular, according to the FundamentalTheorem of Algebra, there is no field L which is an algebraic extension of C. This gives a complete answer to Hamilton’s problem, discussed in Chapter ??. 164 3.6 CHAPTER 3. OTHER EXAMPLES OF RINGS The Ideals of K[x] As we saw in Chapter ??, the structure of the ideals in Z is specially simple, a fact which is behind Euclides Algorithm for the computation of the greatest common divisor and least common multiple of two integers. For a general integral domainD the structure of the ideals of D[x] can be very complex, as the example of Z[x] shows. In general, there is no analogue of the Euclides Algorithm. However, if D = K is a field, then the structure of the ideals in K[x] is easy to describe, as we shall see now.. We start by remarking that the Division Algorithm, as stated in Theorem 3.5.3, in the case of K[x] can be strengthen as follows: Theorem 3.6.1. If p(x), d(x) ∈ K[x] and d(x) 6= 0, there exist unique polynomials q(x) and r(x), such that p(x) = q(x)d(x)+r(x) and deg r(x) < deg d(x). This result is similar to the result we have proved in Chapter ?? for the integers, and so can be used to establish various analogies between the rings K[x] and Z. In particular: Theorem 3.6.2. Every ideal in K[x] is principal. Proof. Assume that I ⊂ K[x] is an ideal. If I = {0}, then I = h0i is a principal ideal. Therefore, assume that I 6= {0}, so there exists some non-zero polynomial p(x) ∈ I, and consider the set N = {n ∈ N0 : ∃p(x) ∈ I, n = deg p(x)} ⊂ N0 . The set N is non-empty, so it must have a minimum. Let m(x) ∈ I be such that deg m(x) = min N . Then (3.6.1) r(x) ∈ I and r(x) 6= 0 =⇒ deg m(x) ≤ deg r(x). Since m(x) ∈ I, obviously hm(x)i ⊂ I. On the other hand, if p(x) ∈ I the Division Algorithm gives p(x) = m(x)d(x) + r(x), where deg r(x) < deg m(x). Since I is an ideal, it follows that r(x) = p(x) − m(x)d(x) ∈ I, and we conclude from (3.6.1) that r(x) = 0 (otherwise we would have deg r(x) < deg m(x), a contradiction). Hence, p(x) ∈ hm(x)i and I = hm(x)i. 3.6. THE IDEALS OF K[X] 165 In Chapter ?? we have used the fact that all ideals in Z are principal to construct Euclides Algorithm, which allows one to find the greatest common divisor and the least common multiple of any two integers. We will now apply the same set of ideas to the ring K[x]. We start with the following proposition, whose proof is left for the exercises: Proposition 3.6.3. Let I = hp(x)i and J = hq(x)i be ideals in K[x]. Then: (a) I ⊂ J if and only if q(x)|p(x); (b) I is maximal if and only if p(x) is irreducible; (c) If I = J and p(x) and q(x) are monic or zero, then p(x) = q(x). If p(x), q(x) ∈ K[x], then I = hp(x), q(x)i is the ideal in K[x], given by: I = {a(x)p(x) + b(x)q(x) : a(x), b(x) ∈ K[x]}. According to Theorem 3.6.2, this ideal is principal. Therefore, there exists a polynomial d(x) ∈ K[x] such that hd(x)i = hp(x), q(x)i. Note that d(x) satisfies: • d(x)|p(x) and d(x)|q(x); • there exist polynomials a(x) and b(x) such that d(x) = a(x)p(x) + b(x)q(x); • if c(x)|p(x) and c(x)|q(x), then c(x)|d(x). In other words, d(x) is a common divisor of p(x) and q(x), and it is a multiple of any other common divisor of these two polynomials. Similarly, hp(x)i∩hq(x)i is a principal ideal, so there exists m(x) ∈ K[x] such that hm(x)i = hp(x)i ∩ hq(x)i. In this case, we have: • p(x)|m(x) and q(x)|m(x); • if p(x)|n(x) and q(x)|n(x), then m(x)|n(x). Hence, m(x) is a common multiple of p(x) and q(x), and is a divisor of any other polynomial which is a common multiple of these two polynomials. Finally, note that according to Proposition 3.6.3 (c), if p(x) and q(x) are monic polynomials or zero and hp(x)i = hq(x)i, then p(x) = q(x). Hence we can introduce: 166 CHAPTER 3. OTHER EXAMPLES OF RINGS Definition 3.6.4. Let p(x), q(x) ∈ K[x]. (i) If hd(x)i = hp(x), q(x)i, with d(x) monic or zero, then d(x) is called the greatest common divisor of p(x) and q(x), abbreviated to d(x) = gcd(p(x), q(x)). (ii) If hm(x)i = hp(x)i ∩ hq(x)i, with m(x) monic or zero, then m(x) is called the least common multiple of p(x) and q(x), abbreviated to m(x) = lcm(p(x), q(x)). Similarly to the integers, notice that p(x) = q(x)a(x) + r(x) =⇒ hp(x), q(x)i = hq(x), r(x)i, so Euclides Algorithm remains valid in K[x]. We illustrate it with an example in Z5 [x]. Example 3.6.5. In order to find the greatest common divisor of p(x) = x4 + x3 + 2x2 + x + 1 and q(x) = x3 + 3x2 + x + 3 in Z5 [x], we proceed as follows: 1. Divide p(x) by q(x), so one obtains x4 + x3 + 2x2 + x + 1 = (x3 + 3x2 + x + 3)(x + 3) + 2x2 + 2. Hence: hx4 + x3 + 2x2 + x + 1, x3 + 3x2 + x + 3i = hx3 + 3x2 + x + 3, 2x2 + 2i. 2. Then divide x3 + 3x2 + x + 3 by 2x2 + 2, leading to x3 + 3x2 + x + 3 = (2x2 + 2)(3x + 4). Hence: hx3 + 3x2 + x + 3, 2x2 + 2i = h2x2 + 2i = hx2 + 1i. 3. One concludes that gcd(x4 + x3 + 2x2 + x + 1, x3 + 3x2 + x + 3) = x2 + 1. The exact some argument that we used for the integers, also leads to the following result: Lemma 3.6.6. If p(x), q(x) ∈ K[x] are monic polynomials, then gcd(p(x), q(x)) lcm(p(x), q(x)) = p(x)q(x). 3.6. THE IDEALS OF K[X] 167 Example 3.6.7. We saw above that the greatest common divisor of p(x) = x4 +x3 +2x2 +x+1 and q(x) = x3 + 3x2 + x + 3 em Z5 [x] is d(x) = x2 + 1. Hence, we conclude that the least common multiple of these polynomials is the polynomial (x4 + x3 + 2x2 + x + 1)(x3 + 3x2 + x + 3) x2 + 1 5 4 2 = x + 4x + 2x + 4x + 3. m(x) = Exercises. 1. Prove Theorem 3.6.1. 2. Prove Proposition 3.6.3. 3. Let p(x), q(x) ∈ K[x]. Show that I = hp(x), q(x)i = {a(x)p(x) + b(x)q(x) : a(x), b(x) ∈ K[x]}. 4. Let p(x), q(x) ∈ K[x]. Show that if hd(x)i = hp(x), q(x)i, then: (a) There exist a(x), b(x) ∈ K[x] such that d(x) = a(x)p(x) + b(x)q(x). (b) d(x)|p(x) and d(x)|q(x). (c) If c(x)|p(x) and c(x)|q(x), then c(x)|d(x), hence deg c(x) ≤ deg d(x). 5. Show that the following generalization of Euclides Lemma holds: if p(x), q1 (x), q2 (x) ∈ K[x], p(x) is irreducible and p(x)|q1 (x)q2 (x), then p(x)|q1 (x) or p(x)|q2 (x). 6. If p(x), q(x) ∈ K[x], shows that p(x) = q(x)a(x) + r(x) =⇒ hp(x), q(x)i = hq(x), r(x)i. 7. Let d(x) be the greatest common divisor of x4 + x3 + 2x2 + x + 1 and x3 + 3x2 + x + 3 in Z5 [x]. Determine a(x) and b(x) in Z5 [x] such that d(x) = a(x)(x4 + x3 + 2x2 + x + 1) + b(x)(x3 + 3x2 + x + 3). 8. Let p(x), q(x) ∈ K[x], d(x) = gcd(p(x), q(x)) and m(x) = lcm(p(x), q(x)). Show that: (a) If p(x)|r(x) and q(x)|r(x), then p(x)q(x)|r(x)d(x). (b) There exists k ∈ K such that kd(x)m(x) = p(x)q(x). 168 CHAPTER 3. OTHER EXAMPLES OF RINGS 9. Let q(x) ∈ K[x] be non-zero and non-invertible. Prove the following statements (recall the Fundamental Theorem of Arithmetic): (a) There exist irreducible polynomials p(x) such that p(x)|q(x). (b) There exist irreducible monic polynomials m1 (x), . . . , mk (x) ∈ K[x] and Qk a0 ∈ K such that q(x) = a0 i=1 mi (x). (c) The decomposition in (b) is unique up to the order of the factors. 10. Show that the ideal hx, yi in K[x, y] is not principal. 11. Assume that the ring A is an extension of the field K, and let a ∈ A be algebraic over K. If J = {p(x) ∈ K[x] : p(a) = 0}, show that: (a) J = hm(x)i is an ideal of K[x] 10 . (b) If A does not have zero divisors, then m(x) is irreducible, and K[a] = K(a) is a field. (c) If A does not have zero divisors, and B is the set of all elements in A which are algebraic over K, then B is a field and it is the largest algebraic extension of K in A. √ √ 12. Show that Q[ 3 2] and Q[ 4 2] are algebraic extensions of Q and subfields of R. What are their dimensions, when viewed as vector spaces over Q? 13. Let A ⊂ R be the set of all real numbers which are algebraic over Q. Show that: (a) A is a countable field. (b) A is an algebraic extension of Q. (c) A has infinite dimension, when viewed as a vector space over Q. 3.7 Divisibility and Prime Factorization We saw before in our studies of the ring of integers Z and of the ring of polynomials K[x] that in these rings any non-zero element which is noninvertible has an essential unique factorization has a product irreducible or prime elements. It is natural to wonder if this fact holds in other rings. We will study in this and in the next section how the notions of divisibility and factorization can be extended to any integral domain D. 10 One say that m(x) is the minimal polynomial of the element a. 3.7. DIVISIBILITY AND PRIME FACTORIZATION 169 Recall that if a, b ∈ D we say that a divides (or is a factor of) b if there exists d ∈ D such that b = da. In this case we write “a|b”11 . The following notions are partially inspired by the usual concepts we have seen in Z and K[x], and they play a basic role. Definition 3.7.1. Let D be an integral domain, a, b, p ∈ D, and p a noninvertible element. We say that: (i) a is associate to b, if a|b and b|a; (ii) p is prime, if p 6= 0 and p|ab ⇒ p|a or p|b; (iii) p is irreducible, if p = ab ⇒ a is invertible or b is invertible. Notice that in this more general context, Euclides Lemma becomes a definition of a prime element, and the irreducible elements are those elements that only admit trivial factorizations. We will see below that in the rings Z and K[x] the prime elements in the sense of the above definition are exactly the irreducible elements. It is only for historical reasons that we use the terms “prime” in Z and “irreducible” in K[x]. For a general integral domain, the notion of prime and irreducible may not coincide, but we will identify many classes of domains where these notions are equivalent, and where one can establish appropriate generalizations of the Fundamental Theorem of Arithmetics, concerning the existence and uniqueness of factorizations into irreducible elements. The binary relation associate to is actually an equivalence relation: it is easy to check that a is associate to b if and only if a = ub for some invertible element u. Hence, if a, b ∈ D are associates, we write a ∼ b. In Factorization Theory, it is common to call the invertible elements units, and we will follow this practice. Notice that the units are precisely those elements which are associate to the identity of D, and that given p, q ∈ D, if p ∼ q, then p is a prime (respectively irreducible) if and only if q is a prime (respectively, irreducible). In particular, if p is a prime then all the elements obtained from p by multiplying by units are also prime elements (and this is not the usual convention in Z). We start by observing the following general fact: Lemma 3.7.2. In any integral domain, a prime element is always irreducible. 11 We also use the symbol “a ∤ b” to say that a does not divide b. 170 CHAPTER 3. OTHER EXAMPLES OF RINGS Proof. If p ∈ D is prime and p = ab, then either p|a or p|b. If, for example, p|a, then there exists x ∈ D such that a = px, and we conclude that p = ab =⇒ p = pxb, and since p 6= 0, =⇒ 1 = xb, =⇒ b is invertible. Similarly, if p|b, we conclude that a is invertible. Examples 3.7.3. 1. In Z the units are {1, −1}. An element p ∈ Z is irreducible in the sense of Definition 3.7.1 if and only if the natural number |p| is prime in the sense of Chapter ??. Also, if p|n then |p||n, so we see that Euclides Lemma shows that if p is irreducible then p|ab =⇒ |p||ab =⇒ |p||a or |p||b =⇒ p|a or p|b. We conclude that in Z irreducible and prime elements, in the sense of Definition 3.7.1, coincide. 2. The units in K[x] are the zero degree polynomials, while an irreducible polynomial in the sense of Definition 3.7.1 is just an irreducible polynomial in the usual sense. The generalization of Euclides Lemma given in Exercise 5 shows that if p(x) ∈ K[x] is irreducible then it is prime. We conclude again that in K[x] irreducible and prime elements, in the sense of Definition 3.7.1, coincide. 3. The units in the ring of Gaussian integers Z[i] are {1, −1, i, −i}. The element 2 ∈ Z[i] is not irreducible in Z[i] (although it is irreducible in Z) since we have: 2 = (1 + i)(1 − i), with 1 ± i non invertible. We claim that 1 + i and 1 − i are irreducible. In order to show this, we observe that the function N : Z[i] → N0 defined by N (a + bi) = |a + bi|2 = a2 + b2 . satisfy the following two properties: (a) if z1 , z2 ∈ Z[i], then N (z1 z2 ) = N (z1 )N (z2 ); (b) N (z) = 1 if and only if z is invertible. To check, for example, that 1 + i is irreducible, assume that 1 + i = z1 z2 in Z[i]. By property (a), we find that 2 = N (1 + i) = N (z1 z2 ) = N (z1 )N (z2 ). 3.7. DIVISIBILITY AND PRIME FACTORIZATION 171 Since 2 is irreducible in Z, we must have N (z1 ) = 1 or N (z2 ) = 1. By property (b), we conclude that z1 or z2 are invertible, and 1 + i is irreducible in Z[i]. Observe that −(1 + i) = −1 − i, i(1 + i) = −1 + i and −i(1 + i) are also irreducible. We will show later that in Z[i] any irreducible element is also a prime, and how one can determine all the irreducible in Z[i]. 4. There are integral domains where there are irreducible elements which are not prime. We have already observed in Chapter ?? that in the ring of even integers 2Z we have two distinct factorizations 36 = 6 × 6 = 2 × 18, so 2 divides 6×6 but it does not divide 6 (note however, that we assume integral domains √ to have a unit, which is not the case with 2Z). Consider instead the ring Z[ −5], which is an integral domain. In this ring, we have: √ √ √ √ 9 = 3 · 3 = (2 + −5)(2 − −5) =⇒ 3|(2 + −5)(2 − −5). We leave it to the √ exercises to check √ that 3 is irreducible √ but is not a factor neither of (2 + −5) nor of (2 − −5). Hence 3 ∈ Z[ −5] is irreducible but not a prime. Using these elementary notions we can introduce domains where factorizations exist: Definition 3.7.4. A domain D is called a unique factorization domain (abbreviated UFD) if: satisfeitas: (i) If 0 6= d ∈ D is non-invertible there exists irreducible elements p1 , · · · , pn such that (3.7.1) d= n Y pi . i=1 Q Q ′ (ii) If p1 , · · · pn , and p′1 · · · p′m are irreducible, and ni=1 pi = m i=1 pi , then n = m, and there exists a permutation π ∈ Sn such that p′i ∼ pπ(i) . In other words, in a UFD every non-zero and non-invertible element has a factorization into a product of irreducible elements and this factorization is unique up to the order of the factors and multiplication of eachQfactor by an appropriate unit (note that if p′i = ui pπ(i) , then we must have ni=1 ui = 1). Note that the factorization 3.7.1 can also be expressed in powers of de irreducible elements, but in this case we may need to include a unit in the factorization: (3.7.2) d = u · pe11 · · · penn . 172 CHAPTER 3. OTHER EXAMPLES OF RINGS Examples 3.7.5. 1. The ring Z is a UFD: it follows from the Fundamental Theorem of Arithmetics that every integer can be factored in the form m = p1 · · · pm , where each pi is irreducible (i.e., |pi | is a prime). This factorization is unique up to the order of the factor and multiplication by ±1. For example, we have: −15 = (3) · (−3) = (−1)32 , where 3 and −3 are both irreducible. 2. By Exercise 9 in Section 3.6, given a polynomial q(x) ∈ K[x], there exist irreducible polynomials p1 (x), . . . , pn (x) ∈ K[x] such that q(x) = n Y pi (x). i=1 This factorization is unique up to the order of the factor and multiplication by units. Hence, K[x] is a UFD. For example, in Q[x] we have 2x2 + 4x + 2 = (2x + 2)(x + 1) = 2(x + 1)2 , where 2x + 2 and x + 1 are both irreducible. 3. We will see below that the ring Z[i] of Gaussian integers is a UFD. It is useful to observe that all the elementary properties concerning factors and divisibility introduced above can be expressed in terms of ideals. For that, we will say that an ideal 0 ( P ( D is a prime ideal if, for any ideals I, J ⊂ D, IJ ⊂ P =⇒ I ⊂ P or J ⊂ P. Proposition 3.7.6. Let a, b, p, u ∈ D. Then: (i) a|b if and only if hai ⊃ hbi; (ii) a ∼ b if and only if hai = hbi; (iii) u is a unit if and only if hui = D; (iv) p is a prime if and only if hpi is a prime ideal; (v) p is irreducible if and only if hpi is maximal among all principal ideals in D. 3.7. DIVISIBILITY AND PRIME FACTORIZATION 173 Proof. We leave the proof of (i), (ii) and (iii) as an easy exercise. In order to show that (iv) holds, let p ∈ D be a prime, and I, J ⊂ D ideals such that IJ ⊂ hpi. If I 6⊂ hpi, then there exists a ∈ I such that a 6∈ hpi, i.e., such that p ∤ a (by (i)). Hence, for all b ∈ J, we have ab ∈ hpi ⇔ p|ab and p ∤ a. Since p is a prime, we must have p|b, i.e., b ∈ hpi (by (i)). We conclude that J ⊂ hpi. This proves that hpi is a prime ideal. For the converse, assume that hpi is a prime ideal p|ab. Then habi = haihbi ⊂ hpi =⇒ hai ⊂ hpi or hbi ⊂ hpi. By (i), this means that either p|a or p|b, so p is a prime. In order to show that (v) holds, consider first an irreducible element p ∈ D and and let hpi ⊂ hai, for some a ∈ D. Then p = ax hence either a is a unit or x is a unit. If a is a unit, then by (iii) hai = D. If x is a unit, then p ∼ a and, by (ii), hpi = hai. Hence, hpi is maximal among all principal ideals of D. Conversely, assume that hpi is maximal among all principal ideals of D. If p = ab, since hpi ⊂ hai, we see that either hai = D and a is invertible (by (iii)), or hpi = hai and a ∼ p (by (ii)). In the later case, there exists a unit u ∈ D such that a = pu, so that p = ab =⇒ p = pub, =⇒ 1 = ub =⇒ b is invertible. Therefore, either a or b is invertible, which shows that p is irreducible. The previous proposition suggests one maybe able to construct a Factorization Theory based in the ideals of D rather than the elements of D. This is indeed true and useful, and the name “ideal” derives historically from this idea (see Chapter 7). For now, we consider only factorization of elements of D, but we use ideals to give a characterization of a UFD. Given a domain D we will say that it satisfies the ascending chain condition on principal ideals if every chain of principal ideals: hd1 i ⊂ hd2 i ⊂ · · · ⊂ hdn i ⊂ . . . stabilizes, i.e., after some natural number n0 ∈ N we have hdn0 i = hdn0 +1 i = . . . . This property turns out to be crucial for factorizations to exist: 174 CHAPTER 3. OTHER EXAMPLES OF RINGS Theorem 3.7.7. Let D be an integral domain. Then D is a UFD if and only if the following two conditions hold: (i) every irreducible element is prime; (ii) D satisfies the ascending chain condition on principal ideals. Proof. Let D be a UFD and p ∈ D an irreducible element. If p|ab, then there exist x ∈ D such that px = ab, where x, a and b admit factorizations of the type (3.7.1): x = p1 · · · pr , a = p′1 · · · p′s , b = p′′1 · · · p′′t with pi , p′j , p′′k irreducible in D. Hence: p · p1 · · · pr = p′1 · · · p′s · p′′1 · · · p′′t , and, by uniqueness of the factorization, we must have p ∼ p′i or p ∼ p′′j . In the first case, p|a and and in the second case p|b. Hence, p is a prime. On the other hand, consider an ascending chain condition of principal ideals hd1 i ⊂ hd2 i ⊂ · · · ⊂ hdn i ⊂ . . . We can assume, without loss of generality (why?), that d1 6= 0 and di is non-invertible, ∀i. Since di |d1 , for all i, the factorizations of d1 and di take the form d1 = p1 · · · pr , di = p′1 · · · p′s . The irreducible factors of di are factors of d1 , so s ≤ r. In particular, the chain cannot contain more than r distinct ideals, so there exists a natural n0 such that hdn0 i = hdk i, for all k ≥ n0 . This completes half of the proof. Conversely, let D be an integral domain where conditions (i) and (ii) in the statement hold. Let d ∈ D be a non-zero and non-invertible element. Let us assume that d does not admit a factorization, and see that this leads to a contradiction. By induction, we construct as follows a sequence {dn }n∈N where d1 = d, dn+1 |dn , dn 6∼ dn+1 and none of the elements dn admits a factorization into a product of irreducible elements. The case n = 1 is trivial, so assume that n > 0, and we have already constructed elements d1 , · · · , dn . Since dn cannot be irreducible, we have dn = an bn , where an and bn are non-invertible. Obviously, if both an and bn can be factored into a product of irreducible elements, then dn can also be factored. So assume, without loss of generality that bn cannot be factored. 3.7. DIVISIBILITY AND PRIME FACTORIZATION 175 Then we set dn+1 = bn , and observe that obviously dn+1 |dn , and dn 6∼ dn+1 . This shows that the sequence {dn }n∈N exists. Now, the principal ideals generated by the dn ’s satisfy hd1 i ( hd2 i ( · · · ( hdn i ( . . . but this contradicts the ascending chain condition. Hence, we conclude that every non-zero and non-invertible element can be factored into irreducible elements. In order to verify the uniqueness of factorization, assume that p1 · · · pn = p′1 · · · p′m , where, say, n ≤ m. Since the pi , p′j are irreducible, by (i) they are also prime. Since pn |p′1 · · · p′m , we have that pn is associate to some p′j , which we denote by p′π(n) . Removing these two elements from the products, and repeating the argument, we obtain by exhaustion that n = m and pi ∼ p′π(i) for some permutation π ∈ Sn . This theorem justifies the use of the term “prime factorization” to denote factorizations of the type (3.7.1) or (3.7.2). As we have already observed, the crucial property of the rings Z and K[x], in what concerns factorization, is that all its ideals are principal. Definition 3.7.8. An integral domain D is called a principal ideal domain (abbreviated PID) if all its ideals are principal, i.e., of the form hdi. As a consequence of Theorem 3.7.7 we have: Corollary 3.7.9. Every PID is a UFD. Proof. First we show that the irreducible elements in D are prime. For that, assume that p ∈ D is irreducible and that p|ab. The ideal ha, pi is principal, so there exists d ∈ D such that hdi = ha, pi. Since hpi ⊂ hdi ⊂ D and hpi is maximal (Proposition 3.7.6), we have hpi = hdi or hdi = D. In the first case, p ∼ d and since d|a, we conclude that p|a. In the second case there exist r, s ∈ D such that 1 = ra + sp, hence b = 1 · b = (ra + sp)b = rab + spb. 176 CHAPTER 3. OTHER EXAMPLES OF RINGS Since p divides each of the terms in the right hand side, we conclude that p|b. In any case, p|a or p|b, so p is a prime. Last we verify the ascending chain condition on principal ideals. Consider the chain: hd1 i ⊂ hd2 i ⊂ · · · ⊂ hdn i ⊂ . . . Note that ∪∞ i=1 hdi i is an ideal that, by assumption, must be principal: ∪∞ hd i = hd 0 i. Obviously, there exists a natural number n0 such that i=1 i d0 ∈ hdn0 i, and it follows that: hdn0 i = hdn0 +1 i = . . . . We have verified both conditions in Theorem 3.7.7, so D is a UFD. Examples 3.7.10. 1. The ring of Gaussian integers is a UFD, since we know that Z[i] is a PID, by Exercise ?? in Section 2.6. 2. We will see in the next section that if D is a UFD then D[x] is also a UFD. In particular, Z[x] is a UFD, although it is not a PID. In geral, the problem of determining if a given integral domain is a UFD maybe hard to solve. For example, it is known that the quadratic domains √ Z[ m], for m < 0, are UFD if and only if m = −1, −2, −3, −7 and −11, but this is a very non-trivial result. For m > 0 is it not known which domains √ Z[ m] are UFD. Also, even if one knows that D is a UFD, it maybe very hard to determine its prime elements. We will illustrate this problem in the case of the Gaussian integers. In order to simplify the exposition, we will call Euclidean primes the prime natural numbers in Z, and Gaussian primes the prime Gaussian integers in Z[i]. Proposition 3.7.11. Let p ∈ Z be an Euclidean prime. (i) If p = n2 + m2 has solutions n, m ∈ Z, then z = n + mi is a Gaussian prime; (ii) If p = n2 + m2 has no solutions in Z, then p is a Gaussian prime; In particular, z ∈ Z[i] is a Gaussian prime if and only if z ∼ p, where p ∈ Z is a Gaussian prime, or z ∼ n + mi, where n2 + m2 = p is an Euclidean prime. 3.7. DIVISIBILITY AND PRIME FACTORIZATION 177 Proof. Let N (n + mi) = n2 + m2 denote the norm of the Gaussian integer n + mi. To show that (i) holds, assume that p = n2 + m2 is an Euclidean prime, and consider a Gaussian factorization: n + mi = zw. Then we have: p = N (n + im) = N (z)N (w). Hence, either N (z) = 1 and z is invertible, or N (w) = 1 and w is invertible. Hence, we conclude that n + mi is a Gaussian prime. Next, to show that (ii) holds, observe that if p is not a Gaussian prime then there exist non-invertible Gaussian integers z and w such that p = zw. Then p2 = N (p) = N (z)N (w), and since N (z) and N (w) are integers 6= ±1, one must have p = N (z) = N (w). Hence the equation p = n2 + m2 has solutions. Finally, let z = a + bi be a Gaussian prime, so N (z) > 1. If p is any prime factor (in Z) of N (z), there exists w ∈ Z[i] such that N (z) = (a + bi)(a − bi) = pw. Since p ∈ Z, we have that p|a + bi = z if and only if p|a − bi. Hence, one of the following alternative must hold: (a) p is also a Gaussian prime: Then either p|a + bi or p|a − bi, i.e., p is a factor of z. Hence, z ∼ p, since z and p are both Gaussian primes; (b) p is not a Gaussian prime: We conclude from (ii) that p = n2 + m2 has solutions, and we then have N (z) = (a + bi)(a − bi) = pw = (n + mi)(n − mi)w. From (i), n + mi is a Gaussian prime, and we conclude that either a + bi ∼ n + mi or a + bi ∼ n − mi. Examples 3.7.12. 1. Obviously, the equation 3 = n2 + m2 has no solutions in Z, so 3 is both an Euclidean prime and a Gaussian prime. 2. Since 5 = 12 + 22 , it follows that 5 is not a Gaussian prime, but the Gaussian integers ±1 ± 2i and ±2 ± i are all Gaussian primes. 178 CHAPTER 3. OTHER EXAMPLES OF RINGS As we have just seen, the identification of Gaussian primes relies on the solutions of p = n2 + m2 , where p is an Euclidean prime. Fermat found a very elegant result concerning the values of p for which this equation has solutions: Theorem 3.7.13 (Fermat). Let p be an Euclidean prime. Then the following statements are equivalent: (i) The equation p = n2 + m2 has solutions in Z, (ii) p 6≡ 3 (mod 4), (iii) The equation x2 = −1 has solutions in Zp . Proof. We leave it as exercise to check that “(i) =⇒ (ii)”. To show that “(ii) =⇒ (iii)”, notice that we can assume p = 6 2, since 2 x = 1 is a solution of x = −1 = 1. Let 2 6= p ≡ 1 (mod 4). If x ∈ Zp set C(x) = {x, −x, x−1 , −x−1 } and define: x ≈ y ⇐⇒ C(x) = C(y). One checks easily that “≈” is an equivalence relation in Z∗p , and that x 6= −x for any x ∈ Z∗p (because p 6= 2). The equivalence class of x is just the set C(x). Denoting by #(C(x)) the number of elements of the class C(x), notice that the possible values of #(C(x)) are: #(C(x)) = 2, if x = x−1 , i.e., if x = ±1, = 2, if x = −x−1 , i.e., if x and x−1 are the solutions of x2 = −1, = 4, if x is not a solution of x2 = ±1 (or, equivalently, of x4 = 1). The equivalence classes of ≈ give a partition of Z∗p , which has p − 1 elements. We have just saw that there exists at least one class with 2 elements, namely C(1) = {1, −1}. Possibly, there exists another class with 2 elements, formed by the roots of x2 + 1, if this polynomial has roots in Z∗p . If n is the number of equivalence classes with 4 elements, we conclude that the p − 1 elements of Z∗p are assembled as follows: • if there are no solutions of x2 = −1, then p − 1 = 2 + 4n ⇔ p = 4n + 3, since there exists only one equivalence class with 2 elements, and the remaining n classes have 4 elements each, or • there exist solutions of x2 = −1 and then p − 1 = 2 + 2 + 4n ⇔ p = 4(n + 1) + 1, because there are 2 equivalence classes each with 2 elements, and the remaining n classes have 4 elements. 3.7. DIVISIBILITY AND PRIME FACTORIZATION 179 Since p 6= 2 is a prime, we have that p 6≡ 3 (mod 4) =⇒ p ≡ 1 (mod 4) and we conclude that the equation x2 = −1 has solutions in Z∗p . Finally, to show that “(iii) =⇒ (i)” we note that x2 = −1 has solutions in Zp if and only if there exists an integer k such that p|1+k2 = (1+ki)(1−ki). If p is a Gaussian prime then p|1 + ki or p|1 − ki, a contradiction, since p 6 |1. Hence p is not a Gaussian prime, and according to Proposition 3.7.11, the equation p = n2 + m2 must have solutions. Examples 3.7.14. 1. The Euclidean primes 7, 11 and 19 are also Gaussian primes. 2. 1973 is an Euclidean prime which is not a Gaussian prime because p ≡ 1 (mod 4). It follows that the equation 1973 = n2 + m2 has solutions n, m ∈ Z, which is not an immediate fact (check that n = 23 and m = 38 is a solution). An important property of a UFD is that any two elements have a greatest common divisor and a least common multiple. In order to discuss this, we need a more abstract definition of greatest common divisor and least common multiple, valid for any integral domain. Definition 3.7.15. Let D be an integral domain, and a1 , . . . , an ∈ D. (i) d ∈ D is called a greatest common divisor of a1 , . . . , an if d|ai (i = 1, . . . , n) and for all b ∈ D such that b|ai (i = 1, . . . , n) we have that b|d; (ii) m ∈ D is called a least common multiple of a1 , . . . , an if ai |m (i = 1, . . . , n) and for all b ∈ D such that ai |b (i = 1, . . . , n) we have that m|b. Note the lack of uniqueness in these notions: if d is a greatest common divisor, and c ∼ d, then c is also a greatest common divisor, and the same holds for a least common multiple. In Z and in K[x], we could ensure uniqueness by requiring that d and m be non negative in Z, and monic in K[x]. Apart from this detail, the definitions above are compatible with the definitions introduced in Chapters 2 and 3. It is not at all obvious that given elements a1 , . . . , an ∈ D a greatest common divisor and/or a least common multiple exist. However, if a1 , . . . , an have at least one greatest common divisor (respectively, least common multiple) we will denote by gcd(a1 , . . . , an ) (respectively, lcm(a1 , . . . , an )) any such element. We leave the proof of the following elementary properties as an exercise: 180 CHAPTER 3. OTHER EXAMPLES OF RINGS Lemma 3.7.16. Let a, b, c ∈ D. Then: (i) gcd(a, gcd(b, c)) ∼ gcd(gcd(a, b), c) ∼ gcd(a, b, c); (ii) gcd(ca, cb) ∼ c gcd(a, b). Similarly, with gcd replaced by lcm. When D is a UFD the gcd(a1 , . . . , an ) and lcm(a1 , . . . , an ) always exist: Proposition 3.7.17. Let D be a UFD, and a, b ∈ D. Then: (i) gcd(a, b), lcm(a, b) exist and ab ∼ gcd(a, b) lcm(a, b); (ii) If D is a PID, then gcd(a, b) = ra + sb for some r, s ∈ D. Proof. (i) If a = 0 then gcd(a, b) = b and lcm(a, b) = 0. If a is invertible, then gcd(a, b) = a and lcm(a, b) = b. Assume then that both a and b are non-zero and non-invertible. The prime factorizations of a and b can be written as a = u · pn1 a1 · · · pns as , and b = u′ · p1nb1 · · · pns bs , where the pi are non-associate, nai ≥ 0 and nbi ≥ 0. Setting for i = 1, . . . , s mi = min{nai , nbi } and Mi = max{nai , nbi }, we see immediately that we can choose: M1 ms Ms 1 gcd(a, b) = pm 1 · · · ps , and lcm(a, b) = p1 · · · ps . (ii) Let D be a PID. If a, b ∈ D, by (i) there exists gcd(a, b). We leave as an exercise to check that since D is a PID one has: ha, bi = hgcd(a, b)i. This means that there exist r, s ∈ D such that gcd(a, b) = ra + sb. Exercises. 1. Verify itens (i)-(iii) in Proposition 3.7.6. √ √ 2. In the ring Z[ −5] show that the elements 3 and 2 ± −5 are irreducible. √ 3. Show that the ring Z[ 10] is not a UFD. 4. If p is a Euclidean prime and there are integers n and m such that p = n2 +m2 , show that p 6≡ 3 (mod 4). 5. Show that if p is a Euclidean prime and n, m, a, b ∈ Z satisfy p = n2 + m2 = a2 + b2 then {n2 , m2 } = {a2 , b2 }. 3.7. DIVISIBILITY AND PRIME FACTORIZATION 181 6. An integral domain D is called an Euclidean domain if there exists a map δ : D → N with the following property: ∀a, b ∈ D − {0} there exist q, r ∈ D such that a = qb + r, with r = 0 or δ(r) < δ(b). Such a map is called an Euclidean function . Show that: (a) Z and K[x] are Euclidean domains; (b) The ring of Gaussian integers Z[i] is an Euclidean domain; (c) A Euclidean domain D always admits an Euclidean function δ̃ : D → N satisfying δ̃(a) ≤ δ̃(ab) for all a, b ∈ D − {0} and δ̃(u) = δ̃(1) if and only if u is a unit; Hint: Define a new Euclidean function δ̃(a) = min{δ(xa) : x ∈ D −{0}}. (d) A Euclidean domain is a UFD (without using neither Theorem 3.7.7 or its Corollary); (e) An Euclidean domain is a PID. √ One can show that the algebraic integers 12 in the field Q( −19) form a PID which is not an Euclidean domain, so the following inclusions are strict: Euclidean Domains ( PID ( UFD ( Integral Domains 7. Let D be an integral domain. (a) Show that if D satisfies the ascending chain condition on principal ideals, then every non-zero and non-invertible element of D has a factorization into irreducible elements (possibly, non unique). (b) Give an example of an integral domain D which does not satisfy the ascending chain condition on principal ideals. 8. Prove Lemma 3.7.16. 9. Show that if D is a PID then for any a, b ∈ D one has ha, bi = hgcd(a, b)i, hai ∩ hbi = hlcm(a, b)i and gcd(a, b) lcm(a, b) ∼ ab. 10. Assume that k ∈ N and show that k = n2 + m2 has solutions in Z if and only if any prime factor p of k with p ≡ 3 (mod 4) satisfies p2N |k. What is the smallest natural number k for which the equation k = n2 + m2 = s2 + t2 has solutions n, m, s, t such that {n2 , m2 } 6= {s2 , t2 }? 12 If K is a field extension of Q, d ∈ K is called an algebraic integer if d is a root of some monic polynomial p(x) ∈ Z[x]. 182 CHAPTER 3. OTHER EXAMPLES OF RINGS 11. Let P be the set of Euclidean primes and G the set of Gaussian primes. Show that both P − G and G − P are infinite sets. In other words, show that there are infinite Euclidean primes of the form p = 4n + 1 and of the form p = 4n + 3. Hint: Consider the natural numbers of the form k = (2N !) − 1 and of the Q 2 N form k = + 4. i=1 (2i − 1) 3.8 Factorization in D[x] When K is a field, the ring of polynomials K[x] is a UFD, since it is a principal ideal domain. For an arbitrary domain D the structure of the ideals of D[x] can be very complex. In fact, D[x] is a PID if and only if D is a field, and this explains why some example of integral domains that we have mentioned before, like Z[x] or K[x, y] = K[y][x], are not PIDs (see Exercise 10 in Section 3.6). In any case, it is obvious that if D is not a UFD then D[x] is also not a UFD. In fact, we will see in this section that the ring D[x] is a UFD if and only if D is a UFD, and this shows in particular that Z[x] and K[x, y] are UFDs. In this section we will always assume that D is a UFD, so greatest common divisors and least common multiples always exist in D. Also, we will denote by K the field of fractions Frac(D). The definition of content of a polynomial, introduced in Section 3.5 for polynomials p(x) ∈ Z[x], can be extended without changes to D[x]. Definition 3.8.1. If p(x) = a0 + a1 x + · · · + an xn ∈ D[x], we say that c(p) ∈ D is the content of p(x) if and only if (3.8.1) c(p) = gcd(a0 , . . . , an ). It should be clear that, just like the greatest common divisor, the content of a polynomial is only defined up to a multiple by a unit. Again we will call a polynomial p(x) ∈ D[x] primitive if c(p) ∼ 1. Lemma 3.8.2. Let p(x) ∈ D[x]. Then: (i) There exists q(x) ∈ D[x] primitive such that p(x) = c(p)q(x). (ii) If p(x) = dq(x), with q(x) ∈ D[x] primitive and d ∈ D, then d ∼ c(p). Proof. Part (i) is obvious. 3.8. FACTORIZATION IN D[X] 183 In order to show that (ii) holds, let p(x) = a0 + a1 x + · · · + an xn and q(x) = b0 + b1 x + · · · + bn xn , with q(x) a primitive polynomial, and assume that p(x) = dq(x). Then ai = dbi and, by Lemma 3.7.16, we conclude that c(p) = gcd(a0 , . . . , an ) ∼ d gcd(b0 , . . . , bn ) ∼ d. The next two results are useful to express a polynomial p(x) ∈ K[x] in terms of a primitive polynomial q(x) ∈ D[x]. Lemma 3.8.3. Let 0 6= p(x) ∈ K[x]. Then: (i) There exist q(x) ∈ D[x] primitive and k ∈ K such that p(x) = kq(x); (ii) If p(x) = kq(x) = k̃q̃(x), with k, k̃ ∈ K and q(x), q̃(x) ∈ D[x] primitives, then q̃(x) = uq(x) and k̃ = u−1 k, where u ∈ D is a unit. Proof. (i) If p(x) = α0 + α1 x + · · · + αn xn = an a0 a1 + x + · · · + xn ∈ K[x], b0 b1 bn Q we let b = ni=1 bi . Clearly, r(x) = bp(x) ∈ D[x]. If c = c(r), by Lemma 3.8.2, there exists q(x) ∈ D[x] primitive such that r(x) = cq(x) and we have p(x) = kq(x), where k = c ∈ K. b (ii) Exercise. Corollary 3.8.4. If p(x), q(x) ∈ D[x] are primitives and p(x) ∼ q(x) in K[x], then p(x) ∼ q(x) in D[x]. Proof. If p(x) ∼ q(x) in K[x], then p(x) = αq(x), with α ∈ K. The corollary then follows from Lemma 3.8.3, item (ii). The next two results are just generalizations to UFD of results that we have obtained before in the cases D = Z and K = Q. Lemma 3.8.5. Let p(x), q(x), r(x) ∈ D[x] be such that p(x) = q(x)r(x). Then: (i) If d ∈ D is prime, then d|c(p) ⇐⇒ d|c(q) or d|c(r), and (ii) p(x) is primitive if and only if q(x) and r(x) are both primitive. 184 CHAPTER 3. OTHER EXAMPLES OF RINGS Proof. Part (ii) follow from (i). To prove (i), note first that if d is a prime and d|c(q) or d|c(r), obviously d|c(p). So it remains to prove the converse. Write p(x) = a0 + a1 x + · · · , q(x) = b0 + b1 x + · · · , and r(x) = c0 + c1 x + · · · . Let d ∈ D be a prime such that d|c(p) and assume, by contradiction, that d 6 |c(q) and d 6 |c(r). Let: s := min{k ≥ 0 : d 6 |bk }, and t =: min{k ≥ 0 : d 6 |ck }. If m = s + t, then we observe that: am = m X bk cm−k = k=0 s−1 X bk cm−k + bs ct + k=0 = s−1 X m X bk cm−k , k=s+1 bk cm−k + bs ct + k=0 t−1 X bm−k ck , k=0 where the last sums are assume to represent zero, when s = 0 or t = 0. It follows that: s−1 t−1 X X bs ct = am − bk cm−k − bm−k ck . k=0 k=0 The right hand side of the last identity is a multiple of d, while the left hand side cannot be a multiple of d. Hence, we must have either d|c(q) or d|c(r). Example 3.8.6. The polynomials p(x) = 3x2 + 2x + 5 and q(x) = 5x2 + 2x + 3 in Z[x] are both primitive since gcd(3, 2, 5) = 1. Its product is the primitive polynomial p(x)q(x) = 15x4 + 16x3 + 38x2 + 16x + 15. We now generalize Lemma 3.5.10 from D = Z to any UFD: the factors a(x) ∈ K[x] of a polynomial p(x) ∈ D[x] are associate in K[x] of the factors of p(x) in the original ring D[x]. Lemma 3.8.7. If p(x) ∈ D[x], and p(x) = a(x)b(x) with a(x), b(x) ∈ K[x], then there exist ã(x), b̃(x) ∈ D[x], and k ∈ K, such that p(x) = ã(x)b̃(x), a(x) = kã(x), and b(x) = k−1 b̃(x). Proof. Lemma 3.8.3 (i) shows that a(x) = sa′ (x) and b(x) = tb′ (x), where s, t ∈ K, and a′ (x) and b′ (x) are primitive polynomials in D[x]. On the 3.8. FACTORIZATION IN D[X] 185 other hand, we have from Lemma 3.8.2 that p(x) = c(p)p′ (x), where p′ (x) is also primitive in D[x]. We conclude that p(x) = c(p)p′ (x) = sta′ (x)b′ (x). By Lemma 3.8.5 (ii), a′ (x)b′ (x) is primitive and it follows from Lemma 3.8.3 (ii) that c(p)u = st and p′ (x) = u−1 a′ (x)b′ (x), for some unit u ∈ D. Let us set, for example, ã(x) = c(p)ua′ (x) and b̃(x) = b′ (x). Then ã(x)b̃(x) = c(p)ua′ (x)b′ (x) = sta′ (x)b′ (x) = p(x). It follows also that the constant in the statement can be taken to be k = and k−1 = 1t . c(p)u s In this more general context Gauss’s Lemma takes the same form as in the case D = Z. Corollary 3.8.8 (Gauss’s Lemma). If p(x) ∈ D[x] is not constant, then p(x) is irreducible in D[x] if and only if it is primitive in D[x] and irreducible in K[x]. Proof. If p(x) is not primitive then it has a non-trivial factorization in D[x], of the form p(x) = c(p)p̃(x). On the other hand, if p(x) = a(x)b(x) is a non-trivial factorization in K[x], then p(x) has a non-trivial factorization in D[x], by Lemma 3.8.7. If p(x) is reducible in D[x], then it has a non-trivial factorization p(x) = a(x)b(x) in D[x]. If any of these non-trivial factors is constant, then p(x) is not primitive. If all factors are non-constant, we obtain a non-trivial factorization in K[x]. Finally, we can prove Theorem 3.8.9. D[x] is a UFD if and only if D is a UFD. Proof. Assume that D[x] is a UFD. Since any 0 6= d ∈ D can be identified with a zero degree polynomial, it is obvious that D must be a UFD. Conversely, let D be a UFD. We will show that any q(x) ∈ D[x] of degree > 0 has a unique factorization. Existence: We have that q(x) = c(q)p(x), whereQp(x) primitive. Since K[x] is a UFD, q(x) has a factorization q(x) = nk=1 qk (x), where the polynomials qk (x) ∈ K[x] are irreducible in K[x] and deg(qk (x)) ≥ 1. As we saw above, there exist primitive polynomials pk (x) ∈ D[x] and constant sk ∈ K such that qk (x) = sk pk (x). By Gauss’s Lemma, the polynomials pk (x) are irreducible in D[x], and we have q(x) = c(q)p(x) = s n Y k=1 pk (x), where s = n Y k=1 sk . 186 CHAPTER 3. OTHER EXAMPLES OF RINGS Q On the other hand, according to Lemma 3.8.5 (ii), nk=1 pk (x) is primitive. By Lemma 3.8.3 (ii), there exists aQunit u ∈ D such that s = c(p)u and, in particular, s ∈ D. If we factor s = m k=1 ck into irreducible elements ck ∈ D we have that ! n ! m Y Y q(x) = ck pk (x) k=1 k=1 is a factorization of q(x) into irreducible elements in D[x]. Q ′ ′ Qn′ ′ Uniqueness: Assume that q(x) = m k=1 pk (x) is another fack=1 ck torization of q(x) into irreducible polynomials in D[x], where c′k ∈ D and deg(p′k (x)) ≥ 1. By Gauss’s Lemma the polynomials p′k (x) are primitive and irreducible in K[x]. Hence: • Qm Q ′ ′ is primitive, so m k=1 ck em D. Since D k=1 ck ∼ c(q) ∼ is a UFD, we have m = m′ and, after an appropriate permutation of these factorizations, we have ck ∼ c′k in D. • Qn Qn′ ′ k=1 pk (x) Q ′ ∼ nk=1 p′k (x) in K[x]. Since K[x] is a UFD, we have that n′ = n, and also after an appropriate permutation of these factorizations pk (x) ∼ p′k (x) in K[x]. By Corollary 3.8.4, we also have that pk (x) ∼ p′k (x) in D[x]. k=1 pk (x) Exercises. 1. If p(x) ∈ Z[x] is a monic polynomial with integers coefficients show that any rational root p(x) is actually an integer. 2. Let D be an integral domain which has an non-invertible element d 6= 0. Show that D[x] is not a PID. 3. Prove the following generalization of Eisenstein’s Criterion: let D be a UFD, K = Frac(D) and p(x) = a0 + a1 x + · · · + an xn ∈ D[x] with n ≥ 1. If p ∈ D is a prime such that p|ak for 0 ≤ k < n, p ∤ an and p2 ∤ a0 , then p(x) is irreducible in K[x]. 4. Show that the polynomial p(x, y) = x3 + x2 y + xy 2 + y is irreducible in K[x, y]. 5. Show that the polynomial p(x) = ix3 + 4x2 + 16ix + (3 − i) is irreducible in Z[i][x]. 3.8. FACTORIZATION IN D[X] 187 6. Show that when D is a UFD and p(x) ∈ D[x] is monic, then any monic factor of p(x) in K[x] belongs to D[x]. 7. Let D be a UFD and p(x), q(x) ∈ D[x]. (a) Can one use Euclides Algorithm to compute gcd(p(x), q(x)) in D[x]? (b) Does the equation gcd(p(x), q(x)) = a(x)p(x) + b(x)q(x) always have solutions a(x), b(x) ∈ D[x]? 8. Let D be an UFD. Which of the following rings are UFD? (a) The ring D[[x]] of power series with coefficients in D. (b) The ring D[1/x, x] of fractions p(x)/q(x), with p(x), q(x) ∈ D[x] and q(x) 6= 0. (c) The ring D[1/x, x]] of Laurent series. (d) The ring D[x1 , x2 , . . . ] of polynomials in an infinite number of indeterminates. 188 CHAPTER 3. OTHER EXAMPLES OF RINGS Chapter 4 Quotients and Isomorphisms 4.1 Groups and Equivalence Relations In defining the ring operations in Zm one takes the following steps: (i) First, one defines congruence modulo m (x ≡ y (mod m) ⇔ y − x ∈ hmi), and shows that it is an equivalence relation. (ii) Then, one considers the set Zm consisting of the equivalence classes x = {x + z : z ∈ hmi}, which one often writes as x = x + hmi. (iii) Finally, one defines operations on these equivalence classes, from the usual operations on the integers, through the identities x + y = x + y and xy = xy. We shall now see that this procedure can be extended to much more general contexts, and hence can be used to provide many new examples of algebraic structures. We start by replacing the additive group (Z, +) by an arbitrary group (G, ·). We use the multiplicative notation since the fact that addition was commutative plays here no role. We also replace the subgroup hmi ⊂ (Z, +) by an arbitrary subgroup H ⊂ G, and we generalize the congruent relations modulo m as follows: Definition 4.1.1. If (G, ·) is a group and H ⊂ G is a subgroup, the congruence relation modulo H is given by: g1 ≡ g2 (mod H) ⇐⇒ 189 g2−1 · g1 ∈ H. 190 CHAPTER 4. QUOTIENTS AND ISOMORPHISMS Note that, in fact, congruence modulo m is the special case of this definition where G = (Z, +) and H = hmi. We leave it as an exercise to check that congruence modulo H is indeed an equivalence relation. Proposition 4.1.2. If (G, ·) is a group and H ⊂ G is a subgroup, then ≡ (mod H) is an equivalence relation in G. Proceeding as we did for the case of the integers, we observe that if g ∈ G, the equivalence class of g, denoted by g, can be written as: g = {g′ ∈ G : g ′ ≡ g}, = {g′ ∈ G : g −1 g′ = h ∈ H}, = {g′ ∈ G : g ′ = gh, h ∈ H}. If A and B are any subsets of the group G, we will denote by AB the set of all products of elements of A and elements of B: AB = {ab : a ∈ A and b ∈ B}. If A = {a} (respectively, B = {b}) is a singleton, we will write aB (respectively, Ab) instead of AB. We leave it as an exercise to check that A(BC) = (AB)C and that, in general, AB 6= BA. With these conventions, we will denote by gH the equivalence class of g ∈ G for the congruence (mod H), and call it the left coset of H 1 . The set of all equivalence classes {gH : g ∈ G} is called the quotient set of G by H, and denoted G/H. Therefore, we have that: G/H = {gH : g ∈ G}. The number of elements of G/H, which is just the number of left cosets, is called the index of the subgroup H in G, and is denoted by [G : H]. Examples 4.1.3. 1. Consider the symmetric group G = S3 = {I, α, β, γ, δ, ε} and take as a subgroup the alternating group H = A3 = {I, δ, ε}. Note that: • The equivalence class of I is the coset I = IH = H = {I, δ, ε}. We conclude that I ≡ δ ≡ ε, and I = δ = ε. Also, H = δH = εH. • If we let g = α, it is immediate that α = αH = {αI, αδ, αε} and a simple computation shows that α = {α, β, γ}. It follows that α ≡ β ≡ γ, or α = β = γ = {α, β, γ} = αH = βH = γH. 1 If G is an additive group with operation +, it is convenient to write A + B instead of AB and g + H instead of gH. In this case, we have A + B = B + A and g + H = H + g. 4.1. GROUPS AND EQUIVALENCE RELATIONS 191 We conclude that in this example there are 2 distinct cosets each with 3 elements. Hence, the quotient set is S3 /A3 = {I, α} = {A3 , αA3 }, and the index of A3 in S3 is [S3 , A3 ] = 2. 2. Consider again the group S3 , but let us fix now the subgroup H = {I, α}. Then: • The equivalence class of I is the coset I = H, hence I ≡ α and I = α; • The equivalence class of β is the coset β = βH = {βI, βα} = {β, ε}, hence β ≡ ε and β = ε; • The equivalence class of γ is the coset γ = γH = {γI, γα} = {γ, δ}, hence γ ≡ δ and γ = δ. We conclude that there are now 3 distinct cosets, each with 2 elements. The quotient set is S3 /H = {I, β, γ} = {H, βH, γH} while the index is [S3 : H] = 3. 3. Obviously the index of the subgroup hmi in Z is the number of elements of Zm , i.e., [Z : hmi] = m. Instead of the binary relation in Definition 4.1.1, we can consider instead the binary relation defined by g1 ≡ g2 (mod H) ⇔ g1 g2−1 ∈ H. We still obtain an equivalence relation (distinct from the previous one, if the group multiplication is not commutative), and the equivalence class of g ∈ G is now given by g = {g ′ ∈ G : g ′ ≡ g} = {g ′ ∈ G : g ′ g −1 = h ∈ H} = {g ′ ∈ G : g ′ = hg, h ∈ H}. We denote this equivalence class by Hg, and call it a right coset of H. The set of right cosets is denoted by H\G(2 ). One may notice that the left and right cosets of a subgroup maybe equal, i.e., Hg = gH, for every g ∈ G, as in Example 4.1.3.1, or distinct, as in Example 4.1.3.2. We leave this easy check for the exercises. If ≡ is an equivalence relation in a set X, we know that the corresponding equivalence classes give a partition of X. In other words, the equivalence classes are distinct, disjoint, subsets of X, and their unions is the set X. 2 By default, unless otherwise stated, we will always use left cosets. When it is clear G . which cosets (left or right) one is using, we may also use the notation H 192 CHAPTER 4. QUOTIENTS AND ISOMORPHISMS Of course, if X is a finite set, each equivalence class is also a finite set, and there are only a finite number of distinct equivalence classes. Denoting by X1 , X2 , . . . , Xn the equivalence classes of the equivalence relation ≡ in the set X, we have: |X| = |X1 | + · · · + |Xn | = (4.1.1) n X i=1 |Xi |. This identity is sometimes referred to as the class equation. When X = G and one considers the equivalence relation (mod H), for some subgroup H ⊂ G, the number of elements in each equivalence class is always the same: Proposition 4.1.4. If H is a finite subgroup of G, then |gH| = |Hg| = |H|, for all g ∈ G. Proof. Fix an element g ∈ G. The map φ : H → gH defined by φ(h) = g · h is obviously surjective. On the other hand, by the cancelation law, it is clear that φ is also 1:1: φ(h) = φ(h′ ) ⇒ g · h = g · h′ ⇒ h = h′ . Therefore, φ is a bijection between H and gH, so that |H| = |gH|. Similarly, one shows that |Hg| = |H|. The previous proposition, combined with the class equation (4.1.1), yields: Theorem 4.1.5 (Lagrange). If G is a finite group and H ⊂ G is a subgroup, then: |G| = [G : H]|H|. In particular, both |H| and [G : H] are factors of |G|. Proof. Since both G and H are finite, there exists only a finite number of left cosets so that [G : H] = n. Denote by g1 H, . . . , gn H these left cosets. The class equation (4.1.1) gives |G| = n X i=1 |gi H| = n X i=1 |H| = n|H| = [G : H]|H|. 4.1. GROUPS AND EQUIVALENCE RELATIONS 193 The number of elements of a group G is usually called the order of the group G. Hence, by Lagrange’s Theorem, the order of a finite group G is a multiple of the order of any of its subgroups. Analogously, if g ∈ G is any element, then the order of the element g is the order of the subgroup generated by the element g, i.e., the order of hgi = {g n : n ∈ Z}. It follows from Lagrange’s Theorem that the order of any element of G is also a factor of the order of G. Examples 4.1.6. 1. In Example 4.1.3.1 we find that |G| = 6, |H| = 3, and [G : H] = 2. In the case of Example 4.1.3.2, we find that |G| = 6, but |H| = 2 and [G : H] = 3. 2. If π ∈ S3 , it is easy to see that the order of π can be 3 (δ and ε), 2 (α, β, and γ), or 1 (I). Of course, in any of these cases, the order of π is a factor of the order of S3 . Note that 6 is also a factor of the order of S3 , but there is no element in S3 of order 6. 3. In the study of the rings Zm we saw that the subgroups of Zm take the form hdi, where d is a divisor of m. Obviously, in this case, the number of elements 3 of the subgroup hdi is m d , which is a factor of m. Note also that d = [Zm : hdi] . 4. If (A, +, ·) is any ring with identity 1, the order of the additive subgroup generated by 1 is precisely the characteristic of the ring A. Hence, for a finite ring A its characteristic is a always a factor of the number of elements of A. In many cases it is important to study the factorization of groups into smaller pieces, i.e., to establish conditions for a group G to factor as a direct product of groups K and H. These two groups then are more “elementary blocks”, which can lead to a better knowledge of the structure of the group, an idea which we will pursue in the next chapter. We give now some results in this direction. Their application is often made easier by Lagrange’s Theorem. Lemma 4.1.7. Let G be a group with identity e. If K and H are normal subgroups in G such that K ∩ H={e}, then kh = hk for any k ∈ K and h ∈ H. Proof. Let k ∈ K and h ∈ H. Consider the element k−1 h−1 kh. We have that h−1 kh ∈ K, since K is a normal subgroup of G. Hence, k−1 h−1 kh ∈ K. Similarly, we have that k−1 h−1 k ∈ H, since H is a normal subgroup, and 3 When we refer to the “group Zm ”, with no further qualitative, we always mean the additive group (Zm , +). 194 CHAPTER 4. QUOTIENTS AND ISOMORPHISMS hence k−1 h−1 kh ∈ H. But k−1 h−1 kh ∈ K ∩ H = {e}, so we conclude that k−1 h−1 kh = e, i.e., that kh = hk. Theorem 4.1.8. Let H and K be subgroups of G, and assume that: (i) H and K are normal in G; (ii) G = HK, and (iii) H ∩ K = {e}. Then G ≃ H × K. Proof. Recall that the group H × K as underlying set the Cartesian product H × K = {(h, k) : h ∈ H, k ∈ K}, and binary operation (h1 , k1 )(h2 , k2 ) = (h1 h2 , k1 k2 ). Define φ : H × K → G by φ(h, k) = hk. Using Lemma 4.1.7, one checks easily that φ is a group homomorphism. The assumption G = HK, implies immediately that φ onto. In order to determine the kernel of φ, observe that if φ(h, k) = e, then hk = e. This means that h = k−1 , hence we conclude that h, k ∈ H ∩ K = {e}. It follows that h = k = e, so the kernel of φ consists of (e, e). Hence, φ is a group isomorphism and G ≃ H × K. As we have mentioned above, in the case of finite groups Lagrange’s Theorem is sometimes useful in verifying the assumptions of the previous theorem. For example, if G is a finite group, then |H ∩ K| is a factor of |H| and |K|. Hence, if |H| and |K| happen to be relative primes then we must have |H ∩ K| = 1. If this is the case, the homomorphism φ used in the proof of the theorem is 1:1 and we can conclude that |HK| = |H||K|, so we can check if HK = G. Here is an example in the case of abelian groups, where any subgroup is obviously a normal subgroup: Example 4.1.9. In the group Z6 consider the subgroups H = {0, 3} and K = {0, 2, 4}. We know that H is isomorphic to Z2 and K is isomorphic to Z3 . We have then H ∩ K = {0}, and |G| = |H||K|, so we conclude that: Z6 ≃ Z2 ⊕ Z3 . More generally, suppose that gcd(n, d) = 1 and recall Proposition ??: if m = nd, the group Zm has subgroups B ≃ Zn and C ≃ Zd . Since |B ∩ C| is a factor of both n and d, it follows immediately that |B ∩ C| = 1. Hence, |B + C| = |B||C| = nd = m = |Zm | and so B + C = Zm . We conclude that Zm ≃ B ⊕ C ≃ Zn ⊕ Zd . 4.1. GROUPS AND EQUIVALENCE RELATIONS 195 The theorem above can be generalized to direct products with more than two factors. The proof is analogous and is left as an exercise: Theorem 4.1.10. Let H1 , . . . , Hn be subgroups of a group G such that: (i) Each Hi is normal in G; (ii) G = H1 · · · Hn , and (iii) Hi ∩ n Q k=1 k6=i Hk = {e}, for i = 1, . . . , n. Then: G ≃ H1 × · · · × Hn . Exercises. 1. Proof Proposition 4.1.2. 2. Show that {gH : g ∈ G} 6= {Hg : g ∈ G}, when G = S3 and H = {I, α}. 3. Show that if e is the identity of G, then g ≡ e (mod H) if and only if g ∈ H. 4. Find the quotient set G/H when G = Z6 and H = h2i. 5. Find the quotient set G/H when G = S4 and H = h(1234)i. 6. Determine the order of every element in the groups S3 , (Z6 , +) and (Z∗9 , ·). 7. Let G = Sn and H = An . Show that π ≡ σ (mod H) if and only if π and σ are permutations with the same parity. Conclude that [Sn : An ] = 2. 8. Show that the function φ : G/H → H\G given by φ(gH) = Hg −1 is a well defined bijection. Conclude that [G : H] coincides also with the number of right cosets. 9. Show that if K ⊂ H, where K and H are subgroups of a finite group G, then [G : K] = [G : H][H : K]. 10. If A, B and C are any subsets of a group G, show that: (a) (AB)C = A(BC). (b) AA = A when A is a subgroup of G. (c) If A and B are subgroups of G, then AB = BA if and only if AB is a subgroup of G. 196 CHAPTER 4. QUOTIENTS AND ISOMORPHISMS 11. Show that, if A and B are finite subgroups of G, then |AB||A ∩ B| = |A||B|. (Hint: Use the fact that A ∩ B is a subgroup of both A and B.) 12. Show that any permutation of S3 is of the form π = αn εm , without explicitly computing these powers. 13. If |G| = p, where p is a prime, what are the subgroups of G, and what is the order of the elements of G? 14. Give an example of an infinite group, where all elements have order 2, with the exception of the identity. 15. Give an example of an infinite group, where all elements have finite order, but such that for each n ∈ N it has elements of order n. 16. Prove Theorem 4.1.10. 17. Show the following converse of Theorem 4.1.8: if G is a group isomorphic to the direct productH × K, then G has normal subgroups H ′ and K ′ such that H ′ K ′ = G and H ′ ∩ K ′ = {e}, where e is the identity in G. 18. Let A be a finite ring with identity. Show that the characteristic of A is a factor of |A|. 19. Classify the rings with unit with 2, 3, 4 and 5 elements. (Hint: use the previous exercise.) 20. Let G be an abelian group with 9 elements, which has no element of order 9. Show that G ≃ Z3 ⊕ Z3 . 21. Show that if G is a finite group where all elements, except the identity, have order 2, then G is abelian. What can you say if all elements distinct from the identity have order 3? 22. If G is a group with order 2n, show that there exists at least one element in G which has order 2. (Hint: If x ∈ G, define C(x) := {x, x−1 }. Show that the binary relation x ∼ y ⇔ C(x) = C(y) is an equivalence relation and that C(x) is the equivalence class of x). 4.2. QUOTIENT GROUP AND QUOTIENT RING 4.2 197 Quotient Group and Quotient Ring The ring operations in Zm were defined from the ring operations in Z. When can one generalize the procedure adopted for Zm , to define operations in the quotient G/H, from given operations in G? We will see that this is possible only under some restrictions in the subgroup H. In fact, as we will show next, it is possible to define a group operation in the quotient G/H, provided H is a normal subgroup of G 4 . Recall that in the case of Zm , the addition operation was defined because of the following property of addition in Z: x ≡ x′ (mod m) (4.2.1) If , thenx + y ≡ x′ + y ′ (mod m). y ≡ y ′ (mod m) Indeed, this implies that for any x and y in Zm , we can define x + y := x + y without any ambiguity caused by a choice of representatives x and y for each of the equivalence classes. The following example shows that property (4.2.1) does not generalize for congruence (mod H). Example 4.2.1. If G = S3 and H = {I, α}, we can have g1 ≡ g1′ (mod H) and g2 ≡ g2′ (mod H) without having g1 g2 ≡ g1′ g2′ (mod H): for example, take g1 = g1′ = α and g2 = γ, g2′ = δ; then γ ≡ δ (mod H), but αγ = ε and αδ = γ are not equivalent (mod H). In fact, property (4.2.1) generalizes only in the case of normal subgroups: Proposition 4.2.2. Let H be a subgroup of G. The following statements are equivalent: (i) H is a normal subgroup of G; (ii) gHg−1 = H for all g ∈ G; (iii) (g1 H)(g2 H) = (g1 g2 )H for all g1 , g2 ∈ G; (iv) if g1 ≡ g1′ (mod H) and g2 ≡ g2′ (mod H), then g1 g2 ≡ g1′ g2′ (mod H) for all g1 , g2 , g1′ , g2′ ∈ G. 4 Note that this restriction has no consequences for Zm : since (Z, +) is an abelian group, any of its subgroups is a normal subgroup. 198 CHAPTER 4. QUOTIENTS AND ISOMORPHISMS Proof. Let us show that (i) ⇒ (ii) ⇒ (iii) ⇒ (iv) ⇒ (i). (i) ⇒ (ii): From the definition of a normal subgroup we have: gHg−1 ⊂ H. Since g is arbitrary, we can replace g by g−1 obtaining g−1 Hg ⊂ H. Note also that g −1 Hg ⊂ H =⇒ g(g −1 Hg)g−1 ⊂ gHg−1 =⇒ H ⊂ gHg−1 . Since we know that gHg−1 ⊂ H, we conclude that gHg−1 = H. (ii) ⇒ (iii): Since gHg−1 = H, one sees immediately that (gHg−1 )g = Hg, i.e, gH = Hg. Hence, we find: (g1 H)(g2 H) =((g1 H)g2 )H = (g1 (Hg2 ))H = (g1 (g2 H))H =((g1 g2 )H)H = (g1 g2 )(HH) = (g1 g2 )H. (iii) ⇒ (iv): g1 ≡ g1′ (mod H) and g2 ≡ g2′ (mod H) ⇒ g1′ ∈ g1 H and g2′ ∈ g2 H. Hence g1′ g2′ ∈ (g1 H)(g2 H). Since(g1 H)(g2 H) = (g1 g2 )H, we find g1′ g2′ ∈ (g1 g2 )H ⇔ g1 g2 ≡ g1′ g2′ (mod H). (iv) ⇒ (i): If g ∈ G and h ∈ H, we must show that ghg −1 ∈ H. For this, observe that gh ≡ g (mod H) and g −1 ≡ g−1 (mod H). Hence we must have ghg−1 ≡ gg −1 = e (mod H). It follows that ghg−1 ∈ H as claimed. By the previous proposition, if H is a normal subgroup of G, then (g1 H)(g2 H) = (g1 g2 )H defines a binary operation in G/H. It is easy to check that: Theorem 4.2.3. If H is a normal subgroup of G, then G/H is a group for the binary operation defined by (g1 H)(g2 H) = (g1 g2 )H. Moreover, the quotient map π : G → G/H defined by π(g) = g = gH is a surjective group homomorphism with kernel N (π) = H. 4.2. QUOTIENT GROUP AND QUOTIENT RING 199 Proof. We saw above that we have a well defined binary operation G/H × G/H → G/H given by (g1 H, g2 H) → (g1 H)(g2 H) = (g1 g2 )H. This binary operation is associative since: ((g1 H)(g2 H)) (g3 H) = ((g1 g2 )H) (g3 H) = ((g1 g2 )g3 ) H = (g1 (g2 g3 )) H = (g1 H) ((g2 g3 )H) = (g1 H) ((g2 H)(g3 H)) . If e is the identity in G, obviously eH = H and (gH)H = H(gH) = gH. Hence, H is the identity in G/H. It is also clear that (gH)(g−1 H) = (g−1 H)(gH) = eH = H. This means that every element in G/H has an inverse. This shows that G/H is a group. Let π : G → G/H be defined by π(g) = g = gH. It is immediate to check that π(g1 )π(g2 ) = (g1 H)(g2 H) = (g1 g2 )H = π(g1 g2 ), so π is a group homomorphism. Finally, since H is the identity in the group G/H, we have that the kernel of π is N (π) = {g ∈ G : gH = H} = H. Examples 4.2.4. 1. Let G = S3 . The subgroup H = A3 is normal, so that G/H = {I, α} is a group. We easily find the multiplication table: I I = α α = I, I α = α I = α. 2. When G = Z6 and H = h2i = {0, 2, 4}, we find G/H = {0, 1}, where: 0 = 0 + h2i = {0, 2, 4} 1 = 1 + h2i = {1, 3, 5}. In this case the multiplication table is given by 0 + 0 = 1 + 1 = 0, 0 + 1 = 1 + 0 = 1. Obviously, the groups S3 /A3 and Z6 /{0, 2, 4} are isomorphic (there exists only one group with two elements!). 200 CHAPTER 4. QUOTIENTS AND ISOMORPHISMS We leave as an exercise the proof of the following result: Theorem 4.2.5. If H is a normal subgroup of G, then the subgroups (respectively, normal subgroups) of G/H take the form K/H, where H ⊆ K ⊆ G, and K is a subgroup (respectively, normal subgroup) of G. Let us turn now to the study of quotient rings, which is slightly more involved. On the one hand, if B ⊂ A is a subring of A, then (B, +) is a subgroup of (A, +), so we can form the quotient group A/B: this follows immediately from the fact that (A, +) is an abelian group, so any of its subgroups is normal. The “addition” in A/B is given by: a1 + a2 = a1 + a2 , or (a1 + B) + (a2 + B) = (a1 + a2 )B. This does not imply that A/B is a ring, since we need also a “multiplication” in A/B, such that the axioms of a ring are satisfied. Let us recall that in the case of Zm the definition of multiplication relied on the following property: x ≡ x′ (mod m) (4.2.2) If , then xy ≡ x′ y ′ (mod m). y ≡ y ′ (mod m) This property shows that given elements x and y in Zm , the binary operation x · y = xy is well defined independent of the choice of representatives x and y for each equivalence class. This suggests that for an arbitrary ring A with a subring B ⊂ A we should set: a1 a2 = a1 a2 or (a1 + B)(a2 + B) = a1 a2 + B. However, this definition is consistent only if whenever a1 ≡ a′1 (mod B) and a2 ≡ a′2 (mod B) we also have a1 a2 ≡ a′1 a′2 (mod B). This is clarified by the next result: Proposition 4.2.6. Let B be a subring of A and set a ≡ a′ (mod B) if and only if a′ − a ∈ B. Then the following statements are equivalent: (i) B is an ideal in A; (ii) If a1 ≡ a′1 (mod B) and a2 ≡ a′2 (mod B) then a1 a2 ≡ a′1 a′2 (mod B), for any a1 , a′1 , a2 , a′2 ∈ A. 4.2. QUOTIENT GROUP AND QUOTIENT RING 201 Proof. We will show that both implications (i) ⇒ (ii) and (ii) ⇒ (i), hold. (i) ⇒ (ii): If a1 ≡ a′1 (mod B) and a2 ≡ a′2 (mod B), then a′1 = a1 + b1 and a′2 = a2 + b2 , where b1 , b2 ∈ B. Hence, a′1 a′2 = (a1 + b1 )(a2 + b2 ) = a1 a2 + a1 b2 + a2 b1 + b1 b2 . It is clear that b1 b2 ∈ B, because B is a subring, and a1 b2 , a2 b1 ∈ B, because B is an ideal. We conclude that a′1 a′2 = a1 a2 +b, where b = a1 b2 + a2 b1 + b1 b2 ∈ B, and hence a1 a2 ≡ a′1 a′2 (mod B). (ii) ⇒ (i): We must show that, if a ∈ A and b ∈ B, then ab, ba ∈ B. For that, we observe that b ∈ B if and only if b ≡ 0 (mod B), where 0 denotes the zero in the ring A. According to (ii), we then have ab ≡ a0 (mod B) and ba ≡ 0a (mod B), or equivalently ab ≡ 0 (mod B) and ba ≡ 0 (mod B). We conclude that ab, ba ∈ B, so B is an ideal. We can now state an analogue of Theorem 4.2.3, for rings. We leave its proof as an exercise. Theorem 4.2.7. If I ⊂ A is an ideal in a ring A, then A/I is a ring with the operations: a1 + a2 := a1 + a2 a1 · a2 := a1 · a2 . If A is abelian (respectively, with identity 1), then A/I is abelian (respectively, with identity 1). Moreover, the quotient map π : A → A/I defined by π(a) = a = a + I is a ring homomorphism with kernel N (π) = I. The next examples show that often properties of the ring A do not pass to the quotient. Examples 4.2.8. 1. The rings Zm are examples of the situation described in the theorem, where A = Z and I = hmi, where m is a fixed natural number. In this case, A is always an integral domain, while the quotient A/I is not, if m is not a prime number: if d is a factor of m then d is a zero divisor in Zm . 2. Let A = Q[x] and I = hm(x)i where m(x) = x2 + 1. Given any p(x) ∈ Q[x], the division algorithm shows that there exists q(x) ∈ Q[x] such that p(x) = q(x)(x2 + 1) + (a + bx), (a, b ∈ Q). Since obviously q(x)(x2 + 1) ∈ I, we conclude that p(x) ≡ a + bx, i.e., p(x) = a + bx, so that Q[x] = {a + bx : a, b ∈ Q}. hx2 + 1i 202 CHAPTER 4. QUOTIENTS AND ISOMORPHISMS One can find easily the operations in this ring. For the sum we have: a + bx + a′ + b′ x = a + bx + a′ + b′ x = (a + a′ ) + (b + b′ )x. To find the product we observe first that x2 + 1 ≡ 0, i.e., that x2 = −1. Therefore: (a + bx)(a′ + b′ x) = (a + bx)(a′ + b′ x), = aa′ + (ba′ + ab′ )x + bb′ x2 , = (aa′ − bb′ ) + (ba′ + ab′ )x. In order to make the notation lighter, let us right a instead of a, and i instead of x (note that a = b if and only if a = b). With these conventions, the two operations can be written: (a + bi) + (a′ + b′ i) = (a + a′ ) + (b + b′ )i, (a + bi)(a′ + b′ i) = (aa′ − bb′ ) + (ab′ + a′ b)i. It should now be clear that Q[x]/hx2 + 1i is actually isomorphic to Q[i], a “coincidence” which will be explained later. Observe also that Q[x]/hx2 + 1i is a field extension of Q, while the original ring Q[x] was not a field. In this field the polynomial x2 + 1 has roots and is reducible. 3. The previous example, extends to rings of polynomials over finite fields. Let A = Z2 [x] and I = hx2 + x + 1i. Just as before, if p(x) ∈ Z2 [x] there exists q(x) ∈ Z2 [x] such that p(x) = q(x)(x2 + x + 1) + (a + bx). Therefore, once more p(x) ≡ a + bx, i.e., p(x) = a + bx. In this case, this leads to a finite ring with 4 elements: hx2 Z2 [x] = {a + bx : a, b ∈ Z2 } = {0, 1, x, 1 + x}. + x + 1i Again, lets us write a instead of a, and α instead of x, where 1 + α + α2 = 0, or still α2 = −1 − α = 1 + α. (Note that since a = −a in Z2 , we also have a = −a in the quotient ring). We can then exhibit the tables for addition and multiplication in the quotient ring (we write β = α2 = 1 + α for convenience): + 0 1 α β 0 0 1 α β 1 1 0 β α α α β 0 1 β β α 1 0 · 0 1 α β 0 0 0 0 0 1 0 1 α β α 0 α β 1 β 0 β 1 α These are the same tables as the field with 4 elements mentioned in an exercise in Chapter ??. It is a field extension of Z2 , and in this field the polynomial x2 + x + 1 has roots and is reducible. 4.2. QUOTIENT GROUP AND QUOTIENT RING 203 In the exercises one is asked to check that, in general, the quotient ring K[x]/hm(x)i is always an extension of the field K and also a vector space over K of dimension n, where n is the degree of the polynomial m(x)). In the last two examples above the quotient ring is actually a field. On the other hand, we know that the rings Zm are fields when m is a prime, and this occurs precisely when hmi is a maximal ideal in Z. These facts are indeed related: Theorem 4.2.9. If A is an abelian ring with identity and I ( A is an ideal, the quotient A/I is a field if and only if I is a maximal ideal in A. Proof. Let us assume first that I is a maximal ideal in A. We need to show that A/I has an identity 1 6= 0, and that if a 6= 0, there exists x ∈ A such that ax = 1. Let us observe first that 1 6∈ I, i.e., 1 6= 0, for otherwise we would have I = A. Next, given a 6= 0, i.e., a 6∈ I, consider the set J = {ax + b : x ∈ A and b ∈ I}. Obviously, a ∈ J and so J 6= I. It is also obvious that I ⊂ J. Since A is abelian, one checks immediately that J is an ideal in A. Since I is assumed to be a maximal ideal we must have J = A. Therefore, we have 1 ∈ J which means that there exists x ∈ A and b ∈ I such that 1 = ax + b, or ax = 1. Conversely, assume now that A/I is a field and let π : A → A/I be the quotient map. If J ) I is an ideal in A containing I, then {0} = 6 π(J) ⊂ A/I is an ideal. It follows that there exists a ∈ J such that a ≡ 1, i.e., 1 = a + b, for some b ∈ I. Since I ⊂ J, we conclude that 1 ∈ J, so that J = A. This shows that I is a maximal ideal. Whenever D is an integral domain, the polynomial ring D[x] is also an integral domain, and m(x) is irreducible if and only if hm(x)i is a maximal ideal maximal among all principal ideals in D[x] (see Proposition ??). When D = K is a field, all ideals in K[x] are principal, so the theorem gives: Corollary 4.2.10. If K is a field, the quotient ring K[x]/hm(x)i is a field if and only if m(x) is an irreducible polynomial in K[x]. This corollary explains why in the examples above the quotient rings are fields. As in those examples, this corollary can be used to obtain field extensions of known fields and, in particular, to construct new fields. On the other hand, Theorem 4.2.9 can be used to define the real numbers from the rationals numbers and to verify that the axiomatics of the real numbers is also a consequence of the axioms of the integers, presented in Chapter ??. We shall do this in the next section, where we will also present a formal 204 CHAPTER 4. QUOTIENTS AND ISOMORPHISMS definition of the complex numbers, identified as the quotient of R[x] by the ideal hx2 + 1i. Exercises. 1. Show that if H is a subgroup of G and [G : H] = 2, then H is a normal subgroup of G. 2. Show that N is a normal subgroupo of G if and only if there exists a group H and a homomorphism φ : G → H whose kernel is N . 3. Let N be a normal subgroup of G, and π : G → G/N the quotient map. (a) Show that if N ⊂ H ⊂ G where H is a subgroup of G, then N is a normal subgroup of H, and H/N is a subgroup of G/N . (b) Show that the subgroups of G/N take the form H/N , where H is a subgroup of G which contains N . 4. Let N be a normal subgroup of G and x ∈ G. (a) Assume that the order of x in G is finite and equal to m. Show that the order of x in G/N is finite and divides m. (b) Find examples where x has infinite order in G and x in G/N has (i) finite order and (ii) infinite order. 5. Let A be a ring and I an ideal in A, verify that the product in the quotient ring A/I, which is defined as (a1 + I)(a2 + I) = a1 a2 + I, does not in general correspond to the product of sets, which was defined as CD = {cd : c ∈ C and d ∈ D}. 6. Prove Theorem 4.2.7. 7. Show that the function φ : Q ⊕ Q → Q[x]/hx2 + 1i, given by φ(a, b) = ax + b, is a bijection. 8. Find the tables for addition and multiplication of the quotient ring Z2 [x]/hx2 + 1i. Check that this ring is not a field. Why doesn’t this contradict Theorem 4.2.9? 9. Consider the ring Q[x]/hm(x)i, where m(x) = x6 + x4 + x2 + 1. Determine the inverse of x + 2. Check if this ring has any zero divisors and if yes, give an example of one such element. 10. Show that Q[x]/hx2 − 3x + 2i is isomorphic to Q ⊕ Q. Hint: Show that the map φ : Q[x]/hx2 − 3x + 2i → Q ⊕ Q given by φ(p(x)) = (p(1), p(2)) is well defined and yields an isomorphism of rings. 4.3. REAL AND COMPLEX NUMBERS 205 11. Let L = Z2 [x]/hx2 + x + 1i. Find the factorization of the polynomial x2 + x + 1 in L[x]. 12. Let m(x) be an irreducible polynomial of degree n in K[x], and let L = K[x]/hm(x)i. Show that: (a) L is a field and a vector of dimension n over K; (b) The field L is an extension of K; (c) m(x) has at least one root in L; (d) There exists a field extension of K, where m(x) is a product of factors of degree 1. 13. Show that the polynomial x3 + x2 + 1 is irreducible in Z2 [x]. Use this fact to give an example of a field L with 8 elements. Find the factorization of the polynomial x3 + x2 + 1 in L. 14. Let I ⊂ A be an ideal, and π : A → A/I the quotient map π(a) = a. Show that J ⊂ A/I is an ideal in A/I if and only if J = π(J), where J is an ideal in A and I ⊂ J. 15. Find all the ideals of Z2 [x]/hx2 + 1i. 16. Show that a non-abelian group G with 6 elements is isomorphic to S3 , by showing that: (a) G has an element x of order 3 and H =< x > is normal in G. (b) G has an element y of order 2 and y 6∈ H. (c) Show that yx = xy 2 (use yx ∈ xH). Conclude that G ≃ S3 . 4.3 Real and Complex Numbers It is more or less obvious that the rational numbers can by represented by points in a line: to determine which point corresponds to a given rational number one just needs to fix two points in the line that correspond to the numbers 0 and 1. This correspondence was of course known to the Ancient Greeks, who also discovered an interesting phenomena: although to each rational number corresponds a point in the line, there are points in the line which do not correspond to any rational number. The Greeks interpreted this as mistake by the Gods, since it suggested that the rational numbers, a byproduct of the natural numbers, were in some sense insufficient. In fact, for some time they even tried to hide this phenomena from general 206 CHAPTER 4. QUOTIENTS AND ISOMORPHISMS knowledge, afraid that it could cause the wrath of the Gods. Actually, as we shall see in this section, the Greeks were wrong since the real numbers, which correspond to all points in the line, can be defined from the rational numbers, and therefore also from the natural numbers. In modern language, the “insufficiency” of the rational numbers can be expressed in terms of Cauchy sequences. We start by recalling its definition, adapted to the special case of the rational numbers. Definition 4.3.1. Let x = (x1 , x2 , . . . ) be a sequence in Q. The sequence is called: (a) bounded, if there exists M ∈ Q such that |xn | ≤ M, ∀n ∈ N. (b) convergent in Q with limit l ∈ Q, if ∀ε ∈ Q+ , ∃N ∈ N : n ≥ N =⇒ |xn − l| < ε. (c) Cauchy, if ∀ε ∈ Q+ , ∃N ∈ N : n, m ≥ N =⇒ |xn − xm | < ε. As usual, if x = (x1 , x2 , . . . ) is a convergent sequence with limit l, then one writes xn → l or limn→∞ xn = l. One has the usual results about convergence of sums, products and differences of convergent sequences. Also, it is easy to show that in Q: (i) Every convergent sequence is Cauchy, and (ii) Every Cauchy sequence is bounded. On the other hand, there are Cauchy sequences in Q which are not convergent, as illustrated in the following example: Example 4.3.2. Consider the map f : Q → Q defined by f (x) = that f (x) > 1, because (x − 1)2 + 1 > 0 =⇒ =⇒ =⇒ x2 +2 2x . If x > 0, we observe x2 − 2x + 2 > 0, x2 + 2 > 2x, x2 + 2 > 1. 2x 4.3. REAL AND COMPLEX NUMBERS 207 If x, y > 0, we also observe that f (x) − f (y) = (xy − 2)(x − y) xy − 2 x − y = . 2xy xy 2 If, additionally, x, y ≥ 1, it is easy to check that −1 ≤ g(z) = 1 − 2z is increasing for z > 0. Therefore, |f (x) − f (y)| ≤ xy−2 xy < 1, since 1 |x − y|. 2 Now let {xn }n∈N be the sequence in Q defined by x1 = 1, and xn+1 = f (xn ) se n ∈ N. For n > 1 we have: |xn+1 − xn | = |f (xn ) − f (xn−1 )| ≤ 1 |xn − xn−1 |, 2 so that 1 |x2 − x1 |. 2n−1 We leave as an exercise to verify that if m > n then |xn+1 − xn | ≤ |xm − xn | ≤ 1 |x2 − x1 |, 2n−2 so the sequence {xn }n∈N is a Cauchy sequence. Although this sequence is a Cauchy sequence, it is not convergent in Q. In fact, we have xn+1 = f (xn ) =⇒ 2xn+1 xn = x2n + 2, so that if xn → x, then 2x2 = x2 + 2, or x2 = 2, an equation that we know has no solutions in Q. Although the sequence in the previous example is not√convergent in Q, it is obviously convergent in R to the irrational number 2. You probably know that every Cauchy sequence in Q converges to a real number, which may or may not be a rational number. This leads to the following motto: • Every Cauchy sequence in Q determines a real number. 5 . Of course, distinct Cauchy sequences can determine the same real number, i.e., can have the same limit. This happens precisely when the difference of the two sequences converges to zero. In other words: 5 You should compare this with the idea used to define the rational numbers from the integers: each pair (m, n) of integers with n 6= 0 determines a rational number 208 CHAPTER 4. QUOTIENTS AND ISOMORPHISMS • The Cauchy sequences {xn }n∈N and {yn }n∈N determine the same real number if and only if (xn − yn ) → 0. We define two Cauchy sequences, {xn }n∈N e {yn }n∈N , to be equivalent if (xn − yn ) → 0. This leads to the main idea we will use to define the real numbers out of the rational numbers: a real number is an equivalence classe of Cauchy sequences in Q. In order to pursue this idea6 , we need the following proposition, which is in fact a special case of the general theory developed in the previous section. Its proof is left as an exercise. Theorem 4.3.3. Let A be the set of all sequences of rational numbers. Then: (i) The set A with the operations of addition and product of sequences is a ring. (ii) The subset B ⊂ A consisting of all Cauchy sequences in Q is a subring of A. (iii) The set I formed by all sequences in Q which converge to 0 is a subring of A and an ideal in B. If x, y ∈ B are Cauchy sequences in Q, then x and y determine the same real number if and only if x − y converges to 0, i.e., if and only if x − y ∈ I. This suggests: Definition 4.3.4 (Cantor). The ring B/I is denoted by R. Its elements (which are just equivalence classes of Cauchy sequences in Q) are called real numbers. It should be clear that R is an extension of the ring Q: given any rational q ∈ Q, we can take the constant sequence q given by qn = q, n ∈ N (obviously a Cauchy sequence), and the map ι : Q → R given by ι(q) = q is an injective homomorphism. Observe also that the zero in R is the equivalence class of the zero sequence (the ideal I), and the identity is the equivalence class of the constant sequence with all terms equal to 1. Of course, any sequence of rationals converging to 0 also represents I = 0, and any sequence of rationals converging to 1 is a representative of 1. 6 This way of defining the real numbers is due to Georg Cantor (1845-1918), a German mathematician who also made fundamental discoveries in Set Theory, including the theory of “transfinite numbers”. 4.3. REAL AND COMPLEX NUMBERS 209 In order to show that R is a field (which amounts to show that I is a maximal ideal in B), one needs to show that if x ∈ R − {0} then there exists y ∈ R such that xy = 1. In terms of Cauchy sequences in Q, this means we need to show that: Proposition 4.3.5. If x is a Cauchy sequence in Q which does not converge to 0, then there exists a Cauchy sequence y in Q such that xn yn → 1. Proof. Let x be a Cauchy sequence in Q which does not converge to 0. We leave as an exercise to show that there exists a rational δ > 0 and a natural number N ∈ N such that |xn | > δ for n ≥ N . Next, we define a sequence y ∈ Q by yn = 0, 1 xn , se n ≤ N se n > N. Notice that if n > N then |yn | = | x1n | ≤ 1δ , hence for n, m > N we have |ym − yn | = 1 |xn − xm | ≤ 2 |xn − xm | → 0, |xn xm | δ This shows that y is a Cauchy sequence in Q. Since xn yn = 1 for n > N , it is obvious that xn yn → 1. Next we show that R is an ordered field. For that we need to define a set R+ such that: 1. x, y ∈ R+ ⇒ x + y ∈ R+ and xy ∈ R+ ; 2. If x ∈ R+ , exactly one of the following 3 conditions hold: x ∈ R+ or x = 0, or − x ∈ R+ . After a little thought, one sees there is really just one way to proceed: Definition 4.3.6. If x ∈ R (i.e., x is a Cauchy sequence in Q), we say that x is positive if there exists a rational number ε > 0 and N ∈ N, such that n > N ⇒ xn ≥ ε. The set of positive real numbers is denoted by R+ . It is now easy to show that Theorem 4.3.7. R is an ordered field. 210 CHAPTER 4. QUOTIENTS AND ISOMORPHISMS It follows that, as it was explained in Chapter ?? in our discussion of ordered fields, that one can define |x| = max{x, −x} for any x ∈ R. If q ∈ Q is a rational number we will denote by q the constant sequence where qn = q for all n ∈ N. This is both a Cauchy sequence and a convergent sequence in Q. We denote by q the corresponding real number (the equivalence class determined by q). As we have already mentioned, the map f : Q → R given by f (q) = q is a ring homomorphism, which allows us to say that the field R is an extension of the field Q. In Analysis we learn that any real number can be approximated by a rational number, to an arbitrary small error, i.e., that “Q is dense in R”. In our approach to the real number, this idea is formalized precisely as follows: Proposition 4.3.8. If x and ε are real numbers and ε > 0, then there exists a rational q ∈ Q such that |x − q| < ε. Proof. We start by choosing representatives for x and ε, i.e., Cauchy sequences x = (x1 , x2 , . . . ) and ε = (ε1 , ε2 , . . . ) in Q. Since ε > 0, there exists a rational number r > 0 such that εn ≥ r for all n ≥ N1 , where N1 ∈ N. We can now obtain the sought “rational number” q by replacing x by a constant sequence whose general term is one of the terms of x of high enough order. In fact, since x is a Cauchy sequence, there exists N2 ∈ N such that r n, m ≥ N2 =⇒ |xn − xm | < . 2 If we let q = xN2 , a rational number, we have that 2r < xn − q ≤ 2r for n ≥ max{N1 , N2 }. Hence, we conclude that −r < x − q < r, and this implies that |x − q| < ε. The basic properties of the real numbers, which form the foundations of Analysis, are usually introduced in an axiomatic form. The usual minimal set of axioms that one considers includes only the axioms that R is an ordered field and a completeness axiom, such us the Least Upper Bound Axiom or Supremum Axiom. This is invoked, for example, to show that every Cauchy sequence in R is convergent, contrary to what happens in Q. Our approach to the real numbers in a constructive (as opposed to an axiomatic) approach. We have already showed that R is an ordered field and it remains to show that the “Supremum Axiom” is another consequence of our definition. For this, one shows directly that every Cauchy sequence in R is convergent: Theorem 4.3.9. Every Cauchy sequence in R is convergent7 . 7 For this reason, we say that R is a complete field. 4.3. REAL AND COMPLEX NUMBERS 211 We leave the proof as an exercise. Using this result one can show the “Supremum Axiom” holds in R. Corollary 4.3.10 (Supremum Axiom). Every non-empty subset of R having an upper bound must have a least upper bound (a supremum). Proof. Assume A ⊂ R is a non-empty set having an upper bound: this means there exists M ∈ R such that x ≤ M , for all x ∈ A. One defines a sequence in R, by successive bisections, a common method in Analysis. We let x1 = M . Since A 6= ∅, there exist a ∈ A and we let a1 := a.Obviously, a1 ≤ x1 , 1 so we can set a2 := a1 +x 2 . The number a2 splits the interval [a1 , x1 ] into 2 subintervals of equal length. There are two possibilities: (i) There exists x ∈ A such that x > a2 . In this case we let x2 := x1 ; (ii) One has x ≤ a2 for all x ∈ A. In this case, we let x2 := a2 . We repeat this procedure, obtaining a sequence of real numbers which is easily shown to be a Cauchy sequence. By Theorem 4.3.9, this sequence converges and one shows easily that its limit is a least upper bound of A. This finishes the proof that the real numbers can be constructed from the rational numbers (and hence, from the integers) and that its properties are a consequence of the axioms for the integers that were discussed in Chapter ??. One can construct the complex numbers from the real numbers with not so much difficulty: since R is an ordered field, it is obvious that the polynomial x2 + 1 is irreducible in R[x]. Therefore the ring C= R[x] hx2 + 1i is a field, called the field of complex numbers. The imaginary unit i is, by definition, the equivalence class of the polynomial x. It clearly satisfies the identity i2 = −1 in C. We will not discuss in detail the properties of C, but one can show that C is also a complete field. Exercises. 1. Let A be an ordered field. Show that any convergent sequence in A is a Cauchy sequence and that any Cauchy sequence is bounded. 212 CHAPTER 4. QUOTIENTS AND ISOMORPHISMS 2. Show that if one sets x1 := 1 and xn+1 := f (xn ), where f is the function in 1 |x2 − x1 |. Example 4.3.2, then |xn − xm | ≤ 2n−2 3. Show that the set of all Cauchy sequences in Q is a subring of the ring of all sequences in Q. 4. Show that the set of all sequences in Q which converge to 0 is an ideal in the ring of all Cauchy sequences in Q. 5. Let x be a Cauchy sequence in Q. Show that the following statements are equivalent: (a) x does not converge to 0; (b) there exists a rational number ε > 0 and subsequence xnk such that |xnk | ≥ ε for k large enough; (c) there exists a rational number d > 0 such that |xn | ≥ d for n large enough. 6. Let x, y ∈ R. (a) Show that if x, y ∈ R+ , then x + y ∈ R+ and xy ∈ R+ . (b) Show that either x ∈ R+ , x = 0, or −x ∈ R+ , but none of these hold simultaneously. 7. Give a proof of Theorem 4.3.9 and finish the proof of Corollary 4.3.10. 8. Show that the order of the real numbers is unique, i.e., show that if R is an ordered field then x ∈ R+ if and only if there exists y ∈ R − {0} such that x = y2. 9. Show that R is not countable. Conclude that R is a transcendental extension of Q (and a vector space of infinite dimension over Q). 10. Show that if x is a real number and 0 ≤ x < 1, then there exists a sequence a1 , a2 , . . . such that 0 ≤ an ≤ 9 for all n ∈ N and x= 11. Show that C is a complete field. ∞ X an . 10n n=1 4.4. ISOMORPHISM THEOREMS FOR GROUPS 4.4 213 Isomorphism Theorems for Groups Let G and H be groups and let K ⊆ G be a normal subgroup. It is natural to ask what is the relationship between the group homomorphisms φ : G → H and the group homomorphisms φ̃ : G/K → H. On the one hand, since the quotient map π : G → G/K, given by π(x) = x = xK is a group homomorphism, it is obvious that for any group homomorphism φ̃ : G/K → H the composition φ := φ̃ ◦ π : G → H is a group homomorphism and that φ̃(x) = φ(x). On the other hand, given a group homomorphism φ : G → H one may wonder if there exists some homomorphism φ̃ : G/K → H, such that φ̃(x) = φ(x). This is illustrated by the following commutative diagram, where the dot arrow is used to indicate that we seek the existence of the corresponding homomorphism: G π G/K φ 7/ H ♦♦ ♦ ♦♦ ♦ ♦ φ̃ Let us assume that there exists a homomorphism φ̃ : G/K → H such that φ̃(x) = φ(x). If x ∈ K then x = K is the identity in G/K, so φ̃(x) is the identity in H. Then φ(x) = φ̃(x) is also the identity in H so that x ∈ N (φ). In other words, a necessary condition for the existence of φ̃ : G/K → H is that K ⊆ N (φ). Actually, this condition is also sufficient: Proposition 4.4.1. Let K ⊂ G be a normal subgroup. Given a group homomorphism φ : G → H there exists a group homomorphism φ̃ : G/K → H such that φ̃(x) = φ(x) if and only if K ⊆ N (φ). Moreover, every group homomorphism φ̃ : G/K → H arises in this way. Proof. We have already observed that if φ̃ : G/K → H is a group homomorphism then φ = φ̃ ◦ π is a group homomorphism φ : G → H with kernel N (φ) ⊇ K. For the converse, suppose that φ : G → H is a group homomorphism with K ⊆ N (φ). Then we define φ̃ : G/K → H by setting: φ̃(xK) := φ(x). This map is well defined because if x′ = xk, with k ∈ K, then φ(x′ ) = φ(xk) = φ(x)φ(k) = φ(x). Also, φ̃ is a group homomorphism because: φ̃(x) · φ̃(x′ ) = φ(x)φ(x′ ) = φ(x · x′ ) = φ̃(x · x′ ). 214 CHAPTER 4. QUOTIENTS AND ISOMORPHISMS Examples 4.4.2. 1. Let G = Z, H = Zn and K = hki. Denoting the quotient map π : Z → Zm by πm , we consider the homomorphism φ = πn : Z → Zn given by πn (x) = x ∈ Zn : φ=πn / ♣8 Zn ♣ ♣ π=πk ♣ ♣φ̃ ♣ ♣ Zk Z Since the kernel of φ is N (φ) = hni, there exists a homomorphism φ̃ : Zk → Zn such that φ̃(πk (x)) = πn (x) if and only if hki ⊆ hni, i.e., i.e., if and only if n|k. We can write φ̃(x) = x but note, in general, φ̃ is not the identity. For example, if k = 4 and n = 2, we have that φ̃(0) = φ̃(2) = 0, and φ̃(1) = φ̃(3) = 1. 2. Let H = {1, i, −1, −i} and let φ : Z → H be the group homomorphism φ(n) = in . Its kernel is N (φ) = h4i, so taking K = N , we conclude that there exists a group homomorphism φ̃ : Z4 → H such that φ̃(n) = in . Since φ̃(0) = 1, φ̃(1) = i, φ̃(2) = −1, and φ̃(3) = −i, its is clear that φ̃ is an isomophism. φ / {1, i, −1, −i} 6 ❧❧ ❧ π ❧ ❧φ̃ ❧❧❧ Z4 Z 3. Let G = Z, H = Z210 , K = hki, and consider the homomoprhism φ : Z → Z210 given by φ(x) = 36x. The kernel of φ is N (φ) = {x ∈ Z : 210|36x} = {x ∈ Z : 35|6x} = h35i. We conclude that there exists a group homomorphism φ̃ : Zk → Z210 such that φ̃(x) = 36x if and only if hki ⊆ h35i, i.e., if and only if 35|k. In particular, φ̃ : Z70 → Z210 , given by φ̃(x) = 36x, is a well defined group homomorphism. Obviously, if φ and φ̃ are as in Proposition 4.4.1, then they have the same image, so φ̃ is surjective if and only if φ is surjective. The injectivity of φ̃ is discussed in the following: Proposition 4.4.3. Let φ : G → H be a group homomorphism with kernel N (φ) ⊇ K, for some normal subgroup K ⊆ G. Denote by π : G → G/K the quotient map and φ̃ : G/K → H the induced homomorphism. Then the kernel of φ̃ is N (φ)/K = π(N (φ)). In particular, φ̃ is injective if and only K = N (φ). 4.4. ISOMORPHISM THEOREMS FOR GROUPS 215 Proof. Let e be the identity of H. Then: N (φ̃) ={x ∈ G/K : φ̃(x) = e} ={x ∈ G/K : φ(x) = e} ={x ∈ G/K : x ∈ N (φ)} = π(N (φ)) = N (φ)/K. Examples 4.4.4. 1. In Example 4.4.2.3, where G = Z, H = Z210 , K = h70i, and φ : Z → Z210 is given by φ(x) = 36x, we saw that the kernel of φ is N (φ) = h35i. Hence, it follows that the kernel of the induced homomorphism φ̃ : Z70 → Z210 is N (φ)/K = π70 (h35i) = h35i = {35, 0}. 2. If in the previous example we let instead K = h35i, we conclude that the group homomorphism φ̃ : Z35 → Z210 , x 7−→ 36x, is injective. If φ : G → H is a surjective group homomorphism and K = N (φ) the previous proposition reduces to an important basic result in Group Theory which we shall use repeatedly: Theorem 4.4.5 (First Isomorphism Theorem). If φ : G → H is a surjective group homomorphism and K = N (φ) , then G/N (φ) and H are isomorphic: there exists a group isomorphism φ̃ : G/N → H given by φ̃(x) = φ(x), for all x ∈ G: G π G/N φ / ♦7 H ♦ ♦ ♦♦ ♦ φ̃ ♦ The First Isomorphism Theorem allows one to establish that groups of a very distinct nature are actually isomorphic. Note that even when a homomorphism φ : G → H is not surjective we can still apply the theorem, replacing H by φ(G), and concluding that there is an isomorphism: φ(G) ≃ G/N (φ). 216 CHAPTER 4. QUOTIENTS AND ISOMORPHISMS Examples 4.4.6. 1. We saw in Example 4.4.2.2 that the multiplicative group H = {1, i, −1, −i} is isomorphic to the additive group Z4 . More generally, consider the multiplicative group Rn consisting of the nth-roots of the unit Rn = {αk : k ∈ Z} = hαi, 2π where α = e n i . The group homomorphism φ : Z → Rn defined by φ(k) = αk is surjective and its kernel is N (φ) = {k ∈ Z : αk = 1} = hni. Hence, Rn is isomorphic to the additive group Zn . 2. Let φ : Sn → Z2 be the surjective homomorphism given by φ(ρ) = sgn(ρ). Its kernel (by definition) is the alternating group An . Hence we conclude that Sn /An is isomorphic to Z2 . 3. Let φ : Z → Zm ⊕ Zn be the group homomorphism φ(x) = (πm (x), πn (x)). Its kernel is given by: N (φ) = {x ∈ Z : x ≡ 0 (mod m) and x ≡ 0 (mod n)} = {x ∈ Z : m|x and x|n} = hlcm(m, n)i. It follows that the homomorphism φ̃ : Zlcm(m,n) → Zm ⊕ Zn , x 7→ (x, x) is injective, and that Zlcm(m,n) ≃ φ̃(Zlcm(m,n) ). In particular, when n and m are relatively prime numbers, so that we have lcm(m, n) = mn, it follows that Zmn and Zm ⊕ Zn are isomorphic, since both groups have the same number of elements. We leave it to the exercises to show that this map is actually the only isomorphism of rings between Zmn and Zm ⊕ Zn , and to determine its inverse. The example above concerning the roots of the unit is much more general that it looks at first sight. In fact, if G is a multiplicative group with identity e and α ∈ G, we know that the group generated by α is hαi = {αk : k ∈ Z}. The group homomorphism φ : Z → hαi given by φ(k) = αk is always surjective and its kernel is N (φ) = {k ∈ Z : αk = e}. Since N (φ) is a subgroup of Z, we have that N (φ) = hni, where n ≥ 0. There are two possibilities for n: (i) n = 0 ⇐⇒ N (φ) = {0}: this means that φ is injective, so we conclude that hαi ≃ Z, and hαi is an infinite group. The element α has infinite order. (ii) n > 0 ⇐⇒ N (φ) 6= {0}: then n is the smallest positive integer in N (φ), i.e., the smallest positive solution of the equation αk = e. In this case, hαi ≃ Z/hni = Zn , and hαi had n elements. Therefore, α has order n, and the order of an element α is precisely the smallest natural number such that αk = e. 4.4. ISOMORPHISM THEOREMS FOR GROUPS 217 Examples 4.4.7. 1. Consider the permutation ε in S3 . We have ε1 = ε, ε2 = δ and ε3 = I. Hence, ε has order 3, and hεi = {ε, δ, I} = A3 ≃ Z3 . 2. Recall that D5 is the symmetry group of a regular pentagon. The order of a non-trivial rotation r ∈ D5 is 5, so that hri ≃ Z5 . More generally, the group Dn has a subgroup H ≃ Zn , and this is a normal subgroup in Dn (why?). Definition 4.4.8. A group G is called a cyclic group if there exists some g ∈ G such that hgi = G. The element g is called a generator of G. Examples 4.4.9. 1. The group Z is cyclic, with generators 1 and −1. 2. A3 is a cyclic group: we can take g = ε or g = δ. 3. The group {1, i, −1, −i} is cyclic: we can take either g = i or g = −i. 4. The groups Zn are cyclic: any element of Z∗n is a generator. 5. The Z2 ⊕ Z4 is not cyclic (why?). The next result identifies all cyclic groups and it is just summarizes what we saw above: Corollary 4.4.10 (Classification of cyclic groups). If G is a cyclic group then either: (i) G is infinite and G ≃ Z, or (ii) G is finite with n elements and G ≃ Zn . Also, applying Lagrange’s Theorem, we obtain the classification of all groups of prime order: Corollary 4.4.11 (Classification of groups of prime order). If G is a finite group of order p, with p prime, then G ≃ Zp . There are also other isomorphism theorems for groups, which are less used than the First Isomorphism Theorem, but which are still useful. They are actually consequences of the First Isomorphism Theorem. 218 CHAPTER 4. QUOTIENTS AND ISOMORPHISMS Theorem 4.4.12 (Second Isomorphism Theorem). Let N and H be subgroups of G, and assume that N is normal in G. Then HN is a subgroup of G, N is normal in HN , H ∩ N is normal in H, and HN H ≃ . N H ∩N Proof. Exercise 10 in Section 4.1 shows that if H and N are subgroups of G, then HN is a subgroup of G if and only if HN = N H. Since N is normal in G, we have that HN = N H, and we conclude that HN is a subgroup of G. Now take the quotient homomorphism π : G → G/N and restrict it to H: one obtains a group homomorphism φ : H → G/N given by φ(x) = π(x) = x, (x ∈ H). The kernel of φ is N (φ) = {x ∈ H : x ∈ N } = H ∩ N . On the other hand, its image φ(H) is a subgroup of G/N , so that φ(H) = K/N , where K is a subgroup of G which contains both H and N . Therefore HN ⊆ K. But also any element in K is equivalent to an element in H, i.e., if k ∈ K then there exists h ∈ H and n ∈ N such that k = hn. Hence we have that K = HN . Now the First Isomorphism Theorem applied to φ gives: HN H ≃ . N H ∩N Notice, as a special case of the Second Isomorphism Theorem, that if H ∩ N = {e} then HN/N ≃ H. Theorem 4.4.13 (Third Isomorphism Theorem). If K ⊂ H ⊂ G are normal subgroups of G, then K is a normal subgroup of H, H/K is a normal subgroup of G/K, and G/K G ≃ . H/K H Proof. Assume that K ⊆ H ⊆ G, where K and H are normal in G. Let πK : G → G/K and πH : G → G/H be the usual quotient homomorphisms, so that πK has kernel K and πH has kernel H. Consider the diagram: G π=πK G/K φ=πH / 7 G/H ♦ ♦ ♦♦ ♦ ♦ ♦ φ̃ 4.4. ISOMORPHISM THEOREMS FOR GROUPS 219 The existence of φ̃ is a consequence of the assumption K ⊆ H (see Proposition 4.4.1). The homomorphism φ̃ is clearly surjective, since πH is surjective. Applying Proposition 4.4.3, the kernel of φ̃ is the group H/K. Now the First Isomorphism Theorem applied to the homomorphism φ̃ : G/K → G/H yields the conclusion of the theorem. This result states that the quotients of quotients of G are isomorphic to quotients of G. Notice, by the way, that this is just a generalization of the remarks in Example 4.4.2.1. Examples 4.4.14. 1. Let G = Z, H = h3i, and K = h6i. Obviously K ⊂ H, and since both K and H are normal subgroups of Z, we find G/K = Z/h6i = Z6 , H/K = h3i/h6i = h3i ⊂ Z6 , and G/H = Z/h3i = Z3 . According to the Third Isomorphism Theorem, we conclude that Z6 /h3i and Z3 are isomorphic. 2. The previous example is a special instance of the following general fact: if n|m then Zm /hni ≃ Zn . In fact, let G = Z, H = hni, and K = hmi, so that K ⊂ H are normal subgroups in Z. Then G/K = Zm , G/H = Zn , and H/K = hni ⊆ Zm , so that the Third Isomorphism Theorem yields our claim. Exercises. 1. Let H = hgi = {g n : n ∈ Z} be a cyclic group. Show that (a) if H is infinite, its only generators are g and g −1 ; (b) if H has m elements, the order of g n is m d, where d = gcd(n, m); n (c) if H has m elements, g is a generator of H if and only if gcd(n, m) = 1. 2. Assume that g1 and g2 belong to an abelian group G and have order, respectively, n and m. Show that the order of g1 g2 divides lcm(n, m). Conclude that the subset formed by all elements of finite order is a subgroup of G. 3. Decide which of the groups Z4 , Z2 ⊕ Z2 , Z8 , Z4 ⊕ Z2 and Z2 ⊕ Z2 ⊕ Z2 are isomorphic. 4. Let G = Z ⊕ Z and N = {(n, n) : n ∈ Z}. Show that G/N ≃ Z. 220 CHAPTER 4. QUOTIENTS AND ISOMORPHISMS iπ 5. Consider the multiplicative group H = he 4 i ⊂ C formed by the complex solutions of z 8 = 1. (a) Determine all the generators and subgroups of H; (b) Determine all the automorphisms φ : H → H. 6. Determine all the generators and subgroups of H = h(123456)i ⊂ S6 . 7. Show that Aut(Zn ) ≃ Z∗n . 8. Show that Zmn and Zm ⊕ Zn are isomorphic groups/rings if and only if m and n are relatively prime. 9. Assume that G is a finite group, H is a normal subgroup of G, K is a subgroup of G, G = HK, and G/H is isomorphic to K. Show that H ∩ K = {e}. 10. Show that if n > 1, then Zpn is not isomorphic to ⊕nk=1 Zp . 11. Show that the quotient Z40 /h15i is isomorphic to Zn and find n. 12. Assume that the only subgroups of G are the trivial groups {1} and G. Show that G is a cyclic group of prime order. 13. Show that if G is an abelian group of order pq, when p and q are both prime, then G is cyclic. 14. Classify the groups with 2p elements, where p > 2 is a prime. (Hint: Show that there exists an element x of order p, and that every element of order p belongs to < x >). 15. Classify the groups with 8 elements, by proceeding as follows: (a) Show that if G is abelian, then it is isomorphic to one of the groups Z8 , Z4 ⊕ Z2 , or Z2 ⊕ Z2 ⊕ Z2 , and these are non isomorphic groups. (b) Assuming that G is not abelian, show that: (i) G has an element x of order 4, and H =< x > is normal in G. (ii) Assuming y 6∈ H, show that y 2 ∈ H, where y 2 = 1 or y 2 = x2 . (iii) Finally show that yx ∈ Hy, where yx = x3 y (note that the order of yxy −1 is the order of x). (iv) Compare the result with the tables of D4 and H8 . 16. Let G and H be groups, with normal subgroups K ⊂ G and N ⊂ H. Show that (G × H)/(K × N ) is isomorphic to (G/K) × (H/N ). 4.5. ISOMORPHISM THEOREMS FOR RINGS 4.5 221 Isomorphism Theorems for Rings When A and B are rings, I ⊆ A is an ideal of A, and φ : A → B is a homomorphism of rings, we may apply the results from the previous section to φ as a homomorphism of the additive groups (A, +) and (B, +). In particular, we know that if the kernel of φ contains I, then there exists a group homomorphism φ̃ : A/I → B such that we have a commutative diagram. A π A/I φ / ♣7 B ♣ ♣ ♣♣ ♣ φ̃ ♣ However, we note that if φ is a homomorphism of rings then we have φ̃(x) · φ̃(x′ ) = φ(x)φ(x′ ) = φ(x · x′ ) = φ̃(x · x′ ). Hence, the group homomorphism φ̃ is a ring homomorphism as long as the original homomorphism φ is a ring homomorphism. This allows us to obtain immediately the analogues for rings of Propositions 4.4.1 and 4.4.3. Proposition 4.5.1. Let A and B be rings, I ⊆ A an ideal in A, and denote by π : A → A/I the quotient map, π(x) = x + I. (i) The ring homomorphisms φ̃ : A/I → B are the maps of the form φ̃(π(x)) = φ(x), where φ : A → B is a ring homomorphism with kernel N (φ) ⊇ I. (ii) If φ : A → B is a ring homomorphism with kernel N (φ) ⊇ I, then the induced ring homomorphism φ̃ : A/I → B has kernel N (φ)/I = π(N (φ)). In particular, φ̃ is injective if and only if I = N (φ). Given a ring A, a map φ : Z → A is a homomorphism of the underlying additive groups if and only if h(n) = na, for some fixed a ∈ A. It is easy to check that φ is a ring homomorphism if and only if additionally a = φ(1) is a solution of x2 = x in A. We make use of this simple remark to look back at the examples of the previous section, now from the perspective of their ring structure. 222 CHAPTER 4. QUOTIENTS AND ISOMORPHISMS Examples 4.5.2. 1. Let A = Z, B = Zn , and I = hki. The quotient φ = πn : Z → Zn is a homomorphism of rings. If n|k then φ=πn / ♣8 Zn ♣ ♣ π=πk ♣ ♣φ̃ ♣ ♣ Zk Z where φ̃ : Zk → Zn , given by φ̃(πk (x)) = πn (x) is a homomorphism of rings. 2. Let A = Z, B = Z210 and I = hki. The equation x2 = x has (non-obvious) solutions in Z210 , such as x = 21. Hence, the homomorphism φ : Z → Z210 given by φ(x) = π210 (21x) is a ring homomorphism. It is easy to check that its kernel is N (φ) = h10i. We conclude that there exists a ring homomorphism φ̃ : Zk → Z210 such that φ̃(x) = 21x if and only if 10|k. Obviously, φ̃ is injective when k = 10. In fact, when k = 10 we have that φ(Z) = φ̃(Z10 ) = h21i is unitary subring of Z210 , isomorphic to Z10 . Z π=π10 Z10 φ / 7 Z210 ♦ ♦ ♦♦ ♦ ♦ ♦ φ̃ 3. Let A = Q[x], B = Q, and denote by φ : Q[x] → Q the evaluation map φ(p(x)) = p(1). By the Polynomial Remainder Theorem, we know that φ is a homomorphism of rings with kernel N (φ) = hx − 1i. If I = hm(x)i is the ideal of Q[x] generated by the polynomial m(x), it follows that there exists a homomorphism of rings φ̃ : Q[x]/I → Q, defined by φ̃(p(x)) = p(1), if and only if (x − 1)|m(x), i.e, if and only if p(1) = 0. Using Proposition 4.5.1 we obtain immediately the First Isomorphism Theorem for for rings: Theorem 4.5.3 (First Isomorphism Theorem for rings). If φ : A → B is a surjective ring homomorphism and I is the kernel of φ, the the rings A/I and B are isomorphic. In particular, there is an isomorphism of rings φ̃ : A/I → B such that φ̃(a) = φ(a), for all a ∈ A. The other isomorphism theorem for rings follow by a simple application of the First Isomorphism Theorem for rings, exactly in the same manner as we did in the case of groups. We state them and leave the easy proofs for the exercises: 4.5. ISOMORPHISM THEOREMS FOR RINGS 223 Theorem 4.5.4 (Second Isomorphism Theorem for rings). Let A be a ring, I an ideal of A, and B a subring of A. Then I + B is a subring of A, I is an ideal in I + B, I ∩ B is an ideal in B, and there is an isomorphism of rings I +B B ≃ . I I ∩B Theorem 4.5.5 (Third Isomorphism Theorem for rings). Let A be a ring, I, J ideals of A with I ⊂ J. Then I is an ideal of J, J/I is an ideal of A/I, and there is an isomorphism of rings A A/I ≃ . J/I J Examples 4.5.6. 1. Let us check that when n and m are relatively prime then the rings Znm and Zn ⊕ Zm are isomorphic. Just like in the case of groups, it is enough to observe that φ = πnm : Z → Zn ⊕ Zm given by φ(k) = (πn (k), πm (k) is a homomorphism of rings. Hence, by the First Isomorphism Theorem for rings, the isomorphism of groups φ̃ : Znm → Zn ⊕ Zm in Example 4.4.6.3 is also an isomorphism of rings. 2. Let A = Z and assume that n|m. If we let I = hmi and J = hni, then J ⊇ I and I and J are ideals in Z. In this case, A/I = Zm , A/J = Zn , and J/I = hni ⊆ Zm . We conclude from the Third Isomorphism Theorem that the rings Zm /hni and Zn are isomorphic. In particular, the quotient rings formed from the rings Zm are always rings Zn . A I = Zm π A/I J/I = Zm hni φ / Zn = ♠6 ♠ ♠ ♠ ♠ ♠ φ̃ A J 3. Let α ∈ C be algebraic over Q, α 6∈ Q. Denote by m(x) its minimal polynomial. Recall that the evaluation map φ : Q[x] → C given by φ(p(x)) = p(α) is a homomorphism of rings, with kernel N (φ) = hm(x)i and that Q[α] = φ(Q[x]). We conclude by the First Isomorphism Theorem for rings that Q[α] ≃ Q[x] . hm(x)i Notice that since m(x) is an irreducible polynomial, we have that Q[x]/hm(x)i is a field. Hence (why?): Q[α] ≃ Q[x] ≃ Q(α). hm(x)i 224 CHAPTER 4. QUOTIENTS AND ISOMORPHISMS We now illustrate the Isomorphism Theorems for rings with two applications. First, we can use the result of Example 4.5.6.1 to compute the Euler function ϕ : N → N. Recall that this function is defined as ϕ(n) = |Z∗n |, i.e., ϕ(n) is the number of invertible elements in the ring Zn , or also, the number of integers 1 ≤ k ≤ n which are relatively prime to n. Lemma 4.5.7. If n1 , . . . , nk are relatively prime, then ϕ(n1 · · · nk ) = ϕ(n1 ) · · · ϕ(nk ). Proof. We give a proof only for the case k = 2. The cases k > 2 can be obtain by a simple induction procedure. Recall that if A and B are rings with units, then (A ⊕ B)∗ = A∗ × B ∗ . Hence, if C ≃ A ⊕ B, and the rings are finite, it is obvious that |C ∗ | = |A∗ ||B ∗ |. We apply this to A = Zn , B = Zm , and C = Znm , assuming that n and m are relative primes. Since Znm ≃ Zn ⊕Zm , we conclude immediately that: ϕ(nm) = ϕ(n)ϕ(m). The next theorem shows that one can compute ϕ(n) in a straightforward fashion, as long as one knows the prime factors of n: Q Theorem 4.5.8. If n = ki=1 pei i is the prime factorization of n then ϕ(n) = n k Y i=1 1− 1 pi . Proof. The lemmaQabove shows that if n is a natural number with prime factorization n = ki=1 pei i , then ϕ(n) = k Y ϕ(pei i ). i=1 Now if p is a prime number and m is a natural number, one finds easily ϕ(pm ): the elements of Zpm which are not invertible are just the elements of the ideal hpi in Zpm . This ideal has exactly pm /p = pm−1 elements (why?). Therefore Z∗pm = Zpm − hpi has pm − pm−1 = pm (1 − p1 ) elements. In other words: 1 ϕ(pei i ) = pei i − piei −1 = pei i (1 − ). pi 4.5. ISOMORPHISM THEOREMS FOR RINGS 225 From this it follows that: ϕ(n) = k Y i=1 ϕ(pei i ) = k Y i=1 pei i 1 1− pi =n k Y i=1 1 1− pi . Example 4.5.9. The prime factors of 9000 are 2, 3 and 5. Therefore: 1 1 1 1− 1− = 2400. ϕ(9000) = 9000 1 − 2 3 5 As another application of the Isomorphism Theorems for rings, we will now show that the fields Zp and Q are, in some sense, the smallest fields with a given characteristic. In other words, we will show that any field necessarily contains a subfield isomorphic to either Zp or Q. In what follows, we will denote by K a field and denote the identity by 1. Definition 4.5.10. We say that K is a primitive field if its does not contain any strictly smaller subfield (i.e., 6= K). Obviously there exist primitive fields (e.g., Z2 ) and non-primitive fields (e.g., R). Moreover, any field K has exactly one primitive subfield: the intersection of all the subfields of K is necessarily a primitive field primitive. One calls it the primitive subfield of K. Example 4.5.11. Obviously, Q is the primitive subfield √ √ of R and of C. Similarly, Q is also the primitive subfield of Q[ 2] = {a + b 2 : a, b ∈ Q}. Our next result identifies all the possible primitive fields: Theorem 4.5.12. Let m be the characteristic of K, where m = 0 or m = p, for a prime number p. Then: (i) If m = 0, the primitive subfield of K is isomorphic to Q; (ii) Se m = p, the primitive subfield of K is isomorphic to Zp . 226 CHAPTER 4. QUOTIENTS AND ISOMORPHISMS Proof. We give the proof in the case m = p, leaving the case m = 0 for the exercises. Consider the homomorphism φ : Z → K defined by φ(n) = n1. It is easy to check that any subfield of K must contain 1 and, hence, also φ(Z). It was shown in Chapter ??, that φ(Z) is isomorphic to Zp , and so it is a field. Since φ(Z) is primitive, it must be the primitive subfield of K. A field K is always a vector space over its primitive subfield, which may have finite or infinite dimension. If K is a finite field, its characteristic is necessary a prime p > 0, so that its primitive subfield J has p elements and is isomorphic to Zp . The dimension of K as a vector space over J is finite, for otherwise K would be infinite. Therefore, there exists a natural number n such that K is isomorphic to the vector space J n . Hence, we have proved: Theorem 4.5.13. Any finite field K has pn elements, where p is a prime, namely the characteristic of K. We know that there are finite fields with exactly p elements, namely Zp . One can show that, if p is a prime and n is a natural number, there exist fields with pn elements, and all fields with pn elements are isomorphic. Hence, up to isomorphism, there exists exactly one field with pn elements, called the Galois field 8 of order pn , denoted CG(pn ). We will not prove these statements now, but we remark that if p(x) ∈ Zp [x] is an irreducible polynomial of degree n, then K = Zp [x]/hp(x)i is a field with pn elements. Therefore, it must be the Galois field CG(pn ). In this way, one can reduce the existence of Galois fields to the existence of irreducible polynomials of arbitrary degree n in Zp .9 Exercises. 1. Prove the Isomorphism Theorems for rings. 2. Let A be a ring with unit which has n elements. Show that: (a) A ≃ Zn as rings if and only if A has characteristic n. (b) A ≃ Zn as rings if and only if (A, +) ≃ (Zn , +) as groups. 8 After Évariste Galois (1811-1832). Galois, was a main contributor to a major mathematical discovery of the 19th Century, the Theory of Groups. Galois is also a tragic figure of the History of Mathematics, for he died at the young age of 21, in a duel because of a women of dubious reputation. We will discuss in Chapter 7 the theory of Galois. 9 See also Exercise 8 in this section. 4.5. ISOMORPHISM THEOREMS FOR RINGS 227 3. The statement that Zn ⊕ Zm ≃ Znm , if gcd(n, m) = 1, expresses the Chinese Remainder Theorem in terms of isomorphisms of rings. How does one expresses the Fundamental Theorem of Arithmetic in terms of isomorphisms of rings? 4. Assume that n, m ∈ N, d = gcd(n, m) and k = lcm(n, m). Show that Zn ⊕ Zm ≃ Zd ⊕ Zk . 5. Assuming that n and m are relatively prime, show that: (a) There exists exactly one isomorphism of rings φ : Zmn → Zm ⊕ Zn , and (b) The inverse isomorphism φ−1 : Zm ⊕ Zn → Zmn is given by φ−1 (x, y) = φm (x) + φn (y), where φk : Zk → Zmn are injective homomorphisms such that φk (πk (x)) = πmn (ak x). Find the integers ak and explain the relationship between φ−1 and the Chinese Remainder Theorem. 6. Consider homomorphisms of rings φ : Zn → Z210 . (a) For which values of n are there injective homomorphisms φ? (b) For which values of n are there surjective homomorphisms φ? 7. Solve the equation ϕ(m) = 6, where ϕ is the Euler function, using the following steps: (a) Show that the prime factors of m must be 2, 3, or 7. (b) Show that if 7|m, then m = 7 or m = 14. (c) Show that if 7 is not a factor of m, then 3|m and 9|m. (d) Determine all solutions of of ϕ(m) = 6. 8. Assume that K ⊂ L are fields, u ∈ L is algebraic over K, and m(x) is the minimal polynomial of u in K[x]. Show that K[u], K(u) and K[x]/hm(x)i are isomorphic fields. 9. Let I ⊂ A be an ideal of A. Is it true that A is necessarily isomorphic to the ring I ⊕ A/I? On the other hand, show that if A is isomorphic to I ⊕ J, then J is isomorphic to A/I. 10. Let K be a field, and assume that p(x) = q(x)d(x) holds in K[x]. Show that K[x]/hp(x)i is isomorphic to K[x]/hq(x)i ⊕ K[x]/hd(x)i if and only if gcd(q(x), d(x)) = 1. 11. Consider p(x) = (x2 + x + 1)(x3 + x + 1) ∈ Z2 [x]. How many invertible elements are there in Z2 [x]/hp(x)i? And in Z2 [x]/hp(x)2 i? 12. Finish the proof of Theorem 4.5.12. 228 CHAPTER 4. QUOTIENTS AND ISOMORPHISMS 13. Let K be a primitive field, let L and M be extensions of K, and let φ : L → M be a non-zero homomorphism. Show that: (a) For all a ∈ K: φ(a) = a. (b) If p(x) ∈ K[x], b ∈ L and p(b) = 0, then p(φ(b)) = 0, i.e., φ takes roots of p(x) into roots of p(x). (c) Q[x]/hx3 − 2i is not isomorphic to Q[x]/hx3 − 3i. 14. Show that any ordered field is an extension of Q, i.e., Q is the smallest ordered field. 15. Show that any complete ordered field is an extension of the reals, i.e., R is the smallest complete ordered field. 4.6 Free Groups, Generators and Relations If X is a subset of a group G, the subgroup generated by X is the intersection of all subgroups of G which contain X and is denoted by hXi. Similarly to the case of ideals in rings, if X = {x1 , x2 , · · · , xn } is a finite set, we also write hXi = hx1 , x2 , · · · , xn i. Example 4.6.1. If G = S3 then we see immediately that: h(12)i = {I, (12)} h(123)i = h(321) =i{I, (123), (321)} On the other hand, h(12), (123)i = S3 . In fact, for any subset X ⊂ S3 containing a transposition and a 3-cycle we have hXi = S3 . These examples make it clear that a subgroup can have many different sets of generators. The set X is called a generating set of the group G if and only if hXi = G. This condition is equivalent to say that any element of G can be written, in multiplicative notation, as a product of positive and negative powers of elements of X: g = xn1 1 · · · xnk k , xi ∈ X, ni ∈ Z. If the group G has a finite generating set, i.e., if G = hx1 , x2 , · · · , xk i, then G is called a finitely generated group. 4.6. FREE GROUPS, GENERATORS AND RELATIONS 229 In the case of an abelian group we use the additive notation. So, for example, if G is a finitely generated abelian group with generators x1 , . . . , xk , then we can write any element of G as: g = n 1 x1 + n 2 x2 + · · · + n k xk , ni ∈ Z. Examples 4.6.2. 1. A cyclic group G = hαi is, by definition, finitely generated with one generator. Any element g ∈ G is of the form g = αn , possibly for multiple values of n. 2. Any finite group is finitely generated, since we can always take X = G as a generating set. 3. The group G = Z ⊕ Z2 ⊕ Z4 is neither cyclic nor finite, but it is finitely generated: a generating set is X = {(1, 0, 0), (0, 1, 0), (0, 0, 1)}. 4. The abelian group Zk := ⊕ki=1 Z is finitely generated. A generating set is given by the n vectors of the usual canonical basis of Rk , e1 , e2 , · · · , ek , where ei has all components zero but the i-th component which is 1. Any element of g ∈ Zk can be written uniquely as: g= k X ni e i , i=1 ni ∈ Z. One can also define the normal subgroup of G generated by a subset X ⊂ G as the intersection of all normal subgroups which contain X. In the abelian case, of course, this is the same as the subgroup generated by X, but it the non-abelian the two subgroups are, in general, distinct. For example, in G = S3 if we let X = {σ}, where σ is any transposition, then the subgroup generated by X is H = {1, σ}, but the normal subgroup generated by X is S3 itself. If X is a generating set for a group G, in general, there are multiple products of elements of X which yield the identity 1 ∈ G. For example: (a) For all x ∈ X, we have xx−1 = 1; (b) If G is cyclic of order m, and X = {x} is a generator, then xm = 1. One calls expressions formed by products of elements of G which equal the identity a relation. There are trivial relations, as in example (a), which are consequences of the groups axioms, and non-trivial relations, as in example (b), which depend on the specific group G and generating set X. We will make these concepts more precise later. 230 CHAPTER 4. QUOTIENTS AND ISOMORPHISMS Many groups can be completely described, in a very succinct form, by specifying a set of generators X and relations between these generators which we write (in multiplicative notation) as: xc11 xc22 · · · xckk = 1. Examples 4.6.3. 1. The cyclic group G of order n is completely described by specifying the generator X = {α}, and the relation αn = 1. 2. The group S3 is generated by X = {α, δ}. Its multiplicative table can be obtained form the relations α2 = 1, δ 3 = 1, and δα = αδ −1 . The last relation can also be written as αδαδ = 1. 3. The group H8 is generated by X = {i, j}, and is completely described by the relations i2 j 2 = i4 = 1 and iji = j. 4. The group Dn of symmetries of a regular polygon of n sides is generated be two elements σ and a ρ satisfying the relations σ 2 = 1, ρn = 1, σρσρ = 1. The element ρ represents a rotation by 2π/n, while the element σ represents a reflection on a symmetry axis of the polygon. In order to make easier the comparison of two distinct groups G and H, using a single set of generators X, we will also say that G is generated by a set X if there exists an injective map ι : X → G such that G is generated by the set ι(X), in the sense we used before. When the map ι is clear from the context, in order to simplify the notation, we will use the same symbol to denote both an element x ∈ X and the corresponding element ι(x) ∈ G. Example 4.6.4. According to these conventions, the groups S3 , H8 , and Z⊕ Z are all generated by X = {x1 , x2 }. Assume now that G is generated by X = {x1 , x2 , . . . , xn }, and let H be an arbitrary group. A moment of thought shows that: • Any homomorphism φ : G → H is uniquely determined on all elements of G, by the values that φ takes on the generators xk , but 4.6. FREE GROUPS, GENERATORS AND RELATIONS 231 • The values yk = φ(xk ) are not arbitrary, because whatever relations hold for x1 , . . . , xn in G they must also hold for the y1 , . . . , yn in H. Example 4.6.5. Any group homomorphism φ : S3 → H is uniquely determined by its values ρ = φ(α) and ξ = φ(δ). However, the elements ρ, ξ ∈ H cannot be arbitrary, and must satisfy the same relations that α and δ satisfy, namely: ρ2 = ξ 3 = 1 and ξρξρ = 1. Still from an informal point of view, a group G generated by X is free from relations between its generators, if there always exist a homomorphism φ : G → H, whatever the values φ(x), x ∈ X. We make these ideas more precise as follows: Definition 4.6.6. For a set X, we call F a free group (respectively, free abelian group) on X, if F is a group (respectively, abelian group) and there exists a map ι : X → F such that the following property holds: for any group (respectively, abelian group) H and any map φ : X → H there exists a unique group homomorphism φ̃ : F → H such that the following diagram commutes: ι /F X ◆◆ ✤ ◆◆◆ ✤ ◆◆◆ ◆ φ ◆◆◆◆ ✤ & φ̃ H Example 4.6.7. Ln If X = {x1 , . . . , xn } is a finite set, we consider the group Zn = i=1 Z and the map ι : X → Zn defined by ι(xi ) = ei , i = 1, . . . , n. If H is any abelian group and φ : X → H is any map, then we have a group homomorphism φ̃ : Zn → H given by: φ̃(k1 , . . . , kn ) = k1 φ(x1 ) + · · · + kn φ(xn ). This makes the following diagram commute: ι / Zn {x1 , . . . , xn } ✤ ❘❘❘ ❘❘❘ ✤ ❘❘❘ φ̃ ❘❘❘ φ ❘❘❘ ✤ ( H It is easy to see that φ̃ : Zn → H is the only group homomorphism with this property. Hence, Zn is a free abelian group on X = {x1 , . . . , xn }. 232 CHAPTER 4. QUOTIENTS AND ISOMORPHISMS We leave it as an exercise to check that the definition of free group F on the set X implies that ι : X → F is injective and F is generated by X. Notice that, if the group H is also generated by X, so there exists a map φ : X → H such that hφ(X)i = H, then the homomorphism φ̃ is onto. In particular, by the First Isomorphism Theorem, the groups F/N and H are isomorphic, where N denotes the kernel of φ̃. This shows the relevance of the free group F for the classification of groups: Any group generated by X is a group quotient of the free group F . Example 4.6.8. Any abelian group generated by X = {x1 , . . . , xn } is a group quotient of the group Zn . We will explore this remark later, in order to obtain a classification of all finitely generated abelian groups. We will see now that for a set X, there exists (up to isomorphism) exactly one free non-abelian group and one free abelian group generated by X, except when X = {x1 } consists of a single element (where both groups are abelian and isomorphic to Z). We start by showing that the free group on X is unique, up to isomorphism. Proposition 4.6.9. Let F and F ′ be free groups on a set X, relative to maps ι : X → L and ι′ : X → L′ , respectively. Both in the abelian and in the non-abelian case, there exists a unique isomorphism ψ : F → F ′ making the following diagram commute: > F✤ ⑥⑥ ✤ ⑥ ⑥ ⑥⑥ ✤ ⑥⑥ ✤ X❆ ✤ψ ❆❆ ❆❆ ✤ ❆❆ ✤ ′ ❆ ι ι F′ Proof. From Definition 4.6.6, if we let H = F ′ and φ = ι′ , we conclude that there exists a unique homomorphism ψ : F → F ′ which makes the diagram in the statement commutative. Similarly, interchanging the roles of ι and ι′ , we conclude that there exists a unique homomorphism ψ ′ : F ′ → F which 4.6. FREE GROUPS, GENERATORS AND RELATIONS 233 makes the following diagram commutative: ′ F ⑥> ✤ ⑥ ′ ι ⑥⑥ ✤ ⑥⑥ ✤ ⑥⑥ ✤ ′ X❇ ✤ψ ❇❇ ❇❇ ✤ ❇ ι ❇❇ ✤ F It follows that the following diagrams are also commutative: F ⑦> ✤ ι ⑦⑦⑦ ✤ ⑦⑦ ✤ ⑦⑦ ✤ ′ X❅ ✤ ψ◦ψ ❅❅ ❅❅ ✤ ❅ ι ❅❅ ✤ F ′ F ⑥> ✤ ι′ ⑥⑥⑥ ✤ ⑥⑥ ✤ ⑥⑥ ✤ ′ X❆ ✤ ψ ◦ψ ❆❆ ❆❆ ✤ ❆❆ ι′ ❆ ✤ F′ Finally, observe that if in the last two diagrams we replace ψ ◦ ψ ′ and ψ ′ ◦ ψ by the identity homomorphisms we also obtain commutative diagrams. The uniqueness in the defining property of a free group, allows us to conclude that ψ ◦ ψ ′ = idF and ψ ′ ◦ ψ = idF ′ . Hence, the homomorphism ψ has a unique inverse and so it is an isomorphism. Let us turn now to the existence of a free abelian group generated by X. We already know that this group exists if X is finite with n elements and it is isomorphic to Zn . For infinite sets we need the following: Definition 4.6.10. Let {Gi }i∈I be a family of groups. Q (i) The direct product of the Gi ’s, denoted group Q by i∈I Gi , is the 10 with supporting set the cartesian product i∈I Gi of the groups and group operation Q defined as follows: if g = (g Qi )i∈I and h = (hi )i∈I are elements of i∈I Gi , then gh ≡ (gi hi )i∈I ∈ i∈I Gi . L (ii) The direct sum of the G ’s, denoted of the i i∈I Gi , is the subgroup Q Q direct product i∈I Gi consisting of elements (gi )i∈I ∈ i∈I Gi where only a finite number of gi ’s is different from the identity (in Gi ). Obviously, when the set of indices is finite the direct sum and direct product coincide, and are the same as the product defined in Chapter 1. We leave the proof of the following proposition for the exercises: 10 See the Definition ?? in the Appendix. 234 CHAPTER 4. QUOTIENTS AND ISOMORPHISMS L Proposition 4.6.11. If X is any set the direct sum F = x∈X Z is a free abelian group generated by X relative to the map ι : X → F which to each x0 ∈ X associates the element (gx )x∈X ∈ L, with gx0 = 1 and gx = 0, if x 6= x0 . Notice that the map ι in this Proposition is injective. For that reason it is called the canonical injection. If we identify each element L L xi ∈ X with its image ι(xi ) ∈ x∈X Z then X can be seen as a subset of x∈X Z. Moreover, L we can express every element g 6= 0 of x∈X Z in additive notation in the form g = n 1 x i1 + n 2 x i2 + · · · + n k x ik , where the indices i1 , . . . , ik are all distinct and n1 , . . . , nk are non-zero integers. This expression is unique, up to the order Lof the factors, and every expression of this form represents an element of x∈X Z. Let us explain now how to construct the non-abelian free group on a set X with more than one element. First we index the elements of X, so that X = {xi : i ∈ I}, and then we form an “appropriate product” of all free groups Fi in the sets Xi = {xi }. The product that we need is called the free product of groups, and is discussed in the following proposition. Proposition 4.6.12. Let {Gi }i∈I be a family of groups. There exists a Q Q group ∗i∈I Gi and group homomorphisms φi : Gi → ∗i∈I Gi with the following property: given any group H and group homomorphisms ψi : Gi → H, Q there exists a unique homomorphism of groups ψ : ∗i∈I Gi → H such that the following diagram commutes for all i ∈ I: Q φi / ∗ Gi Gi PP i∈I✤ PPP PPP ✤ψ PP ψi PPPP ✤ P' H Proof. Let {Gi }i∈I be a family of groups. We call a word in the Gi ’s any finite sequence (g1 , . . . , gn ), where each gk belongs to some Gi . The natural number n is called the word length, and we conventioneer that the empty word, denoted 1, has length zero. A reduced word is any word (g1 , . . . , gn ) which satisfies the following two properties: (a) none of the gk is the identity element of a group Gi ; (b) none of the successive terms in the word belong to the same group Gi ; 4.6. FREE GROUPS, GENERATORS AND RELATIONS 235 Q We denote by ∗i∈I Gi the set of all reduced words, including the empty word 1. Q In the set ∗i∈I Gi , we define a binary operation as follows. Let g = (g1 , . . . , gn ) and h = (h1 , . . . , hm ) be two reduced words with n ≤ m. Let 0 ≤ N ≤ n be the smallest integer such that N < k ≤ n and both gk and hn−k+1 belong to the same group Gi , while gk hn−k+1 is the identity in Gi . Then the product gh is the reduced word defined by  if N > 0 and gN , hn−N +1 do not   (g1 , . . . , gN , hn−N +1 , . . . , hm )   belong to the same group,          (g1 , . . . , gN −1 , gN hn−N +1 , hn−N +2 , . . . , hm ) if N > 0, and gN , hn−N +1 gh = belong to the same group,        (hn−N +1 , . . . , hm ) if N = 0 and n < m,       1 if N = 0 and n = m. By definition the product of a reduced word g with the empty word 1 is g1 Q = 1g = g. It is easy to check that this operation defines a group structure in ∗i∈I Gi , with identity the empty word 1, and where the inverse of the reduced word g = (g1 ,Q . . . , gn ) is the reduced word g−1 = (gn−1 , . . . , g1−1 ). Now let φi : Gi → ∗i∈I Gi be the map which to an element g ∈ Gi , with g 6= e associates the reduced word (g), and to e associates 1. Its is obvious that φi is a group homomorphism. Finally, given any group H and group Q∗ homomorphisms ψi : Gi → H, we define a group homomorphism ψ : i∈I Gi → H as follows: ψ(1) = e (the identity in H) and ψ(g1 , . . . , gn ) := ψi1 (g1 ) · · · ψin (gn ), if gk ∈ Gik . One checks immediately that this is the unique group homomorphism which makes the following diagram commute for all i ∈ I: Q φi / ∗ Gi Gi PP i∈I✤ PPP PPP ✤ψ PP ψi PPPP ✤ P' H Q The group ∗i∈I Gi is called the free product of the groups Gi . From now on, we will use the multiplicative notation to write the word (g1 , . . . , gn ) in the form g1 · · · gn . 236 CHAPTER 4. QUOTIENTS AND ISOMORPHISMS Our next example illustrates the difference between the free product and the direct product of groups. Example 4.6.13. Let G = {1, g} and H = {1, h} be two cyclic groups of order 2. An element in the free product G ∗ H can be written as a finite alternating sequence of products of g and h. For example, the following expressions are elements in the free product: g, h, gh, hg, ghg, hgh, ghgh, . . . Notice that gh 6= hg and that both these elements have infinite order! By contrast, the direct product G × H is an abelian of order 4! Q Given any set X = {xi : i ∈ I} we consider the free product F ≡ ∗i∈I Fi of all free groups Fi in the sets {xi } (recall that Fi ≃ Z). We have an injective map ι : X → F which associates to the element xi the reduced word (xi ). Q Proposition 4.6.14. If X = {xi : i ∈ I}, the free product F = ∗i∈I Fi is a free group generated by X relative to the map ι : X → F . Proof. We need to show that for any group H and map φ : X → H there exists a unique homomorphism of groups φ̃ : F → H so that we have a commutative diagram: ι /F X ◆◆ ✤ ◆◆◆ ✤ ◆◆◆ ◆ φ ◆◆◆◆ ✤ & φ̃ H The elements of F are reduced words of the form xk11 · · · xknn , where the k1 , . . . , kn are non-zero integers. It is easy to see that φ̃ must be defined by φ̃(xk11 , . . . , xknn ) = φ(x1 )k1 · · · φ(xn )kn . Any element in the free group generated by the set X = {xi : i ∈ I} can be written in the reduced form xk11 · · · xknn , 4.6. FREE GROUPS, GENERATORS AND RELATIONS 237 and two elements written in this form can be multiplied in the obvious form, by juxtaposition and elimination, using multiplication in the groups Gi and cancelation of units. There is a relationship between the non-abelian and the abelian free groups in a set X. To see this, let us denote by (G, G) ⊂ G the smallest subgroup of G which contains all commutator elements: (g, h) ≡ g −1 h−1 gh, g, h ∈ G. It is easy to see that (G, G) is a normal subgroup of G and that the quotient G/(G, G) is abelian. We will study this commutator group in Chapter 5. We leave the proof of the following proposition as an exercise: Proposition 4.6.15. If F is a free group on X relative to ι : X → F , then F/(F, F ) is the free abelian group on X relative to the map ῑ : X → F/(F, F ) given by ῑ = π ◦ ι, where π : F → F/(F, F ) is the quotient map. The existence of free groups allows us to formalize the notion of relation, and to clarify the difference between trivial and non-trivial relations. Moreover, we can define precisely what we mean by complete set of relations. In what follows, H will denote a group generated by X ⊂ H and F denotes the free group on X relative to a map ι : X → F . If H is abelian, we will assume naturally that F is the abelian free group. The map i : X ֒→ H denote the canonical inclusion (i(x) = x). As we have already observed: Proposition 4.6.16. There exists a surjective homomorphism φ : F → H. This is summarized by the following diagram: ι /F X ◆◆ ✤ ◆◆◆ ✤ ◆◆◆ φ ◆ i ◆◆◆◆ ✤ & H Definition 4.6.17. A non-trivial relation of H is any element r ∈ ker φ distinct from the identity 1 ∈ F . Notice that r takes the form ι(xi1 )x1 ι(xi2 )n2 · · · ι(xik )nk , where xj ∈ X. Moreover, to say that r ∈ ker φ is equivalent to say that xni11 xni22 · · · xnikk = 1, and so we will continue to say that the last identity is a “relation”. 238 CHAPTER 4. QUOTIENTS AND ISOMORPHISMS Given a set of relations R = {ri }i∈I , we say that a relation r is a consequence of the relations ri ’s, if r belongs to the normal subgroup of F generated by the ri ’s. Moreover, we say the set R is a complete set of relations if the kernel of φ is the normal subgroup of F generated by R. Given a complete set of relations R = {ri }i∈I the group H is completely determined, up to isomorphism, by the set of generators X and by the set R. In fact, H is isomorphic to quotient of F by the normal subgroup generated by R. One calls the pair (X, R) a presentation of the group H. Two groups with the same presentation are obviously isomorphic. On the other hand, in general, a group admits many distinct presentations. Example 4.6.18. Consider the group H = Z2 ⊕Z3 . It is easy to see that this group is generated by the element x1 = (1, 1). The free group F generated by X = {x1 } is isomorphic to Z, where we identify x1 with the element 1. There exists a unique surjective group homomorphism φ : Z → Z2 ⊕ Z3 . This homomorphism maps 1 to (1, 1) and the element 6 belongs to its kernel. In fact, 6 generates the kernel, so R = {6} is a complete set of relations for H, and the pair {{x1 }, R} is a presentation of H. Using the multiplicative notation we can write: {{x1 }, x61 = 1} Now note that x1 = (1, 0) and x2 = (0, 1) is an equally valid choice of generators for Z2 ⊕ Z3 . In this case, we should take instead the free group F = Z ⋆ Z, and the kernel of φ : Z ⋆ Z → Z2 ⊕ Z3 is the normal group generated by x21 , x32 −1 and x1 x2 x−1 1 x2 . Therefore, we obtain as a different presentation for H: −1 {{x1 , x2 }, x21 = 1, x32 = 1, x1 x2 x−1 1 x2 = 1}. Unfortunately, the characterization of a group by presentations does not solve the classification problem for groups. In the 1950’s, Adyan and Rabin11 have shown, independently, that there is no algorithm to decide if two group presentations represent or not isomorphic groups. However, in the case of finitely generated abelian groups one can indeed use presentations to classify them, as we will now sketch. We will see in Chapter 6, in the context of the theory of modules, that this procedure can be generalized to obtain a classification of finitely generated modules over a principal ideal domain D (the case of abelian groups corresponds to taking D = Z). 11 S. Adyan, “The unsolvability of certain algorithmic problems in the theory of groups”, Trudy Moskov. Mat. Obsc. 6 (1957), 231-298, and M. Rabin, “Recursive unsolvability of group theoretic properties”, Ann. of Math. 67 (1958), 172-174. 4.6. FREE GROUPS, GENERATORS AND RELATIONS 239 Let us assume then that A is a finitely generated abelian group, with a set of generators X = {x1 , x2 , · · · , xn } ⊆ A. Let Zn = ⊕nk=1 Z be the free abelian group on X, with ι : X → Zn given by ι(xk ) = ek . The generators ni of the kernel of φ : Zn → A take the form ni = (ri1 , ri2 , · · · , rin ), where rik ∈ Z, and correspond to relations of the form ri1 x1 + ri2 x2 + · · · + rin xn = 0. We can assemble these generators into a matrix R of dimension m × n with entries rik ∈ Z. Hence, in practical terms, a presentation of the finitely generated abelian group A is simply a matrix R ∈ Mm,n (Z). As we have discussed before, we can replace X by any other set of generators of A, which has the effect of changing the matrix R. We leave as an exercise to check that: Proposition 4.6.19. Let A be an abelian group generated by the set X = {x1 , x2 , · · · , xn }, and let R ∈ Mm,n (Z) be the matrix with the corresponding list of relations. Then: (i) If = (skj ) is an invertible matrix in Mn (Z), then the elements yk = PS n k=1 skj xj , are also generators of A, and (ii) The matrix RS −1 gives the list of relations for the generators {y1 , . . . , yn }. The rows l1 , l2 , · · · , lm of the matrix R are the generators of kernel of φ in It should now be clear that if P = (pkj ) is an invertible matrix in Mm (Z), then the elements l1 , l2 , · · · , lm can be replaced by elements l1′ , l2′ , · · · , λ′m , P m where lk′ = j=1 pkj lj . This amounts to a having a new presentation of A with matrix P R. In other words, we see that: Zn . Proposition 4.6.20. Let A be an abelian group generated by the set X = {x1 , x2 , · · · , xn }, and let R ∈ Mm,n (Z) be the matrix with the corresponding list of relations. If P and Q are any invertible matrices, respectively in Mm (Z) and in Mn (Z), then P RQ is also the matrix of a presentation of A. In particular, we can choose P and Q to perform the usual elementary operations of Gauss elimination on the rows and columns of R. Note, however, we can only use invertible operations in Mn (Z), which allows us to: • Switch rows or columns, • Add a row (or column) a multiple of another row (or column), • Multiply a row (or column) by −1. 240 CHAPTER 4. QUOTIENTS AND ISOMORPHISMS Examples 4.6.21. 1. Let A be an abelian group generated by X = {x1 , x2 }. Assume that these generators satisfy the relations 6x1 − 6x2 = 0 and 12x1 + 20x2 = 0. We can perform the following sequence of operations on the resulting matrix: 6 −6 6 −6 96 0 96 0 −→ −→ −→ . 12 20 30 2 30 2 0 2 We conclude that A is an abelian group which has a set of generators y1 and y2 , satisfying 2y1 = 0 and 96y2 = 0. Hence, A ≃ Z2 ⊕ Z96 . More precisely, there exists a surjective homomorphism φ : Z2 → A with kernel N = h2i⊕h96i, so it follows that A ≃ Z2 /N ≃ Z/h2i ⊕ Z/h96i = Z2 ⊕ Z96 . 2. The group Z6 ⊕ Z16 has generators x1 = (1, 0) and x2 = (0, 1), and these satisfy the relations 6x1 = 0 and 16x2 = 0. Again, we can perform the following operations: 6 0 6 0 6 −12 −→ −→ −→ 0 16 6 16 6 4 18 −12 2 4 −→ 0 2 −48 4 −→ 2 0 0 48 to conclude that: Z6 ⊕ Z16 ≃ Z2 ⊕ Z48 . In both these examples, the first aim was to find the greatest common divisor of all entries of the matrix. In fact, we have: Proposition 4.6.22. Let R ∈ Mn (Z) and d = gcd(R) be the greatest common divisor of all entries of R. There exist invertible matrices P, Q ∈ Mn (Z) ′ = gcd(R′ ) = d. such that the matrix R′ = P RQ has R11 The proof of this proposition is not hard, but we will see in Chapter 6 a much more general result so we omit it. Note that once we have obtained a matrix R with an entry equal to gcd(R), then we can use it to eliminate all other entries in the same row and column. In the examples above, where we had 2 × 2 matrices, the elimination stops after this step. For matrices of dimension n×n, with n > 2, we can move the gcd to upper left corner, as in the previous proposition, and eliminate all entries in the first row and column. One starts all over 4.6. FREE GROUPS, GENERATORS AND RELATIONS 241 with now a matrix R′ of dimension (n − 1) × (n − 1), formed by the elements rij , with i > 1 and j > 1, where there exist, possibly, non-zero elements. The gcd of the entries of R′ is a multiple of the gcd of the entries of R, so this process yields a sequence of integers d′1 |d′2 | · · · |d′n . Example 4.6.23. We illustrate the algorithm with a 3 × 3 matrix.        3 0 0 3 0 0 3 0 0 3  9 6 12  −→  0 6 12  −→  0 6 12  −→  0 12 6 24 0 6 24 0 0 12 0  0 0 6 0  0 12 After this elimination process, the homomorphism φ : Zn → A is given by φ(k1 , k2 , · · · , kn ) = k1 y1 + k2 y2 + · · · + kn yn , and has kernel the subgroup N = hd′1 i ⊕ · · · hd′n i. Hence, we conclude that A ≃ Zn /N ≃ Zd′1 ⊕ · · · Zd′n . It is possible that 1 = d′1 = d′2 = · · · = d′k , for some k ≤ n. The correspondent quotients Zd′i are trivial and we can ignore them. We can also have 0 = d′j = d′j+1 = · · · = d′n for some j ≤ n. The corresponding quotients are Zd′i ≃ Z. We conclude that: Theorem 4.6.24 (Classification of finitely generated abelian groups). If A is a finitely generated abelian group, then A ≃ Zd1 ⊕ · · · ⊕ Zdm ⊕ Zr , where 1 < d1 |d2 | · · · |dn . The integer r is the number of integers d′j which are zero and is called the characteristic of the abelian group A. The integers d1 , . . . , dn are called invariant factors or torsion coefficients of the abelian group A. These integers characterize the abelian group up to isomorphism. For this reason, we say that they form a a complete set of invariants of an abelian group. In general, for any abelian group A, the subset of A formed by all elements of finite order is a subgroup called the torsion subgroup of A, which we will denote by Tors(A). If Tors(A) is trivial, the group is said to be torsion free, and if Tors(A) = A, then we say that A is a torsion group. In any case, the quotient A/ Tors(A) is always a torsion free group. 242 CHAPTER 4. QUOTIENTS AND ISOMORPHISMS Example 4.6.25. In Examples 4.6.21 both groups are torsion. The group A has invariant factors 2 and 96, while the invariant factors of Z6 ⊕ Z16 are 2 and 48. Exercises. 1. Let G be a group and X ⊆ G. Show that the intersection of all subgroups (respectively, normal subgroups) of G which contain X is the smallest subgroup (respectively, normal subgroup) of G that contains X. 2. Show directly from the definition of a free abelian group F on a set X, that the image ι(X) is a set generating F . 3. Prove Proposition 4.6.11. 4. Verify that if A1 and A2 are free abelian groups generated by finite sets, then A1 and A2 are isomorphic if and only if they have the same characteristic. 5. Show that a group with two generators a and b and relations aba−1 b−1 = 1, an = 1, bm = 1, is isomorphic to the group Zn ⊕ Zm . 6. Show that a group with two generators σ and ρ and relations σ 2 = 1, ρn = 1, ρσ = σρ−1 , is isomorphic to the group Dn of symmetries of a regular polygon with n sides. 7. Show that a group with two generators a and b and relations a4 = 1, a2 b2 = 1, abab−1 = 1, is isomorphic to the group H8 . 8. Let G be a group with two generators a and b satisfying the relation a3 b−2 = 1, and let H be a group with two generators x and y satisfying the relation xyxy −1 x−1 y −1 = 1. Show that these two groups are isomorphic. 9. Give an example of two non-isomorphic abelian groups A1 and A2 such that Tors(A1 ) ≃ Tors(A2 ), and A1 / Tors(A1 ) ≃ A2 / Tors(A2 ). 10. Find the invariant factors of Z2 ⊕ Z4 ⊕ Z6 and Z16 ⊕ Z3 ? Are these two groups isomorphic? 4.6. FREE GROUPS, GENERATORS AND RELATIONS 243 11. Consider the groups in Examples 4.6.21. In each case, there exist generators {y1 , y2 } such that n1 y1 = n2 y2 = 0, where n1 , n2 are the invariant factors of the group. What is the relation between these generators and the “original” generators x1 and x2 ? Find the group homomorphisms φ which allow one to conclude that these two groups are isomorphic to Z2 ⊕ Z96 and Z2 ⊕ Z48 , respectively. 12. Proof Proposition 4.6.15. 13. Let {Gi : i ∈ I} be a family non-trivial groups with #I > 1. Show that Qof ∗ the non-abelian free product i∈I Gi has elements of infinite order and that its center is trivial. 14. Let G and H be groups. Show that if some element g 6= 1 in the free product G ∗ H has finite order, then g is conjugate to an element of G or of H. 15. Give proofs for Propositions 4.6.19 and 4.6.20. 16. In the proof of Theorem 4.6.24 we have assumed that the number of generators of the kernel of φ (denoted m) equals the number of generators of the group A (denoted n). Does this assumption imply any loss of generality? 244 CHAPTER 4. QUOTIENTS AND ISOMORPHISMS Chapter 5 Finite Groups 5.1 Groups of Transformations In this Chapter we study one of the most classic themes of Algebra, the structure of finite groups1 . We know already many examples of finite groups, such as the finite cycles groups Zn , the symmetric groups Sn or the dihedral groups Dn . These groups are very distinct, although one can certainly find some connections between them. For example, Dn contains a normal subgroup isomorphic to Zn (the subgroup of rotations), and S3 ≃ D3 . Our aim now is, precisely, to study the finite groups in a more systematic way, trying to make such connections more evident. We have already studied some of the properties of the symmetric group Sn , the group of bijections of the set {1, . . . , n}. This group has a central role in the study of finite groups, since as we will see shortly, Cayley”s Theorem states that any finite group is isomorphic to a subgroup of Sn . More generally, we call a group of transformations of a set X any subgroup of the group SX of bijections of X. Often, it is helpful to represent an abstract group as a group of transformations, since some properties of the group become more intuitive and geometric. In fact, the notion of a group of transformations is so natural that historically it preceded the abstract notion of a group: the great mathematicians of the 19th Century that many fundamental discoveries in Group Theory, such as Galois and Lie only worked with transformation groups and did not know of the abstract 1 Although this is a classic theme, one of the greatest achievements of modern days mathematics was the classification of finite simple groups, a class of groups we will study later. This classification occupies several hundreds of pages of mathematical literature. For a brief account see, e.g., the excellent article by R. Solomon, “On finite simple groups and their classification”, Notices of the American Mathematical Society 42, 231–239 (1995) 245 246 CHAPTER 5. FINITE GROUPS notion of a group, which would only be formalized later in the beginning of the 20th Century. The passage from a an abstract group to a group of transformations is done through the notion of group action, whose formal definition is as follows: Definition 5.1.1. An action of a group G on a set X is a map G×X → X, written (g, x) 7→ gx, which satisfy the following properties: (i) ∀x ∈ X, ex = x; (ii) ∀g1 , g2 ∈ G, ∀x ∈ X, g1 (g2 x) = (g1 · g2 )x. A slightly different, but equivalent, point of view is the following. Assume that a group G acts on a set X. For each g ∈ G define a transformation T (g) : X → X by setting T (g)(x) ≡ gx. Then conditions (i) and (ii) in the definition of an action are equivalent to, respectively, (i’) T (e) = I (identity transformation); PSfrag replaements Z (ii’) ∀g1 , g2 ∈ G, T (g1 · g2 ) = T (g1 ) ◦ T (g2 ). h mi Conversely, given a transformation T (g) : X → X, for each g ∈ee~ G, satisfying h1i = Zgx ≡ T (g)(x). (i’) and (ii’), one obtains an action of G on X by the formula h2i Note, also, that each transformation T (g) is bijective. h4i h12i h6i X g2 x T (g2 ) h3i m; n)i hmi hni T (gh1mm( ) m; n)i x hmd( g1 (g2 x) T (g1 g2 ) Figure 5.1.1: Action. Hence, the map g 7→ T (g) is a group homomorphism from G to the group SX of bijections of the set X. We will call T the associated homomorphism of action. Therefore, an action of G on X realizes G as a 5.1. GROUPS OF TRANSFORMATIONS PSfrag replaements Z h 247 mi group of transformations of X. An action is called effective if the associe ated homomorphism T is injective, i.e., if the kernel of the homomorphism e~ =Z T : G → SX is {e}. The kernel of the homomorphism T his1icalled the kernel h2i of the action. h4i h12i When G acts in two distinct sets X1 and X2 , with associated homomorh6i phisms T1 and T2 , we call the actions equivalent if there exists a bijection h3i φ : X1 → X2 such that hmd(m; n)i mi ni hmm(m; n)i h φ ◦ T1 (g) = T2 (g) ◦ φ, (5.1.1) X1 ∀g ∈ G. h X2 gx X (gx) T 1 (g ) T2 (g ) g2 x g1 (g2 x) T (g 2 ) (xT) (g1 ) T (g1 g2 ) x Figure 5.1.2: Equivalent actions. This equation can also be expressed by the commutativity of the diagram: X1 φ / X2 T1 (g) T2 (g) X1 φ / X2 Examples 5.1.2. 1. Consider the group O(n) consisting of n×n orthogonal matrices, i.e., AAT = I. There is an obvious action of O(n) on Rn given by the action of a matrix on a vector: (A, x) 7→ Ax. This is the usual way we view O(n) as a group of transformations of Rn . We saw in Chapter ?? that for each g ∈ O(n) the transformation T (g) is an isometry of Rn which fixes the origin. This action is effective (why?). 2. Similarly, consider the Euclidean group E(n) formed by all pairs (A, b), where A ∈ O(n) and b ∈ Rn , with binary operation (A1 , b1 ) · (A2 , b2 ) ≡ 248 CHAPTER 5. FINITE GROUPS (A1 A2 , A1 b2 + b1 ). The group E(n) acts on Rn by setting ((A, b), x) 7→ Ax + b. We saw in Chapter ?? that the transformations T (g), where g ∈ E(n), are isometries of Rn and that any isometry of Rn is realized by a T (g). This is also an effective action. 3. A group G acts on itself by left translations : G× G → G, (g, x) 7→ g ·x. This action is effective (why?). We can also define an action of G on itself by right translations by setting (g, x) 7→ xg −1 . These actions are equivalent: an equivalence is provided by the inverse map φ : G → G, x 7→ x−1 . 4. Another action of G on itself is the action by conjugation, which is defined by: (g, x) 7→ g x ≡ gxg −1 , (5.1.2) g, x ∈ G. It is easy to check that conditions (i) and (ii) in Definition 5.1.1 are satisfied and that they can be written as: e ( x) = g1 ·g2 x. g1 g2 x = x, We leave it as an exercise to check that the kernel of this action is the center C(G) of the group G. 5. Let G be a group and H ⊂ G a subgroup. Then G acts on the set of left cosets G/H by setting (g, xH) 7→ (g · x)H. An element g ∈ G belongs to the kernel of this action if and only if: (g · x)H = xH, ∀x ∈ G ⇐⇒ g ∈ xHx−1 , ∀x ∈ G. Hence the kernel of this action is \ xHx−1 . x∈G We leave it as an exercise to check that this is the largest normal subgroup of G which is contained in H. Hence, the action is effective if and only if there is no non-trivial subgroup of H, normal in G. We will explore the actions of a group on itself, as in the previous examples, to obtain information about the structure of the group. A simple application of this idea yields the following: Theorem 5.1.3 (Cayley). Let G be a finite group of order n. Then G is isomorphic to a subgroup of Sn . Proof. Consider the action of G on itself by left translations. Since this action is effective, the associated homomorphism T : G → SG ≃ Sn is injective. 5.1. GROUPS OF TRANSFORMATIONS 249 PSfragofreplaements When G acts on a set X we obtain a partition X as follows. One deZ an element fines an equivalence relation ∼ on X where x ∼ y if there exists g ∈ G such that x = gy. An equivalence class of ∼ is called a G-orbit. Hence, we obtain a partition of X into G-orbits, where the G-orbit containe ing the element x ∈ X is e~ hmi h1i = Z h2i Ox ≡ {gx : g ∈ G}. h4i h12i The set of G-orbits is denoted by X/G. When X is a finiteh6set there exists i h3i a finite number of orbits O1 , . . . , On , and each orbit has a finite number of hmd(m; n)i elements. Therefore, counting elements, we obtain a class hequation: mi |X| = |O1 | + · · · + |On |. (5.1.3) hmm( hni m; n)i X X2 1 X. An action is called transitive if it has only one orbit, namely gx X Ox x g1 x gk x g2 x (x) (gx) T1 (g ) T2 (g ) g1 (g2 x) T (g 2 ) T (g 1 ) T (g1 g2 ) Figure 5.1.3: A G-orbit of x. Examples 5.1.4. 1. The orbits of the action of the group O(n) on Rn are the spheres Sr = {|x| = r : x ∈ Rn } (r > 0) and the origin {0}. 2. The action of G on itself by (right or left) translations has only one orbit, hence it is transitive. 3. Let H ⊂ G be a subgroup. The H-orbits of the action of H on G by left (respectively, right) translations are the right (respectively, left) cosets of H. In Chapter ?? we saw that the orbits of this action all have the same cardinal and that from this fact, and the class equation (5.1.3), it follows immediately Lagrange’s Theorem. 250 CHAPTER 5. FINITE GROUPS 4. Consider the action of G on itself by conjugation. The G-orbits are usually called conjugacy classes. We will denote the conjugacy class containing the element x ∈ G by G x. Two elements x and y are said to be conjugate if they belong to the same conjugacy class, i.e., if y = gxg −1 , for some g ∈ G. An element x belongs to the center C(G) if and only if G x = {x}, so the center C(G) is the union of all conjugacy classes containing a single element. Definition 5.1.5. If G acts on X, the isotropy subgroup or stabilizer group of an element x ∈ X is the subgroup Gx ⊂ G given by Gx = {g ∈ G : gx = x}. (5.1.4) When two elements x, y ∈ X belong to the same orbit, the corresponding isotropy subgroups are conjugate. In fact, if y = gx for some g ∈ G, then h ∈ Gx ⇐⇒ ⇐⇒ ⇐⇒ hx = x, hg −1 y = g−1 y, ghg −1 y = y ⇐⇒ ghg −1 ∈ Gy . Hence, the isotropy subgroups of x and of y = gx are related by Gy = gGx g−1 . Moreover, the orbits are determined by the isotropy subgroups, as shown by the following proposition: Proposition 5.1.6. Assume a group G acts on a set X. For any x ∈ X, the map φ : G/Gx → Ox , hGx 7→ hx, gives an equivalence between the G-action on G/Gx and the G-action on the orbit through x. We leave the proof for the exercises. In particular, for an action on a finite set we conclude that: Corollary 5.1.7. If G acts on a finite set X with orbits Ox1 , . . . , Oxn then (5.1.5) |X| = n X i=1 n X [G : Gxi ]. |G/Gxi | = i=1 The actions that we have described are sometimes called left actions, because they satisfy (ii) in Definition (5.1.1). One can also consider right actions where one writes the action as X × G → X, (x, g) 7→ xg, and where (ii) is replaced by ∀g1 , g2 ∈ G, ∀x ∈ X, (xg1 )g2 = x(g1 · g2 ). 5.2. THE SYLOW THEOREMS 251 Unless otherwise mentioned, we will always consider left actions. You should check how to modify the examples in this section in order to obtain right actions. Exercises. 1. Show that the kernel of the action by conjugation of G on itself is the center C(G). 2. Let H be a subgroup of G. Show that subgroup of G which is contained in H. T −1 x∈G xHx is the largest normal 3. Give a proof of Proposition 5.1.6. 4. An action of a group G on a set X is called a free action if any e 6= g ∈ G acts without fixed points, i.e,, if the isotropy subgroups Gx are trivial for all x ∈ X. Show that a free action is effective and determined which of the actions in Examples 5.1.2 are free. 5. One says that a group G acts by automorphisms on a group K if there exists an action of G on K such that, for each g ∈ G, the map k 7→ gk is an automorphism of K. Assuming that G acts by automorphisms on K, show that: (a) the binary operation (g1 , k1 ) ∗ (g2 , k2 ) = (g1 g2 , k1 (g1 k2 )) defines a group (G× K, ∗), called the semidirect product of G by K, which is denoted by G ⋉ K; (b) the maps G → G⋉K : g 7→ (g, e) e K → G⋉K : k 7→ (e, k) are monomorphisms. The image of second monomorphism is a normal subgroup of G ⋉ K. 6. Consider the action of O(n) on (Rn , +). Show that this action is by automorphisms, and describe the semidirect product O(n) ⋉ Rn . 7. Determine the partition of the symmetric group Sn into conjugacy classes. (Hint: Consider first the case n = 3.) 5.2 The Sylow Theorems If G is a finite group, then the order of a subgroup divides the order of the group. When G is a cyclic group, then for each divisor d of |G| there exists exactly one subgroup of G of order d. In general, for a finite group G, it is natural to ask: 252 CHAPTER 5. FINITE GROUPS • Given a factor d of |G|, is there a subgroup of G of order d? In this section we will explore the action by conjugation of the group G on itself, and the corresponding class equation (5.1.3), to obtain some answers to this question. Let x ∈ G. The isotropy subgroup of x for the action by conjugation is precisely: {g ∈ G : g · x · g −1 = x} = {g ∈ G : g · x = x · g}, i.e., the set of all elements of G which commute with x. This group is usually called the centralizer of x, and will be denoted by CG (x). Recall that G x denotes the G-orbit of x, i.e., the conjugacy class which contains x. By Proposition 5.1.6, G x is isomorphic to G/CG (x) and we conclude that: |G x| = [G : CG (x)]. Moreover, it follows that: Proposition 5.2.1. For any finite group G one has: (5.2.1) |G| = |C(G)| + n X [G : CG (xi )], i=1 where x1 , . . . , xn are representatives of each of the conjugacy classes of G with more than one element. Proof. As we have noted before, the center C(G) is the union of all the conjugacy classes of G with exactly one element. Equation (5.2.1) now follows from the class equation (5.1.5). As we shall see now, formula (5.2.1) is specially useful to exclude the existence of subgroups of certain orders. For example, as a first elementary application we show the existence of a family of groups whose centers are non-trivial. Proposition 5.2.2. If |G| = pm , where p is a prime, then the center of G has order pk , where k ≥ 1. Proof. By Lagrange’s Theorem the order of the center of C(G) divides the order of G. Hence, from the class equation (5.2.1), one obtains pm = pk + n X i=1 [G : CG (xi )]. 5.2. THE SYLOW THEOREMS 253 Since xi does not belong to the center, CG (xi ) 6= G and we conclude that [G : CG (xi )] = pmi , where mi ≥ 1. Hence: k m p =p − n X i=1 pmi , m, mi ≥ 1 and we must have k ≥ 1. Corollary 5.2.3. Every group G with |G| = p2 is abelian. Proof. The previous proposition implies that |C(G)| = p or p2 . We leave it as an exercise to check that the case |C(G)| = p is not possible.. The next result gives a partial answer to the question raised at the beginning of this section in the case of abelian groups. Theorem 5.2.4 (Cauchy). If G is a finite abelian group group and p is a prime factor of |G|, then G has an element g of order p. Proof. We use induction on the order |G| of G. If |G| = p then the result is obvious. Assume that |G| > p and assume the result holds for all abelian groups of order less than |G. Choose an element e 6= a ∈ G. Two cases are possible: (i) a is an element of order divisible by p. In this case, the cyclic group hai has an element g of order p, so the theorem holds. (ii) p does not divide the order of a. In this case, the group G/hai has order divisible by p. By the induction hypothesis, this group has an element bhai of order p. The order s of b is divisible by p, since hai = bs hai = (bhai)s . Hence, the cyclic subgroup hbi contains an element g of order p. Cauchy’s Theorem holds for finite abelian groups. We will see later that these groups can by classified. This classification will clarify completely what are the possible subgroups of a finite abelian group. We now turn to nonabelian groups for which we have the following generalization of Cauchy’s Theorem2 . 2 The results that follow are due to the Norwegian mathematician Ludvig Sylow (18321918) and appeared in the paper “Théorèmes sur les groups de substitutions”, Math. Ann., 5 (1872). 254 CHAPTER 5. FINITE GROUPS Theorem 5.2.5 (Sylow I). Let G be a finite group. If a prime power pk is a factor of the order of |G|, then there exists s subgroup H of G of order pk . Proof. We shall use again induce on the order |G|. Again, starting from the class equation (5.2.1): n X [G : CG (xi )]. |G| = |C(G)| + i=1 we observe that: (i) If p ∤ |C(G)|, then for some i ∈ {1, . . . , n} we have that p ∤ [G : CG (xi )]. It follows that CG (xi ) is a subgroup of G whose order is less than |G| and divisible by pk . By the induction hypothesis, there exists a subgroup H of CG (xi ) of order pk . (ii) If p | |C(G)|, by Cauchy’s Theorem there exists an element g ∈ C(G) of order p. If k = 1 we are done. If not, the subgroup hgi is normal in G and the group G/hgi has order smaller than |G| and his divisible by pk−1 . By induction, G/hgi contains a subgroup of order pk−1 . This subgroup takes the form H/hgi, where H is a subgroup of G, and we have: |H| = [H : hgi] |hgi| = pk−1 p = pk . The previous theorem motivates the following definitions: Definition 5.2.6. A group of order pk is called a p-group (of exponent k). A p-subgroup H ⊂ G where the exponent k is maximal is called a Sylow p-subgroup. The Sylow p-subgroups of a group G are, in same sense, the analogues of the subgroups of a cyclic group, as shown by the following theorem: Theorem 5.2.7 (Sylow II). Let G be a finite group and assume p | |G|. Then: (i) The Sylow p-subgroups of G are unique up to conjugation. (ii) The number of Sylow p-subgroups of G is a divisor of the index of any Sylow p-subgroup and equals ≡ 1 (mod p). (iii) Any subgroup of G of order pk is a subgroup of some Sylow p-subgroup of G. 5.2. THE SYLOW THEOREMS 255 The proof of the first Sylow’s Theorem was based on the action of G on itself by conjugation. For the proof of the second Sylow’s Theorem we will use the action of G by conjugation of the set of its subgroups: if H ⊂ G is a subgroup, then gHg−1 , g ∈ G, is a subgroup of G and |gHg−1 | = |H|. For this action, the isotropy subgroup of H ⊂ G is precisely: NG (H) ≡ {g ∈ G : gHg−1 = H}. This subgroup is usually called the normalizer of H in G. Note that H is normal in NG (H) and, in fact, we leave it as an exercise to check that NG (H) is the largest subgroup of G containing H as a normal subgroup. We can also restrict this action of G to an action on the set Π of Sylow p-subgroups of G. Lemma 5.2.8. Let P be a Sylow p-subgroup of G, and H ⊂ NG (P ) a subgroup of order pk . Then H ⊂ P . Proof of Lema 5.2.8. Since P is a normal subgroup of NG (P ), we have a homomorphism π : NG (P ) → NG (P )/P . Since H is a subgroup of NG (P ), it follows that π(H) is a subgroup of NG (P )/P of order a power of p. Since P is a Sylow p-subgroup of G, is it also a Sylow p-subgroup of NG (P ) and we conclude that p 6 ||NG (P )/P |. But then we must have π(H) = {e}, in other words H ⊂ P . Proof of Sylow II. Consider the action by conjugation of G on the set Π of Sylow p-subgroups of G. Denote by OP the orbit of a Sylow p-subgroup P . Then: (a) |OP | ≡ 1 (mod p) : Consider the action of the group P on OP , obtained by restriction of the action of G (note that the G-action on OP is transitive by definition, but not the P -action on OP ). The P -orbits which do not contain P have more than one element, since if {P̃ } is a P -orbit distinct from P , then P̃ is a Sylow p-subgroup distinct from P and P ⊂ NG (P̃ ), contradicting Lemma 5.2.8. On the other hand, P all the P -orbits have cardinality a power of p. Hence, |OP | = 1 + i pki . (b) OP = Π : Assume that P̃ ∈ Π − OP . The same argument as above applied to the action of P̃ on OP , shows that |OP | ≡ 0 (mod p), which contradicts (a). Part (i) of Sylow II is equivalent to (b). On the other hand, to show that (ii) holds, we note that (b) implies that |Π| = |G/NG (P )| = [G : NG (P )]. 256 CHAPTER 5. FINITE GROUPS Since P ⊂ NG (P ) ⊂ G, we conclude that: [G : P ] = [G : NG (P )][NG (P ) : P ], so the number of p-subgroups is a divisor of [G : P ]. By (a) and (b), we also have |Π| ≡ 1 (mod p). Finally, to show that (iii) holds, note that if H ⊂ G is a subgroup of order k p , then the orbits of the action by conjugation of H on Π have cardinality a power of p. But |Π| ≡ 1 (mod p), so at least one of the orbits is of the form {P̃ }. This means that H ⊂ NG (P̃ ). By Lemma 5.2.8, H ⊂ P̃ , as claimed in (iii). Examples 5.2.9. 1. Consider the symmetric group S3 = {I, α, β, γ, δ, ε}. Since |S3 | = 6, by Sylow I there exist Sylow p-subgroups of orders 2 and 3. By Sylow II, the number of Sylow 3-subgroups must be 1 (mod 3) and a divisor of 2. Hence, there exists 1 subgroup of order 3. Obviously, this subgroup of S3 is P = A3 = {I, δ, ε}. Similarly, the number of Sylow 2-subgroups must be 1 (mod 2) and a divisor of 3. Hence, we can have either 1 or 3 subgroups of order 2. Obviously, there are 3 subgroups of order 2: P1 = {I, α}, P2 = {I, β}, P3 = {I, γ}. It is easy to check that these subgroups are all conjugate: P1 = δP2 δ −1 = εP3 ε−1 . 2. The subgroup A4 of S4 formed by the even permutations has order 12. By Sylow I, A4 has Sylow p-subgroups of orders 3 and 4. By Sylow II, the number of Sylow 3-subgroups can be 1 or 4. It is easy to find the following 4 Sylow 3-subgroups: P1 = {I, (123), (321)} P3 = {I, (134), (431)} P2 = {I, (124), (421)} P4 = {I, (234), (432)}. We leave it as an exercise to check that all these subgroups are conjugate and to determine the Sylow 2-subgroups (which have order 4). 3. Let G be a group of order pq, where p < q are prime numbers. Assume that p 6 | (q − 1). Then we claim that G ≃ Zpq . 5.2. THE SYLOW THEOREMS 257 To see this, observe that the number of Sylow p-subgroups of G is 1 + kp and divides pq, so it is must be 1. On the other hand, the number of q-Sylow subgroups is 1 + lq and divides pq, so it must be 1. It follows that G has two normal subgroups H1 , H2 ⊂ G of order p and q, respectively. Obviously, H1 ∩ H2 = {e} and G = H1 H2 , so we conclude that G ≃ H1 ⊕ H2 ≃ Zp ⊕ Zq ≃ Zpq . For example, any group of order 15 is isomorphic to Z15 . One can also show that when p < q are prime numbers and p | (q − 1) there are exactly two groups of order pq: the abelian group Zpq and a non-abelian group with presentation: {a, b, {ap = 1, bq = 1, ab = bs a}}, where s ∈ N satisfies s 6≡ 1 (mod q) and sp ≡ 1 (mod q) (see exercises). Using the techniques developed so far, it is possible to classify all groups of order less or equal to 15, up to isomorphism. The list is as follows: Order 1 2 3 4 5 6 7 8 9 10 Groups {e} Z2 Z3 Z4 , Z2 ⊕ Z2 Z5 Z6 , D3 Z7 Z8 , Z4 ⊕ Z2 , Z2 ⊕ Z2 ⊕ Z2 , H8 , D4 Z9 , Z3 ⊕ Z3 Z10 , D5 Order 11 12 13 14 15 Groups Z11 Z12 , Z2 ⊕ Z6 , A4 , D6 , E Z13 Z14 , D7 Z15 In this list, the group E is a group of order 12 with presentation: {{a, b}, a6 = 1, a3 b2 = 1, abab−1 = 1}. By contrast, there are 14 distinct groups of order 16, 51 distinct groups of order 32, etc. Although a finite group of order n is a subgroup of Sn , there is no known formula for the number of distinct groups of order n. 258 CHAPTER 5. FINITE GROUPS Exercises. 1. Show that if G is a finite group and |G| = p2 , then G is isomorphic to Zp2 or to Zp ⊕ Zp . 2. Show that, if G is a finite abelian where all elements, excepting the identity e, have order p, then |G| = pn and G ≃ Zp ⊕ · · · ⊕ Zp . 3. Classify the finite groups of order ≤ 7. 4. Given an example of a group G where the converse to Lagrange’s Theorem fails, i.e., such that |G| has a factor d but G has no subgroup of order d. Hint: Consider G = A4 . 5. Show that the normalizer NG (H) of a subgroup H ⊂ G is the largest subgroup of G containing H as a normal subgroup. 6. Determine the Sylow p-subgroups of the alternating group A4 and their conjugacy relations. 7. Determine all the p-subgroups of the group of quaternions H8 . 8. Determine the Sylow p-subgroups of the dihedral group Dp when p is a prime. 9. Let φ : G1 → G2 be a surjective group homomorphism between finite groups. Show that is P ⊂ G1 is a Sylow p-subgroup, then φ(P ) is a Sylow p-subgroup of G2 . 10. Show that if P ⊂ G is a Sylow subgroup then NG (NG (P )) = NG (P ). 11. Let p and q be distinct prime numbers such that p|(q − 1) and s a natural number such that s 6≡ 1 (mod q) and sp ≡ 1 (mod q) (one can show that such an s always exists). Show that: (a) The map α : Zq → Zq , k 7→ sk, is an automorphism of Zq ; (b) The map θ : Zp × Zq → Zq , (i, k) → αi (k), is an action of Zp on Zq by automorphisms; (c) The semi-direct product Zp ⋉ Zq is a group with two generators a and b satisfying the relations: ap = 1, bq = 1, ab = bs a. 5.3. NILPOTENT AND SOLVABLE GROUPS 5.3 259 Nilpotent and Solvable Groups The p-groups, as we saw in the last section, play a crucial role in the study of the structure of finite groups. A p-group is an example of a nilpotent group. In this section we will study this class of groups, as well as the largest class of solvable groups. These two classes of groups appear naturally when one looks at the failure of elements of a group in commuting. Let G be any group. The commutator of two elements g1 , g2 ∈ G is the element g1 −1 g2 −1 g1 g2 ∈ G. We denote this element by (g1 , g2 ) 3 , so that: g1 g2 = g2 g1 (g1 , g2 ). The element (g1 , g2 ) measures the failure of g1 and g2 in commuting. The next result gives some elementary properties of commutators whose verification is a simple exercise. Proposition 5.3.1 (Properties of commutators). Let g1 , g2 , g3 ∈ G be any elements in a group. Then: (i) (g1 , g2 )−1 = (g2 , g1 ); (ii) (g1 , g2 ) = e if and only if g1 and g2 commute; (iii) g (g1 , g2 ) = (g g1 , g g2 ); (iv) (g1 g2 , g3 ) · (g2 g3 , g1 ) · (g3 g1 , g2 ) = e; (v) g1 ((g −1 , g ), g ) · g3 ((g −1 , g ), g ) 1 2 2 3 3 1 · g2 ((g −1 , g ), g ) 3 1 2 = e; (vi) If φ : G → H is a group homomorphism, then φ((g1 , g2 )) = (φ(g1 ), φ(g2 )). Let A, B ⊂ G be subgroups. We will denote by (A, B) the subgroup of G generated by all commutators (a, b), where a ∈ A and b ∈ B. So, by definition, (A, B) is the smallest subgroup of G which contains all elements (a, b), with a ∈ A, b ∈ B. Notice that since (A, B) is a group, if (a, b) ∈ (A, B) then (b, a) = (a, b)−1 ∈ (A, B), so that (A, B) = (B, A). Notice also that there can exist elements in (A, B) which are not commutators. In general, the elements of (A, B) are of the form (a1 , b1 )±1 · (a2 , b2 )±1 · · · (as , bs )±1 , ai ∈ A, bi ∈ B, where s ≥ 1. 3 Often the commutator is also denoted by [g1 , g2 ], but we will reserve this notation for the commutator in the context of the so called Lie algebras. 260 CHAPTER 5. FINITE GROUPS Definition 5.3.2. The derived group G is the subgroup (G, G) of G. We denote this group by D(G)4 . One also calls D(G) = (G, G) the commutator group of G but this name is a little bit misleading since, as we have already observed, there can exist elements in D(G) which are not commutators. Proposition 5.3.3 (Properties of the derived group). Let G, G1 and G2 be groups. (i) If φ : G1 → G2 is a group homomorphism, then φ(D(G1 )) ⊂ D(G2 ), and if φ is surjective, then φ(D(G1 )) = D(G2 ). (ii) D(G) is a normal subgroup of G. (iii) G/D(G) is an abelian group and for every homomorphism φ : G → A into an abelian abelian group A there exists a unique homomorphism φ̃ : G/D(G) → A making the following diagram commute: G π φ /6 A ♥ ♥ ♥ ♥ ♥ ♥ φ̃ ♥ G/D(G) Proof. (i) Obvious from property (vi) of the commutators. (ii) For each g ∈ G, the map h 7→ ghg −1 is a group automorphism of G. Hence, by (i), it follows that gD(G)g−1 ⊂ D(G), so D(G) is normal in G. (iii) Since g · h = h · g · (g, h) it is obvious that G/D(G) is an abelian group. If φ : G → A is a group homomorphism from G into an abelian group A, we have ḡ = g · (a, b) =⇒ φ(ḡ) = φ(g · (a, b)) = φ(g) · (φ(a), φ(b)) = φ(g). Hence, we can define φ̃ : G/D(G) → A by setting φ̃(gD(G)) ≡ φ(g). It is very easy to check that φ̃ is a group homomorphism. By construction, φ = φ̃ ◦ π, where π : G → G/D(G) is the quotient map. Note that properties (ii) and (iii) actually characterize the derived group. 4 Some authors also denote the derived group of G by G′ . 5.3. NILPOTENT AND SOLVABLE GROUPS 261 Examples 5.3.4. 1. Obviously, a group G is abelian if and only it its derived group is D(G) = {e}. 2. For the group H8 = {1, i, j, k, −1, −i, −j, −k} the derived group is D(H8 ) = {1, −1} ≃ Z2 , since the commutators of elements in H8 are either 1 or −1. This group is a normal subgroup of H8 and the quocient H8 /D(H8 ) is isomorphic to Z2 ⊕ Z2 (exercise). 3. For the symmetric group S3 = {I, α, β, γ, δ, ε} the derived group is the alternating group A3 = {I, δ, ε}. In fact, the commutator of any two permutations is necessarily an even permutation and one checks easily that, for example, δ = (α, γ) and ε = (γ, α), so all even permutations are commutators. The group A3 is normal in S3 and S3 /A3 ≃ Z2 . For a group G one defines its lower central series {C k (G)}k≥0 by: C 0 (G) ≡ G, C k+1 (G) ≡ (G, C k (G)) (k ≥ 0). One obtains a chain of groups: G = C 0 (G) ⊃ C 1 (G) ⊃ · · · ⊃ C i (G) ⊃ · · · When there exists an n such that C n (G) = C n+k (G) for all k ≥ 0, we say that the series stabilizes. Notice that this will be the case if C n (G) = C n+1 (G). For an infinite group, the series may never stabilize. Definition 5.3.5. A group G is called nilpotent if theres exists some n such that C n (G) = {e}. The smallest such integer n is called the nilpotency class of G. Examples 5.3.6. 1. A group is nilpotent of class ≤ 1 if and only it is abelian. 2. The group H8 is nilpotent of class 2, since we have C 0 (H8 ) = H8 , C 1 (H8 ) = (H8 , H8 ) = Z2 , C 2 (H8 ) = (H8 , Z2 ) = {e}. 3. The dihedral group S3 is not nilpotent: one finds that C 0 (S3 ) = S3 , C 1 (S3 ) = A3 and C 2 (A3 ) = A3 , so the lower central series stabilizes in A3 at k = 2. 4. The subgroup of GL(n, R) formed by the upper triangular matrices with 1s in the main diagonal is nilpotent of class n − 1 (exercise). 5. Any subgroup and any quotient of a nilpotent group is nilpotent (exercise). 262 CHAPTER 5. FINITE GROUPS In general, to try to check directly from the definition that a group is nilpotent, involves a considerably amount of computation. It is convenient to have alternative characterizations of a nilpotent group. For that, we will call a tower of subgroups of G a sequence G = G0 ⊃ G1 ⊃ · · · ⊃ Gm . Also, by a normal tower we mean a tower where Gk+1 is normal in Gk , for all k. In this case we will often write: G = G0 ⊲ G1 ⊲ . . . ⊲ Gm . Finally, by an abelian tower we mean a normal tower where Gk /Gk+1 is abelian, for all k. Proposition 5.3.7. The following statements are equivalent: (i) G is nilpotent of class ≤ n. (ii) There exists a tower of subgroups G = G0 ⊃ G1 ⊃ · · · ⊃ Gn = {e} where Gk+1 ⊃ (G, Gk ). (iii) There exists a subgroup A of the center C(G) such that G/A is nilpotent of class ≤ n − 1. Proof. (i) ⇔ (ii): If G is nilpotent of class ≤ n, then the tower Gk ≡ C k (G) satisfies (ii). Conversely, given a tower as in (ii), one shows easily by induction that C k (G) ⊂ Gk : in fact, C 0 (G) = G = G0 and C k (G) ⊂ Gk ⇒ C k+1 (G) = (G, C k (G)) ⊂ (G, Gk ) ⊂ Gk+1 . Therefore, C n (G) ⊂ Gn = {e}, so G is nilpotent of class ≤ n. (i) ⇔ (iii): If G is nilpotent of class ≤ n, we have (G, C n−1 (G)) = n C (G) = {e}, hence A = C n−1 (G) is a central subgroup. We leave as an exercise to check that G/A is nilpotent of class ≤ n − 1. Conversely, let A be a subgroup of C(G) such that G/A is nilpotent of class ≤ n−1. The quotient map π : G → G/A is surjective, so π(G, G) = (G/A, G/A). Iterating, we conclude that π(C n−1 (G)) = C n−1 (G/A) = {e}, so that C n−1 (G) ⊂ A. Since A is central, it follows that: C n (G) = (G, C n−1 (G)) ⊂ (G, A) = {e}. This shows that G is nilpotent of class ≤ n. 5.3. NILPOTENT AND SOLVABLE GROUPS 263 Another series associated with a group is its derived series {D k (G)}k∈N , which is defined inductively by D 0 (G) ≡ G, D k+1 (G) ≡ D(D k (G)), (k ≥ 0). Note that D 0 (G) = C 0 (G) = G and D 1 (G) = C 1 (G) = (G, G). We leave it to the exercises to check that: (5.3.1) k−1 D k (G) ⊂ C 2 (G), (k ≥ 0). Similarly, to what we did for the central series, we now define: Definition 5.3.8. A group G is called solvable if for some n, D n (G) = {e}. The smallest such integer n is called the derived length of G. Examples 5.3.9. 1. A group is solvable of length ≤ 1 if and only if it is abelian. 2. Every nilpotent group of class ≤ 2n−1 is solvable of length ≤ n. 3. The group S3 is solvable of class 2: its derived series has terms: D0 (S3 ) = S3 , D1 (S3 ) = A3 and D2 (S3 ) = {e}. 4. The subgroup of GL(n, R) formed by all invertible n × n triangular matrices is solvable (what is its derived length?). 5. Any subgroup and any quotient of a solvable group is solvable. We also have alternative characterizations of solvable groups. The proof of the following proposition is left for the exercises. Proposition 5.3.10. The following statements are equivalent: (i) G is solvable of length ≤ n. (ii) There exists a tower G = G0 ⊃ G1 ⊃ · · · ⊃ Gn = {e} where Gk is normal em G and Gk /Gk+1 is abelian, for all k ≥ 0. (iii) There exists an abelian tower G = G1 ⊲ G2 ⊲ . . . ⊲ Gn = {e}. (iv) There exists an abelian normal subgroup A ⊂ G such that G/A is solvable of length ≤ n − 1. 264 CHAPTER 5. FINITE GROUPS Example 5.3.11. We saw above that the group S3 is solvable (but not nilpotent). The group S4 is also solvable, for it admits the following abelian tower of subgroups: S4 ⊲ A4 ⊲ H ⊲ {e}, where H = {I, (12)(34), (13)(24), (14)(23)}. Exercises. 1. If π : G1 → G2 is a group homomorphism, show that π(C k (G1 )) = C k (π(G1 )) and π(Dk (G1 )) = Dk (π(G1 )). 2. Is the dihedral group Dn nilpotent? Solvable? (you answer may depend on n). 3. Show that the subgroup of GL(n, R) formed by the upper triangular matrices with 1s in the main diagonal is nilpotent of class n − 1. 4. Show that the subgroup of GL(n, R) formed by the invertible upper triangular matrices is solvable. What is its derived length? 5. Let G be a group. Show that: (a) If H1 , H2 , H3 ⊂ G are normal subgroups, then (H1 , (H2 , H3 )) ⊂ (H3 , (H2 , H1 )) · (H2 , (H1 , H3 )); (Hint: Use property (v) of the commutators.) (b) for all m, n ∈ N (c) for all n ∈ N (C m (G), C n (G)) ⊂ C m+n (G); n−1 Dn (G) ⊂ C 2 (G). 6. Show that any subgroup and any quotient of a nilpotent (respectively, solvable) group is a nilpotent (respectively, solvable) group. What about direct products? 7. Show that if G is nilpotent (respectively, solvable) of class (respectively, length) ≤ n, then G/C n−1 (G) (respectively, G/Dn−1 (G)) is nilpotent (respectively, solvable) of class (respectively, length) ≤ n − 1. 8. Give a proof of Proposition 5.3.10. 5.4. SIMPLE GROUPS 265 9. The upper central series of a group G is the tower {Ck (G)}k∈N defined inductively as follows: C1 (G) = C(G) and Ck (G) is the largest normal subgroup of G such that Ck (G)/Ck−1 (G) is the center of G/Ck−1 (G). Show that: (i) Ck (G) = {g ∈ G : (g, h) ∈ Ck−1 (G), ∀h ∈ G}; (ii) A group is nilpotent if and only if G = Cn (G), for some n. 10. Show that a p-group is nilpotent. 11. Show that a finite group is nilpotent if and only if it is a direct product of p-subgroups. (Hint: If G is a nilpotent group, show that: (a) If H ( G is a subgroup, then H ( NG (H); (b) Every Sylow subgroup P ⊂ G is normal; (c) G is the direct product of its Sylow subgroups.) 5.4 Simple Groups Contrary to what Example 5.3.11 may suggest, the symmetric groups Sn , for n ≥ 5, are not solvable. They are related to a different class of groups, which in some sense is on the other extreme relative to the class of solvable groups. Notice that if one looks for an abelian tower of subgroups in some group G, then one may end up with a non-abelian subgroup which has no normal subgroups, besides the trivial ones. This motivates partially the following definition: Definition 5.4.1. A group G is called simple if it has no other normal subgroups besides the trivial subgroups {e} and G. In other words, the simple groups are the the groups for which there exists only one congruence equivalence relation. Examples 5.4.2. 1. A subgroup of an abelian group is always a normal subgroup. Hence, an abelian group G is simple if it is cyclic of prime order p, i.e., Zp . 2. If a solvable group G is simple, then its derived group must be D(G) must be trivial, so G is abelian. Hence, the only simple solvable groups are the groups Zp with p prime. 3. If G is a simple non-abelian group then G = D(G). 266 CHAPTER 5. FINITE GROUPS The most elementary, non-abelian, finite simple groups are the alternating groups An , with n ≥ 5. Galois found that the fact that An is simple for n ≥ 5 is the reason why there is no general algebraic formula for the roots of algebraic equations of order greater or equal to 5. We will study these issues in Chapter 7. Theorem 5.4.3. The alternating groups An are simple if n ≥ 5. Proof. We will show that if {I} 6= N ⊂ An is a normal subgroup then N = An . We divide the proof into 3 steps: (i) The group An is generated by 3-cycles: Any representation of π ∈ An as product of transposition has an even number of terms. On the other hand, any product of 2 distinct transpositions can be written as a product of 3-cycles (for example, (12)(23) = (123) and (12)(34) = (123)(234)). (ii) If N contains a 3-cycle, then N = An : Assume, for example, that (123) ∈ N . Then conjugation by the element 1 2 3 4 5 ··· δ= i j k l m ··· gives δ(123)δ−1 = (ijk). This shows that the claim holds as long as δ ∈ An . This can always be achieved, eventually by replacing δ by δ → (lm)δ. (iii) N contains a 3-cycle: We choose an element α 6= I in N which has the property that moves the smallest number of integers: (PM) The number of integers i such that α(i) = i is maximal among all elements of N . We show that α must be a 3-cycle by contradiction. Suppose that α is not a 3-cycle. Since we cannot have α = (123l) (this permutation is odd), the expression of α as a product of disjoint cycles takes either of the following forms: α = (123 . . . ) . . . (. . . ), ou α = (12)(34 . . . ) . . . where in the first case α permutes at least two more elements (for example, 4 and 5). One checks easily that if we let β = (345), then γ = (α, β) belongs to N and has the following properties: (a) if i > 5 and α(i) = i, then γ(i) = i; (b) γ(1) = 1; (c) in the second case, γ(2) = 2; 5.4. SIMPLE GROUPS 267 This shows that γ is an element contradicting property (MP) of α. Hence, α must be a 3-cycle, as claimed. Corollary 5.4.4. Sn is not solvable if n ≥ 5. Proof. Assume that Sn is solvable. The theorem would imply that An ⊂ Sn is a solvable simple group, i.e., would be abelian, a contradiction. The simple groups are, in some sense, indecomposable groups and that is the reason behind their name. One can decompose any finite group into “simple components”. In fact, if G is a finite group then it has a normal tower G = A0 ⊲ A1 ⊲ . . . ⊲ Am = {e}, where Ak /Ak+1 is a simple group, for all k: one chooses for A1 any normal subgroup of A0 = G which is not contain in any normal subgroup of A0 , chooses for A2 any normal subgroup of A1 which is not contain in any normal subgroup of A1 , and so on. Definition 5.4.5. A composition series of a group G is a normal tower: G = A0 ⊲ A1 ⊲ . . . ⊲ Am = {e}, where Ak /Ak+1 is a simple group. Hence a finite group G always admits a composition series. We will see below that the Jordan-Hölder Theorem states that composition series are “essentially unique”. In order to state it precisely, consider two normal towers of a group G: G = A0 ⊲ A1 ⊲ . . . ⊲ As , G = B0 ⊲ B1 ⊲ . . . ⊲ Br. One says that: s r • the tower {Ai }i=0 is a refinement of the tower {B j }j=0 if r < s and for each i there exists a ji such that B ji −1 ⊇ Ai ⊇ B ji . s r • The towers {Ai }i=0 and {B j }j=0 are called equivalent if r = s and there exists a permutation of the indices i 7→ i′ such that ′ ′ Ai /Ai+1 ≃ B i /B i +1 . 268 CHAPTER 5. FINITE GROUPS Theorem 5.4.6 (Schreier). Let G be a group. Any two normal towers of subgroups of G which end with the trivial subgroup {e} admit equivalent refinements. s r Proof. Let {Ai }i=0 and {B j }j=0 be two normal towers ending in the trivial group. Define: Āij ≡ Ai+1 (B j ∩ Ai ). Since Āir = Ai+1 , Āi0 = Ai and Āij+1 is a normal subgroup of Āij , it follows that {Āij } is a refinement of {Ai }. Simlarly, defining B̄ ji ≡ B j+1 (Ai ∩ B j ), one obtains a refinement of {B j }. The proof is completed by invoking the following lemma whose proof is left as an exercise: Lemma 5.4.7 (Zassenhaus). If G ⊃ S ⊲ S ′ and G ⊃ T ⊲ T ′ , then S ′ (S ∩ T ′ ) is normal in S ′ (S ∩ T ), T ′ (S ′ ∩ T ) is normal in T ′ (S ∩ T ) and there is an isomorphism: T ′ (S ∩ T ) S ′ (S ∩ T ) ≃ . S ′ (S ∩ T ′ ) T ′ (S ′ ∩ T ) T′ If in Zassenhaus’s Lemma we let S = Ai , S ′ = Ai+1 , T = B j and = B j+1 , it follows that: Āij Ai+1 (Ai ∩ B j ) B̄ ji B j+1 (Ai ∩ B j ) = . ≃ = Ai+1 (Ai ∩ B j+1 ) B j+1 (Ai+1 ∩ B j ) Āij+1 B̄ ji+1 Hence, {Āij } and {B̄ ji } are equivalent refinements. As a corollary of Schreier’s Theorem, we obtain: Theorem 5.4.8 (Jordan-Hölder). Two composition series of a finite group G are equivalent. Proof. Notice that a composition series is precisely a normal tower of subgroups ending with the trivial group, and which do not admit any refinement. Therefore, by Schreier’s Theorem, any two such towers must be equivalent. The Jordan-Hölder Theorem shows that the composition series is an invariant of a finite group, i.e., any two isomorphic finite groups have equivalent composition series. Hence, they can be used to decide if two groups 5.4. SIMPLE GROUPS 269 are isomorphic. For example, two groups which have composition series of different lengths cannot be isomorphic. The length of a composition series is a numeric invariant of a finite group: any two isomorphic finite groups have the same length. Examples 5.4.9. 1. The cyclic group G = hai of order pm has composition series of length m: m−1 G = hai ⊲ hap i ⊲ . . . hap m i ⊲ hap i = {e}. 2. The group H8 = {1, i, j, k, −1, −i, −j, −k} has the following composition series of length 3: H8 ⊲ {1, i, −1, −i} ⊲ {1, −1} ⊲ {1}, H8 ⊲ {1, j, −1, −j} ⊲ {1, −1} ⊲ {1}, H8 ⊲ {1, k, −1, −k} ⊲ {1, −1} ⊲ {1}. These composition series are all isomorphic: the quotient groups Gi /Gi+1 are all isomorphic to Z2 . 3. By definition, a group is simple if has a unique trivial composition series: G ⊲ {e}. 4. A finite group G is solvable if and only if for any of its composition series G = G0 ⊲ G1 ⊲ . . . ⊲ Gm = {e} the factors Gk /Gk+1 are all cyclic groups of prime order, i.e., isomorphic to some Zpk , with pk prime. In fact, if G is solvable, all subgroups Gk are solvable and also its quotients Gk /Gk+1 are solvable (cf. Exercise 6 in Section 5.3). But any solvable simple group is abelian, hence isomorphic to some Zpk . The results in this section suggest a program to classify all finite groups: one should first classify all finite simple groups and then classify all the possible composition towers made of this groups. The classification of all finite simple groups is one of the greatest achievements of modern days Mathematics. The classification theorem can be roughly stated as follows: Theorem 5.4.10 (Classification of finite simple groups). Up to isomorphism there are 18 infinite families of simple groups and 26 sporadic simple groups. 270 CHAPTER 5. FINITE GROUPS Examples of infinite families are the groups Zp , with p prime, and the groups An , with n ≥ 5. Another example is the projective linear group PL(n, K) of some finite field K: it is obtained by taking the quotient of the group SL(n, K) of n × n matrices of determinant 1 by its center. The sporadic simple groups are more mysterious groups. For example, the largest sporadic simple group has order: 246 · 320 · 59 · 76 · 112 · 133 · 17 · 19 · 23 · 29 · 31 · 41 · 47 · 59 · 71 ∼ 8 · 1053 and it is called the monster group. Exercises. 1. Proof Zassenhaus’s Lemma. (Hint: Use the Second Isomorphism Theorem to show that each of the quotients that appear in the statement of the lemma are both isomorphic to S ∩ T /(S ∩ T ′ )(S ′ ∩ T ).) 2. Show that a cyclic p-group has a unique composition series. 3. Show that an abelian group has a composition series if and only if it is finite. 4. Determine the composition series of the following groups: (a) Z6 × Z5 ; (b) S4 ; (c) G where |G| = pq (p and q prime). 5. If a simple group G has a subgroup of index n > 1, show that the order of G divides n!. Use this to prove that A5 has no subgroup of order 15. 6. Show that a group of order pq 2 , where p and q are primes, is solvable. 7. Classify all groups of order 20. 8. Show that there exists no non-abelian simple group of order smaller than 30. 5.5 Symmetry Groups We introduced in Chapter 1 the notion of symmetry group of of a subset Ω ⊂ Rn . We will now apply the results we have obtained before on the structure of finite groups to the classification of symmetry groups. 5.5. SYMMETRY GROUPS 271 Recall that the group of symmetries of Rn is, by definition, the euclidean group E(n) consisting of all isometries of Rn . We saw in Exercise 5.1.5, that this group is isomorphic to the semi-direct product O(n) ⋉ Rn . In fact, by Theorem 1.8.6, an isometry f ∈ E(n) takes the form f (x) = Ax + b, where A ∈ O(n) represents an orthogonal transformation and b ∈ Rn a translation. Recall also that an orthogonal transformation is called a rotation if det A = 1. If Ω ⊂ Rn its group of symmetries GΩ ⊂ E(n) consists of all isometries which map Ω into itself: GΩ = {f ∈ E(n) : f (Ω) = Ω}. When Ω is bounded, it is obvious that group GΩ contains only orthogonal transformations. Moroever, we have: Proposition 5.5.1. If GΩ is the group of symmetries of a bounded set Ω ⊂ Rn , then one of the following statements holds: (i) GΩ contains only rotations; (ii) The rotations of GΩ form an index 2 subgroup (hence, normal) in GΩ . Proof. Since Ω is bounded, GΩ ⊂ O(n) and GΩ is formed by rotations iff GΩ ⊂ SO(n). If GΩ 6⊂ SO(n), then H = GΩ ∩ SO(n) is the subgroup of rotations in GΩ , and coincides with the kernel of the epimorphism det : GΩ → {1, −1}. By the First Isomorphism Theorem, it follows that [GΩ : H] = |GΩ /H| = |{1, −1}| = 2. In the reminder of this section we will consider only finite symmetry groups. We will see that the the results we have obtained so far on the structure of finite groups allows us to completely classify the symmetry groups of planar (n = 2) and tridimensional (n = 3) bounded figures. 5.5.1 Symmetry groups of plane figures Before we get to the classification of symmetry groups of plane bounded figures, let us recall some examples of plane figures whose symmetry groups we have encountered before. 272 CHAPTER 5. FINITE GROUPS Examples 5.5.2. 1. In Example 1.8.7.2 we saw that D3 , the group of symmetries of an equilateral 4π triangle, contains 6 elements: 3 rotations (I, 2π 3 , 3 ) and 3 reflections across its lines of symmetry. More generally, the dihedral group Dn , the symmetry group of a regular polygon with n sides, has 2n elements: n rotations and n reflections across the lines of symmetry of the polygon. If we denote by ρ a rotation of 2π n and by σ a reflection across one line of symmetry of the polygon, then: Dn = hρ, σi = {I, ρ, . . . , ρn−1 , σ, σρ, . . . , σρn−1 }. As we saw in Example 4.6.3.4, this group has the presentation {{ρ, σ}, σ 2 = I, ρn = I, ρσ = σρ−1 }. 2. Consider the group of symmetries of the sails of a windmill. It is a cyclic group of order 4, generated by a rotation ρ by π2 : C4 = {I, ρ, ρ2 , ρ3 }. More generally one can consider a windmill with n sails, leading to a cyclic group of symmetries with n elements: Cn = {I, ρ, . . . , ρn−1 }5 Figure 5.5.1: Plane figure with cyclic group of symmetries. The symmetry groups that one finds in these examples exhaust all the possibilities since one has the following: Theorem 5.5.3. A finite group of symmetries of a plane figure Ω ⊂ R2 is isomorphic to either Cn or Dn . Proof. Let Ω ⊂ R2 and let G be its group of symmetries, which we assume to be finite. According to Proposition 5.5.1, we need to consider two cases: 5 In the study of symmetries it is common to denote by Cn the cyclic group of order n, which we have denoted before by Zn . 5.5. SYMMETRY GROUPS 273 (a) G contains only rotations: Let ρ ∈ G be a rotation by same angle θρ . We can assume that θρ is the smallest possible among all rotations of G (which exists, since G is finite). Then {I, ρ, ρ2 , . . . } ⊂ G. On the other hand, if ρ̃ ∈ G is a rotation by an angle θρ̃ which is not among the powers ρn , then there exists some positive integer k such that: kθρ < θρ̃ < (k + 1)θρ . It follows that ρ̃ρ−k is a rotation by an angle smaller than ρ, a contradiction. Hence, G = {I, ρ, . . . , ρn−1 } = Cn . (b) G contains a reflection σ: By (a) the subgroup H ⊂ G of proper rotations is of the form H = {I, ρ, . . . , ρn−1 }. Since [G : H] = 2, we find that G = {I, ρ, . . . , ρn−1 , σ, σρ . . . , σρn−1 }. We leave as an exercise to check that in this case G ≃ Dn . If n = p is prime, |Dp | = 2p and the Sylow Theorems show that the subgroups of Dp are: (a) p subgroups of order 2: {I, σ}, {I, σρ}, . . . , {I, σρp−1 }; PSfrag replaements (b) 1 subgroup of order p: Cp = {I, ρ, . . . , ρp−1 }; Z hmi e The subgroup of order p is normal. Since p is a prime, the subgroups of e~ order 2 are conjugate via a rotation (exercise). Geometrically, this means that we can obtain any reflection from a fixed reflection by conjugating it by rotations. For example, the figure below illustrates in the case p = 5 how the reflection ρσ can be obtain from the reflection σ, conjugating by 3 the rotation ρ2 . md(m; n) h i h i mi ni hmm(m; n)i X D X1 h B C 2 D D h C A B E A A C E E A B D Ox X2 x gx g1 x gk x2 (x) (gx) B T1(g) T2 (g ) g2 x g1 (g2 x) T (g2 ) C T (g1) T (g1 g2 ) E Figure 5.5.2: Symmetries of a pentagon. The structure of the subgroups of the dihedral group Dn when n is not a prime is more complex and will not be discussed here. 274 5.5.2 CHAPTER 5. FINITE GROUPS Symmetry groups of tridimensional figures We consider now the symmetry groups G of a tridimensional figure. We start by looking at the rotational symmetries, i.e., the case where G ⊂ SO(3). Theorem 5.5.4. A finite group of rotational symmetries of a figure Ω ⊂ R3 is isomorphic to one of the following rotational symmetry groups: (i) The symmetry group Cn of a windmill with n sails; (ii) The symmetry group Dn of a regular polygon with n sides; (iii) The rotational symmetry group T of a regular tetrahedron; (iv) The rotational symmetry group O of a cube or a regular octahedron: (v) The rotational symmetry group I of a regular dodecahedron or a regular icosahedron: Proof. Let G ⊂ SO(3) be a finite subgroup. The idea of the proof consists in introducing an action of G in a fine set P and then exploring the class equation (5.1.3). 5.5. SYMMETRY GROUPS 275 For the finite set P we will take the set of poles of G: an element p ∈ = {x ∈ R3 : |x| = 1} is called a pole if there exists a non-trivial rotation g ∈ G, such that g · p = p. Therefore, p is a pole fixed by g ∈ G if and only if p ∈ S 2 ∩ L, where L the axis of rotation of g. In particular, to each g ∈ G one associates two poles. Obviously, P 6= ∅ if G is non-trivial. The group G acts on the set of its poles P : if p ∈ P is such that g · p = p for some g ∈ G, then for any h ∈ G, we have that S2 (hgh−1 )h · p = h · p. Since hgh−1 6= e of g 6= e, it follows that h · p ∈ P . The study of the action of G on P leads to the following lemma: Lemma 5.5.5. The action of G on the set of its poles P satisfies X 1 2 (5.5.1) (1 − ) =2− , rpi N i where the sum is over the orbits Oi of the action, pi is a pole representing Oi , rpi is the order of the isotropy subgroup Gpi and N is the order of G. Proof of the Lemma. For each pole p ∈ P , the isotropy subgroup Gp consists of those elements g ∈ G that fix p. Let N = |G| and rp = |Gp |. For each g ∈ G − {e} there exist 2 poles associated to g, hence we find: X X 2= (rp − 1). (5.5.2) 2(N − 1) = g∈G g6=e p∈P Let us group together the elements of P in terms of the orbits of G. If Oi is an orbit, we choose a representative pi ∈ Oi and we write ni = |Oi |. Then equation (5.5.2) yields X ni rpi − |P | = 2(N − 1), i where the sum is now over the orbits Oi of the action. Since Oi ≃ G/Gpi , it follows that ni rpi = N , and we conclude: X (5.5.3) N − |P | = 2(N − 1). i On the other hand, the class equation (5.1.3) gives X X N . (5.5.4) |P | = |Oi | = rpi i i Replacing (5.5.4) in (5.5.3), we obtain (5.5.1). 276 CHAPTER 5. FINITE GROUPS Equation (5.5.1) puts restrictions over G which eventually lead to its classification. A first remark is that there can be at most 3 orbits. Indeed, the right hand side of (5.5.1) is < 2, while each term on the left side is ≥ 21 . It follows that we have the following possibilities: (i) 1 orbit: We would have 1− 2 1 =2− . r1 N This equation has no solutions (ii) 2 orbits: We obtain 1 2 1 + = . r1 r2 N The only solution is r1 = r2 = N . This means that we have 2 poles p1 and p2 which are fixed by all elements of G. Hence, we have G = CN , the group of rotations around the axis that goes through p1 and p2 . (iii) 3 orbits: In this case we obtain 1 1 1 2 + + −1= . r1 r2 r3 N We can assume, without loss of generality, that r1 ≤ r2 ≤ r3 . It follows that r1 = 2 and we obtain the following subgroups: (a) r1 = r2 = 2, N = 2r3 . |O1 | = |O2 | = N 2, |O3 | = 2; (b) r1 = 2, r2 = r3 = 3, N = 12. |O1 | = 6, |O2 | = |O3 | = 4; (c) r1 = 2, r2 = 3, r3 = 4, N = 24. |O1 | = 12, |O2 | = 8, |O3 | = 6; (d) r1 = 2, r2 = 3, r3 = 5, N = 60. |O1 | = 30, |O2 | = 20, |O3 | = 12. We leave as an exercise to check that the cases (a), (b), (c) and (d) correspond, respectively, to the groups G ≃ D N , G ≃ T , G ≃ O and G ≃ I. 2 In case (a), the poles are the intersections of the symmetry lines of the regular polygon with the unit sphere and the intersection of the line through the center of the polygon and perpendicular to its plane with the unit sphere. In cases (b)-(d), the poles are the intersections of the symmetry axis of the regular polyhedra with the unit sphere. 5.5. SYMMETRY GROUPS 277 Proposition 5.5.1 now shows that the full symmetry groups take one of the following forms: (a) If −I ∈ G, then G = H ∪ −H, where H = {I, ρ1 , . . . , ρn−1 } the subgroup of rotations in G and −H ≡ {−I, −ρ1 , . . . , −ρn−1 }. (b) If −I 6∈ G, then G = H ∪ H̃, where H is the subgroup of rotations in G and, if −ρ ∈ H̃ then ρ has even order and ρ2 ∈ H. Since we already know what H can be, by Theorem 5.5.4, this leads to the classification of the finite groups of symmetries of a tridimensional figure Ω ⊂ R3 . Podemos descrever, of forma mais explı́cita, The symmetry groups of the regular polyhedra can be described even more explicitly. We illustrate with the case of a regular dodecahedron (for the remaining regular polyhedra, see the exercises at the end of the section). Example 5.5.6. The group of symmetries I of a dodecahedron (or its dual, a regular icosahedron) has order 60 = 22 ×3×5. By the Sylow Theorems we obtain the following subgroups: (i) Subgroups of order 5: the number of subgroups of order 5 divides 12 and is equal to 1 (mod 5). Hence, we can have either 1 or 6 subgroups of order 5. There are 6 subgroups each generated by a rotation by 2π 5 around a line that connects the centers of two parallel faces of the dodecahedron. These subgroups are precisely the isotropy subgroups of the orbit O3 . (ii) Subgroups of order 3: the number of subgroups of order 3 divides 20 and is equal to 1 (mod 3). We can have 1, 4 or 10 subgroups of order 3. There 10 subgroups each generated by a rotation by 2π 3 around a line through opposite vertices of the dodecahedron. These subgroups are precisely the isotropy subgroups of the orbit O2 . (iii) Subgroups of order 4: the number of subgroups of order 4 divides 15 and is equal to 1 (mod 2). We can have 1, 3, 5 ou 15 subgroups. There are 15 subgroups of order 2 corresponding to rotations by π around the lines through the centers of parallel edges of the dodecahedron. These subgroups are precisely the isotropy subgroups of the orbit O1 , and they give rise to 5 subgroups of order 4, formed by the rotations associated to 3 orthogonal edges of the dodecahedron (in the figure below, these are the edges parallel to the edges of the cube). This list of rotations of order 2, 3 and 5, exhaust all the elements of the group I, since: (5.5.5) 60 = |I| = 1 + |{z} 15 + |{z} 20 + |{z} 24 . order 2 order 3 order 5 278 CHAPTER 5. FINITE GROUPS This description of the elements of I also allows to show that I is a simple group. In fact, if H ⊂ G is a normal subgroup and H contains an element of order r, then H must contains all the elements of order r, since the Sylow subgroups are all conjugate and the subgroups of order 2 are not normal. Hence, the order of H is a sum of terms in equation (5.5.5). However, there is no integer that is a sum of terms of (5.5.5) and that divides 60. Hence, we must have H = I, so this is a simple group. Our study of the alternating group An suggests that I ≃ A5 . To see that this is indeed the case, consider the 5 cubes inscribed in the dodecahedron (see figure). The action of I in the vertices of the dodecahedron transforms the vertices of the cubes into vertices of the cubes (preserving their orientation). This gives an action of I in a set of 5 elements: T (R)(cube) = R(cube) Figure 5.5.3: One of the cubes inscribed in a dodecahedron. Since I is simple, this action is effective: N (T ) = {e}. Since I contains only rotations which preserve orientations, we have Im(T ) ⊂ A5 . Finally, since |I| = 60 = |A5 |, we conclude that I ≃ Im(T ) = A5 . The classification of finite groups of symmetries of figures Ω ⊂ Rn , for n > 3, is only known for small values of n. A special important case is the groups generated by reflections in hyperplanes of Rn , the so-called Coxeter groups. Its classification, obtained by Coxeter in 19346 , is related to the classification of Lie algebras, and has many applications in several domains of Mathematics and Physics. 6 H. S. M. Coxeter, Discrete Groups Generated by Reflections, Ann. Math. 35, (1934) 588-621. 5.5. SYMMETRY GROUPS 279 Exercises. 1. Complete the proof of Theorem 5.5.3. 2. Let Dp be the group of symmetries of a regular polygon with p sides. Show that if p is a prime, all subgroups of Dp of order 2 are conjugate via a rotation. 3. Complete the proof of Theorem 5.5.4. 4. Show that the action of the symmetry group I on the set of 5 cubes inscribed in the dodecahedron is effective (see Example 5.5.6). 5. Show that T ≃ A4 . (Hint: Consider the action of T on the vertices of the tetrahedron.) 6. Mostre que O ≃ S4 . (Hint: Consider the action of O on the diagonals of the cube.) 280 CHAPTER 5. FINITE GROUPS Chapter 6 Modules 6.1 Modules over a ring Let us recall the definition of a vector space over a field: Definition 6.1.1. A vector space over a field K is an abelian group (V, +) with a binary operation K × V → V , written (k, v) 7→ kv, which satisfies: (i) k(v 1 + v 2 ) = kv 1 + kv 2 , k ∈ K, v 1 , v 2 ∈ V ; (ii) (k + l)v = kv + lv, k, l ∈ K, v ∈ V ; (iii) k(lv) = (kl)v, k, l ∈ K, v ∈ V ; (iv) 1v = v, v ∈ V . Now let (G, +) be an abelian group, for which we use the additive notation. Recall that we have a binary operation which to the elements n ∈ Z and g ∈ G associates the element ng ∈ G. This operation satisfies: • n(g1 + g2 ) = ng1 + ng2 , n ∈ Z, g1 , g2 ∈ G; • (n + m)g = ng + mg, n, m ∈ Z, g ∈ G; • n(mg) = (nm)g, n, m ∈ Z, g ∈ G; • 1g = g, g ∈ G. These properties are formally analogous to the axioms in Definition 6.1.1, where we replace the elements of V (“vectors”) by elements of the group G, and the elements of the field K (“scalars”) by elements of the ring Z. 281 282 CHAPTER 6. MODULES Next consider a linear transformation T : Rn → Rn . If p(x) = a0 + a1 x + · · · + an xn ∈ R[x] is any polynomial and v ∈ Rn is a vector, we let p(x) · v ∈ Rn be defined by1 p(x) · v := a0 v + a1 T (v) + · · · + an T n (v) = n X ak T k (v), k=1 where: T 0 = I, Tk = T ◦ T ◦ ··· ◦ T (k − times). For example, if T : R3 → R3 is the linear transformation whose matrix relative to the canonical basis e1 = (1, 0, 0), e2 = (0, 1, 0), e3 = (0, 0, 1) is:   2 0 0 A =  0 3 1 . 0 0 3 and we let p(x) = 3 + 12 x − 2x2 and v = (1, 2, 1), then 1 p(x) · v = 3(1, 2, 1) + T (1, 2, 1) − 2T 2 (1, 2, 1) = (−4, −77/2, −27/2). 2 It is easy to check that this operation satisfies the following properties: • p(x) · (v 1 + v 2 ) = p(x) · v 1 + p(x) · v 2 ; • (p(x) + q(x)) · v = p(x) · v + q(x) · v; • p(x) · (q(x) · v) = (p(x)q(x)) · v; • 1v = v. for any p(x), q(x) ∈ R[x], v, v 1 , v 2 ∈ Rn . In this example, the “vectors” are elements in Rn and the “scalars” are elements of the polynomial ring R[x]. Notice that in each of these three examples all the properties satisfied by the operation in question involve only the ring structure of the “scalars” (respectively, K, Z and R[x]) and the additive structures of the “vectors”. There are other similar circumstances where one can find analogues properties, so it is natural to extend these concepts to an arbitrary ring. The resulting algebraic structure, called module, is the basic stone to study concepts of Linear Algebra, such as linear independence, linear transformations, dimension, basis, etc. 1 From now, in order to simplify the notation, we will not use the convention of denoting an indeterminate by x. Usually, the indeterminates will be denoted by x1 , . . . , xn (or x if n = 1; or x, y if n = 2; or x, y, z if n = 3) and we will denote by bold the vectors (the elements of a vector space or, more generally, of a module). 6.1. MODULES OVER A RING 283 Definition 6.1.2. A module M over a ring A or an A-module is an abelian group (M, +), together with an operation of a unitary ring A on M , written (a, v) 7→ av, satisfying the following properties(2 ): (i) a(v 1 + v 2 ) = av 1 + av 2 , a ∈ A, v 1 , v 2 ∈ M ; (ii) (a1 + a2 )v = a1 v + a2 v, a1 , a2 ∈ A, v ∈ M ; (iii) a1 (a2 v) = (a1 a2 )v, a1 , a2 ∈ A, v ∈ M ; (iv) 1v = v, v ∈ M . To be precise, the modules we have just defined are “left modules”. We leave it to the reader the task of defining “right modules”. All the results in this chapter will be stated for left modules, but they are valid mutatis mutandis for right modules. For a commutative ring A there is little sense in distinguishing between a left and a right module. We will denote by 0A and 0M the identities in (A, +) and (M, +). Since (M, +) is an abelian group the element nv ∈ M , where n ∈ Z and v ∈ M , has the usual meaning. Similarly, we have the element na ∈ A, where n ∈ Z and a ∈ A. One checks easily the following elementary properties: Proposition 6.1.3. For any A-module M : (i) a0M = 0M , a ∈ A; (ii) 0A v = 0M , v ∈ M ; (iii) (−a)v = −(av) = a(−v), a ∈ A, v ∈ M ; (iv) n(av) = a(nv) = (na)v, n ∈ Z, a ∈ A, v ∈ M . A submodule N of an A-module M is a subgroup of (M, +) which is closed for multiplication by elements of A: if a ∈ A and v ∈ N , then av ∈ N . A submodule is obviously an A-module. Examples 6.1.4. 1. Vector spaces are the same thing as modules over a field. More generally, one can call a module over a division ring D a vector space. In this case the submodules are the linear subspaces. 2. Abelian groups are the same thing as modules over Z. In this case, the submodules coincide with the subgroups of G. 2 One can also consider modules over rings A without an identity 1, where one omits axiom (iv). We shall only consider modules over unitary rings. 284 CHAPTER 6. MODULES 3. Generalizing the example above, a fixed linear transformation T : V → V makes the vector space V into a module over the polynomial ring K[x]: given p(x) = a0 + a1 x · · · + an xn ∈ K[x] and v ∈ V then one sets: p(x) · v := a0 v + a0 T (v) + · · · + an T n (v). In this example the submodules are the linear subspaces W ⊂ V which are invariant under the linear transformation T . 4. If A is a ring and I ⊂ A is a (left) ideal then I is a A-module: if a ∈ A and b ∈ I, then ab ∈ I. Similarly, A/I is an A-module, since if a ∈ A and b + I ∈ A/I, we can set: a(b + I) = ab + I. 5. If A is a ring and B ⊂ A is a subring, then A is a B-module. In particular, the rings A[x1 , . . . , xn ] and A[[x1 , . . . , xn ]] are A-modules. 6. If G is an abelian group and End(G) is the ring of endomorphisms of G. Then G is a End(G)-module for the operation φg ≡ φ(g), φ ∈ End(G), g ∈ G. 7. Let A and B be rings and φ : A → B a ring homomorphism. If M is a B-module, then we can make M into A-module, denoted φ∗ M , which has the same underlying supporting set but a new operation defined by av ≡ φ(a)v, a ∈ A, v ∈ M. Notice that in all these examples by studying the structure of the module (e.g, the structure of its submodules) one can obtain information about the underlying objects: the subgroups of an abelian group, the invariant subspaces, etc. This will be a recurrent theme in this chapter. Definition 6.1.5. A homomorphism of A-modules φ : M1 → M2 is a map between A-modules which satisfies: (i) φ(v 1 + v 2 ) = φ(v 1 ) + φ(v 2 ), v 1 , v 2 ∈ M ; (ii) φ(av) = aφ(v), a ∈ A, v ∈ M . One defines in a obvious way monomorphisms, epimorphisms and isomorphisms of A-modules. We will also use interchangeably the terms A-linear transformations or linear map to denote a homomorphism of A-modules. Note that if φ : M1 → M2 is a linear transformation, its kernel N (φ) and its image Im(φ) are submodules of M1 and M2 . 6.1. MODULES OVER A RING 285 Examples 6.1.6. 1. A homomorphism of Z-modules is the same thing as a homomorphism of abelian groups. 2. If V1 and V2 are vector spaces over K, the K-homomorphisms φ : V1 → V2 are the usual linear transformations. If M is an A-module and N ⊂ M is a submodule, the inclusion ι : N → M is an A-linear map. The quotient M/N has a natural structure of an A-module for which the quotient map π : M → M/N is an A-linear map: M/N is an abelian group and we can define a scalar multiplication by: a(v + N ) ≡ (av) + N. One checks easily that (i)-(iv) are satisfied. The module M/N is called the quotient module of M by N . T If {Ni }i∈I is a family of submodules of an A-module M , then i∈I Ni is a submodule of M . Hence, if S ⊂ M is a non-empty set, then the intersection of all submodules of M which contains S is a submodule hSi, called the module generated by S. The elements of hSi are of the form a1 v 1 +· · ·+ar v r , where ai ∈ A and v i ∈ S.3 P If {Ni }i∈I is a family of submodules of an S A-module M , we denote by N the submodule generated by S = i i∈I i∈I Ni . If I = {1, . . . , m} is Pm finite, we write N or N + · · · + N . In general, the elements of 1 m i=1 i P i∈I Ni take the form v i1 + · · · + v im , v ij ∈ Nij . Theorem 6.1.7 (Isomorphism Theorems for Modules). (i) If φ : M1 → M2 is a homomorphism of A-modules, then there exists an isomorphism of A-modules: Im(φ) ≃ M1 /N (φ). (ii) If N1 and N2 are submodules of an A-module M , then there exists an isomorphism of A-modules: N1 N1 + N2 ≃ . N2 N1 ∩ N2 (iii) If N and P are submodules of an A-module M and M ⊃ N ⊃ P , then P is a submodule of N and there is an isomorphism of A-modules: M/N ≃ M/P . N/P 3 Note that this does not hold for P modulesPover rings without a unit. For this modules, the elements of hSi are of the form i ai v i + j nj ṽ j , where ai ∈ A, nj ∈ Z and v i , ṽ j ∈ S. 286 CHAPTER 6. MODULES We leave the easy proofs of this theorems for the exercises. Q Let {Mi }i∈I be a family of A-modules. We define the A-module i∈I Mi , called direct product of the family {Mi }i∈I , as follows. The underlying Q set of M i is the cartesian product of the Mi . If (v i )i∈I , (w i∈I Q Qi )i∈I ∈ i∈I Mi , then (v i )i∈I + (w i )i∈I denotes the element (v i + w i )i∈I Q ∈ i∈I Mi , and if a ∈ A, then a(v i )i∈I denotes the element (av ) ∈ i i∈I i∈I Mi . One Q checks immediately that Qi∈I Mi becomes an A-module. If k ∈ I, the canonical Q projection πk : i∈I Mi → Mk the A-linear map which maps (v i )i∈I ∈ i∈I Mi to the element v k ∈ Mk . L The direct sum of a family of A-modules {Mi }i∈I , denoted i∈I Mi , is Q the submodule of the direct product i∈I Mi formed by the elements (v i )i∈I where only a finite number of v i ’s are non-zero. If k ∈ I, the canonical L injection ιk : Mk → Qi∈I Mi is the A-linear map which maps v k ∈ Mk to the element (v i )i∈I ∈ i∈I Mi with all vi = 0 for i 6= k. When I = {1, . . . , m} isLfinite, the direct sum and direct product coincide. In this case we write m i=1 Mi or M1 ⊕ · · · ⊕ Mm . Proposition 6.1.8. M ≃ M1 ⊕ · · · ⊕ Mm (as A-modules) if and only if there exist A-linear maps πk : M → Mk and ιk : Mk → M such that: (i) πk ◦ ιk = idMk , k = 1, . . . , m; (ii) πk ◦ ιl = 0, k 6= l; (iii) ι1 ◦ π1 + · · · + ιm ◦ πm = idM . Proof. Assume that φ : M → M1 ⊕ · · · ⊕ Mm is an isomorphism. Then the compose of the canonical projections and injections with φ and φ−1 satisfy (i), (ii) and (iii). Conversely, if there exist A-linear maps satisfying (i), (ii) and (iii), we can defined φ : M → M1 ⊕ · · · ⊕ Mm and ψ : M1 ⊕ · · · ⊕ Mm → M by: φ(x) = (πk (x))k={1,...,m} , ψ((xk )k={1,...,m} ) = ι1 (x1 ) + · · · + ιm (xm ). These are A-linear maps which, by (i), (ii) and (iii), satisfy φ◦ψ = idM1 ⊕···⊕Mm and ψ ◦ φ = idM . Hence, φ and ψ establish an isomorphism of A-modules M ≃ M1 ⊕ · · · ⊕ Mm . If {Ni }i∈I is a family of submodules of an A-module M , we Lsay that M is the direct sum of the submodules {N } , if the map i i∈I i∈I Ni → M , P L (v i ) 7→ i v i , is an isomorphism. In this case, we write M = i∈I Ni . We have the following result which characterizes when an A-module is a direct sum of submodules. We leave the proof as an exercise: 6.1. MODULES OVER A RING 287 Proposition 6.1.9. L Let M be an A-module and {Mi }i∈I a family of submodules. Then M = i∈I Mi if and only if: P (i) M = i∈I Mi ; (ii) Mj ∩ (Mi1 + · · · + Mik ) = {0} for j 6∈ {i1 , . . . , ik }. The notion of module is also relevant for studying other algebraic structures. One important example is the following: Definition 6.1.10. Let R be a ring with identity. An algebra over R or R-algebra is a ring A such that: (i) (A, +) is an R-module; (ii) k(ab) = (ka)b = a(kb) for all k ∈ R and a, b ∈ A. If (A, +, ·) is a division ring, A is called a division algebra. The notions of subalgebra, homomorphism and isomorphism of R-algebras are more or less obvious and we leave it to the reader its definition. An algebra over a field K which, as a vector space over K, has finite dimension is called a finite dimensional algebra over K. Examples 6.1.11. 1. A field extension k ⊂ K can be viewed as an algebra over k. For example, the fields Q ⊂ R ⊂ C are algebras over each of the preceding fields. Similarly, the quaternions H is an algebra over Q and R (what about over C?). These are all division algebras. 2. Let R be a ring with identity. The set A = Mn (R) of all matrices n × n with entries in R is an algebra over R, which fails to be a division algebra. If R = K is a field, Mn (K) is an algebra over K of finite dimension. 3. If V is a vector space over a field K, the set A = EndK (V ) of all the linear maps V → V is an algebra over K. Note that A is finite dimensional iff V is finite dimensional: if dim V = n then A is isomorphic to the algebra Mn (K). 4. If A is a commutative ring with identity, the polynomial ring A[x1 , . . . , xn ] and the power series ring A[[x]] are algebras over A. Algebras where the product is non-associative are also important (4 ) but will not be discussed here. 4 For example, Lie algebras and Jordan algebras are non-associative algebras. 288 CHAPTER 6. MODULES Exercises. 1. Consider the R[x]-module structure on V = Rn defined by the linear transformation T (v1 , . . . , vn ) = (vn , v1 , . . . , vn−1 ). Determine the elements v ∈ Rn such that (x2 − 1)v = 0. 2. Let φ : M1 → M2 be a homomorphism of A-modules, and Ni ⊂ Mi (i = 1, 2) submodules such that φ(N1 ) ⊂ N2 . Show that: (a) There exists one, and only one, homomorphism of A-modules φ̃ : M1 /N1 → M2 /N2 for which the following diagram commutes: M1 φ / M2 π1 π2 M1 /N1 ❴ ❴ ❴ ❴ ❴ ❴/ M2 /N2 φ̃ (b) φ̃ is an isomorphism if and only if Im(φ) + N2 = M2 and φ−1 (N2 ) ⊂ N1 . 3. Let {Ni }i∈I be a family of A-modules. Show that: (a) Given an A-module M and homomorphisms Q {φi : M → Ni }i∈I , there exists a unique homomorphism φ : M → i∈I Ni such that for all k ∈ I the following diagram commutes: Q ❴ ❴ ❴ ❴φ ❴ ❴/ i∈I Ni M P PPP PPP PPP πk φk PPPP P' Nk (b) Q i∈I Ni is determined up to isomorphism by this property. 4. Let {Ni }i∈I be a family of A-modules. Show that: (a) Given an A-module M and homomorphisms {φi : Ni → M }i∈I , there exL ists a unique homomorphism φ : i∈I Ni → M such that for all k ∈ I the following diagram commutes: L φ M o❴ gPPP❴ ❴ ❴ ❴ ❴ i∈IO Ni PPP PPP ιk P φk PPPP P Nk (b) L i∈I Ni is determined up to isomorphism by this property. 6.1. MODULES OVER A RING 289 5. Let M be Lan A-module and {Mi }i∈I a family of submodules de M . Show that M = i∈I Mi if and only the following two conditions hold: P (i) M = i∈I Mi ; (ii) Mj ∩ (Mi1 + · · · + Mik ) = {0} se j 6∈ {i1 , . . . , ik }. 6. A sequence of homomorphisms of A-modules: M0 φ1 / M1 φ2 / ··· / M2 φn / Mn , is said to be exact if Im(φi ) = N (φi+1 ), i = 1, . . . , n − 1. Show that: (a) if N ⊂ M is a submodule, then one has an exact sequence: /N 0 ι /M π / M/N /0 (b) if M1 and M2 are A-modules, then one has an exact sequence: 0 / M1 ι1 / M1 ⊕ M2 π2 /0 / M2 7. (Five Lemma) Consider the following commutative diagram of A-modules and linear transformations: M1 φ1 N1 / M2 / M3 / M4 / M5 φ2 φ3 φ4 φ5 / N2 / N3 / N4 / N5 Show that, if the rows are exact and φ1 , φ2 , φ4 and φ5 are isomorphisms, then φ3 is an isomorphism. 8. If M and N are left A-modules, HomA (M, N ) denotes the set of all A-linear transformations φ : M → N . Show that: (a) HomA (M, N ) is a Z-module; (b) HomA (M, A) is a right A-module; (c) EndA (M ) ≡ HomA (M, M ) is a A-algebra, provided A is abelian. 9. Let M be a left A-module. The dual of M is the right A-module M ∗ ≡ HomA (M, A) (see previous exercise). Show that: (a) if φ : M → N is A-linear, there exists a dual linear transformation (of right A-modules) φ∗ : N ∗ → M ∗ ; Q L ∗ (b) ( i∈I Ni ) ≃ i∈I Ni ∗ ; (c) one can have M 6= {0} and M ∗ = {0}. 290 CHAPTER 6. MODULES 6.2 Linear Independence Let M be an A-module. A set ∅ = 6 S ⊂ M is called linearly independent if for any finite family {v 1 , . . . , v n } of elements of S and any a1 , . . . , an ∈ A: a1 v 1 + · · · + an v n = 0 =⇒ a1 = · · · = an = 0. Otherwise, we call S linearly dependent. A set S ⊂ M is called generating if M = hSi. In this case any element v ∈ M can be written as linear combination (in general, non-unique) of elements de S: m X ai vi , ai ∈ A, v i ∈ S. v= i=1 An A-module is said to be of finite type if it has some finite generating subset. A set S ⊂ M which is both generating and linearly independent is called a basis S of the A-module M . Given a basis S, any P element v ∈ M can be written in a unique way as a linear combination m i=1 ai v i , ai ∈ A, v i ∈ S. An A-module may or may not have a basis (see the examples below). We will say that an A-module M is free5 if M admits a basis. Examples 6.2.1. 1. A vector space always contains a basis (exercise). 2. The abelian group Zn , viewed as a Z-module, does not admit a basis. In fact, given g ∈ Zn , there exists always a 0 6= m ∈ Z such that mg = 0. Hence in Zn every subset is linearly dependent. 3. The abelian group Zm = Z ⊕ · · · ⊕ Z is free. A basis is given by S = {e1 , . . . , em }, where ei = (0, . . . , 1, . . . , 0). 4. If D is an integral domain, the polynomial ring D[x] is free. A basis is given by the monomials S = {1, x, . . . , xn , . . . }. 5. Any ring A with a unit is a free A-module with basis {1}. The submodules coincide with the ideals of A. In particular, a submodule may not be free, and even if its free, it may have a basis with more than one element. We say that an A-module M is cyclic if it is generated by one element, i.e., if M = hvi for some v ∈ M .6 . If M = hvi is cyclic, then we have an 5 6 This is the analogue for modules of the notion of a free group. This is the analogue for modules of the notion of a cyclic group 6.2. LINEAR INDEPENDENCE 291 homomorphism of A-modules, A → M , given by a 7→ av. This homomorphism is surjective and, by the First Isomorphism Theorem, M ≃ A/ ann v, where the annihilator of v is the ideal ann v := {a ∈ A : av = 0}. If ann v = {0}, then we say that v is a free element, since in this case M = hvi ≃ A is free. A non-free element is said to have torsion. The set of elements of M with torsion is denoted by Tors(M ). When A is a domain, Tors(M ) ⊂ M is a submodule of M . If X is any set and A is a ring, letL us associate to each x ∈ X a copy of A and form the free A-module M = x∈X A, called the free A-module generated by the set X. We can represent the elements of M as finite sums a1 x1 + · · · + ar xr , where xL 1 , . . . , xr ∈ X and ai ∈ A: by such a sum we mean the element (ax )x∈X ∈ x∈X A, where ax1 = a1 , . . . , axr = ar and ax = 0 if x 6= xi (i = 1, . . . , r). The next proposition gives a characterization of free modules. In particular, notice that the free module generated by the set X satisfies the same universal property that characterizes free groups. Proposition 6.2.2. Let A be a ring. For an A-module M , the following statements are equivalent: (i) M is free. (ii) There exists a family L {Ni }i∈I of cyclic submodules of M , with Ni ≃ A, such that M ≃ i∈I Ni . (iii) M ≃ L j∈J A for some index set J. (iv) There exists a set X 6= ∅ and a map ι : X → M such that the following universal property holds: for any A-module N and map φ : X → N there exists a unique homomorphism of A-modules φ̃ : M → N such that the following diagram commutes: ι /M X ◆◆ ✤ ◆◆◆ ✤ ◆◆◆ φ̃ ◆◆◆ φ ◆◆' ✤ N Proof. We show that (i) ⇒ (ii) ⇒ (iii) ⇒ (iv) ⇒ (i). 292 CHAPTER 6. MODULES (i) ⇒ (ii) Assume that M is free and let {ei }i∈I be a basis of M . Them, for each L i ∈ I, Ni := hei i is a cyclicPsubmodule of M isomorphic to A. The map φ : i∈I Ni → M , (v i )i∈I 7→ i∈I v i is an isomorphism of A-modules. (ii) ⇒ (iii) Obvious.L (iii) ⇒ (iv) Let ψ : j∈J A L→ M be an isomorphism of A-modules and ek = (xj )j∈J the element of j∈J A, with xk = 1 and xj = 0, for j 6= k. Also, let X = J and consider the map ι : X → M given by ι(j) = ψ(ej ). If φ : X → N is any map into an A-module N P, we can define φ̃ : M → N to be the linear transformation ψ((xj )j∈J ) 7→ j xj φ(j). Clearly, φ̃ makes the diagram above commute. Since {ψ(ek )} is a basis of M , φ̃ is unique. (iv) ⇒ (i) We leave as an exercise to check that {ι(x)}x∈X is a basis of M. Let M be a free A-moduleL which admits a finite basis {e1 , . . . , en }. The proposition shows that M ≃ ni=1 A ≡ An . Is it true that any other basis of M has the same number of elements? In other words, is it true that An ≃ Am implies n = m? Maybe somewhat surprising, the answer is no, as shown by an exercise at the end of this section. On the other hand, if M is a free A-module which admits an infinite basis we have the following result: Proposition 6.2.3. If an A-module M admits a basis with an infinite number of elements, then all the bases of M have the same cardinality. Proof. Let {ei }i∈I and {f j }j∈J be bases of M and assume that I is infinite. Then: (a) J is infinite: Assume, to the contrary, that J is finite, say J = {1, . . . , m}. Then P there exist elements cjil ∈ A with il ∈ I, l, j ∈ J, such that f j = m l=1 cjil eil . But then {ei1 , . . . , eim } is a set generating M . Hence, if e0 is another basis element distinct from eil there exist a1 , . . . , am ∈ A such that e0 = a1 ei1 + · · · + am eim , contradicting the linear independence of the {ei }i∈I . (b) ∃ φ : I → Pfin (J)×N injective: 7 Let ψ : I → Pfin (J) be the map which to i ∈ I associates {j1 , . . . , jm }, where the j1 , . . . , jm are the (unique) 7 We denote by Pfin (J) the set consisting of the finite subsets of J. It is shown in the Appendix, that if J is infinite, then Pfin (J) has the same cardinality as J. 6.2. LINEAR INDEPENDENCE 293 indices in J such that: ei = aj1 f j1 + · · · + ajm f jm (ajl 6= 0). The map ψ is not injective, but if P ∈ Pfin (J), then ψ −1 (P ) is finite (why?). Hence we can order ψ −1 (P ). If i ∈ ψ −1 (P ), set φ(i) ≡ (P, α), where α is the ordinal number of i in ψ −1 (P ). Since I is the disjoint union of the ψ −1 (P ), the map φ : I → Pfin (J) × N is injective. (c) |I| = |J|: Since J is infinite, by (b), we have |I| ≤ |Pfin (J) × N| = |Pfin (J)| = |J|. Interchanging the roles of I and J, we conclude that |J| ≤ |I|. By the Schröder-Bernstein Theorem, we see that |I| = |J|. The previous results motivate the following definition: Definition 6.2.4. We say that a ring A has the invariant dimension property if for any free A-module M , all the bases of M have the same cardinality. In this case, the common cardinality of the bases of M is called the dimension of M , denoted dimA M or simply dim M . We leave as an exercise to check that any division ring has the invariant dimension property, hence it makes sense to talk about the dimension of a vector space. We also have: Proposition 6.2.5. A commutative ring has the invariant dimension property. Proof. Let {e1 , . . . , en } and {f 1 , . . . , f m } be bases of a free A-module M . Then there are elements bji , cij ∈ A, i = 1, . . . , n, j = 1, . . . , m such that X X fj = bji ei , ei = cij f j . i By substitution, we conclude that: X fj = bji cil f l , il j ei = X cij bjl el . jl Since {e1 , . . . , en } and {f 1 , . . . , f m } are bases of M , if we define the matrices n,m B = (bji )m,n j=1,i=1 and C = (cij )i=1,j=1 , we conclude that BC = Im×m and CB = In×n . 294 CHAPTER 6. MODULES If A has zero characteristic, since A is commutative, we have8 : m = tr(Im×m ) = tr(BC) = tr(CB) = tr(In×n ) = n. The first and last equality are valid in characteristic zero. We leave the proof for non-zero characteristic as an exercise. Exercises. 1. Give an example of an A-module, non-isomorphic to A, where any set with 2 or more elements is linearly dependent. 2. Let A be a commutative ring with unit and M an A-module. (a) Show that, if v ∈ Tors(M ), then hvi ⊂ Tors(M ); (b) Is Tors(M ) always a submodule of M ? 3. Let A be a commutative ring. Show that EndA (An ) is isomorphic to the ring Mn (A) of n × n matrices with entries in A. 4. Let M be an A-module, X 6= ∅ a set and ι : X → M a map satisfying the following property: for any A-module N and map φ : X → N there exists a unique A-linear transformation φ̃ : M → N such that φ = φ̃ ◦ ι. Show that {ι(x)}x∈X is a basis of M . 5. Let V be a vector space over a division ring D. Show that V has a basis. 6. Show that a division ring D has the invariant dimension property. 7. Let A be a commutative ring. Complete the proof of Proposition 6.2.5 by showing that: (a) if B, C ∈ Mn (A) are n×n matrices, then BC = In×n implies CB = In×n ; (b) if B is a m×n matrix, C is a n×m matrix, BC = Im×m and CB = In×n , then m = n. L∞ ∞ 8. Let R∞ = i=1 R (direct sum of R-modules) and let A = End(R ) be ∞ the ring of all R-linear transformations of R . Show that A ≃ A ⊕ A (as A-modules), i.e., that A has a basis with 2 elements, so A does not have the invariant dimension property. 9. Show that any A-module is the quotient of a free A-module. 8 The trace of a n × n matrix A = (aij ) is defined by tr A = Pn i=1 aii ∈ A. 6.3. TENSOR PRODUCTS 6.3 295 Tensor Products In this section A denotes a commutative ring(9 ). In particular, as we saw before, the free modules have the invariant dimension property. If M1 , . . . , Mr , N are A-modules, an A-multilinear map is a map µ : M1 × · · · × Mr → N which is A-linear in each entry: ′ ′′ ′ ′′ µ(v 1 , . . . , av i +bv i , . . . , v r ) = aµ(v 1 , . . . , v i , . . . , v r )+bµ(v 1 , . . . , v i , . . . , v r ). We will denote by L(M1 , . . . , Mr ; N ) the set of all A-multilinear maps. One checks immediately that L(M1 , . . . , Mr ; N ) é is an A-module for the usual operations of addition and multiplication by scalars: (µ1 + µ2 )(v 1 , . . . , v r ) := µ1 (v 1 , . . . , v r ) + µ2 (v 1 , . . . , v r ), (aµ)(v 1 , . . . , v r ) := aµ(v 1 , . . . , v r ). If M1 = · · · = Mr = M we write Lr (M ; N ) instead of L(M, . . . , M ; N ). Proposition 6.3.1. Let M1 , . . . , Mr be A-modules. Nr (i) There exists an A-module i=1 Mi ≡ M1 ⊗ · · · ⊗ Mr and an Amultilinear map ι : M1 × · · · × Mr → M1 ⊗ · · · ⊗ Mr satisfying the following universal property: for any A-module N and A-multilinear map φ : M1 × · · · × Mr → N , there exists a unique A-linear map φ̃ : M1 ⊗ · · · ⊗ Mr → N which makes the following diagram commute: ι / M1 ⊗ · · · ⊗ Mr ✤ ❱❱❱❱ ❱❱❱❱ ✤ ❱❱❱❱ φ̃ ❱❱❱❱ φ ❱❱❱❱ ✤ * M1 × · · · × M❱ r N (ii) The A-module M1 ⊗ · · · ⊗ Mr is determine by the universal property in (i) up to isomorphism. Proof. Let L be the free A-module generated by the set I = M1 × · · · × Mr , i.e., M L= A, i∈I 9 One can also define tensor products of modules over non-commutative rings. The definition is similar, but one needs to distinguish between left and right modules. 296 CHAPTER 6. MODULES where in this sum there exists a term for each (v 1 , . . . , v r ) ∈ M1 × · · · × Mr . Denoting by R the submodule of L generated by the elements of the form ′ ′′ ′ (6.3.1) (v 1 , . . . , av i + bv i , . . . , v r ) − a(v 1 , . . . , v i , . . . , v r )− ′′ b(v 1 , . . . , v i , . . . , v r ), we let M1 ⊗ · · · ⊗ Mr denote the quotient module L/R. We define ι : M1 × · · · × Mr → M1 ⊗ · · · ⊗ Mr to be the compose of the canonical injection M1 × · · · × Mr → L with the quotient map L → L/R. If φ : M1 × · · · × Mr → N is an A-multilinear map, then we induce a linear map φ̄ : L → N as follows: if ⊕i∈I ai (v i1 , . . . , v ir ) ∈ L, then X φ̄(⊕i∈I ai (v i1 , . . . , v ir )) = ai φ(v i1 , . . . , v ir ). i∈I This map is well-defined, since by definition of the direct sum only a finite number of the ai ’s is non-zero. Also, it is immediate to check that φ̄ is A-linear. Since φ is A-multilinear, the linear map φ̄ vanishes on elements of the form (6.3.1), hence on the submodule R. Passing to the quotient, we obtain an A-linear transformation φ̃ : M1 ⊗ · · · ⊗ Mr → N which, by definition, satisfies φ = φ̃ ◦ ι.N ′ ′ r NrFinally,′ let ( i=1 Mi ) be an A-module and ι : M1 × · · · × Mr → ( i=1 Mi ) an A-multilinear map satisfying the universal property expressed in (i). We then obtain commutative diagrams: M1 × · · · ×PMr ι / Nr PPP PPP PPP PPP ι′ ( N ( Mi i=1 ✤ ✤ ✤ ι̃′ r ′ i=1 Mi ) ι′ M1 × · · · ×PMr Nr Mi )′ /( i=1✤ PPP PPP ✤ PP ι PPPP ✤ ι̃ ( N r i=1 Mi The compose of the A-linear maps ι̃ and ι̃′ make the following diagrams commute: Nr N ( ri=1 Mi )′ i=1 Mi ♥♥7 ♠♠6 ι′♠♠♠♠ ι♥♥♥♥ ♥♥♥ ♥♥♥ M1 × · · · ×PMr PPP PPP P ι PPPP ( N ι̃◦ι̃′ r i=1 Mi ♠♠♠ ♠♠♠ M1 × · · · ×◗Mr ◗◗◗ ◗◗◗ ◗◗◗ ◗◗( ι′ N ( ι̃′ ◦ι̃ r ′ i=1 Mi ) 6.3. TENSOR PRODUCTS 297 Since the identity transformations idM1 ⊗···⊗Mr and id(M1 ⊗···⊗Mr )′ also make the diagrams commute, the uniqueness requirement in the universal property implies that ι̃ ◦ ι̃′ =idM1⊗···⊗Mr and N ι̃′ ◦ ι̃ =id(M1 ⊗···⊗M N r )′ . Hence, these maps give an isomorphism of A-modules ri=1 Mi ≃ ( ri=1 Mi )′ . The A-module M1 ⊗ · · · ⊗ Mr is called the tensor product of the modules M1 , . . . , Mr . If (v 1 , . . . , v r ) ∈ M1 × · · · × Mr , the image ι(v 1 , . . . , v r ) ∈ M1 ⊗ · · · ⊗ Mr is denoted by v 1 ⊗ · · · ⊗ v r . Notice that in this notation, the following identity holds: ′ ′′ ′ v 1 ⊗ · · · ⊗ (av i + bv i ) ⊗ · · · ⊗ v r = a(v 1 ⊗ · · · ⊗ v i ⊗ · · · ⊗ v r ) ′′ + b(v 1 ⊗ · · · ⊗ v i ⊗ · · · ⊗ v r ). An element of M1 ⊗ · · · ⊗ Mr can be written as a linear combination of elements of the form v 1 ⊗ · · · ⊗ v r , since we saw above that the set of all such elements is a generating set. However, this representation is very far from being unique, since ι is not injective. Example 6.3.2. In the tensor product (over Z) of Z2 with Z4 , we have the following relations: 0 = 0(n ⊗ m) = 0 ⊗ m = n ⊗ 0, 1 ⊗ 2 = 2(1 ⊗ 1) = 2 ⊗ 1 = 0 ⊗ 1 = 0, 1 ⊗ 1 = 1 ⊗ 1 + 1 ⊗ 2 = 1 ⊗ 3. Hence, the only element in Z2 ⊗ Z4 which is possibly non-zero is 1 ⊗ 1 = 1 ⊗ 3. But since the bilinear map Z2 × Z4 → Z2 , (n, m) 7→ nm, maps (1, 1) 7→ 1 6= 0, we conclude that 1 ⊗ 1 6= 0, and then it follows that Z2 ⊗ Z4 ≃ Z2 . Using the definition of tensor product we obtain immediately: N Proposition 6.3.3. (Properties of ) Given A-modules M , N , P and {Mi }i∈I , there exist isomorphisms of A-modules: (i) M ⊗ N ⊗ P ≃ (M ⊗ N ) ⊗ P ≃ M ⊗ (N ⊗ P ) such that v ⊗ w ⊗ z ↔ (v ⊗ w) ⊗ z ↔ v ⊗ (w ⊗ z) for any elements v ∈ M , w ∈ N and z ∈ P; (ii) M ⊗ N ≃ N ⊗ M such that v ⊗ w ↔ w ⊗ v, for any v ∈ M and w ∈ N; L L (iii) ( i∈I Mi ) ⊗ N ≃ i∈I (Mi ⊗ N ) such that (v i )i∈I ⊗ w ↔ (v i ⊗ w)i∈I , for any v i ∈ Mi , w ∈ N . 298 CHAPTER 6. MODULES As we saw in the example above, in general the tensor product M ⊗ N involves a large number of relations between elements of the form v ⊗ w. However, int he case of free modules, there are only the “obvious relations” among these elements, as shown by the following proposition: Proposition 6.3.4. Let M and N be free A-modules, with bases {v i }i∈I and {wj }j∈J . Then M ⊗ N is free, with basis {v i ⊗ wj }(i,j)∈I×J . Proof. Obviously, {v i ⊗ wj }(i,j)∈I×J is a generating set. In order to see that L it is also linearly independent, consider the map φ : M × N → (i,j)∈I×J A defined by: X X φ( ai v i , bj wj ) = (ai bj )(i,j)∈I×J . i∈I j∈J Since φ is A-bilinear, there exists an A-linear map φ̃ : M ⊗N → such that φ = φ̃ ◦ ι e L (i,j)∈I×J A φ̃(v k ⊗ wl ) = φ(v k , wl ) = (ekl )(i,j)∈I×J . where (ekl )ij = 1, if (k, l) = (i, j), and (ekl )ij = 0, otherwise. The elements (ekl )(i,j)∈I×J form a basis of ⊕(i,j)∈I×J A, hence, the set {v i ⊗ wj }(i,j)∈I×J is linearly independent. Corollary 6.3.5. Let M and N be free A-modules with dim M = m and dim N = n. Then dim(M ⊗ N ) = mn. If φi : Mi → Ni , i = 1, . . . , r are homomorphisms of A-modules, we have the homomorphism T (φ1 , . . . , φr ) : M1 ⊗ · · · ⊗ Mr → N1 ⊗ · · · ⊗ Nr defined as follows: T (φ1 , . . . , φr ) is the unique A-linear map satisfying: T (φ1 , . . . , φr )(v 1 ⊗ · · · ⊗ v r ) = φ1 (v 1 ) ⊗ · · · ⊗ φr (v r ). Since the right hand side defines a multilinear expression in the v 1 , . . . , v r the universal property of tensor product shows that this map is well defined. In the next results we use the fact that A is commutative to write HomA (M, N ), EndA (M ) and M ∗ as left A-modules (see Exercises 8 and 9 in Section 6.1). Note that these are all free of finite dimension, provided M and N are free of finite dimension. Proposition 6.3.6. Let Mi and Ni , i = 1, . . . , r, be free A-modules of finite dimension. There exists an isomorphism: HomA (M1 , N1 ) ⊗ · · · ⊗ HomA (Mr , Nr ) ≃ ≃ HomA (M1 ⊗ · · · ⊗ Mr , N1 ⊗ · · · ⊗ Nr ), which maps the element φ1 ⊗ · · · ⊗ φr to T (φ1 , . . . , φr ). 6.3. TENSOR PRODUCTS 299 Proof. By the associativity property of the tensor product, it is enough to prove the case r = 2. So let M1 , M2 , N1 and N2 be free A-modules with bases ′ ′ ′′ ′′ ′ ′ ′′ ′′ {v 1 , . . . , v m1 }, {v 1 , . . . , v m2 }, {w1 , . . . , wn1 } and {w 1 , . . . , w n2 }, respectively. Define a basis {φij } of HomA (M1 , N1 ) and {ψkl } of HomA (M2 , N2 ) by the formulas: ′ φij (v a ) =  ′  wj  if a = i, if a 6= i, ′′ ψkl (v b ) = 0  ′′  wl  0 if b = k, if b 6= k. By the previous proposition, {φij ⊗ ψkl } is a basis of HomA (M1 , N1 ) ⊗ HomA (M2 , N2 ). On the other hand, we see that ′ ′′ T (φij , ψkl )(v a ⊗ vb ) =  ′ ′′  wj ⊗ wl  0 if (a, b) = (i, k), if (a, b) 6= (i, k). Hence, {T (φij , ψkl )} is a basis of Hom(M1 ⊗ N1 , M2 ⊗ N2 ), and we conclude that there exists an isomorphism of A-modules such that φ ⊗ ψ 7→ T (φ, ψ). This result shows that in the case of free A-modules of finite dimension we can write φ1 ⊗ · · · ⊗ φr instead of T (φ1 , . . . , φr ), without any ambiguity. Corollary 6.3.7. Let M and N be free A-modules of finite dimension. There exist isomorphisms: (i) EndA (M ) ⊗ EndA (N ) ≃ EndA (M ⊗ N ); (ii) M ∗ ⊗ N ∗ ≃ (M ⊗ N )∗ . These isomorphisms together with the following isomorphism lead to a more concrete interpretation of the tensor product of free A-modules of finite dimension. Corollary 6.3.8. Let M and N be free A-modules of finite dimension. There exists an isomorphism M ∗ ⊗ N ≃ HomA (M, N ) mapping an element l ⊗ w to the homomorphism φl,w given by v 7→ l(v)w. 300 CHAPTER 6. MODULES Proof. If {v 1 , . . . , v m } is a basis for M , let {l1 , . . . , lm } be the dual basis for M ∗ defined by:   1 if j = i, li (v j ) =  0 if j 6= i. Then, if {w1 , . . . , w n } is a basis for N , the tensor products {li ⊗ wk } form a basis for M ∗ ⊗ N . On the other hand, the A-linear transformations φli ,wk ∈ HomA (M, N ) satisfy   wk if j = i, φli ,wk (v j ) = li (v j )w k =  0 if j 6= i, hence the {φli ,wk } form a basis of HomA (M, N ), and there exists an isomorphism M ∗ ⊗ N ≃ HomA (M, N ), l ⊗ w 7→ φl,w . This leads to the following interpretation of the tensor product of free modules of finite dimension, which is often used in Geometry to characterize tensor fields and differential forms: Corollary 6.3.9. Let M and N be free A-modules of finite dimension. There exists an isomorphism M ⊗ N ≃ L(M ∗ , N ∗ ; A). Proof. The diagram in the universal property of tensor product: ι /M ❘❘❘ ❘❘❘ ❘❘❘ ❘❘❘ φ ❘❘❘ ) M × N❘ ⊗✤ N ✤ φ̃ ✤ A determines an isomorphism: L(M, N ; A) ≃ (M ⊗ N )∗ , φ 7→ φ̃. In general, this isomorphism does not characterizes the tensor product, since one can have M ⊗ N 6= {0} and (M ⊗ N )∗ = {0}. However, when M and N are both free of finite dimension, we obtain: L(M ∗ , N ∗ ; A) ≃ (M ∗ ⊗ N ∗ )∗ ≃ (M ∗ )∗ ⊗ (N ∗ )∗ ≃ M ⊗ N. 6.3. TENSOR PRODUCTS 301 It is well known that any vector space V over R can be seen as a vector space over C. One can use tensor products to generalize this construction and extend the ring of scalars of a given module, as follows. Let A be a ring and Ã an extension of A (i.e., A is a subring of Ã). We can view Ã as an A-module: if a ∈ A and b ∈ Ã, then the product ab is defined using the multiplication in Ã. Hence, if M is any A-module, we can form the A-module MÃ = Ã ⊗A M (10 ) and define an operation of Ã on MÃ by: b(c ⊗ v) := (bc) ⊗ v. One checks easily that MÃ with this new scalar multiplication operation is an Ã-module. We say that MÃ is obtained from M by extension of the ring of scalars. If φ : M → N is an A-linear map, then we obtain an Ã-linear map φ̃ : MÃ → NÃ by setting: φ̃(c ⊗ v) := c ⊗ φ̃(v). Proposition 6.3.10. If M is a free A-module, then MÃ is a free Ã-module and dimA M = dimÃ MÃ . Proof. If M ≃ ⊕i∈I A, then MÃ = Ã ⊗A M ≃ Ã ⊗A (⊕i∈I A) ≃ ⊕i∈I (Ã ⊗A A) ≃ ⊕i∈I Ã, where the last isomorphism is obtained from the isomorphism Ã → Ã ⊗A A, a 7→ a ⊗ 1. Examples 6.3.11. 1. If V is a vector space over R, then VC = V ⊗R C (sometimes called the complexification of V ) is a vector space over C. If {v 1 , . . . , v n } is a basis of V over R, then {1 ⊗ v1 , . . . , 1 ⊗ v n } is a basis of VC over C. Hence, if V ≃ Rn then VC ≃ Cn . 2. If one extends the ring of scalars of the Z-module Z to Q, one obtains a Q-module isomorphic a Q. 3. On the other hand, if one extends the ring of scalars of the Z-module Zn to Q, one obtains a trivial Q-module (exercise). 10 Whenever one works with more than one ring at the same time, it is convenient to indicate the ring as a subscript in the symbol of tensor product, so that it is clear in which ring one is taking the tensor product. 302 CHAPTER 6. MODULES There are many other constructions where tensor products, Hom and duality play a crucial role. A few of them are covered in the exercises. Exercises. 1. Check the basic properties of tensor products given in Proposition 6.3.3. 2. Let ρ1 : G → GL(V1 ) and ρ2 : G → GL(V2 ) be representations of the same group G in vector spaces V1 and V2 . Show that there exists exactly one representation ρ : G → GL(V1 ⊗ V2 ) satisfying the following property: ρ(g)(v 1 ⊗ v 2 ) = ρ1 (g)(v 1 ) ⊗ ρ2 (g)(v 2 ). 3. Show that Zm ⊗Z Zn ≃ Zq . What is the expression of q in terms of m and n? 4. Show that Q ⊗Z Zn is trivial. 5. Show that if / M1 0 / M2 / M3 /0 is a short exact sequence of A-modules (see Exercise 6, in Section 6.1) and N is an A-module, then the sequence of A-modules M1 ⊗ N / M2 ⊗ N / M3 ⊗ N /0 is also exact. Give an example that shows that the first map in this sequence may fail to be injective. 6. Let M be an A-module, and let R be the submodule de by elements of the form Nr i=1 M generated v i = v j for some i, j (i 6= j) Nr Denote by M the quotient i=1 M/R, and by v 1 ∧ · · · ∧ v r the Vr module image of v 1 ⊗ · · · ⊗ v r in M . Show that: Vr v 1 ⊗ · · · ⊗ vr , (i) the A-multilinear map ι : M × · · · × M → M ∧ · · · ∧ M , (v 1 , . . . , v r ) 7→ v 1 ∧ · · · ∧ v r is alternating, i.e., ι(v σ(1) , . . . , v σ(r) ) = sgn σ · ι(v 1 , . . . , v r ), ∀σ ∈ Sr . (ii) if φ : M × · · · × M → N is any A-multilinear alternating map, there exists a unique A-linear map φ̃ : M ∧ · · · ∧ M → N making the following diagram commute: ι / M ∧ ···∧ M M × ···× M ✤ ❯❯❯❯ ❯❯❯❯ ✤ ❯❯❯❯ ❯❯❯❯ ✤ φ̃ φ ❯❯❯* N 6.3. TENSOR PRODUCTS 303 (iii) The A-module M ∧ · · · ∧ M is determined by the universal property in (ii) up to isomorphism. V n (iv) If M is free of finite dimension n, then r M is free of dimension r if 1 ≤ r ≤ n, and of dimension 0 if r > n. Vr ∗ (v) If M is free of finite dimension, then M ≃ Ar (M ) (the module of all alternating A-multilinear maps ϕ : M · · × M} → A). | × ·{z r vezes 7. Let {Mi }i∈I be a family of A-modules where I is a partial ordered set which satisfies the following condition 11 : ∀i, j ∈ I, ∃k ∈ I : i ≤ k and j ≤ k. Assume also that for any pair i, j ∈ I with i ≤ j there exists an A-linear map φji : Mi → Mj such that whenever i ≤ j ≤ k one has: φkj ◦ φji = φki , φii = id. Show that: (a) there exists an A-module M and A-linear maps φi : Mi → M satisfying the following universal property: if N is an A-module and ϕi : Mi → N are A-linear maps such that ϕj ◦ φji = ϕi , there exists a unique A-linear map ϕ : M → N making the following diagram commute: φj i / Mj Mi✶ ❇ ✶✶❇❇❇ φi φj ⑤⑤✌✌ ✶✶ ❇❇❇ ⑤⑤ ✌ ✶✶ ❇ }⑤⑤⑤ ✌✌ ✶ ✌✌ ϕi ✶ M ✶✶ ✤ ✌✌✌ ϕj ✶✶ϕ ✤ ✌✌ ✶ ✤ ✌✌ N S Show also that M = i∈I φi (Mi ) is unique up to isomorphism. One calls M the direct limit of the family {Mi , φji } and denotes it by lim Mi ; −→ (b) if φi (v) = 0 for some v ∈ Mi , then there exists j ≥ i such that φji (v) = 0; (c) if N is an A-module, then lim(Mi ⊗ N ) = (lim Mi ) ⊗ N ; −→ −→ (d) if M1 ⊂ M2 ⊂ · · · ⊂ Mi ⊂ . . . are A-modules, find lim Mi . −→ 8. Define inverse limit of a directed family of A-modules, and show that it is characterized by a universal property analogous to the direct limit where the sense of the arrows is inverted. 11 A partial ordered set which satisfies this condition is said to be directed or filtered. 304 6.4 CHAPTER 6. MODULES Modules over Integral Domains In this section we assume that rings are commutative with a unit and the cancellation law holds, i.e., they are integral domains. In linear algebra this property of the ring leads to: Proposition 6.4.1. Let M be a module over an integral domain D. Then Tors(M ) is a submodule of M . Proof. Recall that Tors(M ) = {v ∈ M : existe a ∈ D com av = 0 and a 6= 0}. Hence, if v 1 , v 2 ∈ Tors(M ), then there exist a1 , a2 ∈ D, non-zero, such that a1 v 1 = 0 and a2 v 2 = 0. If d1 , d2 ∈ D, then a1 a2 (d1 v 1 + d2 v 2 ) = a2 d1 a1 v 1 + a1 d2 a2 v 2 = 0, with a1 a2 6= 0, for if a1 a2 = 0 then the cancellation law would imply that a1 = 0 or a2 = 0. Hence, d1 v 1 + d2 v 2 ∈ Tors(M ). One calls Tors(M ) torsion submodule of M . If M = Tors(M ), then one says that M is a torsion module. If Tors(M ) = 0, i.e., if all elements of M are free, then on says that M as module free of torsion. The next proposition gives elementary properties of the torsion submodule and its proof is left as an exercise. Proposition 6.4.2. Let D be an integral domain. (i) If φ : M1 → M2 is a D-linear transformation, then φ(Tors(M1 )) ⊂ Tors(M2 ). If φ is injective, then φ(Tors(M1 )) = Tors(M2 ) ∩ Im(φ). If φ is surjective with ker(φ) ⊂ Tors(M1 ), then φ(Tors(M1 )) = Tors(M2 ). (ii) If M is a D-module, then M/ Tors(M ) is D-module free of torsion. (iii) If {Mi }i∈I is a family of D-modules, then M M Tors( Mi ) = Tors(Mi ). i∈I i∈I The following examples show that one should not confuse the concepts of a free module and free of torsion. 6.4. MODULES OVER INTEGRAL DOMAINS 305 Examples 6.4.3. 1. If M is a free module over an integral domain, then Tors(M ) = 0 (exercise) and M is free of torsion. 2. The Z-module Q is free of torsion, but Q is not a free Z-module. 3. The modules Zn are torsion Z-modules. 4. If V is a finite dimension vector space over K and T : V → V is a linear transformation, then V is a torsion K[x]-module (exercise). Let M be a module over an integral domain D and denote by K = Frac(D) the field of fractions of D. Since K is an extension of D, we can extend the ring of scalars of M to K, resulting in a vector space MK over K. This vector space reflects the properties of M up to its torsion. Proposition 6.4.4. Let M be a D-module, K = Frac(D), and φ : M → MK the D-linear transformation v → 1 ⊗ v. Then: (i) Every element of MK is of the form v ∈ M. 1 d φ(v), where 0 6= d ∈ D and (ii) The kernel of φ is the torsion submodule Tors(M ). Proof. In order to prove (i), observe that the module MK is generated by all elements k ⊗ v, with k ∈ Frac(D) and v ∈ M . Hence, if w ∈ MK , then: w= n X i=1 ki ⊗ v i = n X ai i=1 bi ⊗ vi. Denoting by d the product of the bi ’s, there exist ci ∈ D such that hence: ! n X 1 1 w= ⊗ ci v i = φ(v). d d ai bi = ci d, i=1 The proof of (ii) is left for the exercises. We call the vector space MK over K = Frac(D) the vector space associated with the D-module M . The previous proposition suggests the following definition: Definition 6.4.5. If M is a D-module and S ⊂ M , we call the rank of S the dimension of the linear subspace MK generated by φ(S). In particular, the rank of M is the dimension dim MK . 306 CHAPTER 6. MODULES Recall that M is a finite type if it is finitely generated. The proposition above shows that: Corollary 6.4.6. Every D-module M of finite type has finite rank. Proof. If S is a finite set generating M , then φ(S) is finite and contains a basis of MK , hence dim MK < ∞. In particular, M has finite rank. Notice that the rank of a D-module is an invariant: if M1 ≃ M2 , then M1 and M2 have the same rank. The converse in general fails, so the rank is certainly not a complete invariant. Examples 6.4.7. 1. Since Q ⊗Z Q = Q the rank of Q, as a Z-module, is 1. Since Q is not finitely generated over Z, the converse to the corollary does not hold. 2. Since Q ⊗Z Zn = {0}, the rank of Zn is zero. More generally, if M is a torsion module, then its rank is zero. Corollary 6.4.8. Let M be a D-module. Then {ei }i∈I ⊂ M is a linearly independent family over D if and only if {1 ⊗ ei }i∈I ⊂ MK is a linearly independent family over K. Proof. If {ei }i∈I ⊂ M is a linearly independent family, the submodule N = L i∈I Dei is free of torsion, hence the restriction of φ : M → MK to N is injective. It follows from this corollary that if M is a free D-module then its rank equals its dimension. On the other hand, if M is a free D-module then a submodule N ⊂ M may fail to be free. In fact, we have: Proposition 6.4.9. If D is an integral domain such that for any free Dmodule M the submodules N ⊂ M are all free, then D is a PID. Proof. Since M = D is a free D-module, by assumption its submodules, i.e., the ideals I ⊂ D, must be free. Let I ⊂ D be an ideal. A basis of I contains only one element, since any two elements a, b ∈ I are linearly dependent: (−b)a + ab = 0. If {d} is a basis of I, then I = hdi, so I is a principal ideal. Actually, the principal ideal domains are characterized by the property expressed in this proposition, since we have the following result: 6.4. MODULES OVER INTEGRAL DOMAINS 307 Theorem 6.4.10. If D is a PID and M is a free D-module, then any submodule N ⊂ M is free and dim N ≤ dim M . Proof. Let {ei }i∈I be a basis for M over D and {0} = 6 N ⊂ M a submodule. If J ⊂ I, consider an ordered pair (NJ , BJ ′ ), where   M Dej  , NJ = N ∩  j∈J and BJ ′ = {f j }j∈J ′ is a basis of NJ , with J ′ ⊂ J. We denote by P the set formed by all such ordered pairs. In P we have the partial order relation given by: (NJ1 , BJ1′ ) ≤ (NJ2 , BJ2′ ) ⇔ J1 ⊂ J2 and BJ1′ ⊂ BJ2′ . We claim that we can apply Zorn’s Lemma to (P, ≤): (i) P is non-empty: Since N 6= {0}, there exists Ln J0 = {j1 , . . . , jn } ⊂ I Ln−1 such that N ∩ i=1 Deji = {0} and N ∩ i=1 Deji 6= {0}. The set {a ∈ D : aejn + n−1 X i=1 bi e j i ∈ N } is an ideal in P D, hence takes the form hd0 i. ItPfollows that there exists n−1 n−1 f 0 = d0 ejn + i=1 b0i eji ∈ N . If v = aejn + i=1 bi eji ∈ N , we must have a = kd0 and v − kf 0 = n−1 X i=1 (bi − kb0i )eji ∈ N ∩ n−1 M i=1 Deji = {0}. We conclude that B = {f 0 } is a basis of N{J0 } , so P is non-empty. (ii) Every ascending chain {(NJα , BJα′ )}α∈A in (P, ≤) has an upper bound: It is enough to consider the pair ∪α∈A NJα , ∪α∈A BJα′ . Zorn’s Lemma applied to (P, ≤) gives a maximal element (NJˆ, BJˆ′ ). To finish the proof it is enough to show that Jˆ = I, for this implies that NJˆ = N , so that BJˆ′ is a basis for N . Assume that I − Jˆ 6= ∅. Then there exists l ∈ I − Jˆ and a ∈ D such that M (6.4.1) ael + v a ∈ N for some v a ∈ Dej . j∈Jˆ 308 CHAPTER 6. MODULES The elements a ∈ D which satisfy (6.4.1) form an ideal, which must be principal: a ∈ hd0 i. We claim that BJˆ′ ∪ {f 0 }, where f 0 = d0 el + v d0 , is a basis for NJ∪{l} . Leting BJˆ′ = {f j }j∈Jˆ′ we have: ˆ is (a) BJˆ′ ∪ {f 0 } is a generating set: In fact, every element of v ∈ NJ∪{l} ˆ of the form (6.4.1), hence: v = ael + v a = a′ d0 el + v a = a′ f 0 − a′ v d0 + v a , a′ ∈ D. L It follows that −a′ v d0 + v a ∈ N ∩ ( j∈Jˆ Dej ) = NJˆ, where X v = a′ f 0 + aj f j , j∈Jˆ′ and so BJˆ′ ∪ {f 0 } is generating. (b) BJˆ′ ∪ {f 0 } is linearly independent: given a linear combination   X X X X aj f j + af 0 = bk e k aj  cjk ek  + ad0 el + a j∈Jˆ′ j∈Jˆ′ = X k∈Jˆ k∈Jˆ k∈Jˆ   X j∈Jˆ′  aj cjk + abk  ej + ad0 el . we see that if this linear combination is zero, then ad0 = 0, hence a = 0. Since the elements {f j } are linearly independent, also aj = 0 and the elements BJˆ′ ∪ {f 0 } are linearly independent. Therefore the pair (NJ∪{l} , BJˆ′ ∪{l} ) contradicts the maximality of (NJˆ, BJˆ′ ). ˆ ˆ as claimed. We conclude that I = J, One consequence of the previous theorem is that for modules of finite type over a PID, the concepts “free” and “free of torsion” actually coincide: Proposition 6.4.11. Let M be a module of finite type over a PID. If Tors(M ) = 0, then M is free. Proof. Let S be a finite generating set. In S choose a set B = {v 1 , . . . , v n } which is maximal linearly independent in S. If v ∈ S, there exist av , a1 , . . . , an ∈ D (non-unique) such that av v = a1 v 1 + · · · + an v n (av 6= 0). 6.4. MODULES OVER INTEGRAL DOMAINS 309 Q φ Set a ≡ v∈S av . Since M is torsion free, the map φ : M → LnM , w 7−→ aw, is an injective homomorphism. We claim that φ(M ) ⊂ i=1 Dv i . Since M = hSi it is enough to observe that if w ∈ S, then   Y aw =  av  aw w v∈S−{w}  = Y v∈S−{w}  av  (a1 v1 + · · · + an v n ) ∈ n M Dv i . i=1 Hence M is isomorphic to a submodule of a free module free. By Theorem 6.4.10, M is free. Exercises. 1. Show that Proposition 6.4.2 holds 2. Show that, if M is a free D-module over an integral domain D, then M is torsion free. Give an example of a free module N over a ring A such that Tors(N ) 6= 0. 3. Let V be a finite dimensional vector space K and T : V → V a linear transformation. Show that V is a torsion K[x]-module. 4. Give an example of a free module over a integral domain which admits submodules which are not free. 5. Let D be an integral domain and K = Frac(D), seen as a D-module. In D − {0} consider the partial order defined by: d1 ≤ d2 ⇐⇒ Kd1 ⊂ Kd2 , where Kd ⊂ K is a D-submodule { da : a ∈ D}. (a) Show that K = lim Kd . −→ (b) If M is a D-module and MK is the associated vector space show that MK = lim(Kd ⊗ M ). −→ (c) Conclude that 1 ⊗ v ∈ MK is zero vector if and only if v ∈ Tors(M ). 6. Let D be a PID. If M is a free D-module, N ⊂ M is a submodule, and Tors(M/N ) = M/N , show that dim N = dim M . 310 6.5 CHAPTER 6. MODULES Finitely Generated Modules over a PID We will now study modules which are finitely generated over PID, eventually obtaining its classification. As particular instances of this classification one obtains the classification of finitely generated abelian groups and canonical forms forms for linear transformations between finite dimensional vector spaces. Recall that in the previous section we have shown that for a PID D every submodule of a free D-module is free, and that “free” and “free of torsion” coincide for D-modules of finite type. This leads to the first step in the classification of modules of finite type over a PID: Theorem 6.5.1. Let M be a module of finite type over a PID. There exists a free submodule L ⊂ M such that M = Tors(M ) ⊕ L and dim L = rank M . Proof. The module M/ Tors(M ) is free of torsion and of finite type, hence it is free. This means we can choose elements e1 , . . . , en ∈ M , linearly independent, such that M/ Tors(M ) = n M Dπ(ei ), i=1 where π : M → M/ Tors(M ) denotes the quotient map. Let L = M . Then: Ln i=1 Dei ⊂ (a) L∩Tors(M ) = {0}: If v ∈ L∩Tors(M ) there exist scalars d, d1 , . . . , dn ∈ D, with d 6= 0, such that dv = 0, v= n X di ei . i=1 Hence (dd1 )e1 + · · · + (ddn )en = 0 and we conclude that dd1 = · · · = ddn = 0. By the cancellation law, d1 = · · · = dn = 0, and so v = 0. (b) M = L + Tors(M ): If v ∈ M define d1 , . . . , dn ∈ D by π(v) = n X di π(ei ). i=1 Then v = vT + vL , where v L = N (π) = Tors(M ). Pn i=1 di ei ∈ L and v T = v − v L ∈ 6.5. FINITELY GENERATED MODULES OVER A PID 311 By (a) and (b), we have M = Tors(M )⊕L. If K = Frac(D) and φ : M → MK is the canonical injection, the restriction of φ to L is injective. Since φ(L) generates MK , we conclude that the rank of M equals the dimension of L. In the above decomposition the free factor L is not unique. This is clear from the proof, where a different choice of a basis {e1 , . . . , en } may lead to a different factor. However, its dimension coincides with the rank of M and hence it is an invariant. The rank of M classifies, up to isomorphism, the free part of M . To complete the classification, one needs to classify the finitely generated torsion modules. This will be done in the next paragraphs. 6.5.1 Diagonalization of matrices with entries in a PID Recall that Mn (D) denotes the ring of n × n matrices with entries in a PID D. The following result will be very useful to find special basis in a free module: Proposition 6.5.2. Let A ∈ Mn (D). There exist invertible matrices P, Q ∈ Mn (D) such that:   d1 0   .. Q−1 AP =  , . 0 dn where d1 | d2 | · · · | dn . The di ’s are unique up to multiplication by units. Note that this result does not say that a matrix can be diagonalized by a change of basis: in general, the matrices P and Q are not related in any way. The normal form for a matrix given by Proposition 6.5.2 can be obtained by performing elementary operations in the rows and columns of the matrix, as in the usual Gauss elimination. Let Eij denote the matrix whose entries are all zero with the exception of the entry (i, j) which equals 1 ∈ D. Right and left multiplication by the following invertible matrices leas to the usual elementary operations: • Exchange of columns (rows): Pij = I − Eii − Ejj + Eij + Eji ; • Multiplication of columns (rows) by units: Di (u) = I + (u − 1)Eii (u ∈ D a unit); 312 CHAPTER 6. MODULES • Addition of a multiple of a column (row) to another column (row): Tij (a) = I + aEij (a ∈ D). Also, we define the length δ(d) os an element 0 6= d ∈ D to be the number of prime factors in a prime factorization of d. Given any n × n matrix A = (aij ) we would like to show that A is equivalent12 to a diagonal matrix. If A = 0, there is nothing to prove. If not, then there is a non-zero entry of A of minimal length, and using elementary operations we can transport it to the (1, 1) entry. Let a1k be some entry such that a11 ∤ a1k . By exchange of columns 2 and k, we can assume that this entry is a12 . If d = gcd(a11 , a12 ), then there exist elements p, q ∈ D such that pa11 + qa12 = d. If we let r = a12 d−1 e s = a11 d−1 , then we see that the matrices     p r s r  q −s   q −p          −1 1 1 P = P = , ,     .. ..     . . 1 1 are inverse to each other. Right multiplication of A = (aij ) by the matrix P gives an equivalent matrix whose first row is (d, 0, a13 , . . . , a1n ) and δ(d) < δ(a11 ). Similarly, if a11 ∤ ak1 , we can obtain a new equivalent matrix where the new (1, 1) entry d has length δ(d) < δ(a11 ), so the minimal δ has decreased again. Since δ takes values in N, we can repeat this process a finite number of times leading to an equivalent matrix where a11 | a1k and a11 | ak1 , for all k = 2, . . . , n. Performing elementary operations we obtain finally an equivalent matrix of the form:   d1 0 . . . 0  0 â22 . . . â2n     .. ..  . .. ..  . . .  . 0 ân2 . . . ânn Obviously we can repeat this process again with the second row and column, etc., so we see that the original matrix is equivalent to a diagonal matrix:   d1 0   .. .  . 0 dn 12 In the following we will say that two matrices A and B are equivalent if there exist invertible matrices P and Q such that B = P AQ. 6.5. FINITELY GENERATED MODULES OVER A PID 313 If d1 ∤ d2 , then we can add the second row to the first row and repeat the all process again. Since δ always decreases, we will eventually obtain a diagonal matrix where d1 | d2 . It should now be clear that we can obtain an equivalent diagonal matrix where d1 | d2 | · · · | dn . The elements d1 , . . . , dn in the normal form given by Proposition 6.5.2 are called invariant factors of the matrix A. The uniqueness of the invariant factors follows from the following result which also gives a more efficient method to compute them than the elimination that we used above to prove existence. We leave the proof as an exercise. Lemma 6.5.3. Let A ∈ Mn (D) and assume that A is equivalent to a diagonal matrix   d1 0   .. ,  . 0 dn with d1 | d2 | · · · | dn . If the rank of A is r, then di = 0, for i > r, and i di = ∆∆i−1 for i ≤ r, where ∆0 = 1 and ∆i is the greatest common divisor of the minors of A of size i. The formulas in this lemma, lead immediately to the following corollary: Corollary 6.5.4. The invariant factors are unique up to multiplication by units. Two matrices are equivalent if and only if they possess the same invariant factors. Example 6.5.5. Let D = C[x] and consider the matrix  x−2 0 A =  −1 x −2 4  0 −1  . x−4 Computing the minors given in Lemma 6.5.3, we obtain ∆1 = 1, ∆2 = x − 2, ∆3 = (x − 2)3 . Hence: d1 = 1, d2 = (x − 2) and d3 = (x − 2)2 . In fact, the method of 314 CHAPTER 6. MODULES elimination used above gives the following invertible matrices:  0 −1  −1 −x + 2 1 x−4 6.5.2  0 x−2 0   −1 1 −2   0 0 1 −1 0 x −1   0 0 1  4 x−4 0 1 x   1 0 0 . 0 = 0 x−2 0 0 (x − 2)2 Decomposition into invariant factors Let M be a module over a PID D. For each v ∈ M , one defined the order ideal of v by ann v ≡ {d ∈ D : dv = 0}. This is a principal ideal: ann v = hai, and the element a ∈ D is called the order of v. The order is defined up to multiplication by units. Obviously, the cyclic submodule hvi is isomorphic to D/ ann v. Example 6.5.6. Let G be an abelian group, viewed as a Z-module. If g ∈ G, then the cyclic subgroup hgi generated by g is isomorphic to Z/ ann g. The order of g, as defined above, coincides with the usual notion of order up to a sign (the units in this case are ±1). We can now state and prove our first version of the classification of modules of finite type over a PID: Theorem 6.5.7 (Invariant factors decomposition). Let D be a PID and let M be a module of finite type over D. Then M = hv 1 i ⊕ · · · ⊕ hv k i, where ann v 1 ⊃ ann v 2 ⊃ · · · ⊃ ann v k . Letting ann v i = hdi i, there is an isomorphism M ≃ D/hd1 i ⊕ · · · ⊕ D/hdk i where d1 | d2 | · · · | dk . The ideals hdi i are uniquely determined by M . Proof. If rank M = r then M ≃ Tors(M ) ⊕ D · · ⊕ D}, | ⊕ ·{z r terms 6.5. FINITELY GENERATED MODULES OVER A PID 315 hence it is enough to prove the result for a torsion module M . Let {w 1 , . . . , wn } be a finite set of generators of M . Denote by L the free module generated by the wi ’s. In L there is a basis {ŵ 1 , . . . , ŵ n } such that π(ŵ i ) = wi , where π : L → M is the projection. Let R be the kernel of π, so that M ≃ L/R. Since R is a free submodule of L and since M is torsion, we must have dim R = dim L = n (see Exercise 6.4.6). Let {v̂ 1 , . . . , v̂ n } be a basis of R, so there exist scalars aij ∈ D satisfying the relations X v̂ i = aji ŵ j , i = 1, . . . , n. j Changing basis in L and R, X qji ŵ j , ŵ′i = v̂ ′i = X pji v̂ j , j j we obtain new relations v̂ ′i = X bji ŵ′j , i = 1, . . . , n, j and its is immediate to verify that the matrices A = (aij ), B = (bij ), P = (pij ) and Q = (qij ) are related by B = Q−1 AP. As we saw above, we can choose invertible matrices P and Q (or, equivalently, we can choose bases of L and R) such that B = diag(d1 , . . . , dn ) with d1 | d2 | · · · | dn . In this case: v̂ ′i = di ŵ′i , i = 1, . . . n. If w′i = π(ŵ ′i ), we claim that M = hw′1 i ⊕ · · · ⊕ hw′n i. Since ann w′i = hdi i, this the existence part of the proposition. P proves ′ Obviously, M = i hwi i, since the ŵ ′i form a generating set of L and π : L → M is Psurjective. Hence, to prove the claim, it is enough to show that hw′k i ∩ i6=k hw ′i i = {0}. Let w be an element in this intersection. Then there exist ai ∈ D such that X w = ak w′k = ai w′i . i6=k 316 CHAPTER 6. MODULES It follows that in L we have: ak ŵ′k − X i6=k ai ŵ′i ∈ R so we conclude that there exist bi ∈ D such that ai = bi di , i = 1, . . . , n. But then w = ak w′k = π(ak ŵ ′k ) = π(bk dk ŵ′k ) = π(bk v̂ ′k ) = 0. The proof of uniqueness will be given later. Note that in the decomposition above we can assume that the v k 6 0, i.e., that the di are non-invertible. The elements d1 , . . . , dk in the decomposition above are then defined up to multiplication by a unit, and are called the invariant factors of the module M . Corollary 6.5.8. Two modules of finite type over a P ID are isomorphic if and only if they have the same invariant factors. 6.5.3 Decomposition into primary factors There is an alternative way of encoding the classification of modules over a PID, which is based on the prime factorization of elements of D, and which we now explain. Recall that a PID is a UFD so that any 0 6= a ∈ D can be expressed in the form a = u · p1 · · · pn , where u ∈ D is a unit and the pi ∈ D are primes. This factorization is unique, up to the order of the factors and multiplication by units. Recall that a, b ∈ D are associate, i.e., differ by multiplication by a unit, we write a ∼ b. Lemma 6.5.9. Let D be a PID and assume that gcd(a, b) = 1. Then: D/habi ≃ D/hai ⊕ D/hbi. We leave the proof as an exercise. Using this lemma, we can show: Theorem 6.5.10 (Primary factors decomposition). Let D be a PID and let M be a module of finite type over D. Then: M = L ⊕ hw1 i ⊕ · · · ⊕ hw n i ≃ L ⊕ D/hp1 m1 i ⊕ · · · ⊕ D/hpn mn i, where L is a free submodule with dim L = rank M , ann wi = hpi mi i, and the elements p1 , . . . , pn ∈ D are all primes. The ideals hp1 m1 i, . . . , hpn mn i are uniquely determined, up to its order, by M . 6.5. FINITELY GENERATED MODULES OVER A PID 317 Proof. Consider the decomposition of M into invariant factors: M = hv 1 i ⊕ · · · ⊕ hv k i. If ann vi = hdi i, then d1 | d2 | · · · | dk and dk−r+1 = · · · = dk = 0, where r = rank M . Hence, we have hv k−r+1 i ⊕ · · · ⊕ hv k i = L, where L is free of dimension r. On the other hand, if p1 m1 , . . . , pn mn are the prime powers in the prime factorizations of d1 , . . . , dk−r , Lemma 6.5.9 shows that hv 1 i ⊕ · · · ⊕ hv k i ≃ D/hp1 m1 i ⊕ · · · ⊕ D/hpn mn i ⊕ L. The proof of uniqueness will be given later. The elements p1 m1 , . . . , pn mn associated with the module M are defined up to multiplication by units. They are called the elementary divisors of M . The elementary divisors together with the rank form a complete set of invariants of M : Corollary 6.5.11. Two modules of finite type over a PID are isomorphic if and only if they have the same list of elementary divisors and the same rank. The proof of the theorem shows that the decomposition of M into invariant factors determines a decomposition of M into primary factors. Conversely, let M ≃ L ⊕ D/hp1 m1 i ⊕ · · · ⊕ D/hpn mn i be a decomposition de M into primary factors. Let p1 , . . . , ps be the distinct (i.e., non-associate) primes appearing in this decomposition. We arrange the prime powers in the decomposition as follows: (6.5.1) p1 n11 p1 n21 .. . p2 n12 p2 n22 .. . ··· ··· ps n1s ps n2s .. . p1 nt1 p2 nt2 ··· ps nts , where n1i ≤ n2i ≤ · · · ≤ nti , i = 1, . . . , s. Note that one may eventually need to add some factors 1 = pi 0 . We let dj be the product of all the prime 318 CHAPTER 6. MODULES power in row j, i.e., dj ≡ p1 nj1 · p2 nj2 · · · ps njs . Then d1 | d2 | · · · | dt , and since the prime powers in each dj correspond to distinct primes they are relatively prime, so the preceding lemma gives an isomorphism Tors(M ) ≃ D/hd1 i ⊕ · · · ⊕ D/hdt i. If the dimension of the free part L is r, then we add to the list of dj ’s the elements dt+1 = · · · = dt+r = 0, and this leads to the decomposition of M into invariant factors. Given the list of elementary divisors {pi nji }, the invariant factors dk are determined by the process above. Conversely, given the list of invariant factors {dk }, the elementary divisors {pi nji } are the prime powers in the prime factorizations of the dk ’s. Hence, the uniqueness of the ideals hd1 i, . . . , hdk i follows from the uniqueness of the ideals hp1 m1 i, . . . , hpn mn i, which we will study next. 6.5.4 Primary components and uniqueness of decompositions If M is a D-module and p ∈ D is a prime, the p-primary component of M is the submodule M (p) = {v ∈ M : pk v = 0, for some k ∈ N}. It is an easy exercise to check that if Tors(M ) = M , then M M= M (p). p prime Since M is of finite type, only a finite number of terms in this direct sum is non-zero. Given a primary decomposition: M = L ⊕ D/hp1 m1 i ⊕ · · · ⊕ D/hps ms i note that the primary components of M are: M M (p) = D/hpi mi i. {pi : pi ∼p} Hence, if we are given two primary decompositions M = L ⊕ D/hp1 m1 i ⊕ · · · ⊕ D/hps ms i = L ⊕ D/hp̃n1 1 i ⊕ · · · ⊕ D/hp̃nt t i, 6.5. FINITELY GENERATED MODULES OVER A PID 319 we have that: M (p) = M {pi : pi ∼p} D/hpi mi i = M {p̃i :p̃i ∼p} D/hp̃ni i i. Hence, to prove the uniqueness of the primary decomposition of M , we can assume that M = M (p) and that the primes appearing in the two decompositions coincide: M (p) = D/hpm1 i ⊕ · · · ⊕ D/hpms i = D/hpn1 i ⊕ · · · ⊕ D/hpnt i Then we can order the decompositions so that m1 ≤ m2 ≤ · · · ≤ ms and n1 ≤ n2 ≤ · · · ≤ nt . If v s ∈ M is such that ann vs = hpms i, then the second decomposition gives pnt v s = 0, so nt ≥ ms . Similarly, we find that ms ≥ nt , so we must have ms = ns . The quotient module M (p)/hv s i admits the decompositions M (p)/hv s i ≃ D/hpm1 i ⊕ · · · ⊕ D/hpms−1 i ≃ D/hpn1 i ⊕ · · · ⊕ D/hpnt−1 i By exhaustion, we conclude that mi = ni and s = t. This proves the uniqueness of the decompositions into primary factors and invariant factors. Exercises. 1. An ideal I in a commutative ring A is called a primary ideal if ab ∈ I implies that either a ∈ I or bn ∈ I, for some n ∈ N. Show that if D is a UFD and p ∈ D is a prime, then hpn i is a primary ideal. This justifies the name “primary decomposition” used in Theorem 6.5.10. 2. Deduce the formulas for the invariant factors given in Lemma 6.5.3. 3. Prove Lemma 6.5.9. 4. Determine diagonal matrices equivalent to the matrices 36 12 (a) over Z; 16 18   x − 1 −2 −1 x 1  over R[x]. (b)  0 0 −2 x − 3 320 CHAPTER 6. MODULES 5. Show that if p ∈ N is a prime, the following two matrices in Mn (Zp ) are equivalent:  0  0      0 1 1 0 0 1 .. . ··· ··· 0 0 6. Show that M = ··· ··· L  0 0    ,  1  0 p prime M (p)  1 1  0 1      0 0 0 0 0 1 .. . ··· ··· ··· ··· 1  0 0    .  1  1 if and only if Tors M = M . 7. If M = D/hp1 p2 2 p3 i ⊕ D/hp1 p2 3 p3 2 p4 i ⊕ D/hp1 3 p2 2 p4 5 i is a module over a PID D, where p1 , . . . , p4 are non-associate primes, determine the decompositions of M into invariant factors and primary factors. 8. Let M1 and M2 modules of finite type over a PID D. (a) Show that if M1 and M2 are cyclic, then M1 ⊗ M2 is cyclic. (b) Determine the decompositions of M1 ⊗ M2 into invariant and primary factors in terms of the corresponding decompositions of M1 and M2 . 9. Let M1 and M2 be cyclic modules over a PID D, of order a and b, respectively. Show that if gcd(a, b) 6= 1, then the invariant factors of M1 ⊕ M2 are gcd(a, b) and lcm(a, b). 6.6 Classification of Abelian Groups and Canonical Forms for Matrices We will now show how the structure theory that we developed for modules of finite type over a PID can be used to classify finitely generated abelian groups and to establish canonical forms for linear transformations between vector spaces. This correspond, of course, to consider the cases where D = Z and D = K[x], with K a algebraically closed field. When D = Z, the units are ±1, so every non-zero ideal has a non-negative integer as unique generator. If D = K[x] the units are are the constant, non-zero, polynomials, so every non-zero ideal has a monic polynomial as a unique generator. Hence, we have unique representatives for the invariant factors and the elementary divisors, and the classifications becomes particularly nice. 6.6. CLASSIFICATIONS 6.6.1 321 Classification of Finitely Generated Abelian Groups Let G be group. We say that G is of finite type if there exist elements g1 , . . . , gm ∈ G such that g = g1 n1 · · · gm nm . ∀g ∈ G, ∃n1 , . . . , nm ∈ Z : If G is an abelian group, then G is of finite type if and only if G is a Zmodule of finite type. Since Z is a PID, the classifications theorems in the previous sections immediately yield: Theorem 6.6.1 (Classification of abelian groups of finite type). Let G be an abelian group of finite type. Then G ≃ Zd1 ⊕ · · · ⊕ Zdn , where d1 , . . . , dn are non-negative integers uniquely determined by the condition d1 | d2 | · · · | dn . If p1 n1 , . . . , ps ns are the prime powers in the prime factorization of the non-zero di ’s,then G ≃ Zp1 n1 ⊕ · · · ⊕ Zps ns ⊕ Zr , where r is the number of d′i s equal to zero, i.e., the rank of G. The integers di (respectively, pi ni ) are called the invariant factors (respectively, elementary divisors) of the abelian group G. Together with the rank, they both form two distinct complete set of invariants of an abelian group. Notice that one can find one list of invariants, given the other list. Examples 6.6.2. 1. If n ∈ N admits a prime factorization n = p1 n1 · · · ps ns , then Zn ≃ Zp1 n1 ⊕ · · · ⊕ Zps ns . In this case, there exists only one invariant factor n and the elementary divisors are the powers p1 n1 , . . . , ps ns . 2. Let G = Z6 ⊕ Z15 ⊕ Z18 . The primary decomposition de G is G ≃ Z2 ⊕ Z3 ⊕ Z3 ⊕ Z5 ⊕ Z2 ⊕ Z32 so the list of elementary divisors is {2, 2, 3, 3, 32, 5}. We can obtain the list of invariant factors from Table 6.5.1, which in this case reads: 20 2 2 3 3 32 50 50 5. 322 CHAPTER 6. MODULES The invariant factors are the products of the prime powers in each line of this table: d1 = 3, d2 = 6, d3 = 90. Hence, the decomposition of G into invariant factors is: G ≃ Z3 ⊕ Z6 ⊕ Z90 6.6.2 Jordan Canonical Form Let K be an algebraically closed field. Let V be a finite-dimensional vector space over K, and T : V → V a linear transformation. The K[x]-module structure on V defined by T is defined by: p(x) = an xn + · · · + a0 ∈ K[x] and v ∈ V , then p(x) · v ≡ an T n (v) + · · · + a0 v. We can use this K[x]-module structure to obtain information about T : for example, Ṽ ⊂ V is a K[x]-submodule if and only if Ṽ linear subspace of V , invariant under T (see Exercise 6.1.1). Since K[x] is a PID and V is of finite type, one of the classification theorems for modules over PID leads to the following: Theorem 6.6.3 (Jordan Canonical Form). Let T : V → V be a linear transformation of a finite-dimensional vector space over an algebraically closed field K. There exists a basis {e1 , . . . , en } for V where the matrix representing the linear transformation T is   J1 0   .. J = , . 0 Jm where Ji is a (ni × ni ) matrix of the form  λi 1  0   . .  .   0     .  1  λi Proof. Since K is algebraically closed a polynomial p(x) ∈ K[x] is prime if and only if p(x) = x − λ. Hence, the decomposition of V as a K[x]-module into primary factors takes the form: V ≃ V1 ⊕ · · · ⊕ Vm , 6.6. CLASSIFICATIONS 323 where Vi ≃ hv i i are T -invariant subspaces, and ann(v i ) = h(x − λi )ni i. The vectors {(x − λi )ni −1 v i , . . . , (x − λi )v i , v i } form a basis for Vi over K (exercise), and the matrix of T relative to this basis is precisely Ji . To compute the the Jordan canonical form one needs only to know the elementary divisors (or the invariant factors) of the K[x]-module V . These can be determined as follows, as shown by inspection of the proof of Theorem 6.5.7: Let {f 1 , . . . , f n } be a basis of V over K, and let A = (aij ) be the matrix representing T in this basis. The set {f 1 , . . . , f n } generates V as a K[x]-module. Forming the free module L generated by these elements, we obtain the surjective homomorphism π : L → V which has kernel R. The elements X ei := xf i − aji f j j form a basis for R (as a K[x]-module) and the invariant factors of V can be obtained by applying Proposition 6.5.2 to the matrix   x − a11 −a12 · · · −a1n  −a21 x − a22 · · · −a2n     . .. .. . .   . . . −an1 −an2 ··· x − ann In other words, we have an equivalent diagonal matrix   1   ..   . 0     1  ,   d1 (x)     ..   . 0 ds (x) where d1 (x) | · · · | ds (x) are the invariant factors. Example 6.6.4. Let T : C3 → C3 be the linear transformation defined relative to the canonical basis by the matrix   2 0 0 0 1 . A= 1 2 −4 4 324 CHAPTER 6. MODULES As we saw in Example 6.5.5, we have  0 −1  −1 −x + 2 1 x−4  0 x−2 0   −1 1 −2   0 0 1 −1 0 x −1   0 0 1  4 x−4 0 1 x   1 0 0 , 0 = 0 x−2 0 0 (x − 2)2 and so the elementary divisors are (x − 2) Jordan canonical form of T is  2 0 J = 0 2 0 0 and (x − 2)2 . We conclude that the  0 1 . 2 The Jordan canonical form is a consequence of the decomposition of the K[x]-module V into primary factors. The decomposition of V into invariant factor leads to another canonical form known as the rational canonical form13 (see Exercise 6, below). Exercises. 1. Determine all abelian groups of order 120. 2. If K is an algebraically closed field of characteristic zero, show that U = {r : r is a root of xn − 1 = 0} is an abelian group isomorphic to Zn . 3. Let T : V → V be a linear transformation of a finite-dimensional vector space over a field K and assume that V ≃ hvi (as K[x]-module), where ann(v) = h(x − λ)m i. Show that the elements {(x − λ)m−1 v, . . . , (x − λ)v, v} form a basis of V over K. 4. Determine the Jordan canonical form of the matrices:   −1 1 −2 4 , (a) A =  0 −1 0 0 1 13 Actually, the rational canonical form, contrary to the Jordan canonical form, does not require K to be algebraically closed. 6.6. CLASSIFICATIONS  0  1  (b) B =  0 0 0 0 1 0 325  0 −8 0 16  . 0 −14  1 6 5. Let T : V → V be a linear transformation of a finite-dimensional vector space over a field K, and let d1 (x) | · · · | ds (x) be the invariant factors of K[x]-module V . One calls m(x) = ds (x) the minimal polynomial and p(x) = d1 (x) · · · ds (x) the characteristic polynomial of the linear transformation T . (a) Show that m(x) 6= 0, m(T ) = 0 and that if q(x) is a polynomial such that q(T ) = 0 then m(x)|q(x); (b) Show that p(x) 6= 0, p(T ) = 0 and that p(x) = det(xI − T ). 6. (Rational Canonical Form) Let T : V → V be a linear transformation of a finite-dimensional vector space over a field K. Using the decomposition of V into invariant factors, as a K[x]-module, show that there exists a basis {e1 , . . . , en } for V over K where the matrix representing T is   R= R1 0 .. . 0 Rm   , where Ri is a (ni × ni ) matrix of the form  0 0   1 0      0 .. . 0 1 −a0 .. . .. . −ani −2 −ani −1     .    One calls R the rational canonical form of the linear transformation T . 7. Recall that a scalar linear differential ordinary equation (o.d.e) dn−1 y dy dn y + an−1 n−1 + · · · + a1 + a0 y = 0 n dt dt dt is equivalent to a system of first order linear o.d.e’s: dx = xA, dt 326 CHAPTER 6. MODULES where x = (y, y ′ , y ′′ , . . . , y (n−1) ) and A is  0 0   1 0   ..  .   0 0 1 the companion matrix  −a0  ..  .  . ..  .  −an−2  −an−1 Using the rational canonical form, show the following converse: every system of first order linear o.d.e’s is equivalent to a system of decoupled scalar linear o.d.e.’s. 6.7 Categories and Functors Many of the algebraic structures that we have been studying, albeit distinct, often admit similar properties. We have seen that groups, rings or modules share common constructions and that the methods of proof are the same. We shall now see that one can formalize these similarities in a very precise way. This relies on the following fundamental concept: Definition 6.7.1. A category C consists of: (i) A class of objects. (ii) For each pair of objects (X, Y ), a set Hom(X, Y ), whose elements are called morphisms. (iii) A map Hom(X, Y ) × Hom(Y, Z) → Hom(X, Z), called composition of morphisms. The image of the pair (φ, ψ) under composition of morphisms will be denoted by ψ ◦ φ, and the following properties must hold: (C1) Associativity: if φ ∈ Hom(X, Y ), ψ ∈ Hom(Y, Z) and τ ∈ Hom(Z, W ), then τ ◦ (ψ ◦ φ) = (τ ◦ ψ) ◦ φ. (C2) Identity: for any object X, there exists a morphism 1X ∈ Hom(X, X) which satisfies 1X ◦ φ = φ and ψ ◦ 1X = ψ, for any φ ∈ Hom(W, X) and ψ ∈ Hom(X, Y ), where Y and W are arbitrary objects. 6.7. CATEGORIES AND FUNCTORS 327 Examples 6.7.2. 1. The category S of sets is the category where the objects are the sets, the morphisms Hom(X, Y ) are the maps φ : X → Y , and the composition of morphisms is the usual composition of maps. 2. The category G of groups is the category where the objects are the groups, the morphisms Hom(G, H) are the group homomorphisms φ : G → H, and the composition of morphisms is the composition of homomorphisms. 3. The category MA of (left/right) modules over a ring A is the category where the objects are the (left/right) A-modules, the morphisms Hom(M, N ) are the A-linear transformations φ : M → N , and the composition of morphisms is the composition of linear transformations. 4. The category T of topological spaces is the category where the objects are the topological spaces, the morphisms Hom(X, Y ) are the continuous maps, and the composition of morphisms is the usual composition of continuous maps. The first example above shows that, in general, the objects of a category form a classe, rather than a set: one cannot define a “set of all sets” without getting into paradoxs (e.g., is the set of all sets a set?). This distinction, whose complete justification requires a deep study of Set Theory 14 , means that to classes one does not apply the usual operations over sets, such as passing to subsets. A category where all the objects are elements of a set is called a small category. It maybe convenient of think of a morphism φ ∈ Hom(X, Y ) as an arrow from X to Y . However, it is important to note that, in spite of its name, a morphism φ ∈ Hom(X, Y ) does not need to be a map from X to Y , as shown in the following simple example: Example 6.7.3. Fix a group G. Let C be the category with only one object {∗} and where the morphisms Hom(∗, ∗) are the elements of G. The composition of two morphisms is the multiplication in the group G. If G is a non-trivial group, the morphisms are not maps between the objects. A concrete category is a category C where every morphism φ ∈ Hom(X, Y ) is actually a map X → Y , where the identity morphism 1X ∈ 14 See, e.g., P. R. Halmos, Naive Set Theory, Undergraduate Texts in Mathematics, Springer-Verlag, 1974. 328 CHAPTER 6. MODULES Hom(X, Y ) is the identity map X → X, and where composition of morphisms is the usual composition of maps. Most of the categories that we have found up to now are concrete categories. In any case, we will always represent a morphism φ ∈ Hom(X, Y ) symbolically as φ : X → Y , keeping in mind that φ is not necessarily a map from X to Y . Lets us give some elementary properties of categories. Proposition 6.7.4. In a category C, for each object X, the identity morphism 1X is unique. Proof. In fact, if 1X and 1′X are two identities in X, then by the identity property (C2) applied to both 1X and 1′X , we find 1X ◦ 1′X = 1′X and 1X ◦ 1′X = 1X . Hence, 1X = 1′X . In a category C, given a morphism f : X → Y , we say that g : Y → X is a left inverse of f if g ◦ f = 1X . Similarly, one defines a right inverse of f . Proposition 6.7.5. If f : X → Y had both a left inverse g and a right inverse g′ , then g = g ′ . Proof. From the definition of left/right inverse we obtain: (g ◦ f ) ◦ g ′ = 1X ◦ g ′ = g ′ , g ◦ (f ◦ g ′ ) = g ◦ 1Y = g. Therefore, by the associativity property (C1), we find that g = g′ . When f : X → Y has both a right and a left inverse g, one calls g the inverse of f and writes g = f −1 . In this case we say that f is an isomorphism in the category. Examples 6.7.6. 1. In Examples 6.7.2 the isomorphisms are the bijections (in the category of sets), the group isomorphisms (in the category of groups), the linear isomorphisms (in the category of modules over a ring) and the homeomorphisms (in the category of topological spaces). 2. In Example 6.7.3 every morphism is an isomorphism. A small category where every morphism is invertible is called a groupoid. 6.7. CATEGORIES AND FUNCTORS 329 Next we introduce a notion which allows us to relate two categories. Definition 6.7.7. A covariant functor F from a category C to a category D is a map that associates to an object X of C an object F (X) of D, and to each morphism φ : X → Y , a morphism F (φ) : F (X) → F (Y ), such that the following properties hold: (i) F preserves identities: F (1X ) = 1F (X) , for every object X of C; (ii) F preserves compositions: F (φ ◦ ψ) = F (φ) ◦ F (ψ), for all morphisms φ and ψ that are composable. Similarly, a contravariant functor F from a category C to a category D is a map that associates to each object X of C an object F (X) of D, and to each morphism φ : X → Y , a morphism F (φ) : F (Y ) → F (X), and: (i) F preserves identities: F (1X ) = 1F (X) for every object X de C; (ii’) F inverts compositions: F (φ ◦ ψ) = F (ψ) ◦ F (φ) for all morphisms φ and ψ that are composable. Let us see some examples of functors: Examples 6.7.8. 1. The map that associates to each group its underlying set and to each group homomorphism the underlying map of sets, is a covariant functor from the category of groups to the category of sets. More generally, for any concrete category C, there is a forgetful functor F : C → S from C to the category of sets S which “forgets” the structure: to each object X of C, the functor F associates the underlying set of X, and to each morphism φ ∈ Hom(X, Y ) the functor F associates the underlying map X → Y . 2. The map that associates to each module M over a commutative ring A its dual M ∗ , and to each A-linear map φ : M → N the transpose φ∗ : N ∗ → M ∗ , is a contravariant functor from the category of left A-modules to the category of right A-modules. 3. The map that associates to each topological space X the k-th homology group Hk (X, Z) (respectively, the k-th cohomology group H k (X, Z)) and to each continuous map φ : X → Y the group homomorphism φ∗ : Hk (X, Z) → Hk (Y, Z) (respectively, φ∗ : H k (Y, Z) → H k (X, Z)), is a covariant (respectively, contravariant) functor from the category of topological spaces to the category of abelian groups. 330 CHAPTER 6. MODULES Many constructions that we have seen before are special instances of general constructions in Category Theory. Consider, for example, the notion of direct product: if {Xi }i∈I is a family of objects in some category C the product of the objects {Xi }i∈I is a pair (Z, {πi }i∈I ), where Z is an object in C and each πi : Z → Xi is a morphism in C, satisfying the following universal property: • For any object Y and any collection of morphisms {φi : Y → Xi }∈I , there exists a unique morphism φ : Y → Z such that, for each i ∈ I, the following diagram commutes: φ Y ❴▼▼ ❴ ❴ ❴ ❴ ❴/ Z ▼ ▼▼▼ ▼▼▼ φi ▼▼▼▼ & πi Xi It is easy to see that if products exist, then they are unique up to an isoQ morphism in C. In this case, we denote the product by i∈I Xi . Note that the product may or may not exist, depending on the category. For a given category it is necessary to show the existence, and for that one needs some concrete model (usually quite obvious). We have followed exactly this procedure in the case of the category of groups (direct product of groups) and in the case of the category of modules over a ring (direct product of modules). Another advantage of Category Theory is to make it possible to give a precise meaning to certain expressions one often uses when discussing some mathematical formalism. For example, one may say that a certain map is “induced” or that a certain construction is “natural”. In order to illustrate this usage of Category Theory, let us see how one can make the notion of “natural” precise. We recall that this expression is often is used as a synonymous of “not depending on choices”. Definition 6.7.9. A natural transformation T between two functors F : C → D and G : C → D is a map which associates to each object X in a category C a morphism TX : F (X) → G(X) in the category D, so that for any morphism φ : X → Y of C the following diagram commutes: F (X) TX / G(X) F (φ) G(φ) F (Y ) TY / G(Y ) 6.7. CATEGORIES AND FUNCTORS 331 If TX is an isomorphism for every object X, we say that T is a natural equivalence. As a simple illustration, we show in the next example how one can formalize precisely the statement that for any finite-dimensional vector space V there exists a natural isomorphism between the double dual (V ∗ )∗ and V . Example 6.7.10. Let C be the category of (left) modules over a ring A. For each A-module M , denote by M ∗∗ the double dual: M ∗∗ ≡ (M ∗ )∗ , and for each A-linear transformation φ : M → N denote by φ∗∗ : M ∗∗ → N ∗∗ the double transpose: φ∗∗ ≡ (φ∗ )∗ . This defines a covariant functor from C to C. Now, for each A-module M , we denote by TM : M → M ∗∗ the A-linear transformation defined as follows: for each v ∈ M , TM (v) : M ∗ → A the linear transformation ξ 7→ ξ(v). One checks easily that M 7→ TM is a natural transformation between the double dual functor and the identity functor. If instead of the category of A-modules we consider instead the category of finite-dimensional vector spaces over a field K, then V 7→ TV defines a natural equivalence natural between the double dual functor and the identity functor. We will not make heavy use of Category Theory (but in some situations it we will be quite useful as a language!). Hence, we will not develop further Category Theory, but you should keep in mind that Category Theory is an important subject, that plays a major role in many branches of Mathematics, including Algebraic Topology, Algebraic Geometry, etc. Exercises. 1.QLet {Xi }i∈I be a family of objects in a category C. Define coproduct ∗ i∈I Xi of the objects {Xi }i∈I by diverting the direction of the arrows in the definition of product, and verify that the direct sum of abelian groups and modules, and the free product of groups, are coproducts in the appropriate categories. 332 CHAPTER 6. MODULES 2. A set with a marked point is a pair (X, x) where X is a set and x ∈ X. A morphism (X, x) → (Y, y) between sets with marked points is a map f : X → Y such that f (x) = y. (a) Show that sets with a marked point and the morphisms between marked sets form a category. (b) Show that products exist in this category and describe them explicitly. (c) Show that coproducts exist in this category and describe them explicitly. 3. Let L be an object in a category C, S a non-empty set, and ι : X → L a map. One says that L is a free object in the set S if for each object X in C and map φ : S → X there exists a unique morphism φ̃ : L → X such that the following diagram commutes: ι / S ◆◆ ◆◆◆ ◆◆◆ ◆ φ ◆◆◆◆ & L✤ ✤ ✤ φ̃ X. Show that, in any category C, if L is a free object in the set S, L′ is a free object in the set S ′ and |S| = |S ′ |, then L′ is isomorphic to L. 4. An object I in a category C is called universal or initial if for each object X in C there exists a unique morphism φ : I → X: (a) Show that any two initial objects in a category C are isomorphic. (b) Determine the initial objects in the category of groups. (c) Show that the coproduct can be consider as an initial object in an appropriate category. 5. Define, similarly to the previous exercise, a co-universal or terminal object in a category C. (a) Show that any two terminal objects in a category C are isomorphic. (b) Determine the terminal objects in the category of groups. (c) Show that the product can be consider as a terminal object in an appropriate category. Chapter 7 Galois Theory The solution of a quadratic equation was known to the mathematicians in Babylon1 , which knew how to “complete the square”, and was popularized in the western world during the Renascence in latin translations of the Arab book by al-Khowarizmi Al-jabr wa’l muqābalah (mentioned in Chapter ??). In 1545, it was published the monograph Ars Magna by Geronimo Cardano (1501–1576), also known by the name Cardan, which included formulas to solve third and forth degree polynomial equations. Cardan attributed these formulas, respectively, to Niccolo Tartaglia (1500–1565) and Ludovico Ferrari (1522–1565). The discovery of these formulas, and the fight for the priority in its discovery, has a very curious and amusing history, which can be found in the books mentioned above. The formula for the solution of the cubic equation x3 + px = q, known today as Cardan’s formula, is qp qp 3 3 (p/3)3 + (q/2)2 + q/2 − (p/3)3 + (q/2)2 − q/2. x= The general cubic equation y 3 + by 2 + cy + d = 0 can be reduced to this case by performing the substitution y = x − b/3. One can have a good idea of the degree of difficulty in solving these equations if one verifies by substitution that Cardan’s formula does indeed furnish a solution. A quartic equation can also be reduced to the solution of a cubic. First one can assume, eventually after some translation, that the quartic is of the form x4 + px2 + qx + r = 0. Completing the square, one obtains (x2 + p)2 = px2 − qx − r + p2 . 1 For the historical references in this chapter see, e.g., Carl R. Boyer, A History of Mathematics, John Wiley & Sons, New York (1968), and Dirk J. Struik, A Concise History of Mathematics, Dover Books on Mathematics, New York (1987). 333 334 CHAPTER 7. GALOIS THEORY The trick consists then in observing that, for any y, one has: (x2 + p + y)2 = px2 − qx − r + p2 + 2y(x2 + p) + y 2 = (p + 2y)x2 − qx + (p2 − r + 2py + y 2 ). This last equation is quadratic in x, and we can choose y so that we obtain a perfect square. This is achieved precisely when the discriminant vanishes: q 2 − 4(p + 2y)(p2 − r + 2py + y 2 ) = 0. Now the last equation is the cubic in y: −8y 3 − 20py 2 + (−16p2 + 8r)y + (q 2 − 4p3 + 4pr) = 0, which can be solved by using Cardan’s formula. For this value of y, the right hand side of the above auxiliary equation is then a perfect square, so taking square roots we obtain a quadratic equation which can be easily solved. The solutions to the cubic and the quartic contained in the Ars Magna provided a strong stimulus in the search of general formulas for the solutions of algebraic equations of higher order. These efforts were unsuccessful for over 300 years, and one had to wait for the beginning of the 19th Century when Abel and Rufini reach the opposite conclusion: for an equation of the 5th degree there is no general formula that expresses the solutions in terms of radicals of the coefficients of the equation. Inspired by Abel’s proof of the impossibility of solving the quintic equation, Galois initiated a systematic study of algebraic equations of arbitrary degree and showed that it was not only impossible to give a general formula for equations of any degree higher than 5, but gave also a criterion to decide if a particular equation could be solved and, in the affirmative case, a method to solve it. Galois investigations, in spite of its premature death, were fundamental in shaping the field of Algebra as we know it nowadays, and had much more consequences that the complete solution to the original problem of solving algebraic equation through radicals. In order to illustrate Galois ideas, consider the quartic equation with rational coefficients: p(x) = x4 + x3 + x2 + x + 1 = 0. 2πk The roots of this equation are rk = ei 5 (k = 1, . . . , 4) (why?). Consider all the possible polynomial equations, with rational coefficients, which are satisfied by these roots, such as: r1 + r2 + r3 + r4 − 1 = 0, 2 (r1 + r4 ) + (r1 + r4 ) − 1 = 0, r1 r4 = 1, 5 (r1 ) − 1 = 0, ... 7.1. FIELD EXTENSIONS 335 The key observation is the following: the set of all permutations of the roots which transform equations of this type into equation of this type form a group, now called the Galois group of the equation. In the present case the group is: G = {I, (1243), (14)(23), (1342)}. Galois understood that the structure of this group is the key to solve the equation 2 . Consider, for example, the subgroup H = {I, (14)(23)} ⊂ G. One can check that among the polynomial expressions above the ones that are fixed by H are precisely the polynomials in w1 = r1 + r4 and w2 = r2 + r3 . Now w1 and w2 are the roots of the quadratic equation x2 + x − 1 = 0. Hence, assuming that we did not know the solutions to the original quartic equation, we could try to solve it by solving first this quadratic equation, obtaining: √ √ −1 − 5 −1 + 5 , r2 + r3 = , r1 + r4 = 2 2 and then solving the quadratic equation: (x − r1 )(x − r4 ) = x2 − (r1 + r4 )x + r1 r4 = 0, Note that the last equation is an equation whose coefficients are polynomials in w1 and w2 , since r1 r4 = 1. Note that the Galois Group of a polynomial equation p(x) ∈ Q[x] can be characterized as the group of symmetries of the equation: it consists of the transformations that take solutions (roots) to solutions, preserving the algebraic structure of the roots. In the modern formulation of Galois Theory one constructs the field Q(r1 , . . . , rn ) generated by the roots of p(x) and the Galois Group consists of the automorphisms of this field 3 . One can say that Galois Theory consists in replacing questions concerning the structure of this field into questions about the structure of the Galois group. 7.1 Field Extensions Recall that a field L is said to be an extension of the field K, if K is a subfield of L. We will now initiate a systematic study of field extensions. We start by recalling some results that were deduced in Chapter 3. 2 This discovery is even more remarkable, since Galois had first to invent the concept of a group, which at the time was not known! 3 The notion of field was only formalized by Dedekind in 1879, more that 50 years after the tragic death of Galois. 336 CHAPTER 7. GALOIS THEORY Let K be a field and L an extension of K. Recall that we call L a finite extension (respectively infinite extension) if L, seen as a vector space over K, has finite dimension (respectively, infinite dimension). The dimension of L over K is denoted by [L : K]. If S ⊂ L is some subset, we denote by K(S) the smallest subfield of L which contains K and S. Obviously, K(S) is an extension of K generated by S. If S = {u1 , . . . , un }, we write K(u1 , . . . , un ) instead of K({u1 , . . . , un }). Let us consider in detail the case where S consists of a single element, i.e., K(u) where u ∈ L. If x is an indeterminate, there is a ring homomorphism φ : K[x] → L which associates to a polynomial g(x) its value at u: g(x) 7→ g(u). The kernel ker(φ) is an ideal in K[x], which must be principal. There are two possibilities: (i) u is transcendental over K. This corresponds to the case ker(φ) = {0}, so φ is a monomorphism whose image is K[u] and which extends uniquely to the field of fractions K(x) = Frac(K[x]). It follows that K(u) ≃ K(x), and the elements in K(u) take the form g(u)/f (u) where g(x), f (x) ∈ K[x] and f (x) 6= 0. The extension K(u) has infinite dimension over K; (ii) u is algebraic over K. In this case ker(φ) = hp(x)i, where p(x) is an irreducible polynomial, and K(u) = K[u] ≃ K[x]/hp(x)i. If we require p(x) to be monic, then p(x) is unique and is called the minimal polynomial of u. The extension K(u) has dimension [K(u) : K] = deg p(x). We call this dimension the algebraic degree of u over K. Definition 7.1.1. An extension L of K is called a simple extension if there exists u ∈ L such that L = K(u). In this case, one says that u is a primitive element of L. A extension L of K is called algebraic (respectively transcendental) if all the elements of L are algebraic (respectively,there exists a transcendental element in L) over K. A simple extension L is algebraic or transcendental exactly when its primitive elements are algebraic or transcendental. A finite extension L over K is always algebraic (see Exercise 3.4.8), but there are algebraic extensions of infinite dimension over K. A transcendental extension is always infinite. Examples 7.1.2. √ 1. Consider L = Q( n 2) ⊂ R and K = Q. By Eisenstein Criterion, the √polynomial p(x) = xn − 2, (n ≥ 2) is irreducible over Q. The number u = n 2 is √ √ n n a root of p(x) in R. Hence, [Q( 2) : Q] = n and 2 is algebraic of degree n over Q. 7.1. FIELD EXTENSIONS 337 2. Consider L = C and K = R. The polynomial x2 + 1 is clearly irreducible over R. The number i ∈ C is a root of x2 + 1 in C, hence i is algebraic of degree 2 over R. In fact, C = R[x]/hx2 + 1i = R(i), so C is a simple extension of R, and i is a primitive element. 3. Consider L = C and K = Q. Then L is a transcendental extensional of K and [C : Q] = ∞. We leave the proof of the following proposition as an exercise: Proposition 7.1.3. Let M ⊃ L ⊃ K be successive extensions of a field K. Then [M : K] is finite if and only if both [M : L] and [L : K] are finite and in this case: [M : K] = [M : L] · [L : K]. Example 7.1.4. √ q(x) = x2 −3 is irreducible We saw√above that [Q( 2), √ Q] = 2. The polynomial √ over Q( 2): a root a + b 2 of q(x) in Q( 2) must satisfy √ (a + b 2)2 = 3, a, b ∈ Q. √ Hence, (a2 + 2b2 ) + 2ab 2 = 3, so√that a2 + 2b2 = 3 and ab = 0. If b = 0, then 2 a2 = 3 which is impossible since 3 6∈ Q. If a = 0 √ we have = 6 which is √ √ (2b) √ also impossible√since 6 ∈ 6 Q. We conclude that Q( 2)( 3) ≃ Q( 2)[x]/hx2 − √ √ 3i so that [Q( 2)( 3), Q( 2)] = 2. By Proposition 7.1.3, we conclude that √ √ √ √ √ √ [Q( 2, 3), Q] = [Q( 2)( 3), Q( 2)] · [Q( 2), Q] = 2 · 2 = 4. In the examples so far, we have consider only subfields of the field of complex numbers. In this case, there is no problem in finding extensions where a given polynomial has roots due to a fundamental property of C: every polynomial (of degree ≥ 1) with coefficients in C has at least one root, i.e., C is algebraically closed. We recall: Proposition 7.1.5. The following statements are equivalent: (i) L is an algebraically closed field. (ii) Every polynomial p(x) = an xn + · · ·Q + a1 x + a0 ∈ L[x] factors into a product of linear factors: p(x) = an n1 (x − ri ). (iii) Every irreducible polynomial in L[x] has degree 1. (iv) There are no proper algebraic extensions of L. 338 CHAPTER 7. GALOIS THEORY There are many interesting examples of fields which are not subfields of C. For example, the number fields Zp , relevant in Number Theory, or the field of fractions C(z), fundamental in Algebraic Geometry. For such fields it is not completely obvious that given a polynomial there exists an extension where the polynomial factors into linear polynomials. We will deal with these issues in the next sections. Exercises. 1. Show that the subset of C formed by all algebraic elements over Q is an algebraic extension of Q of infinite dimension over Q (such elements of C are usually called algebraic numbers). 2. Give a proof of Proposition 7.1.3. 3. Show that: (a) every algebraically closed field K is infinite; (b) if K is an infinite field, show that any algebraic extension of K has the same cardinality as K; 4. Show that there are elements in C which are not algebraic over Q. 5. Let L ⊂ C be the smallest extension of Q which contains the roots of the polynomial x3 − 2. Decide if L is simple or not and in the afirmative case give an example of a primitive element. 6. Prove Proposition 7.1.5. 7. Let K be a field and x an indeterminate. In K(x) consider the element u = x2 . Show that K(x) is a simple extension of K(u). What is the dimension of [K(x) : K(u)] ? 7.2 Compass-and-Straightedge Constructions The mathematicians in Ancient Greece used to express in a geometric manner many mathematical concepts and constructions. In general, a construction was consider to be valid only if it could be obtained by using only a compass and a straightedge (an unmarked ruler). In spite of their spectacular achievements, there were a few simple constructions that they could not find a way to perform using only compass and straightedge. 7.2. COMPASS-AND-STRAIGHTEDGE CONSTRUCTIONS 339 Among the most famous constructions were the Greeks failed to find a compass-and-straightedge construction, were the following: (i) trisecting an angle; (ii) duplicating a cube; (iii) constructing a regular heptagon; (iv) squaring the circle, i.e., constructing a square with area equal to a given circle. The problem of existence of such compass-and-straightedge constructions can be transformed into a problem in Field Theory. Then Galois Theory can be applied to solve the problem. We will formulate here a first version of the theory which allows us already to show that (i)-(iv) do not have a compass-and-straightedge construction. Starting with two points in the plane let us attempt to determine all possible compass-and-straightedge constructions based at these points. Without loss of generality, we can assume that the two points are the numbers 0 and 1 in the complex plane C. We define inductively subsets Xm ⊂ C, m = 1, 2, . . . , as follows: • X1 ≡ {0, 1}, and • Xm+1 is the union of Xm with (C1) the points of intersection of straight lines connecting points of Xm ; (C2) the points of intersection of straight lines connecting points of Xm with circumferences with center in points of Xm and radius segments with end points in Xm ; (C3) the points of intersection of pairs of circumferences with center in points of Xm and radius segments with end points in Xm ; A point z ∈ C is said to be constructible with compass and straightedge if it belongs to the union: [ Xm . C≡ m We call C the set of constructible points. Let us give some alternative descriptions of the set C of constructible points. These will be complemented later with more complete descriptions, after we study Galois Theory. 340 CHAPTER 7. GALOIS THEORY Theorem 7.2.1. C is the smallest subfield of C which contains Q, and is closed for the operations of complex conjugation (z 7→ z̄) and taking square 1 roots (z 7→ z 2 ). Proof. The proof is split into 3 parts: (a) C is a subfield of C: Assume that z1 , z2 ∈ C. Then z1 − z2 and z1 /z2 (z2 6= 0) belong to C, since the usual operations of addition and multiplications of complex can be expressed geometrically exclusively using (C1), (C2) and (C3) (exercise). (b) C is closed for the operations of complex conjugation and taking square roots It is obvious that C is closed for the operation of complex √ conjugation. On the other hand, if z ∈ C and r = |z|, we can obtain r using (C1), (C2) and (C3), through the geometric construction sketched in the figure. Since bisecting an angle can also be performed using these 1 operations, we see that z 2 can be constructed using (C1), (C2) and (C3). PSfrag replaements p r 1 mi h e e~ 1h1+i =rh2Zi h4i h12i h6i h3i hmd(m; n)i hmi hni hmm(m; n)i Figure 7.2.1: Finding the square root. (c) If Q ⊂ C ′ ⊂ C is any subfield closed for the operations of complex conjugation and taking square roots, then C ⊂ C ′ : We need to show that the points of intersection of (i) straight lines connecting points of C ′ , (ii) straight lines connecting points of C ′ with circumferences with center in C ′ and radius segments with end-points in C ′ , (iii) pairs of circumferences with center in C ′ and radius segments with end-points in C ′ , 7.2. COMPASS-AND-STRAIGHTEDGE CONSTRUCTIONS 341 belong to C ′ . Observe first that if z = x + iy ∈ C ′ , then x and y belong to C ′ , since C ′ is closed under conjugation. It follows that if ax + by + c = 0 is the equation of a straight line connecting points of C ′ , then a, b, c ∈ C ′ . Similarly, if x2 + y 2 + dx + ey + f = 0 is the equation of a circumference with center in C ′ and radius a segment with end-points in C ′ , then d, e, f ∈ C ′ . Finally, observe that the points of intersection of two objects of this type, have coordinates certain fractions which, at most, contain square roots in the coefficients a, b, c, d, e, f , hence, they belong to C ′ . Next, we observe that one can translate the property of being constructible to an issue about the structure of fields. Theorem 7.2.2. A complex number z belongs to C if and only if it belongs to a subfield Q(u1 , u2 , . . . , ur ) ⊂ C, where u1 2 ∈ Q and um+1 2 ∈ Q(u1 , . . . , um ) for m = 1, 2, . . . , r − 1. Proof. Since C is a field which contains Q and is closed under taking square roots, it follows that fields of the form Q(u1 , u2 , . . . , ur ), with u1 2 ∈ Q and um+1 2 ∈ Q(u1 , . . . , um ) (m < r), are subfields of C. To complete the proof it is enough to check that the complex numbers that belong to some field Q(u1 , u2 , . . . , ur ) with u1 2 ∈ Q and um+1 2 ∈ Q(u1 , . . . , um ) (m < r), form a subfield of C, closed under conjugation (z 7→ z̄) and taking square roots (z 7→ 1 z 2 ). For this, note that if z ∈ Q(u1 , u2 , . . . , ur ) and z ′ ∈ Q(u′ 1 , u′ 2 , . . . , u′ s ), then z − z ′ and z/z ′ (z ′ 6= 0) belong to Q(u1 , u2 , . . . , ur , u′ 1 , u′ 2 , . . . , u′ s ). This field is obviously closed under conjugation and taking square roots is obvious. As a corollary, we obtain the following criterion to exclude a complex number from C: Corollary 7.2.3. If a number z ∈ C is constructible, then it is algebraic over Q of degree a power of 2. Proof. If K(u) is an extension of K where u2 ∈ K, then either K(u) = K or [K(u) : K] = 2. Hence, the fields Q(u1 , u2 , . . . , ur ) with u1 2 ∈ Q and um+1 2 ∈ Q(u1 , . . . , um ), if m < r, satisfy [Q(u1 , . . . , ur ) : Q] = 2s , for some integer s ≤ r. The corollary now follows from the theorem, if one observes that z ∈ C, then Q(z) ⊂ Q(u1 , . . . , ur ), hence, [Q(z) : Q] = 2t . We will see later, in our study of Galois Theory, that the converse of this corollary does not hold. However, we can use the corollary to show that constructions (i)–(iv) are not possible if one uses only a compass and straightedge. 342 CHAPTER 7. GALOIS THEORY Examples 7.2.4. 1. Trisecting an angle. One can construct a 60o degree angle with compass and straightedge. We claim that one cannot trisect an 60o angle. If this was possible, then the number cos 20o + i sin 20o would be constructible and, in particular, cos 20o would be constructible. The trigonometric identity cos 3θ = 4 cos3 θ − 3 cos θ, shows that cos 20o is a root of the polynomial 4x3 −3x− 21 = 0. This polynomial is irreducible over Q (exercise), hence the degree of cos 20o over Q is 3. By the corollary, cos 20o is not constructible. √ 2. Duplicating a cube. We need to √ show that 3 2 is not√constructible. For that, it is enough to observe that [Q( 3 2) : Q] = 3, since 3 2 is a root of the irreducible polynomial x3 − 2. 3. Constructing a regular do heptagon. If this construction exists, then 2π the number z = cos 2π 7 + i sin 7 is constructible. This number is a root of 7 the polynomial x − 1 = (x − 1)(x6 + x5 + · · · + 1). Since x6 + x5 + · · · + 1 is irreducible over Q, we conclude that [Q(z) : Q] = 6. Since 6 is not a power of 2, the corollary shows that z is not constructible. 4. Squaring the circle. Starting from a circle of radius 1, we see that an affirmative answer implies that the number π is constructible. As we know, π is transcendental over Q, hence is not constructible4 . There exist algebraic numbers of degree a power of 2 which are not constructible. We will see later that Galois Theory yields a more efficient criterion to determine if a given algebraic number is constructible. For now, we remark that the structure of certain extensions of Q was the relevant factor to decide about this property. Exercises. 1. Interpret the usual addition and multiplication operations on complex numbers geometrically, showing that it involves exclusively (C1), (C2) and (C3) and, hence, maybe constructed with a compass and straightedge. 4 We will not discuss in this book the transcendence of π. The existence of transcendentals numbers was established for the first time by Liouville in 1844. Liouville observed that algebraic numbers cannotP be the limit of rational sequences which converge “very fast”. −n! For example, the number ∞ cannot be algebraic. Hermite, in 1873, showed that n=1 10 the base of the natural logarithms “e” is transcendental. Finally, Lindemann, in 1882, used methods similar to Hermite to prove that π is transcendental. In 1874, Cantor gave an argument without using limits (see Exercises 3 and 4 in Section 7.1) showing that there exist transcendental numbers. 7.3. SPLITTING FIELDS 2. Show that the number √ 2 2 343 +i √ 2 2 is constructible. 3. Show that the polynomial q(x) = 4x3 − 3x − 1 2 is irreducible over Q. 4. Show that arccos 71/125 can be trisected. 5. Let p be a prime. Show that the polynomial xp−1 +xp−2 +· · ·+1 is irreducible over Q. (Hint: Replace x be x + 1 in the expression xp−1 + xp−2 + · · · + 1 = (xp − 1)/(x − 1) and apply Eisenstein’s Criterion.) 6. Let p be a prime. (a) Show that if a regular polygon with p sides can be constructed with a compass and straightedge then p = 2s + 1. t (b) Show that if p = 2s + 1 is prime then p is a Fermat number: p = 22 + 1. (c) Conclude that if a regular polygon with p sides can be constructed with a compass and straightedge then p is a Fermat number:. 7. Decide if a regular polygon of 9 sides has a compass-and-straightedge construction. 7.3 Splitting Fields As usual, we assume that K is a field. Given a polynomial p(x) ∈ K[x] is there some extension L of K where p(x) splits into a product of linear factors? Since K is not a priori a subfield of an algebraically closed field this is not completely obvious. Such an extension, if it exists, is necessarily obtained through some “abstract” construction. In fact, such an extension always exists and its construction is similar to the construction of the complex field C, as an extension of R, in the form of the “abstract” quotient R[x]/hx2 + 1i. Theorem 7.3.1. Let p(x) ∈ K[x] be polynomial of degree n ≥ 1. There exists an extension L ⊃ K where p(x) splits into a product of linear terms. Moreover, one can take this extension to be of the form L = K(r1 , . . . , rn ), where r1 , . . . , rn are the roots of p(x) in L. 344 CHAPTER 7. GALOIS THEORY Proof. Without loss of generality, we can assume that p(x) is a monic polynomial of degree n. Denoting by l the number of irreducible factors of p(x), the proof proceeds by induction on n − l. If n − l = 0, then p(x) is already a product of linear terms and the roots of p(x) belong to K, Hence, the result holds in this case. Assume now that n − l > 0. Then there exists a factor q(x) of p(x) which is irreducible of degree > 1. The field M = K[x]/hq(x)i is an extension of K, containing the root r = x + hq(x)i of q(x), and coincides with K(r). In M , we have the factorization q(x) = (x − r)q̃(x), so the polynomial p(x) splits into a product of ˜l irreducible factors with ˜l > l. Since n − ˜l < n − l the induction hypothesis gives an extension L of M , where p(x) splits into a product of linear terms, and such that: L = M (r1 , . . . , rn ), where r1 , . . . , rn are the roots of p(x) in L. Since the root r is among the roots r1 , . . . , rn , we see that L = M (r1 , . . . , rn ) = K(r)(r1 , . . . , rn ) = K(r, r1 , . . . , rn ) = K(r1 , . . . , rn ). This result leads to the following definition. Definition 7.3.2. Let p(x) ∈ K[x]. An extension L ⊃ K is called a splitting extension of p(x) if: (i) p(x) splits in L into a product of terms of degree 1; (ii) L = K(r1 , . . . , rn ) where r1 , . . . , rn are the roots of p(x) in L. Similarly, if {pi (x)}i∈I ⊂ K[x] is a family of polynomials, an extension L ⊃ K is called a splitting extension of the family {pi (x)}i∈I if: (i) each pi (x) splits in L into a product of terms of degree 1; (ii) L is generated by the roots of the polynomials pi (x). Often, we will use the term splitting field instead of splitting extension, if the base field is clear from the context. We will see in the next section that two splitting extensions of a polynomial (or a family) are isomorphic. The proof of Theorem 7.3.1 gives a practical algorithm to construct a splitting extension of a polynomial, as we illustrate in the next examples. 7.3. SPLITTING FIELDS 345 Examples 7.3.3. 1. Consider a polynomial p(x) = x2 + bx + c with coefficients in a field K. If p(x) has a root in K, then K is already a splitting field of p(x). Assume then that p(x) is irreducible. Then K[x]/hp(x)i is an extension of K, r1 = x−hp(x)i is a root of p(x), and K[x]/hp(x)i = K(r1 ). In K(r1 ) we have x2 + bx + c = (x − r1 )(x − r2 ), hence K(r1 ) = K(r1 , r2 ) is a splitting field of x2 + bx + c and [K(r1 , r2 ) : K] = 2. √ √ 2. We have seen in a example in the previous section that Q( 2, 3) ⊂ C is a √ √ splitting field of (x2 − 2)(x2 − 3) and that [Q( 2, 3) : Q] = 4. 3. The polynomial p(x) = xp − 1 (p a prime) has the root r0 = 1 in Q, so that xp − 1 = (x − 1)(xp−1 + xp−2 + · · · + 1). The polynomial xp−1 + xp−2 + · · · + 1 is irreducible in Q (exercise). Let r ∈ Q[x]/hxp−1 + xp−2 + · · · + 1i be the element x + hxp−1 + xp−2 + · · · + 1i. In Q(r)Q we have (rk )p = rkp = 1, and p that r, r2 , . . . , rp−1 are all distinct, so xp − 1 = i=1 (x − ri ). We conclude that p Q(r) is a splitting field of x − 1 and that [Q(r) : Q] = p − 1. If z ∈ C is a complex root of xp −1 distinct from 1, one checks easily that Q(z) is isomorphic to Q(r). 4. The polynomial p(x) = x3 +x2 +1 is irreducible over Z2 [x]. In fact, p(x) does not have any roots in Z2 , since p(0) = 0 + 0 + 1 = 1 and p(1) = 1 + 1 + 1 ≡ 1 (mod 2). The field Z2 [x]/hx3 + x2 + 1i is an extension of Z2 where p(x) has the root r = x + hx3 + x2 + 1i. Hence, in this field we can factor p(x) = (x − r)(x2 + bx + c). One checks by comparison that b = 1 + r and c = r2 + r. Using the relations r3 = r2 + 1 and r4 = r2 + r + 1, one check immediately that r2 is a root of x2 + (r + 1)x + (r2 + r). Hence, p(x) splits into linear factors in Z2 (r)[x]. This shows that Z2 (r) is a splitting field of x3 + x2 + 1 over Z2 , and that [Z2 (r) : Z2 ] = 3. 5. The polynomial p(x) = x3 − 2 is irreducible over Q. We form the extension L = Q[x]/hx3 − 2i and let r1 = x + hx3 − 2i. Then L = Q(r1 ) and in L the polynomial x3 − 2 factors as (x − r1 )(x2 + r1 x + r1 2 ). The polynomial x2 + r1 x + r1 2 is irreducible over Q(r1 ) (exercise), so we can form a new extension M = Q(r1 )[x]/hx2 + r1 x + r1 2 i. Setting r2 = x + hx2 + r1 x + r1 2 i ∈ M , we find that M = Q(r1 , r2 ) and in Q(r1 , r2 )[x] we have the factorization x3 − 2 = (x − r1 )(x − r2 )(x − r3 ). We conclude that Q(r1 , r2 ) = Q(r1 , r2 , r3 ) is a splitting field of x3 − 2. Also, by Proposition 7.1.3, we have [Q(r1 , r2 , r2 ) : Q] = 3 · 2 = 6. As we have mentioned before, the splitting field Q(r1 , . . . , rn ) of a polynomial p(x) = an xn +· · ·+a1 x+an ∈ Q[x] can always be realized as a subfield of C, since this field is an algebraically closed extension of Q. Actually, we can now show that any field admits an algebraically closed extension. 346 CHAPTER 7. GALOIS THEORY Theorem 7.3.4. Let K be a field. There exists an algebraically closed extension of K. Proof. We split the proof into 4 steps: (a) There exists an extension L1 of K where every polynomial in K[x] of degree ≥ 1 has a root: For each p(x) ∈ K[x] of degree ≥ 1 choose an indeterminate xp . Denote by S the set of all such indeterminates. We claim that in the ring K[S] the polynomials5 p(xp ) generate a proper ideal. Assume that (7.3.1) g1 p1 (xp1 ) + · · · + gm pm (xpm ) = 1, for some gi = gi (xp1 , . . . , xpm ) ∈ K[S]. There exists an extension M of K where the polynomials p1 , . . . , pm have roots α1 , . . . , αm , so replacing xpi 7→ αi in (7.3.1) we obtain 0 = 1, a contradiction. This proves the claim, and so there exists a maximal ideal I in K[S] which contains the polynomials p(xp ). The field L1 ≡ K[S]/I is the sought extension. (b) By(a), using induction, we construct a chain of extensions K ⊂ L1 ⊂ L2 ⊂ . . . ⊂ Ln ⊂ . . ., where for each k, any polynomial of Lk [x] of degree ≥ 1 has a root in Lk+1 . (c) Let L ≡ ∪i Li . Then L is a field: if a, b ∈ L, there exists a k such that a, b ∈ Lk , and one can define a + b and ab as the sum and product in Lk . This definition is independent of k, since the Li are successive extensions. (d) The field L is a algebraically closed extension: if p(x) ∈ L[x] has degree ≥ 1, then the coefficients of p(x) belong to some Lk . Hence, p(x) ∈ Lk [x], so p(x) must have a root in Lk+1 , hence in L. Corollary 7.3.5. Let K be a field. There exists an algebraic extension L̃ ⊃ K which is algebraically closed. Proof. Let L ⊃ K be an algebraically closed extension and denote by L̃ the set of all algebraic elements of L over K. It is easy to check that L̃ is a subfield of L which has the desired properties. 5 One should not confuse p(x) with p(xp )! 7.3. SPLITTING FIELDS 347 An algebraic extension of K which is algebraically closed is called an algebraic closure of K. We will see in the next section that any two a such extensions are isomorphic. We will denote by the symbol K any algebraic closure of K. The existence of a splitting field of a polynomial (Theorem 7.3.1) can be easily shown if one appeals to the algebraic closure. However, note that the proof of the existence of an algebraic closure uses precisely Theorem 7.3.1. Still, we can now easily prove the existence of a splitting field of an arbitrary family of polynomials. Corollary 7.3.6. Let {pi (x)}i∈I ⊂ K[x] be a family of polynomials. There exists a splitting extension L ⊃ K for the family {pi (x)}i∈I . a Proof. It is enough to consider in the algebraic closure K the subfield L generated by all the roots of the polynomials {pi (x)}i∈I . Let us consider now the converse problem: when is an extension L of K a splitting extension of some family of polynomials with coefficients in K? Definition 7.3.7. An algebraic extension L of K is called a normal extension of K if every irreducible polynomial in K[x] which has a root in L splits into a product of linear terms in L[x]. Examples 7.3.8. √ 1. Q( 2) is a normal extension of Q. To check √ p(x) ∈ Q[x] be an √ this let irreducible polynomial with a root r = a + b 2 ∈ Q( 2). Then p(x) is a multiple of the minimal polynomial q(x) of r. Since q(x) = (x − a)2 − 2b2 ∈ Q[x] and p(x) is irreducible over Q, we must have √ p(x) = λq(x),√for some λ ∈ Q. √ Finally, in Q( 2) we have p(x) = c(x − a − b 2)(x − a + b 2). √ Q. In √ fact, the polynomial x4 − 2 is 2. Q( 4 2) is not a normal extension of √ 4 irreducible over Q, and has the root √2 in Q( 4 2), However, this polynomial does not split into linear terms in Q( 4 2), since this field does not contain the imaginary roots of x4 − 2. The last example suggest that there must exist some relation between splitting extensions and normal extensions. The next result clarifies completely this issue: 348 CHAPTER 7. GALOIS THEORY a Theorem 7.3.9. Let L ⊂ K be an algebraic extension of K. The following statements are equivalent: (i) L is a normal extension of K. (ii) L contains a splitting field of the minimal polynomial of any u ∈ L. (iii) L is a splitting extension of a family {pi (x)}(i∈I) ⊂ K[x]. a (iv) Every monomorphism φ : L → K such that φ|K =id is an automorphism of L: φ(L) = L. We defer the proof of this theorem for the next section. An immediate corollary is the following: Corollary 7.3.10. An extension L of K is a splitting extension of a polynomial p(x) ∈ K[x] if and only if L is a normal, finite-dimensional extension of K. We have provide above a simple example of a non normal extension. If an algebraic extension is not normal, one can try to find a larger normal extension which is not “too big”. Proposition 7.3.11. Let L be an algebraic extension of K. There exists an extension L̃ ⊃ L such that: (i) L̃ is a normal extension of K. (ii) for any normal extension M of K such that L ⊂ M ⊂ L̃, one has M = L̃. (iii) [L̃ : K] < ∞ if and only if [L : K] < ∞. Proof. Let S = {ui : i ∈ I} be a basis for L over K and for each i ∈ I denote by pi (x) ∈ L[x] the minimal polynomial of ui . Let L̃ be a splitting extension for the family {pi (x)}i∈I . By Theorem 7.3.9, L̃ is a normal extension of K, so (i) holds. Also by Theorem 7.3.9, any normal extension of K must contain a splitting extension of the minimal polynomial of any of its elements so we conclude that L̃ is the smallest normal extension of K that contains L, and (ii) also holds. Finally, to see that (iii) holds, note that if [L : K] < ∞, then I is finite and we can replace the family {pi (x)}i∈I by the polynomial Q p(x) = i∈I pi (x). It follows that [L̃ : K] < ∞. Conversely, if [L̃ : K] < ∞ then, by Proposition 7.1.3, [L : K] < ∞. 7.3. SPLITTING FIELDS 349 From the uniqueness of the the splitting extension it follows that the extension L̃ given in the previous proposition is unique up to an isomorphism. n Such an extension is called the normal closure of L and is denoted L . Example 7.3.12. √ of Q. We saw in a example above that L = Q( 4 2) is not a normal extension √ n It is easy to check that its normal closure is the extension L = Q( 4 2, i). Exercises. 1. Let L = Q[x]/hx3 − 2i and r = x + hx3 − 2i ∈ L. Show that the polynomial x2 + rx + r2 is irreducible over L. 2. Prove the following extension of Theorem 7.3.1, without using the algebraic closure: If p1 (x), . . . , pm (x) ∈ K[x], then there exists a splitting extension of the family of polynomials p1 (x), . . . , pm (x). 3. For the following polynomials over Q find splitting extensions and determine their dimensions: (a) x2 + x + 1; (b) (x3 − 2)(x2 − 1); (c) x6 + x3 + 1; (d) x5 − 7. 4. Consider the polynomial x3 − 3. Find a splitting extension and its dimension over each of the following ground fields: (a) Q; (b) Z3 ; (c) Z5 . 5. If p(x) ∈ K[x] is a polynomial of degree n, and L is a splitting extension of p(x), show that [L : K] | n!. n 6. Determine a splitting extension of the polynomial xp − 1 over the field Zp . 7. Show that the extension L̃ constructed in the proof of Corollary 7.3.5 is algebraically closed. 8. If L is an extension of K and [L : K] = 2, show that L must be a normal extension of K. 350 CHAPTER 7. GALOIS THEORY 9. If M ⊃ L ⊃ K are successive extensions, with M normal over L and L normal over K, is it true that M must be normal over K? 10. Show that if M ⊃ L ⊃ K are successive extensions and M is normal over K, then M is normal over L. 7.4 Homomorphisms of Extensions Definition 7.4.1. Let L1 ⊃ K1 and L2 ⊃ K2 be extensions. A field homomorphism φ : L1 → L2 such that φ(K1 ) ⊂ K2 is called a homomorphism of extensions. When K1 = K2 = K and φ|K =id we say that φ is a homomorphism over K or a K-homomorphism6 . Notice that a K-homomorphism φ : L1 → L2 is the same thing as a field homomorphism which is also a linear transformation from L1 to L2 , as vector spaces over K. We are interested in following questions: Given an isomorphism of fields φ : K1 → K2 , when can one extend φ to an isomorphism of extensions Φ : L1 → L2 ? How many such extensions are there? The answers will allow us to prove uniqueness of algebraic and normal closures, as well as of splitting fields. Proposition 7.4.2. Let φ : K1 → K2 be an isomorphism of fields, L1 ⊃ K1 and L2 ⊃ K2 extensions, and r ∈ L1 algebraic over K1 with minimal polynomial p(x). The isomorphism φ can be extended to a monomorphism Φ : K1 (r) → L2 if and only if the polynomial pφ (x) has a root in L2 . The number of such extensions equals the number of distinct roots of pφ (x) in L2 . Proof. Let s be a root of pφ (x) in L2 . It is easy to check that there exist a unique field homomorphism Φ : K1 (r) → L2 such that Φ|K1 = φ and Φ(r) = s. On the other hand, if Φ is any extension of φ, then pφ (Φ(r)) = Φ(p(r)) = Φ(0) = 0, so pφ (x) has a root in L2 . Here is a first answer to the questions posed above: 6 Recall that a homomorphism of fields is necessarily injective, hence in these definitions one can replace “homomorphism” by “monomorphism”. 7.4. HOMOMORPHISMS OF EXTENSIONS 351 Theorem 7.4.3. Let φ : K1 → K2 be an isomorphism of fields and p(x) ∈ K1 [x]. If L1 and L2 are splitting extensions of p(x) and pφ (x), respectively, then there exists an isomorphism Φ : L1 → L2 which extends φ. The number of such extension is ≤ [L1 : K1 ], and equals [L1 : K1 ] when pφ (x) has distinct roots in L2 . Proof. We use induction on [L1 : K1Q ]. If [L1 : K1 ] = 1,Q then p(x) = an ni=1 (x − ri ), where ri ∈ K1 = L1 . But then pφ (x) = φ(an ) ni=1 (x − φ(ri )), with φ(an ), φ(ri ) ∈ K2 . Since the roots of a polynomial generate its splitting field we conclude that L2 = K2 , and that there exists exactly [L1 : K1 ] = 1 extension. Assume now that [L1 : K1 ] > 1. Then p(x) has an irreducible factor q(x) ∈ K1 [x] of degree ≥ 1. Let r be a root of q(x) in L1 . By Proposition 7.4.2 the isomorphism φ : K1 → K2 can be extended to a monomorphism φ̃ : K1 (r) → L2 and there exist as many such extensions as distinct roots of q φ (x) in L2 . We can also view L1 and L2 as splitting extensions of p(x) and pφ (x) over K1 (r) and φ̃(K1 (r)), respectively. Since [L1 : K1 (r)] = [L1 : K1 ]/[K1 (r) : K1 ] = [L1 : K1 ]/ deg q(x) < [L1 : K1 ], by the induction hypothesis, there exists an extension of φ̃ to an isomorphism Φ : L1 → L2 , and the number of such extensions is ≤ [L1 : K1 (r)], and equal [L1 : K1 (r)] if pφ (x) has distinct roots in L2 . It follows that Φ is an extension of φ, and the number of extensions Φ of this type is [L1 : K1 (r)] · deg q(x) = [L1 : K1 (r)] · [K1 (r) : K1 ] = [L1 : K1 ], if pφ (x) has distinct roots in L2 . Finally, observe that we can obtain any extension of φ in this way. In fact, given any extension Φ : L1 → L2 its restriction to K1 (r) gives a monomorphism K1 (r) → L2 , which by Proposition 7.4.2 is an extension of φ of the type φ̃, as above. Setting K1 = K2 = K and φ =id, in this theorem, yields: Corollary 7.4.4. Any two splitting fields of p(x) ∈ K[x] are isomorphic. Example 7.4.5. In an example in the previous section, we have constructed an (abstract) splitting field L1 = Q(r1 , r2 , r3 ) of x3 − 2 ∈ Q[x], with [L1 : Q] = 6. Another splitting field L2 can be obtained √ taking the√subfield of C generated by Q √ by By the theorem, all the 6 and the roots of x3 − 2 in C: 3 2, 3√2/2(−1 √ ± i 3). √ bijections between {r1 , r2 , r3 } and { 3 2, 3 2/2(−1 ± i 3)} can be realized by a Q-isomorphism L1 → L2 . 352 CHAPTER 7. GALOIS THEORY Let us consider now the case of arbitrary algebraic extensions: Theorem 7.4.6. Let φ : K1 → K2 be an isomorphism of fields, L1 ⊃ K1 an algebraic extension and L2 ⊃ K2 an algebraically closed extension. There exists a monomorphism Φ : L1 → L2 which extends φ. If L1 is algebraically closed and L2 is algebraic over K2 , then Φ is an isomorphism L1 ≃ L2 . Proof. Conside the set P formed by ordered pairs (N, τ ) where N ⊂ L1 is an extension of K1 , and τ : M → L2 is a monomorphism which extends φ. In P one defines a partial order relation by declaring (N1 , τ1 ) ≤ (N2 , τ2 ) if and only if N1 ⊂ N2 and τ2 |N1 = τ1 . The set P is non-empty, since it contains the pair (K1 , φ), and any chain {(Ni , τi )}i∈I of elements of P is bounded above by the pair (N, τ ), where N ≡ ∪i Ni and τ |Ni ≡ τi . By Zorn’s Lemma, P contains a maximal element (M, Φ). We claim that M = L1 , so that Φ gives the desired extension. In fact, assume that there exists r ∈ L1 − M . We can then form the extension extension M (r) and, by Proposition 7.4.2, there exists an extension Φ̃ : M (r) → L2 . The pair (M (r), Φ̃) contradicts the maximality of (M, Φ), so we must have L1 = M , as claimed. Finally, if L1 is algebraically closed, then Φ(L1 ) is also algebraically closed. If L2 is algebraic over K2 , then we must have L2 ⊂ Φ(L1 ), hence Φ is onto. The previous result leads to the uniqueness of the algebraic closure and of the splitting extension of any family of polynomials. For example, if one takes K1 = K2 = K and φ =id in the theorem, we obtain immediately: Corollary 7.4.7. Let K be a field, L and L̃ algebraically closed extensions, algebraic over K. There exists a K-isomorphism Φ : L → L̃. Since a splitting extension of a family of polynomials can be taken inside an algebraic closure, and it is generated by the all roots of the polynomials in the family, we also obtain: Corollary 7.4.8. Let {pi (x)}i∈I ⊂ K[x] be a family of polynomials. Any two splitting extensions of the family {pi (x)}i∈I are isomorphic. Finally, the uniqueness of the normal closure of an extension L of K follows from Theorem 7.3.9 whose proof we are now able to furnish. Proof of Theorem 7.3.9. We will prove the implications (i) ⇒ (ii) ⇒ (iii) ⇒ (iv) ⇒ (i). 7.4. HOMOMORPHISMS OF EXTENSIONS 353 (i) ⇒ (ii): Let r ∈ L and denote by p(x) ∈ K[x] the minimalQpolynomial of r. Since L is normal and p(r) = 0, p(x) splits into a product ni=1 (x − ri ) in L[x]. The subfield K(r1 , . . . , rn ) ⊂ L is a splitting field of p(x). (ii) ⇒ (iii): L is a splitting extension of the family {pr (x)}r∈L , where pr (x) is the minimal polynomial of the element r ∈ L. a (iii) ⇒ (iv): Let r ∈ L ⊂ K be a root of a polynomial pi (x). Then φ(r) is also a root of pi (x), since φ|K =id. Since the roots of the family {pi (x)}i∈I generate the field L, we conclude that φ(L) = L. (iv) ⇒ (i): Let p(x) ∈ K[x] be an irreducible polynomial with a root a a r ∈ L. If r̃ ∈ K is any other root of p(x), the map φ : K(r) → K(r̃) ⊂ K which takes r 7→ r̃ and is the identity in K, is a K-isomorphism, which by Theorem 7.4.6 can be extended to L. Then r̃ = φ(r) ∈ L, hence all the roots of p(x) belong to L, so p(x) factors into a product of linear terms in L[x]. Hence, L is a normal extension of K. Exercises. 1. Let φ : K1 → K2 be an isomorphism, L1 ⊃ K1 and L2 ⊃ K2 extensions, and r ∈ L1 algebraic over K1 with minimal polynomial p(x). If s is a root of pφ (x) in L2 , show that there exists a unique homomorphism of fields Φ : K1 (r) → L2 such that Φ|K1 = φ and Φ(r) = s. √ √ 2. Consider the extensions Q(i), Q( 2), Q(i, 2) ⊂ C of Q. √ (a) Show that Q(i) and Q( 2) are isomorphic Q-vector spaces, but are not isomorphic as fields. √ (b) Find all the Q-automorphisms of Q(i, 2). 3. Let r be a root of x6 + x3 + 1. Determine all the Q-homomorphisms Φ : Q(r) → C. (Hint: x6 + x3 + 1 is a factor of x9 − 1.) 4. Determine all the Z2 -automorphisms of the splitting extension of x3 +x2 +1 ∈ Z2 [x]. 5. Show that an algebraic extension L of K is a normal extension if and only if every irreducible polynomial of K[x] factors in L[x] into a product of irreducible factors where all the factors have the same degree. 354 7.5 CHAPTER 7. GALOIS THEORY Separability As it will be clear in the sequel, the K-automorphisms of an extension L ⊃ K play a fundamental role in Galois Theory. As we have seen in the previous section, the precise counting of the number of K-automorphisms depend on the number of distinct roots of certain polynomials. We will now study when a given polynomial has multiple roots and we will establish a criterion for its existence. Any polynomial p(x) ∈ K[x] splits into a product of linear terms in a a K [x]. Hence, if r ∈ K is a root of p(x), we can define its multiplicity to a be the largest integer k such that (x − r)k | p(x) in K [x]. The root is called simple if k = 1, and multiple if k > 1. The criterion for the existence of multiple roots will require a notion of derivative of a polynomial. When the polynomial can be viewed as real or complex valued function, this is just the usual notion of derivative, but in general this will be only a formal derivative. Definition 7.5.1. Let K be a field. The formal derivative operator of K[x] is the only map D : K[x] → K[x] which satisfies the following properties: (i) Linearity: D(p(x) + q(x)) = D(p(x)) + D(q(x)) and D(a · p(x)) = a · D(p(x)). (ii) Leibniz Rule: D(p(x)q(x)) = D(p(x))q(x) + p(x)D(q(x)). (iii) Normalization: D(x) = 1. One checks easily that there is exactly one operator satisfying these properties. In fact, if p(x) = an xn + · · · + a1 x + a0 ∈ K[x], then by applying (i), (ii) and (iii), we find: D(p(x)) = nan xn−1 + · · · + 2a2 x + a1 . (7.5.1) We will often write p′ (x) instead of D(p(x)). Theorem 7.5.2. Let p(x) ∈ K[x] be a monic polynomial of degree ≥ 1. The a roots of p(x) in K are simple if and only if gcd(p(x), p′ (x)) = 1. a Proof. If r ∈ K is a multiple root of p(x), then p(x) = (x − r)k q(x), where a k > 1 and q(x) ∈ K [x]. Diferenciating: p′ (x) = (x − r)k q ′ (x) + k(x − r)k−1 q(x), 7.5. SEPARABILITY 355 so we see that (x − r)k−1 | p′ (x), and (x − r) is a common factor of p(x) and p′ (x). We conclude that if gcd(p(x), p′ (x)) = 1, then p(x) has no multiple roots. a Assume now that p(x) has no multiple roots. Then in K [x] we have p(x) = (x−r1 )(x−r2 ) · · · (x−rn ), where the ri are all distinct. Differentiating we obtain: p′ (x) = n X (x − r1 ) · · · (x − ri−1 )(x − ri+1 ) · · · (x − rn ), i=1 so that (x − ri ) ∤ p′ (x), for all i = 1, . . . , n. Hence, gcd(p(x), p′ (x)) = 1. Notice that to apply this criterion we do not need to know the ala gebraic closure K or even a splitting field of p(x): the computation of gcd(p(x), q(x)) is independent of the extension where one takes the coefficients of the polynomials p(x) and q(x). Definition 7.5.3. One calls p(x) ∈ K[x] a separable polynomial if each of its irreducible factors has no multiple roots. A perfect field is a field K where every polynomial in K[x] is separable. The criterions above leads immediately to: Corollary 7.5.4. Every field of characteristic zero is perfect. Proof. Let p(x) ∈ K[x] be a monic irreducible polynomial with multiple roots. Then gcd(p(x), p′ (x)) 6= 1, so we must have p(x) | p′ (x). Since deg p′ (x) < deg p(x), it follows that p′ (x) = 0. Since the characteristic of K is assumed to be zero, formula (7.5.1) for the derivative shows that p(x) ∈ K, a contradiction. The proof fails in characteristic p, since then the condition q ′ (x) = 0 only implies that q(x) = a0 + a1 xp + a2 x2p + . . . . One can then find examples of non-separable polynomials in characteristic p 6= 0 (see the exercises). When p 6= 0 the study of separability is based on the following simple lemma: Lemma 7.5.5. If K has characteristic p 6= 0 and a ∈ K, then the polynomial xp − a is either irreducible or takes the form (x − b)p , b ∈ K. Proof. Suppose that xp − a = g(x)h(x), where g(x) and h(x) are monic a polynomials. Let b ∈ K be a root of xp − a. Then p X (pk )xk (−b)p−k = xp − bp = xp − a, (x − b) = p k=0 356 CHAPTER 7. GALOIS THEORY since for 0 < k < p the coefficients pk are all divisible by p. Hence, g(x) = (x − b)k and we must have bk ∈ K. Since gcd(k, p) = 1, there exist r, s ∈ Z such that rk + sp = 1, and it follows that b = brk+sp = (bk )r + (bp )s ∈ K. This shows that xp − a = (x − b)p , with b ∈ K. Theorem 7.5.6. A field K of characteristic p 6= 0 is perfect if and only if K = K p , where K p denotes the subfield of K formed by all the powers ap , with a ∈ K. Proof. Let a ∈ K − K p . By Lemma 7.5.5, xp − a is irreducible and since (xp − a)′ = pxp−1 = 0, this polynomial is not separable. Hence if K 6= K p then K is not perfect. Conversely, assume that K is not perfect, so there exists q(x) ∈ K[x] irreducible and non-separable. Since q ′ (x) = 0, we have q(x) = a0 + a1 xp + a2 x2p + . . . . Note that at least one ai 6∈ K p , for if aj = bj p with bj ∈ K for all j, then q(x) = (b0 + b1 x + b2 x2 + . . . )p , which contradicts the assumption that q(x) is irreducible. Hence, K 6= K p . If K is a field of characteristic p 6= 0, the monomorphism φ : K → K, a→ 7 ap , is called the Frobenius endomorphism. The previous theorem states that K is perfect if and only if the Frobenius endomorphism is an automorphism. Corollary 7.5.7. If K is finite of characteristic p 6= 0, then K is a perfect. Proof. If K is finite, the Frobenius endomorphism is surjective. In most examples that we will present in our discussion of Galois Theory the ground field will be a perfect field. When this is not the case, in order for Galois Theory to work, we need the following extra assumption: Definition 7.5.8. Let L be an extension of K. (i) u ∈ L is called separable element over K if u is algebraic over K, and the minimal polynomial of u is separable. (ii) L is called a separable extension over K if all elements of L are separable over K. Given an algebraic extension L ⊃ K one defines its separable degree, denoted [L : K]s , to be the cardinality of the set of K-homomorphisms from a L to the algebraic closure K : a [L : K]s := #{φ : L → K : φ is a K-homomorphism}. 7.5. SEPARABILITY 357 Proposition 7.5.9 (Properties of the separable degree). Let K be a field. (i) If M ⊃ L ⊃ K are algebraic extensions then [M : K]s = [M : L]s · [L : K]s . (ii) If L is an extension of finite dimension over K, then [L : K]s ≤ [L : K], with equality precisely when L is separable over K. (iii) If L = K(u1 , . . . , um ), then L is separable over K if and only if u1 , . . . , um are separable over K. Proof. a (i) If φ : M → K is a K-homomorphism, then φ is an extension of a the K-homomorphism φ|L : L → K . Hence, it is enough to show that a for a K-homomorphism ψ : L → K its extensions to a K-homomorphism a φ : M → K are in bijection with the L-homomorphisms τ : M → L̄a . a a In fact, we can extend ψ to an isomorphism ψ̃ : L → K , hence the a correspondence that to a L-homomorphism τ : M → L associates the extension φ = ψ̃ ◦ τ is a bijection: the inverse associates to an extension a φ : M → K the L-homomorphism τ = ψ̃ −1 ◦ φ. (ii) By (i), it is enough to show that [K(u) : K]s ≤ [K(u) : K] for any algebraic element u over K, with equality if and only if K(u) is separable over K. Since [K(u) : K]s represents the number of K-homomorphisms φ : K(u) → K̄ a , it equals the number of distinct roots of the minimal polynomial of u over K. Hence, [K(u) : K]s ≤ [K(u) : K], with equality if and only if u is separable over K. But u is separable over K if and only if K(u) is a separable extension over K: if u′ ∈ K(u) is not separable, then [K(u) : K]s = [K(u) : K(u′ )]s [K(u′ ) : K]s < [K(u) : K(u′ )][K(u′ ) : K] = [K(u) : K]. (iii) If the ui are separable over K, then ui is separable over K(u1 , . . . , ui−1 ). From (i) and (ii), we conclude that [K(u1 , . . . , um ) : K]s = [K(u1 , . . . , um ) : K], so K(u1 , . . . , um ) is separable over K. If K is perfect, then obviously all algebraic extensions of K are separable. In particular, if K has characteristic 0 or K has characteristic p 6= 0 and K = K p , then every algebraic extension of K is separable. 358 CHAPTER 7. GALOIS THEORY Exercises. 1. Show that if K has characteristic 0 and p(x) ∈ K[x] is monic, then q(x) = −1 p(x)[gcd(p(x), p′ (x))] is a polynomial whose roots are all simple, and that these are precisely the roots of p(x). 2. Show that if K is a field with characteristic 6= 2, then any quadratic polynomial x2 + ax + b ∈ K[x] is separable. 3. Consider the field Zp (x) of fractions q(x)/r(x), where q(x) and r(x) 6= 0 are polynomials over Zp . (a) Show that Zp (x) has characteristic p. (b) Show that the element x ∈ Zp (x) is not a power of degree p, i.e., there is no b(x) ∈ Zp (x) such that x = b(x)p . (c) Given an example of a polynomial over Zp (x) which is not separable. 4. Let K be a field of characteristic p, and q(x) ∈ K[x] an irreducible polynomial. Show that all the roots of q(x) have the same multiplicity pn , for some integer n. 5. Let L be an extension of K, with [L : K] < ∞. Show that [L : K]s | [L : K]. (Note: One calls [L : K]i = [L : K]/[L : K]s the inseparable degree of L over K.) 6. Let M ⊃ L ⊃ K be succesive extensions. Show that: (i) If M is separable over K, then M is separable over L and L is separable over K; (ii) If M is separable over L and L is separable over K, then M is separable over K. 7.6 The Galois Group As we have already mentioned, Galois’s great insight was to replace questions about field extensions by questions in group theory. We are now ready to introduce the relevant groups. If L is an extension of K, obviously the K-automorphisms of L form a group under composition: if φ1 and φ2 are K-automorphisms of L, then φ1 ◦ φ2 is also a K-automorphism. Definition 7.6.1. The group of K-automorphisms of an extension L of K is called the Galois group of L over K and is denoted AutK (L). 7.6. THE GALOIS GROUP 359 The next examples show that this group can be of a very different nature, depending on the extension. Examples 7.6.2. √ √ 1. Let L = Q( 2). The element 2 has minimal polynomial x2 − 2. Any Q-automorphism φ : L → L transforms roots of this polynomial into roots. Hence, we have two automorphisms, namely the identity and √ √ φ(a + b 2) = a − b 2. √ We conclude that the Galois group AutQ (Q( 2) is isomorphic to Z2 . √ √ 2. Let L = Q( 2, 3). As in the previous example we see that √ the √ Q-automorphisms of L are completely determined by its action on the set { 2, 3}. There are 4 possibilities: the identity and √ √ √ √ φ1 ( 2) = − 2, φ1 ( 3) = 3; √ √ φ2 ( 2) = 2, √ √ φ2 ( 3) = − 3; √ √ √ √ φ3 ( 2) = − 2, φ3 ( 3) = − 3. In this case the Galois group is isomorphic to Z2 ⊕ Z2 . 3. Let K be a field of characteristic p such that K 6= K p . If a 6∈ K p , the polynomial q(x) = xp − a is irreducible. Let L be a splitting extension of q(x). In L we have q(x) = (x−r)p , so L = K(r). If φ : L → L is a K-automorphism, then φ(r) = r and we conclude that φ=id. The Galois group AutK (L) is trivial. 4. Let L = K(x) be the field of fractions p(x)/q(x) with p(x), q(x) ∈ K[x] and q(x) 6= 0. It is easy to see that the primitive elements of L are all of the form t= ax + b , cx + d a, b, c, d ∈ K, ad − bc 6= 0. Any K-automorphism of L transforms primitive elements into primitive elements. Hence, a K-automorphism φ : L → L, transforms the element p(x)/q(x) ∈ L in an element p(t)/q(t) ∈ L. We conclude that the Galois group is isomorphic to P GL2 (K) := GL2 (K)/Z, where Z is the center of GL2 (K) consisting of 2 × 2 matrices of the form a 0 Z={ : a 6= 0}. 0 a If K is infinite, this group is infinite. Since we know now that all the splitting fields of a polynomial p(x) are isomorphic, we can set: 360 CHAPTER 7. GALOIS THEORY Definition 7.6.3. The Galois group of p(x) ∈ K[x] (or of p(x) = 0) is the Galois group of any splitting field of p(x) over K. It is natural to identify the Galois group of an equation p(x) = 0 with the subgroup of group of permutations of the roots of p(x)7 . If L is a splitting field of p(x), and S = {r1 , . . . , rn } is the set of distinct roots of p(x), then L = K(r1 , . . . , rn ). If φ ∈ AutK (L) is an element of the Galois group of p(x), then φ transforms roots of p(x) into roots, so it induces a permutation of S. On the other hand, if we know the effect of φ on the roots of p(x), then we know how φ transforms any element of L = K(r1 , . . . , rn ). Hence, the map φ 7→ φ|S is a monomorphism AutK (L) → Sn . This shows that we can identify AutK (L) with a subgroup of the group of permutations the roots. In general, as it was already clear in the examples above, AutK (L) ( Sn , PSfrag replaements even when p(x) is irreducible. Example 7.6.4. Let L ⊂ C be a splitting extension of polynomial p(x) = x6 − 2 ∈ Q[x]. r2 r1 1 +pr r hmi e e~ rZ r3 r4 h1 i = h2i6 h4 i h12i h6 i h3 i hmd(m; n)i hm i hni hmm(m; n)i r5 Figure 7.6.1: Roots of x6 − 2 = 0. This polynomial is irreducible with roots rk = that, for example r3 + r6 = 0. √ 2kπi 6 2e 6 , k = 1, . . . , 6. Notice 7 It was precisely in this way that Galois thought of this group, much before the abstract notion of group was discovered! 7.6. THE GALOIS GROUP 361 Since r3 + r1 6= 0, there is no automorphism in the Galois group corresponding to the transposition (16). From the figure below, note also that (r1 + r5 )6 = r6 6 = 2, hence there are no automorphisms of the Galois group corresponding to the permutations (13)(56) and (16)(35). One can exclude many other elements of S6 in this way. In facto, we will see shortly that |G| = 12. The computation of the Galois group of an equation p(x) = 0 or of an extension maybe a very hard task. However, we have already developed the tools to find its order. Theorem 7.6.5. Let L be an extension of finite dimension over K, and G = AutK (L) its Galois group. Then |G| ≤ [L : K]. If L a normal, separable, extension of K then |G| = [L : K]. Proof. We can assume that L ⊂ K̄ a . If φ ∈ G, then we obtain a Khomomorphism φ : L → K̄ a . The number of such homomorphisms is [L : K]s ≤ [L : K]. Hence |G| ≤ [L : K]. If L is normal, then every K-homomorphism ψ : L → K̄ a is in fact an automorphism of L, by Theorem 7.3.9. If L is separable, then [L : K]s = [L : K], by Proposition 7.5.9. Hence, if L is normal and separable over K, then |G| = [L : K]. For a separable polynomial we conclude that: Corollary 7.6.6. If p(x) is a separable polynomial over K with Galois group G, and L is a splitting extension of p(x), then |G| = [L : K]. Examples 7.6.7. √ 4 2) is a splitting extension of the polynomial√x4 − 2 ∈ 1. The extension Q(i, √ 4 Q[x]. Since [Q(i, 2) : Q(i)] = 4√and [Q(i) : Q] = 2, we have [Q(i, 4 2) : Q] = 8, and √ the Galois group of Q(i, 4 2) has order 8. We have Q-automorphisms of Q(i, 4 2) defined by (check this!) σ(i) = −i, √ √ 4 4 σ( 2) = 2, τ (i) = i, √ √ 4 4 τ ( 2) = i 2. These automorphisms have orders 2 and 4, respectively, and satisfy the relation τ σ = στ 3 . It is then easy to see that √ 4 AutQ Q(i, 2) = {1, τ, τ 2 , τ 3 , σ, στ, στ 2 , στ 3 } ≡ G. 362 CHAPTER 7. GALOIS THEORY This group is isomorphic to the dihedral group D4 . In terms of the roots rk = πk e 2 i , these automorphisms correspond to the permutations σ = (13), τ = (1234). 2. The Galois √ group G of the polynomial p(x) = x6 − 2 over Q has order order √ 2π 2π 6 6 i 12, since [Q( 2, e 3 ) : Q] = 12 and Q( 2, e 3 i ) is a splitting extension of p(x). We leave it as an exercise to check that G ≃ D6 and to determine its representation as a permutation group of the roots. The previous examples suggest that one should be able to compute the Galois group of a polynomial xn − a over a field K of characteristic 0. This depends on the ability to compute the Galois group of xn − 1. Definition 7.6.8. Let K be a field. A splitting extension of the polynomial xn − 1 is called the cyclotomic field of order n over K. The Galois group of a cyclotomic field is much simple in the case of characteristic zero. Proposition 7.6.9. If K has characteristic 0, the Galois group of the cyclotomic field of order n is isomorphic to a subgroup of Z∗n . Proof. Let L be a splitting extension of xn − 1 over K. Since K has characteristic zero, and (xn − 1)′ = nxn−1 6= 0, the roots of xn − 1 are all simple roots. We conclude that the set of roots U = {r ∈ L : r n − 1 = 0} is isomorphic to Zn (see Exercise 6.6.2.2). On the other hand, if φ ∈ AutK (L) ≡ G, then φ|U is an automorphism of U and this restriction determines completely φ. Hence G is isomorphic to a subgroup of Aut(U ) ≃ Aut(Zn ) ≃ Z∗n , the group of units of the ring Zn . The previous result shows that in characteristic 0 the Galois group of a cyclotomic field is abelian. In general, there is nothing else we can say about the Galois group: for example, if K contains the roots of xn − 1 = 0, then obviously the cyclotomic field of order n coincides with K, so its Galois group is trivial. However, when K = Q, the proof above shows that: Corollary 7.6.10. The Galois group of the cyclotomic field of order n over Q is Z∗n . In particular, its order is given by the Euler function: | AutQ (Q(e 2iπ n ))| = ϕ(n) = n k Y i=1 1 1− pi where p1 , . . . , pn are the distinct prime factors of n. , 7.6. THE GALOIS GROUP 363 2iπ Proof. Let ζ = e n , so U = {ζ, ζ 2 , . . . , ζ n } ≃ Zn is set of n-th roots of the unit. In this case the map AutQ (ζ) → Aut(Zn ), φ 7→ φ|U is an isomorphism, so the result follows. Once one has information about the cyclotomic fields, one can find the Galois group of an equation of the form xn − a = 0. For example, one has: Proposition 7.6.11. If K has characteristic 0 and contains the roots of xn − 1 = 0, then the Galois group of xn − a over K is cyclic of order a divisor of n. Proof. Let L be a splitting extension of xn − a over K, and denote again by U = {z ∈ L : z n − 1 = 0} the set of roots of xn − 1. If r ∈ L is a root of xn − a, then {zr : z ∈ U } is the set of n roots of xn − a in L. Hence, we have L = K(r). If φ1 , φ2 ∈ AutK (L) ≡ G, then φ1 (r) = z1 r, φ2 (r) = z2 r, for some z1 , z2 ∈ U , and φ1 ◦ φ2 (r) = z1 z2 r. Hence, the map φ 7→ z is a monomorphism form G into the cyclic group U ≃ Zn and we conclude that G is isomorphic to a subgroup of Zn . Exercises. √ √ √ 1. Determine the Galois group of the extension Q( 2, 3, 5) over Q. 2. Let L = K(x) be the field of fractions of K[x]. Show that the primitive elements of L take the form ax + b , a, b, c, d ∈ K, ad − bc 6= 0 t= cx + d (Hint: If t = p(x)/q(x), where gcd(p(x), q(x) = 1, define the degree of t to be the maximum of the degrees of p(x) and q(x). Show that p(w) − yq(w) is irreducible in K[w, y], hence also in K(y)[w], and that x is algebraic over K(t) with minimal polynomial a multiple of p(w) − tq(w). Conclude that [K(x) : K(t)] = 1 (or that K(x) = K(t)) if and only if the degree of t is 1.) 3. Determine the Galois group of x3 − 2 over Q and over Z2 . 4. Find the Galois group of the polynomial xn − 2 ∈ Q[x] and represent it as a group of permutations of the roots, for n = 4, 5, 6, 7. 5. Determine the Galois group of the polynomial p(x) = 2x3 + 3x2 + 6x + 6 ∈ Q[x]. 6. Show that the Galois group of p(x) = x3 + x2 − 2x − 1 ∈ Q[x] is isomorphic to the alternating group A3 . (Hint: If r is a root of p(x), then r2 − 2 is also a root.) 364 7.7 CHAPTER 7. GALOIS THEORY The Galois Correspondence Finally we are now able to explain how Galois Theory allows one to replace a problem concerning (extensions of fields of) polynomials by a simpler problem in Group Theory. The correspondence, which goes back to Galois, allows one to associate to an intermediate extension a subgroup of the Galois group, as we now explain. Let L ⊃ K be an extension of K and H ⊂ AutK (L) a subgroup of the Galois group. We can think of H as a group of transformations of L, in which case the fixed point set of H is an intermediate field L ⊃ Fix(H) ⊃ K. Conversely, given an intermediate field L ⊃ K̃ ⊃ K then AutK̃ (L) is a subgroup of the Galois group AutK (L). The following properties are immediate for the definitions of these correspondences: Proposition 7.7.1 (Properties of the Galois Correspondence). Let L be an extension of K with Galois group G = AutK (L). Let K̃, K̃1 and K̃2 intermediate extensions, and H, H1 , H2 ⊂ G subgroups. (i) If H1 ⊃ H2 , then Fix(H1 ) ⊂ Fix(H2 ). (ii) If K̃1 ⊃ K̃2 , then AutK̃1 (L) ⊂ AutK̃2 (L). (iii) Fix(AutK̃ (L)) ⊃ K̃. (iv) AutFix(H) (L) ⊃ H. Example 7.7.2. √ √ Consider the extension L = Q( 2, 3) of K = Q. As we saw before, the Galois group of this extension contain 4 elements: AutK (L) = {id, φ1 , φ2 , φ3 } ≡ G. Besides the trivial subgroup H0 = {id}, this group has the subgroups H1 = {id, φ1 }, H2 = {id, φ2 } and H3 = {id, φ3 }. This lattice8 of subgroups can be represented by the diagram: G ❏❏ ❏❏ ttt ❏❏❏ t t t t H1 ■ H2 H3 ■■ ✉ ■■ ✉✉ ✉ ■ ✉✉ H0 8 A lattice is a partially ordered set in which every subset of two elements has a supremum and a infimum. The set of subgroups of a group G, ordered by inclusion, is a lattice. The set of intermediate extensions K ⊂ K̃ ⊂ L of a fixed extension L of K, ordered by inclusion, is a also a lattice. 7.7. THE GALOIS CORRESPONDENCE 365 The field fixed by the Galois group √ G√= AutQ (L) is the ground field: Fix(G) = Q, while obviuosly Fix(H0 ) = Q( 2, 3). On the other hand, one checks easily that √ √ √ Fix(H2 ) = Q( 2), Fix(H3 ) = Q( 6). Fix(H1 ) = Q( 3), Hence, the lattice of intermediate extensions is given by the diagram: ♥♥ Q PPPPP PPP ♥♥♥ ♥ ♥ PP ♥ ♥ ♥ √ √ √ Q( 2) Q( 6) Q( 3) PPP ♥ PPP ♥♥♥ ♥ P ♥ √ √ ♥ Q( 2, 3) As in the previous example, extensions with a rich group of automorphisms so that the correspondences above between intermediate extensions and subgroups are inverse to each other (so in properties (iii) and (iv), inclusions can be replaced by equalities) are specially important and deserve a special name: Definition 7.7.3. A finite dimensional extension L ⊃ K is called a Galois extension of K if Fix(AutK (L)) = K. We have the following alternative characterizations of a Galois extension. Proposition 7.7.4. The following statements are equivalent: (i) L is a Galois extension of K; (ii) L is a splitting extension of a separable polynomial over K; (iii) L is a finite dimensional extension, normal and separable over K. If any of these equivalent conditions hold, then: (7.7.1) | AutK (L)| = [L : K]. Proof. Note that (7.7.1) follows from Theorem 7.6.5. So it remains to prove the implications (i) ⇒ (ii) ⇒ (iii) ⇒ (i). (i) ⇒ (ii): Since [L : K] < ∞, there exist r1 , . . . , rn ∈ L, algebraic over K, such that L = K(r1 , . . . , rn ). Let pi (x) be the minimal polynomial of each ri in K[x], and Oi = {φ(ri ) : φ ∈ AutK (L)} the orbit of ri under the 366 CHAPTER 7. GALOIS THEORY action of the Galois group. This set is finite and consists of roots of pi (x). The polynomial Y qi (x) ≡ (x − r) ∈ L[x] r∈Oi divides pi (x) and is separable, since all its roots are distinct. If φ ∈ AutK (L), then Y Y (x − φ(r)) = (x − r) = qi (x), qiφ (x) = r∈Oi r∈Oi so the coefficients of qi (x) belong to Fix(AutK (L)) = K. We conclude that qi (x) ∈ K[x], so we must have pi (x) = qi (x). It follows Q that pi (x) is separable and all its roots are in L. The polynomial p(x) = i pi (x) is separable and L is a splitting extension of p(x) over K. (ii) ⇒ (iii): If L is a splitting extension of a polynomial p(x) over K, then, by Corollary 7.3.10, L is a normal extension of finite dimension over K. If p(x) is a separable polynomial, then L = K(r1 , . . . , rm ), where the r1 , . . . , rm are separable. By Proposition 7.5.9, L is a separable extension. (iii) ⇒ (i): Let K̃ = Fix(AutK (L)). From the general properties of the Galois correspondence, K̃ is an intermediate field L ⊃ K̃ ⊃ K. Let r1 ∈ K̃ − K and denote by p(x) the minimal polynomial of r1 over K. Since L is normal and separable, Qm p(x) splits in L[x] into a product of distinct linear factors: p(x) = i=1 (x − ri ), with m > 1 and the ri all distinct. If a φ : K(r1 ) → K is the K-monomorphism mapping r1 to r2 , by Theorem a 7.4.6, we can extend φ to a monomorphism Φ : L → K . By Theorem 7.3.9, Φ is actually a K-automorphism of L. It follows that Φ is an element of AutK (L) which does not fix r1 . This contradicts r1 ∈ K̃ = Fix(AutK (L)). Hence, we must have K = K̃ = Fix(AutK (L)). If L fails to be a Galois extension of K, then | AutK (L)| ≤ [L : K]. On the other hand, we have the following general result: Lemma 7.7.5 (Artin9 ). Let G be a finite group of automorphisms of a field L and K = Fix(G). Then (7.7.2) [L : K] ≤ |G|. Proof. Let G = {φ1 = id, φ2 , . . . , φm }. We need to show that any n elements of L with n > m are linearly dependent over K. 9 Emil Artin (1898-1962) was among the greatest algebraists of the do 20th Century. The modern approach to Galois Theory was formulated by him, together with Irving Kaplanski (1917-2006). Artin and Kaplanski where both members of the Bourbaki group. 7.7. THE GALOIS CORRESPONDENCE 367 Let u1 , . . . , un ∈ L and consider the homogeneous linear system: n X ai φj (ui ) = 0, (j = 1, . . . , m). i=1 Since there are more variables than equations, elementary Linear Algebra shows that it must have a non-trivial solution (a1 , . . . , an ) ∈ Ln . After possibly some reordering, we can choose the solution to be of the form (a1 , . . . , as , 0, . . . , 0), where ai 6= 0 for i = 1, . . . , s and s ≥ 2 is minimal (i.e., there is no non-trivial solution with a smaller s). Dividing by a1 , we can additionally assume that a1 = 1. We claim that ai ∈ K = Fix(G), for all i. In fact, if some ai 6∈ K = Fix(G), there exists φ ∈ G such that φ(ai ) 6= ai . Then the vector (1, φ(a2 ), . . . , φ(as ), 0, . . . , 0) is also a solution of the system and the difference vector (0, a2 − φ(a2 ), . . . , as − φ(as ), 0, . . . , 0) is a nontrivial solution with more zeros than (1, a2 , . . . , as , 0, . . . , 0), contradicting the assumption that s was minimal. We conclude that ai ∈ K, for all i. When j = 1 the system gives the linear dependence relation n X ai ui = 0. i=1 Finally we are able to state and prove the key result of Galois Theory. Theorem 7.7.6 (Fundamental Theorem of Galois Theory). Let L be a Galois extension of K. There exists a bijective correspondence between the intermediate extensions K ⊂ K̂ ⊂ L and the subgroups H of the Galois group G ≡ AutK (L), given by: K̂ 7→ AutK̂ (L) ≡ H, H 7→ Fix(H) ≡ K̂. If one writes H ↔ K̂, this correspondence satisfies: (i) If H1 ↔ K̂1 and H2 ↔ K̂2 , then H2 ⊂ H1 if and only if K̂2 ⊃ K̂1 . In this case we have [H1 : H2 ] = [K̂2 : K̂1 ]. (ii) If H ↔ K̂, then H is a normal subgroup of G if and only if K̂ is a normal extension of K. In this case, K̂ is a Galois extension with Galois group G/H. 368 CHAPTER 7. GALOIS THEORY Proof. First we check that the two assignments are inverse to each other: Fix(AutK̂ (L)) = K̂: Let L ⊃ K̂ ⊃ K be an extension. By Proposition 7.7.4, L is a finite-dimensional extension, normal and separable over K, hence also over K̂. But then L is a Galois extension of K̂, and we conclude that K̂ = Fix(H) with H = AutK̂ (L). AutFix(H) (L) = H: Let H ⊂ G ≡ AutK (L) and K̂ = Fix(H). By Artin’s Lemma, |H| ≥ [L : Fix(H)] = [L : K̂]. On the other hand, L is a Galois extension over K̂ and H ⊂ AutK̂ (L). Then by relation (7.7.1) we see that |H| ≤ | AutK̂ (L)| = [L : K̂]. Hence, |H| = | AutK̂ (L)| and we conclude that H = AutK̂ (L), with K̂ = Fix(H). In order to check that (i) holds, notice that from Proposition 7.7.1 and the fact that the correspondence is bijective, we have H2 ⊂ H1 if and only if K̂2 ⊃ K̂1 . Since L is a Galois extension over both K̂1 and K̂2 , we find: [K̂2 : K̂1 ] = [L : K̂1 ] [L : K̂2 ] = | AutK̂1 (L)| | AutK̂2 (L)| = |H1 | = [H1 : H2 ]. |H2 | In order to check that (ii) holds, assume that H ↔ K̂. If φ ∈ G, then the extension corresponding to φHφ−1 is φ(K̂). Hence, we have: (a) If H is a normal subgroup of G, then for all φ ∈ G we have φ(K̂) ⊂ K̂. The map φ 7→ φ|K̂ is a surjective homomorphism of G onto AutK (K̂) with kernel H. Hence, AutK (K̂) ≃ G/H and Fix(AutK (K̂)) = Fix(G/H) = Fix(G) = K. We conclude that K̂ is a Galois extension with Galois group G/H. (b) Conversely, assume that K̂ is a normal extension. By Theorem 7.3.9, φ(K̂) = K̂, for every φ in the Galois group G. Since φHφ−1 ↔ φ(K̂), we conclude that φHφ−1 = H. Hence, H is a normal subgroup of G. Example 7.7.7. √ We saw before that Q(i, 4 2) is a Galois extension over Q, with Galois group √ 4 AutQ Q(i, 2) = {1, τ, τ 2 , τ 3 , σ, στ, στ 2 , στ 3 } ≡ G, √ where σ and τ are the Q-automorphisms of Q(i, 4 2) definided by σ(i) √ = −i,√ σ( 4 2) = 4 2, τ (i) √ = i, √ τ ( 4 2) = i 4 2. 7.7. THE GALOIS CORRESPONDENCE 369 This group has the following lattice of subgroups: ❥ G ❚❚❚❚❚ ❚❚❚❚ ❥❥❥❥ ❥❥❥❥ 2 3 {1, τ , στ, στ } hτ i {1, σ, τ 2 , στ 2 } ❙❙❙❙ ❙ ❙ ❦ ❦ ❙ ❦ ❙❙❙❙ ❙❙❙❙ ❦❦❦ ❦❦❦ ❙❙ ❦❦❦ ❦❦❦❦ ❦ 2 3 2 hστ i ❱❱❱ hστ i hτ i hσi ✐✐ hστ i ❱❱❱❱ ▲▲▲ ✐ s ✐ s ✐ ✐ ❱❱❱❱ ss ✐✐✐✐ ❱❱❱❱ ▲▲▲▲ sss ✐✐✐✐✐✐✐ ❱❱❱❱ ▲▲▲ s s ❱❱❱❱ ▲▲ ss ✐✐✐✐ ❱❱❱ s ✐✐✐✐ {1} The Galois correspondence yields a similar lattice of intermediate extensions of Q: ✐✐✐ Q ◗◗◗◗◗ ✐✐✐✐ ◗◗◗ ✐ ✐ ✐ ✐✐ 2 Q(ir ) ❚❚ Q(i) Q(r2 ) ❖❖❖ ❚❚❚❚ ❤❤❤ ♥ ❤ ♥ ❤ ♥ ❤ ❖❖❖ ❚❚❚❚ ❤❤❤❤ ♥♥♥ 2 2 2 Q(r) Q(i, r ) Q(ir , (1 + i)r) Q(ir , (1 − i)r) ❦ Q(ir) ◆◆◆ ❲❲❲❲❲ ❦❦❦ ✈✈ ❲❲❲❲❲ ❦ ◆◆◆ ❦ ✈ ❦ ❲❲❲❲❲ ❦❦ ✈✈ ❲❲❲❲❲ ◆◆◆◆◆ ✈✈ ❦❦❦ ❲❲❲❲❲ ◆ ✈❦✈❦❦❦❦❦ ✈ ❲ Q(i, r) √ where r = 4 2. The extensions in the first row are normal extensions, being extensions of degree 2. They correspond to subgroups of index √ 2, therefore normal subgroups. Int the second row, only the extension Q(i, 2) is normal: it corresponds to the center C(G) = {1, τ 2 }. Exercises. √ √ √ 1. Determine the Galois correspondence for the extension Q( 2, 3, 5) ⊂ R over Q. 2. Determine the Galois correspondence for the splitting field of the polynomial x3 − 2 over Q and over Z2 . 3. Determine the Galois correspondence for the splitting field of the polynomial (x3 − 2)(x2 − 3) ∈ Q[x]. 4. Determine the Galois correspondence for the splitting field of the polynomial x4 − 4x2 − 1 ∈ Q[x]. 370 CHAPTER 7. GALOIS THEORY 5. Let p(x) ∈ Q[x] be a polynomial of degree 3 with Galois group G. If r1 , r2 , r3 ∈ C denote the roots of p(x) and δ ≡ (r1 − r2 )(r1 − r3 )(r2 − r3 ), show that: (a) |G| = 1 if and only if the roots of p(x) belong to Q; (b) |G| = 2 if and only if p(x) has exactly one rational root; (c) |G| = 3 if and only if p(x) has no rational roots and δ ∈ Q; (d) |G| = 6 if and only if p(x) has no rational roots and δ 6∈ Q. 6. If p(x) ∈ K[x] is a polynomial of degree n with roots r1 , . . . , rn , define the discriminant of p(x) to be ∆ = δ 2 , where δ= Y (ri − rj ). i<j Assuming that p(x) is separable and K has characteristic different from 2, show that (a) ∆ ∈ K; (b) ∆ = 0 if and only if p(x) has a multiple root; (c) ∆ is a perfect square in K if and only if the Galois group of p(x) is contained in An . 7. If p(x) = x3 + px + q ∈ Q[x], show that the discriminant (see the previous exercise) is ∆ = −4p3 − 27q 2 . What is the Galois group of x3 + 6x2 − 9x + 3 ? 8. Let L be a Galois extension of K, and L ⊃ K̂ ⊃ K an intermediate extension. Let H ⊂ AutK (L) be the subgroup formed by the K-automorphisms which map K̂ into itself. Show that H is the normalizer of AutK̂ (L) in AutK (L). 7.8 Applications We will now see several applications of Galois Theory. Our first application is to characterize all symmetric polynomials. Our second application is to find a criterion to decide if a complex number is constructible or not, completing our discussion in Section 7.2. Our third and final application is to the original question that motivated Galois: we will prove Galois criterion to decide if an algebraic equation is solvable by radicals. 7.8. APPLICATIONS 7.8.1 371 Symmetric Polynomials Let x1 , . . . , xn be indeterminates. In the field of fractions K(x1 , . . . , xn ) consider the polynomial n Y (x − xi ) = xn − s1 xn−1 + s2 xn−2 − · · · + (−1)n sn . p(x) = i=1 The coefficients si of this polynomial are certain polynomials in the indeterminates xi , called elementary symmetric polynomials,. They can be explicit written as: s1 = X xi , i s2 = X xi xj , i<j .. . s n = x1 · · · xn . We write their general formula in the form: si = X j1 <···<ji xj 1 · · · xj i , i = 1, . . . , n. The use of the term “symmetric” is justified by the fact that the polynomials remain the same under any permutation of the indices j1 , . . . , jn . More generally, we can consider symmetric rational expressions in the variables xi , which can be formalized as follows. Every permutation π ∈ Sn determines a K-automorphism φπ of K(x1 , . . . , xn ) which transforms xi in xπ(i) . A symmetric rational expression is any element of K(x1 , . . . , xn ) which is fixed by the group G ≡ {φπ : π ∈ Sn } ≃ Sn . Notice that G ⊂ AutK (K(x1 , . . . , xn )) and, under the Galois correspondence, the symmetric rational expressions are precisely the elements of the intermediate extension K ⊂ Fix(G) ⊂ K(x1 , . . . , xn ). Examples 7.8.1. Q 1. The polynomial p(x) = ni=1 (x − xi ) is clearly invariant under the action of the φπ ( i.e., p(x) = pφπ (x)), hence its coefficients si are fixed by φπ and we have si ∈ Fix(G). In other words, the si are symmetric rational expressions. 372 CHAPTER 7. GALOIS THEORY 2. The fractions x1 x2 x2 x3 x3 x1 + + , x3 x1 x2 1 1 1 + 2 + 2, x1 2 x2 x3 are symmetric rational expressions. We can use the Galois correspondence to show that any symmetric rational expression can be expressed in terms of the elementary symmetric polynomials s1 , . . . , sn . More precisely, we have: Theorem 7.8.2. K(x1 , . . . , xn ) is a Galois extension of K(s1 , . . . , sn ) with Galois group G = {φπ : π ∈ Sn } ≃ Sn . In particular, Fix(G) = K(s1 , . . . , sn ). Proof. As we observed above, we have K(s1 , . . . , sn ) ⊂ Fix(G) hence, G ⊂ AutK(s1,...,sn ) K(x1 , . . . , xn ). Note that K(x Q 1 , . . . , xn ) = K(s1 , . . . , sn , x1 , . . . , xn ) is a splitting extension of p(x) = ni=1 (x − xi ) ∈ K(s1 , . . . , sn )[x]. On the other hand, if φ ∈ AutK(s1 ,...,sn) K(x1 , . . . , xn ), then φ permutes the roots of p(x), hence AutK(s1 ,...,sn ) K(x1 , . . . , xn ) ⊂ G. We conclude that the Galois group of K(x1 , . . . , xn ) over K(s1 , . . . , sn ) is G ≃ Sn . Since K(x1 , . . . , xn ) is a normal and separable extension, it is a Galois extension and Fix(G) = K(s1 , . . . , sn ). Example 7.8.3. The polynomial x1 3 + x2 3 + x3 3 ∈ Q(x1 , x2 , x3 ) is a rational symmetric expression, hence it can be expressed as a rational fraction in s1 = x1 + x2 + x3 , s2 = x1 x2 + x1 x3 + x2 x3 and s3 = x1 x2 x3 . Elementary degree considerations show that there exist rational numbers a1 , a2 and a3 , such that x1 3 + x2 3 + x3 3 = = a1 (x1 + x2 + x3 )3 + a2 (x1 + x2 + x3 )(x1 x2 + x1 x3 + x2 x3 ) + a3 x1 x2 x3 . By choosing convenient values of x1 , x2 and x3 , we can find the coefficients: 3 = 27a1 + 9a2 + a3 2 = 8a1 + 2a2 (if we set x1 = x2 = x3 = 1), (if we set x1 = x2 = 1, x3 = 0), 1 = a1 (if we set x1 = 1, x2 = x3 = 0). This linear system has the solution a1 = 1, a2 = −3, a3 = 3. Hence, we conclude that: x1 3 + x2 3 + x3 3 = (x1 + x2 + x3 )3 − 3(x1 + x2 + x3 )(x1 x2 + x1 x3 + x2 x3 ) + 3x1 x2 x3 . 7.8. APPLICATIONS 7.8.2 373 Constructible Numbers We have study in Section 7.2 the subfield C ⊂ C formed by the numbers that can be constructed using compass and straightedge. Using Galois Theory we obtain the following characterization of the constructible numbers: Theorem 7.8.4. A complex number z ∈ C is constructible if and only if n z is algebraic over Q, and the normal closure Q(z) has dimension 2s , for some s ∈ N. Proof. Recall that a complex number z ∈ C is constructible if and only if z belongs to a subfield Q(u1 , . . . , ur ), with u1 2 ∈ Q and um+1 2 ∈ Q(u1 , . . . , um ), for m = 1, . . . , r − 1. It follows that if z is a constructible, then ] n n Q(z) ⊂ Q(u1 , . . . , ur ) . n Letting G = AutQ (Q(u1 , . . . , ur ) ), Theorem 7.3.9 shows that the field n Q(u1 , . . . , ur ) is generated by the images φ(Q(u1 , . . . , ur )), with φ ∈ G. Hence, if G = {φ1 , . . . , φn } we find: n Q(u1 , . . . , ur ) = Q(φ1 (u1 ), . . . , φ1 (ur ), . . . , φn (u1 ), . . . , φn (ur )). n Since φj (ur )2 = φj (u2r ), we conclude that Q(u1 , . . . , ur ) is an extension of the form Q(ũ1 , . . . , ũl ) with ũ21 ∈ Q and ũ2m+1 ∈ Q(ũ1 , . . . , ũm ), for m = 1, . . . , l − 1. We can now compute the algebraic degrees as follows: n [Q(u1 , . . . , ur ) : Q] = l−1 Y [Q(ũ1 , . . . , ũm+1 ) : Q(ũ1 , . . . , ũm )] = 2t . m=0 It follows that n [Q(z) : Q] = n [Q(u1 , . . . , ur ) : Q] n n [Q(u1 , . . . , ur ) : Q(z) ] = 2s , for some s ∈ N. n n Conversely, assume that [Q(z) : Q] = 2s . Then Q(z) is a Galois extension of Q whose Galois group has order |G| = 2s . By the results of Chapter 5 on p-groups, we know that there exists a normal tower G = Hs ⊲ Hs−1 ⊲ · · · ⊲ H1 ⊲ H0 = {e}, where [Hm : Hm−1 ] = 2. Galois correspondence, then gives intermediate extensions n Q(z) = Ks ⊃ Ks−1 ⊃ · · · ⊃ K1 ⊃ Q, 374 CHAPTER 7. GALOIS THEORY where [Km : Km−1 ] = 2. Hence, for each 1 ≤ m ≤ s, there exist um ∈ Km n such that Km+1 = Km (um+1 ) and um+1 2 ∈ Km . We conclude that Q(z) = Q(u1 , . . . , us ), with um+1 2 ∈ Q(u1 , . . . , um ) for 0 ≤ m ≤ s − 1. Hence, z is constructible. Galois Theory gives more than a criterion to decide if a number is constructible: for constructible numbers it gives a method to construct the number! The is illustrated in the next example. Example 7.8.5. A regular polygon with 5 sides can be constructed with rule and compass: if the polygon is inscribed in the unit circle, then this is equivalent to the statement 2πi that na primitive root of x5 − 1 = 0 is constructible. In fact, if r = e 5 , then 2πi 2 Q(r) = Q(e 5 ), and this extension has algebraic degree 4 = 2 . In order to give an explicit description of a regular pentagon, we start by observing that (x5 − 1) = (x − 1)(x4 + x3 + x2 + x + 1), hence r is a root of an irreducible polynomial of 4th degree. The extension L = Q(r) is a splitting extension of this polynomial, hence is a Galois extension. The Galois group has order |G| = 22 . The Q-automorphism defined by φ(r) = r2 is an element 2πik of the Galois group. If we label the roots as zk = e 5 (k = 1, . . . , 4), then φ(z1 ) = z2 , φ(z2 ) = z4 , φ(z4 ) = z3 , φ(z3 ) = z1 , so we see that φ corresponds to the permutation (1243). this element has order 4, so the Galois group is G = {I, φ, φ2 , φ3 }. In terms of permutations of the roots: G = {I, (1243), (14)(23), (3241)}. This group admits the tower of p-subgroups: G ⊃ H ⊃ {I}, where H = {I, (14)(23)}. Since [G : H] = 2, to H corresponds an extension K̂ of degree 2 over Q. To determine this extension, we observe that an element u ∈ L is of the form u = a1 z1 + a2 z2 + a3 z3 + a4 z4 , where φ2 (u) = a1 z4 + a2 z3 + a3 z2 + a4 z1 , and u is fixed by (14)(23) if and only if a1 = a4 and a2 = a3 . Hence, K̂ = Q(ω1 , ω2 ), where ω1 = z1 + z4 and ω2 = z2 + z3 . It is easy to check that ω1 + ω2 = z1 + z4 + z2 + z3 = r + r2 + r3 + r4 = −1 ω1 ω2 = (r + r4 )(r2 + r3 ) = r + r2 + r3 + r4 = −1, so we have: (x − ω1 )(x − ω2 ) = x2 + x − 1. 7.8. APPLICATIONS 375 Solving this equation, we see that √ −1 + 5 ω1 = , 2 −1 − ω2 = 2 √ 5 (these values are 2 Re(z1 ) = 2 Re(z4 ) and 2 Re(z2 ) = 2 Re(z3 )). We can now explain the traditional way of constructing a pentagon. In a unitary circle, we mark the four points A = (1, 0), B = (0, 1), C = (−1, 0) and D = (0, −1). Splitting the segment √ OD into two equal parts, we obtain the compass in E we point E. The segment EA has length 5/2. Centering the √ obtain the arc AF . The segment OF has length ω1 = −1+2 5 , and we can mark the point G in the horizontal axis, so that OG = OF 2 . The point G corresponds to the x-coordinate of z1 (it is slightly simple to observe that AF has the same length as the side of a pentagon). B z2 z1 F A C O z3 G E D z4 Figure 7.8.1: Construction of a regular pentagon. The Greeks knew how to construct regular polygons with 3, 5 and 15 sides and, given a regular polygon with n sides, how to obtain one with 2n sides (obviously by bisecting the sides). Gauss, when he was only 19 years old, and before Galois Theory was discovered, found a way to construct a regular polygon with 17 sides! This discovery made him choose Mathematics over he study of Languages. In fact, he was so proud of this discovery, that later in his life he asked that a regular polygon with 17 sides be engraved in his tombstone. This never happened because the sculptor thought that a polygon with some many sides could be confused with a circle. Galois Theory shows that a regular polygon with n sides is constructible if and only if n = 2r p1 · · · ps where the pi are Fermat primes. Using Galois Theory, its is known how to construct regular polygons with 257 and 65 537 sides! 376 CHAPTER 7. GALOIS THEORY 7.8.3 Solving algebraic equations by radicals We will now comeback to the opening discussion in this chapter, concerning the criterion found by Galois which allows one to decide if an algebraic equation is solvable by radicals. In order to simplify the discussion, we will assumed the fields have characteristic 0. Definition 7.8.6. Let p(x) ∈ K[x] be a monic polynomial. We say that a equation p(x) = 0 is solvable by radicals if there exists an extension L ⊃ K which contains a splitting field of p(x) and which is of the form: L = Km+1 ⊃ · · · ⊃ K2 ⊃ K1 = K, where Ki+1 = Ki (di ) and di ni ∈ Ki . Notice the meaning of this definition: any root of p(x) belongs to L and can be expressed starting from elements of K by a sequence of rational operations and by finding n-th roots. Theorem 7.8.7 (Galois Criterion). Let p(x) ∈ K[x] be a monic polynomial. The equation p(x) = 0 is solvable by radicals if and only if its Galois group is solvable. Example 7.8.8. The equation x5 − 4x + 2 = 0 is not solvable by radicals (in Q). This can be seen as follows. By Eisenstein’s Criterion, the polynomial p(x) = x5 − 4x + 2 is irreducible over Q. It is easy to check that this polynomial has three real distinct roots r1 , r2 , and r3 , and two complex conjugate roots r4 and r5 . Let L = Q(r1 , . . . , r5 ) be the splitting field of p(x) and denote by G its Galois group. Since [Q(r) : Q] = 5, for any r ∈ {r1 , . . . , r5 }, we have that 5 | [L : Q] = |G|, the Sylow Theorems, show that the Galois group G ⊂ S5 contains an element of order 5, i.e., a cycle (i1 , . . . , i5 ). On the other, complex conjugation a+ib 7→ a − ib restricted to L = Q(r1 , . . . , r5 ) gives an element of G of order 2, i.e., a transposition. We leave it as an exercise to check that these two elements generate S5 . Hence, G = S5 , by Galois Criterion the equation is not solvable by radicals. Before we prove Galois criterion, we deduce the following corollary: Corollary 7.8.9 (Theorem of Abel-Rufini). There is no general algebraic solution by radicals for polynomial equations of degree five or higher. 7.8. APPLICATIONS 377 Proof. A more precise way of formulating this result is that the general equation xn − an−1 xn−1 + · · · + (−1)n a0 = 0, is not solvable by radicals, for n ≥ 5. By “general equation” we mean that a0 , . . . , an−1 are indeterminates. Hence, we consider the polynomial p(x) = xn − an−1 xn−1 + · · · + (−1)n a0 over the field K(a0 , . . . , an−1 ) and we need to show that the Galois group of p(x) over this field is not solvable. Let L be the splitting field of p(x) over K(a0 , . . . , an−1 ), so that in L we have the factorization: p(x) = (x − r1 )(x − r2 ) . . . (x − rn ). By comparison, we see that an−1 = X ri , i an−2 = X ri rj , i<j .. . a0 = r1 r2 . . . rn . Hence, L = K(ao , . . . , an−1 )(r1 , . . . , rn ) = K(r1 , . . . , rn ). Consider a set of indeterminates x1 , . . . , xn , and in the field K(x1 , . . . , xn ) consider the subfield of rational symmetric expressions (see Section 7.8.1): this subfield is of the form K(s1 , . . . , sn ), where s1 , . . . , sn are the elementary symmetricQpolynomials in the xi ’s and K(x1 , . . . , xn ) is a splitting extension of q(x) = i (x − xi ) over K(s1 , . . . , sn ). We also saw that the Galois group of this extension is Sn . If there is an isomorphism K(r1 , . . . , rn ) ≃ K(x1 , . . . , xn ) which to the subfield K(a0 , . . . , an−1 ) associates K(s1 , . . . , sn ), then the Galois group of the general equation of degree n is Sn , which is not solvable for n ≥ 5, so we are done. Let us establish the existence of this isomorphism. Consider the K-homomorphism φ : K[a0 , . . . , an−1 ] → K[s1 , . . . , sn ], associating ai 7→ sn−i : this homomorphism exists, because the a0 , . . . , an are indeterminates. Similarly, we have a homomorphism ψ : K[x1 , . . . , xn ] → K[r1 , . . . , rn ], and the following diagram commutes: K[a0 , . . . , an ] φ _ K[r1 , . . . , rn ] o ψ / K[s1 , . . . , sn ] _ K[x1 , . . . , xn ] 378 CHAPTER 7. GALOIS THEORY In fact, we have:  ψ(φ(ai )) = ψ(si ) = ψ  X j1 <···<ji  xj 1 · · · xj i  = X j1 <···<ji rj1 · · · rji = ai . The diagram shows that φ must be a monomorphism. Since φ is clearly surjective, it follows that φ is an isomorphism. Extending this isomorphism to the corresponding fields of fractions, we obtain an isomorphism of fields φ̃ : K(a0 , . . . , an−1 ) → K(s1 , . . . , sn ). This isomorphism associates to a polynomial p(x) ∈ K(a0 , . . . , an−1 )[x] the polynomial pφ̃ (x) = q(x) ∈ K(s1 , . . . , sn )[x]. As we saw in Section 7.4, φ̃ extends to an isomorphism of the corresponding splitting fields K(r1 , . . . , rn ) ≃ K(x1 , . . . , xn ), as claimed. The Abel-Rufini Theorem shows that there exists no general formula to solve a polynomial equation of of degree n, when n ≥ 5. Still there are equations which can be solved by radicals, e.g., x5 − 2 = 0. In fact, it could happen that a general formula did not exist, but still every equation could be solved by radicals. However, the example of x5 − 4x + 2 = 0 shows that is not the case. It remains to prove Galois Criterion. In this proof, the splitting fields of the equations xn − a = 0 play an essential role. We saw before that the Galois group of xn − a = 0 is cyclic if K contains all the roots of unit of order n and is abelian when a = 1. In general, an extension L of K whose Galois group is abelian (respectively, cyclic) is called an abelian extension (respectively, cyclic extension) of K. We need the following preliminary result: Proposition 7.8.10. Let K be a field which contains the p roots of xp −1 = 0 (p a prime). If L is a cyclic extension cyclic of K with [L : K] = p, then L = K(r), where r p ∈ K. Proof. If u ∈ L − K, then L = K(u), since L ⊇ K(u) ) K and [K(u) : K] | [L : K] = p. If AutK (L) = hφi and {z1 , . . . , zp } ⊂ K are the p-roots of xp − 1 = 0, consider the elements (7.8.1) ri = u + φ(u)zi + φ2 (u)zi 2 + · · · + φp−1 (u)zi p−1 . Since φ(ri ) = zi −1 ri , we have φ(ri p ) = ri p , and we conclude that ri p ∈ K. We can always write u as a linear combination of the ri ’s by solving the 7.8. APPLICATIONS 379 linear system (7.8.1) for the unknowns u, φ(u), . . . , φp−1 (u) (this is possible because the corresponding determinant is a Van der Monde determinant). Hence, L = K(r1 , . . . , rp ), and for some k0 we must have rk0 6∈ K. Setting r = rk0 , we have L = K(r), with r p ∈ K. Proof of the Galois Criterion. Let p(x) ∈ K denote by G = AutK (L) its Galois group. We will show both implications: (i) If p(x) = 0 is solvable by radicals, then G is solvable: If p(x) = 0 is solvable by radicals there exists an extension L of K, which contains a splitting extension of p(x), and which admits a tower of subfields L = Kl+1 ⊃ · · · ⊃ K2 ⊃ K1 = K, (7.8.2) n where Ki+1 = Ki (di ), with di ni = ai ∈ Ki . The normal closure L of n n L is generated by the φ(L), with φ ∈ AutK (L ). Hence, if AutK (L ) = {id, φ1 , . . . , φr }, we obtain n (7.8.3) L = K(d1 , . . . , dl , φ1 (d1 ), . . . , φ1 (dl ), . . . , φr (d1 ), . . . , φr (dl )). n Let m = lcm(n1 , . . . , nl ). We can extend the tower (7.8.3) to L (z), n where z is a primitive root of unit of order m. Since L the splitting field of n a polynomial p(x), it follows that L (z) the splitting field of p(x)(xm − 1), n and we conclude that L (z) is normal. Reordering terms, we obtain the new tower: n L (z) = Kl (z) ⊃ · · · ⊃ K̃3 = K2 (z) ⊃ K̃2 = K(z) ⊃ K̃1 = K. This new tower satisfies K̃i+1 = K̃i (d˜i ), with d˜ni i ∈ K̃i , for all i. n Let G be the Galois group of p(x) and H = AutK (L (z)). Then K̃i is an abelian extension of K̃i−1 . If the subgroup Hi ⊂ H corresponds to the intermediate extension K̃i , we have Hi−1 ⊲ Hi and Hi−1 /Hi is isomorphic to the Galois group of K̃i over K̃i−1 , i.e., is abelian. We conclude that H n admits an abelian tower, so it is a solvable group. Since L (z) contains a splitting field of p(x), G is a subgroup of H and it must be solvable. (ii) If G is solvable, then p(x) = 0 is solvable by radicals: Let L be a splitting field of p(x) = 0 and n = |G| = [L : K]. If K1 = K and K2 = K(z), where z is a primitive root of xn − 1 = 0, then M = L(z) has Galois group over K2 isomorphic to a subgroup H of G. Hence, H is solvable and has a composition series H = H1 ⊲ H2 ⊲ . . . ⊲= {e}, where each Hi /Hi+1 is cyclic of prime order. By the Galois correspondence, there exists a tower of subfields K2 ⊂ K3 ⊂ · · · ⊂ M where each Ki+1 is a normal extension over Ki with Galois group cyclic of order prime pi . Since pi | n and Ki contains a 380 CHAPTER 7. GALOIS THEORY primitive root of xn − 1 = 0, we see that Ki contains the pi -th roots of unit. By Proposition 7.8.10, we conclude that Ki+1 = Ki (di ), with di pi ∈ Ki . This shows that the equation p(x) = 0 is solvable by radicals. Exercises. 1. Show that x11 3 + x12 3 + x13 3 is a symmetric rational expression and determine its representation in terms of elementary symmetric polynomials. 2. Determine the integeres 1 ≤ n ≤ 10 for which a regular polygon of n sides admits a compass-and-straightedge construction. 3. Let G ⊂ Sp , with p prime, be a subgroup which contains a cycle of length p and a transposition. Show that one must have G = Sp . 4. Let G be any finite group. Show that there exists fields L and K such that L is an extension of K, with Galois group G. (Hint: By Cayley’s Theorem, one can assume that G ⊂ Sn for some n.) 5. Using Galois Criterion, show that the Galois group of the equation xn −a = 0 (over Q) is solvable. Chapter 8 Commutative Algebra (not yet available) 381 382CHAPTER 8. COMMUTATIVE ALGEBRA (NOT YET AVAILABLE) Appendix A Set Theory The notion of set is the most important of all mathematical notions, and the foundations of all mathematics rest upon it. The reader is for sure familiar with the informal notion of set and element of a set, as well as some elementary constructions with sets, such as (unions, intersections, complements, etc.). On the other hand, statements such as: • two sets are equal if and only if they have the same elements, • given two sets, there exists a set containing them, • given any set, there exists a set formed by all its subsets, are usually accepted as obvious. However, to fully justified them it is necessary a deeper look into the foundations of Set Theory, which is beyond the aim of this book. For example, the famous Russel paradox, concerning the existence of the set of all sets, can only be solved via an axiomtization of Set Theory. The interested reader can, e.g., consult the concise book by Paul Halmos escreveu a este respeito1 . In this appendix we will limit ourselves to a few elementary notions and results of Set Theory which are essentially to the study of abstract algebra. A.1 Relations and Functions The notions of binary relation and map are directly related with the notion of ordered pair and, in fact, can be defined from the latter notion. From 1 P. R. Halmos, Naive Set Theory, Undergraduate Texts in Mathematics, SpringerVerlag, 1974. 383 384 APPENDIX A. SET THEORY a practical point of view, the most basic property of an ordered pair is the equivalence: (A.1.1) (x1 , y1 ) = (x2 , y2 ) ⇐⇒ x1 = x2 and y1 = y2 . The properties that an ordered pair should satisfy can be expressed in terms of even more basic notions of Set Theory, and lead to the previous equivalence: Definition A.1.1. If X and Y are sets, x ∈ X and y ∈ Y , the ordered pair (x, y) is the set (x, y) = {x, {x, y}}. The elements x and y are called the components of the pair (x, y). The set of all ordered pairs (x, y), where x ∈ X and y ∈ Y , are called the cartesian product of X and Y , and is denoted by X × Y . Note that (A.1.1) is an immediate logical consequence of Definition A.1.1, and in particular the fact that (x, y) is equal to (y, x) if and only if x = y. One can use the notion of ordered pair to formalize another important notion which will lead to the notion of map: Definition A.1.2. A relation between X and Y is a subset R ⊂ X × Y . We will often write “xRy” instead of “x, y ∈ R”, and when X = Y ,we say that R is a binary relation on X. It is immediate to check that a relation R between X and Y has an associated relation between Y and X, which is obtained by “switching” the components of every ordered pair in R. Definition A.1.3. If R is a relation between X and Y , the inverse or opposite relation of R, denoted by Rop , is defined by setting Rop = {(y, x) : (x, y) ∈ R}. There are several important classes of relations. In particular, the notions of order relations, equivalence relations and maps, play an essential role in many of our considerations. Definition A.1.4. Let R be a relation on the set X. We say R is an order relation on X if: (i) Transitivity: For x, y, z ∈ X, if xRy and yRz, then xRz. (ii) Anti-symmetry: For x, y ∈ X, if xRy and yRx, then x = y. A.1. RELATIONS AND FUNCTIONS 385 There are also various kinds of order relations, which are distinguished by adding one of the terms strict/non-strict e total/partial: Definition A.1.5. Let R be an order relation on X. (i) R is called a total order relation (the opposite of a partial order relation) if it satisfies the trichotomy property: For any x, y ∈ X, one of the following holds: xRy or yRx or x = y. (ii) R is called a strict order relation (the opposite of a non-strict order relation) if satisfies the anti-reflexivity property: For x, y ∈ X, if xRy then x 6= y. The following examples illustrate all the various possibilities: Examples A.1.6. 1. The usual relation “>” (larger than) between real numbers is a strict order relation and also a total order relation. 2. The usual relation “≥” (larger or equal) between real numbers is a non-strict order relation and a total order relation. 3. The relation “⊇” (contains) between subsets of a given fixed set is a non-strict order relation and a partial order relation. 4. The relation “)” (strictly contains) between subsets of a given fixed set is a strict order relation and a partial order relation. Given a partial ordered set X, with order relation denoted by “≤”, any subset Y ⊂ X is also partially ordered with the order relation induced on X (which we will still denote by “≤”). Obviously, the order relation induced on X may have properties that the order relation on X does not satisfy. For example, it may very well happened that X is partially ordered while the induced ordered in Y ⊂ X is a total order. In this case, the set Y is called a chain in X. Example A.1.7. In R2 consider the partial order relation defined by: (x1 , y1 ) ≤ (x2 , y2 ) if and only if y1 = y2 and x1 ≤ x2 , where the last inequality is the usual order relation for real numbers. In this case, any subset {(x, y) ∈ R2 : y = c}, with c ∈ R fixed ( i.e., an horizontal line recta), is a chain. Note, also, that R2 , with this order relation is not totally ordered. 386 APPENDIX A. SET THEORY The next class of binary relations is the following: Definition A.1.8. Let R be a binary relation on the set X. We call R an equivalence relation on X if it satisfies: (i) Reflexivity: For all x ∈ X, xRx. (ii) Symmetry: For x, y ∈ X, if xRy, then yRx. (iii) Transitivity: For x, y, z ∈ A, if xRy and yRz, then xRz. Here are some simple examples of equivalence relations. Examples A.1.9. 1. The relation of parallelism between lines in the plane is an equivalence relation on the set of all lines. 2. The relation of equality between elements of any set X is an equivalence relation on X. 3. The relation de congruence modulo m is an equivalence relation on the set of integers. Any equivalence relation R on a set X determines an important collection of subsets of X. Definition A.1.10. Given an equivalence relation R on X and x ∈ X, the equivalence class of x, denoted by x or [x], is the set x = [x] = {y ∈ X : xRy}. The set of all equivalence classes x is called the quotient set of X by R and is denoted by X/R, so that: X/R = {x : x ∈ X} = {{y ∈ X : xRy} : x ∈ X}. Examples A.1.11. 1. For the relation of parallelism between the lines in the plane, the equivalence class of a line L is formed by all lines parallel to L. 2. If R is the relation of equality on the set X, the quotient set is X/R = {{x} : x ∈ X}. 3. If R is the congruence relation modulo m in the set of integers Z and a ∈ Z, then a = {a + km : k ∈ Z}. A.1. RELATIONS AND FUNCTIONS 387 It is a basic fact that two equivalence classes either coincide or are disjoint sets. The proof is elementary. Proposition A.1.12. Let R be an equivalence relation on X. For x, y ∈ X, the following statements are equivalent: (i) xRy; (ii) x = y; (iii) x ∩ y 6= ∅. Finally, we consider the following important class of equivalence relations, usually called “maps”. Definition A.1.13. A relation f between X and Y is called a map from X to Y , denoted f : X → Y , of: (i) for each x ∈ X there exists y ∈ Y such that xf y, and (ii) if xf y and xf y ′ , then y = y ′ . Because of (ii), we will write y = f (x) instead of xf y. We then call X is o domain and Y the image (or range) of the map f . Other common names to denote a map are function and transformation. Examples A.1.14. 1. If X is a set, IX : X → X, given by IX (x) = x, is called the identity map in X. 2. If X ⊃ Y are sets, iY : Y → X, given by iY (y) = y, is called the inclusion map of Y in X. 3. If R is an equivalence relation on X, the map πX/R : X → X/R, given by πX/R (x) = x, is called quotient map of X in X/R. In geral, if X ⊃ X ′ and Y ⊃ Y ′ , given a map f : X → Y , we set: f (X ′ ) ≡ {f (x) : x ∈ X ′ }, and f −1 (Y ′ ) ≡ {x ∈ X : f (x) ∈ Y ′ }. We say that f (X ′ ) is the direct image of X ′ by f , and f −1 (Y ′ ) is a inverse image of Y ′ by f . In particular, the image of f the set f (X), which we will often denote by Im f . 388 APPENDIX A. SET THEORY Example A.1.15. if f : R → R is the map cos x, its image is the set f (R) = [−1, +1]. The inverse image of the set {0} is f −1 ({0}) = { 2n+1 2 π : n ∈ Z}. As we all know certain kinds of maps have special designations: Definition A.1.16. If f : X → Y is a map, then f is called: (i) surjective or onto if for all y ∈ Y there exists x ∈ X such that y = f (x); (ii) injective if f (x) = f (x′ ) ⇔ x = x′ ; (iii) bijective or a bijection if it is both injective and surjective. In this case. we say that sets are equipotent or isomorphic. Examples A.1.17. 1. The identity map IX : X → X is bijective. 2. The inclusion map iY : Y → X is injective. 3. The quotient map pX/R : X → X/R is surjective. 4. The map cos : R → R is neither injective nor surjective. Given maps f : X → Y and g : Y → Z, the compose of f and g is the map g ◦ f : X → Z (read “g after f ”) given by (g ◦ f )(x) = g(f (x)). We leave for the exercises to verify that: Proposition A.1.18. If X, Y , Z, and W are sets. Then: (i) Associativity: If f : X → Y , g : Y → Z and h : Z → W are maps, then (h ◦ g) ◦ f = h ◦ (g ◦ f ); (ii) Left Inverse: f : X → Y is injective if and only if there exists g : Y → X such that g ◦ f = IX ; (iii) Right Inverse: f : X → Y is surjective if and only if there exists g : Y → X such that f ◦ g = IY ; (iv) Inverse: f : X → Y is bijective if and only if there exists g : Y → X such that f ◦ g = IY and g ◦ f = IX . In this case, g is called the inverse map of f and is denoted by f −1 . A.1. RELATIONS AND FUNCTIONS 389 Exercises. 1. Use Definition (A.1.1) to prove that (A.1.1) holds. 2. Describe the anti-symmetry and trichotomy properties of a relation R in terms of R and its inverse Rop . 3. Give a proof of Proposition (A.1.12). 4. Show that, if f : X → Y is a map and {Yi }i∈I is a family of subsets de Y , then the following identities hold: f −1 ( [ i∈I Yi ) = [ f −1 (Yi ), and f −1 ( i∈I \ Yi ) = i∈I \ f −1 (Yi ). i∈I Are these still true if we assume that {Xi }i∈I is a family of subsets of X and we replace f −1 por f ? 5. Let f : X → Y be a map. (a) Can there exist more that one map g : Y → X such that f ◦ g = IY ? (b) Can there exist more that one map g : Y → X such that g ◦ f = IX ? (c) Can there exist more that one map g : Y → A such that f ◦ g = IY and g ◦ f = IX ? 6. Prove itens (i) and (ii) of Proposition A.1.182 . 7. Show that the inverse f −1 of a map f : X → Y is: (a) a map f −1 : f (X) → X if f is injective; (b) a map f −1 : Y → X if f is bijective. 8. Verify that, if f : X → Y is a bijective map, then g = f −1 is the only map such that f ◦ g = IY and g ◦ f = IX . 9. Show that, if X 6= ∅ and Y = ∅, there are no maps f : X → Y . 10. If X = ∅ and f : X → Y , then f is injective, and moreover f is surjective if and only if Y = ∅. 2 The proofs of itens (iii) and (iv) require the Axiom of Choice which will be mentioned later. 390 A.2 APPENDIX A. SET THEORY Axiom of Choice, Zorn’s Lemma and Induction We have mentioned several times the notion of cartesian product of sets. Using the notion of map, we can introduce define the cartesian product of any arbitrary number of sets. The set of of integers will play a relevant role in the discussion and given a positive integer n ∈ N we will denote in this section by In = {k ∈ N : k ≤ n} the set consisting of the first n natural numbers. In particular, I0 = ∅. Given a set X, a map f : I2 → X determines uniquely an ordered pair with components in X, namely the pair (f (1), f (2)), which we will also write as (f1 , f2 ). Hence, the set of all maps f : I2 → X is isomorphic to the set of all ordered pairs with components in X. A similar observation holds if we consider maps f : I2 → X ∪ Y such that f (1) ∈ X and f (2) ∈ Y . The set of all such maps f : I2 → X ∪ Y is isomorphic to the set of ordered pairs (x, y), with x ∈ X and y ∈ Y , i.e., to the cartesian product X × Y . This leads us to define arbitrary cartesian products as sets of maps. Definition A.2.1. Let X1 , X2 , . . . , Xn be sets. The cartesian product Qn X is the set of all maps f : In → ∪ni=1 Xi such that f (k) ∈ Xk . i=1 i Q If f ∈ ni=1 Xi we write f = (f (1), f (2), . . . , f (n)), or f = (f1 , f2 , . . . , fn ). n Also, when Qnthe sets Xi all coincide with the same set X, we will write X instead of i=1 X, and called it the power n of X. In this case, the elements of X n are called n-tuples of elements of X. One can apply the method used in Definition A.2.1 to define the cartesian product of an arbitrary number of sets. For that, we replace the family of sets X1 , X2 , . . . , Xn , indexed by a natural number 1 ≤ k ≤ n, by a family {Xi : i ∈ I}, indexed by a parameter i belonging to an arbitrary set I.3 Definition A.2.2. Given a family {Xi : i ∈ I}, the cartesian product Q i∈I Xi is given by Y [ Xi = {f : I → Xi , wiith f (i) ∈ Xi para qualquer i ∈ I}. i∈I i∈I Q The canonical Q projection πk : i∈I Xi → Xk4 is the map associating to an element f ∈ i∈I Xi the element f (k) ∈ Xk . 3 Recall that given some class of sets X, a family indexed by i ∈ I is just a map f : I → X. We write Xi instead of f (i), just like we write, for example, xn instead of f (n) when we deal with a sequence of real numbers. 4 We will often write fk , instead of f (k). A.2. AXIOM OF CHOICE, ZORN’S LEMMA AND INDUCTION 391 Examples A.2.3. Q 1. Given an set X, the cartesian product n∈N X is just the set of all sequences with values in X. Q 2. The cartesian product n∈N In is the set of sequences of natural numbers f : N → N such that f (n) ≤ n, for all n ∈ N. Q 3. The cartesian product x∈R [x − 1, x + 1] is the set of all functions f : R → R such that x − 1 ≤ f (x) ≤ x + 1. If we consider all maps with Q a fixed domain X and fixed range Y , this is just the cartesian product x∈X Y , but it is common to denote it by the symbol “Y X ”. Definition A.2.4. Y X = {f : X → Y } is the set of all maps from X to Y . Examples A.2.5. 1. X N is the set consisting of all sequences in X. 2. RR denotes the set of all real valued functions of one variable. 3. RR×R is the set consisting of all real valued functions of two variables. 4. In general, X X×X is the set of all binary operations on X. 5. One can also denote the set X n by X In . I Q S , X is a strict subset of In general, the cartesian product X i i i∈I i∈I S since not all maps f : I → i∈I Xi satisfy the condition f (i) ∈ Xi , for all i ∈ I. However, when Xi = X for every i ∈ I, it is obvious that Y Y Xi = X = XI . i∈I i∈I Hence, the set X I is a special type of cartesian product and this justifies the use of the “power” notation. Q One shows easily (by induction) that the product ni=1 Xi of any finite number of sets is non-empty, as long as the sets Xi are non-empty. The very same statement, but for an arbitrary number of sets, is a fundamental axiom of Set Theory: Axiom I (Axiom of Choice). If I 6= ∅ and Xi 6= ∅, for all i ∈ I, then Q X = 6 ∅. i∈I i 392 APPENDIX A. SET THEORY The reason for the designation for Q this axiom is easy to explain. An element f of the cartesian product i∈I Xi amounts to a “choice” of an element in each of the sets Xi . The axiom states that given an arbitrary family of sets, there exists always a map which “chooses” exactly one element from each set. The next example illustrates the kind of issue that the Axiom of Choice allows to bypass. Example A.2.6. Assume that each Xi is formed a single pair of shoes. Then we can decide to choose, for example, the right shoe of each pair. In this case. the Axiom of Choice is useless. On the other hand, if each Xi is formed by a pair of socks, then there is no criteria for a choice of an element in Xi , and we can invoke the Axiom of Choice to claim the existence of a set formed by exactly one sock from each set Xi . Many results concerning the existence of some set, or of an element in a set, satisfying some property, can be reformulated in terms of the existence of a maximal element for an appropriate order relation 5 . In this respect, the following result often plays a crucial role: Theorem A.2.7 (Zorn’s Lemma). Let X be a partially ordered, non-empty, where every chain possui has an upper bound. Then X contains a maximal element. One can show that Zorn’s Lemma is equivalent the Axiom of Choice (see P. Halmos book cited at the beginning of this appendix). The next example illustrates how Zorn’s Lemma can be applied. Example A.2.8. Let A be a ring, I ( A an ideal, and denoted by X the set of all proper ideals of A that contains I. Since I belongs to X, this set is non-empty. We consider on X the relation of inclusion, which S is obviously a partial order relation. If {Ij : j ∈ J} is a chain in X, then j∈J Ij is an upper bound (you should check that this is indeed an ideal of A which contains I and hence belongs to X). We conclude from Zorn’s Lemma that X has a maximal element. This shows in an arbitrary ring A, every proper ideal is contain in a maximal ideal. 5 The notions of upper/lower bound, supremum/infimum and maximal/minimal element for subsets of partial ordered sets is discussed in Section 2.2. A.2. AXIOM OF CHOICE, ZORN’S LEMMA AND INDUCTION 393 One can imagine an alternative proof of the result in this example, as follows: given a proper ideal ⊂ A, if I is not maximal, then there exists an ideal I0 containing I. Then, if I0 is not maximal, there exists an ideal I1 which contains I0 , and so on. The problem, of course, is that ”so on” may not terminate. Zorn’s Lemma is useful precisely to avoid this kind of problem. For basically the same reason, a partial ordered set X may fail to contain a minimal element. Even if a minimal element exists, a subset Y ⊂ X may fail to have a minimal element. The following definition attempts to prevent this. Definition A.2.9. A partial ordered set X is said to be well-ordered if every non-empty subset S ⊂ X has a minimal element. Obviously, the order relation in a well-ordered set X is a total order:6 if x, y ∈ X, the set {x, y} has a minimal element, hence either x ≤ y or y ≤ x. Examples A.2.10. 1. The set N, with the usual order relation, is a well-ordered set. 2. The set Z, with the usual order relation, is not a well-ordered set since, e.g., the subset {n ∈ Z : n ≤ 0} does not have a minimal element. Another result which is equivalent to the Axiom of Choice and, hence, also to Zorn’s Lemma, is the following: Theorem A.2.11 (Well-Ordering Principle). Every set can be well-ordered. Again we refer to the book of P. Halmos for a proof of this result. Example A.2.12. We observed above that the set of integers Z, with the usual order relation, is not well-ordered. However, we have, for example, the following well-ordering of Z: 0, 1, −1, 2, −2, . . . , n, −n, . . . The main feature of well-ordered sets is the possibility of generalizing the usual induction method to these sets, as we now explain. Given a partial 6 From now on, unless otherwise mentioned, we denote a (non-strict) order relation on set set X by the symbol “≤”. 394 APPENDIX A. SET THEORY ordered set let us denote by s(x) the set of all elements strictly less than x, also called the “segment” ending at x: s(x) = {y ∈ X : y ≤ x, and y 6= x}. We then have: Theorem A.2.13 (Transfinite Induction). Let X be an well-ordered set, and S ⊂ X a subset such that: ∀x ∈ X, s(x) ⊂ S =⇒ x ∈ S. Then S = X. Proof. If X − S is non-empty, denote by x its minimal element. Then the segmento s(x) is contained in S. By the “induction hypothesis ”, x ∈ S. Since x cannot belong at the same time to S and to X − S, we must have X − S empty, and X = S. The Transfinite Induction method has a large range of application, as it follows from the Well-Ordering Principle. For now, we will use it to study recursive definitions, including transfinite recursive definitions. If A is a set and n ∈ N, consider the set An of all maps f : In → A, i.e., all the n-tuples (x1 , x2 , · · · , xn ) in A. Consider also the class Φ = ∪n∈N An . Definition A.2.14. A map F : Φ → A is called a recursive formula. The justification for this name is that a map F : Φ → A allows to associate to a n-tuple fn = (x1 , x2 , · · · , xn ) the (n+1)-tuple given by fn+1 = (x1 , x2 , · · · , xn , xn+1 ), where xn+1 = F (fn ) = F (x1 , x2 , · · · , xn ). Theorem A.2.15 (Recursive Definitions). Given an element x1 ∈ A and a recursive formula F : Φ → A, there exists a unique sequence φ : N → A such that (i) φ(1) = x1 , and (ii) φ(k + 1) = F (φ|Ik ), for all k ∈ N. where we have denoted by φ|Ik the restriction of φ to the set Ik . Proof. Using induction, we prove first that for every n ∈ N there exists fn ∈ An satisfying: (1) fn (1) = x1 , and A.2. AXIOM OF CHOICE, ZORN’S LEMMA AND INDUCTION 395 (2) fn (k + 1) = F (fn |Ik ), for all k < n. The result is obvious for n = 1, if we set f1 (1) = x1 and if we observe that in this case condition (2) is empty. Assume the result holds for some n ≥ 1: there exists an n-tuple fn = (x1 , x2 , · · · , xn ) ∈ An satisfying (1) and (2). Then we define fn+1 = (x1 , x2 , · · · , xn , xn+1 ) ∈ An+1 , where xn+1 = F (fn ) = F (x1 , x2 , · · · , xn ). One checks immediately that fn+1 automatically satisfies (1), and it also satisfies (2) for k < n + 1. Assume now that fn ∈ An and fm ∈ Am satisfy both (1) and (2). We assume, without loss of generality, that n < m, and we prove that fn is the restriction of fm tp In . For that, consider the set D = {k ∈ In : fn (k) 6= fm (k)}. Assuming D non-empty, let r + 1 be its minimal element, and note that r ≥ 1, because by assumption fn (1) = fm (1) = x1 . Hence, we have that fn (k) = fm (k) for all k ≤ r, so the restrictions fn |Ir and fm |Ir are equal. But then we must have fn (r + 1) = F (fn |Ir ) = F (fm |Ir ) = fm (r + 1), contradicting that r + 1 ∈ D. In order to conclude the proof, since we already know that for any n ∈ N there exists exactly one element fn ∈ An satisfying (1) and (2), we define φ : N → A by f (n) = fn (n). One checks easily that this is the only sequence that satisfies (i) and (ii). The previous result maybe generalized, replacing N by any well-ordered set X. In this case, the sets In are replaced by the segments s(x) = {y ∈ X : y < x}, Ax = As(x) is then the set of all maps f : s(x) → A and Φ = ∪x∈X Ax . A transfinite recursive formula is again a map F : Φ → A. The proof of the following generalization of Theorem A.2.15 is left for the exercises: Theorem A.2.16 (Transfinite Recursive Definitions). Given a transfinite recursive formula F : Φ → A, there exists a unique map f : X → A such that f (x) = F (f |s(x) ), for all x ∈ X. Exercises. 1. Show that X × Y is isomorphic to Y × X. Under what conditions does the equality X × Y = Y × X hold? 2. Show that X × (Y × Z) is isomorphic to (X × Y ) × Z. 3. Describe the sets X ∅ , ∅X and ∅∅ . 4. Show that the sets X Y ∪Z and X Y × X Z are isomorphic. 396 APPENDIX A. SET THEORY Z 5. Show that the sets X Y ×Z and (X Y ) are isomorphic, whenever Y ∩ Z = ∅. 6. Show that (X × Y )Z is isomorphic to X Z × Y Z . 7. Use the Axiom of Choice to prove that f : X → Y is surjective if and only if there exists g : Y → X such that f ◦ g = IY . 8. Show Q that, if I 6= ∅ and Xi 6= ∅ for all i ∈ I, then the canonical projections πk : i∈I Xi → Xk are surjective. 9. Use Zorn’s Lemma to show that for an arbitrary group, every proper subgroup is contained in a maximal subgroup. 10. Prove the following statements: (a) Every partially ordered set has a maximal chain; (b) Every chain in a partially ordered set is contained in a maximal chain. 11. Show that the Principle of Transfinite Induction is equivalent to the usual Principle de Induction (see Chapter 2) when X = N. 12. Give an example of a well-ordering of Q. 13. Show that a totally ordered set X is well-ordered if and only if for all x ∈ X the segment s(x) is well-ordered. 14. Prove Theorem A.2.16. Explain why the statement of this result does not mention an element x1 as was the case with Theorem A.2.15. A.3 Finite Sets Intuition says that a set X is finite if one can “count” its elements. The prototype of a finite set with n ≥ 0 elements is given by the set formed by the first n natural numbers: In = {1, 2, 3, . . . , n} = {k ∈ N : k ≤ n}. When n = 0, we have the empty set I0 = ∅. A moment’s thought shows that by “counting” we really mean establishing a bijective correspondence (map) between X and In . More formally, we define: Definition A.3.1. A set X is called finite if it is isomorphic to In , for some n ≥ 0. If X is not isomorphic to any In , then X is called infinite. A.3. FINITE SETS 397 Examples A.3.2. 1. The set In is obviously isomorphic to itself, hence it is a finite set. 2. We leave as an exercise to check that a subset X ⊂ Z is finite if and only if is bounded. We have mentioned in Chapter 1 that X is infinite if and only if there exists an injective map φ : X → X which is not surjective. This will be established in the next section, where we study in some detail infinite sets. For now, we limit ourselves to finite sets. Let us look first at the sets In . Lemma A.3.3. If φ : In → In is injective, then φ is surjective. PSfrag replaements Proof. We use induction. When n = 0, there is nothing to prove.7 Assuming that the result holds for some n, let φ : In+1 → In+1 be an injective map, and let α = φ(n+1). Consider (see Figure A.3.1) the bijective map ψ : In+1 → In+1 given by ψ(n + 1) = α, ψ(α) = n + 1, and ψ(x) = x in all other cases (ψ “exchanges” the natural numbers α and n + 1, and is y the identity if x 6= α, n + 1, but this last fact is irrelevant forr1the proof). rr2 rr34 5 In+1 In+1 In+1 r6 # n+1 n+1 n+1 In In = Æ 1 +pr r mi h e e~ h1i = Z h2i h4i In h12i h6i h3i hmd(m; n)i hmi hni hmm(m; n)i Figure A.3.1: The maps φ, φ∗ and ψ. Now we define φ∗ = ψ ◦ φ and observe that φ∗ is injective, since it is the composition of injective maps. Also, it satisfies φ∗ (n + 1) = n + 1, 7 A map f : X → Y is just a set of ordered pairs with some special properties. It is possible that f is the empty set, and this happens exactly when X is also the empty set. In this case, f is necessarily injective, and it is surjective if Y is also the empty set. 398 APPENDIX A. SET THEORY by the definition of ψ. It follows that if x ∈ In (i.e., if x 6= n + 1), then φ∗ (x) 6= n + 1, so that φ∗ (x) ∈ In , or equivalently φ∗ (In ) ⊆ In . This shows that the restriction of φ∗ to In is an injective map from In into In . By the induction hypothesis, this restriction is surjective: φ∗ (In ) = In . Since φ∗ (n+1) = n+1, conclude that φ∗ (In+1 ) = In+1 , i.e., φ∗ is a surjective map from In+1 onto In+1 . Finally, we conclude that φ = ψ −1 ◦ φ∗ is surjective, since it is the composition of surjective maps. PSfrag replaements Proposition A.3.4. If X is finite and φ : X → IX is injective, then φ is n n + 1 surjective. Proof. Let Ψ : In → X be a bijection. Note that φ∗ = Ψ−1 ◦ φ ◦ Ψ : In → In is injective, since it is the composition of injective maps (see Figure A.3.2). y ∗ r 1 According to Lemma A.3.3, φ is necessarily surjective. r2 r3 r4 r5 r6 X pr r 1+ X hmi e e~ 1 I = 1 Æ Æ h1i =Z h 2i h 4i h12i h 6i I h 3i hmd(m; n)i h mi hmm(m; n)i Figure A.3.2: The maps φ, φ∗ and Ψ. It follows that φ = Ψ ◦ φ∗ ◦ Ψ−1 is a composition of surjective maps, and hence it is surjective. Corollary A.3.5. If φ : X → X is injective but not surjective, then X is an infinite set. Example A.3.6. In Chapter 2, we observed that the map f : N → N given by f (n) = n + 1 is injective and not surjective. From the previous result, we conclude that N is infinite. Another consequence of the proposition above is that a finite set cannot be isomorphic to a proper subset of it. The example above shows that this A.3. FINITE SETS 399 fails for infinite sets. We state this result as follows and leave the proof for the exercises: Corollary A.3.7. If X is finite, X ⊇ Y and φ : X → Y is injective, then X =Y. It seems obvious that X is isomorphic to In if and only if X has n elements, in which case X cannot be isomorphic to Im for m 6= n. In fact, this is a direct consequence of Proposition A.3.4. Corollary A.3.8. If φ : In → X and ψ : Im → X are both bijections, then n = m. Proof. Assume, without loss of generality, that m ≤ n, so that Im ⊂ In . Then the map Ψ = ψ −1 ◦ φ : In → Im gives an injective map from In into its subset Im . By Corollary A.3.7, we must have Im = In , i.e., n = m. Hence, if X is finite there exists a unique non-negative integer n such that X is isomorphic to In . In this case, we say that X has n elements, and we call n the cardinal number or cardinality of X. We will use often the symbol #X to denote the cardinality of X. We will list a few properties of the cardinality of finite sets. We only sketch the proofs, leaving the details for the exercises. Proposition A.3.9. If Y is a subset of the set X, then: (i) If X is finite, then Y is also finite and #Y ≤ #X. (ii) If X is finite and #Y = #X, then X = Y . (iii) If Y is infinite, then X is also infinite. The proof uses the following two lemmas, the first of which has an elementary proof and so it is left as an exercise. Lemma A.3.10. If φ : In → X is injective but not surjective, then there exists φ∗ : In+1 → X injective. Lemma A.3.11. If φ : X → In is injective, then X is finite and #X ≤ n. Proof. Let M (X) be the set of integers m ≥ 0 for which there exists an injective map Ψm : Im → X. Note that 0 belongs to M (X), and we claim that M (X) is bounded above. 400 APPENDIX A. SET THEORY If Ψm : Im → X is injective, the composition φ ◦ Ψ : Im → In is also injective. According to Corollary A.3.7, we cannot have n < m, i.e., n is an upper bound for M (X). Hence, M (X) has a maximum k. By Lemma A.3.10, we have k = #X. Since n is an upper bound for M (X), it follows that #X = k ≤ n. Our last proposition expresses in a precise form some of our intuition about the meaning of the addition and the product of natural numbers. Proposition A.3.12. If X and Y are finite sets, then X ∪ Y and X × Y are finite sets and we have: (i) #(X ∪ Y ) ≤ #X + #Y ; (ii) #(X ∪ Y ) = #X + #Y , if X and Y are disjoint; (iii) #(X × Y ) = (#X)(#Y ). Proof. We prove only item (ii), leaving the remaining statements for the exercises. Let φ : In → X and ψ : Im → Y be bijections, so that #X = n and #Y = m. Define a map Ψ : In+m → X ∪ Y by:  if 1 ≤ k ≤ n,  φ(k) Ψ(k) =  ψ(k − n) if n + 1 ≤ k ≤ n + m. Obviously, Ψ is surjective and injective (since X and Y are disjoint). Hence, #(X ∪ Y ) = n + m. Exercises. In these exercises, the symbols X and Y denote arbitrary sets. 1. Prove Corollary A.3.7. 2. Prove Lemma A.3.10 3. Let φ : In → X be a surjective map. Show that Ψ : X → In , given by Ψ(x) = max{k ∈ In : φ(k) = x} is injective. 4. Show that the following statements are equivalent: A.4. INFINITE SETS 401 (a) X is finite and #X ≤ n. (b) There exists an injective map φ : X → In . (c) There exists a surjective map ψ : In → X. 5. Let X be a finite set. Show that if either φ : X → Y is surjective or ψ : Y → X is injective, then Y is finite and #Y ≤ #X. 6. Give a proof of Proposition A.3.9 using Corollary A.3.7 and Lemma A.3.11. 7. Show that if X is finite, then #X = #(X − Y ) + #(X ∩ Y ). 8. Let X and Y be finite. Show that X ∪ Y is also finite and #(X ∪ Y ) = #X + #Y − #(X ∩ Y ). 9. Let X and Y be finite. Show that X ×Y is finite and #(X ×Y ) = (#X)(#Y ). 10. Let X1 , X2 , . . . , Xn be finite sets. Show that: Pn Sn (a) #( k=1 Xk ) ≤ k=1 #Xk ; S P (b) #( nk=1 Xk ) = nk=1 #Xk , if the sets Xk ’s are disjoint; Qn Qn (c) #( k=1 Xk ) = k=1 #Xk . 11. Assume that #X = n and #Y = n. Find the cardinality of the following sets: (a) The set Y X of all maps f : X → Y . (b) The set of all injective maps f : X → Y . (c) The set of all surjective maps f : X → Y . 12. Assuming that #X = n, show that: (a) X has 2n distinct subsets; n! subsets with exactly k elements. (b) X has nk = k!(n−k)! 13. Let X ⊂ Z. Show that X is finite if and only if X is bounded. A.4 Infinite Sets We have already seen some elementary results on infinite sets. In particular, we have seen that N is an infinite set. We will now study some more properties of infinite sets. We start by showing that N is the “smallest” infinite set. 402 APPENDIX A. SET THEORY Lemma A.4.1. X is infinite if and only if X contains a subset isomorphic to N. Proof. If there exists a subset Y ⊂ X and a bijection φ : N → Y , it follows that Y is infinite, and hence X is also infinite, according to Proposition A.3.9 (iii). Conversely, assume that X is infinite. We claim that there exists an injective map φ : N → X. Taking Y = φ(N), this completes the proof. We will define the map φ recursively. Since X 6= ∅, there exists x1 ∈ X and we set φ(1) = x1 . Assume that φ has been defined and is injective in {1, 2, . . . , n}. The set Zn = X − {φ(1), . . . , φ(n)} is non-empty, for otherwise X would be finite. Choose z some element of Zn , and set φ(n + 1) = z. This defines an injective map φ : N → X recursively. The previous result allows to complete Corollary A.3.5, and to justify the characterization of infinite sets mentioned in the exercises of Section 2.1. Theorem A.4.2. X is infinite if and only if there existe φ : X → X injective and not surjective. Proof. Corollary A.3.5 shows that if there exists φ : X → X injective and not surjective, then X is infinite. It remains to show that if X is infinite, then there exists such a map. If X is infinite, Lemma A.4.1 yields an injective map (sequence) ψ : N → X. Let Y = ψ(N) so that ψ : N → Y is a bijection. Now we define φ : X → X as follows:  if x 6∈ Y,  x φ(x) =  ψ(ψ −1 (x) + 1) if x ∈ Y. One checks easily that φ is injective and not surjective. All the properties of infinite sets that we have studied so far are, perhaps, not surprising. Our next result is less obvious: it states that Lemma A.4.1 cannot be improve, i.e., there are infinite sets which are not isomorphic to N. In what follows we will denote by P(X) the set consisting of the subsets of X. Theorem A.4.3 (Cantor). Let Ψ : X → P(X) be any map. Then Ψ is not surjective. A.4. INFINITE SETS 403 Proof. The proof is by contradiction, and resembles the ideas behind Russel’s paradox, that we mentioned in Chapter ??. Let Ψ : X → P(X) and define a subset Y ⊂ X by: Y = {x ∈ X : x 6∈ Ψ(x)}. Since Y ∈ P(X), if Ψ is surjective there exists an element y ∈ X such that Y = Ψ(y). Obviously, either y ∈ Y or y 6∈ Y . We claim that both cases lead to a contradiction: (i) If y ∈ Y = Ψ(y), then the definition of Y gives that y 6∈ Y , a contradiction; (ii) If y 6∈ Y = Ψ(y) it follows from the definition of Y , that y ∈ Y , which is also a contradiction. It follows that there is no y ∈ X such that Y = Ψ(y), hence Ψ is not surjective. Examples A.4.4. 1. In order to illustrate the proof above, let X = {0, 1}, so that: P(X) = {∅, {0}, {1}, {0, 1}}. If Ψ : X → P(X) is given by Ψ(0) = {1} and Ψ(1) = {0, 1}, we have that Y = {x ∈ X : x 6∈ Ψ(x)} = {0}, so obviously Y 6∈ Ψ(X) = {{1}, {0, 1}}. 2. Consider the set P(N) whose elements are the sets containing only natural numbers. The result above says that this set is not isomorphic to N. On the other hand, the map Φ : N → P(N) given by Φ(n) = {n} is obviously injective, and hence P(N) is infinite. Definition A.4.5. A set X is called countable if it is finite or isomorphic to N. Otherwise, X is called (infinite) uncountable or Therefore, as we saw above, X is an infinite set if and only if it contains an infinite countable subset, but there are infinite uncountable sets, such as P(N). On the other hand, there are also uncountable sets, not isomorphic to N, and also not isomorphic between themselves: for example, the sets P(N), P(P(N)), P(P(P(N))), etc. 404 APPENDIX A. SET THEORY There are many infinite uncountable sets which we are used to, such as R or C. To check that R is indeed uncountable, we will use the following proposition which is usually presented as a consequence of the Supremum Axiom. Proposition A.4.6. Every bounded monotone sequence of real numbers converges. For the proof we refere to Chapter 4, where the real numbers are introduced in a constructive way. Theorem A.4.7. R is an uncountable set. Proof. Let φ : N → R be any sequence of real numbers. We need to show that φ is not surjective, i.e., that there exists x ∈ R such that x 6∈ φ(N). There exist real numbers a1 and b1 such that a1 < b1 < φ(1). In particular, φ(1) 6∈ [a1 , b1 ]. One can define recursively sequences an and bn which are, respectively, increasing and decreasing, and such that: an < bn , φ(n) 6∈ [an , bn ]. Obviously, both sequences are bounded by a1 and b1 . It follows from Proposition A.4.6 that both sequences converge, and we denote their limits by a and b, respectively. It is also clear that an ≤ a ≤ b ≤ bn , and we conclude immediately that a 6= φ(n), for all n ∈ N. Hence, φ is not surjective. Example A.4.8. Consider the set {0, 1}N, consisting of all sequences in {0, 1}, (called binary sequences) is also uncountable. In this case, it is easy to show that {0, 1}N is isomorphic to P(N). For that, observe that a binary sequence φ : N → {0, 1} is completely determined by its support, i.e., by the set of natural numbers n for which φ(n) 6= 0. In other words, the map Ψ : {0, 1}N → P(N), defined by Ψ(φ) = {n ∈ N : φ(n) 6= 0}, is a bijection. One can use the bijection in the last example, together with some elementary facts concerning binary representations of real numbers (base two), to show that both P(N) and {0, 1}N are isomorphic to R (see the exercises). We will not define “#X” when X is an infinite set. Instead, we will use the symbol “|X|”, also called cardinal of X, where X denotes any set (finite or infinite), as part of the expressions “|X| = |Y |”, “|X| ≤ |Y |” and “|X| < |Y |”, with the following meaning: A.4. INFINITE SETS 405 Definition A.4.9. We will write: (i) |X| = |Y | if X and Y are isomorphic, i.e., if there exists a bijection φ:X →Y; (ii) |X| ≤ |Y | if X is isomorphic to a subset de Y , i.e., if there exists a injective map φ : X → Y ; (iii) |X| < |Y | if |X| ≤ |Y |, and X is not isomorphic to Y . The equality “|X| = |Y |” and the inequality “|X| ≤ |Y |” are analogues of the equalities and inequalities between numbers “#(X) = #(Y )” and “#(X) ≤ #(Y )”, when X and Y are finite sets. For this reason, we could write #X = |X|. Even when X and Y are infinite sets, the cardinal has some properties similar to the equalities and inequalities between numbers. For example, the following two properties are immediate to prove: Proposition A.4.10. Let X, Y and Z be sets. (i) If |X| = |Y | and |Y | = |Z|, then |X| = |Z|; (ii) If |X| ≤ |Y | and |Y | ≤ |Z|, then |X| ≤ |Z|. It may look obvious that |X| ≤ |Y | and |Y | ≤ |X| ⇐⇒ |X| = |Y |. However, the proof of this fact is not at all obvious for infinite sets. In order to prove this we start by showing that the following lemma holds. Note that we already know it holds for finite sets, in which case we can add to the conclusion that “and X = Y ” (but not for infinite sets!). Lemma A.4.11. If Y ⊂ X and φ : X → Y is injective, then |X| = |Y |. Proof. We define recursively two sequences of sets as follows: X1 = X, Y1 = Y , and para n > 1, Xn = φ(Xn−1 ) and Yn = φ(Yn−1 ). We define also Zn = Xn − Yn , and ∞ [ Zn . Z= n=1 Note that (X − Z) ⊂ Y , since if x ∈ X and x 6∈ Y = Y1 , then x ∈ Z1 , so x ∈ Z. Moreover, if x ∈ Z, i.e., if there exists n such that x ∈ Zn , then φ(x) ∈ Zn+1 , so that φ(x) ∈ Z. PSfrag replaements 406 APPENDIX A. SET THEORY I In n+1 1 = 1 Æ Æ y 1 2 3 4 r5 r6 r r Z Z 1 2 3 Z r ::: r 1 +p r r X hmi e e~ h1i = Z h2i h4i h6i h3i hmd(m; n)i hmi hmm(m; n)i Figure A.4.1: The sets Z1 , Z2 , Z3 , . . . Now, define Ψ : X → X by Ψ(x) =   φ(x),  x, if x ∈ Z, if x 6∈ Z. Notice that Ψ(X) ⊆ Y , since if x ∈ Z, then Ψ(x) = φ(x) ∈ Y , and if x 6∈ Z, then x ∈ Y and Ψ(x) = x. We claim that Y ⊆ Ψ(X), so it follows that Ψ(X) = Y . To prove the claim, consider x ∈ Y and observe that if x 6∈ Z then x = Ψ(x) ∈ Ψ(X). On the other hand, if x ∈ Z ∩ Y , then x ∈ Zn for some n > 1 (obviously Y does not contain any element of Z1 ), and hence x ∈ φ(Zn−1 ), so that x ∈ Ψ(X) (see Figure A.4.1). Since Ψ(X) = Y , in order to complete the proof, we only need to show that Ψ is injective. Let x, y ∈ X and assume that x 6= y. Then: (i) If x, y ∈ Z, then Ψ(x) 6= Ψ(y), since Ψ = φ in Z, and φ is injective. (ii) If x, y 6∈ Z, then Ψ(x) 6= Ψ(y), since Ψ is the identity in X − Z. (iii) If x ∈ Z and y 6∈ Z, then Ψ(x) 6= Ψ(y), since Ψ(y) = y 6∈ Z and Ψ(x) = φ(x) ∈ Z. We conclude that X and Y are isomorphic. The construction of the set Y in the previous proof seems ingenious, but it is actually very simple, as we illustrate in the following example. A.4. INFINITE SETS 407 Example A.4.12. Let X = N0 = {n ∈ Z : n ≥ 0} and Y = N and φ(x) = x + 2. Then we find immediately that Xn = {k ∈ Z : k ≥ 2(n − 1)} and Yn = {k ∈ Z : k ≥ 2(n − 1) + 1}. It follows that Zn = {2(n − 1) : n ∈ N}, so Z is the set of even positive integers. The map Ψ : N0 → N in the previous proof is then given by:  if x = 2n,  x + 2, Ψ(x) =  x, if x = 2n + 1. This is obviously a bijection from N0 onto N. We can now show: Theorem A.4.13 (Schroeder-Bernstein). If |X| ≤ |Y | and |Y | ≤ |X|, then |X| = |Y |. Proof. Let φ : X → Y and ψ : Y → X be injective maps and Z = ψ(Y ). Note that ψ ◦ φ : X → Z is injective and since Z ⊆ X, the previous lemma shows that there exists a bijection Ψ : Z → X. The composition Ψ ◦ ψ : Y → X is the desired bijection. The Schroeder-Bernstein theorem is often used to show that two sets are isomorphic, without having to exhibit an explicit bijection between the sets. This is illustrated in the next example. Example A.4.14. Consider the sets X = N × N and Y = N. The map φ : N → N × N given by φ(n) = (n, 1) is injective. According to the Fundamental Theorem of Arithmetic the map ψ : N × N → N given by ψ(n, m) = 2n 3m is also injective. It follows from the Schroeder-Bernstein theorem that N and N×N are isomorphic. The following result follows easily from |N × N| = |N| and is left as an exercise: Proposition A.4.15. Let X, Xn and Y be countable sets. (i) If Y 6= ∅ and X is infinite, then |X ×Y | = |X|, i.e., X ×Y is countable. (ii) The set S∞ n=1 Xn is countable. 408 APPENDIX A. SET THEORY This proposition shows that the behavior of the notion of cardinal under forming unions and products of infinite sets is slightly peculiar. We state two general results concerning unions and products, illustrating this. We will omit the proofs of these statements, since we do not wish to get even more involved with technical issues in Set Theory (see, however, the exercises at the end of the section). Theorem A.4.16. If X is infinite and |Y | ≤ |X|, then: (i) |X ∪ Y | = |X|; (ii) |X × Y | = |X|, if Y 6= ∅. Exercises. 1. Show that Z and Q are countable. 2. Show that the intervals [0, 1], ]0, 1[, and [0, 1[ (in R) are all isomorphic to R. 3. Let X be a countable (finite or infinite) set. For each of the following examples determine if the set is countable (finite or infinite), justifying your answer. (a) The set X {0,1} of all maps f : {0, 1} → X. (b) The set X In of all maps f : In → X. S∞ (c) The set Y = n=1 X In . (d) The set X N of all maps f : N → X. (e) The set {0, 1}X of all maps f : X → {0, 1}. (f) The set of sequences f : N → X which are “eventually constant ”, i.e., for each sequence there exists N ∈ N such that n > N ⇒ f (n) = x ∈ X. (g) The set Pfin (X) of finite subsets of X. 4. Show that the set of polynomials with rational coefficients is countable. 5.SLet Xn , n = 1, 2, 3, . . . be countable sets. Show that the sets ∞ n=1 Xn are also countable. 6. Let XnQ , n = 1, 2, 3, . . . be countable sets. Show that ∞ When is n=1 Xn countable? QN n=1 SN n=1 Xn and Xn is countable. 7. Show that if X is infinite, Y ⊂ X and Y is finite, then |X| = |X − Y |. 8. Show that if X is infinite, Y ⊂ X, X − Y is infinite and Y is countable, then |X| = |X − Y |. Conclude that if X is uncountable and Y is countable, then |X ∪ Y | = |X|. A.4. INFINITE SETS 409 9. Let {0, 1}N be set of all binary sequences. Show that {0, 1}N is isomorphic to the interval [0, 1] ⊂ R, and hence isomorphic to R. 10. Let {0, 1}N be set of all binary sequences. Show that {0, 1}N × {0, 1}N is isomorphic to {0, 1}N, and conclude that Rn is isomorphic to R, for any natural number n. 11. Assume that |Xn | ≤ |R|. Show that | S∞ n=1 Xn | ≤ |R|. 12. Consider the sequence of sets X1 = N, Xn+1 = P(Xn ). Show that there exists a set Y such that |Y | > |Xn |, for every n ∈ N. 13. Let X be an infinite set and let Pfin (X) be set of finite subsets of X. Show that |X| = |Pfin (X)|. 410 APPENDIX A. SET THEORY References (not available yet) 411

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download lecture notes