Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Polynomial greatest common divisor wikipedia , lookup
Cayley–Hamilton theorem wikipedia , lookup
Homological algebra wikipedia , lookup
Birkhoff's representation theorem wikipedia , lookup
Group action wikipedia , lookup
Laws of Form wikipedia , lookup
Congruence lattice problem wikipedia , lookup
ABSTRACT ALGEBRA I NOTES MICHAEL PENKAVA 1. Peano Postulates of the Natural Numbers 1.1. The Principle of Mathematical Induction. The principle of mathematical induction is usually stated as follows: Theorem 1.1. Let Pn be a sequence of statements indexed by the positive integers n ∈ P. Suppose that • P1 is true. • If Pn is true, then Pn+1 is true. Then Pn is true for all n ∈ P. This formulation makes the idea of mathematical induction into a property of statements. However, in reality, there is a deeper level to this principle, as a property of the positive integers themselves. Let us state this as a property of the set of positive integers. Theorem 1.2. Let S be a subset of P satisfying the following: • 1 ∈ S. • If n ∈ S then n + 1 ∈ S. Then S = P. We don’t give a proof of either of these versions of the principle of mathematical induction. However, it is not difficult to show that both of these versions are equivalent. That is to say, if the version described in terms of statements is true, then the version given in terms of subsets of the positive integers is true and viceversa. Instead, we will give an axiomatic construction of the positive integers, including the notions of addition and multiplication of such numbers, in terms of what are called the Peano Postulates. Before giving this axiomatic construction, we will give some simple examples of how to use the principle of mathematical induction to prove some explicit formulae. We begin with an apocryphal story about the mathematician Carl Friedrich Gauss, 1777–1855, who was one of the most significant contributors to modern mathematics. According to the story, Gauss was an elementary school student who was constantly disrupting his class, and his teacher decided to give him a task to occupy his time, adding up the numbers from 1 to 100. Unfortunately for his teacher, Gauss was able to give the answer immediately, ”They sum to 5050”. In various versions of the story, his teacher doubted the answer, but Gauss was able to give a simple explanation for his result. There are 100 numbers from 1 to 100, and they can be divided into 50 pairs of numbers each of which sum to 101, 1 and 100, 2 and 99, etc. Thus, the sum is 50*101 =5050. 1 2 MICHAEL PENKAVA The above reasoning is certainly clever, but we can give a very general answer to the question of how to sum the numbers from 1 to n using mathematical induction. In general, mathematical induction can be used to prove a conjecture, but usually the conjecture cannot be seen by using inductive methods. This may seem strange, that in order to determine a general result you have to first know the answer, but this is a deep mystery of mathematics, that seeing what is true and being able to show it are very different activities. The statement of the sum formula is as follows. Theorem 1.3. n ∑ k= k=1 n(n + 1) . 2 Proof. We use the principle of mathematical induction. Let Pn be the statement ∑n n(n+1) .. We first show that P1 is true. To see this, note that if n = 1, k=1 k = 2 the left hand side of P1 is simply the sum from k = 1 to 1 of k, which is just 1. On the other hand, the right hand side of P1 is 1(1+1) , which is also equal to 1. Thus 2 we have shown that P1 is true. ∑n+1 Next, assume that Pn is true. Now Pn+1 is the statement k=1 k = (n+1)(n+2) . 2 Let us compute n+1 ∑ k=1 k =n+1+ n ∑ k=1 k =n+1+ n(n + 1) 2(n + 1) n(n + 1) = + 2 2 2 (n + 2)(n + 1) (n + 1)(n + 2) = = . 2 2 Notice that in the third equality above, we used the statement Pn . By the principle of mathematical induction, the statement Pn is true for all n. Exercise 1.4. Prove the following statements using mathematical induction. ∑n n(n+1)(2 n+1) 2 (1) . k=1 k = 6 2 2 ∑n n (n+1) 3 (2) . k=1 k = 4 (3) 3n > 2n for all positive integers n. ABSTRACT ALGEBRA I NOTES 3 1.2. The Peano Postulates. The Peano postulates for the natural numbers were first given by the mathematician Giuseppe Peano 1858–1932, in the year 1889. These axioms were the culmination of about a century of work in developing the notions of arithmetic as a system of formal reasoning. Here, we will give the axioms and constructions for the set P of positive integers. Peano’s axioms were originally stated for the natural numbers N = 0, 1, . . . . Definition 1.5. The positive integers P is a set with a map s : P → P, called the successor map, satisfying: (1) There is a natural number 1 such that 1 ̸= s(n) for any n ∈ P. (2) If s(m) = s(n) then m = n. (3) If S is a subset of P such that • 1 ∈ S. • If n ∈ S, then s(n) ∈ S. Then S = P. It is not hard to show that if P and P′ are two such sets, then there is a unique bijection φ : P → P′ such that φ(1) = 1 and φ(s(n)) = s(φ(n)) for all n ∈ P. This means that in some sense, the Peano postulates uniquely determines the set of positive integers. The construction of all of the elementary arithmetic operations from the Peano postulates was given in the Principia Mathematica, a three volume tome written by the mathematicians Alfred North Whitehead, 1861–1947, and Bertrand Russell 1872–1970, consisting of thousands of pages. Clearly, in a course on Abstract Algebra, there is not enough time to give this kind of an in depth treatment of elementary arithmetic, so we will only establish a few of the highlights of the material. Theorem 1.6. If n ∈ P, then n ̸= s(n). Proof. Let S be the set of all elements of P which are not equal to their successors, that is all n ∈ P such that n ̸= s(n). If we can show that S = P, then the theorem is true. First we show that 1 ∈ S. This is true because by hypothesis, 1 is not a successor of any element. Now suppose that n ∈ S. Then n ̸= s(n). If s(n) = s(s(n)), then it would follow that n = s(n), since both n and s(n) have the same successor. However, by assumption n ̸= s(n). Thus s(n) ̸= s(s(n)). It follows that s(n) ∈ S. From this we conclude that S = P. The proof above illustrates a common technique in the theory of arithmetic on P. We use the inductive property of the natural numbers to show the property we wish to establish. We give one more example of such a proof. Theorem 1.7. Let n ∈ P and suppose that n ̸= s(m) for any m ∈ P. Then n = 1. In other words, the only element of P which is not a successor is 1. Proof. Let S be the set of all elements n ∈ P such that either n = 1 or n = s(m) for some m ∈ P. Notice that 1 ∈ S by assumption. Let us suppose that n ∈ S. Then s(n) ∈ S, since we have s(n) = s(m) where m = n. It follows that S = P. It follows that if n ∈ P , then n ∈ S, so that if n ̸= 1, n = s(m) for some m ∈ P . Therefore, if n ̸= s(m) for all m ∈ P, we must have n = 1. Definition 1.8 (Recursion). A function f defined on P is said to be defined recursively if f is defined as follows. First, f (1) is explicitly given. Secondly, the value f (s(n)) is given by some rule that depends only on the value of f (n). 4 MICHAEL PENKAVA The principle of mathematical induction shows that a recursive definition gives a well-defined function, provided that the rule for f (s(n)) can always be evaluated. The rules for addition and multiplication of positive numbers are given by recursive definitions. Definition 1.9 (Definition of addition). Addition is a binary operation on S given by the following rules: • m + 1 = s(m). • m + s(n) = s(m + n). Notice that there can be no conflict between the two rules because 1 is not a successor. The fact that addition is well-defined is an elementary exercise. One shows that the set S of all n such that m + n is defined satisfies the induction hypotheses, so is all of P. From the definition of addition, we are able to show the properties of associativity and commutativity of addition. The order in which these two properties are established is quite important. One of the difficulties that Whitehead and Russell encountered in writing the Principia Mathematica was that there is a certain natural order in which the properties need to be established, and the difficulty is determining that natural order. Theorem 1.10 (Associativity of addition). (a + b) + c = a + (b + c), for all positive integers a, b and c. Proof. The first difficulty one has to overcome in this proof is that there are three variables, but mathematical induction gives conditions for a subset S of P to be all of P. This means we should somehow reduce our proof to a one variable proof. One way to do this is to imagine that a and b are fixed numbers, and to show that the set S consisting of all c ∈ P such that (a + b) + c = a + (b + c) is all of P. First we show that 1 ∈ S. To see this, note that (a + b) + 1 = s(a + b) by the first rule of addition. Secondly, a + (b + 1) = a + s(b) = s(a + b), by the second rule of addition. It follows that (a + b) + 1 = s(a + b) = a + (b + 1). This shows that 1 ∈ S. Next, suppose that c ∈ S, so that (a + b) + c = a + (b + c). Then (a + b) + s(c) = s((a + b) + c) = s(a + (b + c)) = a + s(b + c) = a + (b + s(c)). But this means that s(c) ∈ S. By the inductive principle of natural numbers, we have S = P . Finally, we note that although we fixed a and b to give this property for c, we did not use any properties of a and b, so we finally see that the formula for associativity holds for all positive integers a, b and c. Theorem 1.11 (Commutativity of addition). For all positive integers m, n ∈ P, m + n = n + m. Sometimes, it helps to prove a technical or special case of a theorem, which will help in the general proof, as a separate result. Such a result is usually called a lemma. Of course, a lemma is a theorem, but we usually reserve that word for results which are primarily useful in proving a more important result. However, there are cases where an important result is also called a lemma, so one has to be careful. ABSTRACT ALGEBRA I NOTES 5 Lemma 1.12. For all positive integers m, m + 1 = 1 + m. Proof of the lemma. Let S be the subset of all m ∈ P such that m + 1 = 1 + m. Evidently 1 ∈ S, since 1 + 1 = 1 + 1. Now suppose that m ∈ S. Then 1 + s(m) = s(1 + m) = s(m + 1) = m + s(1) = m + (1 + 1) = (m + 1) + 1 = s(m) + 1. Thus, by induction S = P, and the lemma holds. Notice that we used associativity in the proof of this lemma, so it was important that the associative law of addition was established first. Proof of the theorem. Fix m ∈ P. Let S be the subset of all n ∈ P such that m + n = n + m. Then by the lemma, 1 ∈ S. Suppose that n ∈ S. Then m + n = n + m. As a consequence, m + s(n) = s(m + n) = s(n + m) = n + s(m) = n + (m + 1) = n + (1 + m) = (n + 1) + m = s(n) + m. Thus S = P and the commutative law of addition holds. Definition 1.13 (Definition of multiplication). Multiplication is a binary operation on S given by the following rules: • m · 1 = m. • m · s(n) = m · n + m. There are two properties of multiplication, associativity and commutativity, and a property involving addition and multiplication called the distributive law. Theorem 1.14 (The distributive law). For all a, b and c in P we have a · (b + c) = a · b + a · c Proof. Once again, we prove this result by fixing a and b and showing that the set of all c ∈ P for which the equation above holds satisfies the induction hypotheses. First, if c = 1, we note that a · (b + 1) = a · s(b) = a · b + a = a · b + a · 1, so 1 ∈ S. Suppose now that c ∈ S. Then a · (b · s(c)) = a · (b + (c + 1)) = a · ((b + c) + 1) = a · (b + c) + a · 1 = (a · b + a · c) + a · 1 = a · b + (a · c + a · 1) = a · b + a · s(c) Actually, there are two distributive laws. The one stated above is often called the left distributive law. The right distributive law is stated as follows: (a + b) · c = a · c + b · c When the commutative law of multiplication holds, the right distributive law follows directly from the left distributive law. However, many structures with addition and multiplication do not have a commutative multiplication, so in those cases, the left and right distributive laws do not follow directly from each other, and other methods of proof are necessary. It also may seem strange that we first proved the distributive law, instead of the laws involving multiplication alone, but it will emerge that we use the distributive law in proving the other properties. An interesting feature 6 MICHAEL PENKAVA of the proof of the distributive law is that all the work seems to be in moving parentheses around. This is a key feature in proofs in algebra. Exercise 1.15. Prove the right distributive law: For all positive integers a, b and c, we have (a + b) · c = a · c + b · c. Theorem 1.16 (Associative law of multiplication). For all a, b and c in P, we have a · (b · c) = (a · b) · c. Proof. As usual, we fix a and b and show that the set S of all c ∈ P such that a · (b · c) = (a · b) · c is all of P. Now a · (b · 1) = a · b = (a · b) · 1, so 1 ∈ S. Now suppose c ∈ S. Then (a · b) · s(c) = (a · b) · c + (a · b) = a · (b · c) + a · b = a · (b · c + b) = a · (b · s(c)) Notice that in the proof of the associative law of multiplication we used the left distributive law. Finally, we are ready to prove the commutative law of multiplication. Theorem 1.17. For all positive integers a and b we have a · b = b · a. To simplify the proof, we first state and prove the following lemma. Lemma 1.18. For all positive integers a, we have a · 1 = 1 · a. Proof of the lemma. Let S be the set of all positive integers a such that a · 1 = 1 · a. Then 1 ∈ S because 1 · 1 = 1 · 1. Now suppose that a ∈ S. Then 1 · s(a) = 1 · a + 1 = a · 1 + 1 · 1 = (a + 1) · 1 = s(a) · 1. Proof of the theorem. Fix a and let S be the set of all b ∈ P such that a · b = b · a. Then by the lemma, 1 ∈ S. Suppose now that b ∈ S. Then a · s(b) = a · b + a · 1 = b · a + 1 · a = (b + 1) · a = s(b) · a. This shows that S = P so the commutative law of multiplication holds for the positive integers. Next, we introduce the notion of inequality for the positive integers. Definition 1.19 (Definition of inequality). We say that a < b, a is less than b, precisely when there is some c such that b = a + c. Although we won’t develop the properties of inequalities, we point out that the usual properties of inequalities involving positive integers can all be established using the properties of addition and multiplication which we have developed thus far. To illustrate this principle, we state and prove the following theorem. Theorem 1.20. If a < b then a + c < b + c for any c ∈ P. ABSTRACT ALGEBRA I NOTES 7 Proof. Suppose that a < b. Then there is some x ∈ P such that b = a + x. Thus b + c = (a + x) + c = (a + (x + c) = a + (c + x) = (a + c) + x. It follows that a + c < b + c. 1.3. Well Ordering and Strong Induction. Definition 1.21. A set X is ordered provided it is equipped with a binary relation < satisfying: (1) If a, b ∈ X, then exactly one of the following hold: • a < b. • a = b. • b < a. (2) If a < b and b < c then a < c. For an ordered set X, we write a ≤ b if a < b or a = b. Definition 1.22. An ordered set X satisfies the Principle of Strong Induction if given any subset S which satisfies: • If x ∈ S for all x < n then n ∈ S Then S = X. A subset of an ordered set X is said to be strongly inductive if it satisfies the condition above. One can restate the principle of strong induction in the form: X satisfies the principle of strong induction if every strongly inductive subset is all of X. Theorem 1.23 (Strong Induction). P satisfies the principle of strong induction. Proof. Let S ⊆ P be a strongly inductive subset of X. We need to show that S = P. To see this, we will show that a certain subset of S is already all of P. Let Y be the subset of S consisting of all elements n ∈ S such that x ∈ S for all x < n. We show that Y satisfies the inductive hypotheses. First, note that x ∈ S for all x < 1, since there are no such values of x. Therefore 1 ∈ S. Furthermore, it is clear that 1 ∈ Y as well. Next, suppose that n ∈ Y . Then for all x < n, x ∈ S, and since n ∈ S, it follows that for all x < s(n), x ∈ S. Thus s(n) ∈ S. It follows that s(n) ∈ Y . Since Y satisfies the hypotheses of induction, Y = P. It follows that S = P as well. Definition 1.24. If X is an ordered set, and Q is a subset of X, then c is called a least element of Q if c ≤ x for all x ∈ Q. An ordered set X is well ordered or satisfies the least element property provided that any nonempty subset Q of X has a least element. Theorem 1.25. The set P satisfies the least element property. Proof. Let Q be a subset of P which does not have a least element, and let S be the subset of P consisting of all x ∈ P such that y ̸∈ Q for all y ≤ x. We show that S satisfies the hypothesis of strong induction, which implies it is all of P. Suppose that x ∈ S for all x < n. Then x ̸∈ Q for all x < n. If n ∈ Q, it would be the least element of Q. Thus n ̸∈ Q, so n ∈ S. Thus S must be all of P. 8 MICHAEL PENKAVA 2. Equivalence of forms of induction and well ordering Both the Principle of Strong Induction and the Well Ordering Principle refer only to an ordering on a set X. The Principle of Mathematical Induction which we gave as part of the Peano Postulates, which is also known as weak induction requires a successor operation, and there must be a connection between the ordering and the successor operation. We have already shown that the set of positive integers, with the ordering given by the construction from the Peano postulates satisfies the Well Ordering Principle and the Principle of Strong Induction. Theorem 2.1. Let X be an ordered set. Then X is well ordered if and only if it satisfies the principle of strong induction. Proof. We show that well ordering implies the principle of strong induction. We leave the reverse direction as an exercise. Suppose that X is well ordered and S is a strongly inductive subset of X. We must show that S = X. Let Q be the complement of S. It is enough to show that Q must be the empty set. Suppose that it is not empty. Then Q has a least element c. It follows that for all x < c, x is not an element of Q, which means that x is in S. Thus for all x < c, x ∈ S. Since S is strongly inductive, it follows that c ∈ S. But this contradicts the fact that c ∈ Q. This shows that Q is empty. Exercise 2.2. Show that an ordered set satisfying the principle of strong induction is well ordered. It can be shown that every set X can be well-ordered, using the axiom of choice, which is an axiom of a certain set theory, called Zermelo-Frenkel Choice, often denoted as ZFC. To understand this construction would take us too far into the realm of set theory for this course. However, we note that if X is well ordered, then the principle of strong induction holds, by the theorem above. Transfinite Induction refers to proofs using the principle of strong induction on a well ordered set. Since every set can be well ordered, transfinite induction can be used to prove many interesting results in set theory, in particular, it is used to study ordinal numbers. 3. The Division Algorithm From the positive integers, the integers are constructed in a straightforward manner, and all of the usual properties of addition, multiplication and inequalities can be established in a routine manner. Nevertheless, the construction takes a lot of detail and would take too long to carry out in this course. We will assume that all of these basic properties have been shown, and will begin our analysis of the integers with the division algorithm. Theorem 3.1. Suppose that m, n ∈ Z and m ̸= 0. Then there are unique q, r ∈ Z such that 0 ≤ r < |m| and n = qm + r. Proof. We first show uniqueness of q and r. Suppose that n = mq + r and n = mq ′ + r′ , where 0 ≤ r ≤ |m| and 0 ≤ r ≤ |m|. If r = r′ , it follows that mq = mq ′ , so m(q − q ′ ) = 0. By the zero product property of the integers, either q − q ′ = 0 or m = 0. Since we have explicitly assumed that m ̸= 0, it follows that q − q ′ = 0, so q = q ′ . Now, let us assume that r ̸= r′ . Then we can assume without loss of ABSTRACT ALGEBRA I NOTES 9 generality that r′ > r, so that r′ −r > 0. But m(q−q ′ ) = r′ −r, so |m||q−q ′ | = r′ −r. However r′ − r < |m| − r < |m|, but |m||q − q ′ | > |m| unless q = q ′ . It follows that q = q ′ so r = r′ . This proves uniqueness. We will use the least element property of P to prove the existence of a q and r satisfying the properties. Let X = {n − mq|q ∈ Z} ∩ P. Because m ̸= 0, X ̸= ∅. Therefore X has a least element r. We have n = mq + r for some q. Suppose that r ≥ |m|. Then r′ = r − |m| ≥ 0 If m > 0, then n = mq + r = m(q + 1) + r′ , so r′ ∈ X. If m < 0 then n = m(q − 1) + r′ , so again r′ ∈ X. But this contradicts the fact that r is the least element of X, since r′ < r. Definition 3.2. Let a, b ∈ Z. We say that a divides b, and denote this by a|b, provided that there is some integer x such that ax = b. Note that a|b is a statement, not a number. Definition 3.3. Let m, n ∈ Z. Then c is called a greatest common divisor of m and n provided that (1) c|m and c|n. (2) If d|m and d|n then d|c. Notice that we did not define the greatest common divisor. In fact, in general, the greatest common divisor is only determined up to multiplication by ±1, as we shall show. However, this fact does allow us to define the greatest common divisor as the unique greatest common divisor which is nonnegative, which is exactly what most textbooks do. Note also that the definition of a greatest common divisor does not imply that such a thing exists. It simply gives a criterion for determining whether a number c is a greatest common division. It is common to write c = gcd(m, n) to express that c is a greatest common divisor of m and n, even though there is some ambiguity about c. Proposition 3.4. Suppose that a|b and b|a. Then b = ±a. Proof. Let a|b and b|a. Then there are x, y ∈ Z such that b = ax and a = by. It follows that b = byx, so b(1 − yx) = 0. If b = 0, then a = 0, so b = a. Otherwise we must have 1 − yx = 0, so xy = 1. In particular, x has a multiplicative inverse. But the only integers which have a multiplicative inverse are ±1, so x = ±1, and b = ±a. Theorem 3.5. Let c and d be two greatest common divisors of m and n. Then d = ±c. Proof. Since c is a gcd of m and n, we have c|m and c|n. Since d is a gcd of m and n, it follows that c|d. Similarly, d|c. Thus, according to Proposition 3.4, d = ±c. Proposition 3.6. Let m ∈ Z. Then (1) gcd(m, 0) = m. (2) gcd(m, 1) = 1. Exercise 3.7. Prove Proposition 3.6 Definition 3.8. Let m, n ∈ Z. Then m and n are said to be relatively prime if gcd(m, n) = 1. In other words, 1 is a greatest common divisor of m and n. 10 MICHAEL PENKAVA Notice that m and 1 are relatively prime for any m ∈ Z, by Proposition 3.6. Now we will show that given m, n ∈ Z, there is always a greatest common divisor of m and n. In other words, greatest common divisors exist! Theorem 3.9. Let m, n ∈ Z, and suppose that n ̸= 0. Let X = {rm + sn|r, s ∈ Z} ∩ P. Then X has a least element c, and this least element is a greatest common divisor of m and n. Moreover, for any m, n ∈ Z, if c is a gcd of m and n, then c = rm + sn for some r, s ∈ Z. Proof. Since n ̸= 0, |n| ∈ P. Moreover, |x| = sx where s = 1 or s = −1. Thus |x| = 0 · m + sn ∈ X, so X is nonempty. As a consequence, it has a least element c, and since c ∈ X, c = rm + sn for some r, s ∈ Z. Since c ̸= 0, there are unique q, d such that 0 ≤ d < c and m = cq + d. But then d = m − cq = m − (rm + sn)q = (1 − rq)m + (−sq)n. If d > 0, it follows that d ∈ X and d < c, which contradicts our assumption that c is the least element of X. Thus d = 0, so m = cq. Thus c|m. Similarly, c|n. Now suppose that d ∈ Z satisfies d|m and d|n. Then m = xd and n = yd for some x, y ∈ Z. Thus c = rxd + syd = (rx + sy)d. It follows that d|c. Thus c is a gcd of m and n. Finally, from what we have shown, when n ̸= 0, we have constructed a gcd c of m and n which satisfies c = rm + sn for some r, s ∈ Z. If d is another gcd of m and n, then either d = c or d = −c. But −c = (−r)m + (−s)n, so d can be expressed in the required form. We still have to address the case when n = 0, but then gcd(m, n) = m, so any gcd of m and n is of the form rm + sn where r = ±1 and s = 0. Corollary 3.10. Let m, n ∈ Z. Then m and n are relatively prime if and only if there are r, s ∈ Z such that 1 = rm + sn. In other words, we can express 1 as a linear combination of m and n. Proof. If m and n are relatively prime, then 1 is a gcd of m and n. Thus, by the theorem, 1 = rm+sn for some r, s ∈ Z. On the other hand, suppose 1 = rm+sn for some r, s ∈ Z. Now, by the theorem, the least element in X = {rm+sn|r, s ∈ Z}∩P is a gcd of m and n, and by assumption, 1 ∈ X. It follows that 1 must be the least element in X, so 1 is a gcd of m and n. Theorem 3.11 (Euclidean Algorithm). Suppose that n, m ∈ Z, and n = mq + r. Then gcd(m, n) = gcd(m, r). Proof. Let c = gcd(m, n) and d = gcd(m, r). Then m = xd and r = yd for some x, y ∈ Z. Thus n = mq + r = (qx + y)d, so d|n. Since d|m and c is a gcd of m and n, it follows that d|c. Next, note that m = rc and n = sc for some r, s ∈ Z. so r = n − mq = (s − rq)c. It follows that c|r and c|m, so c|d. Therefore d = ±c. Thus gcd(m, n) = gcd(m, r). It may seem that the Euclidean algorithm is not an algorithm at all, since it does not tell one how to compute the gcd of m and n. The trick is to notice that if we first express n = mq + r with 0 ≤ r ≤ |m|. and then we express m = q1 r + r1 with 0 ≤ r1 < r, and continue this process, we obtain a sequence of elements ABSTRACT ALGEBRA I NOTES 11 r > r1 > · · · rn . Eventually, this process must terminate with some rn+1 = 0. But we have gcd(m, n) = gcd(m, r) = gcd(r, r1 ) = · · · = gcd(rn , rn+1 ) = rn , since rn+1 = 0. Thus, the Euclidean algorithm computes the gcd of m and n. In fact, the Euclidean algorithm is efficient in this computation. Moreover, we can adapt the Euclidean algorithm to find numbers x and y so that the gcd c of m and n satisfies c = xm + yn. Dr. Nick Passell, a professor emeritus of the department of mathematics at the University of Wisconsin-Eau Claire, developed an efficient algorithm, which we illustrate below. Let us find the gcd c of 78 and 30, as well as x and y such that c = 30x + 78y. First make a table with 4 columns, with headings r, −q, m and n. We will use it to keep track in each row how the element r can be expressed as a linear column of m and n. For simplicity, we start with the largest element n = 78. and the first row expresses that it is zero times m = 30 plus 1 times n. In the next row, before filling in the q column, first note that m = 1 · m + 0 · n, so put a 1 in the m column and a 0 in the n column. Now, note that when we use the division algorithm to express n = mq + r, with 0 ≤ r < |m|, we have q = 2, so write −2 in the q column, and put the r = 18 in the r column in the next row. To figure out the m column in the current row, add the m column from 2 rows above, and −q times the m column in the row above, and do similarly for the n column. Then we begin again by figuring out how to express 30 in the form 30 = 18 ∗ q + r. We write the −q, which in this case is −1 in the q column, and proceed as before. In each case, we determine the m column by adding the value in the m column two rows above plus the −q times the value in the m column in the row above, and similarly for the n column. Finally, when the number c in the r column divides the number in the r column in the row above, that r is the gcd, and the numbers we calculate in the m and n columns become the x and y so that c = xm + yn. The complete calculation is given in the table below. r -q m n 78 0 1 30 -2 1 0 18 -1 -2 1 12 -1 3 -1 6 -5 2 From this calculation we see that 6 is the gcd of 78 and 30, and that 6 = −5 · 30 + 2 · 78. Definition 3.12. An element a ∈ Z is called a unit if a has a multiplicative inverse. Of course we already know that the units in Z are precisely the numbers ±1. Definition 3.13. Let p ∈ Z and suppose that p is not zero and not a unit. Then • p is said to be irreducible if whenever p = ab then either a or b is a unit. • p is said to be prime if whenever p|ab then p|a or p|b. We will show that the notions of primeness and irreducibility coincide for the integers. Theorem 3.14. Let p ∈ Z be prime. Then p is irreducible. 12 MICHAEL PENKAVA Proof. Suppose that p is prime and p = ab. Then p|ab so either p|a or p|b. Suppose that p|a. Then a = px for some x and thus p = pxb. It follows that p(1 − xb) = 0. Since p ̸= 0, we must have xb = 1, so b is a unit. Similarly, if p|b, then we can show that a is a unit. It follows that p is irreducible. Theorem 3.15. Let p be irreducible and a ∈ Z. Then either gcd(p, a) = 1 or p|a. Proof. Let c = gcd(p, a). Then c|p so p = cx and a = cy for some x, y ∈ Z. If c is a unit, then gcd(p, a) = 1. Otherwise, x is a unit, so a = cy = px−1 y. Thus p|a. Theorem 3.16. Suppose that p is irreducible. Then p is prime. As a consequence, we have p is prime if and only if p is irreducible. Proof. Suppose p is irreducible and p|ab. Then ab = xp for some x ∈ Z. If p ̸ |a, then gcd(p, a) = 1, so 1 = rp + sa for some r, s ∈ Z. Thus b = brp + sab = brp + sxp = (br + sx)p. It follows that p|b. Thus p is prime. Proposition 3.17. Suppose that a and b are relatively prime and that a|bx. Then a|x. Exercise 3.18. Prove the above proposition. 4. Modular Arithmetic Modular arithmetic is also called clock arithmetic, because the rules of addition resemble the rules for addition on a clock. In order to give a rigorous definition, we will first introduce the notion of an equivalence relation. A relation on a set X is a subset of elements (a, b) ∈ X × X. If we have a relation, we often denote it by introducing some symbol R, and write xRy to mean that (x, y) lies in the relation. For example, the relation equality is given by the symbol “=” and we write a = b to mean that (a, b) lies in the relation equality. Other examples of relations given by symbols are “¡”, ≤, ⊆. If ∼ is the symbol of a relation, we will usually just call the relation ∼, rather than say that it is the symbol of the relation. Definition 4.1. Suppose ∼ is a relation on a set X. Then ∼ is called an equivalence relation provided that (1) a ∼ a for all a ∈ X. (Reflexivity) (2) If a ∼ b then b ∼ a. (Symmetry) (3) If a ∼ b and b ∼ c then a ∼ c. (Transitivity) Definition 4.2. If ∼ is an equivalence relation on X and b ∈ X, then the equivalence class of b, denoted by b, is b = {a ∈ X|a ∼ b}. The set of all equivalence classes of elements in X is denoted by X/ ∼ or sometimes X. Theorem 4.3. Let ∼ be an equivalence relation on X. Then the following properties hold: (1) If a ∈ X, then a ∈ a. Thus a ̸= ∅. (2) If a ∩ b ̸= ∅, then a = b. (3) ∪{a|a ∈ X} = X. ABSTRACT ALGEBRA I NOTES 13 Proof. Since a ∼ a by reflexivity, it follows that a ∈ a. Thus a ̸= ∅. Suppose that x ∈ a ∩ b. Then x ∼ b and x ∼ a. Then by symmetry, b ∼ x. Let y ∈ b̄. Then y ∼ b, and by transitivity y ∼ x, and applying the transitive rule a second time, we have y ∼ a. It follows that y ∈ a. This shows b ⊆ a. By a similar argument a ⊆ b. Thus we must have a = b. Finally, let x ∈ X. Then x ∈ x, so x ∈ ∪{a|a ∈ X}. It follows that {a|a ∈ X} = X. Definition 4.4. Let X be a set and C be a collection of subsets of X. Then C is said to be a partition of X provided that (1) If A ∈ C, then A ̸= ∅. (2) If A and B are in C, and A ∩ B ̸= ∅, then A = B. (3) If x ∈ X then x ∈ A for some A ∈ C. Theorem 4.5. If ∼ is an equivalence relation on a nonempty set X, then the collection X is a partition of X. Exercise 4.6. Prove the above theorem. Definition 4.7. Let n ∈ Z and define a relation on Z by x=y (mod n) if y − x = kn for some k ∈ Z. Theorem 4.8. The relation = (mod n) is an equivalence relation. Proof. First, note that a = a (mod n), because a − a = 0 = 0 · n. Suppose that a = b (mod n), so b − a = kn for some k ∈ Z. But then a − b = (−k)n, which shows that b = a (mod n). Finally, suppose that a = b (mod n) and b = c (mod n). Then b − a = kn and c − b = ln for some k, l ∈ Z. Thus c − a = c − b + b − a = ln − kn = (l − k)n. It follows that a = c (mod n). Definition 4.9. For the equivalence relation = (mod n), the set of equivalence classes is denoted by Zn . (Some authors denote it by Z/nZ.) Theorem 4.10. There is a well-defined binary operation + on Zn given by a + b = a + b. Moreover, this operation satisfies the following properties. (1) a + (b + c) = (a + b) + c. (Associativity) (2) a + b = b + a. (Commutativity) (3) a + 0 = a. (Existence of additive identity) (4) a + −a = 0. (Existence of additive inverse) Proof. It turns out that the hard part is showing that the addition is well defined. What causes the problem is that the sets a do not determine the element a. So what the operation actually says is that to add the two sets, take arbitrary elements a and b out of the sets and form the set a + b. The problem is that we need to show that the set a + b does not depend on the choice of a and b. To do this, let a1 ∈ a and b1 ∈ b. We need to show that a1 + b1 = a + b. Now a1 = a (mod n), and b1 = b (mod n), so a − a1 = kn and b − b1 = ln for some k, l ∈ Z. It follows that (a + b) − (a1 + b1 ) = a − a1 + b − b1 = kn + ln = (k + l)n. 14 MICHAEL PENKAVA Thus a+b = a1 +b1 (mod n). It follows that a+b ∈ a1 + b1 , and since a+b ∈ a + b, we see that a1 + b1 ∩a + b ̸= ∅. Therefore, a1 + b1 = a + b. This shows that addition is well defined. Now, to show the associative law, we proceed as follows: a + (b + c) = a + b + c = a + b + c = a + b + c = (a + b) + c. To show commutativity: a + b = a + b = b + a = b + a. Next, we compute a + 0 = a + 0 = a. Finally, a + −a = a + −a = 0. Theorem 4.11. There is a well defined binary operation · on Zn , called multiplication, given by a · b = ab. This operation satisfies the following properties: (1) a · (b · c) = (a · b) · c. (Associativity) (2) a · b = b · a. (Commutativity) (3) a · (b + c) = a · b + a · c. (Distributive Law) (4) a · 1 = a. (Existence of a multiplicative identity) Proof. As usual, well definedness is the hard part. Suppose that a1 = a (mod n) and b1 = b (mod n). We need to show that a1 b1 = ab (mod n). Now a1 = a + kn and b1 = b + ln for some k, l ∈ Z. Thus a1 b1 − ab = a1 b1 − a1 b + a1 b − ab = a1 (b1 − b) + (a1 − a)b = a1 ln + knb = (a1 l + kb)n. Thus a1 b1 = ab (mod n) and it follows that multiplication is well defined. The properties are straightforward to show and are left as an exercise. Theorem 4.12. Let a ∈ Z. Then a is a unit in Zn precisely when gcd(a, n) = 1. In that case, if we express 1 = xa + yn, then (a)−1 = x. In particular, Zp is a field if and only if p is prime. Exercise 4.13. Prove the theorem above. Theorem 4.14 (Freshmen Exponentiation). Let p ∈ P be prime and a, b ∈ Z. Then (a + b)p = ap + bp (mod p). ∑n ( ) Proof. Recall the binomial theorem for n ∈ P: (a + b)n = k=0 nk an−k bk . Note ( ) (n) n! , and that nk ∈ P. As a consequence, when n = p is prime, that k = k!(n−k)! we note that for any 1 ≤ x < p, we have gcd(x, p) = 1. But this means that gcd(k!, p) = 1 and gcd((p − k)!, p) = 1 if 1 ≤ k < p. Therefore gcd(k!(p − k)!, p) = 1, if 1 ≤ k < p, and ( ) since (k!(p(−)k)!)|p!, it follows that (k!(n − k)!)|(p − 1)!. But this means that p| kp and thus kp = 0 (mod p) for 1 ≤ k < p. It follows that every term in the binomial formula is equal to zero mod p except for the terms with k = 0 and k = p. But the term corresponding to k = 0 is ap and the term corresponding to k = p is bp . This gives the exponentiation formula in the theorem. ABSTRACT ALGEBRA I NOTES 15 Theorem 4.15 (Fermat’s Little Theorem). Suppose that p ∈ P is prime. Then if a ∈ Z, ap = a (mod p). In particular, if gcd(a, p) = 1, then ap−1 = 1 (mod p). Proof. We first show the statement is true whenever a ∈ P. For a = 1, the statement is trivial. Suppose that ap = a (mod p). Then (a + 1)p = ap + 1p = a + 1 (mod p). Thus by induction, we see that the statement is true for all a ∈ P. Next, note that 0p = 0, so the statement holds for a = 0. If p is odd, then if a ∈ P, we have (−a)p = (−1)p ap = −a (mod p), so the statement holds when a < 0. Thus we only have to show the case when a < 0 and p = 2. But −a = a mod 2, since −a − a = 2a is divisible by 2. Thus (−a)2 = a2 = a = −a mod 2. Thus the statement holds when p = 2 and a < 0. Finally, suppose that gcd(a, p) = 1. Now ap = a (mod p) so a(ap−1 − 1) = 0 (mod p). Since Zp is a field and a ̸= 0 (mod p), it follows that ap−1 = 1 (mod p). Theorem 4.16 (Chinese Remainder Theorem). Suppose that m and n are relatively prime and a, b ∈ Z. Then there is an x ∈ Z such that x=a mod m x=b mod n. Proof. If there is an x satisfying the statement above then x = a+km and x = b+ln for some k, l ∈ Z. As a consequence a+km = b+ln. This means that b−a = km−ln. On the other hand, since gcd(m, n) = 1, we know that 1 = rm+sn for some r, s ∈ Z. It follows that b−a−(b−a)rm−(a−b)sn. Thus if we set k = (b−a)r and l = (a−b)r we have expressed b − a in the required format. Theorem 4.17 (General Chinese Remainder Theorem). Let m1 , . . . , mn ∈ Z be pairwise coprime; that is gcd(mi , mj ) = 1 if i ̸= j. Let a1 , . . . , am ∈ Z. Then there is an integer x such that x = ai (mod mi ) for i = 1, . . . , n. ∏n ∏ Proof. Let M = i=1 mi , and Mi = j̸=i mj . Then mi Mi = M . Moreover mi and Mi are relatively prime, so there are integers ri , si such that ri mi + si Mi = 1. Let ei = si Mi . Then ri mi + ei = 1. so ei = 1 (mod ∑ mi ). Moreover, if j ̸= i, then mj |Mi , so mj |ei , and ei = 0 (mod mi ). Let x = i = 1n ai ei . It follows that x = ai (mod mi ) for all i. Exercise 4.18. Suppose that gcd(a, n) = 1. Show that the equation ax = b (mod n) has a solution for any b ∈ Z. Moreover, if x is the equivalence class (mod n) of a particular solution x to the equation, then the solutions to the equation are precisely the elements of the equivalence class of x mod n. Exercise 4.19. Let gcd(a, n) = c, and express a = ca′ and n = cn′ . Show that the equation ax = b (mod n) has a solution if and only if c|b. In that case, if b = cb′ , and x is a solution to a′ x = b′ (mod n′ ), then x is a solution to ax = b (mod n). 5. Permutations Definition 5.1. If f : X → Y is a map, then • f is injective if f (x) = f (x′ ) implies that x = x′ . • f is surjectiveif given any y ∈ Y , there is some x ∈ X such that f (x) = y. 16 MICHAEL PENKAVA f is said to be a bijection if f is both injective and surjective. Theorem 5.2. Suppose that f : Y → Z and g : X → Y are maps. Then • If f and g are both injective then f ◦ g is injective. • If f and g are both surjective then f ◦ g is surjective. • If f and g are both bijective then f ◦ g is bijective. If h : W → X is another map, then (f ◦ g) ◦ h = f ◦ (g ◦ h). Proof. Suppose that both f and g are injective and (f ◦ g)(x) = (f ◦ g)(x′ ). Then f (g(x)) = f (g(x′ )), and since f is injective it follows that g(x) = g(x′ ). But then, since g is injective, we see that x = x′ . Thus f ◦ g is injective. Next, suppose that f and g are surjective, and y ∈ Y . Then since f is surjective,there is some x ∈ X such that f (x) = y. Since g is surjective, there is some w ∈ W such that g(w) = x. Then (f ◦ g)(w) = f (g(w)) = f (x) = y. Thus f ◦ g is surjective. Putting the two results together, we see that if f and g are bijective, then f ◦ g is bijective. Finally, the associativity of function composition is easy to see and is left as an exercise to the reader. g(w) = x. Definition 5.3. Let X be a set Then the set SX = {f : X → X|f is bijective} is called the permutation group of X. The permutation group of n = {1, . . . , n} is denoted simply as Sn . Often, the permutation group of n is denoted by Σn instead of Sn . Theorem 5.4. Function composition is a well defined binary operation SX ×SX → SX . This operation, called the product of permutations, is usually denoted by juxtaposition instead of the composition symbol ◦. It satisfies the following properties. (1) (στ )ϕ = σ(τ ϕ). (associativity) (2) The identity map 1X , defined by 1X (x) = x, is a permutation and σ · 1X = 1x · σ = σ for all σ ∈ SX . (Existence of identity) (3) The inverse map σ −1 to σ, defined by σ −1 (y) = x if and only if σ(x) = y is a permutation of X and σ · σ −1 = σ −1 · σ = 1X . (Existence of inverse) Proof. Since the composition of bijections is a bijection, we see that the product of permutations is well defined. Since function composition is associative, the product is associative. Clearly, 1X is a bijection. We have (σ · 1X )(x) = σ(1X (x)) = σ(x), for any x ∈ X. Thus σ · 1X . Similarly, 1X · σ = σ. The identity element in Sn is often denoted as e, since the notation 1SX is cumbersome. Note that with this notation, there is some ambiguity about which Sn the element e belongs to, which needs to be determined by context. Definition 5.5 (Matrix Notation for)Permutations). If σ ∈ Sn , then the matrix ( 1 ··· n notation for σ is σ(1) · · · σ(n) ABSTRACT ALGEBRA I NOTES 17 Definition 5.6. Let a1 , . . . , ak be a sequence of distinct elements of X. Then the cycle σ associated to the sequence is the map σ : X → X given by ai+1 if x = ai and 1 ≤ i < k σ(x) = a1 if x = ak x if x ̸∈ {a1 , . . . , ak } We say that the cycle σ has length k, and we denote it by σ = (a1 , . . . , ak ). If τ = (b1 , . . . , bℓ ) is another cycle, then the cycles σ and τ are said to be disjoint if the sets {a1 , . . . , ak } and {b1 , . . . , bℓ } are disjoint. Exercise 5.7. Show that a cycle σ : X → X is actually a permutation of X. Theorem 5.8. The product of disjoint cycles commutes. Proof. Let σ = (a1 , . . . , ak ) and τ = (b1 , . . . , bℓ ) be two disjoint cycles. Let ϕ = στ , and ψ = τ σ. Suppose that x ∈ X. Let x ∈ X. Then exactly one of three possibilities hold: x ∈ {a1 , . . . , ak }, x ∈ {b1 , . . . , bℓ }, or x ̸∈ {a1 , . . . , ak , b1 , . . . , bℓ }. Let us examine what happens in each case. Case 1 : x ∈ {a1 , . . . , ak } In this case σ(x) ̸∈ {b1 , . . . , bℓ }, so ϕ(x) = τ (σ(x)) = σ(x). Moreover, τ (x) = x so ψ(x) = σ(τ (x)) = σ(x).. Thus ϕ(x) = ψ(x). Case 2 : x ∈ {b1 , . . . , bℓ } In this case τ (x) ̸∈ {a1 , . . . , ak }, so ψ(x) = σ(τ (x)) = τ (x). Moreover, σ(x) = x so ϕ(x) = τ (σ(x)) = τ (x).. Thus ϕ(x) = ψ(x). Case 3 :x ̸∈ {a1 , . . . , ak , b1 , . . . , bℓ }. In this case, both σ(x) = x and τ (x) = x, so ψ(x) = x = ϕ(x). Since ϕ(x) = ψ(x) for all x, we that σ and τ commute. We can generalize the result above and combine with the associative law to see that if σ1 , . . . , σm is a sequence of disjoint cycles, then the order of multiplication does not determine their product. Theorem 5.9. If X is a nonempty set, then every permutation can be written as a product of disjoint cycles so that every element of X appears in one of the cycles. Moreover, this product is unique up to order. Exercise 5.10. Prove the above theorem. Note that there is some ambiguity about which Sn a permutation written in disjoint notation belongs to. For example sigma = (1, 3, 2) might belong to Sn for any n ≥ 3. Sometimes this ambiguity is advantageous. Note that there is no ambiguity about the n when a permutation is expressed in matrix notation. Theorem 5.11. Let σ = (a1 , . . . , ak ) be a cycle. Then σ −1 = (ak , . . . , a1 ). In other words, to compute the inverse of a cycle, you just reverse the order of the elements in the cycle. Exercise 5.12. Prove the theorem above. Theorem 5.13. If σ, τ ∈ SX , then (στ )−1 = τ −1 σ −1 . Proof. If f : X → Y and g : Y → X, then we know that g = f −1 precisely when g ◦ f = 1X and f ◦ g = 1Y . Thus we compute (στ )(τ −1 σ −1 ) = σ(τ τ −1 )σ −1 = σ · 1X σ −1 = σσ −1 = 1X . Similarly, (τ −1 σ −1 )(στ ) = 1X . Thus (στ )−1 = τ −1 σ −1 . 18 MICHAEL PENKAVA In the study of linear algebra, you learned that if A, B are n × n matrices, then (AB)−1 = B −1 A−1 . The rule for computing the inverse of a product of permutations is analogous to the rule for matrix inverse computation. If you combine the rule for computing the inverse of a cycle, and the rule for computing the inverse of a product of permutations, one obtains a simple method for computing the inverse of a product of any number of cycles, whether they are disjoint or not. In the case when one has a product of disjoint cycles, this gives a very simple method of computing the inverse. Example 5.14. Let σ = (1, 3, 5, 6)(2, 4, 8). Then σ −1 = (6, 5, 3, 1)(8, 4, 2). Notice that we don’t have to reverse the order because the two cycles are disjoint, so their inverses are also disjoint, and thus can be multiplied in any order. It is also easy to multiply permutations which are expressed in cycle notation. In fact, one can compute the product of a number of permutations in a very quick fashion. It is also easy to convert the matrix notation for a permutation into disjoint cycle notation. ( ) 1 2 3 4 5 6 7 8 . Then σ = (1, 3, 4, 6, 8, 2, 5). Exercise 5.15. Let σ = 3 5 4 6 1 8 7 2 Notice that σ = (1, 3, 4, 6, 8, 2, 5)(7) as well, but it is customary to drop the singleton cycles from the expression for σ, as they are not necessary. Now let τ = (1, 4, 5)(2, 3)(7, 8) be a permutation in S8 expressed in cycle ( ) notation. Then the 1 2 3 4 5 6 7 8 matrix notation for τ is τ = . To find the product 4 3 2 5 1 6 8 7 of σ and τ write στ = (1, 3, 4, 6, 8, 2, 5)(1, 4, 5)(2, 3)(7, 8) = (1, 6, 8, 7, 2, 4)(3, 5). To calculate this, we first note that when computing the product of permutations, you must remember that the permutation on the right acts first. To get the right hand side of the equation, you first start a cycle with any number. We started with 1, so we first wrote (1. Now reading from right to left, we track down where 1 goes to. First, the cycle (1, 4, 5) acts on 1 taking it to 4. Then the cycle (1, 3, 4, 6, 8, 2, 5) takes 4 to 6. Thus we put a comma, followed by a 6, so we have (1, 6 so far. Next we do the same thing as we did with 1, but starting with 6, and find that 6 goes to 8. We continue in this manner until we have (1, 6, 8, 7, 2, 4. When we repeat the process with 4, we find 4 goes to 5 which then goes to 1. Since 4 goes to 1, which is the first element in the cycle, we have computed a cycle in the product. Next, we look for a number which is not in the first cycle. 3 is such a number, so we can start a new cycle with (3. In this manner, we compute the product. Note that the method above can be applied when multiplying more than two permutations together. Thus it is a very efficient method of computing the product of two permutations. One might ask, if the disjoint cycle notation is so advantageous for computing inverses and products of cycles, what is the value of the matrix notation. It turns out that the matrix notation has some applications, which do not arise in a course in abstract algebra, and the notation is a common notation as well, so it is valuable to learn. Definition 5.16. If σ ∈ SX , then the order of σ, denoted o(σ) is the least positive integer m such that σ m = 1X . If there is no such integer, then we say that the order of σ is ∞ and write o(σ) = ∞. ABSTRACT ALGEBRA I NOTES 19 Theorem 5.17. Let σ = (a1 , . . . , ak ) be a cycle. Then o(σ) = k. Proof. Suppose that 1 ≤ i < k. Then it is a straightforward induction to see that σ i (a1 ) = ai+1 . Since ai+1 ̸= a1 , it follows that σ i ̸= e. In particular σ k (a1 ) = σ(ak ) = a1 . Since σ = (aj , . . . , ak , a1 , . . . , aj−1 ) for any 1 ≤ j ≤ k, it follows that σ k (aj ) = aj for any 1 ≤ j ≤ k. Moreover if x ̸∈ {a1 , . . . , ak }, then σ(x) = x, so σ k (x) = x for all k. It follows that σ k = e. Recall that if n1 , . . . , nℓ ∈ P, then lcm(n1 , . . . , nℓ ) is the least common multiple of n1 , . . . , nℓ . It is the smallest positive integer x such that ni |x for all i = 1, . . . , ℓ. Corollary 5.18. Let σ = σ1 · · · σm be a product of cycles σ1 , . . . , σm . Then o(σ) = lcm(o(σ1 ), . . . , o(σm )). Definition 5.19. A cycle of the form (a1 , a2 ) is called a transposition. Theorem 5.20. Let k > 1 and σ = (a1 , . . . , ak ) be a cycle. Then σ = (a1 , a2 )(a2 , a3 ) . . . (an−1 , an ). As a consequence, any element of Sn can be written as a product of transpositions when n > 1. Proof. That σ = (a1 , a2 )(a2 , a3 ) . . . (an−1 , an ) is a matter of calculation. If σ is not the identity, then it can be written as a product of disjoint cycles, each of which has length at least 2. Thus, after factoring each of them as a product of transpositions, we have found a factorization of σ in the desired form. It remains to consider the case when σ = e. But e = (1, 2)(1, 2), so it is a product of transpositions. Definition 5.21. Let n > 1. Then a permutation σ ∈ Sn is said to be even if it can be expressed as a product of an even number of transpositions. A permutation which is not even is said to be odd. Note that if n > 1, then a permutation which is odd can be expressed as a product of an odd number of transpositions, since every permutation is a product of transpositions. What is not so obvious is that a permutation can not be expressed both as a product of an even number of transpositions and an odd number of transpositions. In order to prove this fact, we need to develop some properties of permutations. Definition 5.22. Suppose that σ ∈ Sn can be expressed as a product of k disjoint cycles so that every number 1 ≤ i ≤ n appears in one of the cycles. Then the orbit number of σ is n − k. Notice that since the decomposition of σ into such a product is unique up to the order of the cycles, the orbit number is well defined. Also, the orbit number of the identity element e is zero, since it is a product of n disjoint cycles. Theorem 5.23. Suppose that σ ∈ Sn and τ = (a, b) be a transposition. Then the orbit number of στ is either 1 larger or 1 less than the orbit number of σ. More precisely, if a and b lie in the same cycle of σ, then στ has 1 more orbit than σ, and if a and b lie in different cycles of σ, then στ has one less orbit. 20 MICHAEL PENKAVA Proof. Suppose that a and b belong to the same orbit of σ. We can suppose that σ = (a, a1 , . . . , ak , b, b1 , . . . , bℓ ), as the other cycles in σ will not influence the outcome of the product. Then στ = (a, a1 , . . . , ak , b, b1 , . . . , bℓ )(a, b) = (b, a1 , . . . , ak )(a, b1 , . . . , bℓ ), so στ has one more orbit. Next, suppose that a and b belong to different orbits. Then we can suppose that σ = (a, a1 , . . . , ak )(b, b1 , . . . , bℓ ). In this case, we have στ = (a, a1 , . . . , ak )(b, b1 , . . . , bℓ )(a, b) = (a, b1 , . . . , bℓ , b, a1 , . . . , ak ), , so that στ has one less orbit. Corollary 5.24. If n > 1, then an element σ ∈ Sn has a factorization as a product of an even number of transpositions or an odd number of transpositions, but not both. In fact, σ is even precisely when its orbit number is even. Moreover, we have the following: • The product of two even elements is even. • The product of an even element and an odd element in either order is an odd element. • The product of two odd elements is even. • The inverse of an even element is even. • The inverse of an odd element is odd. 6. Groups Definition 6.1. A set G, equipped with a binary operation ⋆, called the product or group operation is called a group provided that (1) a ∗ (b ∗ c) = (a ∗ b) ∗ c, for all a, b, c ∈ G. (associativity) (2) There is an element e ∈ G such that a ⋆ e = e ⋆ a = a for all a ∈ G. (Existence of identity) (3) Given a ∈ G there is some b ∈ G such that a ⋆ b = b ⋆ a = e. (Existence of inverse) Frequently, the group operation is indicated by juxtaposition;i.e.we write gh instead of g ⋆ h. If we wish to emphasize the group operation, we sometimes say (G, ⋆) is a group. This may be important when the set G is equipped with more than one operation. It is also common for the operation to be written as +, but in that case, almost always we require the operation to be commutative, which we define below. Definition 6.2. A group G with product ⋆ is said to be commutative provided that a ⋆ b = b ⋆ a for all a, b ∈ G. Examples of commutative groups are (Z, +), (Zn , +), (Q, +), (R, +), and any vector space over any field k with the operation of addition. In all of these cases the identity element is called 0. Commutative groups whose group operation is not written as + are (Z∗ , ·), (Z∗n , ·), (R∗ , ·), where the ∗ means the subset of elements invertible under the group operation. The set GL(n, k) of invertible n × n matrices with coefficients in a field k is a group under matrix multiplication, which is not commutative if n > 1. The permutation group SX is a group under composition of maps, which is also not commutative when X has more than two elements. ABSTRACT ALGEBRA I NOTES 21 A careful reading of the definition of a group reveals that it does not state that there is only one identity element or one element satisfying the inverse property. Luckily, we can prove this uniqueness of identity and inverse. Theorem 6.3 (Uniqueness of Identity). Suppose that G is a group and e, e′ both satisfy the condition of identity in the second axiom of a group. Then e = e′ . In fact, if e is the identity and e′ ⋆ a = a or a ⋆ e′ = a for some a ∈ G, then e′ = e. Proof. Suppose that e ⋆ a = a ⋆ e = a for all a ∈ G, and that e′ ⋆ a = a, for some a ∈ G. By the third axiom of groups, there is some b ∈ G such that a ⋆ b = e. Then e′ = e′ ⋆ e = e′ ⋆ (a ⋆ b) = (e′ ⋆ a) ⋆ b = a ⋆ b = e. The proof is similar if we assume a ⋆ e′ = a for some a ∈ G. Theorem 6.4 (Uniqueness of Inverse). Suppose that G is a group, a, b ∈ G and a ⋆ b = b ⋆ a = e. Let b′ ∈ G satisfy b′ ⋆ a = e or a ⋆ b′ = e. Then b′ = b. Proof. Let a, b be as in the statement of the theorem, and suppose that b′ ⋆ a = e. Then b′ = b′ ⋆ e = b′ ⋆ (a ⋆ b) = (b′ ⋆ a) ⋆ b = e ⋆ b = b. A similar argument holds when a ⋆ b′ = e. As a consequence of the above theorem, we can give the definition below. Definition 6.5. If G is a group with identity e and g ∈ G, then the inverse of g is the unique element h such that g ⋆ h = h ⋆ g = e. When the group operation of G is written in some multiplicative form (either by juxtaposition or ⋆), we denote the inverse of g by g −1 . When the group operation of G is a commutative operation written as +, we write the inverse of g as −g. Most of the time, we will assume that the group in question is written multiplicatively, so will state our results in that form. Later, we will give a table which compares the multiplicative forms of our results to their additively written counterparts. Theorem 6.6. Let G be a group (written multiplicatively). Then • If g ∈ G, then (g −1 )−1 = g. • If g, h ∈ G,then (gh)−1 = h−1 g −1 . Exercise 6.7. Prove the above theorem. Definition 6.8 (Exponentiation). Let G be a group. For n ∈ P we define the power g n for g ∈ G recursively as follows: • g 1 = g. • g s(n) = g n g. This definition is extended to all n ∈ Z as follows • g 0 = e. • g −n = (g n )−1 if n ∈ P. Lemma 6.9. Let G be a group, g ∈ G, and m, n ∈ P. Then (1) g m g n = g m+n . (2) (g m )n = g mn . 22 MICHAEL PENKAVA Proof. To establish the first equation, we show that the set S = {n ∈ P|g m g n = g m+n for all m ∈ P} is an inductive subset of P. Note g m+1 = g m g = g m g 1 , so 1 ∈ S. Suppose that n ∈ S. Then g m+s(n) = g (m+n)+1 = g m+n g = g m g n g = g m g n+1 = g m g s(n) . Thus S is inductive so it follows that S = P. Next, we show that the set S = {n ∈ P|(g m )n = g mn for all m ∈ P} is an inductive subset of P. Note (g m )1 = g m = g m·1 , so 1 ∈ S. Suppose that n ∈ S. Then (g m )s(n) = (g m )n+1 = (g m )n (g m )1 = g mn g m = g mn+m = g m(n+1) = g m·s(n) . Thus S is inductive so that S = P. Theorem 6.10. Let G be a group, g ∈ G and n ∈ P. g −n = (g −1 )n . Proof. We proceed by induction. Let S be the subset of P such that g −n = (g −1 )n for all n ∈ S. Since g −1 = (g −1 )1 by the definition of exponentiation, it follows that 1 ∈ S. Suppose that n ∈ S. then g −s(n) = (g s(n )−1 = (g n g)−1 = g −1 (g n )−1 = g −1 (g −1 )n = (g −1 )n+1 = (g −1 )s(n) . . Now we are ready to show that the statements of Lemma 6.9 holds for all integers. Theorem 6.11. Let G be a group, g ∈ G, and m, n ∈ Z. Then (1) g m g n = g m+n . (2) (g m )n = g mn . Proof. Let us first note that both formulas are immediate whenever m or n is equal to zero. Thus, we can restrict to the case when either both m and n are negative, or when one is positive and the other is negative. Let us examine the case when both coefficients are negative, so m = −k and n = −ℓ for some k, ℓ ∈ P. Then g m+n = g n+m = g −(ℓ+k) = (g ℓ+k )−1 = (g ℓ g k )−1 = (g k )−1 (g ℓ )−1 = g −k g −ℓ = g m g n . Next (g m )n ) = (g −k )−ℓ = ((g −k )ℓ )−1 = (((g k )−1 )ℓ )−1 = (((g k )ℓ )−1 )−1 ) = (g k )ℓ = g kℓ = g mn . Thus, both formulae hold when m and n are negative. Next, Suppose that m ≥ n ∈ P. Then g m g −n = g m−n g n (g n )−1 = g m−n . A similar formula holds when 1 ≤ m < n, but this time, we factor g −n = g −m−n−m . It is an exercise for the reader to extend this to the case g −m g n when m, n ∈ P. Finally, let us consider the multiplicative formula. Let m, n ∈ P. Then (g −m )n = ((g m )−1 )n = (g m )−n = ((g m )n )−1 = (g mn )−1 = g −mn . Finally, the case (g m )−n is handled similarly. ABSTRACT ALGEBRA I NOTES 23 Theorem 6.12. Suppose that G is a group, g, h ∈ G. Then if gh = hg, we have (gh)m = g m hm for all m ∈ Z. Moreover, g and h commute precisely when (gh)2 ̸= g 2 h2 , so if g and h fail to commute, the formula does not hold for all m ∈ Z. Exercise 6.13. Prove the above theorem. Let us consider a group G with a commutative operation +. In this case, it is natural to write a + a as 2a rather than a2 . We could define na for a ∈ Z using a recursive definition for n ∈ P, 0a = 0, and −na = −(na). Then we could prove properties corresponding to the exponentiation rules we have shown above. However, it is not really necessary to do this, as these rules represent a translation of the power rules into additive notation. In the table below, we give a comparison between the exponential properties of a group, written multiplicatively, and the properties of multiplication by integers in a group with the group operation given as +, which we assume is commutative. Property Power Inverse Sum Rule Multiplication Rule Power of products Exponential Notation gn g −1 g m+n = g m g n (g m )n = g mn If gh = hg then (gh)m = g m hm Additive Notation ng −g (m + n)g = mg + ng (mn)g = m(ng) m(g + h) = mg + mh Proposition 6.14 (Cancelation Laws). Let G be a group and a, b, c ∈ G. (1) If ab = ac then b = c. (left cancelation) (2) If ba = ca then b = c. (right cancelation) Exercise 6.15. Prove the above proposition. Proposition 6.16. If a, b ∈ G, then • The equation ax = b has the unique solution x = a−1 b. • The equation xa = b has the unique solution x = ba−1 . Exercise 6.17. Prove the above proposition. Definition 6.18. If G is a finite group, then a Cayley Table of the group is an n × n matrix whose columns and rows are headed by elements of the group ordered as g1 , . . . , gn , and whose entry in the ith row and jth column is gi gj . It is conventional to list the elements in the same order in the rows as in the columns and to list the identity element e first. Cayley tables first appeared in an 1854 paper by Arthur Cayley 1821–1895. Example 6.19. Recall that the symmetric group S3 is given by S3 = {e, ρ, ρ2 , σ, ρσ, ρ2 σ}, where ρ = (1, 2, 3) and σ = (1, 2) are given in cyclic notation. Then a Cayley table for S3 is 24 MICHAEL PENKAVA e ρ ρ2 σ ρσ ρ2 σ e e ρ ρ2 σ ρσ ρ2 σ ρ ρ ρ2 e ρ2 σ σ ρσ ρ2 ρ2 e ρ ρσ ρ2 σ σ σ σ ρσ ρ2 σ e ρ ρ2 ρσ ρσ ρ2 σ σ ρ2 e ρ ρ2 σ ρ2 σ σ ρσ ρ ρ2 e In the example above, note that each element appears exactly once in each column and row of the table. This is a basic property of Cayley tables, which follows from Proposition solveeq. We can exploit this to find Cayley tables for groups of small size. Example 6.20. Let G be a group of order 2, e be its identity element, and g be the nonidentity element. Then its Cayley Table is e g e e g g g e Example 6.21. Let G be a group of order 3, e be its identity element and g be a nonidentity element. Let h be the third element. If g 2 were equal to e, then consider what the Cayley Table would look like e g h e e g h g g e x h h y z There is no way to assign a value to either x or y which is consistent with the observation that the rows and columns must have each element appearing only once. Thus we must have g 2 = h and we can write the Cayley Table as follows. e g g2 e e g g2 g g g2 e g2 g2 e g2 We have presented Cayley tables for groups of order 2 and 3, but there is a problem with the exposition. The table alone does not mean that there is a group with the structure given in the table. For example, if we have a group of order 3, then its Cayley table must look like the one given in the example above, but that is not enough to show that there is such a group. The problem is that we have not verified the axioms. In particular, the axiom of associativity for a binary operation on a set G is time consuming to verify. For a group of order n, there are n3 expressions of the form (a ⋆ b) ⋆ c, and another n3 expressions of the form a ⋆ (b ⋆ c). To check associativity, we have to compare the two sequences of products, so one needs to make 2n3 calculations to verify the associativity axiom directly. ABSTRACT ALGEBRA I NOTES 25 On the other hand, it is possible to determine if the other two axioms are satisfied by examining the Cayley table, so checking associativity is the main problem. As a consequence, we will have to find other methods to check if a binary operation is associative. We already encountered this problem for the groups Z and Zn . We proved associativity of addition on P by verifying that it is a consequence of the Peano postulates, and we reduced the problem of verifying the associativity of addition on Zn to the associativity in Z. We did not give a proof of associativity in Z, because we did not give the construction of Z from P, but this can be done. These remarks still are not enough to verify that there is a group whose Cayley table corresponds to the one which we gave for a group of order 3. However, it is enough to show there is some group of order 3, because our argument shows that the table we gave applies to any group of order 3. However, we can exhibit such a group easily, as we show in the theorem below. Theorem 6.22. The group given by addition on Zn for n ∈ P has order n. Exercise 6.23. Prove the above theorem. There is a second problem with the idea that there is only one group of order 3, and it is more substantial. In fact, there are many groups of order 3. If we take set G = {x, y, z} of three elements, then we can give them the structure of a group of order 3, by identifying e = x, g = y and g 2 = z. Since there is clearly more than one set of order 3, there are more than one group of order 3. Yet, in some sense, we would like to say that all groups of order 3 are essentially the same group. To make this idea precise, we introduce the notion of isomorphism. Definition 6.24. Let G and G′ be two sets equipped with binary operations (which we will denote by juxtaposition). Then an isomorphism ϕ from G to G′ is a bijection ϕ : G → G′ which satisfies ϕ(gh) = ϕ(g)ϕ(h), for all g, h ∈ G. We can interpret ϕ as a relabeling function. The key idea is that it doesn’t matter whether you multiply the elements in G and then consider the image of their product or their images in G′ , because you obtain the same result. We can also express the property of isomorphism in the form gh = ϕ−1 (ϕ(g)ϕ(h). From this point of view, to compute the product in G, first map the two elements to G′ , compute their product and then map back. We can use this formula to define a product on G given a product on G′ and a bijection between G and G′ . Notice that we did not require G or G′ to be a group to define an isomorphism. However, there is an important fact we can establish relating group structures and bijections. Theorem 6.25. Let ϕ : G → G′ be an isomorphism between two sets which are equipped with binary operations. Then • The product on G is associative if and only if the product on G′ is associative. • There is an identity element in G if and only if there is one in G′ . • G is a group if and only if G′ is a group. 26 MICHAEL PENKAVA Proof. Suppose that x, y, z ∈ G′ . Let g, h, k ∈ G be such that ϕ(g) = x, ϕ(h) = y and ϕ(k) = z. If the product on G is associative, then (xy)z = (ϕ(g)ϕ(h))ϕ(k) = ϕ(gh)ϕ(k) = ϕ((gh)k) = ϕ(g(hk)) = ϕ(g)ϕ(hk) = x(ϕ(h)ϕ(k)) = x(yx). Suppose that g, h, k ∈ G and let x, y, z ∈ G′ be such that ϕ(g) = x, ϕ(h) = y and ϕ(k) = z. If the product on G′ is associative, then (gh)k = ϕ−1 (ϕ((gh)k)) = ϕ−1 (ϕ(gh)ϕ(k)) = ϕ−1 ((ϕ(g)ϕ(h))z) = ϕ−1 ((xy)z) = ϕ−1 (x(yz)) = ϕ−1 (ϕ(g)(ϕ(h)ϕ(k))) = ϕ−1 ((ϕ(g)ϕ(hk)) = ϕ−1 (ϕ(g(hk))) = g(hk). This shows that the product on G is associative precisely when the product on G′ is associative. Next, suppose that eg = eg = g for all g ∈ G. Let e′ = ϕ(e). Suppose that x ∈ G′ . Then x = ϕ(g) for some g ∈ G, so we have e′ x = ϕ(e)ϕ(g) = ϕ(eg) = ϕ(g) = x, and similarly xe′ = x. Thus e′ is an identity element in G′ . On the other hand, if e′ ∈ G′ satisfies e′ x = xe′ = x for all x ∈ G′ , then let e ∈ G be such that ϕ(e) = e′ . Let g ∈ G. Then eg = ϕ−1 (ϕ(eg)) = ϕ−1 (ϕ(e)ϕ(g)) = ϕ−1 (e′ ϕ(g)) = ϕ−1 (ϕ(g)) = g, and similarly ge = g. Now finally, note that if either G or G′ is a group, then the product on both of them is associative and there are identity elements e ∈ G and ϕ(e) = e′ ∈ G′ . Assume G is a group, and let x ∈ G′ . Then there is some g ∈ G such that ϕ(g) = x and some h ∈ G such that gh = hg = e. Let y = ϕ(h). Then xy = ϕ(g)ϕ(h) = ϕ(gh) = ϕ(e) = e′ and similarly yx = e′ . Thus G′ is a group. We leave for the reader the case of showing that if G′ is a group then G is a group. 7. Subgroups Definition 7.1. If H ⊆ G is a subset of a group G, then H is said to be a subgroup of G provided that H is a group under the same binary operation as in G. If H is a subgroup of G, we denote this fact by H ≤ G. It is very important that we use the same binary operation. For example Q is a group under addition, and Q∗ , the set of nonzero elements of Q is a group under multiplication. But it is not a subgroup of Q, because the binary operation is not the same. Since a subgroup H of a group G must be a group, it cannot be empty, since it must have an identity element. In fact, if e is the identity in G and e′ is the identity in H, then if h ∈ H, we have e′ h = h. But by Uniqueness of Identity in G, it follows that e′ = e. Thus the identity in H must be the same identity as in G. Similarly, the inverse of an element in H must coincide with the inverse of that element in G, by the Uniqueness of Inverse. The theorem below gives a powerful criterion for determining whether a subset of a group is a subgroup. ABSTRACT ALGEBRA I NOTES 27 Theorem 7.2. Let H be a subset of a group G. Then H is a subgroup of G if and only if it satisfies the three properties below. (1) H is not empty. (2) If a, b ∈ H, then ab ∈ H. (H is closed under the group operation) (3) If a ∈ H, then a−1 ∈ H. (H is closed under inverse). Proof. By the second property, the binary operation is well-defined on H. Although we don’t state the condition of well-definedness of the binary operation as one of the three axioms of a group, it is implicitly required because the definition of a group is based on the existence of a binary operation satisfying three requirements, so at least we must verify that the binary operation is well-defined. The axiom of associativity is automatic, because associativity holds in G, so the product of elements in H must also satisfy associativity. The existence of an identity in H is verified as follows. By the first property, there is some h ∈ H. By the third property, h−1 ∈ H. Thus by the second property e = hh−1 lies in H. Since e is the identity in G, it also is the identity in H. Finally, the third property guarantees that h−1 ∈ H if h ∈ H. Moreover, h−1 is the inverse of h in H, since it is the inverse of h in G. Students often make the mistake of trying to show that H is a subgroup of G by proving that an identity exists in H or an inverse exists in H. This is not the idea. Existence of the identity and existence of the inverse of an element in H are already guaranteed. It is location of the identity and inverse we are concerned with. You want to show the identity lies in H and the inverse of an element in H lies in H, not that these elements exist. Another common mistake is students proving that the product in H is associative. We already know that so such a “proof” is irrelevant to showing that H is a subgroup of G. The key is to prove the three properties in the theorem hold! Corollary 7.3. Suppose that H is a finite subset of a group G and it satisfies the first two properties in the theorem. Then H is a subgroup of G. Proof. Since the first two properties hold, we need only show that the third holds. Let h ∈ H. Now the set P is infinite, and hk ∈ H for any k ∈ P by a straightforward induction argument, so there must be some m < n ∈ P such that hm = hn . Then e = hm h−m = hn h−m = hn−m . Let k = n−m. If k = 1, then h = e so h = h−1 and h−1 ∈ H. Otherwise k > 1, so k − 1 ∈ P. Now e = hhk−1 so h−1 = hk−1 ∈ H. Corollary 7.4. Let G be a finite group. Then if g ∈ G, g −1 = g k for some k ∈ P. Proof. Let H = {g k |k ∈ P}. Then H is not empty, since g ∈ H. Suppose that x, y ∈ H. Then x = g k and y = g ℓ for some h, ℓ ∈ P. Then xy = g k+ℓ ∈ H. Thus H satisfies the first two properties in the theorem, and since G is finite, so is H. Thus by the corollary above H is a subgroup of G. But this implies g −1 = g k for some k ∈ P. Theorem 7.5. Let H, K ≤ G. Then H ∩ K ≤ G. More generally, let Λ be a set Hλ of G for λ ∈ Λ. Define ∩ ∩ and suppose that we have a collection of subgroups Hλ = {g ∈ G|g ∈ Hλ for all λ ∈ Λ}. Then Hλ ≤ G. λ∈Λ λ∈Λ 28 MICHAEL PENKAVA since it includes the case of the inProof. We prove the more general statement, ∩ tersection of two subgroups. Let H = Hλ . First, since Hλ is a subgroup for all λ∈Λ λ ∈ Λ, we must have e ∈ Hλ for all λ ∈ Λ. Thus e ∈ H and H is not empty. Now suppose that a, b ∈ H. Then a, b ∈ Hλ for all λ, so ab ∈ Hλ for all λ. Thus ab ∈ H. Finally, a−1 ∈ Hλ for all λ, so it also follows that a−1 ∈ H. Thus H ≤ G. Theorem 7.6. Let H ≤ G and suppose that g ∈ G. Then the set gHg −1 = {ghg −1 |h ∈ H} is a subgroup of G. Such a subgroup is called a conjugate of H (by g). Proof. e ∈ H, so e = geg −1 ∈ gHg −1 . Thus gHg −1 ̸= ∅. Suppose that x, y ∈ gHg −1 . Then x = gag −1 and y = gbg −1 for some a, b ∈ H. Thus xy = gag −1 gbg −1 = gabg −1 ∈ gHg −1 , since ab ∈ H. Finally x−1 = (gag −1 )−1 = (g −1 )−1 a−1 g −1 = ga−1 g −1 ∈ gHg −1 , since a−1 ∈ H. Definition 7.7. Let G be a group and gλ ∈ G for λ ∈ Λ. Then ⟨aλ , λ ∈ Λ⟩ is the intersection of all subgroups H such that aλ ∈ H for all λ ∈ Λ, is called the subgroup generated by the aλ . Exercise 7.8. Show that ⟨aλ , λ ∈ Λ⟩ is actually a subgroup of G. Exercise 7.9. Let G = S3 . Show that the only proper subgroups of G are ⟨(1, 2)⟩, ⟨(1, 3)⟩, ⟨(2, 3)⟩ and ⟨(1, 2, 3)⟩. Thus there are precisely 6 subgroups of S3 . Definition 7.10. Let G be a group. Then the center of G, denoted Z(G) is the subset of all g ∈ G such that gx = xg for all x ∈ G. Exercise 7.11. Prove that the center of G is a subgroup of G. Exercise 7.12. Find Z(S3 ). Definition 7.13. Let G be a group and S ⊆ G. Then the centralizer of S in G is the set CG (S) = {g ∈ G|gs = sg for all s ∈ S}. When the context is clear, we also denote the centralizer of S by C(S), and when S = {a} is a singleton, we usually denote C({a}) more compactly as C(a). Note that C(G) = Z(G) is just the center of G. Also, note that Z(e) = G, since every element of G commutes with e. Proposition 7.14. Let S ⊆ G. Then C(S) is a subgroup of G; i.e., C(S) ≤ G. Proof. First note that e ∈ C(S), since e commutes with any element of G, so it certainly commutes with every element in S. Thus C(S) ̸= ∅. Suppose that g, h ∈ C(G). If s ∈ S, then gs = sg and hs = sh, so (gh)s = g(hs) = g(sh) = (gs)h = (sg)h = s(gh), so gh ∈ C(G). Finally, if g ∈ C(G) and s ∈ S, then g −1 s = g −1 se = g −1 sgg −1 = g −1 gsg −1 = esg −1 = sg −1 . Thus g −1 commutes with s, so g −1 ∈ C(G). This shows that C(S) ≤ G. ABSTRACT ALGEBRA I NOTES 29 8. Cyclic Groups Definition 8.1. Let G be a group and g ∈ G. The cyclic subgroup generated by g, denoted by ⟨g⟩, is the set {g k |k ∈ Z}. For a group G with commutative operation +, the cyclic subgroup ⟨g⟩ is given by ⟨g⟩ = {kg|k ∈ Z}. This is because kg is the analogue of g k for groups with operation +. Example 8.2. We show that ⟨g⟩ is actually a subgroup of G. First, it is not empty, since g = g 1 ∈ ⟨g⟩. Next, suppose that x, y, ∈ ⟨g⟩. Then x = g k and y = g ℓ for some k, ℓ ∈ Z, so xy = g k g ℓ = g k+ℓ ∈ ⟨g⟩. Finally x−1 = g −k ∈ ⟨g⟩. Theorem 8.3. Let G be a group and g ∈ G. Then the cyclic subgroup ⟨g⟩ generated by g is the intersection of all subgroups of G containing g. As a consequence, if H ≤ G and g ∈ H, then ⟨g⟩ ≤ H. Let H be the collection of all subgroups containing g. Then ⟨g⟩ ∈ H, so Proof. ∪ {H|H ∈ H} ⊆ ⟨g⟩. On the other hand, suppose that H ∈ H. We show by induction that g m ∈ H for all m ∈ P. Clearly, g 1 = g ∈ H, since H ∈ H. Suppose that g m ∈ H. Then g m+1 = g m g ∈ H, since H is closed under products. Thus, by the principle of mathematical induction, g m ∈ H for all m ∈ P. Moreover, if m ∈ P, then g −m = (g m )−1 ∈ H, since H is closed under inverses. Finally, g 0 = e ∈ H, since every ∪ subgroup contains the identity. Thus ⟨g⟩ ⊆ H. It follows from this that ⟨g⟩ ⊆ {H|H ∈ H}. Thus equality holds. The second statement of the theorem follows immediately from the first. Example 8.4. Let G be a group. Then ⟨e⟩ = {e} is the cyclic subgroup generated by the identity element. It is called the trivial subgroup of G. The group G is also a subgroup of G, called the improper subgroup of G. Note that every group has these two subgroups, and they are distinct unless G is a one element group. All other subgroups of G are called proper, nontrivial subgroups. Definition 8.5. A group G is called a cyclic group if there is an element g ∈ G such that G = ⟨g⟩. An element g such that G = ⟨g⟩ is called a generator of the group G. Example 8.6. The group Z is cyclic, because Z = ⟨1⟩. Moreover, the group Zn is also cyclic, since Zn = ⟨1⟩. Definition 8.7. Suppose that G and G are groups. If there is a an isomorphism φ : G → G′ we say that G and G′ are isomorphic groups (or simply that G and G′ are isomorphic). We denote this by G ∼ = G′ . Exercise 8.8. Show that ∼ = is an equivalence relation. Actually, the collection of all groups is not a set, but is something called a class in set theory. Since we require an equivalence relation to be a relation on a set, it is not technically correct that ∼ = is an equivalence relation, since there is no set of all groups. But we can extend the notion of an equivalence relation to classes, and if we do so, then it is true that ∼ = is an equivalence relation. Definition 8.9. If X is a finite set, the order of X, denoted as o(X) or |X|, is the number of elements in X. Otherwise, we say that X has infinite order and denote o(X) = ∞. 30 MICHAEL PENKAVA Theorem 8.10. Let G be a cyclic group. If G is infinite, then G ∼ = Z. If G is not infinite, then G ∼ = Zn where o(G) = n. Proof. Let G = ⟨g⟩. Suppose that o(G) = n < ∞. The sequence g = g 1 , . . . , g n+1 has n + 1 elements, so must contain some duplicates, so there is some 1 ≤ k < ℓ ≤ n + 1 such that g k = g ℓ . Then g ℓ−k = e. Moreover ℓ − k ≤ n + 1 − 1 = n. Since the set X = {x ∈ P|g k = e} includes ℓ − k, X is not empty and so has a least element m. Moreover, m ≤ n. We claim that G = {g 1 , . . . , g m }, from which it follows that m = n. It cannot happen that there are any duplicates in the set T = {g 1 , . . . , g m }, because if some g k = g ℓ for some 1 ≤ k < ℓ < m, then g ℓ−k = e, and 1 ≤ ℓ − k < m, which would contradict the minimality of m. On the other hand, we claim that g k ∈ T for all k ∈ Z. To see this, we first show this fact for k ∈ N. Let S = {k ∈ P|g k ∈ T }. Suppose that x ∈ S for all x < k. If 1 ≤ k ≤ m, then clearly k ∈ T , so assume k > m. Then g k = g k−m g m = g k−m e = g k−m . Now x = k − m < k, so by assumption x ∈ S, so g k = g x ∈ T . Thus k ∈ S. By the principle of strong induction, S = P. Next, note that g 0 = g m ∈ T . Finally, note that if k ∈ P, then mk − k = (m − 1)k >= 0. Moreover g mk−k = (g m )k g −k = ek g −k = eg −k = g −k , so g −k = g mk−k ∈ T . This shows that G = T , and thus m = n, since T has m elements and G has n elements. Define a map ϕ : Zn → G by ϕ(k) = g k . We have to show that this map is well defined. Suppose that k = ℓ. Then ℓ = k + nr for some r ∈ Z, so g ℓ = g k+nr = g k (g n )r = g k . Thus the right hand side in the definition of ϕ is independent of the choice of representative for k. Clearly, ϕ is a bijection, since it is surjective and o(Zn ) = o(G). We show that ϕ is an isomorphism. Now ϕ(k + ℓ) = ϕ(k + ℓ) = g k+ℓ = g k g ℓ = ϕ(k)ϕ(ℓ). Thus ϕ satisfies the required condition of an isomorphism. Next, we consider the case when G is infinite. Define ϕ : Z → G by ϕ(k) = g k . By the definition of a cyclic group, ϕ is surjective. If ϕ were not injective, then there would have to be integers k and ℓ such that k < ℓ and g k = g ℓ . But then g l−k = e, and it would follow that G is finite, by the kind of argument we gave above. Therefore ϕ is injective, so it is a bijection. To see ϕ is an isomorphism, note that ϕ(k + ℓ) = g k+ℓ = g k g ℓ = ϕ(k)ϕ(ℓ). Thus ϕ meets all of the conditions to be an isomorphism. The theorem above shows that we do not encounter any new kinds of groups when considering cyclic groups. Note that if G is any group and g ∈ G, then the cyclic subgroup ⟨g⟩ is cyclic, so it must be either isomorphic to Z or to Zn for some n ∈ P. This means there is a short menu for what the structure of ⟨g⟩ could be, and this is a very useful fact. Theorem 8.11. Let m ∈ Z and d = gcd(m, n). Then ⟨m⟩ = ⟨d⟩ in Zn . Moreover, if n ∈ P, this subgroup has order n/d. As a consequence, the number of subgroups of Zn is equal to the number of distinct nonnegative divisors of n. ABSTRACT ALGEBRA I NOTES 31 Proof. Let H = ⟨m⟩ and K = ⟨d⟩. We show that H ⊆ K and K ⊆ H, from which it follows that H = K. Express d = rm + sn. Thus d = rm, so that d ∈ ⟨m⟩. It follows that K ⊆ H, since any multiple of d is a multiple of m. On the other hand, since m is a multiple of d, m is a multiple of d. Thus m ∈ K, and it follows that H ⊆ K. To see that o(⟨d⟩) = n/d, note that if 1 ≤ k < n/d, then kd < n, so kd ̸= 0. Moreover, if 1 ≤ k < ℓ ≤ n/d, we could not have kd = ℓd, because in that case we would have (ℓ − k)d = 0, and since 1 ≤ ℓ − k < n/d, this is impossible by the previous remark. It follows that o(⟨d⟩) = n/d. Corollary 8.12. Every subgroup of a cyclic group is cyclic. Proof. First, note that the map ϕ : Z → Z0 , given by ϕ(a) = a is surjective, since every element in Zn for any n is of the form a for some a ∈ Z. Next note that a = b (mod 0) if and only if b−a = k0 = 0, that is, if and only if a = b. Thus ϕ is injective, since if ϕ(a) = ϕ(b) then a = b, which only happens if a = b. Thus Z ∼ = Z0 . Now let H ≤ Zn . If H = ⟨o⟩, it is cyclic. Otherwise, let X = {x ∈ P|x̄ ∈ H}. Since H ̸= ⟨o⟩, there must be some nonzero x ∈ Z such that x ∈ H. If x ∈ P then x ∈ X. If x < 0, then −x = −x ∈ H, so −x ∈ X. Thus X ̸= ∅, so it must have a least element d. Suppose that m ∈ H. Express m = qd + r where 0 ≤ r < |d|. Since r = m − qd, r ∈ H. Now if r > 0, it follows that r ∈ H, which would contradict the minimality of d. Thus m ∈ ⟨d⟩, and H ⊆ ⟨d⟩. Since ⟨d⟩ ⊆ H, it follows that H = ⟨d⟩. Thus H is cyclic. Because of our remarks above that Z ∼ = Z0 , we see that the theorem holds both for Z and Zn for all n. Of course, since Z is not finite, the counting part of the theorem does not apply. However, we can easily see that ⟨k⟩ is an infinite cyclic group, so is isomorphic to Z, except when k = 0, in which case ⟨0⟩ = {0} ∼ = Z1 , the cyclic group with exactly one element. The moral of all this discussion of cyclic groups is that they are not very interesting, since they are all isomorphic to groups of the form Zn , which we already have studied. Moreover, they are all abelian, and all subgroups of cyclic groups are cyclic. Definition 8.13. Given a group G, its subgroup lattice is the collection H of all subgroups of G, partially ordered by inclusion. The lattice diagram corresponding to the subgroup lattice is the graph with one vertex for each subgroup, with edges corresponding to subgroup inclusion. These edges occur between subgroups K and H provided that K ≤ H and there is no subgroup X such that K ≤ X ≤ H except for the trivial cases when X = H or X = K. We also draw the graph so that if K ≤ H, and K ̸= H, then the vertex corresponding to K lies lower in the graph than the vertex corresponding to H. A little thought reveals that there is one top vertex, corresponding to the improper subgroup G, and one bottom vertex, corresponding to the trivial subgroup ⟨e⟩. We give two examples below of lattice diagrams, for the groups S3 and Z45 . 9. Morphisms of Groups Definition 9.1. Let G, G′ be groups. Then a morphism of groups or homomorphism is a map ϕ : G → G′ which satisfies ϕ(gh) = ϕ(g)ϕ(h). 32 MICHAEL PENKAVA j ⟨σ, τ ⟩ TT jjjj ??? TTTTTTT j j j T jjj 2 ⟨τ ⟩ ⟨στ ⟩ ⟨σ⟩ TTT j ⟨σ τ ⟩ TTTT ?? j j j j TTTT ? jjjj T jj ⟨e⟩ Figure 1. Subgroup Lattice for S3 , where σ = (1, 2, 3) and τ = (1, 2) ⟨9⟩ ⟨3⟩ ?? ? ⟨0⟩ ⟨1⟩ ?? ? ?? ? ⟨15⟩ ⟨5⟩ Figure 2. Subgroup Lattice for Z45 Example 9.2. If ϕ : G → G′ is an isomorphism, then it is a homomorphism, since the morphism condition above is one of the two conditions of an isomorphism. Example 9.3. If G, G′ are groups, then the trivial morphism from G to G′ is the map ϕ(g) = e for all g ∈ G. It is clearly a homomorphism, since ϕ(gh) = e = e·e = ϕ(g)ϕ(h), for any g, h ∈ G. Thus there is always at least one morphism between any two groups G and G′ . Example 9.4. The identity morphism 1G : G → G is the map 1G (g) = g for all g ∈ G. It is easy to show it is a morphism of groups. Proposition 9.5. Let ϕ : G → G′ be a homomorphism. Then (1) ϕ(e) = e. (2) ϕ(g −1 ) = ϕ(g)−1 . (3) ϕ(g m ) = ϕ(g)m for any m ∈ Z. Proof. First, Note that ϕ(e) = ϕ(e · e) = ϕ(e)ϕ(e). By uniqueness of identity in G′ , this forces ϕ(e) = e. Next, note that e = ϕ(e) = ϕ(gg −1 ) = ϕ(g)ϕ(g −1 ), for any g ∈ G. By uniqueness of inverse, it follows that ϕ(g −1 ) = ϕ(g)−1 . Finally, we prove the last result by first showing it for m ∈ P by induction. Since ϕ(g 1 ) = ϕ(g) = ϕ(g)1 , it holds for m = 1. Suppose that ϕ(g m ) = ϕ(g)m . Then ϕ(g m+1 ) = ϕ(g m g) = ϕ(g m )ϕ(g) = ϕ(g)m ϕ(g) = ϕ(g)m+1 . Thus the formula holds for all m ∈ P. Next ϕ(g 0 ) = ϕ(e) = e = ϕ(g)0 , so it holds for m = 0. Finally, if m ∈ P, then ϕ(g −m ) = ϕ((g −1 )m ) = ϕ(g −1 )m = (ϕ(g)−1 )m = ϕ(g)−m . Thus the formula holds for all m ∈ Z. Theorem 9.6. Suppose that ϕ : G → G′ and ψ : G′ → G′′ are morphisms of groups. Then ψ ◦ ϕ : G → G′′ is also a morphism of groups. ABSTRACT ALGEBRA I NOTES 33 Proof. Let g, h ∈ G. Then (ψ ◦ ϕ)(gh) = ψ(ϕ(gh)) = ψ(ϕ(g)ϕ(h)) = ψ(ϕ(g))ψ(ϕ(h)) = (ψ ◦ ϕ)(g)(ψ ◦ ϕ)(h). Corollary 9.7. Let ϕ : G → G′ and ψ : G′ → G′′ be isomorphisms. Then ψ ◦ ϕ is an isomorphism. Proof. We already showed that the composition of two bijections is a bijection. By the theorem, the composition of two isomorphisms is a morphism. Since an isomorphism is just a bijective morphism, this shows that ψ ◦ ϕ is an isomorphism. Definition 9.8. Let ϕ : G → G′ be a homomorphism of groups. Then the kernel of ϕ, denoted ker(ϕ) is the subset of G given by ker(ϕ) = {g ∈ G|ϕ(g) = e}. The notion of a kernel of a morphism of groups is parallel to that of the kernel of a linear transformation. In fact, the kernel of a linear transformation is a special case of the kernel of a morphism, since a linear map λ between two vector spaces is a morphism of their underlying group operation of addition, and the kernel of λ, considered as a linear transformation coincides with the kernel of the map λ, considered as a morphism of groups! Proposition 9.9. The kernel of a morphism ϕ : G → G′ is a subgroup of G. Proof. Let ϕ : G → G′ be a homomorphism, and let H = ker(ϕ). Now e ∈ H since ϕ(e) = e, so H ̸= ∅. Suppose that g, h ∈ H. Then ϕ(gh) = ϕ(g)ϕ(h) = e · e = e, so gh ∈ H. Also ϕ(g −1 ) = ϕ(g)−1 = e−1 = e, so g −1 ∈ H. Thus H ≤ G. An interesting question is what kinds of subgroups of G turn up as kernels of morphisms. Since the kernel of the identity map 1G is just {e}, we see that the trivial subgroup is the kernel of a morphism. Moreover, if we consider the trivial morphism from G to G, it is easy to see that the improper subgroup of G is the kernel of a morphism. We will see later, in the section on normal subgroups, that the kinds of subgroups which are kernels of morphisms are quite special! One of the most powerful results about kernels of morphisms is the following! Theorem 9.10. Let ϕ : G → G′ be a group homomorphism. Then ϕ is injective if and only if ker(ϕ) = {e}. Proof. Suppose ϕ is injective and g ∈ ker(ϕ). Then ϕ(g) = e = ϕ(e), so g = e. Thus ker(ϕ) = {e}. On the other hand, suppose that ker(ϕ) = {e} and ϕ(g) = ϕ(h). Then ϕ(gh−1 ) = ϕ(g)ϕ(h)−1 = e, so gh−1 = e and g = h. Thus ϕ is injective. What makes the theorem above so powerful is that it reduces the proof of injectivity to the consideration of when ϕ(g) = e, instead of having to study when ϕ(g) = ϕ(h) in general. In many cases, computation of the kernel of a morphism is easy, so determination of whether the morphism is injective is equally easy. Example 9.11. Let ϕ : Z → Zn be defined by ϕ(a) = a. Since ϕ(a + b) = a + b = a + b = ϕ(a) + ϕ(b), it follows that ϕ is a homomorphism. Moreover a ∈ ker(ϕ) if and only if ϕ(a) = 0 if and only if a = 0 if and only if a ∈ ⟨n⟩. Thus we have identified ker(ϕ) = ⟨n⟩. 34 MICHAEL PENKAVA Theorem 9.12. Suppose that ϕ : G → G′ is a morphism of groups and H ≤ G. Then ϕ(H) ≤ G′ . Proof. Since e = ϕ(e) ∈ ϕ(H), ϕ(H) ̸= ∅. Let x, y ∈ ϕ(H). Then x = ϕ(g) and y = ϕ(h) for some g, h ∈ H. But then xy = ϕ(g)ϕ(h) = ϕ(gh) ∈ ϕ(H), since gh ∈ H. Finally x−1 = ϕ(g)−1 = ϕ(g −1 ) ∈ ϕ(H), since g −1 ∈ H. Definition 9.13. If ϕ : X → Y is a map and S ⊆ Y , then ϕ−1 (S) = {x ∈ X|ϕ(x) ∈ S} is called the inverse image of S under ϕ. Theorem 9.14. Let ϕ : G → G′ be a group homomorphism and H ′ ≤ G′ . Then ϕ−1 (H ′ ) ≤ G. Proof. Let H = ϕ−1 (H ′ ). Since ϕ(e) = e ∈ H ′ , e ∈ H. This show that H ̸= ∅. Suppose that g, h ∈ H. Then ϕ(gh) = ϕ(g)ϕ(h) ∈ H ′ , since H ′ is a subgroup and ϕ(g), ϕ(h) ∈ H ′ . Thus gh ∈ H. Moreover ϕ(g −1 ) = ϕ(g)−1 ∈ H ′ , so g −1 ∈ H. Thus H ≤ G. Definition 9.15. Let G be a group. Then Aut(G) is the set of all isomorphisms from G onto G is called the group of automorphisms of G, or automorphism group of G. Theorem 9.16. Aut(G) is a group under composition. In fact, Aut(G) ≤ SG ; i.e., the automorphism group of G is a subgroup of the permutation group of G. Proof. Since the second statement implies the first, we prove that Aut(G) ≤ SG . Since an automorphism is a bijection by definition, Aut(G) ⊆ SG . Moreover, composition is the group operation in SG . We already showed that the identity 1G is a morphism, so it is in Aut(G), which implies that Aut(G) ̸= ∅. Suppose that ϕ, ψ ∈ Aut(G). Then we already showed that ϕ ◦ ψ is a morphism, so it lies in Aut(G). Finally, we need to show that ϕ−1 ∈ Aut(G), which follows if we can show it is a morphism. But ϕ−1 (gh) = ϕ−1 (ϕ(ϕ−1 (g))ϕ(ϕ−1 (h)) = ϕ−1 (ϕ(ϕ−1 (g)ϕ−1 (h)) = ϕ−1 (g)ϕ−1 (h). Thus ϕ−1 is a morphism, which shows it is in Aut(G). Theorem 9.17 (Cayley’s Theorem). The map ϕ : G → SG given by ϕ(g)(h) = gh is an injective morphism of groups. Thus G is isomorphic to a subgroup of SG . In particular, every group is isomorphic to a subgroup of a permutation group. Proof. First, we need to show that if g ∈ G, then ϕ(g) ∈ SG . To see that ϕ(g) is injective, suppose ϕ(g)(h) = ϕ(g)(h′ ). Then gh = gh′ so by left cancelation h = h′ . Thus ϕ(g) is injective. To see that ϕ(g) is surjective, suppose y ∈ G. Then ϕ(g)(g −1 y) = gg −1 y = y. Thus ϕ(g) is surjective, so it is bijective. This shows that ϕ(g) ∈ SG . Next, suppose that g, g ′ ∈ G. Then ϕ(gg ′ )(x) = gg ′ x = ϕ(g)(g ′ x) = ϕ(g)(ϕ(g ′ )(x) = (ϕ(g) ◦ ϕ(g ′ ))(x). Thus ϕ(gg ′ ) = ϕ(g) ◦ ϕ(g ′ ), which shows that ϕ is a morphism of groups. Suppose that g ∈ ker(ϕ). Then ϕ(g) = 1G , so g = ge = ϕ(g)(e) = 1G (e) = e. . Thus ker(ϕ) = {e}, which shows that ϕ is injective. Finally, consider the restriction of the image of ϕ to ϕ(G). Then the induced map G → φ(G) is still injective, and is clearly surjective. This means that ϕ : G → ϕ(G) is an isomorphism. ABSTRACT ALGEBRA I NOTES 35 In older textbooks, it was pointed out that Cayley’s Theorem is interesting from a theoretical viewpoint, but is utterly useless as a calculation tool. However, times have changed. The computational algebra software Maple has a group theory package that can help analyze properties of groups, but only if they are given as groups of permutations, that is, as subgroups of some permutation group. Cayley’s Theorem can be used to express a finite group as a permutation group in a very concrete manner. Example 9.18. Let G be a group and g ∈ G. Then the map cg : G → G given by cg (x) = gxg −1 is called conjugation by g. We claim that cg is an automorphism of G. To see this, first note that cg (xy) = gxyg −1 = gxg −1 gyg −1 = cg (x)cg (y), so cg is a morphism. Thus, to show that cg is injective, we need only show that ker(cg ) = {e}. Suppose that x ∈ ker(cg ). Then e = cg (x), so gxg −1 = e, and multiplying both sides on the left by g −1 and both side on the right yields x = e. To see that cg is surjective, let y ∈ G. We compute cg (g −1 yg) = gg −1 ygg −1 = y. Thus cg is an isomorphism from G onto G, which is precisely what an automorphism of G is. Theorem 9.19. Let ϕ : G → G′ be a surjective morphism of groups and K = ker(ϕ). Then the map H ′ 7→ ϕ−1 (H ′ ) is an order preserving bijection between the set of subgroups of G′ and the set of subgroups of G which contain K. Proof. First, note that if H = ϕ−1 (H ′ ) for some H ′ ≤ G′ , then if g ∈ ker(ϕ), we have ϕ(g) = e ∈ H ′ , which implies that g ∈ ϕ−1 (H ′ ). This shows that ker(ϕ) ⊆ (ϕ−1 (H ′ )), which means that H ′ 7→ ϕ−1 (H ′ ) is a well defined map between subgroups of G′ and subgroups of G containing ker(ϕ). Now suppose that H is a subgroup of G containing ker(ϕ). Then H ′ = ϕ(H) is a subgroup of G′ . We claim that ϕ−1 (H ′ ) = H. To see this, suppose that a ∈ ϕ−1 (H ′ ). Then ϕ(a) ∈ H ′ = ϕ(H), so there is some h ∈ H such that ϕ(a) = ϕ(h). But then ϕ(ah−1 ) = e, so ah−1 ∈ H, since H contains the kernel of ϕ. From this we conclude that a = ah−1 h ∈ H. Thus ϕ−1 (H ′ ) ⊆ H. However, if h ∈ H then ϕ(h) ∈ ϕ(H) = H ′ , so h ∈ ϕ−1 (H ′ ). Thus H ⊆ (ϕ−1 (H ′ ), and we see that H = ϕ−1 (H ′ ). This shows that the map H ′ 7→ ϕ−1 (H ′ ) is surjective to the set of all subgroups of G containing ker(ϕ). Next, we show it is injective. Let H ′ , K ′ be subgroups of G′ and suppose that H ′ ̸= K ′ . Suppose that x ∈ H ′ but x ̸∈ K ′ . Then there is some g ∈ G such that ϕ(g) = x, since ϕ is surjective. We have g ∈ ϕ−1 (H ′ ) but g ̸∈ ϕ−1 (K ′ ). Thus ϕ−1 (H ′ ) ̸= ϕ−1 (K ′ ). If we have some x ∈ K ′ which does not lie in H ′ we deduce by a similar argument that the two inverse images are not equal. This shows that the map is injective. Notice that in the proof above, we only needed that ϕ was surjective to show that H ′ 7→ ϕ−1 (H ′ ) is an injective map to the set of subgroups of G containing ker(ϕ). When ϕ is not surjective, the proof above shows that the map is still well defined and surjective. 10. Cosets In this section we consider a fixed subgroup H of a group G. Definition 10.1. Let H ≤ G. We define a relation ∼, called left equivalence by a ∼L b if and only if b−1 a ∈ H. Similarly, we define right equivalence by a ∼R b if and only if ab−1 ∈ H. 36 MICHAEL PENKAVA Theorem 10.2. The relations left equivalence and right equivalence are equivalence relations. Proof. We prove the result for left equivalence, which we will denote by ∼ instead of ∼L , and we leave the proof for right equivalence, which is similar, to the reader. First, note that a ∼ a because a−1 a = e ∈ H. Next, suppose that a ∼ b. Then b−1 a = h ∈ H. Thus a−1 b = (b−1 a)−1 = h−1 ∈ H. This shows that b ∼ a. Finally, suppose that a ∼ b and b ∼ c. Then b−1 a = h ∈ H and c−1 b = h′ ∈ H. It follows that c−1 a = c−1 bb−1 a = hh′ ∈ H. Thus a ∼ c. Definition 10.3. For the equivalence relation ∼L , we call the equivalence class of an element a a left coset of a. The left coset of a is often denoted as aH, although, for simplicity, we will usually just denote it as a. Similarly, the equivalence class of a under ∼R is called the right coset of a, and is often denoted as Ha. The set of left cosets of G is denoted by G/H. The number of elements of G/H is called the index of H in G, and is denoted by [G : H]; in other words, [G : H] = o(G/H). In fact, there is some ambiguity in the notation, as a = {b ∈ G|a ∼L b} by definition, but aH = {ah|h ∈ H}. However, it is easy to see that these two sets are the same, so the ambiguity is only apparent, not real. Lemma 10.4. The map ϕ : a → b, given by ϕ(x) = ba−1 x, is a well defined bijection of aH onto bH. Proof. Let x ∈ a. Then x = ah for some h ∈ H, so ϕ(x) = ba−1 aah = bh ∈ b. This shows that ϕ is well defined. Suppose that ϕ(x) = ϕ(y) for some x, y ∈ a. Then x = ah and y = ah′ for some h, h′ ∈ H. We compute that ϕ(x) = bh and ϕ(y) = bh′ , so bh = bh′ and by left cancelation, we conclude that h = h′ . This forces x = y. Thus ϕ is injective. Now let y ∈ b. Then y = bh for some h ∈ H. Let x − ah. Then x ∈ a and clearly ϕ(x) = y. Thus ϕ is surjective. This shows that ϕ is a bijection of a onto b. Theorem 10.5 (Lagrange’s Theorem). Suppose that o(G) < ∞, and H ≤ G. Then o(H)|o(G). Proof. By the lemma, it follows that the number of elements in a is independent of a. Note that e = H, so we have o(H) = o(a) for all a ∈ G. Recall that the sets a for a ∈ G are either disjoint or coincide, and that G is the union of all such sets. Since G is finite, it follows that G is a finite union of disjoint sets, each of which has o(H) elements. If there are m distinct such sets, we obtain that o(G) = mo(H). It follows that o(H)|o(G). Corollary 10.6. If G is a finite group and g ∈ G, then o(g)|o(G). Proof. Recall that the order of an element g is the least positive integer n such that g n = e. But by the classification of cyclic groups, we also know that ⟨g⟩ = {g 1 , . . . , g n }, so o(g) = o(⟨g⟩). Thus, by the theorem, o(g)|o(G). Example 10.7. We show that there are exactly two groups of order 4, up to isomorphism. Let G be a group of order 4. If there is an element g of order 4 in G, then G is cyclic, so G ∼ = Z4 . Otherwise, every non identity element in G has order 2. Let e be the identity element in G. Then there are three non identity elements in G. Denote two such distinct elements by a and b. Then ab cannot equal a, e or b, so the fourth element is ab. Thus, we have the following Cayley Table for G. ABSTRACT ALGEBRA I NOTES e a e e a a a e b b ab ab ab b b b ab e a 37 ab ab b a e Notice that the Cayley Table is symmetric, so the group corresponding to the Cayley Table, if there is such a group, is abelian. However, to check that the Cayley Table above gives a group would require 128 calculations to check associativity. Thus, it is more convenient if we can find a group with this Cayley Table. The group described by this table is not cyclic, nor is it isomorphic to Sn , for any n. Nevertheless, we can find a subgroup of S4 which has this Cayley Table. Let a = (1, 2) and b = (3, 4). Then a2 = b2 = e. Moreover ab = ba. It follows from this fact that the set H = {a, b, ab, e} is a subgroup of S4 whose Cayley Table is given above. The group with Cayley Table above has several names in the literature. It is called the Klein 4-group, after Felix Klein, 1849-1925. It is also isomorphic to the group Z2 × Z2 , a group which we will understand when we discuss product groups. It is also isomorphic to the Dihedral Group D2 , which is also denoted by some as D4 . We will discuss Dihedral groups later. 11. Normal Subgroups and Quotient Groups Proposition 11.1. Let K = ker(ϕ) be the kernel of a morphism ϕ : G → G′ . Then if g ∈ G and x ∈ K, gxg −1 ∈ K. In other words, cg (K) = K for every g ∈ G, where cg is the conjugation by g morphism introduced in Example 9.18. Proof. Let g ∈ G and x ∈ K. Then ϕ(x) = e. We compute ϕ(gxg −1 ) = ϕ(g)ϕ(x)ϕ(g)−1 = ϕ(g)eϕ(g)−1 = e. Thus gxg −1 ∈ K. To see the second statement, we note that cg (K) ⊆ K by the first statement. To see equality holds, suppose y ∈ K and g ∈ G, Then x = g −1 yg = cg−1 (y) ∈ K. Moreover, cg (x) = y. Thus K ⊆ gg (K) and equality holds. From the theorem above, we see that kernels of morphisms have a special property among subgroups. This property turns out to be very important, leading to the definition below. Definition 11.2. A subgroup H ≤ G is said to be normal in G if gxg −1 ∈ H whenever x ∈ H and g ∈ G. When H is normal in G, we denote this by H ▹ G. Since we defined a subgroup to be normal by requiring to have a property which we already showed that kernels of morphisms satisfy, it follows that every kernel of a morphism is automatically a normal subgroup. Theorem 11.3. Let H ≤ G. Then the following are equivalent. (1) H ▹ G. (2) gxg −1 ∈ H for all x ∈ H and g ∈ G. (3) cg (H) ⊆ H for all g ∈ G. (4) cg (H) = H for all g ∈ G. Proof. The equivalence of conditions (1) and (2) is a matter of definition. That (2) is equivalent to (3) is straightforward. That (4) implies (3) is clear. To see that (3) implies (4), the same argument as in proposition 11.1 works. 38 MICHAEL PENKAVA Example 11.4. We give an example of a subgroup which is not normal. Let H = ⟨(1, 2)⟩ = {(1, 2), e} be the subgroup of S3 generated by the transposition (1, 2). Then c(1,3) ((1, 2)) = (1, 3)(1, 2)(1, 3) = (2, 3) ̸∈ H, so H is not normal in S3 . The fact that we chose a nonabelian group in order to find a subgroup which was not normal in the group is no accident. Theorem 11.5. Every subgroup of an abelian group is normal in the group. Proof. Let H ≤ G and suppose G is abelian. If x ∈ H and g ∈ G, then cg (x) = gxg −1 = xgg −1 = x ∈ H, so H ▹ G. It is very important to notice that the theorem does not imply that an abelian subgroup of a group is normal in the group. In fact, in Example 11.4, the subgroup H of S3 is abelian, because it is a cyclic subgroup, but it is not normal in S3 . We would like to explore the possibility of giving a group structure to the set of left cosets G/H of a subgroup H of G. Remember that we denote the left coset of g by g. How would we define a · b? There seems to be only one natural definition: a · b = ab. The problem with this “definition” is that it is not obvious that it is well-defined, since this rule can be stated as “Take some element out of the first set, some element of the second set and then multiply them and take the resulting coset”. Of course, the answer may depend on which elements we take from the two sets. We encountered this problem when defining addition and multiplication on Zn and were able to resolve them. It turns out that in this situation, the problem may not be insurmountable. We will break the problem down into some steps and see that some condition on H is necessary in order for this definition to work. Theorem 11.6. Suppose H ≤ G and that the multiplication of cosets on G/H given by a · b = ab is well-defined. Then • G/H is a group under this multiplication with identity e and inverse given by (a)−1 = a−1 . • The map π : G → G/H given by π(a) = a is a surjective morphism with kernel H. • H is normal in G. • If G is finite, then o(G/H) = o(G)/o(H). Thus, in order for the product above to be well defined, a necessary condition is that H ▹ G. Conversely, when H ▹ G, the product above is well defined, so this condition is sufficient as well. Proof. Assume a · b = ab is well defined. Then (a · b) · c = ab · c = abc = a · bc = a · (b · c). a · e = ae = a = ea = e · a. a · a−1 = aa−1 = e = a−1 a = a−1 · a. Thus G/H is a group. Now define π as above, and we compute π(ab) = ab = a · b = π(a)π(b). ABSTRACT ALGEBRA I NOTES 39 Thus π is a morphism of groups. We have π(a) = e precisely when a = e which happens if and only if a ∈ e = H. Thus ker(π) = H. Since kernels of morphisms are normal in their group, H ▹ G. The fact that o(G/H) = o(G)/o(H) doesn’t depend on the group structure of G/H, but is simply a coset counting formula, which we already used in establishing Lagrange’s Theorem. Now, let us show that when H ▹ G, the product above is well defined. Suppose that a1 ∈ a and b1 ∈ b. Then a1 = ah and b1 = bh′ for some h, h′ ∈ H. We then compute a1 b1 = ahbh′ = abb−1 hbh′ ∈ ab, since b−1 hb ∈ H, so b−1 hbh′ ∈ H. This shows that a1 b1 = ab, and the product is well defined. Definition 11.7. If H ▹G, then the group G/H, equipped with the induced product above, is called a quotient group or factor group of G. Example 11.8. In any group G, the improper subgroup G and the trivial subgroup {e} are both normal in G. The group G/G has one element while the group G/{e} is isomorphic to G. Thus, we have identified the isomorphism classes of these two quotient groups. Example 11.9. Let G = S3 . Then the complete list of subgroups of G is G, ⟨(1, 2)⟩, ⟨(1, 3)⟩, ⟨(2, 3)⟩, ⟨(1, 2, 3)⟩, and {e}. Of these subgroups, Example 11.10. Let n ∈ Z, and H = ⟨n⟩ = nZ be the subgroup of Z generated by n. Then we claim that Z/nZ = Zn . This shows that Zn is a quotient group of Z. To see this claim, note that a = b (mod n) precisely when b − a ∈ ⟨n⟩. Thus the equivalence classes mod n are precisely the left cosets of H. Also, the formula for addition of cosets is the same in both cases. This shows that every quotient group of Z is cyclic. Theorem 11.11. Let n ∈ P and H = ⟨d⟩ be the subgroup of Zn generated by a positive divisor d of n. Then Zn /H ∼ = Zd . In particular, every quotient group of Zn is cyclic. Proof. We already showed that o(H) = n/d. Moreover, it is easy to see that Zn /H is generated by the image of 1, so Zn /H is cyclic. To determine what cyclic group it is isomorphic to, we need only find o(Zn )/H). But we have n o(Zn /H) = o(Zn )/o(H) = = d. n/d This shows that Zn /H ∼ = Zd . Since every subgroup of Zn is of the form H = ⟨d⟩, where d is a positive divisor of n, it follows that every quotient group of Zn is cyclic. Theorem 11.12. Let ϕ : G → G′ be a morphism of groups and H ≤ ker(ϕ) be a normal subgroup of G. Then there is an induced map ϕ : G/K → G′ , given by ϕ(a) = ϕ(a). Proof. Let a1 ∈ a, so that a1 = ah. Then ϕ(a1 ) = ϕ(ah) = ϕ(a)ϕ(h) = ϕ(a)e = ϕ(a). This shows that ϕ is well defined. Next, we see that ϕ(a · b) = ϕ(ab) = ϕ(ab) = ϕ(a)ϕ(b) = ϕ(a) · ϕ(b), 40 so ϕ is a morphism. MICHAEL PENKAVA Theorem 11.13 (First Isomorphism Theorem for Groups). Let ϕ : G → G′ be a morphism of groups and H = ker(ϕ). Then ϕ(G) ∼ = G/H. Proof. The map ϕ : G/H → G′ is well defined, and its image is ϕ(G). Thus we obtain a morphism G/H → ϕ(G), which is surjective, since if y ∈ ϕ(G), then y = ϕ(g) for some g ∈ G, so that y = ϕ(g). Moreover, this map is injective, since if ϕ(a) = e, then ϕ(a) = e, so that a ∈ H and thus a = e. Thus ker(ϕ) = {e}. Using the first isomorphism theorem, we see that every morphism ϕ : G → G′ factors as the composition of an injective map, an isomorphism, and a surjective map. The injective map is the inclusion ϕ(G) ,→ G′ . (We use the symbol ,→ to indicate an injective morphism.) The surjective map is the map G → G/ ker(ϕ). The isomorphism is the map G/H ∼ = ϕ(G) given in the First Isomorphism Theorem for groups. We have ϕ is given by the composition G → G/ ker(ϕ) ∼ = ϕ(G) ,→ G′ . One consequence of all this gyration is that we can understand all morphisms if we understand surjective and injective morphisms. Moreover, to find the set of morphisms from G to G′ , we should study the quotient groups of G and the subgroups of G′ . Every morphism is given by an isomorphism from one of the quotient groups of G to a subgroup of G′ . Example 11.14. We classify all morphisms from Z4 to S3 and all morphisms from S3 to Z4 . We know that Z4 is abelian, so every subgroup is normal in Z4 . There are precisely 3 subgroups of Z4 , H0 = ⟨0⟩, H1 = ⟨1⟩, and H2 = ⟨2⟩, since there are only 3 positive divisors of 4. There are 6 subgroups of S3 , K0 = ⟨e⟩, K1 = ⟨(1, 2)⟩, K2 = ⟨(1, 3)⟩, K3 = ⟨(2, 3)⟩, K4 = ⟨(1, 2, 3)⟩, and S3 itself. Only the subgroups K1 , K4 and S3 are normal in S3 . To find all the morphisms from Z4 to S3 , note that the quotient groups of Z4 are isomorphic to Z1 , Z2 and Z4 . The group Z1 is isomorphic to the subgroup K0 of S3 , and there is exactly one isomorphism between these groups. Thus we obtain the map ϕ0 : Z4 → S3 , given by ϕ0 (a) = e for all a ∈ Z4 , which is, of course the trivial morphism. Next, the group Z2 is isomorphic to each of the subgroups K1 , K2 and K3 , and there is precisely one isomorphism between these groups. This gives three more morphisms ϕ1 , ϕ2 and ϕ3 . The first one is given by ϕ1 (a) = (1, 2)a , the second by ϕ2 (a) = (1, 3)a , and the third by ϕ3 (a) = (2, 3)a . Finally, we note that none of the subgroups of S3 are isomorphic to Z4 . Thus we have found exactly 4 morphisms from Z4 to S3 . To find all the morphisms from S3 to Z4 , note that the quotients of S3 are S3 /S3 ∼ = Z1 , S3 /K3 ∼ = Z2 , and S3 /K0 ∼ = S3 . The first quotient is isomorphic to the subgroup H0 , and gives rise to the trivial morphism from S3 → Z4 . The second quotient is isomorphic to H2 , and gives rise to a morphism ϕ : S3 → Z4 given by { 2 if σ ∈ {(1, 2), (1, 3), (2, 3)} ϕ(σ) = . 0 otherwise Finally, the third quotient group is not isomorphic to any subgroup of Z4 . Thus there are exactly 2 morphisms S3 → Z4 . This example shows that even when the groups are small, the description of the morphisms between them is quite involved. ABSTRACT ALGEBRA I NOTES 41 Theorem 11.15. Let H and K be two normal subgroups of G. Then H ∩ K ▹ G. Exercise 11.16. Prove the above theorem. Definition 11.17. Let H, K ≤ G. Then define the product of the two subgroups by HK = {hk|h ∈ H, k ∈ K}. The product of two subgroups need not be a subgroup of G. To see this, Consider H = ⟨(1, 2)⟩, K = ⟨(1, 3)⟩ in S3 . It is an easy exercise to show that HK is not a subgroup of S3 . For example, by direct computation, we see that HK has 4 elements, while S3 has 6 elements, so by Lagrange’s Theorem, it cannot be a subgroup. However, we do have the following important result. Theorem 11.18. Let H, K ≤ G and suppose that either H ▹ G or K ▹ G. Then HK is a subgroup of G. Exercise 11.19. Prove the above theorem. Theorem 11.20 (Second Isomorphism Theorem for groups). Let H, K ≤ G and suppose that K ▹ G. Then H ∩ K ▹ H, K ▹ HK, and ∼ HK/K. H/H ∩ K = Proof. By Theorem 11.18, we know that HK ≤ G. Since cg (K) = K for all g ∈ G, this also holds for all g ∈ HK. Thus K ▹ HK. If h ∈ H and x ∈ H ∩ K, then ch (x) = hgh−1 ∈ H and ch (x) ∈ K. Thus ch (x) ∈ H ∩ K. This shows that H ∩ K ▹ H. Define ϕ : H → HK/K by ϕ(h) = h. This definition makes sense since H ⊆ HK because e ∈ K, so h = he ∈ HK for all h ∈ H. Suppose that y ∈ HK/K. Then y = hk for some h ∈ H and k ∈ K. Thus ϕ(h) = h = he = hk = hk = y. This shows that ϕ is surjective. Let h ∈ ker(ϕ). Then h = e, so h ∈ K. Thus h ∈ H ∩ K. Since any element in H ∩ K is in the kernel of ϕ, it follows that ker(ϕ) = H ∩ K. Thus by the first isomorphism theorem, H/H ∩ K ∼ = ϕ(H) = HK/K. To prepare for the next isomorphism theorem, we need a definition. Definition 11.21. Let X ⊆ G and H ▹ G. Then by X/H we mean the subset {x|x ∈ X} of G/H. Proposition 11.22. If H ▹ G and H ≤ K, then K/H ≤ G/H. Moreover, if K ▹ G, then K/H ▹ G/H. Proof. Since e ∈ K/H, K/H is nonempty. Let a, b ∈ K/H. Then a = k1 h1 and b = k2h2 for some k1 , k2 ∈ K and h1 , h2 ∈ H. Then ab = k1 h1 k2 h2 = k1 k2 k2−1 h1 k2 h2 , so ab ∈ K/H. But a·b = ab, so this shows that K/H is closed under −1 multiplication. Moreover a−1 = h−1 = k1−1 k1 h1 k1−1 , so (a)−1 = a−1 ∈ K/H. 1 k1 Thus K/H ≤ G/H. Theorem 11.23 (Third Isomorphism Theorem for groups). Let H ≤ K ≤ G and suppose that both H and K are normal subgroups of G, so that in particular, H ▹K. Then (G/H)/(K/H) ∼ = G/K. 42 MICHAEL PENKAVA Proof. The tricky part of the proof is to give a good notation for the equivalence classes of elements in G in the two different quotients G/K and G/H. Let us denote the image of a ∈ G in G/H by a, and in G/K by a. Define a map ϕ : G/H → G/K by ϕ(a) = a. To see that this is well defined, note that if π : G → G/K is the projection π(a) = a, then H ⊆ ker(π), since H ⊆ K. Let ϕ = π be the induced map ϕ(a) = π(a) = a. Clearly ϕ is surjective. Now suppose that a ∈ ker(ϕ). Then a ∈ K, so a ∈ K/H. Moreover, if a ∈ K/H, then a = kh for some k ∈ K and h ∈ H. Since H ⊆ K, it follows that a ∈ K, so that ϕ(a) = e. Thus ker(ϕ) = K/H, and the induced map ϕ : (G/H)/(K/H) → G/K is surjective and injective, so is an isomorphism. 11.1. The Commutator Subgroup. Definition 11.24. If G is a group and g, h ∈ G, then the commutator of g and h, denoted [g, h], is the element [g, h] = ghg −1 h−1 . The commutator subgroup of G, denoted as G′ or [G, G], is the smallest subgroup of G containing all commutators. It is not true in general that the set of commutators of a group G is a subgroup. Instead, we have the following characterization of [G, G]. Theorem 11.25. If a, b ∈ G, then [a, b]−1 = [b, a]. As a consequence [G, G] = ∏n { i=1 [ai , bi ]|ai , bi ∈ G, n ∈ P}. Proof. To see the first statement, note that [a, b][b, a] = aba−1 b−1 bab−1 a−1 = e. ∏n In general, if S ̸= ∅ ⊆ G, we have ⟨S⟩ = { i=1 xi |x±1 ∈ S, n ∈ P}. But in this i case, since the set of commutators is closed under inverses, we have the simpler description above. Theorem 11.26. If a, b ∈ G, then cg ([a, b]) = [cg (a), cg (b)] for all g ∈ G. As a consequence, the commutator subgroup is a normal subgroup of G. Theorem 11.27. Let ϕ : G → G′ be a morphism, and suppose that G′ is abelian. Then [G, G] ≤ ker(ϕ). Moreover G/[G, G] is abelian. Proof. First, note that φ([a, b]) = φ(aba−1 b−1 ) = ϕ(a)ϕ(b)ϕ(a)−1 ϕ(b)−1 = ϕ(a)ϕ(a)−1 ϕ(b)ϕ(b)−1 = e. Thus [a, b] ∈ ker(ϕ) for every commutator. It follows that any product of commutators is in ker(ϕ), and since every element in the commutator subgroup is a product of commutators, we have [G, G] ≤ ker(ϕ). Let π : G → G/[G, G] be the natural projection π(a) = a. Then if a, b ∈ G/[G, G], we have ab(a)−1 (b)−1 = aba−1 b−1 = [a, b] = e. But this means that ab = ba, so G/[G, G] is abelian. ABSTRACT ALGEBRA I NOTES 43 12. Dihedral Groups The dihedral group Dn is often defined as the group of symmetries of the regular n-gon. By symmetries, we mean rotations and reflections which preserve the set of vertices and edges of the n-gon. There are several different ways of representing the dihedral group, and in this section, we will discuss several of them. 0.8 0.6 0.4 0.2 –1 –0.5 0 0.5 1 –0.2 –0.4 –0.6 –0.8 Figure 3. Hexagon centered at the origin with one vertex at (1, 0) The picture above illustrates a regular hexagon, centered at the origin, with one vertex at the point (1, 0). The vertices are at the points (cos(kπ/3), sin(kπ/3), for k = 0, . . . , 5. More generally, for a regular n-gon, we would have n-vertices at the points (cos(2kπ/n), sin(2kπ/n), for k = 0, . . . , n − 1. There are n rotations which preserve the n-gon, generated by the rotation ρ through the angle of 2π/n, which is conventionally chosen to be a counterclockwise rotation. The n rotations are ρ, ρ2 , . . . , ρn , where ρn is the identity element, since a rotation by the angle 2π is considered as the identity. For a hexagon, there are 6 reflections, three of them across the lines determined by pairs of midpoints of opposite edges, and three of them through the lines given by lines given by pairs of opposite vertices. The same pattern holds for any ngon, when n is even, but for n odd, there is a different pattern. There are still n reflections, but each is given by a reflection through a line through a vertex and the midpoint of the opposite edge. If we denote the reflection of the plane across the x-axis by σ, we note that for any n-gon, this reflection is one of the symmetries. Moreover, every reflection is of the form ρk σ, for k = 0, . . . , n. Thus the complete set of symmetries of the n-gon is 44 MICHAEL PENKAVA {e, ρ, . . . , ρn−1 , σ, ρσ, . . . , ρn−1 σ}. This means that there are 2n symmetries of the n-gon. We would like to show that the set Dn of symmetries of the n-gon form a group under composition. We have already used composition to aid in the description of the symmetries, but we still need to show that any composition of symmetries is another symmetry. This is clear from the geometric point of view, since the composition of maps which preserve the edges and vertices should also preserve the vertices and edges. However, we would like to understand this idea from an algebraic point of view as well. Let see what we can work out directly from the definitions of ρ and σ. Clearly, the elements σk = ρk σ are not rotations, because we have flipped the plane over with σ, and the rotation ρk does not flip it over again. In fact, it is not hard to see that the point whose coordinates are ±(cos(kπ/n), sin(kπ/n)) is preserved under σk , so that σk preserves the line through these two points, and therefore must be a reflection across that line. Suppose that ϕ is any element of the dihedral group. It is enough to know whether it is a rotation or reflection and what it does to the vertex (1, 0). We use this fact to compute σρ. Since ρ takes (1, 0) to (cos(2π/n), sin(2π/n)), and σ has the effect of negating the y-coordinate, (ρσ)(1, 0) = (cos(2π/n), − sin(2π/n)) = (cos(2π/n), sin(−2π/n)). However, we can directly compute that (ρn−1 σ)(1, 0) = ρn−1 (1, 0) = (cos(2(n − 1)π/n), sin(2(n − 1)π/n)) = (cos(2π/n), sin(−2π/n)), It follows that σρ = ρn−1 σ, since both of these maps are reflections and they take (1, 0) to the same vertex. All of the 2n symmetries of the n-gon are given by orthogonal transformations of the plane. The set of all orthogonal transformations of the plane is a group, denoted by O(2, R),[or just O(2).]It consists of rotations, which are given by matrices of the − sin(θ) form ρθ = cos(θ) , and reflections, which are given by matrices of the form sin(θ) cos(θ) ] [ [ 0 ] cos(θ) sin(θ) σθ = sin(θ) − cos(θ) . Let σ = 10 −1 . Then σθ = ρθ σ, by direct computation. Moreover, by direct computation, we obtain that σρθ = ρ−θ σ. Now, let ρ = ρθ for θ = 2π/n. It is easily computed that n is the least positive integer such that ρn = I, the identity matrix, and that the set of 2n elements Dn = {I, ρ, . . . , ρn−1 , σ, ρσ, . . . , ρn−1 σ} is closed under multiplication and inverses and is nonempty. Thus Dn is a subgroup of O(2). The last incarnation of the dihedral group that we will study is as operators on the complex plane. By an operator on the complex plane we mean a map C → C. Let ρ be the operator defined by ρ(z) = e2πi/n z, in other words, multiplication by the complex number e2πi/n , where eiθ = cos(θ) + i sin(θ), a property of complex numbers that you can verify by expanding the power series for eiθ , and using the fact that i2 = −1. Let σ be the conjugation operator σ(z) = z. Then the dihedral group Dn is the subgroup of operators on the complex plane generated by ρ and σ. Since ρ and σ are invertible maps, with the inverse to ρ given by multiplication by e−2πi/n , and σ being its own inverse, Dn is a subgroup of the permutations of C. Moreover ρk ABSTRACT ALGEBRA I NOTES 45 is multiplication by e2kπi/n , so ρn = 1, the operator of multiplication by 1, which is the identity operator on C. Next, we compute (σ ◦ ρ)(z) = σ(ρ(z)) = σ(e2πi/n z) = e2πi/n z = e2πi/n z = e−2πi/n z = ρ−1 (σ(z)) = (ρ−1 ◦ σ)(z). From this we conclude that σρ = ρ−1 σ. It follows from this relation that Dn = {e, ρ, . . . , ρn−1 , σ, ρσ, . . . , ρn−1 σ}, and that these 2n elements are distinct. No matter which of the three models one makes of Dn , we obtain that it has 2n elements, that ρ has order n, σ has order 2, and σρ = ρ−1 σ. In fact, these three properties are sufficient to compute all products in Dn , and therefore to construct the Cayley Table of Dn . For some low values of n, we obtain some isomorphisms between Dn and some other groups. D1 has exactly 2 elements, e and σ, so D1 ≃ Z2 . D2 is isomorphic to the Klein 4-group. D3 ≃ S3 , which can be verified by comparing Cayley tables. There is a morphism Dn → Sn , since every symmetry of the n-gon induces a permutation of the n vertices of the n-gon. When n ≥ 3, this map is injective. The isomorphism between D3 and S3 can be seen in this way. When n > 3, the morphism Dn → Sn is not surjective, as o(Dn ) = 2n < n! = o(Sn ) when n > 3. The model for Dn as the symmetries of the n-gon has some difficulty when n is 1 or 2. To rescue the model, let us consider the n-gon to be defined as follows. Take the unit circle and mark n equally spaced points on it, in other words, the points (cos(2kπ, n), sin(2kπ/n)) for k = 0, . . . , n − 1. We will call these n-points the vertices of the n-gon. Define the group Dn as the symmetries of the circle, which preserve this set of n vertices. In this way, the 1-gon is a circle with one point marked on it, and the 2-gon is a circle with two diametrically opposite points marked on it. Note that some mathematicians denote the group Dn as D2n . This kind of conflicting terminology is not unusual in mathematics and reflects traditions coming from different branches of mathematics. 13. Direct Products and Semidirect Products Definition 13.1. Let G and H be groups. Then the set G × H, equipped with binary operator (g, h)(g ′ , h′ ) = (gg ′ , hh′ ) is called the direct product ∏ of G and H. More generally, if Gλ is a collection of groups for λ ∈ Λ, then Gλ is equipped λ∈Λ with the binary operator (gλ )(gλ′ ) = (gλ gλ′ ) is called the direct product of the groups Gλ . Theorem 13.2. The direct product of groups is a group. Proof. We give the proof for two groups G and G′ . The proof for a collection of groups is similar. First, we check that the binary operation is associative. (g, h)((g ′ , h′ )(g ′′ , h′′ )) = (g, h)(g ′ g ′′ , h′ h′′ ) = (gg ′ g ′′ , hh′ h′′ ) = (gg ′ , hh′ )(g ′′ , h′′ ) = ((g, h)(g ′ , h′ ))(g ′′ , h′′ ). Next we check that (e, e) is the identity in G × H. (e, e)(g, h) = (eg, eh) = (g, h) = (ge, he) = (g, h)(e, e). 46 MICHAEL PENKAVA Finally we show that (g, h)−1 = (g −1 , h−1 ). (g, h)(g −1 , h−1 ) = (gg −1 , hh−1 ) = (e, e) = (g −1 g, h−1 h) = (g −1 , h−1 )(g, h). Theorem 13.3. G × H is abelian if and only if both G and H are abelian groups. Exercise 13.4. Prove the above theorem. Theorem 13.5. If (g, h) ∈ G × H, then (g, h)k = (g k , hk ) for all k ∈ Z. Exercise 13.6. Prove the theorem above. Example 13.7. Let G = Z2 and H = Z3 . Then o(G × H) = o(G)o(H) = 6. We have found two groups of order 6, up to isomorphism, Z6 and S3 . It turns out that these are the only two groups of order 6 up to isomorphism. However we have not shown that, so let us see what we can determine. First, since both Z2 and Z3 are abelian, we know that G is abelian. Thus we can rule out the possibility that G is isomorphic to S3 . On the other hand, if G ∼ = Z6 , it would have to have an element of order 6. This motivates us to study the order of the elements in G. It is easy to calculate that k(1, 1) = (k, k), so this is (0, 0) precisely when k = 0 (mod 2) and k = 0 (mod 3). But this means that both 2 and 3 must divide k. Since 2 and 3 are relatively prime, this forces their product to divide k, so k is a multiple of 6. The least nonzero positive multiple of 6 is 6, so we see that (1, 1) has order 6. But that means that G is cyclic and so it is isomorphic to Z6 . Theorem 13.8. Let g ∈ G and h ∈ H be elements of finite order. Then the order of (g, h) in G × H is the least common multiple of o(g) and o(h). On the other hand, if either g or h have infinite order, then so does (g, h). Proof. Let m = o(g) and n = o(h), and c = lcm(m, n). Now c = mx and c = ny for some x, y ∈ Z, so (g, h)c = (g c , hc ) = (g mx , hny ) = ((g m )x , (hn )y ) = (ex , ey ) = (e, e). This shows that o(g, h)|c. On the other hand, if (g, h)k = (e, e), then g k = e and hk = e, so m|k and n|k, which means that c|k, since c is the least common multiple of m and n. Since the order of the element is the least positive power k such that (g, h)k = e, it follows that c|o(g, h). Thus the two are equal. When g or h has infinite order, it is impossible that (g, h)k = e for any positive integer k, because in that case we would have g k = e and hk = e, which would imply that both of them have finite order. In the example above, we saw that Z2 × Z3 ∼ = Z6 . In fact, this is a consequence of the more general result below. Theorem 13.9. Zm × Zn ∼ = Zmn precisely when gcd(m, n) = 1. ∼ Zmn . Then there is an element (a, b) of order mn. Proof. Suppose that Zm × Zn = But the maximum possible order of any element in Zm × Zn is lcm(m, n) ≤ mn. It mn , so this forces gcd(m, n) = follows that lcm(m, n) = mn. Now lcm(m, n) = gcd(m,n) 1. On the other hand, when gcd(m, n) = 1, then o(1, 1) = lcm(m, n) = mn, so Zm × Zn is a cyclic group of order mn, and thus it is isomorphic to Zmn . ABSTRACT ALGEBRA I NOTES 47 One problem with the notion of direct product is that in order for a group to be such a direct product, it needs to consist of orders pairs. This is extremely unlikely, so at first hand, the notion of direct product does not seem to be so powerful. But the theorem above shows that the key idea is not being a direct product, but being isomorphic to one. Because we know that Zmn is isomorphic to Zm × Zn when m and n are relatively prime, it tells us a lot about the structure of that group. The following theorem gives a method of characterizing when a group is isomorphic to a direct product. Theorem 13.10 (Fundamental Theorem on Direct Products). Let H and K be subgroups of a group G. Assume the following three conditions hold. (1) H ∩ K = {e}. (2) H ▹ G and K ▹ G. (3) HK = G. Then the map ϕ : H × K → G, given by ϕ(h, k) = hk is an isomorphism between H × K and G. We can replace (2) by (2′ ) hk = kh for all h ∈ H, k ∈ K. Moreover, when o(G) < ∞, we can replace (3) by (3′ ) o(G) = o(H)o(K). Proof. We first show that if (1) and (2) hold, then (2′ ) holds. Let h ∈ H and k ∈ K. Note that hk = kh if and only if hkh−1 k −1 = e. Now hkh−1 k −1 lies in K because hkh−1 is in K. Similarly, it lies in H because kh−1 k −1 is in H. Thus hkh−1 k −1 lies in H ∩ K, so we must have hkh−1 k −1 = e. Next, we show that if (2′ ) and (3) hold, then (2) holds. Let g ∈ G. By (3), we must have g = hk for some h ∈ H and k ∈ K. Let x ∈ H. Then gxg −1 = hkxk −1 h−1 = hkk −1 xh−1 = hxh−1 ∈ H. Thus H ▹ G. By a similar argument K ▹ G. Suppose that (1) holds. Then we claim that ϕ is injective. Suppose that hk = h′ k ′ for some h′ ∈ H and k ′ ∈ K. Then (h′ )−1 h = k ′ k −1 ∈ H ∩ K, which implies that (h′ )−1 h = k ′ k −1 = e, from which we deduce that h = h′ and k = k ′ . Thus ϕ is injective. Suppose (1) and (3) hold, and that o(G) < ∞. Since (1) implies that ϕ is injective, and (3) implies that ϕ is surjective, we have that ϕ is a bijection, so o(G) = o(H) × o(K). Thus (3′ ) holds. On the other hand, suppose (1) and (3′ ) hold and that o(G) < ∞. Since ϕ is injective, and the sets G and H × K have the same number of elements, ϕ must be surjective. Thus G = HK. Thus (3) holds. Now, we have already seen that if (1) holds, ϕ is injective, and if (3) holds, ϕ is surjective. Thus it remains to show that ϕ is a morphism. If either (2) holds as well, then we know that (2′ ) holds. Thus we will use (2′ ) to show that ϕ is a morphism. We have ϕ((h, k)(h′ , k ′ )) = ϕ((hh′ , kk ′ )) = hh′ kk ′ = hkh′ k ′ = ϕ((h, k))ϕ((h′ , k ′ )). Thus ϕ is a morphism. Notice that in the above proof, if conditions (1) and (2′ ) hold, then ϕ is still an injective morphism. Thus, whether (3) holds or not, ϕ is an isomorphism of H × K onto its image, which is a subgroup of G. However, we need to be a bit careful here, 48 MICHAEL PENKAVA because it is not enough that (1) and (2) hold, since it was condition (2′ ) that we needed to show that ϕ was a morphism, and condition (3) was needed to replace (2) by (2′ ). Since we are mainly concerned with finite groups, it is usually much easier to show that condition (3′ ) holds than condition (3). In fact, if we don’t know what o(H) and o(K) are, we probably don’t have enough information about the two subgroups to conclude anything. Definition 13.11. If H and K are subgroups of G and the map ϕ : H × K → G given by ϕ(h, k) is an isomorphism of groups, then we say that G is the internal direct product of H and K, and we also write G = H × K. Notice that when we express an internal direct product in the form G = H × K, we don’t mean that G consists of ordered pairs of elements of H and K. Thus there is some ambiguity about whether H × K means the internal direct product or the external direct product, which is given by these ordered pairs of elements. However, this ambiguity is not so much of a problem as the two groups are isomorphic in a natural manner. Example 13.12. Consider the group R∗ of nonzero real numbers (under multiplication). Let H = {±1} and K = R+ , the subgroup of positive real numbers. Clearly H ∩K = {1}. Moreover, every subgroup of an abelian group is normal, so condition (2) is satisfied. (Of course, condition (2′ ) is even more obvious.) Finally, every element of R∗ is in HK. Thus R∗ = H ×K. We also express this as R∗ = Z2 ×R+ , since H ∼ = Z2 . ∼ Zm × Zn when gcd(m, n) = 1. Example 13.13. We give another proof that Zmn = We note that m has order n and n has order m in Zmn . Let H = ⟨m⟩ and K = ⟨n⟩. Suppose x ∈ H ∩K. Then o(x) must divide both m and n, since it divides the orders of each of those groups, since it is a member of both of them. Thus o(x) = 1, so x = 0. It follows that H ∩K = {0}, so condition (1) holds. Condition (2) holds since Zmn is abelian. Condition (3′ ) holds since o(Zmn ) = mn = o(K)o(H) = o(H)o(K). Thus Zmn = H × K ∼ = Zm × Zn . Exercise 13.14. Show that D2n ∼ = Dn × Z2 when n is odd. To do this, show that if we let r = ρ2 , then the subgroup H = ⟨r, σ⟩ ∼ = Dn . Let K = ⟨ρn ⟩. Show that ∼ K = Z2 . Then show that the hypotheses of the Fundamental Theorem on Direct Products are satisfied. Exercise 13.15. Recall that GL(n, R) is the group of n × n invertible matrices with real coefficients, and SL(n, R) is the subgroup of matrices with determinant 1. Suppose that n is odd, so that det(−I) = −1. Let H = {±I}, and K = SL(n, R). Show that GL(n, R) = H × K. Before the introduction of the notion of direct product, we did not know many groups. We have studied the groups Zn , Sn , Dn and some matrix groups. With the introduction of the direct product, we obtain a lot more groups. For example, every finite abelian group (actually every finitely generated abelian group) can be expressed as a direct product of cyclic groups. This statement is called the Fundamental Theorem of Finitely Generated Abelian Groups. The proof of the Fundamental Theorem of Finite Abelian Groups is long, but the applications of the theorem are fairly straightforward. ABSTRACT ALGEBRA I NOTES 49 The construction of groups by direct products is still not enough to classify all finite groups. We need a more powerful tool, called the semidirect product. Even this tool, which we introduce next, is not enough to classify finite groups, but it is a much more powerful tool than the direct product. The basic strategy in constructing finite groups is as follows. Suppose that you want to construct all groups of order n, and you do know all groups of order less than n, up to isomorphism. Then we would like to be able to construct groups of order n from groups of smaller order. The first case we need to consider is the case when our group has no proper nontrivial normal subgroups. Definition 13.16. Let G be a finite group. Then G is said to be a simple group if there are no proper nontrivial subgroups of G. The first step in classifying groups is to determine all of the simple groups. The classification of simple groups turned out to be a very hard problem. Its solution was originally announced in the 1980s, but there were some problems with this solution. Then, in the 1990s, it was thought to have been completely solved, but again there were some problems with the proofs. According to Wikipedia, the complete classification of finite simple groups was completed in 2008. Nevertheless, the classification remains a difficult problem, and a complete proof has not yet been published. The complete list of simple groups contains several families of simple groups, and 26 special cases, called sporadic groups. We already know one of the families, Zp for p prime, because the only subgroups of Zp are the trivial subgroup and the improper subgroup. This is because if 1 ≤ m < p, then m is a generator of Zp . The family of groups An , the subgroup of even permutations in Sn , is another family of simple groups for n ≥ 5. It was a very important discovery of Évariste Galois (1811-1832) that A5 was a simple group. This was the key fact in his proof that there can be no formula for obtaining the solution to a general quintic polynomial in terms of roots. The largest sporadic group, called the Monster Group, was only discovered in 1982 (although it was shown to exist in the 1970s). It has order 246 · 320 · 59 · 76 · 112 · 133 · 17 · 19 · 23 · 29 · 31 · 41 · 47 · 59 · 71 ≈ 8 × 1053 . This group is so large that even calculating things like the product of two elements in the group is extremely complicated. A great deal of study of this monster group is still ongoing. If α : K → Aut(H) ia a morphism of K to the automorphism group of a group H, then it is typical to denote the automorphism α(k) by αk . Definition 13.17. Suppose that α : K → Aut(K) is a morphism between the group K and the automorphism group of the group H. Then the semidirect product of H and K determined by α, denoted by H oα K, is the set H × K equipped with the binary operation (h, k)(h′ , k ′ ) = (hαk (h′ ), kk ′ ). When the map α is implicit, we usually write H o K instead of H oα K. Theorem 13.18. The semidirect product H oα K is a group under the binary operation introduced above. 50 MICHAEL PENKAVA Proof. To see associativity holds, we compute ((h, k)(h′ , k ′ ))(h′′ , k ′′ ) = (hαk (h′ ), kk ′ )(h′′ , k ′′ ) = (hαk (h′ )αkk′ (h′′ ), kk ′ k ′′ ) = (hαk (h)αk (αk′ (h′′ ))kk ′ k ′′ ) = (hαk (h′ αk′ (h′′ ), kk ′ k ′′ = (h, k)(h′ αk′ (h′′ ), k ′ k ′′ ) = (h, k)((h′ , k ′ )(h′′ , k ′′ )). It is natural to guess that the identity is (e, e), and we verify this by (e, e)(h, k) = (eαe (h), ek) = (αe (h), k) = (1H (h), k) = (h, k) (h, k)(e, e) = (hαk (e), ke) = (he, k) = (h, k). It is not so obvious what (h, k)−1 should be, so let us solve for it. Suppose that (h, k)(x, y) = (e, e). Then (e, e) = (h, k)(x, y) = (hαk (x), ky). Thus y = k −1 and αk (x) = h−1 , so applying αk−1 to both sides, we obtain that x = αk−1 (h−1 ). Thus (x, y) = (αk−1 (h−1 ), k −1 ). We need to verify that (x, y)(h, k) = (e, e). But (x, y)(h, k) = (αk−1 (h−1 ), k −1 )(h, k) = (αk−1 (h−1 )αk−1 (h), k −1 k) = (αk−1 (h−1 h), e) = (αk−1 (e), e) = (e, e). Note that a direct product is a special case of a semidirect product, where the map α is the trivial morphism between K and Aut(K), because in that case we have (h, k)(h′ , k ′ ) = (hαk (h′ ), kk ′ ) = (h1H (h′ ), kk ′ ) = (hh′ , kk ′ ). As is the case for direct products, it is uncommon for a group to actually consist of ordered pairs, so there is little chance that a group fits the description of a semidirect product. However, what is more important is when a group is isomorphic to a semidirect product. The following theorem characterizes when G is isomorphic to a semidirect product. Theorem 13.19. Suppose that H and K are subgroups of G satisfying (1) H ∩ K = {e}. (2) H ▹ G. (3) HK = G. Let α : K → Aut(H) be given by αk (h) = khk −1 be the automorphism of H given by the restriction of the conjugation operator to K, acting on H. Then the map H oα K → G given by (h, k) 7→ hk is an isomorphism. If o(G) < ∞, then we may replace condition (3) by the condition (3)′ o(G) = o(H)o(K). Proof. The fact that (3) is equivalent to (3)′ if (1) holds is proved in the same way it was for direct products. Let ϕ : H oα K → G be given by ϕ(h, k) = hk. Then ϕ((h, k)(h′ , k ′ )) = ϕ(hkh′ k −1 , kk ′ ) = hkh′ k −1 kk ′ = hkh′ k ′ = ϕ(h, k)ϕ(h′ , k ′ ). Injectivity of ϕ follows from (1) and surjectivity from (3). ABSTRACT ALGEBRA I NOTES 51 Example 13.20. Suppose that n ≥ 2. Let H = An be the alternating subgroup of the permutation group Sn . We know that An ▹ Sn and that o(An ) = n!/2. Let K = ⟨(12)⟩ = {(12), e} ∼ = Z2 . Since o(H)o(K) = p! = o(Sn ) and H ∩ K = {e}, we see that Sn = An o Z2 . Thus Sn is a semidirect product. Since An is simple for n ≥ 5, this gives a decomposition of Sn as a semidirect product of two simple groups. Exercise 13.21. Show that Dn ≃ Zn o Z2 . Example 13.22. Let V be a vector space over a field k. Then GL(V ) is a the group of invertible linear transformations from V to V . Let v ∈ V and A ∈ GL(V ). Then the map TA,v : V ] → V given by TA,v (x) = Ax+v is called an affine transformation of V . We show that the set Aff(V ) of affine transformations of V is a group under composition. Since (TA,v ◦ TB,w )(x) = TA,v (TB,w (x)) = TA,v (Bx + v) = A(Bx + v) + w = (AB)x + Av + w = TAB,Av+w (x), we see that the composition of two affine transformations is another affine transformation. Clearly, the identity map TI,0 is an affine transformation, and TA,v ◦ TI,0 = TA,v = TI,0 ◦ TA,v , so e = TI,0 is the identity in Aff(V ). Finally, TA,v ◦ TA−1 ,−A−1 v = TI,0 = TA−1 ,−A−1 v ◦ TA,v , −1 so TA−1 ,−A−1 v = TA,v . Thus Aff(V ) is a group. Next, we let H = {TI,v |v ∈ V }. Denote an element TI,v by Tv and call it the translation by v. It is easy to see that Tv + Tw = Tv+w , Tv−1 = T−v , and that e ∈ H, so H is a subgroup of Aff(V ). It is also straightforward to see that the map T : V → H given by T (v) = Tv is an isomorphism, so H ≃ V . Next, we compute TA,v ◦ Tw ◦ TA−1 ,−A−1 v = TA,v ◦ TA−1 ,w−A−1 v = TI,Aw , which shows that H ▹ Aff(V ). Let K = GL(V ) = {TA,0 |A ∈ GL(V )}. We have −1 TA,0 ◦ TB,0 = TAB,0 , TA,0 = TA−1 ,0 and e ∈ K, which shows that K ≤ Aff(V ). Next, note that if TA,0 = Tv , then A = I and v = 0, so H ∩K = {e}. Finally, note that TA,v = Tv ◦ TA,0 , which shows that HK = Aff(V ). Thus all three conditions of the theorem on semidirect products are satisfied and Aff(V ) = H o K. Since H ≃ V and K ≃ GL(V ), we express this fact in the form Aff(V ) = V o GL(V ). Department of Mathematics, University of Wisconsin-Eau Claire, Eau Claire, WI 54729 USA E-mail address: [email protected]