* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download MA 311 NUMBER THEORY BUTLER UNIVERSITY FALL 200 1
Survey
Document related concepts
Transcript
MA 311 NUMBER THEORY BUTLER UNIVERSITY FALL 2008 SCOTT T. PARSELL 1. Introduction Number theory could be deο¬ned fairly concisely as the study of the natural numbers: 1, 2, 3, 4, 5, 6, . . . . We usually denote this set by β. The set of all integers (including 0 and the negatives) is denoted by β€. Is there anything about the natural numbers thatβs worth studying? It seems that we have a pretty good understanding of them once weβve learned to count! Perhaps surprisingly, this turns out to be a rich and fascinating ο¬eld of study, bursting with unsolved problems. A good starting point for our investigations is to look at how the natural numbers factor. Primes. A prime number is a number greater than 1 that cannot be written as the product of two smaller natural numbers. The ο¬rst few primes are 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, . . . . Integers exceeding 1 that are not prime are called composite. The primes are important because each natural number greater than 1 can be written as a product of primes, and this factorization is unique (up to the order of the factors). For example, 24 = 23 β 3 and 105 = 3 β 5 β 7. It is fairly easy to show that there are inο¬nitely many prime numbers; weβll prove this in a later section. However, there remain many interesting unsolved (or partially solved) questions about the primes and how they are distributed. For example, β How precisely can we estimate the number of primes less than π₯? (We know that π₯/ log π₯ gives a good ο¬rst approximation.) What about primes of the form 4π + 1, of the form 4π + 3, etc.? β Are there inο¬nitely many primes of the form π2 + 1? How about of the form 2π β 1? Of the form 2π + 1? β Is there an eο¬cient algorithm for ο¬nding a numberβs prime factorization or proving that a number is prime? (The diο¬culty of factoring eο¬ciently is the basis of the security of RSA encryption.) β Are there inο¬nitely many pairs of βtwin primesβ, i.e., primes whose diο¬erence is two, such as 3 and 5)? If not, can anything be said about small gaps between primes asymptotically? β (Goldbachβs problem) Can every even integer exceeding 2 be written as the sum of two primes? Questions about the distribution of primes usually fall under the heading of analytic number theory because many of the techniques are based on real and complex analysis (i.e., mathematics related to calculus). 1 2 SCOTT T. PARSELL Divisibility and congruences. Along with the idea of factoring integers comes the notion of divisibility. We say that π divides π if there exists an integer π such that ππ = π. For example, 4 divides 24 since 4 β 6 = 24, and 15 divides 105 since 15 β 7 = 105. Divisibility leads to the important idea of congruences. We say that π is congruent to π modulo π if π divides π β π. In this case, we write πβ‘π (mod π). For example, 3 β‘ 75 (mod 24) and 8 β‘ 38 (mod 10). Arithmetic with congruences (sometimes called modular arithmetic) is useful for detecting certain types of periodic phenomena. For example, one could use arithmetic mod 24 to keep track of the hour of day (in military time) without regard to minutes, seconds, or day. One could use arithmetic mod 10 to keep track of the last digit of a positive number (or mod 100 to keep track of the last two digits). If π objects are arranged in a circle, then arithmetic mod π can be used to keep track of the positions of the objects as they are rearranged. Weβll see some more interesting uses of congruences later on. For instance, they can be used to construct check-digit schemes to minimize errors in data entry. Facts about the computation of powers modulo π form the basis for constructing an RSA cryptosystem. Rings and ο¬elds. If one is doing arithmetic with congruences, say modulo 6, then eο¬ectively there are only 6 distinct βnumbersβ to work with, usually denoted by 0, 1, 2, 3, 4, and 5. Under this scheme, the number 0 actually stands for the set [0]6 = {. . . , β24, β18, β12, β6, 0, 6, 12, 18, 24, . . . }. Similarly, 1 stands for [1]6 = {. . . , β17, β11, β5, 1, 7, 13, 19, . . . }, and so on. However, it is convenient to pick one small integer (usually either the smallest positive integer or the one of smallest absolute value) to represent each βcongruence classβ. The integers themselves are an example of an abstract algebraic structure called a ring, which is basically a set equipped with addition and multiplication operations satisfying basic properties like associativity and the distributive law (we omit the precise deο¬nition of a ring here). The set of congruence classes {0, 1, 2, 3, 4, 5} can be viewed as a ring in its own right, sometimes denoted by β€/6β€ or β€6 , with addition and multiplication deο¬ned modulo 6. For example, 2 + 5 = 1 and 2 β 3 = 0 in the ring β€6 . One defect of rings is that multiplicative inverses do not exist in general. For example, 2 does not have a multiplicative inverse in β€, nor in β€6 . However, 2 does have a multiplicative inverse in β€7 , since 2 β 4 = 1 under mod 7 arithmetic. Special rings in which all nonzero elements have multiplicative inverses (such as the rational numbers, real numbers, and complex numbers) are called ο¬elds. It turns out that β€π = {0, 1, 2, . . . , π β 1}, under arithmetic modulo π is a ο¬eld if and only if π is prime. Our algebra with congruences will be inο¬uenced by these considerations. Just as the equation 2π₯ = 1 can be solved over the rationals but not over the integers, the congruence 2π₯ β‘ 1 (mod π) can be solved when π = 7 but not when π = 6 (in other words, the equation 2π₯ = 1 has a solution over β€7 but not over β€6 ). One can construct further examples of rings β by βadjoiningβ irrational or complex numbers to the set of integers. For example if π = β1, then the set β€[π] of all complex numbers of the form π + ππ, where π and π are integers, forms a ring, known as the ring of Gaussian MA 311 NUMBER THEORY FALL 2008 3 integers. One can ask whether such a ring has any number-theoretic properties in common with the integers, such as unique factorization. It turns out that this ring does have unique factorization, but not all the integer primes remain prime in β€[π]. For instance, 2 = (1 + π)(1 β π), but 3 remains irreducible. The numbers 1 + π and 1 β π are primes in β€[π], and the number 6 has the βunique prime factorization 6 = (1 + π) β (1 β π) β 3. If we let πΏ = β5, then we can construct the ring β€[πΏ], which is the set of all complex numbers of the form π + ππΏ, where π and π are integers. Something bizarre happens when we try to factor 6 in this ring. We obviously have 6=2β 3 and 6 = (1 + πΏ)(1 β πΏ), and one can show that 2, 3, 1 + πΏ, and 1 β πΏ are all irreducible in the ring β€[πΏ]. Thus we have two diο¬erent factorizations for 6, which means that unique factorization fails in this ring! The study of primes and factorization in rings such as β€[π] and β€[πΏ] forms the basis for much of algebraic number theory. Here one makes heavy use of general results from modern algebra, so we wonβt pursue this branch of the subject very deeply. Diophantine equations. One area of number theory that we hope to touch on later in the course overlaps with both analytic and algebraic number theory. A diophantine equation is simply an equation (usually a polynomial in two or more variables) for which we seek integer (or sometimes rational) solutions; a classic example is the equation π₯2 + π¦ 2 = π§ 2 . This equation has many integer solutions, such as (3, 4, 5) and (5, 12, 13). In fact, it can be shown that there are inο¬nitely many integer solutions, and all the solutions can be described by an explicit parametrization. These are the so-called Pythagorean triples, which correspond to the lengths of the sides in right triangles. Interestingly, things become dramatically diο¬erent if we change the equation to π₯3 + π¦ 3 = π§ 3 . Here the only integer solutions are the βtrivialβ ones with π₯π¦π§ = 0. In fact, Fermatβs Last Theorem asserts that if π is any integer exceeding 2 then the diophantine equation π₯π + π¦ π = π§ π has only trivial solutions. This seemingly innocent conjecture remained unproven for over 300 years until deep work of Wiles resolved it in 1995. As another example, consider the diophantine equation π¦ 2 = π₯3 +17. This is an example of an elliptic curve, which more generally has the form π¦ 2 = π (π₯), where π is a cubic polynomial. It turns out that the rational points lying on such a curve have an additive group structure, and this can be used as the basis for an encryption scheme and also for an eο¬cient factoring algorithm. Wiles also exploited connections with elliptic curves in his proof of Fermatβs Last Theorem. All this work on diophantine equations in few variables uses primarily algebraic techniques, so the detailed study of these topics is best left for a more advanced course. A theorem of Lagrange states that every positive integer can be expressed as the sum of four perfect squares. In other words, the diophantine equation π₯21 + π₯22 + π₯23 + π₯24 = π can be solved for every positive integer π. For instance, when π = 31 we can take π₯1 = 5, π₯2 = 2, π₯3 = 1, and π₯4 = 1. A generalization of this question known as Waringβs problem asks what happens with higher powers. For instance, how large does π have to be in order to represent all integers as sums of π perfect cubes? (The answer turns out to be 9.) What if we only need to represent all suο¬ciently large integers? Here we know that 7 cubes suο¬ce, 4 SCOTT T. PARSELL but itβs conjectured that 4 would be enough! The type of diophantine equation involved in Waringβs problem typically has enough variables that it can be attacked by analytic methods, and this has been a very active area of research over the past 20 years. Weβll discuss some of the underlying ideas later in the course. In Waringβs problem, one could also ask what happens if the variables are restricted to be primes. For example, the Goldbach problem mentioned earlier amounts to solving the equation π1 + π2 = π in primes π1 and π2 for every even π > 2. The general Waring-Goldbach problem considers the solubility of the diophantine equation ππ1 + β β β + πππ = π in primes π1 , . . . , ππ for every π for which the underlying congruences are feasible. A variation known as a diophantine inequality arises when attempting to approximate irrational number βnumbers by rational numbers. For instance, if we want to ο¬nd a rational β close to 2, then we are looking for integer solutions to the inequality β£π₯/π¦ β 2β£ < π, where π is a small positive number. Dirichletβs theorem on diophantine approximation actually tells us that we can solve this inequality with π replaced by an explicit function of the β denominator, namely 1/π¦ 2 . Thus we can solve the diophantine inequality β£π₯ β 2π¦β£ < 1/π¦. More general inequalities (for example, involving sums of πth powers) are a subject of current research interest. Where do we begin? Weβve only scratched the surface of number theory by mentioning some of the important ideas and some of the interesting unsolved problems. In the next section, weβll start laying the foundations for our study by developing some actual machinery on divisibility, primes, and congruences. This will lead us to our ο¬rst main goal, which is to understand RSA cryptography. Following that, we hope to touch on some of the more advanced topics mentioned above, such as the distribution of primes, the algebraic structure of β€π , Waringβs problem, and diophantine approximation. 2. Divisibility Recall that if π, π β β€, we say that π divides π (and write πβ£π) if there exists π β β€ such that π = ππ. For example, 2 divides 6, but 4 does not divide 6. When π divides π, we say that π is a multiple of π and that π is divisible by π. Two easy properties of divisibility that weβll ο¬nd useful are given in the following lemma. Lemma 2.1. Let π, π, and π be integers. (a) If πβ£π and πβ£π, then πβ£π. (b) If πβ£π and πβ£π, then πβ£(ππ + ππ‘) for all integers π and π‘. Proof. If πβ£π and πβ£π, then we can write π = ππ and π = ππ for some integers π and π. We then have π = π(ππ), which shows that πβ£π. Similarly, if πβ£π and πβ£π, then we can write π = ππ and π = ππ for some integers π and π. If π and π‘ are arbitrary integers, we have ππ + ππ‘ = πππ + πππ‘ = π(ππ + ππ‘), which shows that πβ£(ππ + ππ‘). β‘ The following divisibility exercise gives us a chance to review proof by mathematical induction. Example 2.2. Prove that π5 β π is divisible by 5 for every positive integer π. MA 311 NUMBER THEORY FALL 2008 5 Solution. We proceed by induction on π. First of all, we have 15 β 1 = 0, which is clearly divisible by 5, since 0 = 5 β 0. This establishes the base case. Now suppose that π β₯ 1 is an integer and that π5 β π is divisible by 5. Then by the binomial theorem one has (π + 1)5 β (π + 1) = π5 + 5π4 + 10π3 + 10π2 + 5π + 1 β π β 1 = (π5 β π) + 5(π4 + 2π3 + 2π2 + π). Here the ο¬rst term on the right is divisible by 5 according to the induction hypothesis, and the second term is clearly divisible by 5 since π4 + 2π3 + 2π2 + π is an integer. We therefore deduce from part (b) of Lemma 2.1 that (π + 1)5 β (π + 1) is divisible by 5, and the result now follows by induction. β‘ In the future, we will not always be quite so pedantic in writing, but the above solution serves as a good model for constructing proofs of this type. In general, to prove that a statement π (π) holds for all positive integers π, one must ο¬rst establish π (1) and then prove the implication π (π) =β π (π + 1). This principle is one of the fundamental axioms about the integers. It is equivalent to the well-ordering principle, which states that every non-empty subset of the positive integers has a smallest element. Greatest common divisors. The greatest common divisor of π and π is the largest positive integer that divides both π and π. It is denoted by gcd(π, π), or sometimes just (π, π) when there is no danger of confusion with an ordered pair. For example, gcd(4, 6) = 2, gcd(12, 51) = 3, and gcd(9, 16) = 1. If gcd(π, π) = 1, then we say that π and π are relatively prime (or coprime). We note that gcd(π, 0) = π for every non-zero integer π and that gcd(0, 0) is undeο¬ned. The least common multiple of π and π is the smallest positive integer that is a multiple of both π and π. It is denoted by lcm(π, π) or [π, π]. For example, lcm(4, 6) = 12. It is fairly easy to see that gcd(π, π)lcm(π, π) = ππ. When π and π are small, one can compute gcd(π, π) fairly easily by looking at the prime factorizations of π and π and picking out the parts in common. For instance, 24 = 23 β 3 and 180 = 22 β 32 β 5, so gcd(24, 180) = 22 β 3 = 12. However, since factoring is expensive computationally, this is not an eο¬cient method when π and π are large. A better method is based on the division with remainder algorithm learned in grade school. Theorem 2.3. (Division with remainder) For any integers π and π with π > 0, there exist unique integers π and π such that π = ππ + π and 0 β€ π < π. Proof. We ο¬rst prove the existence of π and π. Consider the list of integers . . . π β 3π, π β 2π, π β π, π, π + π, π + 2π, π + 3π, . . . . Since π > 0, we can select one with the smallest non-negative value, say π = π β ππ. If π β₯ π, then we ο¬nd that π β π = π β ππ β π = π β (π + 1)π is a non-negative number on our list with a smaller value than π, which contradicts our choice of π. Thus we have 0 β€ π < π and π = ππ + π. To check uniqueness, suppose there are integers π1 , π2 , π1 , and π2 with π = π1 π + π1 = π2 π + π2 and 0 β€ π1 , π2 < π. 6 SCOTT T. PARSELL Then we have π(π1 β π2 ) = π2 β π1 , and we may suppose without loss of generality that π1 β€ π2 . Then 0 β€ π2 β π1 < π β π1 β€ π, and hence 0 β€ π(π1 β π2 ) < π, which implies that π1 β π2 = 0. Thus π1 = π2 , and it follows that π1 = π2 . β‘ For example, if π = 48 and π = 9, then we can write 48 = 5 β 9 + 3, so we can take π = 5 and π = 3 in Theorem 2.3. We call π the quotient and π the remainder. Notice that π = 0 if and only if π divides π. Theorem 2.4. Let π and π be nonzero integers. Then gcd(π, π) is the smallest positive integral linear combination of π and π. That is, gcd(π, π) is the smallest positive value of ππ + ππ‘, where π and π‘ are integers. Proof. By taking π = π and π‘ = π, we see that positive integral linear combinations exist, so we can let π denote the smallest such value. Write π = ππ 0 + ππ‘0 . By Theorem 2.3, we can write π = ππ + π = π(ππ 0 + ππ‘0 ) + π, where 0 β€ π < π. Solving for π, we get π = π(1 β ππ 0 ) + π(βππ‘0 ), so π is an integral linear combination of π and π, and since π < π, the minimality of π implies that π = 0. Thus we see that π divides π, and we can apply a similar argument to deduce that π divides π. Thus π is a common divisor of π and π. Moreover, if π is any common divisor of π and π, then π divides both ππ 0 and ππ‘0 , so π divides π. Thus we conclude that π = gcd(π, π). β‘ Corollary 2.5. The integers π and π are relatively prime if and only if there exist integers π and π‘ such that ππ + ππ‘ = 1. Proof. If gcd(π, π) = 1, then it follows from Theorem 2.4 that ππ + ππ‘ = 1 for some integers π and π‘. Conversely, suppose that 1 can be expressed as a linear combination of π and π. Since Theorem 2.4 ensures that gcd(π, π) is the smallest positive integer with this property, we may conclude that gcd(π, π) = 1. β‘ For example, we have 9 β (β7) + 16 β 4 = 1, which shows that gcd(9, 16) = 1. An eο¬cient algorithm for computing gcd(π, π) is based on the following simple result. Lemma 2.6. If π = ππ + π, then gcd(π, π) = gcd(π, π). Proof. If π divides both π and π, then π clearly divides π = π β ππ, so π is a common divisor of π and π. Conversely, if π divides both π and π, then π clearly divides π = ππ + π, so π is a common divisor of π and π. Therefore the set of common divisors of π and π is identical to the set of common divisors of π and π, so the greatest common divisors must be equal. β‘ The Euclidean Algorithm. We can compute the greatest common divisor very eο¬ciently by successively applying Theorem 2.3 and Lemma 2.6. The gcd is the last non-zero MA 311 NUMBER THEORY FALL 2008 7 remainder in this process. That is, to compute gcd(π, π), we write π = ππ1 + π1 (0 < π1 < π) π = π 1 π2 + π 2 (0 < π2 < π1 ) π1 = π2 π3 + π3 ... (0 < π3 < π2 ) ππβ2 = ππβ1 ππ + ππ ππβ1 = ππ ππ+1 , (0 < ππ < ππβ1 ) so that gcd(π, π) = ππ . Example 2.7. Use the Euclidean algorithm to compute π = gcd(630, 132), and ο¬nd integers π and π‘ such that π = 630π + 132π‘. Solution. We have 630 = 132 β 4 + 102 132 = 102 β 1 + 30 102 = 30 β 3 + 12 30 = 12 β 2 + 6 12 = 6 β 2, so the algorithm terminates with π = 4, and we have gcd(630, 132) = π4 = 6. We can now work backwards through these equations to ο¬nd the required integers π and π‘. We have 6 = 30 β 12 β 2 = 30 β (102 β 30 β 3) β 2 = 30 β 7 β 102 β 2 = (132 β 102) β 7 β 102 β 2 = 132 β 7 β 102 β 9 = 132 β 7 β (630 β 132 β 4) β 9 = 132 β 43 β 630 β 9, so we can take π = β9 and π‘ = 43. β‘ There is another way to organize the computations in the Euclidean algorithm that produces gcd(π, π) and the integers π and π‘ simultaneously. The idea is to set up an augmented matrix consisting of a 2 × 2 identity matrix, followed by π and π in the third column. One then subtracts one a multiple of one row from the other until the entries in the third column divide one another. The multiples we use are exactly the quotients π1 , π2 , . . . , ππ . Thus Example 2.7 could be handled as follows: ] [ ] [ ] [ 1 β4 102 1 β4 102 1 0 630 β β 0 1 132 0 1 132 β1 5 30 ] [ ] [ 4 β19 12 4 β19 12 β . β β1 5 30 β9 43 6 8 SCOTT T. PARSELL Every row [π₯ π¦ β£ π§] of every matrix in this computation has the property that 630π₯ + 132π¦ = π§, because this is satisο¬ed by the initial matrix and is preserved by the row operations. Therefore, the required integers π and π‘ appear to the left of gcd(π, π) in the ο¬nal matrix. In the worst case, the Euclidean algorithm takes on the order of log π steps to compute gcd(π, π), where π = max(β£πβ£, β£πβ£). The function log π grows very slowly as π β β, so the algorithm runs very quickly on a computer. Primes. Recall that an integer π > 1 is said to be prime if its only positive factors are 1 and π. One can generate all the primes up to π using the Sieve of Eratosthenes to successively strike out all the proper multiples β of 2, 3, 5, etc. If an integer less than π isβnot prime, then it has a prime divisor less than π , so one can terminate this process at π . The integers that remain uncrossed are the primes up to π . Lemma 2.8. (Euclidβs Lemma) Let π and π be integers, and let π be a prime. If πβ£ππ, then πβ£π or πβ£π. Proof. Suppose that π divides ππ but that π does not divide π. Since π is prime, we must have gcd(π, π) = 1, so by Theorem 2.4 there exist integers π and π‘ such that ππ + ππ‘ = 1. Multiplying through by π, we obtain πππ + πππ‘ = π. Since πβ£ππ and πβ£π, we deduce from part (b) of Lemma 2.1 that πβ£π. β‘ Note that Lemma 2.8 fails if π is not prime. For example, 6β£12 = 3β 4, but 6 does not divide 3 or 4. One can easily show by induction that Lemma 2.8 can be extended to products of more than two integers. That is, if π is a prime dividing the product π1 β β β ππ , then π must divide at least one of the ππ . As a simple application of Euclidβs Lemma, we perform the following entertaining exercise. β Example 2.9. Prove that 2 is irrational. β β Solution. We proceed by contradiction. If 2 were rational, then we could write 2 = π/π for some positive integers π and π with (π, π) = 1. After squaring both sides and clearing denominators, we ο¬nd that 2π2 = π2 , and hence in particular that 2β£π2 . Since 2 is prime, it now follows from Euclidβs Lemma that 2β£π, so we can write π = 2π for some integer π. Substituting this into our previous equation yields 2π2 = 4π2 , or π2 = 2π2 . Thus 2β£π2 and hence by Euclidβs Lemma we have 2β£π. We have now deduced that both π and π are divisible by 2, contradicting our original assumption that (π, π) = 1. This contradiction forces us to β conclude that 2 is in fact irrational. β‘ β Note that there is little diο¬culty in generalizing the argument to handle π, where π is any β prime. In fact it is not hard to see that π is irrational if and only if π fails to be a perfect square, but this requires information about factoring composite integers. The following result is the most important application of Euclidβs Lemma and, as its name suggests, is fundamental to our study of number theory. Theorem 2.10. (Fundamental Theorem of Arithmetic) Every integer π > 1 can be written as a product of prime factors, and this factorization is unique up to the order of the factors. MA 311 NUMBER THEORY FALL 2008 9 Proof. The existence of factorizations follows easily by induction on the size of the integer π. For the base case, it suο¬ces to note that π = 2 is prime. Now suppose that π β₯ 2 and that every integer π with 2 β€ π β€ π β 1 has a factorization into primes. If π is prime, then we are done. Otherwise, we may write π = ππ where 2 β€ π, π β€ π β 1, and the induction hypothesis shows that π and π both have factorizations, which combine to produce a factorization of π. To prove uniqueness, we induct on the number of factors. Suppose that π = π1 β β β ππ = π1 β β β ππ , where the ππ and ππ are primes, and we may assume without loss of generality that π β€ π . If π = 1, then clearly π = 1, so π1 = π1 . Now let π > 1, and suppose that unique factorization holds for all integers with fewer than π prime factors. Since π1 β£π1 β β β ππ , we have π1 β£ππ (and hence π1 = ππ ) for some π by an easy extension of Euclidβs Lemma. By relabeling, we may suppose that π = 1, and hence we may divide through by π1 to get π2 β β β ππ = π2 β β β ππ . The induction hypothesis now implies that π = π and that π2 , . . . , ππ is a permutation of π2 , . . . , ππ , and the uniqueness follows. β‘ β In rings where unique factorization fails, like β€[ β5], the problem is that the notions of βirreducibleβ and βprimeβ do not correspond. The property in Lemma 2.8 is used as the deο¬nition of prime, but there are β irreducible elements that donβt satisfy this property. For example, 2 is irreducible in β€[ β5], but it is not β β β prime in this β ring because 2 divides 6 = (1 + β5)(1 β β5), but 2 does not divide 1 + β5 or 1 β β5 Theorem 2.11. There are inο¬nitely many primes. Proof. Assume to the contrary that there are only ο¬nitely many primes, say π1 , π2 , . . . , ππ , and let π = π1 π2 β β β ππ + 1. We know from Theorem 2.10 that π has at least one prime factor, say π. We cannot have π = ππ for some π because this would imply that π divides 1 = π β π1 π2 β β β ππ . This is a contradiction, so we conclude that there must be inο¬nitely many primes. β‘ This theorem was ο¬rst proved by Euclid, and weβve given his original proof. Many other proofs have been discovered since Euclidβs time. A more general theorem of Dirichlet states that there are inο¬nitely many primes of the form π = ππ + π whenever π and π are relatively prime. For example, there are inο¬nitely many primes of the form π = 4π + 1 and also of the form π = 4π + 3. A weak version of the prime number theorem states that if π(π₯) denotes the number of primes up to π₯, then π(π₯) βΌ π₯/ log π₯ asymptotically, in the sense that π(π₯) = 1. π₯ββ π₯/ log π₯ lim One could interpret this by saying that the probability that the integer π₯ is prime is roughly 1/ log π₯. Throughout these notes log π₯ denotes the natural (base π) logarithm. Theorem 2.12. There are arbitrarily large gaps between consecutive primes. 10 SCOTT T. PARSELL Proof. Given an integer π > 1, weβll construct a list of π consecutive composite numbers. If we let π = (π + 1)! + 2, then the π numbers π, π + 1, π + 2, . . . , π + π β 1 are all composite, since π + 2 divides π + π = (π + 1)! + (π + 2) for π = 0, 1, 2 . . . , π β 1. β‘ At the other extreme, the Twin Primes Conjecture states that there are inο¬nitely pairs of primes whose diο¬erence is 2, for instance (3, 5), (5, 7), (11, 13), (17, 19), (29, 31), (41, 43), . . . . Those familiar with analysis may wish to observe that if ππ denotes the πth prime then Theorem 2.12 is equivalent to the statement that lim sup(ππ+1 β ππ ) = β, while the Twin Primes Conjecture asserts that lim inf(ππ+1 β ππ ) = 2. In spite of some recent breakthroughs in this area, we do not even know for sure that lim inf(ππ+1 β ππ ) < β. This indicates that weβre not very close to a proof of the Twin Primes Conjecture! Perfect numbers and Mersenne primes. A positive integer is said to be perfect if it is the sum of its proper positive divisors (that is, not including the number itself). For example, 6=1+2+3 and 28 = 1 + 2 + 4 + 7 + 14 are perfect. The ο¬rst few perfect numbers are 6, 28, 496, 8128, 33550336. It is believed that there are inο¬nitely many perfect numbers, but this is not known. Another open problem is to determine whether there are any odd perfect numbers (itβs believed that the answer is no). Theorem 2.13. A positive even integer π is perfect if and only if we can write π = 2πβ1 (2π β 1), where 2π β 1 is prime. Proof. First suppose that π = 2π β 1 is prime. We need to show that π = 2πβ1 π is perfect. The proper positive divisors of π are 1, 2, 4, 8, . . . , 2πβ1 , π, 2π, 4π, 8π, . . . , 2πβ2 π, so their sum is 2π β 1 + π(2πβ1 β 1) = π + (2πβ1 β 1)π = 2πβ1 π = π. This shows that π is perfect. Conversely, suppose that π is an even perfect number. We need to show that there is an integer π such that π = 2πβ1 (2π β 1) and 2π β 1 is prime. Since π is even, we can write π = 2π π‘, where π β₯ 1 and π‘ is odd. Let π denote the sum of all the positive divisors of π‘ (i.e., the sum of the odd positive divisors of π). Since π is perfect, we know that the sum of all the positive divisors of π is equal to 2π, so we have have 2π = π + 2π + 4π + 8π + β β β + 2π π = (2π+1 β 1)π, and thus π= 2π 2π+1 π‘ (2π+1 β 1)π‘ + π‘ π‘ = = = π‘ + . 2π+1 β 1 2π+1 β 1 2π+1 β 1 2π+1 β 1 MA 311 NUMBER THEORY FALL 2008 11 Since π and π‘ are integers, we see that π’ = π‘/(2π+1 β 1) is an integer, and π’ < π‘ since π β₯ 1. Thus π’ and π‘ are two distinct divisors of π‘. It follows that they are the only positive divisors of π‘, whence π‘ is prime and π’ = 1. Thus we have π‘ = 2π+1 β 1, so on setting π = π + 1 we get π = 2πβ1 π‘ = 2πβ1 (2π β 1), where 2π β 1 is prime. β‘ Primes of the form 2π β 1 are called Mersenne primes. As a result of Theorem 2.13, ο¬nding even perfect numbers is equivalent to ο¬nding Mersenne primes. Notice that 6 = 21 β (22 β 1), 28 = 22 (23 β 1), 496 = 24 (25 β 1), 8128 = 26 (27 β 1), and 33550336 = 212 (213 β 1). The following theorem restricts the possibilities somewhat. Theorem 2.14. If 2π β 1 is prime, then π is prime. Proof. We prove the contrapositive. Suppose that π is composite. Then we can write π = ππ for some integers π and π with 1 < π, π < π. Then we have 2π β 1 = 2ππ β 1 = (2π )π β 1 = (2π β 1)(1 + 2π + 22π + β β β + 2(πβ1)π ). Here we have used the factorization π₯π β 1 = (π₯ β 1)(1 + π₯ + π₯2 + β β β + π₯πβ1 ) with π₯ = 2π . Since 1 < π < π, we have 1 < 2π β 1 < 2π β 1, and hence we conclude that 2π β 1 is composite. β‘ The converse of Theorem 2.14 is false. That is, there exist primes π for which 2π β 1 is not prime. The smallest example is 211 β 1 = 2047 = 23 β 89. There are 46 known Mersenne primes, the largest of which is 243,112,609 β 1. This was discovered in August 2008 and has 12,978,189 digits. The largest known perfect number is therefore 243,112,608 (243,112,609 β 1). This world-record prime was actually the 45th Mersenne prime to be discovered. The 46th one was found about two weeks later but has only 11,185,272 digits. To join the Great Internet Mersenne Prime Search (GIMPS), go to http://www.mersenne.org. 3. Congruences Let π be a positive integer, and let π and π be arbitrary integers. We say that π and π are congruent modulo π if π divides π β π. In this case, we write πβ‘π (mod π). For example, we have 37 β‘ 2 (mod 5), 37 β‘ β3 (mod 5), and 24 β‘ 0 (mod 6). Notice that π β‘ 0 (mod π) if and only if πβ£π and that π β‘ π (mod π) if and only if we can write π = π + ππ for some integer π. Lemma 3.1. If π β‘ π (mod π) and π β‘ π (mod π), then π + π β‘ π + π (mod π) and ππ β‘ ππ (mod π). Proof. Suppose that π β‘ π (mod π) and π β‘ π (mod π). Then there exist integers π and π such that π = π + ππ and π = π + ππ. We then have π + π = π + π + (π + π)π and ππ = ππ + (ππ + ππ + πππ)π, which shows that π + π β‘ π + π (mod π) and ππ β‘ ππ (mod π). β‘ This lemma allows us to manipulate congruences algebraically as we do with equations. 12 SCOTT T. PARSELL Example 3.2. For what integers π₯ does the congruence 4π₯ + 1 β‘ 3 (mod 7) hold? Solution. Subtracting 1 from both sides shows that the congruence is equivalent to 4π₯ β‘ 2 (mod 7). Multiplying both sides by 2 now gives 8π₯ β‘ 4 (mod 7), which is the same as π₯ β‘ 4 (mod 7), since 8 β‘ 1 (mod 7). Hence the congruence is satisο¬ed by all integers π₯ of the form π₯ = 4 + 7π, where π is an integer. β‘ Lemma 3.3. (Cancellation) If ππ β‘ ππ (mod π) and (π, π) = 1, then π β‘ π (mod π). Proof. Suppose that ππ β‘ ππ (mod π) and (π, π) = 1. Then π divides ππ β ππ = π(π β π). Since (π, π) = 1, it follows by imitating the proof of Euclidβs Lemma that π divides π β π (exercise). Thus we have π β‘ π (mod π). β‘ Note that Lemma 3.3 may fail without the assumption that (π, π) = 1. For instance, we have 2 β 5 β‘ 2 β 14 (mod 6), but 5 ββ‘ 14 (mod 6). Example 3.4. For what values of π₯ does the congruence 4π₯ + 1 β‘ 5 (mod 7) hold? Solution. Here the congruence is equivalent to 4π₯ β‘ 4 (mod 7), and since (4, 7) = 1 we may apply Lemma 3.3 to conclude that π₯ β‘ 1 (mod 7). Hence the congruence holds for all integers π₯ of the form π₯ = 1 + 7π, where π is an integer. β‘ Residue Classes. It is easy to see that congruence modulo π deο¬nes an equivalence relation on the set of integers and therefore partitions the integers into equivalence classes. Our solutions to Examples 3.2 and 3.4 indicate how these are deο¬ned. In Example 3.4, for instance, the solution was the set of all integers congruent to 1 modulo 7, that is, all integers π₯ that can be expressed in the form π₯ = 1 + 7π for some integer π. We call this set the residue class of 1 modulo 7. It is sometimes denoted by [1] or [1]7 . Thus [1]7 = {. . . , β20, β13, β6, 1, 8, 15, 22, . . . }. Similarly, the solution of Example 3.2 is the set of all integers in the residue class [4]7 = {. . . , β17, β10, β3, 4, 11, 18, . . . }. In general, we let [π] or [π]π denote the residue class of π modulo π, which is deο¬ned to be the set of all integers of the form π + ππ, where π β β€. It is often convenient to view each residue class as a single element in a number system. Therefore, we let β€π denote the set of residue classes modulo π. Technically, we have β€π = {[0]π , [1]π , [2]π , . . . , [π β 1]π }, but Lemma 3.1 allows us to work with any set of representatives, such as {0, 1, 2, . . . , π β 1}, when doing computations. Thus we often dispense with the brackets and just think of β€π as the set {0, 1, 2, . . . , π β 1} under mod π arithmetic. With this viewpoint, we could say that the congruence in Example 3.4 has the unique solution π₯ = 1 in β€7 . Addition and multiplication in β€7 can be represented by the following tables: MA 311 + 0 1 2 3 4 5 6 0 0 1 2 3 4 5 6 NUMBER THEORY 1 1 2 3 4 5 6 0 2 2 3 4 5 6 0 1 3 3 4 5 6 0 1 2 4 4 5 6 0 1 2 3 5 5 6 0 1 2 3 4 6 6 0 1 2 3 4 5 × 0 1 2 3 4 5 6 FALL 2008 0 0 0 0 0 0 0 0 1 0 1 2 3 4 5 6 2 0 2 4 6 1 3 5 3 0 3 6 2 5 1 4 4 0 4 1 5 2 6 3 5 0 5 3 1 6 4 2 13 6 0 6 5 4 3 2 1 A set such as {0, 1, 2, . . . , πβ1} that contains exactly one representative of each equivalence class is called a complete residue system modulo π. Complete residue systems are not unique; for instance {0, 1, 2, 3, 4, 5, 6} and {β3, β2, β1, 0, 1, 2, 3} are equally valid complete residue systems modulo 7, and either one could be used to represent β€7 . Solving Linear Congruences. We want to develop a systematic procedure for ο¬nding the solutions of a congruence of the shape ππ₯ β‘ π (mod π). The following lemma is an important starting point. Lemma 3.5. (Multiplicative Inverses) If (π, π) = 1, then there is an integer π such that ππ β‘ 1 (mod π). Moreover, the residue class of π modulo π is unique. Proof. Since (π, π) = 1, we know from Corollary 2.5 that there exist integers π and π‘ with ππ + ππ‘ = 1. We then have ππ = 1 β ππ‘, which shows that ππ β‘ 1 (mod π), so we can take π = π . Now suppose that πβ² is any other integer with πβ² π β‘ 1 (mod π). Then πβ² β‘ πβ² (ππ) β‘ (πβ² π)π β‘ π (mod π), and the uniqueness claim follows. β‘ If ππ β‘ 1 (mod π), then we say that π is the inverse of π modulo π, and we sometimes write π = πβ1 or π = πβ1 mod π. Lemma 3.5 shows that when (π, π) = 1, the congruence ππ₯ β‘ π (mod π) has a unique solution in β€π , given by π₯ = πβ1 π. In view of Corollary 2.5, it is easy to see that Lemma 3.5 can be strengthened to an βif and only ifβ statement. That is, π has a multiplicative inverse modulo π if and only if (π, π) = 1. In order to ο¬nd πβ1 when (π, π) = 1, we apply the Euclidean algorithm to ο¬nd integers π and π‘ with ππ + ππ‘ = 1. β1 We then have ππ β‘ 1 (mod π), so π β‘ π (mod π). For small values of π, we can often ο¬nd inverses by inspection without resorting to the Euclidean algorithm. Example 3.6. Solve the congruence 4π₯ β‘ 3 (mod 9). Solution. Since (4, 9) = 1 we know that 4 has a multiplicative inverse modulo 9, and we ο¬nd by inspection that 4β1 = 7 in β€9 since 4 β 7 = 28 β‘ 1 (mod 9). Multiplying through by 7 now gives π₯ β‘ 21 β‘ 3 (mod 9), and hence π₯ = 3 is the unique solution in β€9 . β‘ Example 3.7. Solve the congruence 91π₯ β‘ 5 (mod 64). Solution. We can start by observing that 91 β‘ 27 (mod 64), so the congruence is equivalent to 27π₯ β‘ 5 (mod 64). Since (27, 64) = 1, we can again ο¬nd a unique solution modulo 64 by 14 SCOTT T. PARSELL multiplying through 27β1 , but ο¬nding the inverse by inspection is not quite as easy as it was in Example 3.6. Thus we apply the Euclidean algorithm: [ ] [ ] [ ] 1 0 64 1 β2 10 1 β2 10 β β 0 1 27 0 1 27 β2 5 7 [ ] [ ] 3 β7 3 3 β7 3 β β . β2 5 7 β8 19 1 This shows that 64 β (β8) + 27 β 19 = 1 and hence that 27 β 19 β‘ 1 (mod 64). Hence we have 27β1 = 19 in β€64 . Thus π₯ β‘ 5 β 19 β‘ 31 is the unique solution modulo 64. β‘ What, if anything, can we say about the solutions to the congruence ππ₯ β‘ π (mod π) when (π, π) > 1? The following theorem provides the answer. Theorem 3.8. Write π = (π, π). The congruence ππ₯ β‘ π (mod π) has a solution if and only if π divides π. In this case, there are exactly π solutions modulo π, spaced π/π apart. Proof. If π₯ is a solution to the congruence, then we have ππ₯ = π + ππ for some integer π, and thus π = ππ₯ β ππ. Since πβ£π and πβ£π, we must have πβ£π by Lemma 2.1. Therefore the congruence has no solution if π does not divide π. Now suppose that πβ£π. Then since ππ₯ β π = ππ if and only if ππ π₯ β ππ = π ππ , we see that the congruence is equivalent to π π π π₯β‘ (mod ). π π π Since (π/π, π/π) = 1, Lemma 3.5 shows that there is a unique solution π₯0 modulo π/π and hence π distinct solutions modulo π, given by π₯ = π₯0 + π(π/π) for 0 β€ π β€ π β 1. β‘ Example 3.9. Describe the solutions of the congruence 6π₯ β‘ 5 (mod 9). Solution. We have (6, 9) = 3, which fails to divide 5, so Theorem 3.8 tells us that there is no solution. β‘ Example 3.10. Describe the solutions of the congruence 24π₯ β‘ 9 (mod 33). Solution. We have (24, 33) = 3, which divides 9, so the proof of Theorem 3.8 shows that the congruence is equivalent to 8π₯ β‘ 3 (mod 11). Since 8β1 = 7 in β€11 , we ο¬nd that π₯ = 10 is the unique solution modulo 11. It follows that there are exactly 3 solutions modulo 33, represented by the residue classes π₯ = 10, π₯ = 21, and π₯ = 32. β‘ Applications to check digit schemes. Congruences can be used to construct a method for reducing errors in data entry. Suppose we have a list of 9-digit identiο¬cation numbers of the form π₯1 π₯2 . . . π₯9 to enter into a computer. We can add a 10th digit π₯10 satisfying the congruence π₯10 β‘ π₯1 + β β β + π₯9 (mod 10); that is, π₯10 is the sum of the previous 9 digits modulo 10. We can now enter our ID numbers in the form π₯1 π₯2 . . . π₯10 and program our computer to reject our entry if the above congruence is not satisο¬ed. For example, the number 129-28-5468 would be entered as 129-28-5468-5 The number π₯10 (in this case 5) is called a check digit. This scheme will catch any errors in which only a single digit is mistyped; for instance, the erroneous entry 126-28-5468-5 for the ID number above would be rejected. Many other errors will be caught as well, and this MA 311 NUMBER THEORY FALL 2008 15 scheme can be applied to data strings of any length. One notable disadvantage is that it does not detect errors in which two digits are interchanged; for example, the entry 129-28-4568-5 would be accepted by our computer as a valid ID even though it may have resulted from mistyping 54 as 45. In order to detect errors resulting from interchanging digits, one can employ a more sophisticated scheme. We illustrate by examining the International Standard Book Number (ISBN) system. These numbers are 10 digits long and come in 4 blocks; for instance, the ISBN for Niven, Zuckerman, and Montgomery, Introduction to the Theory of Numbers, 5th edition, is 0-471-62546-9. The ο¬rst digit indicates the country of publication, the second block encodes the publisher (Wiley), the third block identiο¬es the title and edition, and the fourth block is a check digit. If the ο¬rst nine digits are π₯1 , . . . , π₯9 , then the check digit π₯10 is determined by the congruence π₯10 β‘ 9 β ππ₯π β‘ π₯1 + 2π₯2 + 3π₯3 + β β β + 9π₯9 (mod 11). π=1 Thus in the above case, we would compute π₯10 β‘ 0 + 2 β 4 + 3 β 7 + 4 β 1 + 5 β 6 + 6 β 2 + 7 β 5 + 8 β 4 + 9 β 6 β‘ 196 β‘ 9 (mod 11). We ο¬nd π₯10 by reducing the above expression modulo 11 to obtain one of the standard representatives 0, 1, 2 . . . , 9, 10. (In the event that π₯10 = 10, the ISBN uses X instead.) It turns out that this scheme protects both against mistyping a single digit and against interchanging two unequal digits, as long as only one of these errors occurs in a given entry. Theorem 3.11. If π΄ = π₯1 π₯2 . . . π₯10 is a valid ISBN and π΅ = π₯β²1 π₯β²2 . . . π₯β²10 is obtained from π΄ by altering exactly one digit or interchanging two unequal digits, then π΅ is not a valid ISBN. Proof. Note that since 10 β‘ β1 (mod 11) our check digit test for a valid ISBN is equivalent to the congruence 10 β ππ₯π β‘ 0 (mod 11). π=1 Suppose that π΅ is obtained from π΄ by replacing some digit π₯π by π₯β²π , where π₯π β= π₯β²π . Then (β ) 10 10 β β² ππ₯π = ππ₯π β ππ₯π + ππ₯β²π β‘ π(π₯β²π β π₯π ) ββ‘ 0 (mod 11) π=1 π=1 by Euclidβs Lemma, since 11 does not divide π or π₯π β π₯β²π . Suppose instead that π΅ is obtained from π΄ by interchanging the πth and πth digits, where π β= π and π₯π β= π₯π . Then we can write π₯β²π = π₯π and π₯β²π = π₯π , and hence (β ) 10 10 β β² ππ₯π β‘ ππ₯π + ππ₯π + ππ₯π β ππ₯π β ππ₯π β‘ (π β π)(π₯π β π₯π ) ββ‘ 0 (mod 11) π=1 π=1 by Euclidβs Lemma, since 11 does not divide π β π or π₯π β π₯π . β‘ Example 3.12. The code number 5-382-14572-2 was obtained from a valid ISBN by interchanging two adjacent digits. What was the original ISBN? 16 SCOTT T. PARSELL Solution. Adopting the notation from the proof of Theorem 3.11, we have 10 β ππ₯β²π = 5 + 6 + 24 + 8 + 5 + 24 + 35 + 56 + 18 + 20 = 201 β‘ 3 (mod 11). π=1 Suppose the adjacent digits π₯π and π₯π+1 were interchanged in the original ISBN. Then by applying the last displayed equation in the proof of Theorem 3.11 with π = π + 1, we see that π₯β²π+1 β π₯β²π = π₯π β π₯π+1 β‘ 3 (mod 11). In the given code, we have π₯β²6 β π₯β²5 = 3, and there is no other pair of adjacent digits with this property, so these must be the ones that were interchanged. It follows that the original ISBN was 5-382-41572-2. β‘ In the above example, we were able to use the ISBN scheme not only to detect an error but also to correct it, assuming we were fairly conο¬dent that the error involved transposing adjacent digits. Of course, if there was more than one adjacent pair (π₯β²π , π₯β²π+1 ) in the erroneous code with π₯β²π+1 β π₯β²π = 3, then weβd be less successful. Recently, the above system (known as ISBN-10) has been phased out in favor of a 13-digit code that is compatible with the UPC/EAN scheme. Here the check digit is determined by the congruence π₯1 + 3π₯2 + π₯3 + 3π₯4 + π₯5 + β β β + 3π₯12 + π₯13 β‘ 0 (mod 10), and a 12-digit UPC is converted to this form by putting an extra 0 at the beginning. Since the arithmetic now occurs in β€10 , there is no need to allow X as a possible check digit. This scheme (known as ISBN-13) still detects all single-digit errors but unfortunately no longer detects all transpositions. Many recent books contain both the ISBN-10 and ISBN-13 codes. Fermatβs Little Theorem. In many applications of congruences, it is important to be able to compute powers of an integer eο¬ciently modulo some number π. In the case where π is a prime, we have the following useful result. Theorem 3.13. (Fermatβs Little Theorem) If π is a prime not dividing π, then ππβ1 β‘ 1 (mod π). Proof. Suppose that π does not divide π, and consider the product π = π β 2π β 3π β β β (π β 1)π = ππβ1 [1 β 2 β 3 β β β (π β 1)] = ππβ1 (π β 1)!. Suppose that 1 β€ π, π β€ π β 1 and that ππ β‘ ππ (mod π). Since (π, π) = 1, Lemma 3.3 implies that π β‘ π (mod π), and hence that π = π. Therefore, the integers π, 2π, 3π, . . . , (π β 1)π represent all the non-zero residue classes modulo π, and hence their product, π, must be congruent modulo π to 1 β 2 β 3 β β β (π β 1) = (π β 1)!. That is, we have ππβ1 (π β 1)! β‘ (π β 1)! (mod π). Now since all the prime factors of (π β 1)! are smaller than π, we ο¬nd that π and (π β 1)! are relatively prime, and thus Lemma 3.3 implies that ππβ1 β‘ 1 (mod π). β‘ MA 311 NUMBER THEORY FALL 2008 17 We can use Fermatβs Little Theorem to compute powers modulo a prime very eο¬ciently by applying division with remainder to the exponent. Usually we are interested in the least non-negative representative for a particular residue class; this is sometimes called the residue and denoted by the MOD symbol. For instance, the residue of 8 modulo 5 is 8 MOD 5 = 3. Example 3.14. Compute 22008 MOD 13. Solution. Since 13 is prime and doesnβt divide 2, Theorem 3.13 implies that 212 β‘ 1 (mod 13). Moreover, division with remainder yields 2008 = 12 β 167 + 4, so 22008 = 212β 167+4 = (212 )167 β 24 β‘ 24 β‘ 3 (mod 13). Thus we have 22008 MOD 13 = 3. β‘ Fermatβs Little Theorem also yields a negative test for primality, which is often faster than trial division. If π is a positive integer not divisible by π and we can show that ππβ1 ββ‘ 1 (mod π), then we may conclude that π is not prime. However, the converse of this is false. For example, 2340 β‘ 1 (mod 341), and yet 341 = 11 β 31 is not prime. So this does not give a way to prove that an integer is prime. Weβll return to this topic in the next section. Reduced residues and Eulerβs Theorem. Recall that π has a multiplicative inverse modulo π if and only if (π, π) = 1. When π is prime, the residues with this property are just 1, 2, 3, . . . , π β 1. In general, we write π(π) for the number of positive integers less than or equal to π that are relatively prime to π. This is known as Eulerβs phi function. For instance, we have π(1) = 1, π(2) = 1, π(3) = 2, π(4) = 2, π(5) = 4, π(6) = 2, π(7) = 6, π(8) = 4, π(9) = 6, and π(10) = 4. Notice that π(π) = π β 1 whenever π is prime. The property of being relatively prime to π depends only on the residue class of an integer, since (π, π) = (π+ππ, π) for any integer π by Lemma 2.6. Therefore, we can view π(π) as the number of residue classes modulo π that are relatively prime to π. Any set of representatives for these classes is called a reduced residue system modulo π. For instance, {1, 2, 3, 4} is a reduced residue system modulo 5, while {1, 3, 7, 9} and {β3, β1, 1, 3} are reduced residue systems modulo 10. We often use β€βπ to denote a reduced residue system modulo π. Those familiar with abstract algebra may wish to note that β€βπ forms a group under multiplication. The following result generalizes Fermatβs Little Theorem to the case of composite moduli. Theorem 3.15. (Eulerβs Theorem) If π and π are positive integers with (π, π) = 1, then ππ(π) β‘ 1 (mod π). Proof. Let π1 , . . . , ππ(π) denote the positive integers less than or equal to π that are relatively prime to π, and let ππ = πππ MOD π be the residue of πππ modulo π. Suppose that 1 β€ π, π β€ π(π) and ππ = ππ . Then πππ β‘ πππ (mod π), which implies that ππ β‘ ππ (mod π) since (π, π) = 1. Since π1 , . . . , ππ(π) are distinct integers between 1 and π, we must have π = π. This shows that π1 , . . . , ππ(π) are distinct. Moreover, it is clear that (ππ , π) = 1 for each π, so {π1 , . . . , ππ(π) } is a reduced residue system modulo π. In particular, we have π1 β β β ππ(π) β‘ π1 β β β ππ(π) β‘ ππ1 β β β πππ(π) β‘ ππ(π) π1 β β β ππ(π) (mod π). Since π1 β β β ππ(π) is relatively prime to π, we conclude that ππ(π) β‘ 1 (mod π), as desired. β‘ Example 3.16. Compute 5999 MOD 12. 18 SCOTT T. PARSELL Solution. We have π(12) = 4 and (5, 12) = 1, so Theorem 3.15 implies that 54 β‘ 1 (mod 12). Since 999 = 4 β 249 + 3, we have 5999 = 54β 249+3 = (54 )249 β 53 β‘ 53 β‘ 5 (mod 12). Thus we have 5999 MOD 12 = 5. β‘ In turns out that π(π) can be computed easily provided that the prime factorization of π is known. This follows from the following important theorem about simultaneous congruences. We say that integers π1 , . . . , ππ are pairwise relatively prime if (ππ , ππ ) = 1 whenever π β= π. Theorem 3.17. (Chinese Remainder Theorem) Let π1 , . . . , ππ be pairwise relatively prime positive integers, and let π1 , . . . , ππ be any integers. There exists an integer π₯ satisfying the system of congruences π₯ β‘ π1 (mod π1 ), π₯ β‘ π2 (mod π2 ), ... , π₯ β‘ ππ (mod ππ ), and π₯ is unique modulo π1 β β β ππ . Proof. Let π = π1 β β β ππ , and for each π write ππ = π/ππ . Since the ππ are pairwise relatively prime, we have (ππ , ππ ) = 1, and thus Theorem 3.8 shows that there is a unique integer π π modulo ππ satisfying the congruence ππ π π β‘ ππ (mod ππ ). It is easy to check that the integer π₯ = π1 π 1 + π2 π 2 + β β β + ππ π π satisο¬es our system of congruences. If π₯β² is another solution to the system, then we have π₯ β‘ π₯β² (mod ππ ) for each π, and hence π₯ β π₯β² is divisible by ππ . Since the ππ are pairwise relatively prime, it follows easily that π₯ β π₯β² is divisible by π , which establishes uniqueness modulo π . β‘ Example 3.18. Solve the system of congruences π₯ β‘ 1 (mod 5), 2π₯ β‘ 4 (mod 6), 3π₯ β‘ 2 (mod 7). Solution. We ο¬rst rewrite the system in a form to which Theorem 3.17 applies. In view of Theorem 3.8, we see that the system is equivalent to π₯ β‘ 1 (mod 5), π₯ β‘ 2 (mod 3), π₯ β‘ 3 (mod 7), and we may now employ the proof of Theorem 3.17 with π1 = 5, π2 = 3, and π3 = 7 to produce a unique solution modulo π = 105. We must ο¬nd integers π 1 , π 2 , and π 3 satisfying the congruences 21π 1 β‘ 1 (mod 5), 35π 2 β‘ 2 (mod 3), 15π 3 β‘ 3 (mod 7). We see easily by inspection that π 1 = 1, π 2 = 1, and π 3 = 3 are solutions, and thus π₯ = 21 β 1 + 35 β 1 + 15 β 3 = 101 is the unique solution of the original system modulo 105. Hence the solutions are precisely the integers of the form π₯ = 101 + 105π, where π β β€. β‘ MA 311 NUMBER THEORY FALL 2008 19 The Chinese Remainder Theorem also allows us to deal with systems of congruences in which the moduli are not pairwise relatively prime. The technique is to convert the system to an equivalent one in which all the moduli are distinct prime powers. Example 3.19. Find all solutions of the system π₯ β‘ 1 (mod 36) and π₯ β‘ 5 (mod 56). Solution. By the Chinese Remainder Theorem, the ο¬rst congruence is equivalent to the pair π₯ β‘ 1 (mod 4) and π₯ β‘ 1 (mod 9), and the second congruence is equivalent to the pair π₯ β‘ 5 (mod 8) and π₯ β‘ 5 (mod 7). The congruences modulo powers of 2 must contain either redundant or contradictory information, so we examine these more carefully. If π₯ β‘ 5 (mod 8), then we can write π₯ = 8π + 5 = 4(2π + 1) + 1, for some π β β€, and it follows that π₯ β‘ 1 (mod 4). Since π₯ β‘ 5 (mod 8) implies π₯ β‘ 1 (mod 4), the latter congruence is redundant and may be eliminated from consideration. We have therefore reduced to the system π₯ β‘ 5 (mod 8), π₯ β‘ 1 (mod 9), π₯ β‘ 5 (mod 7), and here the moduli are pairwise relatively prime, so Theorem 3.17 applies. We know that the unique solution modulo π = 504 is given by π₯ = 63π 1 + 56π 2 + 72π 3 , where π 1 , π 2 , and π 3 are integers satisfying 63π 1 β‘ 5 (mod 8), 56π 2 β‘ 1 (mod 9), 72π 3 β‘ 5 (mod 7), 2π 2 β‘ 1 (mod 9), 2π 3 β‘ 5 (mod 7). or equivalently, 7π 1 β‘ 5 (mod 8), We see that π 1 = 3, π 2 = 5, and π 3 = 6 satisfy these congruences, and thus π₯ = 63 β 3 + 56 β 5 + 72 β 6 = 901 β‘ 397 (mod 504) is the unique solution modulo 504. β‘ Example 3.20. Find all solutions of the system π₯ β‘ 1 (mod 36) and π₯β‘3 (mod 56). Solution. As in the previous example, the Chinese Remainder Theorem implies that the system is equivalent to π₯ β‘ 1 (mod 4), π₯ β‘ 3 (mod 8), π₯ β‘ 1 (mod 9), π₯ β‘ 3 (mod 7). But if π₯ β‘ 3 (mod 8), then we have π₯ = 8π + 3 = 4(2π) + 3 for some integer π, which shows that π₯ β‘ 3 (mod 4). Hence these two congruences are inconsistent, and we conclude that the system has no solution. β‘ 20 SCOTT T. PARSELL One way of viewing the Chinese Remainder Theorem is that it gives a bijection between the integers π₯ with 0 β€ π₯ < π and the integral π-tuples (π1 , . . . , ππ ) with 0 β€ ππ < ππ . The correspondence is given by π₯ Ãβ (π₯ MOD π1 , . . . , π₯ MOD ππ ). The CRT is what allows us to recover π₯ uniquely modulo π from the numbers ππ = π₯ MOD ππ . In fact, this yields a bijection between the π(π ) reduced residue classes modulo π and the π(π1 ) β β β π(ππ ) π-tuples of reduced residue classes modulo π1 , . . . , ππ . This observation allows us to prove the following important multiplicative property of Eulerβs phi function. Theorem 3.21. If (π, π) = 1, then π(ππ) = π(π)π(π). Proof. By the Chinese Remainder Theorem, there is a one-to-one correspondence, π₯ Ãβ (π₯ MOD π, π₯ MOD π) between the integers π₯ with 0 β€ π₯ < ππ and the pairs (π, π) with 0 β€ π < π and 0 β€ π < π. Now suppose that π₯ is one of the π(ππ) integers with (π₯, ππ) = 1. Then one clearly has (π₯, π) = (π₯, π) = 1, so Lemma 2.6 implies that (π₯ MOD π, π₯ MOD π) is one of the π(π)π(π) pairs (π, π) with (π, π) = (π, π) = 1. On the other hand, if π₯ β‘ π (mod π) and π₯ β‘ π (mod π), where (π, π) = (π, π) = 1, then Lemma 2.6 shows that (π₯, π) = (π₯, π) = 1 and hence that (π₯, ππ) = 1. It follows that the CRT bijection specializes to a bijection among reduced residue classes. β‘ To help visualize the correspondence used in the proof of Theorem 3.21, we illustrate it explicitly for the case π = 8, π = 9. In row π, column π we write the unique integer π₯ with 0 β€ π₯ < 72 that satisο¬es π₯ β‘ π (mod 8) and π₯ β‘ π (mod 9). The reduced residues modulo 8, 9, and 72 are indicated by stars, and we see that π(72) = 24 = 4 β 6 = π(8)π(9). 0 1β 2 3β 4 5β 6 7β 0 0 9 18 27 36 45 54 63 1β 64 1β 10 19β 28 37β 46 55β 2β 56 65β 2 11β 20 29β 38 47β 3 48 57 66 3 12 21 30 39 4β 40 49β 58 67β 4 13β 22 31β 5β 32 41β 50 59β 68 5β 14 23β 6 24 33 42 51 60 69 6 15 7β 16 25β 34 43β 52 61β 70 7β 8β 8 17β 26 35β 44 53β 62 71β Corollary 3.22. If π = ππΌ1 1 β β β ππΌπ π , where π1 , . . . , ππ are distinct primes, then ( ) ( ) 1 1 πΌ1 πΌ1 β1 πΌπ πΌπ β1 π(π) = (π β π ) β β β (π β π )=π 1β β β β 1 β . π1 ππ Proof. Applying Theorem 3.21 repeatedly gives π(π) = π(ππΌ1 1 ) β β β π(ππΌπ π ). Now if π is prime, then the only positive integers less than or equal to ππ‘ that are not relatively prime to ππ‘ are the multiples of π, namely π, 2π, 3π, . . . , ππ‘β1 π. Since there are ππ‘β1 such multiples, we have π(ππ‘ ) = ππ‘ β ππ‘β1 = ππ‘ (1 β 1/π), and the result follows. β‘ MA 311 NUMBER THEORY FALL 2008 21 Example 3.23. Compute π(21000). Solution. We have 21000 = 23 β 3 β 53 β 7, so Corollary 3.22 gives π(21000) = (23 β 22 )(3 β 1)(53 β 52 )(7 β 1) = 4 β 2 β 100 β 6 = 4800. β‘ Example 3.24. Find the last two digits of 32008 . Solution. The last two digits are determined by the residue class modulo 100. Since 100 = 22 β 52 , we have π(100) = (4 β 2)(25 β 5) = 40 by Corollary 3.22. Moreover, one has 2008 = 40 β 50 + 8, so Eulerβs Theorem gives 32008 = (340 )50 β 38 β‘ 38 β‘ 61 (mod 100). Therefore the last two digits are 61. β‘ 4. Public-key cryptography We can use Eulerβs Theorem to devise a scheme for public-key encryption. In such a system, each individual creates and publishes some unique data (known as a public key) that allows them to receive encrypted messages from other users. The system weβll describe was developed at MIT in 1977 by Rivest, Shamir, and Adelman and is commonly known as RSA. To construct the code, we choose two large primes π and π, say around 200 digits each. We then compute π = ππ and use Corollary 3.22 to calculate π(π) = (π β 1)(π β 1). Next we choose an integer π > 1 that is relatively prime to π(π) and use the Euclidean algorithm to ο¬nd π = πβ1 MOD π(π). We make the pair (π, π) publicly available but keep π secret. Obviously, we keep π and π secret as well, since knowing them would enable one to ο¬nd π(π), and hence π. The security of the system rests on the fact that it is essentially impossible to factor π in a reasonable amount of time with current technology. To encrypt a message to a user whose public key is (π, π), we ο¬rst create a digital version of the message, say π , using some character-to-integer scheme such as ASCII. For simplicity, we use the conversions A = 01, B = 02, C = 03, D = 04, . . . , Y = 25, Z = 26, so that each letter of the alphabet corresponds to a 2-digit integer, and we use 27 to represent a space. If desired, we could introduce additional integers to stand for punctuation marks and other symbols. If π β₯ π, we break the message into blocks so that each block is smaller than π. We then encrypt the message by computing πΈ = π π MOD π. The recipient then decrypts the message by computing πΈ π MOD π, using Eulerβs Theorem and the fact that ππ = 1 + ππ(π) for some integer π. One has πΈ π β‘ (π π )π β‘ π ππ β‘ π 1+ππ(π) β‘ (π π(π) )π β π β‘ π π (mod π), and thus πΈ MOD π = π . In applying Eulerβs Theorem, we implicitly assumed that (π, π) = 1, but the probability that this fails is negligible when π is composed of two 200digit primes. The point of RSA is that, given π and π, one cannot compute the decryption key π without knowing π(π), which is equivalent to knowing the factorization π = ππ. 22 SCOTT T. PARSELL Example 4.1. Decode the encrypted message 0828, which was generated using RSA with public key (4897, 19). Solution. Here the integer π = 4897 is far too small to create a secure cryptosystem, and after some trial division we easily obtain the factorization π = ππ, where π = 59 and π = 83. It now follows from Corollary 3.22 that π(π) = 58 β 82 = 4756. Next we calculate the decryption key π = 19β1 MOD 4756 using the Euclidean Algorithm: [ ] [ ] [ ] 1 0 4756 1 β250 6 1 β250 6 β β . 0 1 19 0 1 19 β3 751 1 Hence we have π = 751, so we can decrypt the message by computing π = 828751 MOD 4897. This computation is doable on a calculator by successive squaring. We write the exponent 751 in binary as 1011011112 = 512 + 128 + 64 + 32 + 8 + 4 + 2 + 1 and square π = 828 repeatedly modulo 4897. This gives π2 = 4, π4 = 16, π8 = 256, π16 = 1875, π32 = β421, π64 = 949, π128 = β447, π256 = β968, and π512 = 1697, and it follows that π β‘ π512 π128 π64 π32 π8 π4 π2 π β‘ 2515 (mod 4897), so the message was βYOβ. β‘ The method of successive squaring used in the above example gives a good way of performing fast modular exponentiation, which has been implemented in many software packages. For instance, Mathematica has the function PowerMod[a,b,n], which quickly computes ππ MOD π. A useful tool for ο¬nding modular inverses is the function ExtendedGCD[a,b], which returns gcd(π, π), together with integers π and π‘ satisfying gcd(π, π) = ππ + ππ‘. Mathematica has built-in arbitrary precision, so it is great for handling long integers without the fear of truncation. Programs that only store, say, the ο¬rst 16 digits of a number give more than sufο¬cient accuracy for many applications, but losing even a single digit of an integer is obviously devastating for number theory. Digital Signatures. We can also apply the RSA encryption principle to authenticate digital signatures. If your public key is (π, π) and you send me your signature π in the form π· = π π MOD π, where π is your personal decryption key, then I can recover π by computing π·π MOD π = π ππ MOD π = π 1+ππ(π) MOD π = π. Moreover, I know that the signature is authentic since youβre the only one who knows π. If the signature had been encrypted using an incorrect π then I would most likely obtain gibberish when attempting to recover π. Example 4.2. You receive the digital signature 20496 from a user with initials S. P. and public key (21311, 41). Does it appear to be authentic? Solution. We compute 2049641 MOD 21311 using Mathematica and get PowerMod[20496, 41, 21311] = 1916. MA 311 NUMBER THEORY FALL 2008 23 Since S=19 and P=16, the message must have come from S. P., or at least someone with access to his decryption key. Of course, the numbers are once again so small that anyone could have ο¬gured out the decryption key and sent a phony signature. β‘ The digital signature process described above presumes that the signer is not concerned about the possibility of his or her signature being viewed by a third party. The goal is simply to provide a method for the recipient to verify the signerβs identity. One can transmit sensitive information and verify the identity of the sender by nesting an encryption within the digital signature process. Suppose that Alice, whose public key is (ππ΄ , ππ΄ ), wants to send a message π to Bob, whose public key is (ππ΅ , ππ΅ ), and Bob wants to be sure that the message is really coming from Alice. Alice ο¬rst βsignsβ the message using her own decryption key, ππ΄ , and then encrypts it using Bobβs public key. Thus she computes π = π ππ΄ MOD ππ΄ and sends Bob πΈ = π ππ΅ MOD ππ΅ . When Bob receives πΈ, he uses his decryption key ππ΅ to compute π = πΈ ππ΅ MOD ππ΅ and then recovers the message as π = π ππ΄ MOD ππ΄ . At this point, he knows the contents of the message and can be sure that it was sent by Alice and not someone pretending to be Alice. Primality testing. One issue in implementing RSA is that we need to ο¬nd large integers that are known to be prime. Fortunately, there are alternatives to trial division for investigating primality. As mentioned in §3, Fermatβs Little Theorem can be used to show that an integer is not prime. For example, if π is an odd prime, then the theorem tells us that 2πβ1 β‘ 1 (mod π). The converse of this statement is false; that is, there exist odd composite integers π with the property that 2πβ1 β‘ 1 (mod π). However, it turns out that such integers are fairly rare, so there is a good chance that an integer π satisfying this congruence will in fact be prime. An odd composite integer π satisfying this congruence is called a pseudoprime. The only pseudoprimes less than 1000 are 341 = 11 β 31, 561 = 3 β 11 β 17, and 645 = 3 β 5 β 43. More generally, if π is an odd composite integer with (π, π) = 1 and ππβ1 β‘ 1 (mod π), then we say that π is a pseudoprime for the base π. If we want to know whether π is prime, we could ο¬rst test for divisibility by 2, 3, 5, and 7 and then compute 2πβ1 MOD π, 3πβ1 MOD π, 5πβ1 MOD π, and 7πβ1 MOD π, for instance. If any of these is not equal to 1, then Fermatβs Theorem implies that π is not prime. If they are all equal to 1, then π is very likely (but not certain) to be prime. Interestingly, and perhaps unfortunately, there are odd composite integers π that are pseudoprimes for every base π with (π, π) = 1. Such numbers are called Carmichael numbers, and the smallest one is 561. Carmichael numbers are very sparse (561 is the only one less than 1000), but it was proved in 1994 by Alford, Granville, and Pomerance that there are inο¬nitely many! In fact, they showed that there are at least π₯2/7 Carmichael numbers not exceeding π₯. The concept of pseudoprimes can be strengthened by making the following simple observation. Suppose that ππβ1 β‘ 1 (mod π). If π is odd, we can write π = 2π + 1 for some integer π, and we see that π divides π2π β 1 = (ππ β 1)(ππ + 1). If π is prime, it now follows from Euclidβs Lemma that π divides ππ β 1 or ππ + 1. Thus if ππ ββ‘ ±1 (mod π) then we can conclude that π is not prime. On the other hand, if ππ β‘ 1 (mod π) and π is even, then we can apply the same reasoning within the factorization ππ β 1 = (ππ/2 β 1)(ππ/2 + 1). 24 SCOTT T. PARSELL Example 4.3. Show how to deduce that 341 is not prime without using its prime factorization. Solution. One easily computes that 2340 β‘ 1 (mod 341), so 341 divides 2340 β 1 = (2170 β 1)(2170 + 1) = (285 β 1)(285 + 1)(2170 + 1). We further compute that 2170 β‘ 1 (mod 341) and 285 β‘ 32 (mod 341). But if 341 were prime then it would have to divide 285 β 1, 285 + 1, or 2170 + 1 by Euclidβs Lemma, which would mean that 285 β‘ ±1 (mod 341) or 2170 β‘ β1 (mod 341). Since none of these conclusions holds, we may conclude that 341 is not prime. β‘ In general, if π is an odd integer exceeding 1, we can write π = 2π π‘ + 1, where π‘ is odd and π β₯ 1. Then one has the factorization πβ1 π‘ ππβ1 β 1 = (ππ‘ β 1)(ππ‘ + 1)(π2π‘ + 1)(π4π‘ + 1) β β β (π2 + 1). (4.1) π In Example 4.3 we had π = 2 and π‘ = 85. An odd composite integer π = 2 π‘ + 1 is called a strong pseudoprime for the base π if (π, π) = 1 and π divides one of the factors on the right-hand side of (4.1). Any integer (prime or composite) with this property is said to have passed the strong pseudoprime test. Strong pseudoprimes are considerably more scarce than ordinary pseudoprimes. For the base π = 2, for example, there are 5597 pseudoprimes up to 109 but only 1282 strong pseudoprimes, the smallest of which is 2047. Example 4.4. Show that 2047 is a strong pseudoprime for the base 2. Solution. We observe that 2046 is not divisible by 4, so we have π = 1 and π‘ = 1023 in the above notation. Moreover, Mathematica shows that 21023 β‘ 1 (mod 2047), so 2047 divides 21023 β 1 and hence passes the strong pseudoprime test. Finally, we note that 2047 = 23 β 89 is composite and is therefore in fact a strong pseudoprime. β‘ The results are more striking if we apply the strong pseudoprime test for several diο¬erent bases. There is only one integer less than 2.5 × 1010 , namely 3, 215, 031, 751 = 151 × 751 × 28351, that is a strong pseudoprime for bases 2, 3, 5, and 7. Moreover, there is no βstrongβ analogue of the Carmichael numbers. That is, every composite number π fails the strong pseudoprime test for some base π with (π, π) = 1. Such a π is called a witness to the compositeness of π. In fact, it can be shown that at least half of the bases π β€ π with (π, π) = 1 are witnesses when π is composite, so this procedure can be used to identify primes with near certainty. In 2004, Agrawal, Kayal, and Saxena developed an algorithm that proves the primality or compositeness of π with β a running time that is polynomial in log π. For comparison, trial division requires π( π) steps to prove that an integer is prime. Many software packages have built-in functions that implement various primality tests. In Mathematica, for example, PrimeQ[n] returns true or false according to whether or not π is prime, while Prime[k] returns the πth prime number. Factorization algorithms. Attacks on RSA could be made if eο¬cient factoring algorithms were known. As with primality testing, there are algorithms that are far more MA 311 NUMBER THEORY FALL 2008 25 eο¬cient than trial division, but no current algorithm comes close to breaking RSA with 200-digit primes, except in very special cases that can easily be avoided. We brieο¬y explore some of the ideas involved in these factorization techniques. Fermatβs factoring method was to try to express π as the diο¬erence of two squares. If we can ο¬nd positive integers π₯ and π¦ such that π = π₯2 β π¦ 2 = (π₯ β π¦)(π₯ + π¦), then weβve found a factorization of π, provided that π₯ β π¦ β= 1. Kraitchik realized that one could apply the spirit of Fermatβs idea more eο¬ciently by instead looking for integers π₯ and π¦ satisfying the weaker condition π₯2 β‘ π¦ 2 (mod π), so that π divides (π₯ β π¦)(π₯ + π¦). This no longer ensures a factorization of π, but there is a reasonable chance that both π₯ β π¦ and π₯ + π¦ contain some of the prime factors of π. For example, if π is the product of two distinct primes π and π, one would expect roughly a 50% chance that π and π split among the two factors π₯βπ¦ and π₯+π¦. In this case gcd(π₯βπ¦, π) will be a non-trivial factor of π, and this can be computed eο¬ciently via the Euclidean algorithm. If both π and π divide the same factor, then one can simply try diο¬erent values for π₯ and π¦. Powerful recent factoring methods like the quadratic sieve are based on ο¬nding suitable integers π₯ and π¦ to carry out this principle. Pollardβs so-called βrhoβ method is based on generating a quasi-random sequence of numbers that are distinct modulo the integer π to be factored but not distinct modulo its smallest prime divisor π. Suppose we generate βrandomβ integers π₯1 , . . . , π₯π , where π is large by comβ parison with π but small by comparison with π. For example, we could take π β 10π1/4 . Then the probability that the π₯π are distinct modulo π is very small, so gcd(π₯π β π₯π , π) will most likely produce the factor π for some π and π. This leads to a factorization of π with expected running time π(π1/4 ). When the method works, the numbers π₯π MOD π are eventually periodic and can thus be written in a shape resembling the Greek letter π. Example 4.5. Use Pollardβs rho method to factor the integer π = 36287. Solution. First note that 236286 β‘ 35799 ββ‘ 1 (mod 36287), so Fermatβs Little Theorem implies that π is composite. We construct our quasi-random sequence of integers recursively by taking π₯0 = 1 and π₯π+1 = (π₯2π + 1) MOD π. The ο¬rst few terms of the sequence are 1, 2, 5, 26, 677, 22886, 2439, 33941, 24380, 3341, 22173, 25654, 26685. In particular, one has π₯5 = 22886 and π₯12 = 26685, which gives gcd(π₯12 β π₯5 , π) = 131. Thus we obtain the factorization 36287 = 131 β 277. β‘ Suppose that π is a large composite integer with no small prime factors but that all the prime factors of π β 1 are small for some prime πβ£π. For example, suppose that π β 1 divides 10000!. Then by Fermatβs Little Theorem one has 210000! β‘ 1 (mod π), and thus π divides gcd(210000! β 1, π). Thus we can attempt to ο¬nd π by computing gcd(2π! β 1, π) for various values of π. This is known as Pollardβs π β 1 method, and it can be applied with bases other than 2 as well. 26 SCOTT T. PARSELL Example 4.6. Use Pollardβs π β 1 method to factor the integer π = 69841. Solution. We have 2πβ1 β‘ 37073 ββ‘ 1 (mod π), so π is composite. With π = 5 in the above notation, we obtain gcd(2120 β 1, 69841) = 331, which gives us a nontrivial divisor of π. It is now easily checked that π = 211 β 331 is the desired prime factorization. Note that the method was eο¬ective here because π β 1 = 330 = 2 β 3 β 5 β 11 is divisible by 11!. One would typically expect to test up to π = 11 before ο¬nding π, but we happened to ο¬nd it sooner. β‘ One important consequence of the Pollard π β 1 method is that RSA can be broken if the primes π and π are chosen in such a way that π β 1 or π β 1 has only small prime factors. Therefore, one must be careful to avoid this situation when constructing a public key. We also mention that neither of Pollardβs algorithms will prove primality if they are applied to prime integers. Hence they should only be used on integers that are known to be composite, for example by failing a pseudoprime test. The Mathematica function FactorInteger[n] implements a variety of advanced algorithms to attempt to determine the prime factorization of π, but it typically becomes extremely slow when the smallest prime factor of π is large. 5. Primitive roots When (π, π) = 1, we know from Eulerβs Theorem that ππ(π) β‘ 1 (mod π). However, there may be smaller powers of π that are congruent to 1 modulo π. We deο¬ne the order of π modulo π (or the order of π in β€βπ ) to be the smallest positive integer π such that ππ β‘ 1 (mod π). For example, the elements 3, 5, and 7 all have order 2 in β€β8 . The elements 2 and 3 have orders 3 and 6, respectively, in β€β7 . In view of Eulerβs Theorem, the order of an element in β€βπ is at most π(π). Note that if ππ β‘ 1 (mod π) for some positive integer π, then we can write ππβ1 π + ππ = 1 for some π β β€, so Corollary 2.5 implies that (π, π) = 1. Thus if (π, π) > 1 then there is no positive power of π that is 1 modulo π. Hence order is not deο¬ned for the elements of β€π that are not relatively prime to π. Theorem 5.1. If π has order π in β€βπ and π is a positive integer with ππ β‘ 1 (mod π), then π divides π. Proof. We use division with remainder (Theorem 2.3) to write π = ππ + π, where π and π are integers with 0 β€ π < π. Then we have 1 β‘ ππ β‘ πππ+π β‘ (ππ )π β ππ β‘ ππ (mod π), so the minimality of π implies that π = 0, and hence π = ππ, as required. β‘ Corollary 5.2. The order of every element of β€βπ divides π(π). Proof. In view of Eulerβs Theorem, this follows from Theorem 5.1 with π = π(π). β‘ For example, it is easy to check directly that each element of β€β7 has order 1, 2, 3, or 6 and that each element of β€β8 has order 1 or 2. If the order of π modulo π happens to be π(π) then we say that π is a primitive root modulo π. MA 311 NUMBER THEORY FALL 2008 27 Example 5.3. Determine the primitive roots modulo 7 and modulo 8. Solution. The elements 3 and 5 are primitive roots modulo 7 because they both have order 6 = π(7). It is easily checked that all other elements of β€β7 have order less than 6, so there are no other primitive roots. Finally, there are no primitive roots modulo 8 because there are no elements of order 4 = π(8). β‘ A primitive root is sometimes a called a generator because computing successive powers of it generates the whole of β€βπ . For example, 3 is a generator for β€β7 because 31 = 3, 32 = 2, 33 = 6, 34 = 4, 35 = 5, and 36 = 1. For this reason, we sometimes use the letter π to denote a primitive root. In algebraic terms, the existence of a primitive root modulo π means that β€βπ is a cyclic group under multiplication. Example 5.3 shows that β€β7 is cyclic but that β€β8 is not. The following theorem shows that primitive roots are generators. Theorem 5.4. If π is a primitive root modulo π and (π, π) = 1, then we have π β‘ π π (mod π) for some integer π with 1 β€ π β€ π(π). Proof. Consider the π(π) integers π, π 2 , π 3 , . . . , π π(π) . If π π β‘ π π (mod π) for some π and π with 1 β€ π < π β€ π(π), then we would have π πβπ β‘ 1 (mod π), which is impossible since π has order π(π) and 0 < π β π < π(π). Therefore, the integers π, π 2 , π 3 , . . . , π π(π) all lie in distinct residue classes modulo π. Since each π π is also relatively prime to π, we deduce that the set {π, π 2 , π 3 , . . . , π π(π) } forms a reduced residue system modulo π. Hence there is some exponent π for which π β‘ π π (mod π). β‘ Theorem 5.5. If π has order π modulo π, then ππ has order π/(π, π) modulo π. Proof. Let π denote the order of ππ modulo π. First of all, we have (ππ )π/(π,π) β‘ (ππ )π/(π,π) β‘ 1 (mod π), so Theorem 5.1 implies that π divides π/(π, π). Moreover, we have πππ β‘ (ππ )π β‘ 1 (mod π), so Theorem 5.1 further implies that π divides ππ, and hence that π/(π, π) divides ππ/(π, π). Since π/(π, π) and π/(π, π) are relatively prime, it follows from a homework exercise that π/(π, π) divides π. Since π divides π/(π, π) and π/(π, π) divides π, and both quantities are positive, we may conclude that π = π/(π, π), as desired. β‘ Corollary 5.6. If β€βπ contains a primitive root, then the total number of primitive roots in β€βπ is π(π(π)). In other words, if β€βπ is cyclic, then it has π(π(π)) generators. Proof. If π is a primitive root modulo π, then Theorem 5.5 shows that π π is a primitive root if and only if (π(π), π) = 1. Hence there are π(π(π)) choices for π. By Theorem 5.4, all β‘ elements of β€βπ can be expressed as π π for some π, so this completes the proof. The following theorem, due to Gauss, completely characterizes the integers π for which has a primitive root. β€βπ 28 SCOTT T. PARSELL Theorem 5.7. There exists a primitive root modulo π if and only if π = 1, 2, 4, ππ , or 2ππ , where π is an odd prime and π is a positive integer. Example 5.8. What can you say above the existence of primitive roots modulo π when 9 β€ π β€ 20? How many primitive roots are there modulo 18 and 19? Solution. In view of Theorem 5.7, there are primitive roots modulo 9, 10, 11, 13, 14, 17, 18, and 19, while there are no primitive roots modulo 12, 15, 16, or 20. By Corollary 5.6, the number of primitive roots modulo 18 is π(π(18)) = π(6) = 2, and the number of primitive roots modulo 19 is π(π(19)) = π(18) = 6. β‘ The full proof of Theorem 5.7 is somewhat time-consuming, although it is accessible with elementary techniques. Rather than giving the complete argument, which involves a number of separate cases, we will be content to prove the existence of primitive roots for prime moduli. Before doing this, we need some auxiliary results. The following theorem, due to Lagrange, concerns solutions of polynomial congruences modulo a prime. Theorem 5.9. Let π (π₯) be a polynomial of degree π with integer coeο¬cients, and let π be a prime not dividing the leading coeο¬cient of π (π₯). Then the congruence π (π₯) β‘ 0 (mod π) has at most π distinct solutions modulo π. Proof. We proceed by induction on π. When π = 0, the polynomial π (π₯) is a constant not divisible by π, so the congruence has no solutions. Now suppose that π > 0 and that the result holds for all polynomials of degree less than π. Let π (π₯) be a polynomial of degree π, with π not dividing the leading coeο¬cient, and suppose that π (π) β‘ 0 (mod π). Using division with remainder for polynomials, we can write π (π₯) = π(π₯)(π₯ β π) + π, where π(π₯) is a polynomial of degree π β 1, and where π is an integer. (Since π₯ β π has degree one, the remainder has degree zero.) Moreover, π does not divide the leading coeο¬cient of π(π₯), since this is the same as the leading coeο¬cient of π (π₯). We have π = π (π) β‘ 0 (mod π), which means that πβ£π, and thus for any integer π₯ we have π (π₯) β‘ π(π₯)(π₯ β π) (mod π). Now if π (π) β‘ 0 (mod π), then π divides π(π)(π β π), so Euclidβs Lemma implies that π divides π(π) or π divides π β π. In the ο¬rst case, we have π(π) β‘ 0 (mod π), so the induction hypothesis ensures that there are at most π β 1 choices for π. In the second case, we have π β‘ π (mod π), which gives one additional possibility. Thus π (π₯) β‘ 0 (mod π) has at most π solutions in total. β‘ Note that the theorem fails for composite moduli. For example, the congruence π₯2 β 1 β‘ 0 (mod 8) has four solutions, π₯ = 1, 3, 5, 7, but the polynomial π (π₯) = π₯2 β 1 has degree two. The next lemma establishes an interesting relationship between an integer and the Euler phi function of its divisors. To illustrate, notice that the positive divisors of 12 are 1, 2, 3, MA 311 NUMBER THEORY FALL 2008 29 4, 6, and 12 and that π(1) + π(2) + π(3) + π(4) + π(6) + π(12) = 1 + 1 + 2 + 2 + 2 + 4 = 12. The positive divisors of 17 are 1 and 17, and we have π(1) + π(17) = 1 + 16 = 17. The positive divisors of 20 are 1, 2, 4, 5, 10, and 20, and we have π(1) + π(2) + π(4) + π(5) + π(10) + π(20) = 1 + 1 + 2 + 4 + 4 + 8 = 20. We now show that this phenomenon occurs in general. Lemma 5.10. Let π be a positive integer, and let π1 , π2 , . . . , ππ‘ denote the positive divisors of π. Then β π(π1 ) + π(π2 ) + β β β + π(ππ‘ ) = π(π) = π. πβ£π Proof. Let π(π) denote the number of integers π with 1 β€ π β€ π and (π, π) = π. Since π(π) = 0 unless π is a divisor of π, we can write β π(π1 ) + π(π2 ) + β β β + π(ππ‘ ) = π(π) = π. πβ£π Consider an integer π counted by π(π). Then (π, π) = π, so in particular we have πβ£π and πβ£π, and furthermore (π/π, π/π) = 1. Thus we can write π = ππ for a unique integer π with 1 β€ π β€ π/π and (π, π/π) = 1. Hence the number of choices for π is π(π/π). Since this also gives the number of possibilities for π, we deduce that π(π) = π(π/π). Notice that the numbers π/π1 , π/π2 , . . . , π/ππ‘ are just the divisors π1 , π2 , . . . , ππ‘ , listed in a diο¬erent order. Thus we have β β β π(π) = π, π(π/π) = π(π) = πβ£π πβ£π πβ£π as desired. β‘ We can now demonstrate the existence of primitive roots modulo a prime π. The following theorem actually makes the stronger assertion that β€βπ contains elements of all orders dividing π β 1 (including π β 1 itself). Note that Corollary 5.2 implies that no other orders are permissible. Theorem 5.11. If π is prime and π is a positive integer dividing πβ1, then there are exactly π(π) elements of order π in β€βπ . Proof. Let π be a divisor of π β 1, and let π (π) be the number of elements of order π in β€βπ . If π (π) > 0, then there exists some π β β€βπ of order π. The integers π, π2 , π3 , . . . , ππ are distinct modulo π, since otherwise we would have ππβπ β‘ 1 (mod π), where 0 < π β π < π, violating the deο¬nition of order. Moreover, for each π with 1 β€ π β€ π we have (ππ )π β‘ (ππ )π β‘ 1 (mod π), so each ππ is a solution of the congruence π₯π β 1 β‘ 0 (mod π), and Theorem 5.9 implies that these are the only solutions. Furthermore, every element of order π satisο¬es the congruence and must therefore be a power of π. We know from Theorem 5.5 that ππ has order π if and only if (π, π) = 1, so there are exactly π(π) elements of order π. Thus weβve shown that either 30 SCOTT T. PARSELL π (π) = 0 or π (π) = π(π) whenever πβ£(π β 1). Since there are π β 1 elements in β€βπ , we deduce from Lemma 5.10 that β β π (π) = π β 1 = π(π). πβ£(πβ1) πβ£(πβ1) Since π (π) β€ π(π) for each π, we must actually have π (π) = π(π) for each π, and this completes the proof. β‘ Corollary 5.12. There are exactly π(π β 1) primitive roots in β€βπ . Proof. This follows immediately by taking π = π β 1 in Theorem 5.11. β‘ The Lucas primality test. Suppose the integer π has passed a strong pseudoprime test and is therefore suspected to be prime. It turns out that we can then use primitive roots to try to prove that π is prime. Suppose that π passes the ordinary pseudoprime test for the base π, so that ππβ1 β‘ 1 (mod π), and further that we are able to factor π β 1, say π β 1 = ππ11 β β β ππππ . Theorem 5.1 implies that the order of π modulo π divides π β 1, so if we can show that π(πβ1)/ππ ββ‘ 1 (mod π) for each π then we may conclude that the order of π is actually π β 1. On the other hand, we know from Eulerβs Theorem that the order of π cannot exceed π(π), so we have π β 1 β€ π(π). But it follows easily from Corollary 3.22 that this can only happen if π is prime, in which case π(π) = π β 1. Example 5.13. Use the Lucas test to prove that π = 631 is prime. Solution. We have 3630 β‘ 1 (mod 631), so the order of 3 modulo 631 must divide 630. Moreover we have 630 = 2 β 32 β 5 β 7 and 3(πβ1)/2 = 3315 β‘ β1 (mod 631), (πβ1)/5 3 =3 126 β‘ 242 (mod 631), 3(πβ1)/3 = 3210 β‘ β44 3 (πβ1)/7 =3 90 β‘ 269 (mod 631), (mod 631), which shows that the order of 3 modulo 631 is actually equal to 630. We may therefore conclude that 631 is prime and that 3 is a primitive root modulo 631. β‘ If we ο¬nd an element π of order π β 1 in β€βπ , then the above argument shows that π is prime and hence that π is a primitive root modulo π. So the success of the test depends in part on being able to ο¬nd primitive roots quickly. However, Corollary 5.12 implies that there are π(π β 1) primitive roots in β€βπ when π is prime, and a bit of elementary analytic number theory shows that π(π) β π62 π on average. Hence the proportion of numbers π β€ π that are primitive roots modulo π averages about 6/π 2 β 0.608 for large prime values of π. Thus we have a good chance of ο¬nding a suitable π fairly quickly if π is in fact prime. A more serious issue is that it may not be easy to factor π β 1. If weβre lucky, it will have several relatively small prime factors, but there might be a large factor remaining whose primality needs to be established. In this case, we can iterate the Lucas test until our numbers π β 1 are small enough to be factored by trial division. MA 311 NUMBER THEORY FALL 2008 31 The Diο¬e-Hellman key exchange. The ο¬rst secure method for public-key cryptography was actually developed about two years before the RSA breakthrough. One of the fundamental problems with classical cryptography is the diο¬culty of agreeing on the key for a particular cipher without having this information intercepted. Diο¬e and Hellman resolved this by generating large prime π and then choosing a primitive root π modulo π. Note that Corollary 5.12 ensures that there are π(π β 1) possible choices for π . The pair (π, π ) is public information. Now if Alice wants to communicate with Bob, she chooses a number π at random between 2 and π β 2 and sends him πΌ = π π MOD π. Bob then chooses a number π at random between 2 and π β 2 and sends Alice π½ = π π MOD π. Since π is a primitive root modulo π, we know that neither πΌ nor π½ will equal 1. We observe that π = π ππ MOD π = π½ π MOD π = πΌπ MOD π can now be calculated by both Alice and Bob and can be used as the key for whatever cryptosystem they employ. Example 5.14. Using the public prime π = 197 and public base π = 31, show how Alice and Bob can agree on a common key for secure communication. Solution. Suppose that Alice randomly chooses π = 72. Then she sends πΌ = 3172 MOD 197 = 76 to Bob. If Bob randomly chooses π = 109, then he sends π½ = 31109 MOD 197 = 147 to Alice. At this point, Alice computes π½ 72 MOD 197 = 14772 MOD 197 = 28 and Bob computes πΌ109 MOD 197 = 76109 MOD 197 = 28, so theyβve agreed on the key π = 28. β‘ The natural way for Eve (who is eavesdropping) to obtain the key π would be to solve the two congruences π π β‘ πΌ (mod π) and π π β‘ π½ (mod π) (5.1) for π and π. Solving either one of these is known as the discrete log problem, and there is no known eο¬cient algorithm for handling it. It is believed that no such algorithm exists, but this has not been proven. What Eve really needs is an eο¬cient algorithm for ο¬nding π ππ MOD π from π π and π π , which is known as the Diο¬e-Hellman problem. Its solution would obviously follow from a solution to the discrete log problem, but itβs not known whether the two problems are equivalent. In view of Theorem 5.4, the fact that π is a primitive root modulo π ensures that the congruences (5.1) have unique solutions π and π between 1 and π β 1 for every choice of πΌ and π½ between 1 and π β 1. The uniqueness of the solutions obviously makes it less likely that Eve will stumble upon one quickly by trial and error. The ElGamal Cryptosystem. One disadvantage of the Diο¬e-Hellman method is that Alice has to wait for a response from Bob before she can calculate the key and initiate secure communication. However, ElGamal showed that the protocol can be adapted to create a selfcontained public-key cryptosystem. In addition to the public prime π and base π , suppose 32 SCOTT T. PARSELL that Alice and Bob publish their numbers πΌ and π½ in a directory. If Alice wants to send a message π₯ to Bob, she generates a random session key π between 2 and π β 2 and sends Bob π‘ = π π MOD π and π¦ = π½ π π₯ MOD π. He then recovers the message by computing π¦(π‘π )β1 MOD π = (π₯π ππ ) β (π ππ )β1 = π₯, provided that π₯ β€ π β 1. Longer messages can of course be broken into blocks prior to encryption. Example 5.15. Suppose that Alice and Bob use ElGamal with public prime π = 11881379 and base π = 23, and that Alice has published πΌ = 10442571. How can Bob discreetly ask Alice to tea? Solution. Bob ο¬rst needs to pick a random session key, say π = 101. He then converts TEA to digital form, say π₯ = 200501, and calculates π‘ = 23101 MOD π = 3054634 and π¦ = 200501 β 10442571101 MOD π = 3497868. Bob now sends the pair (3054634, 3497868) to Alice, and she recovers the message using her private key, π = 8137: 3497868 β (30546348137 )β1 MOD π = 3497868 β 7225717 MOD π = 200501. β‘ 6. Quadratic reciprocity Quadratic Residues. Having studied linear congruences in §3, it is natural to ask about solving quadratic congruences. Let π be a prime and let π β β€βπ . We say that π is a quadratic residue modulo π if there exists π₯ such that π₯2 β‘ π (mod π). If no such π₯ exists, then π is called a quadratic non-residue modulo π. We sometimes denote the sets of quadratic residues and non-residues in β€βπ by π and π , respectively. Example 6.1. Identify the quadratic residues and non-residues in β€β5 , β€β7 , and β€β11 . Solution. In β€β5 , we have π = {1, 4} and π = {2, 3}. In β€β7 , we have π = {1, 2, 4} and π = {3, 5, 6}. In β€β11 , we have π = {1, 3, 4, 5, 9} and π = {2, 6, 7, 8, 10}. β‘ The next theorem shows that there are always equal numbers of quadratic residues and non-residues modulo an odd prime. Theorem 6.2. If π is an odd prime, then β£π β£ = β£π β£ = 12 (π β 1). Proof. If π₯2 β‘ π¦ 2 (mod π), then π divides π₯2 βπ¦ 2 = (π₯βπ¦)(π₯+π¦), so Euclidβs Lemma implies that π divides π₯ β π¦ or π₯ + π¦, and thus π₯ β‘ ±π¦ (mod π). Thus every quadratic residue in β€βπ has exactly two distinct square roots, which implies that the set π = {π₯2 : π₯ β β€βπ } contains β‘ exactly half the elements of β€βπ . How do we determine whether a particular element of β€βπ is a quadratic residue? One answer is given by the following theorem. MA 311 NUMBER THEORY FALL 2008 33 Theorem 6.3. (Eulerβs Criterion) Let π be an odd prime, and let π β β€βπ . Then π is a quadratic residue modulo π if and only if π(πβ1)/2 β‘ 1 (mod π). Proof. By Theorem 5.11, we know that there exists a primitive root π modulo π. If π is a quadratic residue modulo π, then there exists π₯ β β€βπ with π₯2 β‘ π (mod π). By Theorem 5.4, we have π₯ β‘ π π (mod π) for some integer π, and thus π β‘ π 2π (mod π). It follows that π(πβ1)/2 β‘ (π 2π )(πβ1)/2 β‘ (π πβ1 )π β‘ 1 (mod π). Conversely, if π is a quadratic non-residue modulo π, then π cannot be an even power of π, so Theorem 5.4 implies that π β‘ π 2π+1 (mod π) for some integer π. Thus we have π(πβ1)/2 β‘ (π 2π+1 )(πβ1)/2 β‘ (π πβ1 )π π (πβ1)/2 β‘ π (πβ1)/2 ββ‘ 1 (mod π), since π has order π β 1. In fact, we can deduce that π(πβ1)/2 β‘ β1 (mod π), since π(πβ1)/2 is a solution of the congruence π₯2 β‘ 1 (mod π). β‘ As a result, the congruence π₯2 β‘ π (mod π) has two solutions if π(πβ1)/2 β‘ 1 (mod π) and no solutions if π(πβ1)/2 β‘ β1 (mod π). Example 6.4. Decide whether the congruence π₯2 β‘ 6 (mod 37) has solutions. Solution. We have 618 β‘ (62 )9 β‘ (β1)9 β‘ β1 (mod 37), so Eulerβs Criterion implies that the congruence has no solution. β‘ ( ) When π is an integer and π is an odd prime, β§ ( )  β¨ 0 π = 1  π β©β1 we deο¬ne the Legendre symbol π π by if πβ£π if π β π . if π β π Note that this deο¬nition only depends on the residue class( of) π modulo ( 3 ) π, so replacing ( 7 ) π by 2 π + ππ does not change the value. For example, we have 7 = 1, 7 = β1, and 7 = 0. The Legendre symbol (sometimes read as βπ on πβ) has the following useful properties: Theorem 6.5. Let π and π be integers, and let π be an odd prime. Then ( ) ( ) ( )( ) π ππ π π (πβ1)/2 (i) β‘π (mod π) (iii) = π π π π ( ) ( 2) β1 π (ii) = (β1)(πβ1)/2 (iv) = 1 if π is not divisible by π. π π Proof. Fermatβs Little Theorem gives ππβ1 β‘ 1 (mod π) when (π, π) = 1, so property (i) follows immediately from Eulerβs Criterion and Euclidβs Lemma. Properties (ii), (iii), and (iv) follow easily from (i). β‘ Note that property (ii) implies that β1 is a quadratic residue mod π if and only if π β‘ 1 (mod 4). For example, the congruence π₯2 β‘ β1 is solvable modulo 73 but not modulo 71. 34 SCOTT T. PARSELL The following criterion is a key ingredient in proving the law of quadratic reciprocity, which provides an eο¬cient method for computing the Legendre symbol. Theorem 6.6. (Gaussβ Criterion) Let π be an odd prime, and let π be a positive integer not divisible by π. For 1 β€ π β€ 21 (π β 1), let ππ = π(2π β 1) MOD π, and let π‘ be the number of ππ that are even. Then we have ( ) π = (β1)π‘ . π Example 6.7. Use Gaussβ criterion to calculate (2) ( 2 ) ( 2 ) (2) , 11 , 13 , and 17 . 7 Solution. For π = 7, we have π1 = 2 β 1 = 2, π2 = 2 β 3 = 6, and ( )π3 = 2 β 5 = 3. Hence the number of even residues is π‘ = 2, and Gaussβ Criterion gives 72 = (β1)2 = 1. Similarly, have π1 = 2, π2 = 6, π3 = 10, π4 = 3, and π5 = 7, which yields π‘ = 3 and (for2 )π = 11 we 3 = (β1) = β1. For π =( 13,) we get π1 = 2, π2 = 6, π3 = 10, π4 = 1, π5 = 5, and π6 = 9, 11 2 so we again have π‘ = 3 and 13 = β1. Finally, for π = 17 we have(π1 )= 2, π2 = 6, π3 = 10, 2 π4 = 14, π5 = 1, π6 = 5, π7 = 9, and π8 = 13, which gives π‘ = 4 and 17 = 1. β‘ The result of Example 6.7 may be generalized as follows. ( ) 2 2 Corollary 6.8. If π is an odd prime, then = (β1)(π β1)/8 . π Proof. It is an easy exercise to check that (π2 β 1)/8 is even if π β‘ ±1 (mod 8) and odd if π β‘ ±3 (mod 8). The proof therefore splits into four cases. First of all, suppose that π β‘ 1 (mod 8) so that π = 8π + 1 for some positive integer π. The numbers 2 β 1, 2 β 3, 2 β 5, . . . , 2(4π β 1) are all less than π (since 8π β 2 < 8π + 1), so their residues are all clearly even. On the other hand, the numbers 2(4π + 1), 2(4π + 3), 2(4π + 5), . . . , 2(8π β 1) all lie between π and 2π, so their residues are 1, 5, 9, . . . , 8π β 3, which are all odd. The number of even residues in Gaussβ criterion is therefore π‘ = 2π, since 2π β 1 ranges from 1 to 4π β 1 as π ranges from 1 to 2π, and thus (2/π) = (β1)2π = 1. Next suppose that π β‘ 3 (mod 8), so that π = 8π + 3. Then the numbers 2 β 1, 2 β 3, 2 β 5, . . . , 2(4π + 1) are all less than π (since 8π + 2 < 8π + 3), so their residues are even. The numbers 2(4π + 3), 2(4π + 5), 2(4π + 7), . . . , 2(8π + 1) all lie between π and 2π, so their residues are 3, 7, 11, . . . , 8π β 1, which are all odd. We therefore have π‘ = 2π + 1 and hence (2/π) = (β1)2π+1 = β1. The remaining two cases are left as exercises. β‘ Proof of Gaussβ criterion: Write π = 21 (π β 1). We re-index the residues so that π1 , π2 , . . . , ππ‘ are even and ππ‘+1 , ππ‘+2 , . . . , ππ are odd. Let π1 , π2 , . . . , ππ be the positive odd MA 311 NUMBER THEORY FALL 2008 35 integers less than π, re-ordered so that ππ = πππ MOD π. The numbers π β π1 , π β π2 , . . . , π β ππ‘ , ππ‘+1 , ππ‘+2 , . . . , ππ are positive odd integers less than π; we claim that they are distinct and hence a re-ordering of π1 , . . . , ππ . To show this, we consider three cases: (i) If ππ = ππ , where π‘ + 1 β€ π, π β€ π, then πππ β‘ πππ (mod π), so Lemma 3.3 gives ππ β‘ ππ (mod π). But π1 , . . . , ππ are distinct positive integers less than π, so we deduce that π = π. (ii) If π β ππ = π β ππ , where 1 β€ π, π β€ π‘, then ππ = ππ , so the above argument gives π = π. (iii) If π β ππ = ππ , where 1 β€ π β€ π‘ and π‘ + 1 β€ π β€ π, then ππ + ππ β‘ 0 (mod π), so π(ππ + ππ ) β‘ 0 (mod π), and thus ππ + ππ β‘ 0 (mod π). Since 0 < ππ + ππ < 2π, it follows that ππ + ππ = π, which is impossible since ππ + ππ is even. We therefore have π1 β β β ππ β‘ (π β π1 ) β β β (π β ππ‘ )ππ‘+1 β β β ππ β‘ (β1)π‘ π1 β β β ππ β‘ (β1)π‘ ππ (π1 β β β ππ ) (mod π). Since π does not divide π1 β β β ππ , we deduce that ππ β‘ (β1)π‘ (mod π), and the result now follows from part (i) of Theorem 6.5. β‘ We are now ready to state the main theorem of this section, which is one of the most important and beautiful results in elementary number theory. Theorem 6.9. (Quadratic Reciprocity) If π and π are distinct odd primes, then { ( )( ) (πβ1)(πβ1) π π 1 if π β‘ 1 (mod 4) or π β‘ 1 (mod 4) 4 = (β1) = π π β1 if π β‘ π β‘ 3 (mod 4). The proof of Theorem 6.9 uses Gaussβ criterion but requires a somewhat technical argument to count the even residues π(2π β 1) MOD π and π(2π β 1) MOD π. There are actually many ways of proving quadratic reciprocity; over 200 diο¬erent proofs have appeared in print since Gaussβ original work in the early 1800s. Before launching into a proof, we illustrate with some typical applications. ( ) 11 Example 6.10. Use quadratic reciprocity to calculate . 31 Solution. Since 11 and 31 are both primes congruent to 3 mod 4, quadratic reciprocity gives ( 11 ) ( 31 ) ( 31 ) ( 9 ) ( 32 ) = β 11 . Now since 31 β‘ 9 (mod 11), we have 11 = 11 = 11 = 1. We therefore 31 ( 11 ) conclude that 31 = β1 and hence that 11 is a quadratic non-residue modulo 31. β‘ ( Example 6.11. Use quadratic reciprocity to calculate ) 42 . 61 Solution. We ο¬rst apply Theorem 6.5 (iii) to write ( ) ( )( )( ) 42 2 3 7 = . 61 61 61 61 36 SCOTT T. PARSELL (2) Since 61 β‘ 5 (mod 8), Corollary 6.8 gives 61 = β1. Next, since 61 β‘ 1 (mod 4), we may apply quadratic reciprocity to obtain ( ) ( ) ( ) ( ) ( ) ( ) 61 1 7 61 5 3 = = = 1 and = = = β1, 61 3 3 61 7 7 ( 42 ) by the result of Example 6.1. Thus we conclude that 61 = (β1) β (1) β (β1) = 1, and hence that 42 is a quadratic residue modulo 61. β‘ Quadratic reciprocity can be used to determine a general criterion for 3 to be a quadratic residue modulo a prime π > 3. The result is somewhat reminiscent of the analogous criterion for (2/π) given in Corollary 6.8, except that here the conclusion depends on the residue class of π modulo 12 rather than modulo 8. ( ) { 3 1 if π β‘ ±1 (mod 12) Corollary 6.12. One has = . π β1 if π β‘ ±5 (mod 12) Proof. If π β‘ 1 (mod 12), then π β‘ 1 (mod 3) and π β‘ 1 (mod 4), so quadratic reciprocity gives ( ) ( ) ( ) 3 π 1 = = = 1. π 3 3 Similarly, if π β‘ β1 (mod 12), then π β‘ 2 (mod 3) and π β‘ 3 (mod 4), so quadratic reciprocity yields ( ) ( ) ( ) 3 π 2 =β =β = β(β1) = 1. π 3 3 We leave the remaining two cases as exercises. β‘ A proof of quadratic reciprocity. We now describe an argument that leads from Gaussβ criterion to the conclusion of Theorem 6.9. Let π and π be odd primes, and deο¬ne ππ = π(2π β 1) MOD π (1 β€ π β€ πβ1 ) 2 and π π = π(2π β 1) MOD π (1 β€ π β€ πβ1 ). 2 ( ) ( ) By Theorem 6.6, we have ππ = (β1)π‘ , where π‘ is the number of even ππ , and ππ = (β1)π’ , where π’ is the number of even π π . It follows that ( )( ) π π = (β1)π‘+π’ . (6.1) π π It therefore suο¬ces to show that π‘ + π’ is odd if and only if π β‘ π β‘ 3 (mod 4). We now let π denote the set of all integers of the form π₯ = ππ β ππ, where π and π are odd integers with 1 β€ π < π and 1 β€ π < π. For example, if π = 7 and π = 11, each element of π has the form π₯ = 11π β 7π where π β {1, 3, 5} and π β {1, 3, 5, 7, 9}. Taking π = 1 gives π₯ = 4, β10, β24, β38, β52, while π = 3 gives π₯ = 26, 12, β2, β16, β30, and ο¬nally π = 5 gives π₯ = 48, 34, 20, 6, β8, for a total of 15 elements. MA 311 NUMBER THEORY FALL 2008 37 Lemma 6.13. The elements of π are nonzero even integers, and one has β£πβ£ = 14 (π β 1)(π β 1). Proof. Suppose that π₯ = ππ β ππ β π. Then ππ and ππ are odd, so π₯ is clearly even. Moreover, if ππ = ππ, then πβ£ππ, and hence πβ£π, which is impossible since 1 β€ π < π. Finally, if ππ β ππ = ππβ² β ππβ² , then π(π β πβ² ) = π(π β πβ² ), which implies that πβ£(π β πβ² ) and hence that π = πβ² and π = πβ² , since βπ < π β πβ² < π. Hence these expressions are all distinct, and β£πβ£ is just the number of ordered pairs (π, π). β‘ Next, we let π = {π β π : βπ < π < π}. For example, when π = 7 and π = 11, we have π = {β10, β8, β2, 4, 6}. Lemma 6.14. One has β£π β£ = π‘ + π’. Proof. First suppose that π β π and that 0 < π < π. Then π β‘ ππ (mod π) for some odd integer π with 1 β€ π < π, and we can write π = 2π β 1 with 1 β€ π β€ 21 (π β 1). But since π < π, we must actually have π = ππ , and Lemma 6.13 shows that this is one of the even residues counted by π‘. On the other hand, if ππ = π(2π β 1) MOD π is even, then 0 < ππ < π, and ππ β‘ ππ (mod π), where π = 2π β 1 is odd and 1 β€ π < π. It then follows that ππ β ππ = ππ for some π β β€, and clearly π must be odd and positive. Moreover, ππ < ππ < ππ and hence π < π. This shows that ππ β π . We may therefore conclude that the elements π β π with 0 < π < π are precisely the even residues ππ counted by π‘. A similar argument shows that the elements π β π with βπ < π < 0 are precisely the negatives of the even residues π π counted by π’, and the lemma follows immediately. β‘ To determine whether π‘ + π’ is even, we attempt to pair up the elements of π via the correspondence ππ β ππ Ãβ π(π β 1 β π) β π(π β 1 β π). (6.2) For example, when π = 7 and π = 11, we have 11π β 7π Ãβ 11(6 β π) β 7(10 β π), which gives the pairs (4, β8), (β10, 6), and (β2, β2). On the other hand, if π = 5 and π = 7, then π = {β18, β8, β4, 2, 6, 16} and π = {β4, 2}, so the correspondence 7πβ5π Ãβ 7(4βπ)β5(6βπ) yields the obvious pair (2, β4). We now aim to show that this correspondence gives the desired parity result for β£π β£. Lemma 6.15. The pairs arising from the correspondence (6.2) consist of distinct elements unless π β‘ π β‘ 3 (mod 4), in which case a single element is paired with itself. Proof. We ο¬rst note that if ππ β ππ β π then one has βπ = βπ + π β π < βπ + π β (ππ β ππ) < βπ + π + π = π, which shows that π(π β 1 β π) β π(π β 1 β π) = βπ + π β (ππ β ππ) β π . Moreover, Lemma 6.13 shows that the expressions ππ β ππ are distinct, so if an element is paired with itself in (6.2), we must have π = π β 1 β π and π = π β 1 β π, which gives π = 12 (π β 1) and π = 21 (π β 1). But these values are both odd if and only if π β‘ π β‘ 3 (mod 4), and this completes the proof. β‘ The proof of quadratic reciprocity is now within our grasp. By Lemmas 6.14 and 6.15, we see that β£π β£ = π‘ + π’ is odd if and only if π β‘ π β‘ 3 (mod 4), so the result follows from (6.1). 38 SCOTT T. PARSELL The Jacobi symbol. There is a generalization of the Legendre symbol, called the Jacobi symbol, that is deο¬ned whenever the bottom entry is odd. If π = π1 β β β ππ , where the ππ are (not necessarily distinct) primes, then we deο¬ne ( ) ( ) ( ) π π π = β β β , π π1 ππ where the factors on the right are Legendre symbols. It turns out that the Jacobi symbol enjoys many of the same properties as the Legendre symbol, including the law of quadratic reciprocity. Theorem 6.16. The results of Theorem 6.5 (ππ), (πππ), Corollary 6.8, and Theorem 6.9 hold with the Legendre symbol replaced by the Jacobi symbol and the odd primes π and π replaced by odd positive integers. Proof. It suο¬ces to write out the prime factorizations of the odd integers in question and apply the deο¬nition of the Jacobi symbol in combination with the corresponding properties of the Legendre symbol. We leave the details as an exercise. β‘ Note that part (iv) of Theorem 6.5 does not quite hold for the Jacobi symbol. The correct analogue is that (π2 /π) = 1 if (π, π) = 1. Theorem 6.16 often allows us to perform computations with Legendre symbols more eο¬ciently than was previously possible. For instance, in Example 6.11, we could apply quadratic reciprocity for Jacobi symbols to obtain ( ) ( ) ( ) ( ) ( ) 21 61 19 21 2 = = = = = β1 61 21 21 19 19 rather than dealing with (3/61) and (7/61) separately. Unfortunately, the Jacobi symbol (π/π) does not tell us whether π is a square mod π. For example, (2/9) = (2/3)(2/3) = 1, but 2 is not a square modulo 9. 7. Some diophantine equations A diophantine equation usually refers to a polynomial equation with integer coeο¬cients to which we seek integer solutions. As a simple example, consider the equation 9π₯ + 6π¦ = 20. This is a linear diophantine equation in two variables. A momentβs thought reveals that this equation has no integer solutions, since 9π₯+6π¦ is divisible by 3 for any integers π₯ and π¦ while 20 is not divisible by 3. From another point of view, notice that solving the above equation is equivalent to solving the congruence 9π₯ β‘ 20 (mod 6), and we know from Theorem 3.8 that this has no solution since (9, 6) = 3 does not divide 20. On the other hand, the equation 2π₯ + 3π¦ = 7 has inο¬nitely many integer solutions, given by π₯ = β1 + 3π and π¦ = 3 β 2π for any π β β€. The following theorem characterizes the solutions of the linear diophantine equation ππ₯ + ππ¦ = π. Theorem 7.1. Let π, π, and π be integers, and write π = (π, π). The equation ππ₯ + ππ¦ = π has integer solutions if and only if πβ£π. Moreover, the set of solutions is given by π₯ = π₯0 + ππ/π, where (π₯0 , π¦0 ) is any particular solution. π¦ = π¦0 β ππ/π (π β β€), MA 311 NUMBER THEORY FALL 2008 39 Proof. The equation ππ₯ + ππ¦ = π is equivalent to the congruence ππ₯ β‘ π (mod π), and Theorem 3.8 shows that this is solvable if and only if (π, π) divides π. If π₯0 is any solution of the congruence, then we have ππ₯0 = π β ππ¦0 for some integer π¦0 , so (π₯0 , π¦0 ) solves the equation. Moreover, any solution (π₯, π¦) satisο¬es the congruences π π₯ π β‘ π π (mod ππ ) and π π¦ π β‘ π π (mod ππ ), which have unique solutions modulo π/π and π/π, respectively. Therefore we have π₯ = π₯0 + ππ/π and π¦ = π¦0 + ππ/π for some integers π and π. Substituting into the equation ππ₯ + ππ¦ = π, we ο¬nd that (π₯, π¦) is a solution if and only if π = βπ. β‘ Example 7.2. Describe all integer solutions of the diophantine equations 35π₯ + 49π¦ = 64 and 35π₯ + 49π¦ = 63. Solution. In view of Theorem 7.1, the equation 35π₯ + 49π¦ = 64 has no integer solutions, but the equation 35π₯ + 49π¦ = 63 has solutions π₯ = β1 + 7π and π¦ = 2 β 5π for every π β β€. β‘ Notice that the solubility of our linear diophantine equation was closely connected to the solubility of the underlying congruences. This is a fairly general principle that is useful to keep in mind when studying higher degree equations. Example 7.3. Determine all integer solutions of the diophantine equation π₯2 + π¦ 2 = 1999. Solution. Notice that 0 and 1 are the only perfect squares modulo 4, and no two of these add up to 3, which is congruent to 1999 modulo 4. We therefore conclude that the equation has no integer solutions. β‘ Example 7.4. Determine all integer solutions of the equation π₯2 + 7π¦ 2 + 35π§ 2 = 70493. Solution. If (π₯, π¦, π§) were a solution, then π₯ would satisfy the congruence π₯2 β‘ 70493 β‘ 3 (mod 7). But 3 is a quadratic non-residue modulo 7, so we conclude that there are no integer solutions. β‘ Pythagorean triples. A famous quadratic diophantine equation in three variables is the Pythagorean equation π₯2 + π¦ 2 = π§ 2 . (7.1) Notice that this equation has many βtrivialβ solutions, (0, π¦, ±π¦) and (π₯, 0, ±π₯), obtained by setting one of the variables on the left hand side equal to zero. These solutions are not very interesting. Of course, there are some well-known right triangles with integer side lengths, which give non-trivial solutions such as (3, 4, 5) and (5, 12, 13). A solution to (7.1) is sometimes called a Pythagorean triple. The equation (7.1) also has a special property called homogeneity, which means that if (π₯, π¦, π§) is a solution, then so is (ππ₯, ππ¦, ππ§) for any integer π. For this reason, we usually restrict attention to the so-called primitive solutions, in which π₯, π¦, and π§ have no non-trivial common factors. It turns out that we can express all primitive solutions of this equation as a two-parameter family. It is easy to show that in any primitive Pythagorean triple we must have π§ odd and either π₯ or π¦ even. By interchanging π₯ and π¦ if necessary, we may suppose without loss of generality that π₯ is even. 40 SCOTT T. PARSELL Theorem 7.5. If (π₯, π¦, π§) is a primitive Pythagorean triple, where π₯ is even and π₯, π¦, and π§ are positive, then π₯ = 2π π‘, π¦ = π 2 β π‘2 , and π§ = π 2 + π‘2 , for some relatively prime positive integers π and π‘. Conversely, if π and π‘ are relatively prime, π > π‘ > 0, and π or π‘ is even, then (2π π‘, π 2 β π‘2 , π 2 + π‘2 ) is a primitive Pythagorean triple. Proof. Let (π₯, π¦, π§) be a positive primitive Pythagorean triple with π₯ even and π¦ and π§ odd. Then we have π₯2 = π§ 2 β π¦ 2 = (π§ + π¦)(π§ β π¦), and both π§ + π¦ and π§ β π¦ are even, so we can write )( ) ( )2 ( π§+π¦ π§βπ¦ π₯ = , 2 2 2 where all three factors are integers. Any common divisor of (π§ + π¦)/2 and (π§ β π¦)/2 would have to divide their sum and diο¬erence, π§ and π¦, but we know that π§ and π¦ are relatively prime and hence so are (π§ + π¦)/2 and (π§ β π¦)/2. It follows easily that both (π§ + π¦)/2 and (π§ β π¦)/2 must be perfect squares, say π§+π¦ π§βπ¦ = π 2 and = π‘2 . 2 2 2 2 2 2 The equations π₯ = 2π π‘, π¦ = π β π‘ , and π§ = π + π‘ follow immediately. Conversely, it is easy to check that (2π π‘)2 + (π 2 β π‘2 )2 = (π 2 + π‘2 )2 . Moreover, any odd prime dividing both 2π π‘ and π 2 β π‘2 would have to divide either π or π‘ and either π + π‘ or π β π‘, and in all of these cases the prime would divide both π and π‘. Thus if (π , π‘) = 1 then the above triple is primitive. β‘ Example 7.6. Find all positive primitive Pythagorean triples with one of the variables equal to 15. Solution. Since 15 is odd and is not the sum of two squares, Theorem 7.5 implies that π¦ = π 2 β π‘2 is the only variable that could take the value 15. So we seek positive integers π > π‘ such that π 2 β π‘2 = (π + π‘)(π β π‘) = 15. Clearly, the only possibilities are π + π‘ = 15, π βπ‘=1 and π + π‘ = 5, π β π‘ = 3, which yield π = 8, π‘ = 7 and π = 4, π‘ = 1. Hence the only Pythagorean triples of this type are (112, 15, 113) and (8, 15, 17). β‘ Theorem 7.7. The equation π₯4 + π¦ 4 = π§ 2 has no integer solutions with π₯π¦π§ β= 0. Proof. If (π₯, π¦, π§) is a solution with gcd(π₯, π¦) = π, then π§ 2 is divisible by π4 and hence π§ is divisible by π2 , so we obtain a new solution (π₯/π, π¦/π, π§/π2 ) with the ο¬rst two variables relatively prime. Therefore we may suppose that (π₯, π¦, π§) is a solution with π₯, π¦, and π§ positive, gcd(π₯, π¦) = 1 and π§ as small as possible. We will show how to construct a solution with a smaller value of π§, thereby producing a contradiction. MA 311 NUMBER THEORY FALL 2008 41 Since (π₯2 , π¦ 2 , π§) is a positive primitive Pythagorean triple, we may apply Theorem 7.5 (after possibly interchanging π₯ and π¦) to write π₯2 = 2π π‘, π¦ 2 = π 2 β π‘2 , and π§ = π 2 + π‘2 , where π > π‘ > 0 and gcd(π , π‘) = 1. Since π¦ is odd, it follows that π is odd and π‘ is even, so we in fact have gcd(π , 2π‘) = 1, and thus π and 2π‘ are both perfect squares, say π = π’2 and 2π‘ = π£ 2 . Furthermore, we have π‘2 + π¦ 2 = π 2 , so (π‘, π¦, π ) is another primitive Pythagorean triple, and we can apply Theorem 7.5 again to write π‘ = 2ππ, π¦ = π 2 β π 2, and π = π 2 + π 2, where π > π > 0 and gcd(π, π ) = 1. We now have ππ = π‘/2 = (π£/2)2 , which implies that π and π are both perfect squares, say π = π 2 and π = π 2 . But now π 4 + π 4 = π 2 + π 2 = π = π’2 and π’2 = π < (π 2 + π‘2 )2 = π§ 2 , so π’ < π§, and taking π = π’ gives a new solution (π, π, π) with π < π§. β‘ Corollary 7.8. The equation π₯4 + π¦ 4 = π§ 4 has no integer solutions with π₯π¦π§ β= 0. Proof. If (π₯, π¦, π§) were a solution with π₯π¦π§ β= 0, then we would have π₯4 + π¦ 4 = (π§ 2 )2 , contradicting Theorem 7.7. β‘ Theorem 7.9. (Fermatβs Last Theorem) If π is an integer with π β₯ 3 is an integer, then the equation π₯π + π¦ π = π§ π has no integer solutions with π₯π¦π§ β= 0. Note that this follows easily from Corollary 7.8 when π is a multiple of 4. The proof for arbitrary π is extremely hard and was just completed by Wiles in 1995. The following symmetric generalization of Fermatβs Last Theorem is still unsolved. Conjecture 7.10. If π is an integer with π β₯ 5, then the equation π₯π + π¦ π = π§ π + π€π has no non-trivial integer solutions. Notice that there are non-trivial solutions to this equation when π = 2, 3, and 4. For instance, one has 12 + 72 = 52 + 52 , 13 + 123 = 93 + 103 , and 1334 + 1344 = 1584 + 594 . Equations in βmanyβ variables. A general theme illustrated above is that diophantine equations in few variables (relative to the degree) tend to have few, if any, non-trivial solutions. Conversely, equations in suο¬ciently many variables (relative to the degree) tend to have many non-trivial solutions. One of the most interesting problems here is to try to quantify the phrase βsuο¬ciently many.β As an example, we look at the problem of representing integers as sums of πth powers. Theorem 7.11. (Lagrangeβs four squares theorem) Every positive integer can be written as the sum of four squares. 42 SCOTT T. PARSELL For example, we have 31 = 52 + 22 + 12 + 12 and 120 = 102 + 42 + 22 + 02 . We leave it as an exercise to show that there are inο¬nitely many positive integers that cannot be represented as sums of three squares. Lemma 7.12. If π and π are sums of four squares, then so is ππ. Proof. Suppose that π = π₯2 + π¦ 2 + π§ 2 + π€2 and π = π2 + π2 + π2 + π2 . Then it is easy (but somewhat tedious) to verify that ππ = (π₯π + π¦π + π§π + π€π)2 + (π₯π β π¦π + π§π β π€π)2 + (π₯π β π§π + π€π β π¦π)2 + (π₯π β π€π + π¦π β π§π)2 . We leave this algebra as an exercise. β‘ Lemma 7.13. If π is an odd prime, then there exist integers π₯, π¦, and π, with 0 < π < π, such that π₯2 + π¦ 2 + 1 = ππ. Proof. It suο¬ces to ο¬nd integers π₯ and π¦ with π₯2 + π¦ 2 + 1 β‘ 0 (mod π) and π₯2 + π¦ 2 + 1 < π 2 . (7.2) We divide the proof into two cases. ( ) If π β‘ 1 (mod 4), then Theorem 6.5 (ii) implies that β1 = 1, so β1 is a quadratic π residue modulo π. Therefore we can ο¬nd π₯ with 0 < π₯ < π/2 such that π₯2 β‘ β1 (mod π), and (7.2) is satisο¬ed with π¦ = 0. ( ) = β1. Now let π be the If π β‘ 3 (mod 4), then Theorem 6.5 (ii) implies that β1 π smallest quadratic non-residue modulo π. Then we have ( ) ( )( ) βπ β1 π = = (β1)(β1) = 1 π π π by Theorem 6.5 (iii), so βπ is a quadratic residue modulo π. Therefore we can ο¬nd π₯ with 0 < π₯ < π/2 such that π₯2 β‘ βπ (mod π). Furthermore, the minimality of π ensures that π β 1 is a quadratic residue modulo π, so we can ο¬nd π¦ with 0 < π¦ < π/2 such that π¦ 2 β‘ π β 1 (mod π). It is easy to check that π₯ and π¦ satisfy (7.2), so this completes the proof. β‘ Proof of Lagrangeβs Theorem: In view of Lemma 7.12 and the fact that 2 = 12 +12 +02 +02 , it suο¬ces to prove that every odd prime π is the sum of four squares. By Lemma 7.13, we can ο¬nd integers π₯, π¦, π§, and π€ such that π₯2 + π¦ 2 + π§ 2 + π€2 = ππ (7.3) for some positive integer π < π. For instance, take π₯ and π¦ as in the lemma, π§ = 1, and π€ = 0. We employ a descent argument to show that we can ο¬nd a solution to (7.3) with π = 1. To do this, we suppose that we have a solution with π > 1 and demonstrate how to construct a solution with a smaller value of π. First of all, if π is even, then an even number of the variables on the left-hand side are odd, so by relabeling if necessary we may suppose that π₯ ± π¦ and π§ ± π€ are even, and )2 ( )2 ( )2 ( )2 ( π₯βπ¦ π§+π€ π§βπ€ π₯+π¦ + + + = (π/2)π. 2 2 2 2 MA 311 NUMBER THEORY FALL 2008 43 If π/2 is even, then we can repeat the argument until we obtain a solution to (7.3) with π odd, so we may suppose from now on that π is odd. Now let π, π, π, and π denote the least absolute value residues of π₯, π¦, π§, and π€ modulo π. That is, π β‘ π₯ (mod π), π β‘ π¦ (mod π), π β‘ π§ (mod π), π β‘ π€ (mod π), where β£πβ£, β£πβ£, β£πβ£, β£πβ£ < π/2 since π is odd. Then we have π2 + π2 + π2 + π2 β‘ π₯2 + π¦ 2 + π§ 2 + π€2 β‘ 0 (mod π), so we can write π2 + π2 + π2 + π2 = ππ for some integer π, and π2 + π2 + π2 + π2 < π 2 , so we have π < π. If π = 0 then we would have π = π = π = π = 0, which would imply that ππ = π₯2 + π¦ 2 + π§ 2 + π€2 is divisible by π 2 . This cannot occur when 1 < π < π since π is prime, so we conclude that π > 0. Now by the proof of Lemma 7.12 we can write (ππ)(ππ) = π 2 + π 2 + π 2 + π 2 , where π = π₯π + π¦π + π§π + π€π, π = π₯π β π¦π + π§π β π€π, π = π₯π β π§π + π€π β π¦π, π = π₯π β π€π + π¦π β π§π, and it is easy to check that π, π , π, and π are each divisible by π. It follows that (π/π)2 + (π /π)2 + (π/π)2 + (π/π)2 = ππ, which gives a solution of (7.3) with 0 < π < π. This completes the descent and shows that there is in fact a solution with π = 1. β‘ Waringβs problem. One might ask whether similar results exist for higher powers. That is, given a positive integer π, can we ο¬nd a positive integer π such that all positive integers π can be written in the form π = π₯π1 + π₯π2 + β β β + π₯ππ (7.4) for some non-negative integers π₯1 , π₯2 , . . . , π₯π ? This question was posed by Waring in 1770 (around the same time as Lagrangeβs Theorem was proved) and has received considerable attention over the past century. The original version of the problem seeks to determine π(π), which is deο¬ned to be the smallest integer π such that the above equation can be solved for every positive integer π. For example, one has π(2) = 4. It is also known that π(3) = 9, π(4) = 19, and π(5) = 37 and that β( )π β 3 π π(π) β₯ 2 + β2 (7.5) 2 for all π, where βπ₯β denotes the greatest integer less than or equal to π₯. Notice that the integer 23 really does require 9 cubes in order to achieve a representation. Since 23 < 33 and 23 < 23 + 23 + 23 , the most eο¬cient decomposition is 23 = 23 + 23 + 13 + 13 + 13 + 13 + 13 + 13 + 13 . Amazingly, it turns out that 23 and 239 are the only two integers that actually require 9 cubes. In fact, there are only ο¬nitely many integers that require 8 cubes, and it follows that every suο¬ciently large integer can be expressed as the sum of 7 cubes. In general, we 44 SCOTT T. PARSELL deο¬ne πΊ(π) to be the smallest integer π such that every suο¬ciently large integer π can be represented in the form (7.4). For example, it is known that πΊ(2) = 4, 4 β€ πΊ(3) β€ 7, πΊ(4) = 16, 6 β€ πΊ(5) β€ 17, and 9 β€ πΊ(6) β€ 24. It turns out that πΊ(π) grows much slower than π(π) as π β β, reο¬ecting the fact that the representation of small integers poses some unusual diο¬culties that do not persist in the long run. In fact, it was shown by Wooley in 1992 that πΊ(π) grows no faster than π log π asymptotically, whereas (7.5) shows that the growth of π(π) is exponential in π. In the 1920s, Hardy and Littlewood devised a method for counting the number of representations of π in the form (7.4) by using a deο¬nite integral. Reο¬nements of this strategy due to Vinogradov, Davenport, Vaughan, Woooley, and others have led to the sharpest available upper bounds for πΊ(π) when π β₯ 3. Notice that even in the cubic case, the existing technology still leaves fairly large gaps between what is conjectured and what can be proved! We give β a very brief outline of the Hardy-Littlewood method. When πΌ is a real number and π = β1, write π(πΌ) = π2πππΌ = cos(2ππΌ) + π sin(2ππΌ). If π is an integer, then it is easy to verify the orthogonality relations { β« 1 β« 1 β« 1 1 if π = 0 sin(2ππΌπ) ππΌ = cos(2ππΌπ) ππΌ + π . π(πΌπ) ππΌ = 0 if π β= 0 0 0 0 If we let π = βπ1/π β and introduce the exponential sum π (πΌ) = π β π(πΌπ₯π ), π₯=1 then the fact that π(π)π(π) = π(π + π) gives β« 1 π π β« β β π π (πΌ) π(βπΌπ) ππΌ = β β β 0 π₯1 =1 π₯π =1 1 0 π(πΌ(π₯π1 + β β β + π₯ππ β π)) ππΌ, and the orthogonality relations show that each term in the sum is 1 or 0 according to whether or not π₯π1 + β β β + π₯ππ = π. The integral on the left therefore counts the representations of π in this form, and demonstrating the existence of representations amounts to showing that the integral is positive. This is a non-trivial task that involves dissecting the interval [0, 1] into two subsets according to the nature of the rational approximations to πΌ and applying several types of estimates for the exponential sum π (πΌ). Notice that as the real variable πΌ runs from 0 to 1, the complex variable π§ = π2πππΌ traces out the unit circle β£π§β£ = 1. The original set-up devised by Hardy and Littlewood actually takes the latter perspective, using integrals over circles in the complex plane. For this reason, the technique is often referred to as the circle method, and the two subsets mentioned above are called major and minor arcs. 8. Irrationality and transcendence β We have already seen in §2 that irrational numbers exist; for instance, 2 ββ β. In fact, almost all real numbers are irrational, since the rationals form a countable set while the reals are uncountable. On the other hand, given any two real numbers πΌ < π½, we can ο¬nd a rational number lying between them. To see this, let π be an integer with π > 1/(π½ β πΌ), so that ππ½ β ππΌ > 1. Clearly there must be an integer π between ππΌ and ππ½, and it MA 311 NUMBER THEORY FALL 2008 45 follows that the rational number π/π lies between πΌ and π½. In particular, by choosing π½ suο¬ciently close to πΌ, we can ο¬nd a rational number that approximates πΌ to any desired degree of accuracy. This property is often expressed by saying that the rationals are dense in the reals. In number theory, we often desire more quantitative information about rational approximations. For instance, how does the quality of the approximation improve as we allow the denominator to increase? This is the type of information that determines how we dissect into major and minor arcs in the Hardy-Littlewood method. One simple answer is given by the following theorem. Theorem 8.1. (Dirichletβs theorem on diophantine approximation) Given a real number πΌ and a positive integer π , there exist integers π and π with (π, π) = 1 and 1 β€ π β€ π β 1 such that ¯ ¯ ¯ ¯ π ¯πΌ β ¯ β€ 1 . ¯ π ¯ ππ Proof. It suο¬ces to prove the result for πΌ β [0, 1], since the general case can then be obtained by replacing π/π by βπΌβ+π/π. We divide the interval [0, 1] into π subintervals, each of length 1/π , and consider the values of ππΌ β βππΌβ as π runs over the integers 1, 2, 3, . . . , π β 1. First of all, if ππΌ β βππΌβ lies in the interval [0, 1/π ] for some π, then taking π = βππΌβ gives β£ππΌ β πβ£ β€ 1/π . Similarly, if ππΌ β βππΌβ lies in the interval [1 β 1/π, 1] for some π, then taking π = βππΌβ + 1 gives β£ππΌ β πβ£ β€ 1/π . If none of these π β 1 values lies in the ο¬rst or last subinterval, then the pigeonhole principle ensures that two of them must lie in one of the remaining π β 2 subintervals. That is, we have β£(π2 πΌ β βπ2 πΌβ) β (π1 πΌ β βπ1 πΌβ)β£ β€ 1/π for some integers π1 and π2 with 1 β€ π1 < π2 β€ π β1. Taking π = π2 βπ1 and π = βπ2 πΌβββπ1 πΌβ again gives β£ππΌ β πβ£ β€ 1/π . Finally, if (π, π) = π then setting πβ² = π/π and π β² = π/π gives (π β² , πβ² ) = 1 and β£π β² πΌ β πβ² β£ β€ 1/(ππ ) β€ 1/π , which completes the proof. β‘ Corollary 8.2. If πΌ is an irrational number, then there are inο¬nitely many rational numbers π/π for which ¯ ¯ ¯ ¯ π ¯πΌ β ¯ < 1 . ¯ π ¯ π2 Proof. If there were only ο¬nitely many such rational approximations to πΌ, then we could ο¬nd one, say π/π, with πΏ = β£πΌ β π/πβ£ minimal. Since πΌ ββ β, we have πΏ > 0, so we may let π = β1/πΏβ + 1 > 1/πΏ. By Theorem 8.1, we can ο¬nd a rational number π/π with 1 β€ π < π and ¯ ¯ ¯ ¯ ¯πΌ β π ¯ β€ 1 < 1 . ¯ π ¯ ππ π2 Since 1/(ππ ) < πΏ, this contradicts the minimality of πΏ. β‘ Note that if πΌ is rational then the inequality in Corollary 8.2 has only ο¬nitely many solutions. To see this, write πΌ = π/π and note that if π/π β= π/π then we have ¯ ¯ ¯ π π ¯ β£ππ β ππβ£ 1 1 ¯ β ¯= β₯ β₯ 2 ¯π ¯ π ππ ππ π 46 SCOTT T. PARSELL whenever π β₯ π, so the only possible solutions come from 1 β€ π < π. A theorem of Hurwitz shows that there are in fact inο¬nitely many solutions of ¯ ¯ ¯ ¯ ¯πΌ β π ¯ < β 1 ¯ π¯ 5π 2 when πΌ is irrational. This turns out to be best possible in the sense that the result fails β β if the constant 1/ 5 is replaced by anything smaller. However, the golden ratio πΌ = 21 (1 + 5) provides the only counterexample! Continued fractions. One way of generating good rational approximations to an irrational number πΌ is to construct the continued fraction expansion 1 πΌ = π₯0 + . 1 π₯1 + 1 π₯2 + 1 π₯3 + π₯4 + . . . To save space, this is sometimes denoted by πΌ = [π₯0 ; π₯1 , π₯2 , π₯3 , π₯4 , . . . ]. We can construct continued fractions for rational numbers as well, but in this case the expansion is ο¬nite. 125 as a ο¬nite continued fraction. 54 Solution. We ο¬rst split oο¬ the integer part by writing 125 17 =2+ . 54 54 Next we take the reciprocal of the fractional part and repeat the process. We have 54 3 17 2 3 1 =3+ , = 5 + , and =1+ . 17 17 3 3 2 2 Thus we have 125 1 = [2; 3, 5, 1, 2]. =2+ 1 54 3+ 1 5+ 1 1+ 2 Example 8.3. Express the rational number β‘ Example 8.4. What real number is represented by the continued fraction [1; 1, 1, 1, 1, . . . ]? Solution. If πΌ = [1; 1, 1, 1, 1, . . . ] then we have 1 1 πΌ=1+ =1+ . 1 πΌ 1+ 1 + ... 2 It follows that πΌ β πΌ β 1 = 0, and since πΌ is clearly positive we may conclude that β 1+ 5 . πΌ= 2 β‘ MA 311 NUMBER THEORY FALL 2008 47 To generate the continued fraction for πΌ, we ο¬rst take π₯0 = βπΌβ and then write πΌ1 = 1 πΌ β π₯0 and π₯1 = βπΌ1 β. In general, if πΌπ and π₯π have been deο¬ned, we take πΌπ+1 = 1 πΌπ β π₯π and π₯π+1 = βπΌπ+1 β. β Example 8.5. Compute the continued fraction for 2. β Solution. First of all, we have π₯0 = β 2β = 1. Next, we have πΌ1 = β β 1 = 2 + 1, 2β1 and hence π₯1 = 2. Furthermore, πΌ2 = 1 1 =β = πΌ1 , πΌ1 β 2 2β1 and hence π₯2 = 2. Since β πΌπ+1 depends only on πΌπ and π₯π , we can conclude that π₯π = 2 for all π β₯ 1. Therefore, 2 = [1; 2, 2, 2, 2, . . . ] = [1; 2]. β‘ By truncating the continued fraction obtained above, we can obtain rational approximaβ tions to 2, for instance π1 1 3 =1+ = , π1 2 2 π2 1 =1+ π2 2+ 1 2 7 = , 5 and π3 1 17 =1+ = . 1 π3 12 2 + 2+ 1 2 The rational number ππ /ππ is called the πth convergent to πΌ, and the integer π₯π is called the πth partial quotient of πΌ. It turns out that the convergents satisfy some simple recurrence relations, which make them easy to compute once the partial quotients are known. Theorem 8.6. If πΌ has the continued fraction expansion [π₯0 ; π₯1 , π₯2 , π₯3 , . . . ], then the πth convergent to πΌ is the rational number ππ /ππ deο¬ned by recurrence relations ππ = π₯π ππβ1 + ππβ2 and ππ = π₯π ππβ1 + ππβ2 (π β₯ 0), where we take πβ1 = 1, πβ1 = 0, πβ2 = 0, and πβ2 = 1. Proof. We regard the convergents ππ /ππ as functions of the partial quotients. That is, ππ = ππ (π₯0 , π₯1 , . . . , π₯π ) and ππ = ππ (π₯0 , π₯1 , . . . , π₯π ). The result is clear for π = 0, since the recursions give π0 = π₯0 and π0 = 1. Now suppose that [π₯0 ; π₯1 , . . . , π₯πβ1 ] = ππβ1 /ππβ1 . Then we can write [π₯0 ; π₯1 , . . . , π₯πβ1 , π₯π ] = [π₯0 ; π₯1 , . . . , π₯πβ1 + 1 ] π₯π = ππβ1 (π₯0 , π₯1 , . . . , π₯πβ1 + ππβ1 (π₯0 , π₯1 , . . . , π₯πβ1 + 1 ) π₯π 1 . ) π₯π 48 SCOTT T. PARSELL Applying the above recurrence relations, we obtain [π₯0 ; π₯1 , . . . , π₯πβ1 , π₯π ] = (π₯πβ1 + 1 )ππβ2 π₯π 1 )π π₯π πβ2 + ππβ3 (π₯πβ1 + + ππβ3 π₯π π₯πβ1 ππβ2 + ππβ2 + π₯π ππβ3 = π₯π π₯πβ1 ππβ2 + ππβ2 + π₯π ππβ3 π₯π (π₯πβ1 ππβ2 + ππβ3 ) + ππβ2 π₯π ππβ1 + ππβ2 ππ = = = . π₯π (π₯πβ1 ππβ2 + ππβ3 ) + ππβ2 π₯π ππβ1 + ππβ2 ππ The result follows by induction. β‘ β Example 8.7. Find the continued fraction expansion for 29, and compute the ο¬rst 6 convergents. β Solution. We have π₯0 = β 29β = 5, and thus β β 1 29 + 5 29 β 3 πΌ1 = β = =2+ . 4 4 29 β 5 It follows that π₯1 = 2 and β β 29 + 3 29 β 2 4 πΌ2 = β = =1+ . 5 5 29 β 3 This in turn gives π₯2 = 1 and β β 5 29 + 2 29 β 3 πΌ3 = β =1+ , = 5 5 29 β 2 which yields π₯3 = 1 and β β 29 + 3 29 β 5 5 πΌ4 = β =2+ . = 4 4 29 β 3 Now we have π₯4 = 2 and β β 4 πΌ5 = β = 29 + 5 = 10 + ( 29 β 5), 29 β 5 and from this we see that π₯5 = 10 and πΌ6 = πΌ β1 , which means that the continued fraction becomes periodic. We therefore conclude that 29 = [5; 2, 1, 1, 2, 10], and we can use Theorem 8.6 to compute the convergents. We have π0 = 5, π1 = 2 β 5 + 1 = 11, π2 = 1 β 11 + 5 = 16, π3 = 1 β 16 + 11 = 27, π4 = 2 β 27 + 16 = 70, and π5 = 10 β 70 + 27 = 727. Similarly, we get π0 = 1, π1 = 2, π2 = 1 β 2 + 1 = 3, π3 = 1 β 3 + 2 = 5, π4 = 2 β 5 + 3 = 13, and π5 = 10 β 13 + 5 = 135. Hence the ο¬rst 6 convergents are 11 16 27 70 727 5, , , , , and . 2 3 5 13 135 β‘ Algebraic and transcendental numbers. A real number that is a root of a non-trivial polynomial with integer coeο¬cients is said to be algebraic. More precisely, if πΌ is a root of a polynomial of degree π with integer coeο¬cients that is irreducible over β, then we say that πΌ is algebraic of degree π. Note that any rational number π/π is algebraic of degree one, since MA 311 NUMBER THEORY FALL 2008 49 β it is a root of the polynomial π (π₯) = ππ₯ β π. Any real number of the form π ± π π, where π, π, and β π are rational and π is not a perfect square, is algebraic 2of degree two. For instance, 1 (1+ 5) is algebraic of degree two, since it is a root of π (π₯) = π₯ βπ₯β1. Algebraic numbers 2 of degree two are sometimes called quadratic irrationals. It turns out that a number is a quadratic irrational if and only if it has an eventually periodic continued fraction expansion. The set of algebraic numbers is closed under addition and β β multiplication, but the set of algebraic numbers of degree π is not. For instance, 2 and 3 are algebraic of degree 2, but β β β β 2 + 3 is algebraic of degree 4 and 2 β 2 = 2 is algebraic of degree 1. β β Example 8.8. Prove that πΌ = 2 + 3 is algebraic. β β Solution. First of all, we have πΌ2 = 2 + 2 6 + 3, and hence πΌ2 β 5 = 2 6. Squaring both sides gives πΌ4 β 10πΌ2 + 25 = 24, or πΌ4 β 10πΌ2 + 1 = 0. Thus πΌ is a root of the polynomial π (π₯) = π₯4 β 10π₯2 + 1 and hence is algebraic of degree at most 4. One can in fact show that π is irreducible over β and hence that πΌ is algebraic of degree 4. β‘ Real numbers that are not algebraic are called transcendental. Probably the two most famous transcendental numbers are π and π. Proving the transcendence of π and π is beyond the scope of the course; however, it is not too diο¬cult to show that π is irrational. Theorem 8.9. The number π is irrational. Proof. Suppose to the contrary that π is rational, say π = π/π, where π and π are integers with π β₯ 1. We recall that π can be expressed as the inο¬nite series β β 1 π= . π! π=0 Let π β₯ 2π be an integer, and let ππ denote the πth partial sum of this series; that is, π β 1 1 1 1 1 ππ = =1+1+ + + + β β β + . π! 2 6 24 π! π=0 Clearly ππ is rational, and we can write ππ = π/π! for some integer π. Moreover, we have π > ππ , and thus π π ππ! β ππ 1 π β ππ = β = β₯ . π π! ππ! ππ! On the other hand, we have β β 1 1 1 1 π β ππ = = + + + ... π! (π + 1)! (π + 2)! (π + 3)! π=π+1 ) ( 1 1 1 1 2 1 < = β β€ 1 + + 2 + ... (π + 1)! π π (π + 1)! 1 β 1/π (π + 1)! since π β₯ 2. Combining our two inequalities, we obtain 1 2 β€ π β ππ β€ , ππ! (π + 1)! which implies that π β€ 2π β 1, a contradiction. β‘ 50 SCOTT T. PARSELL The idea of the preceding proof may be summarized by saying that π has rational approximations (namely ππ ) that are βtoo goodβ to allow π to be rational, since two distinct rationals diο¬er by at least the reciprocal of the product of the denominators. The following theorem may be viewed as a generalization of this idea. It states that algebraic numbers cannot have fantastically good rational approximations. Theorem 8.10. (Liouvilleβs Theorem) Suppose that πΌ is an algebraic number of degree π β₯ 2. Then there exists a positive constant ππΌ such that ¯ ¯ ¯ ¯ ¯πΌ β π ¯ > ππΌ ¯ π ¯ ππ for all integers π and π with π β₯ 1. Proof. Suppose that πΌ is a root of the irreducible polynomial π (π₯) = ππ π₯π + ππβ1 π₯πβ1 + β β β + π1 π₯ + π0 , where π β₯ 2, and let π and π be integers with π β₯ 1. First of all, we note that π (π/π) β= 0, since π is irreducible of degree at least two. Furthermore, it is clear that π π π (π/π) is an integer and hence that π π β£π (π/π)β£ β₯ 1. Since πΌ is a root of π , we may write π (π₯) = (π₯βπΌ)π(π₯), where π is a polynomial of degree π β1, not necessarily with integer coeο¬cients. Since π is a continuous function, we know that it attains maximum and minimum values on any closed, bounded interval. Therefore, there exists ππΌ > 0 such that β£π(π₯)β£ β€ ππΌ for all π₯ β [πΌ β 1, πΌ + 1]. We set ππΌ = (1 + ππΌ )β1 and consider two cases. If β£πΌ β π/πβ£ β€ 1, then we have π βπ β€ β£π (π/π)β£ β€ β£πΌ β π/πβ£β£π(π/π)β£ β€ β£πΌ β π/πβ£ππΌ < β£πΌ β π/πβ£πβ1 πΌ , which gives β£πΌ β π/πβ£ > ππΌ π βπ , as required. If β£πΌ β π/πβ£ > 1, then the desired inequality follows from the observation that ππΌ β€ 1. β‘ Example 8.11. Find an admissible value for ππΌ in Liouvilleβs Theorem when πΌ = β 3 2. Solution. In the notation of the above proof, we have π (π₯) = π₯3 β 2 = (π₯ β πΌ)(π₯2 + πΌπ₯ + πΌ2 ) = (π₯ β πΌ)π(π₯). Since πβ² (π₯) = 2π₯ + πΌ, we ο¬nd that π is increasing on the interval [πΌ β 1, πΌ + 1] and hence that π(πΌ β 1) β€ π(π₯) β€ π(πΌ + 1) for all π₯ in the interval [πΌ β 1, πΌ + 1]. Since π(πΌ β 1) > 0 and π(πΌ + 1) = 3πΌ2 + 3πΌ + 1 < 9.542, we can take ππΌ = 9.542 and thus any ππΌ < (10.542)β1 is admissible. For example, one has ¯ ¯ ¯β ¯ π 3 ¯ 2β ¯> 1 ¯ π ¯ 11π 3 for all integers π and all positive integers π. β‘ One might hope that the proof of Theorem 8.9 could be modiο¬ed to show that π is transcendental using the contrapositive of Liouvilleβs Theorem. However, the quality of the rational approximations ππ is not suο¬cient to make this argument work. We note that ππ has denominator π = π!, but 2/(π + 1)! > 1/(π!)2 = 1/π 2 so the inequality β£π β ππ β£ < 2/(π + 1)! doesnβt even rule out the possibility that π is a quadratic irrational! Therefore a more MA 311 NUMBER THEORY FALL 2008 51 sophisticated argument is required to prove that π is transcendental. However, we can establish the existence of transcendental numbers by working with a series that converges much faster. Theorem 8.12. The number πΌ = β β 10βπ! = 0.11000100000000000000000100000000..... is π=1 transcendental. π ππ β βπ! Proof. We write = 10 , where ππ = 10π! . We then have ππ π=1 ¯ ¯ β β ¯ ¯ ¯πΌ β ππ ¯ = 10βπ! = 10β(π+1)! + 10β(π+2)! + 10β(π+3)! + . . . ¯ ¯ ππ π=π+1 ( ) 10 10 < 10β(π+1)! 1 + 10β1 + 10β2 + . . . = β 10β(π+1)! = ππβ(π+1) . 9 9 If πΌ is algebraic of degree π β₯ 2, then Liouvilleβs Theorem implies that there is a constant π > 0 such that β£πΌ β ππ /ππ β£ > πππβπ for all π. This statement holds for π = 1 as well since πΌ β= ππ /ππ and hence πΌ = π/π =β β£πΌ β ππ /ππ β£ β₯ (πππ )β1 , whence we can take π = 1/(π + 1). Thus if πΌ is algebraic of degree π we have 10 β(π+1) πππβπ < β£πΌ β ππ /ππ β£ < π , 9 π and thus πππ+1βπ < 10/(9π). But ππ β β as π β β, so we obtain a contradiction by taking π suο¬ciently large in terms of π and π. β‘ Some open questions. A real number πΌ is said to be badly approximable if there is a positive constant ππΌ such that β£πΌβπ/πβ£ > ππΌ π β2 for all integers π and π with π β₯ 1. Liouvilleβs Theorem shows that all algebraic numbers of degree two (i.e., all quadratic irrationals) are badly approximable. It is conjectured that no algebraic numbers of degree greater than two are badly approximable, but this has not been proven. It turns out that a number is badly approximable if and only if the partial quotients in its continued fraction expansion are bounded. For instance, π is not badly approximable, for it can be shown that π = [2; 1, 2, 1, 1, 4, 1, 1, 6, 1, 1, 8, 1, 1, 10, 1, 1, 12, . . . ]. By contrast, it is unknown whether π is badly approximable (the conjecture is that itβs not). We do know that most real numbers are not badly approximable, in the sense that the badly approximable numbers have measure zero in the real line. However, the set of badly approximable numbers is uncountable (like the reals), whereas the set of algebraic numbers is countable (like the integers and the rationals). Therefore, there are uncountably many badly approximable transcendental numbers, but producing a single speciο¬c example seems to be non-trivial. On a more basic level, much is still unknown about the irrationality and transcendenceβof familiar numbers. For instance, it is not known whether the numbers π ± π, π/π, π π , π 2 , 2π , and log π are irrational. As another example, consider the Riemann zeta function β β 1 π(π ) = ππ π=1 52 SCOTT T. PARSELL for π > 1. When π is an even integer, it is known that π(π ) is a rational multiple of π π (and hence transcendental); for example, π(2) = π 2 /6. Much less is known when π is odd. It was proved by ApeΜry in 1979 that π(3) is irrational, but it is unknown whether π(3) is transcendental. It is unknown whether π(5) is irrational, although it has been shown that at least one of π(5), π(7), π(9), and π(11) must be irrational (Zudilin, 2001). In fact, it is known that there are inο¬nitely many odd integers π for which π(π ) is irrational (Rivoal, 2000), but the irrationality is not known for any particular odd π > 3. 9. The distribution of primes Suppose that ππ denotes the πth prime, so that π1 = 2, π2 = 3, π3 = 5, and so on. One of the central problems in analytic number theory is to obtain precise information about the behavior of ππ as π β β. A simple way to get an idea of how the sequence (ππ ) is distributed is to look at the sum of the reciprocals of the terms: β β 1 1 1 1 1 1 1 1 = + + + + + + + .... π 2 3 5 7 11 13 17 π=1 π (9.1) As a starting point, we recall from calculus that the harmonic series 1 + 12 + 13 + 41 + . . . diverges. The following lemma provides quantitative information about the growth rate of the partial sums. Speciο¬cally, it says that the sum of the ο¬rst π terms is roughly log π . Lemma 9.1. For all positive integers π , one has 1 1 1 1 0 < 1 + + + + β β β + β log π β€ 1. 2 3 4 π Proof. By using a left-hand Riemann sum to over-estimate the area under the graph of π¦ = 1/π‘, we ο¬nd that β« π +1 1 1 1 ππ‘ 1 + + + β β β + > = log(π + 1) > log π. 2 3 π π‘ 1 By considering a right-hand Riemann sum we similarly obtain β« π 1 1 1 ππ‘ 1 + + + β β β + β€1+ = 1 + log π, 2 3 π π‘ 1 and the result follows by subtracting log π from each side of the above inequalities. β‘ It turns out that the quantity considered above actually approaches a limit as π β β, known as Eulerβs constant: ) ( 1 1 1 πΎ = lim 1 + + + β β β + β log π = 0.57721566490153286060651209008240243 . . . π ββ 2 3 π Although Eulerβs constant is known to over 1,000,000 decimal places, it is still unknown whether it is irrational. It is conjectured to be transcendental. We now return to the prime harmonic series (9.1). It turns out that the π th partial sum of this series is on the order of log log π rather than log π . In what follows, it is useful to call an integer square-free if it is not divisible by the square of any prime. In other words, π is square-free if we can write π = π1 β β β ππ , where π1 , . . . , ππ are distinct primes. For instance, MA 311 NUMBER THEORY FALL 2008 53 the integer 42 = 2 β 3 β 7 is square-free but 45 = 32 β 5 is not. From now on, the letter π is reserved to denote a prime unless otherwise indicated. β1 Theorem 9.2. For every integer π > 1, one has > log log π β log 2. π πβ€π Proof. Every positive integer π can be written uniquely in the form π = ππ2 , where π and π are positive integers with π square-free. Using Lemma 9.1, we obtain β β1 β β β 1 β 1 1 log π < = β€ . 2 π ππ π π=1 π2 β πβ€π πβ€π πβ€π π squarefree Furthermore, we have πβ€ π/π π squarefree β« β β β 1 ππ‘ <1+ = 2, 2 π π‘2 1 π=1 and the inequality 1 + π’ β€ ππ’ yields ) β (β ) β 1 β( 1 1 1/π β€ 1+ β€ π = exp . π πβ€π π π πβ€π πβ€π πβ€π π squarefree We therefore deduce that (β ) 1 log π < 2 exp , π πβ€π and taking logarithms gives log log π < log 2 + β1 , π πβ€π as required. Corollary 9.3. The prime harmonic series β‘ β β 1 diverges. π π π=1 Proof. This follows immediately from Theorem 9.2, since lim (log log π ) = β. π ββ β‘ β We may interpret the divergence of πβ1 π to mean that the primes are not all that sparsely distributed. For instance, if the primes were as sparse as the sequence of perfect squares β β2 then the series would converge by comparison with π . On the other hand, comparing the orders of growth of the partial sums in Lemma 9.1 and Theorem 9.2 indicates that the primes are, at least in some sense, signiο¬cantly sparser than the integers themselves. In order to obtain more precise information about the growth of ππ , it is useful to deο¬ne π(π) to be the number of primes π with π β€ π. We aim to derive some elementary bounds for π(π) due to Chebyshev and then use our results to obtain bounds on ππ . We begin with two simple combinatorial lemmas. ( ) 2π π Lemma 9.4. One has 2 β€ < 4π for all positive integers π. π 54 SCOTT T. PARSELL Proof. By the binomial theorem, we have π 4 = (1 + 1) 2π ) 2π ( β 2π = π=0 π ( ) 2π > . π The other inequality may be established by a simple induction argument and is left as an exercise. β‘ Lemma 9.5. One has β π πββ β βlog β π log π! = log π. ππ πβ€π π=1 3 β β β (π β 1) β π, it is clear that there are no primes π > π dividing π!, Proof. Since π! = 1 β 2 β β so we may write π! = πβ€π ππΌπ , where πΌπ is a non-negative integer representing the exact power of π that divides π!. Taking logarithms gives β log π! = πΌπ log π, πβ€π so it remains to ο¬nd a formula for πΌπ . Among the integers 1, 2, 3, . . . , π β 1, π, there are βπ/πβ multiples of π. Of these, βπ/π2 β are also multiples of π2 , and in general βπ/ππ β of them are multiples of ππ . Since ππ > π when π > logπ π, we see that β β β β β β π π π πΌπ = + 2 + 3 + ... = π π π βlogπ πββ β π=1 β π , ππ and this completes the proof. β‘ Theorem 9.6. For every integer π β₯ 2 one has π 6π < π(π) < . 6 log π log π Proof. By taking logarithms in the result of Lemma 9.4, we obtain since (2π) π π log 2 β€ log(2π)! β 2 log π! < π log 4, = (2π)!/(π!)2 . Lemma 9.5 therefore gives π log 2 β€ β π 2πβ (β β βlog β 2π πβ€2π π=1 ππ β β) π β2 π log π < π log 4. π (9.2) Now since β2π₯β β 2βπ₯β is 0 if 0 β€ π₯ β βπ₯β < 1/2 and 1 otherwise, we ο¬nd that β π < π log 2 β€ (logπ 2π)(log π) = π(2π) log 2π, 2 πβ€2π and hence π(2π) > 2π/(4 log 2π), which establishes the lower bound for even integers. Since 2π β₯ 23 (2π + 1) for π β₯ 1, we also have π(2π + 1) β₯ π(2π) > 2π + 1 2π > , 4 log 2π 6 log(2π + 1) MA 311 NUMBER THEORY FALL 2008 55 which proves the lower bound for odd integers. For the upper bound, we delete all but the π = 1 term from (9.2) to obtain β β) β (β 2π β π β2 log π < π log 4. π π πβ€2π Let π(π) = β log π. Since β2π/πβ β 2βπ/πβ = 1 when π < π β€ 2π, we deduce that πβ€π β π(2π) β π(π) = log π < π log 4. π<πβ€2π Now if π is a particular integer with π β₯ 2, there is a positive integer π such that 2π β€ π < 2π+1 . We then have π+1 π(π) β€ π(2 )= π β (π(2 π+1 π ) β π(2 )) < π=0 π β 2π log 4 = (2π+1 β 1) log 4 < 4π log 2, π=0 since the ο¬rst summation telescopes and π(1) = 0. On the other hand, we have β π(π) β₯ log π β₯ (π(π) β π(π2/3 )) log(π2/3 ) β₯ 32 (π(π) β π2/3 ) log π. π2/3 <πβ€π Combining the previous two inequalities yields (π(π) β π2/3 ) log π < 6π log 2, and hence ( ) 6π log 2 π log π 2/3 π(π) < +π = 6 log 2 + 1/3 . log π log π π It is a simple calculus exercise to show that the function (log π₯)/π₯1/3 takes its maximum value at π₯ = π3 and hence that (log π)/π1/3 β€ 3/π for all π β₯ 1. We therefore have π(π) < π 6π (6 log 2 + 3/π) < , log π log π as required. β‘ We now deduce upper and lower bounds on the size of the πth prime. Theorem 9.7. For every integer π β₯ 2, one has 1 π log π < ππ < 18π log π. 6 Proof. Suppose that ππ = π. By Theorem 9.6, we have π = π(π) < 6π 6ππ = , log π log ππ and thus ππ > 16 π log ππ > 61 π log π, which gives the lower bound. Similarly, Theorem 9.6 gives π ππ π = π(π) > = , 6 log π 6 log ππ 56 SCOTT T. PARSELL and thus ππ < 6π log ππ . We recall from the proof of Theorem 9.6 that log π₯ β€ (3/π)π₯1/3 , 1/3 2/3 which gives ππ < (18/π)πππ and thus ππ < 18π/π. Taking logarithms gives 2 log ππ < log π + log(18/π) < 2 log π, 3 provided that π > 6. We therefore obtain ππ < 18π log π when π > 6, and it is easy to check that this holds for 2 β€ π β€ 6 as well. β‘ Even more precise information is known about π(π) and ππ asymptotically as π β β. Before mentioning some of these results, we discuss some of the common asymptotic notation. We say that π (π₯) βΌ π(π₯) as π₯ β β if π (π₯) = 1. π₯ββ π(π₯) lim Furthermore, we write π (π₯) = π(π(π₯)) if lim π₯ββ π (π₯) = 0. π(π₯) Finally, we write π (π₯) = π(π(π₯)) if there is a constant π such that β£π (π₯)β£ β€ π β£π(π₯)β£ for all π₯. Notice that π = π(π) implies that π = π(π). Theorem 9.8. (The Prime Number Theorem) As π β β one has π π(π) βΌ and ππ βΌ π log π. log π The proof of the prime number theorem is beyond the scope of the course, as the most direct method requires the theory of complex variables. If π(π; π, π) denotes the number of primes π β€ π with π β‘ π (mod π), then it is also known that 1 π π(π; π, π) βΌ π(π) βΌ π(π) π(π) log π whenever (π, π) = 1. This is called the prime number theorem for arithmetic progressions. In particular, it shows that there are inο¬nitely many primes in each reduced residue class modulo π and that the primes are equally distributed among the residue classes asymptotically. For example, roughly half of the odd primes are congruent to 1 mod 4 and roughly half are congruent to 3 mod 4. The prime number theorem may be interpreted by saying that the probability that the integer π is prime is roughly 1/ log π. In fact, this interpretation leads to an approximation for π(π) that is more accurate than π/ log π. It is known that π(π) βΌ li(π), where β« π₯ ππ‘ li(π₯) = . 2 log π‘ We may think of li(π₯) as a sort of cumulative distribution function for the density function π (π‘) = 1/ log π‘. It is known that β£π(π₯)βli(π₯)β£ = π(π₯) as π₯ β β, and in fact one can make the error term more explicit. The best known quantitative version of the prime number theorem states that β β£π(π₯) β li(π₯)β£ = π(π₯ exp(βπ log π₯)) MA 311 NUMBER THEORY FALL 2008 57 for some constant π > 0. However, it is easy to show that this error term grows more rapidly than π₯1βπΏ for every πΏ > 0, so this is actually a fairly weak β result in some sense. It is conjectured that the true error term is just slightly larger than π₯. Conjecture 9.9. (The Riemann Hypothesis) One has β β£π(π₯) β li(π₯)β£ = π( π₯ log π₯). This is one of the most notorious unsolved problems in mathematics, and even establishing an error term of π(π₯1βπΏ ) for some positive πΏ would be considered a major breakthrough. The usual statement of the Riemann hypothesis concerns the zeta function mentioned at the end of §8. This is a function of a complex variable, which is deο¬ned by the inο¬nite series β β π(π ) = πβπ π=1 when Re(π ) > 1. The above series fails to converge when Re(π ) β€ 1, but it turns out that the zeta function has a unique extension (called an analytic continuation) to the whole complex plane. This extension of π(π ) has so-called βtrivialβ zeros at the negative even integers, and the Riemann hypothesis is equivalent to the assertion that all the remaining zeros of π(π ) lie on the line Re(π ) = 1/2. Twin Primes and Mersenne Primes. It is conjectured that π2 (π), the number of twin prime pairs (π, π + 2) with π + 2 β€ π, is asymptotic to πΆπ/(log π)2 for some constant πΆ > 0, but we donβt even know that π2 (π) β β. This latter statement is known as the Twin Prime Conjecture. In some sense, the twin primes are very sparse, as it can be shown that the sum of their reciprocals, ( ) ( ) ( ) ( ) 1 1 1 1 1 1 1 1 + + + + + + + + ... 3 5 5 7 11 13 17 19 converges, in contrast to the conclusion of Corollary 9.3. The value of the above sum, known as Brunβs constant, is quite diο¬cult to estimate precisely because of the slow convergence; however, its value appears to be around 1.902160583. In 1994, Nicely discovered inconsistencies in his computations of Brunβs constant, which turned out to result from a subtle ο¬aw in Intelβs new Pentium processor. This led to an embarrassing recall and provided one of the more surprising applications of number theory. It is not known whether Brunβs constant is rational; of course, its irrationality would imply the Twin Prime Conjecture since a ο¬nite sum of rational numbers is rational. Recall that the Mersenne numbers are integers of the form 2π β 1 where π is prime. It is conjectured that the number of Mersenne primes up to π is asymptotic to ππΎ log2 (log π), where πΎ is Eulerβs constant. However, only 46 Mersenne primes have been discovered as of November 2008, and proving that there are inο¬nitely many seems completely out of reach. The computational evidence certainly suggests that the Mersenne primes are sparsely distributed among the Mersenne numbers; that is, for most primes π the number 2π β 1 turns out to be composite. However, it also remains an open problem to establish that there are inο¬nitely many composite Mersenne numbers. It seems inconceivable that this would fail, since then all suο¬ciently large Mersenne numbers would be prime! Nevertheless, the existing technology does not seem to be capable of generating a proof. 58 SCOTT T. PARSELL References [1] G. E. Andrews, Number Theory, Dover, 1994. [2] T. M. Apostol, Introduction to analytic number theory, Undergraduate Texts in Mathematics, Springer-Verlag, 1976. [3] T. H. Barr, Invitation to cryptology, Prentice Hall, 2002. [4] D. M. Bressoud, Factorization and primality testing, Undergraduate Texts in Mathematics, Springer-Verlag, New York, 1989. [5] E. B. Burger, Exploring the number theory jungle: A journey into diophantine analysis, AMS Student Mathematical Library, Volume 8, 2000. [6] M. Erickson and A. Vazzana, Introduction to number theory, Discrete Mathematics and its Applications, Chapman & Hall/CRC, Boca Raton, 2008. [7] J. A. Gallian, Contemporary abstract algebra, 6th ed, Houghton Miο¬in, 2006. [8] G. H. Hardy and E. M. Wright, An introduction to the theory of numbers, 6th ed, Oxford University Press, 2008. [9] J. F. Humphreys and M. Y. Prest, Numbers, groups, and codes, Cambridge University Press, 1989. [10] K. Ireland and M. Rosen, A classical introduction to modern number theory, 2nd ed, Graduate Texts in Mathematics, 84, Springer-Verlag, 1990. [11] N. Koblitz, A course in number theory and cryptography, 2nd ed, Graduate Texts in Mathematics, 114, Springer-Verlag, 1994. [12] H. L. Montgomery and R. C. Vaughan, Multiplicative number theory I. Classical theory, Cambridge University Press, 2007. [13] M. B. Nathanson, Additive number theory I: The classical bases, Graduate Texts in Mathematics, 164, Springer-Verlag, 1996. [14] I. Niven, H. S. Zuckerman, and H. L. Montgomery, An introduction to the theory of numbers, 5th ed, Wiley, 1991. [15] K. H. Rosen, Elementary number theory and its applications, 5th ed, Pearson Addison Wesley, 2005. [16] J. H. Silverman, A friendly introduction to number theory, 3rd ed, Pearson Prentice Hall, 2006. [17] G. Tenenbauam and M. MendeΜs France, The prime numbers and their distribution, AMS Student Mathematical Library, Volume 6, 2000. [18] R. C. Vaughan, The Hardy-Littlewood method, 2nd ed, Cambridge University Press, 1997.