Download math 216t topics in number theory

California State University, Fresno MATH 216T TOPICS IN NUMBER THEORY Spring 2008 Instructor : Stefaan Delcroix Chapter 1 Diophantine Equations Definition 1.1 Let f (x1 , . . . , xn ) be a polynomial with integral coefficients in x1 , . . . , xn . We call the equation f (x1 , . . . , xn ) = 0 a diophantine equation if we are looking for all solutions (x1 , . . . , xn ) ∈ Zn or (x1 , . . . , xn ) ∈ Qn . 1.1 The Diophantine Equation ax + by = c Theorem 1.2 Let a, b ∈ Z0 and c ∈ Z. Then the following holds about the diophantine equation ax + by = c : (a) The equation has a solution if and only if gcd(a, b)|c. (b) Suppose that (x0 , y0 ) is a solution. Then all the solutions are given by  b   x = x0 − t d   y =y +t a 0 d , t∈Z where d = gcd(a, b). Proof : Put d = gcd(a, b). (a) Suppose first that (x0 , y0 ) is a solution. Then ax0 + by0 = c. Since d|a and d|b, we get that d|(ax0 + by0 ) and so d|c. Suppose next that d|c. Then c = dk for some k ∈ Z. We’ve seen in MATH 116 that the greatest common divisor of a and b can be written as a linear combination of a and b. So there exist u, v ∈ Z with au + bv = d. Hence a(uk) + b(vk) = dk = c. So the equation ax + by = c has a solution (namely (x, y) = (uk, vk)). (b) Suppose that (x0 , y0 ) is a solution. Then d|c by (a). b a Pick t ∈ Z. Put x = x0 − t and y = y0 + t . Then d d ! ! b a ab ba ax + by = a x0 − t + b y0 + t = ax0 − t + by0 + t = ax0 + by0 = c d d d d 1 So (x, y) is a solution. Next, we show that every solution is of this form. Suppose that (x, y) is a solution. Then ax + by = c = ax0 + by0 Hence b(y − y0 ) = a(x0 − x). Put a = da0 and b = bd0 . Then db0 (y − y0 ) = da0 (x0 − x) and so b0 (y − y0 ) = a0 (x0 − x). But gcd(a0 , b0 ) = 1. Since b0 |a0 (x0 − x), we get that b0 |(x0 − x). Hence x0 − x = b0 t for some t ∈ Z. Then y − y0 = a0 t. So x = x0 − b0 t and y = y0 + a0 t. 2 Example : Find all the solutions of the diophantine equation 457x + 67y = 60714. Are there any solutions (x, y) ∈ N2 ? We start by calculating (457, 67) and writing it as a linear combination of 457 and 67. 1 0 457 0 1 67 457 = 67 · 7 + (−12) −1 7 12 67 = 12 · 6 + (−5) −6 41 5 12 = 5 · 2 + 2 11 −75 2 5=2·2+1 −28 191 1 2=1·2+0 so so so so so R3 R4 R5 R6 we = R1 − 7R2 and change signs = R2 − 6R3 and change signs = R3 − 2R4 = R4 − 2R5 stop Hence gcd(457, 67) = 1 = (−28) · 457 + 191 · 67 Multiplying both sides by 60714, we get that (−1699992) · 457 + 11596374 · 67 = 60, 714 So (−1699992; 11596374) is a particular solution of 457x + 67y = 60714. Hence x = −1699992 + 67t all integral solutions of 457x + 67y = 60714 are , t∈Z y = 11596374 − 457t Are there any solutions (x, y) ∈ N2 ? So −1699992 + 67t ≥ 0 and 11596374 − 457t ≥ 0 Hence we get t≥ 1699992 11596374 ≈ 25373.0149 and t ≤ ≈ 25374.9978 67 457 Since t is an integer, we get that t = 25374. So x = −1699992 + 67 · 25374 = 66 and y = 11596374 − 457 · 25374 = 456 the only natural solution of 457x + 67y = 60714 is (x, y) = (66, 456) 2 We have the following result for natural solutions of ax + by = c. Proposition 1.3 Let a, b, c ∈ N0 with gcd(a, b) = 1 and c > ab. Then the equation ax + by = c has a solution (x, y) ∈ N20 . Proof : Let (x0 , y0 ) be a solution of ax + by = c. Then all the solutions are given by xt = x0 − tb yt = y0 + ta , t∈Z Hence the solutions are equally spaced along the line L ↔ ax + by = c : for all t ∈ Z, the distance between (xt+1 , yt+1 ) and (xt , yt ) is p √ ([(x0 − (t + 1)b] − [x0 − tb])2 + ([y0 + (t + 1)a] − [y0 + ta])2 = a2 + b2 Note that the X-intercept (resp. Y -intercept) of L are c c ,0 resp. 0, a b The distance between these two intercepts (which is the length of the part of L that is in the first quadrant) is given by r √ c√ 2 c2 c2 2 > + = a + b a2 + b 2 a2 b 2 ab since c > ab. Hence there exists t ∈ Z with (xt , yt ) ∈ N20 (so (xt , yt ) is in the first quadrant). 2 1.2 Unique Prime Factorization over N The fact that we can write every n ∈ N with n ≥ 2 uniquely (up to order) as a product of primes can be a very powerful tool. We illustrate this with a question from the International Math Olympiad. The original question was : 2 (IMO 1997, B2) : Find all (a, b) ∈ N20 with ab = ba . We start with an easy lemma : Lemma 1.4 Let n ∈ N with n ≥ 2. Then we have the following : (a) nk−2 > k for all k ∈ N with k ≥ 5. (b) n2k−1 > k for all k ∈ N0 . 3 Proof : (a) We use induction on k. For k = 5, we have n5−2 = n3 ≥ 23 = 8 > 5. So suppose that nk−2 > k for k = 5, 6, . . . , m for some m ∈ N with m ≥ 5. Then using induction, we get n(m+1)−2 = nm−1 = n · nm−2 > n · m ≥ 2m ≥ m + 1 since m ≥ 5. (b) Similar to the proof in (a). 2 Now we tackle the question. One easily checks that a = 1 ⇔ b = 1. So we may assume that a, b > 1. Hence we can consider the prime factorization of a and b. We only use the primes that show up in either factorization : a= n Y pαi i and b= i=1 n Y pβi i i=1 where p1 < p2 < · · · < pn are primes, αi , βi ∈ N and not both αi and βi are zero for i = 2 1, 2, . . . , n. Substituting this into ab = ba , we find n Y 2 pαi i b = n Y pβi i a i=1 i=1 Using unique prime factorization, we get αi b2 = βi a Suppose first that for i = 1, 2, . . . , n (∗) a ≥ 2. From (*), we get that αi ≥ 2βi for i = 1, 2, . . . , n. Hence b2 2 b = n Y pi2βi divides n Y pαi i = a i=1 i=1 So a = kb2 for some k ∈ N with k ≥ 2. Hence by (*), αi = kβi for i = 1, 2, . . . , n. So a= n Y pαi i = n Y i = bk pkβ i i=1 i=1 Putting everything together, we find kb2 = a = bk and so bk−2 = k By Lemma 1.4(a), k ∈ {2, 3, 4}. One easily checks that k = 2 does no lead to a solution while k = 3, 4 lead to the solutions (a, b) ∈ {(27, 3), (16, 2)}. a Suppose next that 2 < 2. From (*), we get that αi < 2βi for i = 1, 2, . . . , n. Hence b a= n Y pαi i divides but doesn’t equal i=1 n Y i=1 4 pi2βi = b2 So b2 = ka for some k ∈ N with k ≥ 2. Hence by (*), βi = kαi for i = 1, 2, . . . , n. So b= n Y pβi i = i=1 n Y i pkα = ak i i=1 Putting everything together, we find ka = b2 = (ak )2 = a2k and so a2k−1 = k By Lemma 1.4(b), there are no solutions. 2 All couples (a, b) ∈ N20 with ab = ba are {(1, 1), (16, 2), (27, 3)}. 1.3 2 Pythagorean Triples In this section, we will find all the solutions (x, y, z) ∈ N30 of the diophantine equation x2 + y 2 = z 2 Definition 1.5 (1) A triple (x, y, z) ∈ N30 is a Pythagorean triple if x2 + y 2 = z 2 . (2) A Pythagorean triple (x, y, z) is primitive if gcd(x, y, z) = 1. Remarks : (1) If (x, y, z) is a Pythagorean triple, then so is (dx, dy, dz) for any d ∈ N0 . (2) Let (x, y, z) be a Pythagorean triple. Put d = gcd(x, y, z). Then x = dx0 , y = dy 0 and z = dz 0 for some x0 , y 0 , z 0 ∈ N0 . So (dx0 )2 + (dy 0 )2 = (dz 0 )2 . Hence x0 2 + y 0 2 = z 0 2 . Note that gcd(x0 , y 0 , z 0 ) = 1. So (x0 , y 0 , z 0 ) is a primitive Pythagorean triple and (x, y, z) = (dx0 , dy 0 , dz 0 ). Hence it is enough to find all primitive Pythagorean triples. (3) Let (x, y, z) be a Pythagorean triple. Then gcd(x, y, z) = 1 ⇔ gcd(x, y) = 1 ⇔ gcd(y, z) = 1 ⇔ gcd(x, z) = 1. Lemma 1.6 Let (x, y, z) be a primitive Pythagorean triple. Then either x is even and y is odd or x is odd and y is even. Proof : Since gcd(x, y) = 1, we get that x and y can not both be even. Suppose x and y are both odd. Then z 2 ≡ x2 + y 2 ≡ 1 + 1 ≡ 2 mod 4 a contradiction since 2 is not a quadratic residue modulo 4. So either x is even and y is odd or x is odd and y is even. 2 If (x, y, z) is a primitive Pythagorean triple, then so is (y, x, z). By Lemma 1.6, it is enough to find all primitive Pythagorean triples with x even and y odd. 5 Theorem 1.7 Let (x, y, z) ∈ N30 with x even. Then (x, y, z) is a primitive Pythagorean triple if and only if (x, y, z) = (2mn, m2 − n2 , m2 + n2 ) for some m, n ∈ N0 with m > n, gcd(m, n) = 1 and m and n not both odd. Proof : Suppose (x, y, z) is a primitive Pythagorean triple. By Lemma 1.6, we get that y is odd. Since x2 + y 2 = z 2 , we get that z is odd. Hence z + y and z − y are even. We get that x2 = z 2 − y 2 = (z + y)(z − y) and so x 2 Put d = gcd !2 = z+y z−y , . Then 2 2 z+y z−y d| + 2 2 z+y 2 ! z−y 2 and d| ! z+y z−y − 2 2 So d|z and d|y. Hence d| gcd(z, y). But gcd(z, y) = 1. So d = 1. By HW3 #1, there exist m, n ∈ N such that z−y z+y = m2 and = n2 2 2 z+y z−y , Then gcd(m, n) = 1 since gcd = 1. We easily get that 2 2 x = 2mn , y = m2 − n2 and z = m2 + n2 Since y > 0, we have that m > n. Since gcd(m, n) = 1, m and n can not both be even. If m and n are both odd, then y is even, a contradiction. So m and n are not both odd. One easily checks that (2mn, m2 − n2 , m2 + n2 ) is a primitive Pythagorean triple if m, n ∈ N0 such that gcd(m, n) = 1, m > n and not both m and n are odd. 2 Remark : It follows now that all Pythagorean triples (x, y, z) are given by (x, y, z) ∈ {d(2mn, m2 − n2 , m2 + n2 ), d(m2 − n2 , 2mn, m2 + n2 )} where d, m, n ∈ N0 , m > n, gcd(m, n) = 1 and not both m and n are odd. Example : Find all Pythagorean triples (x, y, z) such that one of the variables has value 15. We may assume that (x, y, z) = d(2mn, m2 − n2 , m2 + n2 ) where d, m, n ∈ N0 , m > n, gcd(m, n) = 1 and not both m and n are odd. So either d(m2 − n2 ) = 15 or d(m2 + n2 ) = 15. So d|15. Hence d ∈ {1, 3, 5, 15}. Suppose first that d(m2 + n2 ) = 15. We easily get (by going over all the cases for d that the only solution is (d, m, n) = (3, 2, 1) and so (x, y, z) = (12, 9, 15). 6 Suppose next that d(m2 − n2 ) = 15. Then d(m − n)(m + n) = 15. Note that m − n < m + n. This leads to the following possibilities : (d, m − n, m + n) ∈ {(1, 1, 15), (1, 3, 5), (3, 1, 5), (5, 1, 3)} Hence (d, m, n) ∈ {(1, 8, 7), (1, 4, 1), (3, 3, 2), (5, 2, 1)}. So (x, y, z) ∈ {(112, 15, 113), (8, 15, 17), (36, 15, 39), (20, 15, 25)} All Pythagorean triples (x, y, z) such that one of the variables has value 15 are given by (12, 9, 15), (112, 15, 113), (8, 15, 17), (36, 15, 39), (20, 15, 25) (9, 12, 15), (15, 112, 113), (15, 8, 17), (15, 36, 39), (15, 20, 25) 1.4 The Chord-Tangent Method In this section, we illustrate with an example how to find all rational points on an irreducible quadratic curve if one rational point is known. This method is called the Chord-Tangent Method of Diophantus. We want to find all the rational points on the circle C with equation x2 + y 2 = 1. Clearly, the point P = (0, −1) is a rational point on C. Suppose Q is another rational point on C. Then the line through P and Q is either vertical or has a rational slope. Conversely, let l be a line through P that is either vertical or has a rational slope. Then this line l will intersect the circle C in a second point Q, which turns out to be rational. So all the rational points on C can be found by intersecting C with lines through P that are either vertical or have a rational slope P . We now find that second point of intersection between the circle C and such a line l. The vertical line through (0, −1) intersects the circle in (0, −1) and (0, 1). So suppose l has a rational slope. Then an equation for l is y = mx − 1 where m ∈ Q. So the points of intersection between the circle C and the line l are the solutions of y = mx − 1 x2 + y 2 = 1 We get that x2 + (mx − 1)2 = 1 or (m2 + 1)x2 − 2mx = 0 This is a quadratic equation in x and we know that x = 0 is a solution. The second solution is x= So y = mx − 1 = m · 2m m2 + 1 2m m2 − 1 − 1 = m2 + 1 m2 + 1 7 Hence the second point of intersection is 2m m2 − 1 Q= , m2 + 1 m2 + 1 Notice that we get the point (0, 1) if we consider the limit as m → +∞ in the expression for Q. This is normal since we can view a vertical line as a line with infinite slope. We get that all the rational points on x2 + y 2 = 1 are 2m m2 − 1 , :m∈Q {(0, 1)} ∪ m2 + 1 m2 + 1 1.5 The Method of Infinite Descent In this section, we prove that the diophantine equation x4 + y 4 = z 4 has no solutions with xyz 6= 0. We begin with the following lemma. Lemma 1.8 Let x, y, z ∈ N0 with x4 + y 4 = z 2 . Then there exist a, b, c ∈ N0 such that a4 + b4 = c2 and c < z. Proof : Suppose first that gcd(x, y) > 1. Then there exists a prime p such that p|x and p|y. So x = pa and y = pb for some a, b ∈ N0 . Hence z 2 = x4 + y 4 = p4 (a4 + b4 ). So p4 |z 2 . Hence p2 |z. So z = p2 c for some c ∈ N0 . Then a4 + b4 = c2 and c < z. Suppose next that gcd(x, y) = 1. Note that (x2 )2 + (y 2 )2 = z 2 and that gcd(x2 , y 2 , z) = 1. So (x2 , y 2 , z) is a primitive Pythagorean triple. We may assume that y 2 is even. Then by Theorem 1.7, we get that there exist m, n ∈ N0 such that gcd(m, n) = 1, m > n, not both m and n are odd and x2 = m2 − n2 , y 2 = 2mn and z = m2 + n2 So x2 + n2 = m2 . Since gcd(m, n) = 1, we get that (x, n, m) is a primitive Pythagorean triple. Since y 2 is even, we get that x2 (and so also x) is odd. Hence n is even. By Theorem 1.7, there exist r, s ∈ N0 such that gcd(r, s) = 1, r > s, not both r and s are odd and x = r2 − s2 , n = 2rs and m = r2 + s2 Since x2 + n2 = m2 , we get that m2 (and so also m), is odd. Since gcd(m, n) = 1 and m is odd, we get that gcd(m, 2n) = 1. But y 2 = m(2n). So there exist w, c ∈ N0 such that 2n = w2 and m = c2 . Clearly, w is even, say w = 2v with v ∈ N0 . Then we have that 4v 2 = w2 = 2n = 4rs and so rs = v 2 . Since gcd(r, s) = 1, there exist a, b ∈ N0 such that r = a2 and s = b2 . Since r2 + s2 = m, we have that a4 + b 4 = c 2 2 Note that z = m2 + n2 > m2 ≥ m = c2 ≥ c. 8 Corollary 1.9 Let x, y, z ∈ N such that x4 + y 4 = z 2 . Then xyz = 0. Proof : Suppose that xyz 6= 0. Put x0 = x, y0 = y and z0 = z. By Lemma 1.8, there exist x1 , y1 , z1 ∈ N0 such that x41 + y14 = z12 and z1 < z0 . Continuing to apply Lemma 1.8, we get that for all n ∈ N, there exist xn , yn , zn ∈ N0 such that x4n + yn4 = zn2 and z0 > z1 > z2 > · · · . This is impossible since zn ∈ N0 for all n ∈ N. Hence xyz = 0. 2 Theorem 1.10 (Fermat) Let x, y, z ∈ N with x4 + y 4 = z 4 . Then xyz = 0. Proof : Note that x4 + y 4 = (z 2 )2 . By Corollary 1.9, xyz 2 = 0. Hence xyz = 0. 1.6 2 Sums of Two Squares In this section, we study the diophantine equation x2 + y 2 = n For which n does this equation have a solution? Definition 1.11 Let n ∈ N. We say that n is the sum of two squares if there exist x, y ∈ N with x2 + y 2 = n. We start with a little lemma that tells us that certain numbers are not the sum of two squares. Lemma 1.12 Let n ∈ N with n ≡ 3 mod 4. Then n is not the sum of two squares. Proof : Suppose that n = x2 + y 2 for some x, y ∈ N. Note that a2 ≡ 0 mod 4 or a2 ≡ 1 mod 4 for all a ∈ N. Hence x2 + y 2 ∈ {0, 1, 2} mod 4, a contradiction since n ≡ 3 mod 4. 2 Being the sum of two squares is a ‘multiplicative’ property : Lemma 1.13 Let m, n ∈ N. If m and n are both the sum of two squares then mn is also the sum of two squares. Proof : Suppose that m = a2 + b2 and n = c2 + d2 for some a, b, c, d ∈ N. One easily checks that mn = (a2 + b2 )(c2 + d2 ) = (ac − bd)2 + (ad + bc)2 2 So mn is the sum of two squares. Corollary 1.14 Let ai ∈ N for i = 1, 2, . . . , n. If ai is the sum of two squares for i = 1, 2, . . . , n then a1 a2 · · · an is the sum of two squares. Proof : This follows from Lemma 1.13 by induction on n. 9 2 Lemma 1.13 makes the following question quite natural : which primes are the sum of two squares? If p is an odd prime that is the sum of two squares then p ≡ 1 mod 4 by Lemma 1.12. It turns out that this condition is also sufficient. In order to prove this, we use a special type of complex numbers. Definition 1.15 (1) A Gaussian integer is a complex number of the form a+bi with a, b ∈ Z. (2) The set of all Gaussian integers is denoted by Z[i]. Note that Z ⊂ Z[i]. (3) Let α, β ∈ Z[i] with β 6= 0. We say that β divides α (notation : β|α) if α = βγ for some γ ∈ Z[i]. (4) Let α ∈ Z[i]. Then α is a unit if αβ = 1 for some β ∈ Z[i]. (5) Let α := a + bi ∈ Z[i]. We define the norm of α (notation : N (α)) as the natural number N (α) = a2 + b2 . The norm function N is quite a powerful tool to convert Gaussian integers into natural numbers. Lemma 1.16 The following holds about Gaussian integers and the norm N : (a) N (αβ) = N (α)N (β) for all α, β ∈ Z[i]. (b) Let α ∈ Z[i]. Then α is a unit if and only if N (α) = 1. (c) The units of Z[i] are {1, −1, i, −i}. Proof : (a) Note that N (α) = |α|2 (where | | stands for the norm of a complex number) for all α ∈ Z[i]. Since |z1 z2 | = |z1 ||z2 | for all z1 , z2 ∈ C, we get that N (αβ) = N (α)N (β) for all α, β ∈ Z[i]. (b) Suppose first that α is a unit. Hence αβ = 1 for some β ∈ Z[i]. Applying the norm N to both sides and using (a), we get N (α)N (β) = N (αβ) = N (1) = 12 + 02 = 1 Since N (α), N (β) ∈ N, we have that N (α) = N (β) = 1. Suppose next that N (α) = 1. Put α = a + bi with a, b ∈ Z. Then a2 + b2 = 1. Put β = a − bi. Then β ∈ Z[i] and αβ = (a + bi)(a − bi) = a2 + b2 = 1 So α is a unit. (c) Let α := a + bi ∈ Z. By (b), we get that α is a unit ⇔ N (α) = 1 ⇔ a2 + b2 = 1 Since a, b ∈ Z, we have (a, b) ∈ {(1, 0), (−1, 0), (0, 1), (0, −1)}. Hence α ∈ {1, −1, i, −i}. 2 The Gaussian integers Z[i] behave pretty much like the integers Z. We need the concept of a Gaussian prime number in order to proceed. There are two ways of thinking of prime numbers over the integers. In general domains, this leads to two different concepts. 10 Definition 1.17 Let α ∈ Z[i] such that α 6= 0 and α is not a unit. (1) We say that α is irreducible if ∀β, γ ∈ Z[i] : α = βγ ⇒ β is a unit or γ is a unit (b) α is reducible if α is not irreducible. Hence α is reducible if and only if α = βγ for some β, γ ∈ Z[i] \ {1, −1, i, −i}. (c) α is prime if ∀β, γ ∈ Z[i] : α|(βγ) ⇒ α|β or α|γ To distinguish between p ∈ Z being prime and α ∈ Z[i] being prime, we call a prime p ∈ Z a rational prime while a prime α ∈ Z[i] is called a Gaussian prime. In Math 251, we proved the following relation between being irreducible and being a prime in Z[i]. Proposition 1.18 Let α ∈ Z[i]. Then α is irreducible if and only if α is a Gaussian prime. Examples (a) 5 is a rational prime but not a Gaussian prime Note that 5 = (2 + i)(2 − i). Since neither 2 + i nor 2 − i is a unit, we get that 5 is reducible and hence not a Gaussian prime. (b) 3 is both a rational prime and a Gaussian prime. Suppose that 3 = αβ for some α, β ∈ Z[i]. Applying the norm N to both sides, we get that 9 = 32 + 02 = N (3) = N (αβ) = N (α)N (β) Since N (α), N (β) ∈ N, we get that (N (α), N (β)) ∈ {(1, 9), (9, 1), (3, 3)}. Note that there are no Gaussian integers with norm 3 (indeed, let a, b ∈ Z with N (a + bi) = 3; then a2 + b2 = 3, a contradiction since a, b ∈ Z). Hence N (α) = 1 or N (β) = 1. So α is a unit or β is a unit. Hence 3 is irreducible and so a Gaussian prime. (c) 2 + i is a Gaussian prime Suppose that 2 + i = αβ for some α, β ∈ Z[i]. Applying the norm N to both sides, we get that 5 = 22 + 12 = N (2 + i) = N (αβ) = N (α)N (β) Since N (α), N (β) ∈ N, we get that (N (α), N (β)) ∈ {(1, 5), (5, 1)}. Hence N (α) = 1 or N (β) = 1. So α is a unit or β is a unit. Hence 2 + 5i is irreducible and so a Gaussian prime. Note that this example generalizes : if α ∈ Z[i] such that N (α) is a rational prime, then α is a Gaussian prime. 11 (d) 1 − 3i is not a Gaussian prime How can we come up with a factorization of 1 − 3i? The answer : use the norm! Suppose that 1 − 3i = αβ for some α, β ∈ Z[i]. Applying the norm N to both sides, we get that 10 = 1 + (−3)2 = N (1 − 3i) = N (αβ) = N (α)N (β) Since N (α), N (β) ∈ N, we may assume that (N (α), N (β)) ∈ {(1, 10), (5, 2)}. If we can exclude (5, 2) then 1 − 3i will be irreducible. First, we’ll find all α ∈ Z[i] with N (α) = 5. Putting α = a + bi, we get that N (α) = a2 + b2 = 5. Since a, b ∈ Z we get (a, b) ∈ {(1, 2), (1, −2), (−1, 2), (−1, −2), (2, 1), (2, −1), (−2, 1), (−2, −1)} So α ∈ {1 + 2i, 1 − 2i, −1 + 2i, −1 − 2i, 2 + i, 2 − i, −2 + i, −2 − i} We don’t have to check all eight possibilities. If 1 − 3i is divisible by 2 + i, then it is also divisible by any unit times 2 + i. So we can divide the eight possibilities in groups of four : {1 · (2 + i), (−1) · (2 + i), i · (2 + i), (−i) · (2 + i)} = {2 + i, −2 − i, −1 + 2i, 1 − 2i} {1 · (2 − i), (−1) · (2 − i), i · (2 − i), (−i) · (2 − i)} = {2 − i, −2 + i, 1 + 2i, −1 − 2i} Now we try one element of these groups of four. If neither group works, 1−3i is irreducible; if one of the groups works then we found a ’factorization of 1 − 3i. First, we try 2 + i : is 1 − 3i divisible by 2 + i? We easily get (1 − 3i)(2 − i) −1 − 7i 1 7 1 − 3i = = =− − i∈ / Z[i] 2+i (2 + i)(2 − i) 5 5 5 So 1 − 3i is not divisible by 2 + i. Next we try 2 − i : (1 − 3i)(2 + i) 5 − 5i 1 − 3i = = = 1 − i ∈ Z[i] 2−i (2 − i)(2 + i) 5 Hence we came up with the following factorization : 1 − 3i = (2 − i)(1 − i) Since neither 2 − i nor 1 − i is a unit (in fact, they are Gaussian primes since their norms are rational primes), we get that 1 − 3i is reducible and hence not a Gaussian prime. The following proposition shows the relation between rational primes being Gaussian primes and sums of squares. 12 Proposition 1.19 A rational prime p is a Gaussian prime if and only if p is not the sum of two squares. Proof : Let p be a rational prime. We will prove that p is reducible if and only if p is the sum of two squares. Suppose first that p is the sum of two squares, say p = a2 + b2 with a, b ∈ N. Then p = a2 + b2 = (a + bi)(a − bi). Note that neither a + bi nor a − bi is a unit (since p is a rational prime, we have that a 6= 0 6= b). Hence p is reducible. Suppose next that p is reducible. So p = αβ for some α, β ∈ Z[i] \ {1, −1, i, −i}. Applying the norm to both sides, we find p2 = p2 + 02 = N (p) = N (αβ) = N (α)N (β) Since N (α), N (β) ∈ N \ {1} and p is a rational prime, we get that N (α) = N (β) = p. Put α = a + bi with a, b ∈ Z. Then p = N (α) = a2 + b2 . So p is the sum of two squares. 2 Corollary 1.20 Let p be a rational prime with p ≡ 3 mod 4. Then p is a Gaussian prime. Proof : By Lemma 1.12, p is not the sum of two squares. So by Proposition 1.19, p is a Gaussian prime. 2 Next, we prove that a rational prime p with p ≡ 1 mod 4, is a Gaussian prime. We need Euler’s Criterion (seen in Math 116). Proposition 1.21 (Euler’s Criterion) Let p be an odd prime and a ∈ Z with gcd(a, p) = 1. p−1 Then a is a square modulo p if and only if a 2 ≡ 1 mod p. Corollary 1.22 Let p be an odd prime. Then −1 is a square modulo p if and only if p ≡ 1 mod 4. Proof : Suppose first that p ≡ 1 mod 4. Then p = 4k + 1 for some k ∈ N. Hence (−1) p−1 2 ≡ (−1)2k ≡ 1 mod p So by Euler’s Criterion, −1 is a square modulo p. Suppose next that p ≡ 3 mod 4. Then p = 4k + 3 for some k ∈ N. Hence (−1) p−1 2 ≡ (−1)2k+1 ≡ −1 6≡ 1 So by Euler’s Criterion, −1 is not a square modulo p. 13 mod p 2 Proposition 1.23 Let p be a rational prime with p ≡ 1 mod 4. Then p is not a Gaussian prime. Proof : Suppose that p is a Gaussian prime. By Corollary 1.22, there exists n ∈ N with n2 ≡ −1 mod p. Hence p|(n2 + 1). So p|((n + i)(n − i)). Since p is a Gaussian prime, we get that p|(n + i) or p|(n − i). Hence there exist a, b ∈ Z with n ± i = p(a + bi) = pa + pbi Hence n = pa and |pb| = 1, a contradiction since b ∈ Z and p is a rational prime. So p is not a Gaussian prime. 2 Theorem 1.24 Let p be a rational prime. Then the following are equivalent : (a) p is a Gaussian prime. (b) p ≡ 3 mod 4. (c) p is not the sum of two squares. Proof : This follows from Proposition 1.19, Corollary 1.20, Proposition 1.23 and the fact that 2 is not a Gaussian prime (since 2 = (1 + i)(1 − i)). 2 We are now able to describe which natural numbers are the sum of two squares. Note that 0 and 1 are the sum of two squares. Q Theorem 1.25 Let n ∈ N with n ≥ 2 and let n = ki=1 pαi i be the prime factorization of n. Then n is the sum of two squares if and only if for i = 1, 2, . . . , k, we have that αi is even if pi ≡ 3 mod 4. Proof : Assume first that for i = 1, 2, . . . , k, we have that αi is even if pi ≡ 3 mod 4. Note that 2 is the sum of two squares. If p is an odd rational prime, then p is the sum of two squares if p ≡ 1 mod 4 by Theorem 1.24 and p2 is the sum of two squares if p ≡ 3 mod 4. So n is the sum of two squares by Corollary 1.14. Assume next that n is the sum of two squares. Suppose that there exists i ∈ {1, 2, . . . , k} such that pi ≡ 3 mod 4 and αi is odd, say i = 1. Then n = a2 + b2 for some a, b ∈ N. Let f = gcd(a, b). Put a = f c and b = f d. Then n = a2 + b2 = f 2 c2 + f 2 d2 = f 2 (c2 + d2 ) So f 2 |n. Put n = f 2 m. Then we have that m = c2 + d2 Let f = Qk i=1 pδi i be the prime factorization of f . Then Q k αi k Y n i=1 pi pαi i −2δi m = 2 = Q 2 = f k δi i=1 i=1 pi 14 Since α1 is odd, we get that αi − 2δ1 ≥ 1. So p1 |m. If p1 |c, then p1 |d2 since d2 = m − c2 and so p1 |d since p1 is a prime, a contradiction since gcd(c, d) = 1. Hence gcd(p1 , c) = 1. As seen in Math 116, there exists t ∈ N with ct ≡ d mod p1 (indeed, the equation cx ≡ d mod p1 has a solution). So we get 0 ≡ m ≡ c2 + d2 ≡ c2 + (ct)2 ≡ c2 (1 + t2 ) mod p1 Since gcd(p1 , c) = 1, we get that 1 + t2 ≡ 0 mod p1 So −1 is a square modulo p1 , a contradiction to Corollary 1.22. Hence for i = 1, 2, . . . , k, we have that αi is even if pi ≡ 3 mod 4. 1.7 2 Sums of k-th Powers In the previous section, we found all the natural numbers that can be written as a sum of two squares. We now could ask the question : which natural numbers can be written as the sum of three squares? The answer is a theorem similar to Theorem 1.25. Note that not every natural number can be written as the sum of three squares (7, 15, 23, ... are all examples; in fact, if n ≡ 7 mod 8 then n can not be written as the sum of three squares). Lagrange however proved the following amazing result : Theorem 1.26 (Lagrange,1770) Every natural number can be written as the sum of four squares. In 1770, Edward Waring suggested the following generalization : Given k ∈ N, does there exist g(k) ∈ N such that every n ∈ N can be written as the sum of g(k) k-th powers of natural numbers? He also conjectured that g(3) = 9 and g(4) = 19. In 1906, Hilbert proved that the function g(k) indeed exists. In 1909, it was proven that g(3) = 9. Only in 1986 was it shown that g(4) = 19. We finish this section with a result from Euler (that we can write now as) : h i k − 2 for all k ≥ 2. Theorem 1.27 g(k) ≥ 2k + 23 Notice that in Euler’s time, it was only conjectured that g(k) existed. Euler’s proof is constructive but his lower bound is truly remarkable : " # k 3 k • Mahler proved in 1957 that g(k) 6= 2 + − 2 for at most a finite number of values 2 for k. " # k 3 • Stemmler, Kubina and Wunderlich showed in 1990 that g(k) = 2k + − 2 for all 2 k ≤ 471, 600, 000. 15 1.8 The Pell Equation x2 − ny 2 = 1 Let n ∈ N that is not a perfect square. Put √ √ √ √ Q( n) = {a + b n | a, b ∈ Q} and Z[ n] = {a + b n | a, b ∈ Z} √ √ √ Since n is irrational and every element√of Q( n) can√be written uniquely as a + b n for some a, b ∈ Q. Recall that the norm of a + b n is N (a + b n) = a2 − nb2 . √ √ We proved that α ∈ Z[ n] is a unit if and only N (α) = ±1. Putting α = x + y n where x, y ∈ Z, this leads to solving the Diophantine equations x2 − ny 2 = −1 or x2 − ny 2 = 1 In this section, we concentrate on the latter : x2 − ny 2 = 1 This Diophantine equation is called the Pell equation or Pell-Fermat equation. We want to find all (x, y) ∈ N2 with x2 − ny 2 = 1. Clearly (x, y) = (1, 0) is a natural solution. This solution is called the trivial solution. We also put (x0 , y0 ) = (1, 0). Are there any other solutions? We will prove that there are infinitely many solutions that can be described in terms of a ‘smallest’ or ‘fundamental’ solution. Suppose that (x1 , y1 ), (x2 , y2 ) ∈ N2 are solutions of x2 − ny 2 = 1. Then we have that x1 < x2 ⇔ x21 < x22 ⇔ 1 + ny12 < 1 + ny22 ⇔ y12 < y22 ⇔ y1 < y2 So we can order the natural solutions of x2 − ny 2 = 1 : (x1 , y1 ) < (x2 , y2 ) ⇔ x1 < x2 ⇔ y1 < y2 We start by proving that the equation x2 − ny 2 = 1 always has a non-trivial solution. We give a non-constructive existence proof that relies heavily on the Pigeonhole Principle : • If we distribute more than k pigeons into k holes then at least one hole contains more than one pigeon. • If we distribute infinitely many pigeons into finitely many holes then at least one hole contains infinitely many pigeons. The main step is Dirichlet’s Approximation Theorem. 16 Theorem 1.28 [Dirichlet] Let n, B ∈ N0 with n not a perfect square. Then there exist a, b ∈ Z √ 1 such that 1 ≤ b ≤ B and |a − b n| < . B √ √ Proof : For k = 1, 2, . . . , B + 1, put αk = k n − [k n] (where for x ∈ R, [x] is the integral part of x : it is the largest integer smaller than or equal to x). Note that αk is irrational and 0 < αk <√1 for k = 1, 2, . . . , B + 1. Moreover, {α1 , α2 , . . . , αB+1 } are B + 1 different numbers because n is irrational. By the Pigeonhole Principle, we have that 1 for some 1 ≤ k < l ≤ B + 1 B √ √ Put a = [l n] − [k n] and b = l − k. Then a, b ∈ Z, 1 ≤ b = l − k ≤ B and √ √ √ √ √ √ √ √ 1 |a − b n| = |([l n] − [k n]) − (l − k) n| = |(k n − [k n]) − (l n − [l n])| = |αk − αl | ≤ B √ √ 1 2 Since b 6= 0, we have that a − b n is irrational and so |a − b n| < . B Corollary 1.29 Let n ∈ N with n not a perfect square. Then there are infinitely many couples √ 1 (a, b) ∈ Z × N0 with |a − b n| < . b √ 1 Proof : Suppose that there are only finitely many couples (a, b) ∈ Z × N0 with |a − b n| < , b √ √ 1 say {(a1 , b1 ), . . . , (ak , bk )}. Pick B ∈ N with ≤ min{1, |a1 − b1 n|, . . . , |ak − bk n|}. By B Dirichlet’s Approximation Theorem, there exists (a, b) ∈ Z × N0 such that 1 ≤ b ≤ B and √ √ 1 1 1 |a − b n| < . Since 1 ≤ b ≤ B, we have that |a − b n| < ≤ . Hence (a, b) = (ai , bi ) for B B b some 1 ≤ i ≤ k. But then √ √ √ 1 ≤ |ai − bi n| |ai − bi n| = |a − b n| < B a contradiction. √ 1 2 Hence there are infinitely many couples (a, b) ∈ Z × N0 with |a − b n| < . b Corollary 1.30 √ √ Let n ∈ N with n not a perfect square. Then there are infinitely many numbers a − b n ∈ Z[ n] with the same norm. √ 1 Proof : By Corollary 1.29 there are infinitely many couples (a, b) ∈ Z×N0 with |a − b n| < . b If (a, b) is such a couple then √ √ √ √ √ √ √ 1 |a + b n| = |(a − b n) + 2b n| ≤ |a − b n| + 2b n < + 2b n < 3b n b and so √ √ √ √ √ 1 |N (a − b n)| = |a2 − nb2 | = |a − b n| |a + b n| < · 3b n = 3 n b √ √ √ √ Hence there are infinitely many numbers a − b n ∈ Z[ n] with |N (a − b n)| < 3 n. Since the norm is an√integer,√it follows from the Pigeonhole principle that there are infinitely many numbers a − b n ∈ Z[ n] with the same norm. 2 |αk − αl | ≤ 17 √ √ Corollary 1.31 There exists an integer N and two different numbers a 1 − b1 n, a2 − b2 n ∈ √ √ √ Z[ n] such that N (a 1 − b1 n) = N (a2 − b2 n) = N , a1 ≡ a2 mod |N |, b1 ≡ b2 mod |N | and √ √ (a1 − b1 n)(a2 − b2 n) > 0. √ √ 1.30, there exists N ∈ Z and infinitely many numbers a − b n ∈ Z[ n] Proof : By Corollary √ with N (a − b n) = N . Since there are only finitely many congruence classes modulo |N |, it follows from the Pigeonhole Principle √ that there exist p, q ∈ {0, 1, . . . , |N | − 1} such that infinitely many of these numbers a − b n satisfy a ≡ p mod |N | and b ≡ q mod |N |. Since every number is either positive or negative, we get√(using the Pigeonhole Principle one more time) that infinitely many √ a − b n have the same √ sign. So there √ are two √ of these√numbers that N (a − b n) = N (a − b different numbers a1 − b1 n, a2 − b2 n ∈ Z[ n] such 2 2 n) = N , √1 1 √ 2 a1 ≡ a2 mod |N |, b1 ≡ b2 mod |N | and (a1 − b1 n)(a2 − b2 n) > 0. We are now ready to prove the existence of a non-trivial solution to the Pell-Fermat equation. Theorem 1.32 Let n ∈ N with n not a perfect square. Then the equation x2 − ny 2 = 1 has a non-trivial solution. So there exists (x, y) ∈ N2 such that x2 − ny 2 = 1 and (x, y) 6= (1, 0). √ √ √ 2 2 Proof : Let √ N and a1 − b1 n, a2 − b2 n ∈ Z[ n] be as in Corollary 1.31. Since a2 − nb2 = N (a2 − b2 n) = N , we get that √ √ √ (a1 − b1 n)(a2 + b2 n) a a − nb1 b2 a1 b2 − b1 a2 √ a1 − b 1 n √ = √ √ = 1 2 + n N N a2 − b 2 n (a2 − b2 n)(a2 + b2 n) √ Recall that a21 − nb21 = N (a1 − b1 n) = N , a1 ≡ a2 mod |N | and b1 ≡ b2 mod |N |. Hence a1 a2 − nb1 b2 ≡ a1 a1 − nb1 b1 ≡ a21 − nb21 ≡ 0 mod |N | a1 b 2 − b 1 a2 a1 a2 − nb1 b2 ∈ Z. Similarly, we get that b := ∈ Z. Hence N N √ √ √ a1 − b 1 n √ ∈ Z[ n] a+b n= a2 − b 2 n √ √ Since N (a1 − b1 n) = N (a2 − b2 n), we get that √ √ √ a1 − b 1 n N (a1 − b1 n) 2 2 √ √ =1 a − nb = N (a + b n) = N = a2 − b 2 n N (a2 − b2 n) √ √ √ √ Since (a + b n > 0. Finally, since a1 − b1 n and √1 − b1 n)(a2 − b2 n) > 0, we get that a√ a2 − b2 n are different numbers, we have that a + b n 6= 1. So (|a|, |b|) is a non-trivial solution of x2 − ny 2 = 1. 2 So a := So the equation x2 − ny 2 = 1 has a non-trivial solution. Recall that we can order the natural solutions of x2 − ny 2 = 1. We call the smallest non-trivial natural solution of x2 − ny 2 = 1 the fundamental solution of x2 − ny 2 = 1 and denote it by (x1 , y1 ). 18 By trial and error, we compiled a list of the fundamental solution for some small square-free values of n : n 2 3 5 6 7 10 (x1 , y1 ) (3, 2) (2, 1) (9, 4) (5, 2) (8, 3) (19, 6) Later, we will use continued fractions to find the fundamental solution which can be enormous, even for relatively small values of n. For example, the fundamental solution of x2 − 61y 2 = 1 is (1766319049, 226153980) √ Let√(x1 , y1 ) be the fundamental solution of x2 −ny 2 = 1. For k ∈ N, we have that (x1 +y1 n)k ∈ Z[ n]. Hence we can define xk , yk ∈ N by √ √ xk + yk n = (x1 + y1 n)k for k = 0, 1, 2, . . . Note that it follows from Newton’s Binomium that xk , yn are indeed natural numbers for all k ≥ 0. We now prove that all the natural solutions of x2 − ny 2 = 1 are {(xk , yk ) | k = 0, 1, 2, . . .}. √ One final remark : let x, y ∈ Q with x2 − ny 2 = 1 (so N (x + y n) = 1). Then √ √ √ 1 x−y n x−y n √ = √ √ = 2 = x − y n x − ny 2 x+y n (x + y n)(x − y n) Theorem 1.33 Let n ∈ N with n not a perfect square and let √ (x1 , y1 ) be the √ fundamental solution of x2 − ny 2 = 1. For k ≥ 0, define xk , yk ∈ N by xk + yk n = (x1 + y1 n)k . Then the following holds : (a) All the natural solutions of x2 − ny 2 = 1 are given by {(xk , yk ) | k = 0, 1, 2, . . .}  √ √ (x1 + y1 n)k + (x1 − y1 n)k    xk = √ k 2 √ (b) For all k ≥ 0, we have that (x + y n) − (x1 − y1 n)k 1 1   √  yk = 2 n xk = x1 xk−1 + ny1 yk−1 (c) For all k ≥ 1, we have that yk = y1 xk−1 + x1 yk−1 19 Proof : (a) Using the norm, we get that for all k ≥ 0 √ √ √ k x2k − nyk2 = N (xk + yk n) = N (x1 + y1 n)k = N (x1 + y1 n) = (x21 − ny12 )k = 1k = 1 So (xk , yk ) is a natural solution of x2 − ny 2 = 1 for all k ≥ 0. √ √ Suppose that x, y ∈ N with x2 −ny 2 = 1. Since (x1 +y1 n)0 = 1 and lim (x1 + y1 n)k = +∞, k→+∞ it follows that there exists a unique k ∈ N such that √ √ √ (1) (x1 + y1 n)k ≤ x + y n < (x1 + y1 n)k+1 Note that √ √ √ √ √ √ x+y n x+y n √ k = √ = (x + y n)(xk − yk n) = xxk − nyyk + (yxk − xyk ) n := a + b n (x1 + y1 n) xk + yk n where a, b ∈ Z. Applying the norm, we find √ a2 − nb2 = N (a + b n) √ √ = N ((x + y n)(xk − yk n)) √ √ = N (x + y n)N (xk − yk n) = (x2 − ny 2 )(x2k − nyk2 ) = 1 It follows from (1) that √ √ √ √ √ (x1 + y1 n)k+1 (x1 + y1 n)k x+y n √ k ≤ √ k =a+b n< √ k = x1 + y1 n 1= (x1 + y1 n) (x1 + y1 n) (x1 + y1 n) Thus (2) 1 1 √ ≤1 √ < x1 + y1 n a+b n Hence √ √ x1 − y1 n < a − b n ≤ 1 and so √ √ −1 ≤ −a + b n < −x1 + y1 n < 0 Adding (2) and (3), we get that (3) √ √ 0 ≤ 2b n < 2y1 n So 0 ≤ b < y1 Since a2 − nb2 = 1 and (x1 , y1 ) is the fundamental solution of X 2 − nY 2 = 1, we get that (|a|, b) = (1, 0). It follows from (2) that a = 1. So √ √ √ √ √ x + y n = (a + b n)(x1 + y1 n)k = (x1 + y1 n)k = xk + yk n 20 Hence (x, y) = (xk , yk ), which proves (a). (b) Pick k ≥ 0. Recall that √ √ xk + yk d = (x1 + y1 d)k (4) Hence xk − yk √ 1 1 √ = √ d= = xk + yk d (x1 + y1 d)k 1 √ x1 + y1 d k √ = (x1 − y1 d)k (5) Solving equations (4) and (5) for xk and yk , we get √ √ √ √ (x1 + y1 n)k − (x1 − y1 n)k (x1 + y1 n)k + (x1 − y1 n)k √ and yk = xk = 2 2 n which proves (b). (c) Pick k ≥ 1. Then √ √ xk + yk n = (x1 + y1 √n)k √ = (x1 + y1 √n)(x1 + y1 n)k−1 √ = (x1 + y1 n)(xk−1 + √ yk−1 n) = x1 xk−1 + ny1 yk−1 + n(x1 yk−1 + y1 xk−1 ) Hence xk = x1 xk−1 + ny1 yk−1 and yk = x1 yk−1 + y1 xk−1 2 which proves (c). Example : Consider the equation x2 − 3y 2 = 1 We ‘see’ that (2, 1) is the fundamental solution. Hence all the natural solutions of x2 − 3y 2 = 1 are given by ( ) √ √ √ √ ! (2 + 3)k + (2 − 3)k (2 + 3)k − (2 − 3)k √ , |k ∈ N 2 2 3 We can also get these solutions by considering √ √ xk + yk 3 = (2 + 3)k 21 for k = 0, 1, 2, . . . Chapter 2 Analytic Number Theory 2.1 2.1.1 The Riemann-Zeta Function The Riemann-Zeta Function and the Euler Product Riemann saw that there was a connection between the distribution of primes and the zeros of a function, now called the zeta function. The zeta function is a function of a complex variable traditionally denoted by s with real part σ and imaginary part t (so s = σ + it). +∞ X 1 converges absolutely and uniformly on any compact subset of Lemma 2.1 The series ns n=1 the half plane Re(s) > 1. Proof : Let T be a compact subset of the half plane Re(s) > 1. Then there exists p > 1 such that Re(s) ≥ p for all s ∈ T . Note that 1 1 1 1 1 = 1 = = ln(n)σ = σ ≤ p for all s ∈ T ns |ns | ln(n)s |e | |e | n n +∞ +∞ X X 1 1 Since the series converges (it’s a p-series with p > 1), we get that the series p n ns n=1 n=1 converges absolutely and uniformly on T by the Weierstrass M -test. 2 Definition 2.2 We define the Riemann-zeta function ζ(s) by +∞ X 1 ζ(s) = ns n=1 for all s ∈ C with Re(s) > 1 It follows from Lemma 2.1 that ζ(s) is an analytic function on the half plane Re(s) > 1. The following property of the zeta function will be used to prove that certain sets of prime numbers are infinite. 22 Lemma 2.3 For s ∈ R with s > 1, we have lim ζ(s) = +∞ s→1+ Proof : For all s ∈ (1, +∞) and all integers N > 1, we have that Z N N N −1 X X 1 dx 1 1 − N 1−s < < = s s n x s−1 ns 1 n=2 n=1 Taking the limit as N goes to +∞, we find +∞ +∞ X X 1 1 1 ≤ ≤ s n s − 1 n=1 ns n=2 for all s ∈ (1, +∞) 1 ≤ ζ(s) s−1 for all s ∈ (1, +∞) Hence ζ(s) − 1 ≤ So 1 1 ≤ ζ(s) ≤ +1 s−1 s−1 for all s ∈ (1, +∞) 2 Hence lim+ ζ(s) = +∞. s→1 The following theorem gives a relation between the zeta function and prime numbers. Theorem 2.4 (Euler Product for ζ(s)) For all s ∈ C with Re(s) > 1, we have that 1 1 − p−s prime Y ζ(s) = p Proof : Pick s ∈ C with Re(s) > 1. Let p1 < p2 < p3 < · · · be the list off all primes. Then by definition, we have k Y Y 1 1 = lim −s k→+∞ 1−p 1 − p−s i i=1 p prime Note that for any prime p, +∞ X 1 1 − p−s p−js = j=0 where the series converges absolutely. Using the Cauchy Product for absolutely convergent series, we get that for any k ≥ 1 ! +∞ +∞ k k X X X 1 Y Y 1 −js = p = (pα1 1 . . . pαk k )−s = i −s ns 1 − pi α ,...,α =0 i=1 i=1 j=0 n∈N 1 23 k k where Nk is the set of natural numbers that have no prime divisors bigger than pk . Note that +∞ X 1 +∞ N1 ⊆ N2 ⊆ N3 ⊆ · · · and ∪k=1 Nk = N. Since the series converges absolutely, we get ns n=1 k +∞ X 1 Y X 1 1 1 = lim = = ζ(s) −s = lim −s s s k→+∞ k→+∞ 1 − p n n 1 − p i n=1 i=1 prime n∈N Y p 2 k This Euler Product leads to an easy analytic proof of Euclid’s Theorem. Theorem 2.5 (Euclid) There are infinitely many prime numbers. Proof : Suppose that there are only finitely many prime numbers, say p1 < p2 < · · · < pk . Then it follows from the Euler Product that lim ζ(s) = lim+ s→1+ s→1 k k Y Y 1 1 1 < +∞ = lim+ −s = −s s→1 1−p 1 − pi 1 − p−1 i i=1 i=1 prime Y p This is a contradiction to Lemma 2.3. Hence there are infinitely many prime numbers. 2.1.2 2 Analytic Continuation of ζ(s) and the Riemann Hypothesis The zeta function ζ(s) is only defined for s ∈ C with Re(s) > 1. Euler proved that the series η(s) = +∞ X (−1)n−1 n=1 ns is an analytic function on the set {s ∈ C \ {1} | Re(s) > 0} and that ζ(s) = η(s) 1 − 21−s for all s ∈ C with Re(s) > 1 η(s) is an analytic continuation of ζ(s). It follows from the theory of analytic continuation 1 − 2−s that this is the only way to extend the definition of ζ(s) in an analytic way to Re(s) > 0. e of ζ(s) that is defined over C \ Riemann showed that there exists an analytic continuation ζ(s) Z So +∞ xs−1 e−x dx. {1} (again, this continuation is unique). For s ∈ C with Re(s) > 0, put Γ(s) = 0 Then Γ(s) is analytic on C \ {0, −1, −2, . . .}. Riemann proved that πs s s−1 e ζ(s) = 2 π sin Γ(1 − s) ζ(1 − s) for all s ∈ C with Re(s) < 0 (∗) 2 e = 0 for s = −2, −4, −6, . . .. These are called the trivial It follows from this formula that ζ(s) e zeros of ζ(s). Using the Euler Product, one can show that ζ(s) 6= 0 for all s ∈ C with Re(s) > 1. e has no non-trivial Since Γ(s) 6= 0 for all s ∈ C with Re(s) > 0, it follows from (*) that ζ(s) e then 0 ≤ Re(s) ≤ 1. It turns out that zeros with Re(s) < 0. So if s is a non-trivial zero of ζ(s) e are symmetric around the line Re(s) = 1 : if s is a non-trivial zero the non-trivial zeros of ζ(s) 2 e then ζ(s) e = 0 ⇔ ζ(1 e − s) = 0. This leads to Riemann’s famous conjecture : of ζ(s) 24 e then Re(s) = 1 . Riemann Hypothesis : If s is a non-trivial zero of ζ(s) 2 In 1896, Hadamard and de la Vallée-Poussin were able to prove the following : e then Re(s) 6= 1. If s is a non-trivial zero of ζ(s) 2.2 The Prime Number Theorem For x ≥ 0, let π(x) be the number of primes less than or equal to x. e In 1859, Riemann found a connection between π(x) and the non-trivial zeros of ζ(s). He also x . showed how a proof of his conjecture would result in a proof that π(x) is asymptotic to ln(x) In 1896, this major result in number theory was finally proven by Hadamard and de la ValléePoussin (although they were unable to prove Riemann’s conjecture, they did prove a related result). Theorem 2.6 (Prime Number Theorem) lim x→+∞ π(x) π(x) =1 lim Z x x = x→+∞ dt ln(x) 2 ln(t) In 1949, Selberg and Erdős found elementary proofs of the Prime Number Theorem (in number theory, an elementary proof is a proof that does not use complex analysis or abstract algebra; it can still be extremely complicated). 2.3 Dirichlet’s Theorem This section is devoted to the proof of the following theorem (due to Dirichlet) : Let a, m ∈ N with gcd(a, m) = 1. Then there exist infinitely many primes p with p ≡ a mod m. For certain values of a and m, one can prove quite easily that there are infinitely many primes p with p ≡ a mod m. For example, if a = 1 and m = 2, we need to prove that there are infinitely amny odd primes. This easily follows from Euclid’s Theorem (there are infinitely many primes) and the fact that 2 is the only even prime. The following theorem gives an elementary proof of Dirichlet’s Theorem in the special case that a = 3 and m = 4. Theorem 2.7 there are infinitely many primes p with p ≡ 3 mod 4. Proof : Suppose there are only finitely many primes p with p ≡ 3 mod 4, say p1 < p2 < · · · < pk . Put N = p21 p22 · · · p2k + 2. Modulo 4, we find N ≡ p21 p22 · · · p2k + 2 ≡ 32 · 32 · · · 32 + 2 ≡ 1 · 1 · · · 1 + 2 ≡ 1 + 2 ≡ 3 25 mod 4 In particular, N is odd. So 2 does not divide N . If pi divides N for some i = 1, 2, . . . , k then pi |(N − p21 p22 · · · p2k ) and so pi divides 2, a contradiction. Hence if a prime p divides N then p ≡ 1 mod 4. So the prime factorization of N is of the form N = q 1 q 2 . . . qn where qi is a prime and qi ≡ 1 mod 4 for i = 1, 2, . . . , n. Again considering modulo 4, we find that N ≡ q1 q2 · · · qn ≡ 1 · 1 · · · 1 ≡ 1 mod 4 a contradiction since N ≡ 3 mod 4. Hence there are infinitely many primes p with p ≡ 3 mod 4. 2.3.1 2 Group Characters Definition 2.8 Let (G, ·) be a finite abelian group. (1) A character of G is a homomorphism φ : G → (C0 , ·) (2) Ĝ is the set of all characters of G. (3) For ϕ, ψ ∈ Ĝ, we define the map ϕ · ψ : G → C0 : g → ϕ(g)ψ(g) One easily checks that • (Ĝ, ·) is an abelian group with identity element ϕ0 : G → C0 : g → 1 (also called the trivial character ) • the inverse of ϕ ∈ Ĝ is the character ϕ−1 : G → C0 : g → 1 ϕ(g) The following proposition gives us some basic properties about characters. Proposition 2.9 Let G be a finite abelian group. Then the following holds : (a) Let ϕ ∈ Ĝ , g ∈ G and n ∈ N with g n = 1G . Then ϕ(g) is a n-th root of unity. In 1 particular, = ϕ(g). ϕ(g) (b) |Ĝ| = |G| (c) Let 1G 6= g ∈ G. Then there exists ϕ ∈ Ĝ with ϕ(g) 6= 1. Proof : (a) Since ϕ is a homomorphism, we get (ϕ(g))n = ϕ(g n ) = ϕ(1G ) = 1 26 Hence |ϕ(g)| = 1. Put ϕ(g) = a + bi with a, b ∈ R. Then a2 + b2 = |ϕ(g)|2 = 1. So 1 1 a − bi a − bi = = = 2 = a − bi = ϕ(g) ϕ(g) a + bi (a + bi)(a − bi) a + b2 (b) (c) It follows from the Fundamental Theorem for Finite Abelian Groups that G is the direct sum of a finite number of cyclic subgroups (say t). So there exist g1 , . . . , gt ∈ G and n1 , . . . , nt ∈ N\{0, 1} such that gi is of order ni for i = 1, 2, . . . , t and every g ∈ G can be written uniquely as g = g1m1 · · · gtmt with 0 ≤ mi < ni for i = 1, 2, . . . , t (hence G = hg1 i×hg2 i×· · ·×hgt i). (b) For n ∈ N \ {0}, put Cn = {c ∈ C | cn = 1}. Consider the map θ : Ĝ → Cn1 × Cn2 × · · · × Cnt : ϕ → (ϕ(g1 ), ϕ(g2 ), . . . , ϕ(gt )) By (a), θ is well-defined. Suppose that ϕ, ψ ∈ Ĝ with θ(ϕ) = θ(ψ). So ϕ(gi ) = ψ(gi ) for i = 1, 2 . . . , t. Pick g ∈ G. Then g = g1m1 · · · gtmt where 0 ≤ mi < ni for i = 1, 2, . . . , t. Since ϕ, ψ are homomorphisms, we get ϕ(g) = ϕ(g1m1 · · · gtmt ) = (ϕ(g1 ))m1 · · · (ϕ(gt ))mt = (ψ(g1 ))m1 · · · (ψ(gt ))mt = ψ(g1m1 · · · gtmt ) = ψ(g) Hence ϕ = ψ and θ is one-to-one. Pick (α1 , . . . , αt ) ∈ Cn1 × · · · × Cnt . Define the map ϕ : G → C0 : g1m1 · · · gtmt → α1m1 · · · αtmt where 0 ≤ mi < ni for i = 1, 2, . . . , t. One easily checks that ϕ ∈ Ĝ and θ(ϕ) = (α1 , . . . , αt ). So θ is onto. Hence θ is a bijection. So |Ĝ| = |Cn1 × · · · × Cnt | = |Cn1 | · · · |Cnt | = n1 · · · nt = |hg1 i| · · · |hgt i| = |hg1 i × · · · × hgt i| = |G| which proves (b). (c) We can write g = g1k1 · · · gtkt with 0 ≤ ki < ni for i = 1, 2, . . . , t. Since g 6= 1G , there exists j ∈ {1, 2, . . . , t} with kj 6= 0. Define the map ϕ : G → C0 : g1m1 · · · gtmt → e 2πmj nj i where 0 ≤ mi < ni for i = 1, 2, . . . , t. One easily checks that ϕ ∈ Ĝ and ϕ(g) = e The characters of a finite abelian group satisfy some nice relations. 27 2πkj nj i 6= 1. 2 Proposition 2.10 (Orthogonality Relations) Let G be a finite abelian group and ϕ0 the trivial character. Then the following relations hold : X |G| if ϕ = ϕ0 (a) For all ϕ ∈ Ĝ, we have that ϕ(g) = 0 if ϕ 6= ϕ0 g∈G (b) For all g ∈ G, we have that X |G| 0 ϕ(g) = X ϕ(g) = if g = 1G if g = 6 1G ϕ∈Ĝ Proof : (a) Pick ϕ ∈ Ĝ. If ϕ = ϕ0 then X g∈G 1 = |G| g∈G So we may assume that ϕ 6= ϕ0 . Then there exists h ∈ G with ϕ(h) 6= 1. Note that G = {hg | g ∈ G}. Since ϕ is a homomorphism, we get ! X X X X ϕ(g) ϕ(h)ϕ(g) = ϕ(h) ϕ(hg) = ϕ(g) = g∈G g∈G g∈G g∈G ! Hence (ϕ(h) − 1) X = 0. Since ϕ(h) 6= 1, we get that ϕ(g) g∈G X ϕ(g) = 0. g∈G (b) Pick g ∈ G. If g = 1G then by Proposition 2.9(b) X X ϕ(g) = 1 = |Ĝ| = |G| ϕ∈Ĝ ϕ∈Ĝ So we may assume that g 6= 1G . By Proposition 2.9(c), there exists ψ ∈ Ĝ with ψ(g) 6= 1. Note that Ĝ = {ψϕ | ϕ ∈ Ĝ}. Then we get   X X X X ϕ(g) ϕ(g) = (ψϕ)(g) = ψ(g)ϕ(g) = ψ(g)  ϕ∈Ĝ  Hence (ψ(g) − 1)  ϕ∈Ĝ ϕ∈Ĝ ϕ∈Ĝ  X ϕ(g) = 0. Since ψ(g) 6= 1, we get that ϕ∈Ĝ X ϕ∈Ĝ 28 ϕ(g) = 0. 2 2.3.2 Dirichlet Characters and L-Functions Throughout this section, m ∈ N with m ≥ 2. Definition 2.11 (a) For n ∈ Z, we put n = n + mZ ∈ Z/mZ. (b) Put Z∗m = {n | n ∈ Z, gcd(m, n) = 1}. Note that (Z∗m , ·) is a an abelian group of order ϕ(m) (where ϕ is the Euler-Phi function). (c) A Dirichlet character (mod m) is a character of Z∗m . We denote the trivial character of Z∗m by χ0 . (d) Let χ be a Dirichlet character. We extend the definition of χ to N as follows : χ(n) if gcd(m, n) = 1 χ:N→C:n→ 0 if gcd(m, n) 6= 1 One easily checks that χ(n1 n2 ) = χ(n1 )χ(n2 ) for all n1 , n2 ∈ N. (e) The L-function associated to the Dirichlet character χ is the series L(s, χ) = +∞ X χ(n) n=1 ns First, we prove that L(s, χ) is an analytic function on the half plane Re(s) > 0 whenever χ is not the trivial Dirichlet character. Theorem numbers such that the sequence of partial + Let han in≥1 be a sequence of complex * n 2.12 +∞ X an X is bounded. Then the series sums ak converges to an analytic function on the s n n=1 k=1 n≥1 half plane Re(s) > 0. 2 Proof : Corollary 2.13 Let χ be a non-trivial Dirichlet character. Then L(s, χ) is an analytic function on the half plane Re(s) > 0. Proof : Pick n ∈ N. Put n = qm + r with q, r ∈ N and 0 ≤ r < m. Then ! ! ! ! qm n r m r X X X X X χ(k) = χ(k) + χ(qm + k) = q χ(k) + χ(k) k=1 k=1 By Proposition 2.10(a), m X k=1 χ(k) = k=1 m X k=1 χ(k) = 0. So k=1 gcd(k,m)=1 n r r X X X χ(k) = χ(k) ≤ |χ(k)| ≤ r < m k=1 k=1 k=1 29 k=1 by Proposition 2.9(a). +∞ X Pn χ(n) Hence the sequence h k=1 χ(k)in≥1 is bounded. So by Theorem 2.12, the series L(s, χ) = ns n=1 is an analytic function on the half plane Re(s) > 0. 2 For the trivial character, we can prove the following proposition. Proposition 2.14 Let χ0 be the trivial Dirichlet character. Then L(s, χ0 ) is an analytic function on the half plane Re(s) > 1. Proof : It’s enough to prove that the series +∞ X χ0 (n) converges uniformly on any compact s n n=1 subset of the half plane Re(s) > 1. Let T be a compact subset of the half plane Re(s) > 1. Then there exists p > 1 such that Re(s) ≥ p for all s ∈ T . So by Proposition 2.9(a), we get χ0 (n) χ0 (n) 1 1 1 1 1 = ns ns ≤ |ns | = |eln(n)s | = |eln(n)σ | = nσ ≤ np for all s ∈ T +∞ +∞ X X 1 χ0 (n) Since the series converges (it’s a p-series with p > 1), we get that the series p n ns n=1 n=1 converges uniformly on T by the Weierstrass M -test. 2 Similarly as the Euler Product for the zeta function, there is a relation between L(s, χ) and the prime numbers. Theorem 2.15 (Euler Product for L(s, χ)) Let χ be a Dirichlet character. Then for all s ∈ C with Re(s) > 1, we have that 1 1 − χ(p)p−s prime Y L(s, χ) = p Proof : Pick s ∈ C with Re(s) > 1. Let p1 < p2 < p3 < · · · be the list off all primes. Then by definition, we have k Y Y 1 1 = lim −s k→+∞ 1 − χ(p)p 1 − χ(p)p−s i i=1 p prime Note that for any prime p, +∞ X χ(p)j p−js = j=0 1 1 − χ(p)p−s where the series converges absolutely. Using the Cauchy Product for absolutely convergent series and the fact that χ is completely multiplicative, we get ! k k +∞ +∞ Y Y X X X χ(n) 1 αk −s α1 j −js α1 αk χ(p) p = · · · χ(p ) )(p = (χ(p ) . . . p ) = 1 k 1 i k ns 1 − χ(p)p−s i α ,...,α =0 i=1 i=1 j=0 n∈N 1 k 30 k where Nk is the set of natural numbers that have no prime divisors bigger than pk . Note that +∞ X χ(n) +∞ N1 ⊆ N2 ⊆ N3 ⊆ · · · and ∪k=1 Nk = N. Since the series converges absolutely (because ns n=1 +∞ X χ(n) 1 ≤ 1 and the series converges absolutely), we get ns |ns | ns n=1 k +∞ X χ(n) X Y 1 χ(n) 1 = lim = = L(s, χ) −s = lim −s s k→+∞ k→+∞ 1 − χ(p)p n ns 1 − χ(p)pi n=1 i=1 prime n∈N Y p 2 k The following property of L(s, χ) is not so easy to prove. Theorem 2.16 Let χ be a non-trivial Dirichlet character. Then L(1, χ) 6= 0. 2 Proof : The trivial Dirichlet character behaves quite differently . Theorem 2.17 Let χ0 be the trivial Dirichlet character. Then for s ∈ R with s > 1, we have lim L(s, χ0 ) = +∞ s→1+ Proof : Pick s ∈ R with s > 1. Using the Euler Product for L(s, χ0 ) and ζ(s), we get 1 1 − χ0 (p)p−s p prime Y 1 = 1 − p−s L(s, χ0 ) = Y p prime p6|m  1 1 − p−s prime Y = p = ζ(s) Y  !  Y   (1 − p−s )   p prime p|m (1 − p−s ) p prime p|m Note that   Y  Y  −s  lim+  (1 − p ) = (1 − p−1 )   s→1 p prime p|m p prime p|m is a finite strictly positive real number. Hence it follows from Lemma 2.3 that lim+ L(s, χ0 ) = +∞. s→1 2 31 Proposition 2.18 Let χ be a Dirichlet character. Then the following holds : ! +∞ X X χ(p)k converges to an analytic function M (s, χ) on the half plane (a) The series ks kp p prime k=1 Re(s) > 1 (b) eM (s,χ) = L(s, χ) for all s ∈ C with Re(s) > 1 (c) There exists a function Φ(s, χ) defined on the half plane Re(s) > 1 such that Φ(s, χ) is bounded on the half plane Re(s) > 1 and M (s, χ) = Φ(s, χ) + X χ(p) ps p prime for all s ∈ C with Re(s) > 1. (d) M (s, χ) is bounded on (1, +∞) if χ is not the trivial character. Proof : (a) Let T be a compact subset of the half plane Re(s) > 1. Then there exists σ > 1 such that Re(s) ≥ σ for all s ∈ T . Note that χ(p)k 1 kpks ≤ kpkσ for all k ∈ N, all primes p and all s ∈ T and X p prime +∞ X 1 kpkσ k=1 ! ≤ X p prime +∞ X 1 pkσ k=1 ! +∞ X 1 X 1 1 < 2 < 2 < +∞ σ −1 σ σ p p n n=1 prime p prime X = p since the latter series is a p-series with p = σ > 1. Hence the series X +∞ X χ(p)k ! kpks converges absolutely and uniformly on T . Since T was an ! arbitrary compact subset of the half +∞ k X X χ(p) plane Re(s) > 1, we get that the series converges to an analytic function kpks p prime k=1 p prime k=1 M (s, χ) on the half plane Re(s) > 1. (b) Let p1 < p2 < p3 <!· · · be the list of all primes. The above also shows that for i = 1, 2, . . ., +∞ X χ(pi )k the series converges to an analytic function fi (s) on the half plane Re(s) > 1. kpks i k=1 Recall the following from complex analysis : The series +∞ k X z k=1 g(z) e k converges to an analytic function g(z) on {z ∈ C | |z| < 1} with 1 for all z ∈ C with |z| < 1. = 1−z 32 χ(pi ) 1 Note that s = Re(s) < 1 for i = 1, 2, . . . and all s ∈ C with Re(s) > 1. Hence pi p 1 1 − χ(pi )p−s i * n + X Since the sequence fi (s) efi (s) = i=1 for i = 1, 2, . . . and all s ∈ C with Re(s) > 1 converges to M (s, χ) on the half plane Re(s) > 1 and ez is n≥1 D Pn E analytic on C, we get that the sequence e i=1 fi (s) converges to eM (s,χ) on the half plane n≥1 Re(s) > 1. But Pn e i=1 fi (s) = n Y fi (s) e = i=1 n Y i=1 1 for all s ∈ C with Re(s) > 1 1 − χ(pi )p−s i Hence by Theorem 2.15, we get M (s,χ) e = lim n→+∞ n Y i=1 Y 1 1 = L(s, χ) = −s 1 − χ(p)p−s 1 − χ(pi )pi p prime (c) For s ∈ C with Re(s) > 1, put Φ(s, χ) = +∞ X X χ(p)k kpks p prime k=2 Since the series X p prime +∞ X χ(p)k k=1 ! ! converges absolutely, we get that kpks M (S, χ) = = +∞ X χ(p)k X p prime +∞ X k=1 k=1 ! kpks X χ(p)k kpks p prime ! +∞ X χ(p) X X χ(p)k = + ps kpks p prime p prime k=2 X χ(p) = + Φ(s, χ) ps p prime for all s ∈ C with Re(s) > 1. 33 ! Again because of absolute convergence, we get that ! +∞ X k X χ(p) |Φ(s, χ)| = ks kp p prime k=2 ! +∞ X X χ(p)k = kpks p prime k=2 ! +∞ X X χ(p)k ≤ kpks p prime k=2 ! +∞ X X 1 ≤ pk p prime k=2 = X p prime ≤2 p2 1 −p X 1 p2 p prime +∞ X 1 ≤2 n2 n=1 = 2 ζ(2) for all s ∈ C with Re(s) > 1. So Φ(s, χ) is bounded on the half plane Re(s) > 1. (d) Suppose that χ is not the trivial character. Then L(1, χ) 6= 0 by Theorem 2.16. Hence L0 (s, χ) is analytic on B(1, ) := {s ∈ C | |s − 1| < }. there exists 0 < < 1 such that L(s, χ) By the Antiderivative Theorem, there exists a function N (s, χ) such that N (s, χ) is analytic L0 (s, χ) on B(1, ) and N 0 (s, χ) = for all s ∈ B(1, ). Since eM (s,χ) = L(s, χ), we get that L(s, χ) L0 (s, χ) M 0 (s, χ) = for all s ∈ C with Re(s) > 1. So M 0 (s, χ) = N 0 (s, χ) for all s ∈ B(1, ) L(s, χ) with Re(s) > 1. Hence there exists c ∈ C such that M (s, χ) = N (s, χ) + c for all s ∈ B(1, ) with Re(s) > 1 Since N (s, χ) is analytic on B(1, ), we get that M (s, χ) is bounded on (1, 1 + 2 ]. It follows from (a) that M (s, χ) is bounded on [1 + 2 , +∞). So M (s, χ) is bounded on (1, +∞). 2 Theorem 2.19 (Dirichlet) Let a, m ∈ N with gcd(a, m) = 1. Then there exist infinitely many primes p with p ≡ a mod m. Proof : Let p be a prime. If p|m then X χ(a)χ(p) = 0 ∗ d χ∈Z m 34 So suppose that gcd(p, m) = 1. By Proposition 2.9(a), we get that χ(a)χ(p) = χ(a)χ(p) = (χ(a))−1 χ(p) = χ(a−1 )χ(p) = χ(a−1 p) for all Dirichlet characters χ. Hence by Proposition 2.10(b), we have X X φ(m) if a−1 p = 1 −1 χ(a)χ(p) = χ(a p) = 0 if a−1 p 6= 1 ∗ d χ∈Z m ∗ d χ∈Z m Note that a−1 p = 1 ⇔ p = a ⇔ p ≡ a mod m So we get X χ(a)χ(p) = φ(m) 0 if p ≡ a if p ≡ 6 a mod m mod m (∗) ∗ d χ∈Z m Using Proposition 2.18(c), we find X χ(a)M (s, χ) = X ∗ d χ∈Z m ∗ d χ∈Z m X χ(p) χ(a) Φ(s, χ) + ps p prime for all s ∈ C with Re(s) > 1. By Proposition 2.18(c), Ω(s) := X ! χ(a) Φ(s, χ) is bounded on ∗ d χ∈Z m the half plane Re(s) > 1. Using (*) and absolute convergence, we get   ! X X χ(p) X X χ(a)χ(p)  = φ(m)  χ(a) = s s p p p prime p prime ∗ d χ∈Z m ∗ d χ∈Z m for all s ∈ C with Re(s) > 1. Putting everything together, we get X X 1 χ(a)M (s, χ) = Ω(s) + φ(m) ps X p prime p≡a mod m 1 ps (∗∗) p prime p≡a mod m ∗ d χ∈Z m for all s ∈ C with Re(s) > 1. By Proposition 2.18(d), X χ(a)M (s, χ) is bounded on (1, +∞) ∗ d χ0 6=χ∈Z m By Theorem 2.17 and Proposition 2.18(a)(b), we have that lim χ0 (a) M (s, χ0 ) = lim+ M (s, χ0 ) = +∞ s→1+ s→1 Since Ω(s) is bounded on the half plane Re(s) > 1, it follows from (**) that X 1 φ(m) ps p prime p≡a mod m is not bounded over (1, +∞). This means that we have an infinite series. So there exist infinitely many primes p with p ≡ a mod m. 2 35 Chapter 3 Continued Fractions 3.1 Finite Continued Fractions Definition 3.1 : (1) Let n ∈ N, a0 ∈ R and a1 , . . . , an ∈ R+ 0 . The finite continued fraction with partial denominators a1 , . . . , an (notation : [a0 ; a1 , . . . , an ]) is the number 1 a0 + 1 a1 + 1 a2 + a3 + 1 .. . 1 an−2 + an−1 + 1 an (2) The continued fraction [a0 ; a1 , . . . , an ] is called simple if a0 , a1 , . . . , an are integers. We will use abbreviations like FSCF for “finite, simple continued fraction”, etc. (3) For x ∈ R, the √ integral part of x (notation : [x]) is the biggest integer smaller than or equal to x. So [ 2] = 1 and [−3.2] = −4. Example : We get that 1 [3; 2, 1, 2, 6] = 3 + = 1 2+ 1+ 1 2+ 36 1 6 172 51 Clearly, every FSCF is a rational number. The next theorem show that the converse is also true. Theorem 3.2 Every rational number can be written as a FSCF. a Proof : Note that every rational number q can be written as where a ∈ Z and b ∈ N0 . We b prove the theorem by induction on b. If b = 1 then q = a = [a]. So suppose that every rational a number of the form with a ∈ Z and b ∈ N0 can be written as a FSCF if b = 1, 2, . . . , m − 1 for b a some m ≥ 2. Let a ∈ Z. Put q = . Using the Division Algorithm, we can write a = mk + r m a = k = [k]. So we may assume that r 6= 0. with k, r ∈ Z and 0 ≤ r < m. If r = 0 then q = m Then 1 a mk + r q= = =k+ m m m r m m By induction, can be written as a FSCF, say = [b0 ; b1 , b2 , . . . , bn ]. So r r q= 1 a =k+ =k+ m m b0 + r 1 1 1 b1 + b2 + 1 .. . 1 bn−2 + bn−1 + 1 bn 2 Hence q = [k; b0 , b1 , . . . , bn ]. This proof provides us with an algorithm to represent a rational number q as a FSCF : a0 = [q]; 1 a1 = , etc. q − [q] Writing a Rational Number x as a FSCF (1) Put x0 = x and a0 = [x0 ]. (2) Suppose that we already have x0 , x1 , . . . , xn and a0 , a1 , . . . , an for some n ≥ 0. (a) If an = xn , the algorithm stops. We get that x = [a0 ; a1 , a2 , . . . , an ]. 1 (b) If an 6= xn , put xn+1 = and an+1 = [xn+1 ]. xn − an 37 Example : Write − 31 as a FSCF. 25 x0 = − x1 = 31 25 so a0 = −2 1 31 − − (−2) 25 1 19 x2 = = 6 25 −1 19 1 =6 x3 = 19 −3 6 = 25 19 so a1 = 1 so a2 = 3 so a3 = 6 and we stop Hence we get − 31 = [−2; 1, 3, 6] 25 We end this section by noting that that [a0 ; a1 , a2 , . . . , an−1 , an , 1] = [a0 ; a1 , a2 , . . . , an−1 , an + 1]. So we can write a rational number in at least two ways as a FSCF. It turns out that these are the only ways to represent a rational number as a FSCF. 3.2 Convergents of a Continued Fraction The previous section might have left the impression that evaluating finite continued fractions involves a lot of calculations. Moreover, if we extend our continued fraction, we have to start all over again : in order to calculate [1; 2, 3, 4] we don’t use any of the calculations we needed to evaluate [1; 2, 3]. In this section, we develop an iterative method to evaluate finite continued fractions that resolves these problems. Definition 3.3 : Let [a0 ; a1 , . . . , an ] be a FCF. (1) For k = 0, 1, . . . , n, we define the k-th convergent of [a0 ; a1 , . . . , an ] (notation : Ck ) as the number [a0 ; a1 , . . . , ak ]. (2) For k = −2, −1, 0, 1, . . . , n, we define numbers pk and qk as follows : p−2 = 0 p−1 = 1 pk = ak pk−1 + pk−2 q−2 = 1 q−1 = 0 qk = ak qk−1 + qk−2 Remark : It’s easy to see that qk > 0 for all k ≥ 0. 38 for k = 0, 1, 2, . . . , n Lemma 3.4 Let [a0 ; a1 , . . . , an ] be a FCF. Then [a0 ; a1 , . . . , ak , x] = xpk + pk−1 xqk + qk−1 for k = 0, 1, . . . , n and all x ∈ R+ 0. Proof : The proof is by induction on k. Let k = 0 and x ∈ R+ 0 . Then we get that [a0 ; x] = a0 + xa0 + 1 xp0 + p−1 1 = = x x xq0 + q−1 So assume that k ≥ 1. By induction, we have that [a0 ; a1 , . . . , ak−1 , y] = ypk−1 + pk−2 yqk−1 + qk−2 for all y ∈ R+ 0 1 Let x ∈ R+ 0 . Using the above formula with ‘y = ak + ’, we find x " 1 [a0 ; a1 , . . . , ak−1 , ak , x] = a0 ; a1 , . . . , ak−1 , ak + x ! 1 ak + pk−1 + pk−2 x ! = 1 ak + qk−1 + qk−2 x # x(ak pk−1 + pk−2 ) + pk−1 x(ak qk−1 + qk−2 ) + qk−1 xpk + pk−1 = xqk + qk−1 = 2 The following theorem shows that the numbers p0 , p1 , . . . , pn and q0 , q1 , . . . , qn are an efficient way of calculating finite continued fractions. Theorem 3.5 Let [a0 ; a1 , . . . , an ] be a FCF. Then Ck = [a0 ; a1 , . . . , ak ] = Proof : Note that C0 = a0 = get that pk for k = 0, 1, . . . , n qk p0 . Pick k ∈ {1, 2, . . . , n}. Using Lemma 3.4 with ‘x = ak ’, we q0 Ck = [a0 ; a1 , . . . , ak−1 , ak ] = 39 ak pk−1 + pk−2 pk = ak qk−1 + qk−2 qk 2 We describe an algorithm to calculate the numbers p0 , p1 , . . . , pn and q0 , q1 , . . . , qn . It is quite similar to our implementation of the Euclidean Algorithm. Let [a0 ; a1 , a2 , . . . , an ] be a FCF. We start by writing down the following ‘matrix’ : 0 1 1 0 a0 a1 .. . an−1 an We calculate the rows R0 , R1 , . . ., Rn−1 and Rn using the formula Rk = ak Rk−1 + Rk−2 for k = 0, 1, . . . , n. Then the first column are the numbers p0 , p1 , . . . , pn while the second column are the numbers q0 , q1 , . . . , qn . Example : Calculate [1; 2], [1; 2, 3], [1; 2, 3, 4] and [1; 2, 3, 4, 5] We easily get the following table : 0 1 1 3 10 43 225 1 0 1 2 7 30 157 1 2 3 4 5 3 10 43 225 Hence [1; 2] = , [1; 2, 3] = , [1; 2, 3, 4] = and [1; 2, 3, 4, 5] = . 2 7 30 157 Notice in our example that gcd(pk , qk ) = 1 for k = 0, 1, 2, 3, 4. This is always the case. Lemma 3.6 Let [a0 ; a1 , . . . , an ] be a FCF. Then pk qk−1 − qk pk−1 = (−1)k−1 for k = −1, 0, 1, . . . , n Proof : The proof is by induction on k. For k = −1, we get that p−1 q−2 − q−1 p−2 = 1 · 1 − 0 · 0 = 1 = (−1)−1−1 . So we may assume that k ≥ 0. Then we have that pk qk−1 − qk pk−1 = (ak pk−1 + pk−2 )qk−1 − (ak qk−1 + qk−2 )pk−1 = −(pk−1 qk−2 − qk−1 pk−2 ) By induction, we get that pk−1 qk−2 − qk−1 pk−2 = (−1)(k−1)−1 . Hence pk qk−1 − qk pk−1 = −(−1)k−2 = (−1)k−1 40 2 Corollary 3.7 Let [a0 ; a1 , . . . , an ] be a FSCF. Then gcd(pk , qk ) = 1 for k = 0, 1, . . . , n. Proof : Pick k ∈ {0, 1, . . . , n}. Put d = gcd(pk , qk ). By Lemma 3.6, we have that pk qk−1 − qk pk−1 = (−1)k−1 . Since d|pk and d|qk , we get that d|(−1)k−1 and so d = 1. 2 The convergents of a finite continued fraction show an alternating pattern. Theorem 3.8 Let [a0 ; a1 , . . . , an ] be a FCF. Then the following holds : (a) The convergents with even subscripts form a strictly increasing sequence : C0 < C2 < C4 < · · · (b) The convergents with odd subscripts form a strictly decreasing sequence : C1 > C3 > C5 > · · · (c) Any convergent with an odd subscript is greater than any convergent with an even subscript : C0 < C2 < C4 < · · · < C5 < C3 < C1 Proof : (a)(b) Pick k ∈ {0, 1, 2, . . . , n − 2}. Using Theorem 3.5 and Lemma 3.6, we easily get that Ck+2 − Ck = = = = = pk+2 pk − qk+2 qk qk pk+2 − pk qk+2 qk qk+2 qk (ak+2 pk+1 + pk ) − pk (ak+2 qk+1 + qk ) qk qk+2 ak+2 (pk+1 qk − qk+1 pk ) qk qk+2 ak+2 (−1)k qk qk+2 If k is even, we see that Ck+2 −Ck > 0 and so Ck+2 > Ck . Hence we get that C0 < C2 < C4 < · · · . If k is odd, then Ck+2 − Ck < 0 and so Ck+2 < Ck . Hence we get that C1 > C3 > C5 > · · · . (c) Using Theorem 3.5 and Lemma 3.6, we get that Cn − Cn−1 pn pn−1 pn qn−1 − qn pn−1 (−1)n−1 = − = = qn qn−1 qn−1 qn qn−1 qn Let r and s be integers such that 0 ≤ 2r, 2s − 1 ≤ n. Suppose first that n is even, say n = 2m. (−1)2m−1 < 0 and so C2m < C2m−1 . By (a) and (b), we get that Then C2m − C2m−1 = q2m−1 q2m C2r ≤ C2m < C2m−1 ≤ C2s−1 Suppose next that n is odd, say n = 2m − 1. Similarly, we get that C2m−2 < C2m−1 and so C2r ≤ C2m−2 < C2m−1 ≤ C2s−1 . 2 41 3.3 Infinite Continued Fractions Suppose that a0 ∈ R and ak ∈ R+ 0 for all k ≥ 1. We still put Ck = [a0 ; a1 , . . . , ak ] for all k ≥ 0. As on page 38, we define pk and qk for all k ≥ −2. Then all the theorems we’ve proven about Ck , pk and qk are still valid. Lemma 3.9 Let a0 ∈ R and ak ∈ R such that ak ≥ 1 for all k ≥ 1. Then qk ≥ k for all k ≥ 0. Proof : The proof is by induction on k. Note that q0 = 1 > 0, q1 = a1 ≥ 1 and q2 = a2 q1 +q0 ≥ 1 + 1 = 2. For k ≥ 3, we get by induction that qk = ak qk−1 + qk−2 ≥ qk−1 + qk−2 ≥ (k − 1) + (k − 2) = 2k − 3 ≥ k 2 Theorem 3.10 Let a0 ∈ R and ak ∈ R such that ak ≥ 1 for all k ≥ 1. Then lim Ck exists (so k→∞ the sequence (Ck )k≥0 converges) and C0 < C2 < C4 < · · · < lim Ck < · · · < C5 < C3 < C1 k→∞ Proof : By Theorem 3.8(a), we have that C0 < C2 < C4 < · · · . So (C2k )k≥0 is a strictly increasing sequence. By Theorem 3.8(c), this sequence is bounded (by C1 ). Hence this sequence converges. Call the limit α. Similarly, we get that the sequence (C2k+1 )k≥0 converges. Call the limit β. Note that C2k < α and β < C2k+1 for all k ≥ 0. For all k, l ≥ 0, we have that C2k < C2l+1 by Theorem 3.8(c) and so C2k ≤ β. Since this is true for all k ≥ 0, we get that α ≤ β. Pick k ≥ 0. We have that β < C2k+1 and C2k < α. Similarly as in the proof of Theorem 3.8(c) and by Lemma 3.9, we get that 0 ≤ β − α < C2k+1 − C2k = 1 (−1)2k 1 = ≤ q2k+1 q2k q2k+1 q2k 2k(2k + 1) Since this is true for all k ≥ 0, we get that 0 ≤ β − α ≤ 0. So β = α. Hence the sequence (Ck )k≥0 converges (to α = β). 2 Definition 3.11 (1) Let a0 ∈ R and ak ∈ R such that ak ≥ 1 for all k ≥ 1. The infinite continued fraction with partial denominators a1 , a2 , . . . (abreviation : ICF; notation : [a0 ; a1 , a2 , . . .]) is the number lim [a0 ; a1 , . . . , ak ]. Note that this limit exists by Theorem k→∞ 3.10. (2) If a0 ∈ Z and ak ∈ N0 for all k ≥ 1, then the ICF [a0 ; a1 , a2 , . . .] is called an infinite, simple continued fraction (abreviation : ISCF). We’ve seen that an FSCF represents a rational number and that every rational number can be written as an FSCF. We will now prove that an ISCF represents an irrational number and that every irrational number can be written uniquely as an ISCF. 42 Theorem 3.12 Every ISCF is an irrational number. Proof : Let x be an ISCF, say x = [a0 ; a1 , a2 , . . .]. Pick k ≥ 0. Using Theorem 3.8, we get that Ck < x < Ck+1 if k is even and Ck+1 < x < Ck is k is odd. Using Theorem 3.5 and Lemma 3.6, we get that p p q − q p (−1)k p 1 k k+1 k k+1 k+1 k − = 0 < |x − Ck | < |Ck+1 − Ck | = = = qk+1 qk qk qk+1 qk qk+1 qk qk+1 a Suppose that x is rational. Then there exist a ∈ Z and b ∈ N0 such that x = . Hence b a p 1 b k 0< − < or 0 < |aqk − bpk | < b qk qk qk+1 qk+1 Note that this formula is true for all k ≥ 0. By Lemma 3.9, we can pick k ≥ 0 such that qk+1 > b. For this k, we get that 0 < |aqk − bpk | < b qk+1 <1 a contradiction since aqk − bpk is an integer. Hence x is irrational. 2 Before we prove that every irrational number can be written uniquely as an ISCF, we remark the following : Let [a0 ; a1 , a2 , . . .] be an ISCF. Then [a0 ; a1 , a2 , . . .] = lim [a0 ; a1 , a2 , . . . , ak ] k→∞ = lim k→∞ = a0 + 1 a0 + [a1 ; a2 , a3 , . . . , ak ] ! 1 lim [a1 ; a2 , a3 , . . . , ak ] k→∞ 1 [a1 ; a2 , a3 , . . .] = [a0 ; [a1 , a2 . . .]] = a0 + Similarly, we get that [a0 ; a1 , a2 , . . .] = [a0 ; a1 , . . . , ak , [ak+1 ; ak+2 , ak+3 , . . .]] for all k ≥ 1 43 Example : Calculate [3; 1, 2, 1, 2, 1, 2, . . .]. We have that [3; 1, 2, 1, 2, 1, 2, . . .] = 3 + 1 . Put y = [1; 2, 1, 2, 1, 2, . . .]. Then [1; 2, 1, 2, 1, 2, . . .] we have that 1 y = [1; 2, 1, 2, 1, 2, . . .] = 1 + 2+ 1 [1; 2, 1, 2, 1, 2, . . .] 1 =1+ 2+ 1 y = 3y + 1 2y + 1 So y is a solution of the equation 3y + 1 or 2y 2 − 2y − 1 = 0 2y + 1 √ √ 1+ 3 1± 3 . Since y > 0, we have that y = . Hence we get that We get that y = 2 2 √ √ 1 1 5+3 3 √ = √ =2+ 3 [3; 1, 2, 1, 2, 1, 2, . . .] = 3 + = 3 + y 1+ 3 1+ 3 y= 2 Next, we prove that an irrational number can be written as at most one ISCF. We need a little lemma that will also show us how to write an irrational number as a ISCF. Lemma 3.13 Let x = [a0 ; a1 , a2 , . . .] be an ISCF. Then a0 = [x]. Proof : By Theorem 3.10, we get that C0 < x < C1 and so a0 < x < 1 p 1 a0 a1 + 1 = = a0 + ≤ a0 + 1 q1 a1 a1 2 since a1 ≥ 1. Hence a0 = [x]. Corollary 3.14 An irrational number can be written as at most one ISCF. Proof : Suppose that there are two ISCF’s that are the same irrational number x, say x = [a0 ; a1 , a2 , . . .] = [b0 ; b1 , b2 , . . .] By Lemma 3.13, we get that a0 = [x] = b0 . But a0 + 1 1 = [a0 ; a1 , a2 , . . .] = x = [b0 ; b1 , b2 , . . .] = b0 + [a1 ; a2 , a3 , . . .] [b1 ; b2 , b3 , . . .] Since a0 = b0 , we get that [a1 ; a2 , a3 , . . .] = [b1 ; b2 , b3 , . . .]. Repeating this process, we get that ak = bk for all k ≥ 0. 2 Let x be an irrational number. If we can write x as an ISCF, then this ISCF is unique. Note that Lemma 3.13 also gives us the only possible candidate : a0 = [x]. Hence we define an ISCF (related to x) and prove that it actually equals x. 44 Definition 3.15 : Let x be an irrational number. We define real numbers xk and integers ak for k ≥ 0 as follows : (*) x0 = x and a0 = [x0 ] (*) Suppose we already have x0 , . . . , xk and a0 , . . . , ak for some k ≥ 0. Then xk+1 = 1 x k − ak and ak+1 = [xk+1 ]. We get that xk is an irrational number and ak is an integer for all k ≥ 0; moreover xk > 1 and ak ≥ 1 for all k ≥ 1. Theorem 3.16 Let x be an irrational number. For all k ≥ 0, define xk and ak as above. Then the following holds : (a) x = [a0 ; a1 , a2 , . . .]. pq 1 for all k ≥ 0. (b) x − < qk qk qk+1 Proof : We first prove the following claim : x = [a0 ; a1 , . . . , ak , xk+1 ] for all k ≥ 0. 1 1 as xk = ak + . The proof of the claim is by x k − ak xk+1 1 induction on k. For k = 0, we get that x = x0 = a0 + = [a0 ; x1 ]. So assume that k ≥ 1. x1 Then we get that Note that we can rewrite xk+1 = x = [a0 ; a1 , . . . , ak−1 , xk ] = [a0 ; a1 , . . . , ak−1 , ak + 1 xk+1 ] = [a0 ; a1 , . . . , ak−1 , ak , xk+1 ] which proves the claim. Pick k ≥ 1. By Lemma 3.4 with x = xk+1 , we get that x = [a0 ; a1 , . . . , ak , xk+1 ] = xk+1 pk + pk−1 xk+1 qk + qk−1 Hence we have x − Ck = xk+1 pk + pk−1 pk − (pk qk−1 − pk−1 qk ) (−1)k − = = xk+1 qk + qk−1 qk qk (xk+1 qk + qk−1 ) qk (xk+1 qk + qk−1 ) by Lemma 3.6. Since xk+1 > ak+1 , we get by Lemma 3.9 that |x − Ck | = 1 1 1 1 < = ≤ qk (xk+1 qk + qk−1 ) qk (ak+1 qk + qk−1 ) qk qk+1 k(k + 1) 45 Hence lim |x − Ck | = 0. So x = lim Ck = lim [a0 ; a1 , a2 , . . . , ak ] = [a0 ; a1 , a2 , . . .]. k→∞ k→∞ k→∞ 2 √ Example : Write 6 as an ISCF. We calculate xk and ak for k ≥ 0. x0 = x1 = √ x2 = √ √ 1 6−2 6 ≈ 2.4 and so a0 = 2 √ = 1 6+2 −2 2 = 6+2 ≈ 2.2 and so a1 = 2 2 √ 6 + 2 ≈ 4.4 and so a2 = 4 √ 1 x3 = √ = ( 6 + 4) − 2 6+2 = x1 2 Since x√ 3 = x1 , we get that 2 = a1 = a3 = a5 = a7 = · · · and 4 = a2 = a4 = a6 = a8 = · · · Hence 6 = [2; 2, 4, 2, 4, 2, 4, . . .]. 3.4 Rational Approximations of Irrational Numbers In this section, we prove that the convergents of an ISCF are very good rational approximations for the ISCF. Moreover, any ‘good’ rational approximation of an irrational number x is a convergent of the ISCF for x. Lemma 3.17 Let x be the ISCF [a0 ; a1 , a2 , . . .], k ≥ 0 and a, b ∈ Z such that 1 ≤ b < qk+1 . Then |qk x − pk | ≤ |bx − a| and we have equality only if a = pk and b = qk . Proof : Consider the following system of linear equations in α and β : pk α + pk+1 β = a qk α + qk+1 β = b Using Lemma 3.6 and Cramer’s Rule, we find that α = (−1)k+1 (aqk+1 − bpk+1 ) and β = (−1)k+1 (bpk − aqk ) Note that α 6= 0. Indeed, if α = 0, then aqk+1 = bpk+1 and so qk+1 |bpk+1 ; but gcd(pk+1 , qk+1 ) = 1 by Lemma 3.7; so qk+1 |b and b ≥ qk+1 , a contradiction. Assume that β = 0. Similarly as above, we get that qk |b and so b = mqk for some m ∈ N0 . Then a = mpk and so |bx − a| = m|qk x − pk | ≥ |qk x − pk |. Note that we have equality if and only if m = 1 (and so a = pk and b = qk ). Hence we may assume that β 6= 0. If β < 0 then qk α = b − qk+1 β > 0 and so α > 0; if β > 0 then b < qk+1 ≤ βqk+1 , hence qk α = b − βqk+1 < 0 and so α < 0. Hence α and β have opposite 46 signs. By Theorem 3.8, we get that Ck < x < Ck+1 if k is even and Ck+1 < x < ck is k is odd. Hence x − Ck and x − Ck+1 have opposite signs. So qk x − pk and qk+1 x − pk+1 have opposite signs. We conclude that α(qk x − pk ) and β(qk+1 x − pk+1 ) both have the same sign. Hence we get that |bx − a| = |(qk α + qk+1 β)x − (pk α + pk+1 β)| = |α(qk x − pk ) + β(qk+1 x − pk+1 )| = |α||qk x − pk | + |β||qk+1 x − pk+1 | > |α||qk x − pk | ≥ |qk x − pk | 2 We can now prove our first result related to convergents of an ISCF and rational approximations. Theorem 3.18 Let x be the ISCF [a0 ; a1 , a2 , . . .], k ≥ 0 and a, b ∈ Z such that 1 ≤ b ≤ qk . a pk Then x − ≤ x − and we have equality only if a = pk and b = qk . qk b a pk Proof : Suppose that x − ≥ x − . Then we get that qk b a a pk |qk x − pk | = qk x − ≥ qk x − ≥ b x − = |bx − a| qk b b By Lemma 3.17, this is only possible if a = pk and b = qk . 2 179 is the best rational approximation of π 57 179 among all the fractions with denominator less than or equal to 57. Yet is not a convergent 57 of π. Note that the converse of Theorem 3.18 is false : The following result is an easy consequence of Theorem 3.16. p 1 k Theorem 3.19 Let x be an irrational number and n ∈ N. Then x − < 2 for all k ≥ 0. qk q k Proof : Recall that the sequence hq k ik≥0 is increasing. Hence by Theorem 3.16, we get that pk 1 1 ≤ 2 2 x − < qk qk qk+1 qk Convergents of an irrational number are not the only fractions with this property : 1 19 π − < 2 6 6 yet 19 is not a convergent of π. However, very ‘good’ rational approximations of an irrational 6 number x have to be convergents of x. 47 Theorem 3.20 Let x be an irrational number, a ∈ Z and b ∈ N0 such that x − Then a 1 ≤ 2. b 2b a is a convergent of the ISCF for x. b Proof : For k ≥ 0, let pk and qk be the integers, defined using the ISCF for x. Note that (qk )k≥1 is a strictly increacing sequence of positive integers (qk+1 = ak qk + qk−1 > qk for all k ≥ 1) and q0 = 1. Hence there exists a unique k ≥ 0 such that qk ≤ b < qk+1 . By Lemma 3.17, we get that 1 a |qk x − pk | ≤ |bx − a| = b x − ≤ b 2b Hence p 1 1 k x − = |qk x − pk | ≤ qk qk 2bqk Since x is irrational, we actually have a strict inequality. Suppose that bpk 6= aqk . Then |bpk − aqk | ≥ 1 and so bp − aq p 1 k k k a pk ≤ = − ≤ − x + x − bqk bqk qk b qk a 1 1 + 2 < b 2bqk 2b Hence we get that 1 1 < 2 and so b < qk 2bqk 2b a contradiction since b ≥ qk . a pk Hence bpk = aqk . So = = Ck . b qk 2 One can prove (without using the theory of continued fractions) that given any irrational a number x, there are infinitely many rational numbers (with a ∈ Z and b ∈ N0 ) such that b a 1 x − < √ 2 . Note that all these rational numbers must be convergents of x by Theorem b 5b 3.20. We prove that infinitely many convergents of the ISCF for x have this property. 48 pk 1 for some Theorem 3.21 Let x be an irrational number and n ∈ N. Then x − < qk 2qk2 k ∈ {n, n + 1}. p pn pn+1 1 k Proof : Suppose that x − ≥ 2 for k = n, n + 1. By Theorem 3.8(c), <x< if n qk 2qk qn qn+1 pn+1 pn is even and <x< if n is odd. Hence qn+1 qn pn pn+1 pn p 1 1 − = − x + x − n+1 > + 2 qn qn qn+1 qn+1 2qn2 2qn+1 By Lemma 3.6, we have that p p q − p q p 1 n+1 n n+1 n n n+1 = − = qn qn+1 qn qn+1 qn qn+1 Putting everything together, we find 1 1 1 > 2 + 2 qn qn+1 2qn 2qn+1 2 + qn2 . So 0 > (qn+1 − qn )2 , a contradiction. Hence 2qn qn+1 > qn+1 2 We can do a little bit better. pk 1 Theorem 3.22 Let x be an irrational number and n ∈ N0 . Then x − < √ 2 for some qk 5 qk k ∈ {n − 1, n, n + 1}. pk 1 Proof : Suppose that x − ≥ √ 2 for k = n − 1, n, n + 1. qk 5 qk Pick k ∈ {n − 1, n}. As in the proof of Theorem 3.21, we get that p p 1 1 1 k k+1 = − x + x − ≥ √ 2 +√ 2 qk qk+1 qk qk+1 5 qk 5 qk+1 Putting bk = qk+1 , we find qk √ 5 > qk qk+1 1 1 + 2 2 qk qk+1 ! = 49 qk 1 qk+1 + = bk + qk qk+1 bk So b2k − √ 5 bk + 1 < 0 Solving this inequality, we get √ √ 5−1 5+1 < bk < 2 2 Hence √ for k = n − 1, n (∗) 5+1 qn+1 an+1 qn + qn−1 qn + qn−1 1 > bn = = ≥ =1+ 2 qn qn qn bn−1 So bn−1 > √ 1 5+1 −1 2 =√ 2 5−1 √ = 5+1 2 2 a contradiction to (*). This is a good as it gets in the following sense : There exist irrational numbers α such that for all c > a 1 a many rational numbers with α − ≤ 2 . b b c b 3.5 √ 5, there are only a finitely Periodic Continued Fractions √ On page 46, we represented 6 as an ISCF and noted that this ISCF has a repeating part. And on page 44, we calculated [3; 1, 2, 1, 2, 1, 2, . . .] and noted that square roots were involved. In this section, we will describe all the irrational numbers whose ISCF has a repeating part : they all contain (somehow) a square root. 3.5.1 Quadratic Irrationals Definition 3.23 Let d ∈ N that is not a perfect square. √ √ (a) We put Q( d) = {q1 + q2 d | q1 , q2 ∈ Q}. √ √ (b) If q1 , q2√∈ Q then the conjugate of q1 + q2 d (notation : q1 + q2 d ) is the number q1 − q2 d. √ √ Remark : Note that every element of Q(√ d) can be written uniquely as q1 + q2 d with √ q1 , q2 ∈ Q. Indeed, suppose that p1 + p2 d = q1 + q2 d where p1 , p2 , q1 , q2 ∈ Q. Then √ √ √ q1 − p 1 (p2 − q2 ) d = q1 − p1 . If p2 6= q2 , then d = ∈ Q, a contradiction since d is irrational p 2 − q2 (because d is not a perfect square). Hence q2 = p2 and so also p1 = q1 . 50 Theorem 3.24 Let d ∈ N such that d is not a perfect square. Then the following holds : √ (a) Q( d) is a field. √ (b) For all x, y, z ∈ Q( d) with z 6= 0, we have that xy = x y, x ± y = x ± y and z −1 = z −1 . √ Proof : (a) Since Q( d) ⊂ R and R is a field, we only need to prove the following √ √ • x − y ∈ Q(√ d) for all x, y ∈ Q(√ d) • xy −1 ∈ Q( d) for all x, y ∈ Q( d) with y 6= 0 √ √ We will prove the second statement. Let x = p1 +p2 d and y = q1 +q2 d where p1 , p2 , q1 , q2 ∈ Q and (q1 , q2 ) 6= (0, 0). Then we have that √ p1 + p2 d −1 √ xy = q1 + q2 √d √ (p1 + p2 d)(q1 − q2 d) √ √ = (q1 + q2 d)(q1 − q2 d) √ (p1 q1 − dp2 q2 ) + (p2 q1 − p1 q2 ) d = q12 − dq22 p1 q1 − dp2 q2 p2 q1 − p1 q2 √ = + 2 d q12 − dq22 q1 − dq22 √ So xy −1 ∈ Q( d). √ (b) We will prove √ that xy = x y for√all x, y ∈ Q( d). The other formulas are proven similarly. Let x = p1 + p2 d and y = q1 + q2 d where p1 , p2 , q1 , q2 ∈ Q. Then √ √ xy = (p1 + p2 d)(q1 + q2 d) √ = (p1 q1 + dp2 q2 ) + (p1 q2 + p2 q1 ) d √ = (p1 q1 + dp2 q2 ) − (p1 q2 + p2 q1 ) d √ √ = (p1 − p2 d)(q1 − q2 d) = xy 2 Definition 3.25 A number√x is a quadratic irrational if there exists d ∈ N such that d is not a perfect square and x ∈ Q( d) \ Q. 3.5.2 Periodic Continued Fractions Definition 3.26 (a) An infinite continued fraction [a0 ; a1 , a2 , . . .] is called periodic if there exists p ∈ N0 and N ∈ N such that ak+p = ak for all k ≥ N . We write [a0 ; a1 , a2 , . . .] = [a0 ; a1 , a2 , . . . , aN −1 , aN , aN +1 , . . . , aN +p−1 ] 51 (b) The smallest p with this property is called the period of the periodic continued fraction. (c) A periodic continued fraction is purely periodic if we can choose N = 0. Example : [1; 3, 5, 3, 5, 3, 5, . . .] = [1; 3, 5]. This ISCF is periodic with period 2 but is not purely periodic. We want to prove that an ISCF is periodic if and only if it is a quadratic irrational. One direction is quite easy to prove. Proposition 3.27 Let x = [a0 ; a1 , a2 , . . .] be a periodic ISCF. Then x is a quadratic irrational. Proof : Suppose that x = [a0 ; a1 , a2 , . . . , aN −1 , aN , aN +1 , . . . , aN +p−1 ]. Put y = [aN ; aN +1 , . . . , aN +p−1 ]. By Lemma 3.4, we get that y = [aN ; aN +1 , . . . , aN +p−1 , y] = pp−1 y + pp−2 qp−1 y + qp−2 where hpk ik≥−1 and hqk ik≥−1 are associated to the ISCF [aN ; aN +1 , . . .]. So we have that qp−1 y 2 − (pp−1 − qp−2 ) y − pp−2 = 0 Put a = qp−1 , b = pp−1 − qp−2 , c = pp−2 and d = (pp−1 − qp−2 )2 + 4qp−1 pp−2 . Since y > 0, we get that √ b+ d y= 2a Note that d ∈ N and that d is not a perfect square since y is irrational by Theorem 3.12. Using Lemma 3.4, we find that x = [a0 ; a1 , . . . , aN −1 , aN , aN +1 , . . . , aN +p−1 ] = [a0 ; a1 , . . . , aN −1 , y] = pN −1 y + pN −2 qN −1 y + qN −2 where hpk ik≥−2 and hqk ik≥−2 are associated √ to the ISCF [a0 ; a1 , . . .]. Since √ y is irrational, qN −1 y + qN −2 6= 0. By Theorem 3.24(a), Q( d) is a field and so x ∈ Q( d). Hence x is a quadratic irrational since x is irrational by Theorem 3.12. 2 Note that a quadratic irrational can be written in a special form. Lemma 3.28 Let x be a quadratic irrational. Then there exist √ P, Q ∈ Z and D ∈ N such that P+ D . Q 6= 0, D is not a perfect square, Q|(D − P 2 ) and x = Q Proof : Since x is a quadratic irrational, there exists d ∈ N such that d is not a perfect square √ √ and x ∈ Q( d) \ Q. So x = q1 + q2 d for some q1 , q2 ∈ Q. Note that q2 6= 0 since x is irrational. 52 Putting q1 and q2 on a common denominator, we see that there exist a, b ∈ Z with b 6= 0 and c ∈ N0 such that √ √ √ √ √ a + b d c(a + b d) ac + bc d ac + ε b2 c2 d εac + b2 c2 d = x= = = = c c2 c2 c2 εc2 where ε = sgn(bc) ∈ {−1, 1}.√Put P = εac, Q = εc2 and D = b2 c2 d. Then P, Q ∈ Z with P+ D Q 6= 0, D ∈ N and x = . Note that D is not a perfect square since x is irrational. Q Moreover, D − P 2 = b2 c2 d − a2 c2 = εc2 (εb2 d − εa2 ). So Q|(D − P 2 ). 2 We can now prove the converse of Proposition 3.29. Theorem 3.29 An ISCF is periodic if and only if it is a quadratic irrational. Proof : If an ISCF is periodic then it is a quadratic irrational by Proposition 3.27. So let x be a quadratic irrational. By Lemma 3.28, there exist √ P, Q ∈ Z and D ∈ N such that D P + Q 6= 0, D is not a perfect square, Q|(D − P 2 ) and x = . Define hxk ik≥0 and hak ik≥0 as Q in Definition 3.15. Now define hPk ik≥0 and hQk ik≥0 as follows :  P0 = P , Q0 = Q    P =a Q for all k ≥ 1 k k−1 k−1 − Pk−1 2 D − P  k   Qk = for all k ≥ 1 Qk−1 √ Pk + D Claim 1 : For all k ≥ 0, we have that Pk , Qk ∈ Z with Qk 6= 0 and xk = Qk The proof of Claim 1 is by induction on k. For k = 0, we have that P0 = P ∈ Z, Q0 = Q ∈ Z0 √ P+ D and x = . So suppose that the claim is true for k = 0, 1, . . . , n for some n ≥ 0. Then Q Pn+1 = an Qn − Pn ∈ Z since Pn , Qn ∈ Z by induction. We easily get that Qn+1 2 D − (an Qn − Pn )2 D − Pn2 D − Pn+1 = = = + 2an Pn − a2n Qn Qn Qn Qn D − P02 D − Pn2 ∈ Z since Q|(D − P 2 ); if n > 0 then = Qn−1 ∈ Z by induction. Q0 Qn ∈ Z. Note that Qn+1 6= 0 since D is not a perfect square. Using induction, we find If n = 0 then Hence Qn+1 that xn+1 = 1 1 Qn Qn √ = =√ =√ x n − an Pn + D D − (an Qn − Pn ) D − Pn+1 − an Qn 53 Hence xn+1 √ √ √ √ Qn ( D + Pn+1 ) Pn+1 + D Pn+1 + D Qn ( D + Pn+1 ) √ = = √ = = 2 2 D − Pn+1 Qn+1 D − Pn+1 ( D − Pn+1 )( D + Pn+1 ) Qn which proves Claim 1. Claim 2 : There exists N > 0 such that xk < 0 for all k > N . Pick k ≥ 1. Using the proof of Theorem 3.16 and Lemma 3.4, we get that x = [a0 ; a1 , . . . , ak−1 , xk ] = pk−1 xk + pk−2 qk−1 xk + qk−2 Taking conjugates of both sides and using Theorem 3.24(b), we have that x= pk−1 xk + pk−2 qk−1 xk + qk−2 Solving this for xk , we find that xk = qk−2 x − pk−2 pk−1 − qk−1 x We can rewrite this as pk−2 qk−1 qk−2 xk = − qk−2 pk−1 x− qk−1 x− pk = x, we get that k→∞ qk Since lim pk−2 qk−2 x − x = =1 lim k→∞ pk−1 x − x x− qk−1 x− Hence there exists N ≥ 1 such that q k−1 xk − 1 < 1 for all k > N − qk−2 So we have that −1 < − qk−1 xk − 1 < 1 for all k > N qk−2 54 In particular, we get that qk−1 xk > 0 for all k > N qk−2 are positive for all k > N , we get that − Since qk−1 and qk−2 xk < 0 for all k > N which proves Claim 2. √ √ √ Claim 3 : − D < Pk < D and 0 < Qk < 2 D for all k > N . Pick k > N . By Claim 1, we have that √ Pk + D = xk > 1 Qk By Claim 2, we get that − Pk + Qk Multiplying (*) and (**), we get that √ D (∗) = −xk > 0 (∗∗) D − Pk2 >0 Q2k So D − Pk2 > 0. Hence √ √ − D < Pk < D Adding (*) and (**), we get that √ 2 D >0 Qk So Qk > 0. Hence it follows from (*) that √ Qk < Pk + √ D<2 D which proves Claim 3. We can now finish the proof of the theorem. By Claim 1, (Pk , Qk ) ∈ Z2 for all k ≥ 0. By Claim 3, the set {(Pk , Qk ) | k ≥ 0} is finite. Hence there exist m, n ∈ N0 such that m < n and (Pm , Qm ) = (Pn , Qn ). We prove by induction on k that xk+n−m = xk and ak+n−m = ak for all k ≥ m. Let k = m. Using Claim 1, we get that √ √ Pn + D Pm + D = = xm xn = Qn Qm Hence an = [xn ] = [xm ] = am . Assume now that k > m. Again using Claim 1 and induction, we get that 1 1 xk+n−m = = = xk xk−1+n−m − ak−1+n−m xk−1 − ak−1 55 So ak+n−m = [xk+n−m ] = [xk ] = ak . Hence x = [a0 ; a1 , a2 , . . .] = [a0 ; a1 , . . . , am−1 , am , am+1 , . . . , an−1 ] 2 So the ISCF for x is periodic. The final result in this section is the description of quadratic irrationals with a purely periodic ISCF. Definition 3.30 Let x be a quadratic irrational. Then x is reduced if x > 1 and −1 < x < 0. Examples : is reduced. √ √ √ √ 1+ 5 1+ 5 5 and are both quadratic irrationals but 5 is not reduced while 2 2 Reduced quadratic irrationals have the following property. Lemma 3.31 Let x be a reduced quadratic irrational. Put y = 1 . Then y is reduced. x − [x] Proof : By definition, x > 1 and −1 < x < 0. Since x is irrational, we have that 0 < x−[x] < 1 and so 1 >1 y= x − [x] Since x is a quadratic √ irrational, we get that√there exists d ∈ N such that d is not a perfect square and x ∈ Q( d). By Theorem 3.24, Q( d) is a field. Clearly, y is irrational and so y is a quadratic irrational. Since x > 1 and x < 0, we have that x − [x] < −1. So −1 < 1 <0 x − [x] Using Theorem 3.24(b), we get that y= 1 1 1 1 = = = x − [x] x − [x] x − [x] x − [x] Hence −1 < y < 0. So y is reduced. 2 56 Theorem 3.32 An ISCF is purely periodic if and only if it is a reduced quadratic irrational. Proof : Suppose first that x := [a0 ; a1 , a2 , . . .] is a purely periodic ISCF, say x = [a0 ; a1 , . . . , an ] for some n ≥ 0. Note that x is a quadratic irrational by Theorem 3.29. By Lemma 3.4, we get that pn x + pn−1 x = [a0 ; a1 , . . . , an ] = [a0 ; a1 , . . . , an , x] = qn x + qn−1 Hence qn x2 − (pn − qn−1 )x − pn−1 = 0 Applying the conjugate to this equation and using Theorem 3.24(b), we get that qn x2 − (pn − qn−1 )x − pn−1 = 0 So x and x are the roots of the quadratic function f (t) = qn t2 − (pn − qn−1 )t − pn−1 . Note that f (−1) = qn + pn − qn−1 − pn−1 = (qn − qn−1 ) + (pn − pn−1 ) > 0 Since a0 > 0, we get that f (0) = −pn−1 < 0 By the Intermediate Value Theorem, f (t) has a root in (−1, 0). By Theorem 3.10, we have that 1 ≤ a0 = C0 < x Since the only roots of f (t) are x and x, we get that −1 < x < 0 So x is reduced. Suppose next that x is a reduced quadratic irrational. Define hxk ik≥0 and hak ik≥0 as in Definition 3.15. Using Lemma 3.31 and induction on k, we see that xk is a reduced quadratic irrational for all k ≥ 0. 1 Pick k ≥ 0. Since xk+1 = , we get that x k − ak x k = ak + 1 (∗) xk+1 Taking the conjugates of both sides and using Theorem 3.24(b), we find that x k = ak + 1 xk+1 = ak + 1 xk+1 = ak + Note that −1 < xk < 0 since xk is reduced. Hence we get −1 < ak + 57 1 xk+1 <0 1 xk+1 We can rewrite this as ak < − Hence " − 1 xk+1 1 xk+1 < ak + 1 # = ak (∗∗) It follows from (the proof of) Theorem 3.29 that x = [a0 ; a1 , a2 , . . .] is periodic and that we can choose m ∈ N minimal such that there exists n > m with xm = xn . Suppose that m ≥ 1. Using (**), we get that " # " # 1 1 am−1 = − = − = an−1 xm xn Hence by (*), we have that xm−1 = am−1 + 1 1 = an−1 + = xn−1 xm xn a contradiction to the minimality of m. Hence m = 0 and x = [a0 ; a1 , a2 , . . .] = [a0 ; a1 , a2 , . . . , am−1 ] 2 So the ISCF for x is purely periodic. 58 Chapter 4 Elliptic Curves In this chapter, we give a very brief introduction to elliptic curves. We will mention most results without proof. 4.1 Cubic Curves Definition 4.1 (a) A curve is a subset of C2 of the form {(a, b) ∈ C2 | f (a, b) = 0} where f (x, y) ∈ C[x, y]. (b) Let C be the curve f (x, y) = 0 and (a, b) ∈ C. Then (a, b) is a singular point on C if ∂f ∂f (a, b) = (a, b) = 0 ∂x ∂y (b) A cubic curve is a curve of the form y 2 = ax3 + bx2 + cx + d where a, b, c, d ∈ C with a 6= 0. Note that we are mostly interested in finding rational points on the curve f (x, y) = 0 where f (x, y) ∈ Q[x, y]. Lemma 4.2 Let f (x) ∈ C[x] and C the curve y 2 = f (x). Then (x0 , y0 ) is a singular point on C if and only if y0 = 0 and f (x0 ) = f 0 (x0 ) = 0. In particular, a cubic curve has at most one singular point. Proof : Let (x0 , y0 ) ∈ C. Then (x0 , y0 ) is a singular point on C if and only if 2y0 = 0 f 0 (x0 ) = 0 59 So y0 = 0. Since (x0 , y0 ) ∈ C, we have that y02 = f (x0 ) Hence (x0 , y0 ) is a singular point on C if and only if y0 = 0 and f (x0 ) = f 0 (x0 ) = 0. This means that x0 is a root of f (x) of multiplicity at least two. Suppose now that C is a cubic curve. Then f (x) is a polynomial of degree three. So f (x) can have at most one root of multiplicity at least two. Hence C has at most one singular point. 2 Let a, b, c, d, e, f, g ∈ Q with a, d 6= 0. Then there exist q1 , q2 , q3 , q4 ∈ Q such that the substitux = q 1 X + q2 tion transforms the curve y = q3 Y + q4 ay 2 + by + c = dx3 + ex2 + f x + g into a curve of the form Y 2 = X 3 + AX + B where A, B ∈ Q. So we concentrate on cubic curves of the latter form. Lemma 4.3 Let C be the cubic curve y 2 = x3 + ax + b where a, b ∈ C. Then C has a singular point if and only if 4a3 + 27b2 = 0. Proof : By Lemma 4.2, C has a singular point if and only if x3 +ax+b has a root of multiplicity at least two. This happens if and only if x3 + ax + b = (x − r)2 (x − s) for some r, s ∈ C. Note that (x − r)2 (x − s) = x3 − (s + 2r)x2 + (2rs + r2 )x − r2 s for all r, s ∈ C. So C has a singular point if and only if   s + 2r = 0 2rs + r2 = a S :=  −r2 s = b the system of equations (1) (2) (3) has a solution (s, r) ∈ C2 . Suppose first that S has a solution (s, r) ∈ C2 . From (1), we get that s = −2r. Substituting this in (2) and (3), we have that −3r2 = a and 2r3 = b Hence 4a3 + 27b2 = 4(−3r2 )3 + 27(2r3 )2 = 0 Suppose next that 4a3 + 27b2 = 0. If a = 0 then b = 0 and (r, s) = (0,!0) is a solution of S. So 3b 3b is a solution of S. 2 we may assume that a 6= 0. One easily checks that (r, s) = − , 2a a 60 4.2 Elliptic Curves and the Group Law Definition 4.4 An elliptic curve is a curve of the form {(x, y) ∈ C2 | y 2 = x3 + ax + b} where a, b ∈ C with 4a3 + 27b2 6= 0. An elliptic curve over Q is an elliptic curve where a, b ∈ Q. Let E be an elliptic curve. What are the points at infinity of E? For this we need some projective geometry. Suppose that E is the curve y 2 = x3 + ax + b where 4a3 + 27b2 6= 0. The associated homogeneous equation is Y 2 Z = X 3 + aXZ 2 + bZ 3 The projective line at infinity has equation Z = 0. So the points at infinity of E are the solutions of 2 Y Z = X 3 + aXZ 2 + Z 3 Z=0 We easily get that the point (0, 1, 0) is the only solution. We denote this point by e. Note that a line in the XY -plane that goes through e is a vertical line. What are the points of intersection between a line and the elliptic curve y 2 = x3 + ax + b? Suppose first that the line is not vertical. Then the line has an equation of the form y = mx + c where m, c ∈ C. Hence the points of intersections are the solutions of y = mx + c y 2 = x3 + ax + b Substituting ’y = mx + c’ into the second equation and simplifying, we get that y = mx + c x3 − m2 x2 + (a − 2mc)x + b − c2 = 0 (*) Hence, counting multiplicities, we see that a non-vertical line intersects an elliptic curve E in three points on E. Suppose next that the line is vertical. Then the line has an equation of the form x = c where c ∈ C. Hence the points of intersections are the solutions of x=c y 2 = x3 + ax + b Substituting ’x = c’ into the second equation, we get that x=c y 2 = c3 + ac + b Hence, counting multiplicities, we see that a vertical line intersects an elliptic curve E in two points on E and the point e (at infinity). 61 Definition 4.5 Let E be the elliptic curve y 2 = x3 + ax + b. (a) We put E(C) = {(x, y) ∈ C2 | y 2 = x3 + ax + b} ∪ {e} E(Q) = {(x, y) ∈ Q2 | y 2 = x3 + ax + b} ∪ {e} (b) Let P1 , P2 ∈ E(C). We define a third point in E(C), which we denote by P1 + P2 , as follows : 1. P1 6= e 6= P2 and P1 6= P2 Suppose first that the line P1 P2 is not vertical. Then the line P1 P2 intersects E in a third point (x, y). We put P1 + P2 = (x, −y) Suppose next that the line P1 P2 is vertical. We put P 1 + P2 = e 2. P1 6= e 6= P2 and P1 = P2 Suppose first that the tangent line to E at P1 is not vertical. Then this tangent line intersects E in a third point (x, y). We put P1 + P2 = (x, −y) Suppose next that the tangent line to E at P1 is vertical. We put P 1 + P2 = e 3. P1 = e or P2 = e We put P1 + e = P 1 and e + P2 = P2 The following theorem is far from obvious. Theorem 4.6 Let E be an elliptic curve. Then E(C), + is an abelian group. Moreover, if E is an elliptic curve over Q then E(Q) is a subgroup of E(C). Proof : Clearly, e is an identity element and P1 + P2 = P2 + P1 for all P1 , P2 ∈ E(C). The inverse of (x, y) ∈ E is (x, −y) while the inverse of e is e. Associativity is quite hard to prove. If E is an elliptic curve over Q, then it follows from (*) on page 61 that P1 − P2 ∈ E(Q); so E(Q) is a subgroup of E(C). 2 The following propositions allow us to perform practical calculations in E(C). 62 Proposition 4.7 Let E be the elliptic curve y 2 = x3 +ax+b and P = (x1 , y1 ), Q = (x2 , y2 ) ∈ E with x1 6= x2 . Then  !2   y2 − y1     x3 = x2 − x1 − x1 − x2 P + Q = (x3 , y3 ) where !   y2 − y1     y3 = − x2 − x1 (x3 − x1 ) − y1 Proof : The line through P1 and P2 has equation y= y2 − y1 (x − x1 ) + y1 x2 − x1 It follows from (*) on page 61 that x1 + x2 + x3 = y2 − y1 x2 − x1 !2 2 The formulas for x3 and y3 are now easily deduced. Proposition 4.8 Let E be the elliptic curve y 2 = x3 +ax+b and P = (x1 , y1 ) ∈ E with y1 6= 0. Then  !2   3x21 + a   − 2x1   x2 = 2y1 P + P = (x2 , y2 ) where !   3x21 + a   (x2 − x1 ) − y1   y2 = − 2y1 Proof : Using implicit differentiation, we get that 2y dy = 3x2 + a dx Hence the tangent line to E at P has equation y= 3x21 + a (x − x1 ) + y1 2y1 It follows from (*) on page 61 that x1 + x1 + x2 = 3x21 + a 2y1 The formulas for x2 and y2 are now easily deduced. 63 !2 2 Example : Consider the elliptic curve E : y 2 = x3 + 17 We easily see the following integral points on E : P = (−1, 4) and Q = (2, 5) We can now use the group law to construct new rational points on E : ! ! 64 59 137 2651 ,− , Q+Q= − , P +P = 64 512 25 125 ! 8 109 P + Q = − ,− , P − Q = (8, 23) 9 27 We finish this section by mentioning without proof some major results. Note that the proof of most of these results is extremely complicated. Theorem 4.9 Let E be the elliptic curve y 2 = x3 + ax + b with a, b ∈ Q. Then the following holds : (a) (Mordell, 1922) The group E(Q), + is finitely generated. (b) (Siegel, 1929) There are only finitely many integral points on E. (c) (Lutz-Nagell,1937) If a, b ∈ Z and (x, y) ∈ E(Q) has finite order, then x, y ∈ Z and either y = 0 or y|(4a3 + 27b2 ). (d) (Mazur, 1976) The torsion subgroup of E(Q) (this is the set of all elements in E(Q) of finite order) is isomorphic to one of the following groups : Z/mZ with m ∈ {1, 2, . . . , 9, 10, 12} or Z/2mZ ⊕ Z/2Z with m ∈ {1, 2, 3, 4}. 4.3 Sums of Two Cubes Let k ∈ N0 . Does there exist a number n that can be written as the sum of k (positive) cubes in k different ways? Elliptic curves give us a construction to answer this question in the affirmative. We want to find numbers n 6= 0 such that the Diophantine equation X3 + Y 3 = n has k different integral solutions (note that if (a, b) is a solution, we do not consider (b, a) to be a different solution). We make the substitution    36n + y   X= 6x   36n − y   Y = 6x 64 After making this substitution and simplifying, we get the elliptic curve y 2 = x3 − 432n2 Note that we can solve our substitution rules for x and y :   12n    x= X +Y   36n(X − Y )   y= X +Y Hence there is a bijection between the rational points on the elliptic curve y 2 = x3 − 432n2 and the rational points on the curve X 3 + Y 3 = n (note that integral points on one curve do not necessarily lead to integral points on the other curve). Starting with a certain value for X, Y and n, we get a rational point (mostly of infinite order) on the elliptic curve; using the group law, we easily get more rational points on the elliptic curve which lead to rational points on X 3 + Y 3 = n; clearing the denominators will lead to integral solutions of X 3 + Y 3 = m for some m ∈ N. We illustrate this with the case n = 9. We easily get that (X1 , Y1 ) = (2, 1) is a solution of X3 + Y 3 = 9 The associated elliptic curve is y 2 = x3 − 34992 which has the rational point P = (36, 108) We easily get that P + P = (252, −3996) The associated rational point on X 3 + Y 3 = 9 is (X2 , Y2 ) = So we have  3 3 =9   2 +1 ! 3 − 17  +  7 ! 17 20 − , 7 7 !3 20 =9 7 Multiplying both equations by 73 , we get that (2 · 7)3 + (1 · 7)3 = 9 · 73 = (−17)3 + 203 65 So 9 · 73 = 3087 is a number that can be written as the sum of two cubes in (at least) two different ways. We can keep going : P + P + P = (73, 595) Hence we have and (X3 , Y3 ) = ! 919 271 ,− 438 438       23 + 13 = 9    !3 !3    − 17 20 + =9 7 7     !3 !3    919 271     438 + − 438 = 9 Multiplying both equations by (7 · 438)3 , we get that (2 · 7 · 438)3 + (1 · 7 · 438)3 = (−17 · 438)3 + (20 · 438)3 = (919 · 7)3 + (−271 · 7)3 = 9 · (7 · 438)3 So 9 · (7 · 438)3 = 259393423464 is a number that can be written as the sum of two cubes in (at least) three different ways. What if we only want to use positive cubes? Suppose that the point P has infinite order (it is rather rare for P to have finite order). For k ≥ 1, we can calculate (Xk , Yk ) (the rational point associated to kP ). One can prove that Xk , Yk > 0 for infinitely many k’s. We list the first couple of values of (Xk , Yk ) in our case (n = 9) : (X1 , Y1 ) = (2, 1) (X2 , Y2 ) = ! 17 20 − , 7 7 (X3 , Y3 ) = ! 919 271 ,− 438 438 (X4 , Y4 ) = ! 36520 188479 − , 90391 90391 (X5 , Y5 ) = ! 169748279 152542262 ,− 53023559 53023559 (X6 , Y6 ) = 415280564497 676702467503 , 348671682660 348671682660 66 !

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download math 216t topics in number theory