Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
California State University, Fresno
MATH 216T
TOPICS IN NUMBER THEORY
Spring 2008
Instructor : Stefaan Delcroix
Chapter 1
Diophantine Equations
Definition 1.1 Let f (x1 , . . . , xn ) be a polynomial with integral coefficients in x1 , . . . , xn . We
call the equation f (x1 , . . . , xn ) = 0 a diophantine equation if we are looking for all solutions
(x1 , . . . , xn ) ∈ Zn or (x1 , . . . , xn ) ∈ Qn .
1.1
The Diophantine Equation ax + by = c
Theorem 1.2 Let a, b ∈ Z0 and c ∈ Z. Then the following holds about the diophantine equation
ax + by = c :
(a) The equation has a solution if and only if gcd(a, b)|c.
(b) Suppose that (x0 , y0 ) is a solution. Then all the solutions are given by
b
x = x0 − t
d
y =y +t a
0
d
,
t∈Z
where d = gcd(a, b).
Proof : Put d = gcd(a, b).
(a) Suppose first that (x0 , y0 ) is a solution. Then ax0 + by0 = c. Since d|a and d|b, we get that
d|(ax0 + by0 ) and so d|c. Suppose next that d|c. Then c = dk for some k ∈ Z. We’ve seen in
MATH 116 that the greatest common divisor of a and b can be written as a linear combination
of a and b. So there exist u, v ∈ Z with au + bv = d. Hence a(uk) + b(vk) = dk = c. So the
equation ax + by = c has a solution (namely (x, y) = (uk, vk)).
(b) Suppose that (x0 , y0 ) is a solution. Then d|c by (a).
b
a
Pick t ∈ Z. Put x = x0 − t and y = y0 + t . Then
d
d
!
!
b
a
ab
ba
ax + by = a x0 − t
+ b y0 + t
= ax0 − t
+ by0 + t
= ax0 + by0 = c
d
d
d
d
1
So (x, y) is a solution.
Next, we show that every solution is of this form. Suppose that (x, y) is a solution. Then
ax + by = c = ax0 + by0
Hence b(y − y0 ) = a(x0 − x). Put a = da0 and b = bd0 . Then db0 (y − y0 ) = da0 (x0 − x) and so
b0 (y − y0 ) = a0 (x0 − x). But gcd(a0 , b0 ) = 1. Since b0 |a0 (x0 − x), we get that b0 |(x0 − x). Hence
x0 − x = b0 t for some t ∈ Z. Then y − y0 = a0 t. So x = x0 − b0 t and y = y0 + a0 t.
2
Example : Find all the solutions of the diophantine equation 457x + 67y = 60714. Are there
any solutions (x, y) ∈ N2 ?
We start by calculating (457, 67) and writing it as a linear combination of 457 and 67.
1
0 457
0
1 67 457 = 67 · 7 + (−12)
−1
7 12 67 = 12 · 6 + (−5)
−6
41
5 12 = 5 · 2 + 2
11 −75
2 5=2·2+1
−28 191
1 2=1·2+0
so
so
so
so
so
R3
R4
R5
R6
we
= R1 − 7R2 and change signs
= R2 − 6R3 and change signs
= R3 − 2R4
= R4 − 2R5
stop
Hence
gcd(457, 67) = 1 = (−28) · 457 + 191 · 67
Multiplying both sides by 60714, we get that
(−1699992) · 457 + 11596374 · 67 = 60, 714
So (−1699992; 11596374) is a particular solution of 457x + 67y = 60714. Hence
x = −1699992 + 67t
all integral solutions of 457x + 67y = 60714 are
, t∈Z
y = 11596374 − 457t
Are there any solutions (x, y) ∈ N2 ? So
−1699992 + 67t ≥ 0 and 11596374 − 457t ≥ 0
Hence we get
t≥
1699992
11596374
≈ 25373.0149 and t ≤
≈ 25374.9978
67
457
Since t is an integer, we get that t = 25374. So
x = −1699992 + 67 · 25374 = 66 and y = 11596374 − 457 · 25374 = 456
the only natural solution of 457x + 67y = 60714 is (x, y) = (66, 456)
2
We have the following result for natural solutions of ax + by = c.
Proposition 1.3 Let a, b, c ∈ N0 with gcd(a, b) = 1 and c > ab. Then the equation ax + by = c
has a solution (x, y) ∈ N20 .
Proof : Let (x0 , y0 ) be a solution of ax + by = c. Then all the solutions are given by
xt = x0 − tb
yt = y0 + ta
, t∈Z
Hence the solutions are equally spaced along the line L ↔ ax + by = c : for all t ∈ Z, the
distance between (xt+1 , yt+1 ) and (xt , yt ) is
p
√
([(x0 − (t + 1)b] − [x0 − tb])2 + ([y0 + (t + 1)a] − [y0 + ta])2 = a2 + b2
Note that the X-intercept (resp. Y -intercept) of L are
c
c ,0
resp.
0,
a
b
The distance between these two intercepts (which is the length of the part of L that is in the
first quadrant) is given by
r
√
c√ 2
c2 c2
2 >
+
=
a
+
b
a2 + b 2
a2 b 2
ab
since c > ab. Hence there exists t ∈ Z with (xt , yt ) ∈ N20 (so (xt , yt ) is in the first quadrant). 2
1.2
Unique Prime Factorization over N
The fact that we can write every n ∈ N with n ≥ 2 uniquely (up to order) as a product of
primes can be a very powerful tool. We illustrate this with a question from the International
Math Olympiad.
The original question was :
2
(IMO 1997, B2) : Find all (a, b) ∈ N20 with ab = ba .
We start with an easy lemma :
Lemma 1.4 Let n ∈ N with n ≥ 2. Then we have the following :
(a) nk−2 > k for all k ∈ N with k ≥ 5.
(b) n2k−1 > k for all k ∈ N0 .
3
Proof : (a) We use induction on k. For k = 5, we have n5−2 = n3 ≥ 23 = 8 > 5. So suppose
that nk−2 > k for k = 5, 6, . . . , m for some m ∈ N with m ≥ 5. Then using induction, we get
n(m+1)−2 = nm−1 = n · nm−2 > n · m ≥ 2m ≥ m + 1
since m ≥ 5.
(b) Similar to the proof in (a).
2
Now we tackle the question. One easily checks that a = 1 ⇔ b = 1. So we may assume that
a, b > 1. Hence we can consider the prime factorization of a and b. We only use the primes
that show up in either factorization :
a=
n
Y
pαi i
and
b=
i=1
n
Y
pβi i
i=1
where p1 < p2 < · · · < pn are primes, αi , βi ∈ N and not both αi and βi are zero for i =
2
1, 2, . . . , n. Substituting this into ab = ba , we find
n
Y
2
pαi i b
=
n
Y
pβi i a
i=1
i=1
Using unique prime factorization, we get
αi b2 = βi a
Suppose first that
for i = 1, 2, . . . , n
(∗)
a
≥ 2. From (*), we get that αi ≥ 2βi for i = 1, 2, . . . , n. Hence
b2
2
b =
n
Y
pi2βi
divides
n
Y
pαi i = a
i=1
i=1
So a = kb2 for some k ∈ N with k ≥ 2. Hence by (*), αi = kβi for i = 1, 2, . . . , n. So
a=
n
Y
pαi i
=
n
Y
i
= bk
pkβ
i
i=1
i=1
Putting everything together, we find kb2 = a = bk and so
bk−2 = k
By Lemma 1.4(a), k ∈ {2, 3, 4}. One easily checks that k = 2 does no lead to a solution while
k = 3, 4 lead to the solutions (a, b) ∈ {(27, 3), (16, 2)}.
a
Suppose next that 2 < 2. From (*), we get that αi < 2βi for i = 1, 2, . . . , n. Hence
b
a=
n
Y
pαi i
divides but doesn’t equal
i=1
n
Y
i=1
4
pi2βi = b2
So b2 = ka for some k ∈ N with k ≥ 2. Hence by (*), βi = kαi for i = 1, 2, . . . , n. So
b=
n
Y
pβi i
=
i=1
n
Y
i
pkα
= ak
i
i=1
Putting everything together, we find ka = b2 = (ak )2 = a2k and so
a2k−1 = k
By Lemma 1.4(b), there are no solutions.
2
All couples (a, b) ∈ N20 with ab = ba are {(1, 1), (16, 2), (27, 3)}.
1.3
2
Pythagorean Triples
In this section, we will find all the solutions (x, y, z) ∈ N30 of the diophantine equation
x2 + y 2 = z 2
Definition 1.5
(1) A triple (x, y, z) ∈ N30 is a Pythagorean triple if x2 + y 2 = z 2 .
(2) A Pythagorean triple (x, y, z) is primitive if gcd(x, y, z) = 1.
Remarks :
(1) If (x, y, z) is a Pythagorean triple, then so is (dx, dy, dz) for any d ∈ N0 .
(2) Let (x, y, z) be a Pythagorean triple. Put d = gcd(x, y, z). Then x = dx0 , y = dy 0 and
z = dz 0 for some x0 , y 0 , z 0 ∈ N0 . So (dx0 )2 + (dy 0 )2 = (dz 0 )2 . Hence x0 2 + y 0 2 = z 0 2 . Note
that gcd(x0 , y 0 , z 0 ) = 1. So (x0 , y 0 , z 0 ) is a primitive Pythagorean triple and (x, y, z) =
(dx0 , dy 0 , dz 0 ). Hence it is enough to find all primitive Pythagorean triples.
(3) Let (x, y, z) be a Pythagorean triple. Then
gcd(x, y, z) = 1 ⇔ gcd(x, y) = 1 ⇔ gcd(y, z) = 1 ⇔ gcd(x, z) = 1.
Lemma 1.6 Let (x, y, z) be a primitive Pythagorean triple. Then either x is even and y is odd
or x is odd and y is even.
Proof : Since gcd(x, y) = 1, we get that x and y can not both be even. Suppose x and y are
both odd. Then
z 2 ≡ x2 + y 2 ≡ 1 + 1 ≡ 2 mod 4
a contradiction since 2 is not a quadratic residue modulo 4. So either x is even and y is odd or
x is odd and y is even.
2
If (x, y, z) is a primitive Pythagorean triple, then so is (y, x, z). By Lemma 1.6, it is enough to
find all primitive Pythagorean triples with x even and y odd.
5
Theorem 1.7 Let (x, y, z) ∈ N30 with x even. Then (x, y, z) is a primitive Pythagorean triple
if and only if
(x, y, z) = (2mn, m2 − n2 , m2 + n2 )
for some m, n ∈ N0 with m > n, gcd(m, n) = 1 and m and n not both odd.
Proof : Suppose (x, y, z) is a primitive Pythagorean triple. By Lemma 1.6, we get that y is
odd. Since x2 + y 2 = z 2 , we get that z is odd. Hence z + y and z − y are even. We get that
x2 = z 2 − y 2 = (z + y)(z − y) and so
x
2
Put d = gcd
!2
=
z+y z−y
,
. Then
2
2
z+y z−y
d|
+
2
2
z+y
2
!
z−y
2
and
d|
!
z+y z−y
−
2
2
So d|z and d|y. Hence d| gcd(z, y). But gcd(z, y) = 1. So d = 1. By HW3 #1, there exist
m, n ∈ N such that
z−y
z+y
= m2 and
= n2
2
2
z+y z−y
,
Then gcd(m, n) = 1 since gcd
= 1. We easily get that
2
2
x = 2mn , y = m2 − n2 and z = m2 + n2
Since y > 0, we have that m > n. Since gcd(m, n) = 1, m and n can not both be even. If m
and n are both odd, then y is even, a contradiction. So m and n are not both odd.
One easily checks that (2mn, m2 − n2 , m2 + n2 ) is a primitive Pythagorean triple if m, n ∈ N0
such that gcd(m, n) = 1, m > n and not both m and n are odd.
2
Remark : It follows now that all Pythagorean triples (x, y, z) are given by
(x, y, z) ∈ {d(2mn, m2 − n2 , m2 + n2 ), d(m2 − n2 , 2mn, m2 + n2 )}
where d, m, n ∈ N0 , m > n, gcd(m, n) = 1 and not both m and n are odd.
Example : Find all Pythagorean triples (x, y, z) such that one of the variables has value 15.
We may assume that (x, y, z) = d(2mn, m2 − n2 , m2 + n2 ) where d, m, n ∈ N0 , m > n,
gcd(m, n) = 1 and not both m and n are odd. So either d(m2 − n2 ) = 15 or d(m2 + n2 ) = 15.
So d|15. Hence d ∈ {1, 3, 5, 15}.
Suppose first that d(m2 + n2 ) = 15. We easily get (by going over all the cases for d that the
only solution is (d, m, n) = (3, 2, 1) and so (x, y, z) = (12, 9, 15).
6
Suppose next that d(m2 − n2 ) = 15. Then d(m − n)(m + n) = 15. Note that m − n < m + n.
This leads to the following possibilities :
(d, m − n, m + n) ∈ {(1, 1, 15), (1, 3, 5), (3, 1, 5), (5, 1, 3)}
Hence (d, m, n) ∈ {(1, 8, 7), (1, 4, 1), (3, 3, 2), (5, 2, 1)}. So
(x, y, z) ∈ {(112, 15, 113), (8, 15, 17), (36, 15, 39), (20, 15, 25)}
All Pythagorean triples (x, y, z) such that one of the variables has value 15 are given by
(12, 9, 15), (112, 15, 113), (8, 15, 17), (36, 15, 39), (20, 15, 25)
(9, 12, 15), (15, 112, 113), (15, 8, 17), (15, 36, 39), (15, 20, 25)
1.4
The Chord-Tangent Method
In this section, we illustrate with an example how to find all rational points on an irreducible
quadratic curve if one rational point is known. This method is called the Chord-Tangent
Method of Diophantus.
We want to find all the rational points on the circle C with equation x2 + y 2 = 1. Clearly, the
point P = (0, −1) is a rational point on C. Suppose Q is another rational point on C. Then
the line through P and Q is either vertical or has a rational slope. Conversely, let l be a line
through P that is either vertical or has a rational slope. Then this line l will intersect the circle
C in a second point Q, which turns out to be rational.
So all the rational points on C can be found by intersecting C with lines through P that are
either vertical or have a rational slope P .
We now find that second point of intersection between the circle C and such a line l. The
vertical line through (0, −1) intersects the circle in (0, −1) and (0, 1). So suppose l has a
rational slope. Then an equation for l is y = mx − 1 where m ∈ Q. So the points of intersection
between the circle C and the line l are the solutions of
y = mx − 1
x2 + y 2 = 1
We get that
x2 + (mx − 1)2 = 1
or
(m2 + 1)x2 − 2mx = 0
This is a quadratic equation in x and we know that x = 0 is a solution. The second solution is
x=
So
y = mx − 1 = m ·
2m
m2 + 1
2m
m2 − 1
−
1
=
m2 + 1
m2 + 1
7
Hence the second point of intersection is
2m m2 − 1
Q=
,
m2 + 1 m2 + 1
Notice that we get the point (0, 1) if we consider the limit as m → +∞ in the expression for
Q. This is normal since we can view a vertical line as a line with infinite slope.
We get that all the rational points on x2 + y 2 = 1 are
2m m2 − 1
,
:m∈Q
{(0, 1)} ∪
m2 + 1 m2 + 1
1.5
The Method of Infinite Descent
In this section, we prove that the diophantine equation x4 + y 4 = z 4 has no solutions with
xyz 6= 0.
We begin with the following lemma.
Lemma 1.8 Let x, y, z ∈ N0 with x4 + y 4 = z 2 . Then there exist a, b, c ∈ N0 such that
a4 + b4 = c2 and c < z.
Proof : Suppose first that gcd(x, y) > 1. Then there exists a prime p such that p|x and p|y.
So x = pa and y = pb for some a, b ∈ N0 . Hence z 2 = x4 + y 4 = p4 (a4 + b4 ). So p4 |z 2 . Hence
p2 |z. So z = p2 c for some c ∈ N0 . Then a4 + b4 = c2 and c < z.
Suppose next that gcd(x, y) = 1. Note that (x2 )2 + (y 2 )2 = z 2 and that gcd(x2 , y 2 , z) = 1. So
(x2 , y 2 , z) is a primitive Pythagorean triple. We may assume that y 2 is even. Then by Theorem
1.7, we get that there exist m, n ∈ N0 such that gcd(m, n) = 1, m > n, not both m and n are
odd and
x2 = m2 − n2 , y 2 = 2mn and z = m2 + n2
So x2 + n2 = m2 . Since gcd(m, n) = 1, we get that (x, n, m) is a primitive Pythagorean triple.
Since y 2 is even, we get that x2 (and so also x) is odd. Hence n is even. By Theorem 1.7, there
exist r, s ∈ N0 such that gcd(r, s) = 1, r > s, not both r and s are odd and
x = r2 − s2 , n = 2rs and m = r2 + s2
Since x2 + n2 = m2 , we get that m2 (and so also m), is odd. Since gcd(m, n) = 1 and m is odd,
we get that gcd(m, 2n) = 1. But y 2 = m(2n). So there exist w, c ∈ N0 such that 2n = w2 and
m = c2 . Clearly, w is even, say w = 2v with v ∈ N0 . Then we have that 4v 2 = w2 = 2n = 4rs
and so rs = v 2 . Since gcd(r, s) = 1, there exist a, b ∈ N0 such that r = a2 and s = b2 . Since
r2 + s2 = m, we have that
a4 + b 4 = c 2
2
Note that z = m2 + n2 > m2 ≥ m = c2 ≥ c.
8
Corollary 1.9 Let x, y, z ∈ N such that x4 + y 4 = z 2 . Then xyz = 0.
Proof : Suppose that xyz 6= 0. Put x0 = x, y0 = y and z0 = z. By Lemma 1.8, there exist
x1 , y1 , z1 ∈ N0 such that x41 + y14 = z12 and z1 < z0 . Continuing to apply Lemma 1.8, we get
that for all n ∈ N, there exist xn , yn , zn ∈ N0 such that x4n + yn4 = zn2 and z0 > z1 > z2 > · · · .
This is impossible since zn ∈ N0 for all n ∈ N.
Hence xyz = 0.
2
Theorem 1.10 (Fermat) Let x, y, z ∈ N with x4 + y 4 = z 4 . Then xyz = 0.
Proof : Note that x4 + y 4 = (z 2 )2 . By Corollary 1.9, xyz 2 = 0. Hence xyz = 0.
1.6
2
Sums of Two Squares
In this section, we study the diophantine equation
x2 + y 2 = n
For which n does this equation have a solution?
Definition 1.11 Let n ∈ N. We say that n is the sum of two squares if there exist x, y ∈ N
with x2 + y 2 = n.
We start with a little lemma that tells us that certain numbers are not the sum of two squares.
Lemma 1.12 Let n ∈ N with n ≡ 3 mod 4. Then n is not the sum of two squares.
Proof : Suppose that n = x2 + y 2 for some x, y ∈ N. Note that a2 ≡ 0 mod 4 or a2 ≡ 1
mod 4 for all a ∈ N. Hence x2 + y 2 ∈ {0, 1, 2} mod 4, a contradiction since n ≡ 3 mod 4. 2
Being the sum of two squares is a ‘multiplicative’ property :
Lemma 1.13 Let m, n ∈ N. If m and n are both the sum of two squares then mn is also the
sum of two squares.
Proof : Suppose that m = a2 + b2 and n = c2 + d2 for some a, b, c, d ∈ N. One easily checks
that
mn = (a2 + b2 )(c2 + d2 ) = (ac − bd)2 + (ad + bc)2
2
So mn is the sum of two squares.
Corollary 1.14 Let ai ∈ N for i = 1, 2, . . . , n. If ai is the sum of two squares for i = 1, 2, . . . , n
then a1 a2 · · · an is the sum of two squares.
Proof : This follows from Lemma 1.13 by induction on n.
9
2
Lemma 1.13 makes the following question quite natural : which primes are the sum of two
squares? If p is an odd prime that is the sum of two squares then p ≡ 1 mod 4 by Lemma
1.12. It turns out that this condition is also sufficient. In order to prove this, we use a special
type of complex numbers.
Definition 1.15
(1) A Gaussian integer is a complex number of the form a+bi with a, b ∈ Z.
(2) The set of all Gaussian integers is denoted by Z[i]. Note that Z ⊂ Z[i].
(3) Let α, β ∈ Z[i] with β 6= 0. We say that β divides α (notation : β|α) if α = βγ for some
γ ∈ Z[i].
(4) Let α ∈ Z[i]. Then α is a unit if αβ = 1 for some β ∈ Z[i].
(5) Let α := a + bi ∈ Z[i]. We define the norm of α (notation : N (α)) as the natural number
N (α) = a2 + b2 .
The norm function N is quite a powerful tool to convert Gaussian integers into natural numbers.
Lemma 1.16 The following holds about Gaussian integers and the norm N :
(a) N (αβ) = N (α)N (β) for all α, β ∈ Z[i].
(b) Let α ∈ Z[i]. Then α is a unit if and only if N (α) = 1.
(c) The units of Z[i] are {1, −1, i, −i}.
Proof : (a) Note that N (α) = |α|2 (where | | stands for the norm of a complex number) for
all α ∈ Z[i]. Since |z1 z2 | = |z1 ||z2 | for all z1 , z2 ∈ C, we get that N (αβ) = N (α)N (β) for all
α, β ∈ Z[i].
(b) Suppose first that α is a unit. Hence αβ = 1 for some β ∈ Z[i]. Applying the norm N to
both sides and using (a), we get
N (α)N (β) = N (αβ) = N (1) = 12 + 02 = 1
Since N (α), N (β) ∈ N, we have that N (α) = N (β) = 1.
Suppose next that N (α) = 1. Put α = a + bi with a, b ∈ Z. Then a2 + b2 = 1. Put β = a − bi.
Then β ∈ Z[i] and
αβ = (a + bi)(a − bi) = a2 + b2 = 1
So α is a unit.
(c) Let α := a + bi ∈ Z. By (b), we get that
α is a unit ⇔ N (α) = 1 ⇔ a2 + b2 = 1
Since a, b ∈ Z, we have (a, b) ∈ {(1, 0), (−1, 0), (0, 1), (0, −1)}. Hence α ∈ {1, −1, i, −i}.
2
The Gaussian integers Z[i] behave pretty much like the integers Z. We need the concept of a
Gaussian prime number in order to proceed. There are two ways of thinking of prime numbers
over the integers. In general domains, this leads to two different concepts.
10
Definition 1.17 Let α ∈ Z[i] such that α 6= 0 and α is not a unit.
(1) We say that α is irreducible if
∀β, γ ∈ Z[i] : α = βγ ⇒ β is a unit or γ is a unit
(b) α is reducible if α is not irreducible. Hence α is reducible if and only if α = βγ for some
β, γ ∈ Z[i] \ {1, −1, i, −i}.
(c) α is prime if
∀β, γ ∈ Z[i] : α|(βγ) ⇒ α|β or α|γ
To distinguish between p ∈ Z being prime and α ∈ Z[i] being prime, we call a prime p ∈ Z a
rational prime while a prime α ∈ Z[i] is called a Gaussian prime.
In Math 251, we proved the following relation between being irreducible and being a prime in
Z[i].
Proposition 1.18 Let α ∈ Z[i]. Then α is irreducible if and only if α is a Gaussian prime.
Examples
(a) 5 is a rational prime but not a Gaussian prime
Note that 5 = (2 + i)(2 − i). Since neither 2 + i nor 2 − i is a unit, we get that 5 is
reducible and hence not a Gaussian prime.
(b) 3 is both a rational prime and a Gaussian prime.
Suppose that 3 = αβ for some α, β ∈ Z[i]. Applying the norm N to both sides, we get
that
9 = 32 + 02 = N (3) = N (αβ) = N (α)N (β)
Since N (α), N (β) ∈ N, we get that (N (α), N (β)) ∈ {(1, 9), (9, 1), (3, 3)}. Note that there
are no Gaussian integers with norm 3 (indeed, let a, b ∈ Z with N (a + bi) = 3; then
a2 + b2 = 3, a contradiction since a, b ∈ Z). Hence N (α) = 1 or N (β) = 1. So α is a unit
or β is a unit. Hence 3 is irreducible and so a Gaussian prime.
(c) 2 + i is a Gaussian prime
Suppose that 2 + i = αβ for some α, β ∈ Z[i]. Applying the norm N to both sides, we
get that
5 = 22 + 12 = N (2 + i) = N (αβ) = N (α)N (β)
Since N (α), N (β) ∈ N, we get that (N (α), N (β)) ∈ {(1, 5), (5, 1)}. Hence N (α) = 1 or
N (β) = 1. So α is a unit or β is a unit. Hence 2 + 5i is irreducible and so a Gaussian
prime.
Note that this example generalizes : if α ∈ Z[i] such that N (α) is a rational prime, then
α is a Gaussian prime.
11
(d) 1 − 3i is not a Gaussian prime
How can we come up with a factorization of 1 − 3i? The answer : use the norm! Suppose
that 1 − 3i = αβ for some α, β ∈ Z[i]. Applying the norm N to both sides, we get that
10 = 1 + (−3)2 = N (1 − 3i) = N (αβ) = N (α)N (β)
Since N (α), N (β) ∈ N, we may assume that (N (α), N (β)) ∈ {(1, 10), (5, 2)}. If we can
exclude (5, 2) then 1 − 3i will be irreducible. First, we’ll find all α ∈ Z[i] with N (α) = 5.
Putting α = a + bi, we get that N (α) = a2 + b2 = 5. Since a, b ∈ Z we get
(a, b) ∈ {(1, 2), (1, −2), (−1, 2), (−1, −2), (2, 1), (2, −1), (−2, 1), (−2, −1)}
So
α ∈ {1 + 2i, 1 − 2i, −1 + 2i, −1 − 2i, 2 + i, 2 − i, −2 + i, −2 − i}
We don’t have to check all eight possibilities. If 1 − 3i is divisible by 2 + i, then it is also
divisible by any unit times 2 + i. So we can divide the eight possibilities in groups of
four :
{1 · (2 + i), (−1) · (2 + i), i · (2 + i), (−i) · (2 + i)} = {2 + i, −2 − i, −1 + 2i, 1 − 2i}
{1 · (2 − i), (−1) · (2 − i), i · (2 − i), (−i) · (2 − i)} = {2 − i, −2 + i, 1 + 2i, −1 − 2i}
Now we try one element of these groups of four. If neither group works, 1−3i is irreducible;
if one of the groups works then we found a ’factorization of 1 − 3i.
First, we try 2 + i : is 1 − 3i divisible by 2 + i? We easily get
(1 − 3i)(2 − i)
−1 − 7i
1 7
1 − 3i
=
=
=− − i∈
/ Z[i]
2+i
(2 + i)(2 − i)
5
5 5
So 1 − 3i is not divisible by 2 + i. Next we try 2 − i :
(1 − 3i)(2 + i)
5 − 5i
1 − 3i
=
=
= 1 − i ∈ Z[i]
2−i
(2 − i)(2 + i)
5
Hence we came up with the following factorization :
1 − 3i = (2 − i)(1 − i)
Since neither 2 − i nor 1 − i is a unit (in fact, they are Gaussian primes since their norms
are rational primes), we get that 1 − 3i is reducible and hence not a Gaussian prime. The following proposition shows the relation between rational primes being Gaussian primes
and sums of squares.
12
Proposition 1.19 A rational prime p is a Gaussian prime if and only if p is not the sum of
two squares.
Proof : Let p be a rational prime. We will prove that p is reducible if and only if p is the sum
of two squares.
Suppose first that p is the sum of two squares, say p = a2 + b2 with a, b ∈ N. Then p = a2 + b2 =
(a + bi)(a − bi). Note that neither a + bi nor a − bi is a unit (since p is a rational prime, we
have that a 6= 0 6= b). Hence p is reducible.
Suppose next that p is reducible. So p = αβ for some α, β ∈ Z[i] \ {1, −1, i, −i}. Applying the
norm to both sides, we find
p2 = p2 + 02 = N (p) = N (αβ) = N (α)N (β)
Since N (α), N (β) ∈ N \ {1} and p is a rational prime, we get that N (α) = N (β) = p. Put
α = a + bi with a, b ∈ Z. Then p = N (α) = a2 + b2 . So p is the sum of two squares.
2
Corollary 1.20 Let p be a rational prime with p ≡ 3 mod 4. Then p is a Gaussian prime.
Proof : By Lemma 1.12, p is not the sum of two squares. So by Proposition 1.19, p is a
Gaussian prime.
2
Next, we prove that a rational prime p with p ≡ 1 mod 4, is a Gaussian prime. We need
Euler’s Criterion (seen in Math 116).
Proposition 1.21 (Euler’s Criterion) Let p be an odd prime and a ∈ Z with gcd(a, p) = 1.
p−1
Then a is a square modulo p if and only if a 2 ≡ 1 mod p.
Corollary 1.22 Let p be an odd prime. Then −1 is a square modulo p if and only if p ≡ 1
mod 4.
Proof : Suppose first that p ≡ 1 mod 4. Then p = 4k + 1 for some k ∈ N. Hence
(−1)
p−1
2
≡ (−1)2k ≡ 1
mod p
So by Euler’s Criterion, −1 is a square modulo p.
Suppose next that p ≡ 3 mod 4. Then p = 4k + 3 for some k ∈ N. Hence
(−1)
p−1
2
≡ (−1)2k+1 ≡ −1 6≡ 1
So by Euler’s Criterion, −1 is not a square modulo p.
13
mod p
2
Proposition 1.23 Let p be a rational prime with p ≡ 1 mod 4. Then p is not a Gaussian
prime.
Proof : Suppose that p is a Gaussian prime. By Corollary 1.22, there exists n ∈ N with
n2 ≡ −1 mod p. Hence p|(n2 + 1). So p|((n + i)(n − i)). Since p is a Gaussian prime, we get
that p|(n + i) or p|(n − i). Hence there exist a, b ∈ Z with
n ± i = p(a + bi) = pa + pbi
Hence n = pa and |pb| = 1, a contradiction since b ∈ Z and p is a rational prime.
So p is not a Gaussian prime.
2
Theorem 1.24 Let p be a rational prime. Then the following are equivalent :
(a) p is a Gaussian prime.
(b) p ≡ 3 mod 4.
(c) p is not the sum of two squares.
Proof : This follows from Proposition 1.19, Corollary 1.20, Proposition 1.23 and the fact that
2 is not a Gaussian prime (since 2 = (1 + i)(1 − i)).
2
We are now able to describe which natural numbers are the sum of two squares. Note that 0
and 1 are the sum of two squares.
Q
Theorem 1.25 Let n ∈ N with n ≥ 2 and let n = ki=1 pαi i be the prime factorization of n.
Then n is the sum of two squares if and only if for i = 1, 2, . . . , k, we have that αi is even if
pi ≡ 3 mod 4.
Proof : Assume first that for i = 1, 2, . . . , k, we have that αi is even if pi ≡ 3 mod 4. Note
that 2 is the sum of two squares. If p is an odd rational prime, then p is the sum of two squares
if p ≡ 1 mod 4 by Theorem 1.24 and p2 is the sum of two squares if p ≡ 3 mod 4. So n is the
sum of two squares by Corollary 1.14.
Assume next that n is the sum of two squares. Suppose that there exists i ∈ {1, 2, . . . , k}
such that pi ≡ 3 mod 4 and αi is odd, say i = 1. Then n = a2 + b2 for some a, b ∈ N. Let
f = gcd(a, b). Put a = f c and b = f d. Then
n = a2 + b2 = f 2 c2 + f 2 d2 = f 2 (c2 + d2 )
So f 2 |n. Put n = f 2 m. Then we have that
m = c2 + d2
Let f =
Qk
i=1
pδi i be the prime factorization of f . Then
Q k αi
k
Y
n
i=1 pi
pαi i −2δi
m = 2 = Q
2 =
f
k
δi
i=1
i=1 pi
14
Since α1 is odd, we get that αi − 2δ1 ≥ 1. So p1 |m. If p1 |c, then p1 |d2 since d2 = m − c2 and
so p1 |d since p1 is a prime, a contradiction since gcd(c, d) = 1. Hence gcd(p1 , c) = 1. As seen
in Math 116, there exists t ∈ N with ct ≡ d mod p1 (indeed, the equation cx ≡ d mod p1 has
a solution). So we get
0 ≡ m ≡ c2 + d2 ≡ c2 + (ct)2 ≡ c2 (1 + t2 )
mod p1
Since gcd(p1 , c) = 1, we get that
1 + t2 ≡ 0
mod p1
So −1 is a square modulo p1 , a contradiction to Corollary 1.22.
Hence for i = 1, 2, . . . , k, we have that αi is even if pi ≡ 3 mod 4.
1.7
2
Sums of k-th Powers
In the previous section, we found all the natural numbers that can be written as a sum of two
squares. We now could ask the question : which natural numbers can be written as the sum of
three squares? The answer is a theorem similar to Theorem 1.25. Note that not every natural
number can be written as the sum of three squares (7, 15, 23, ... are all examples; in fact,
if n ≡ 7 mod 8 then n can not be written as the sum of three squares). Lagrange however
proved the following amazing result :
Theorem 1.26 (Lagrange,1770) Every natural number can be written as the sum of four
squares.
In 1770, Edward Waring suggested the following generalization :
Given k ∈ N, does there exist g(k) ∈ N such that every n ∈ N can be written as
the sum of g(k) k-th powers of natural numbers?
He also conjectured that g(3) = 9 and g(4) = 19.
In 1906, Hilbert proved that the function g(k) indeed exists. In 1909, it was proven that
g(3) = 9. Only in 1986 was it shown that g(4) = 19.
We finish this section with a result from Euler (that we can write now as) :
h i
k
− 2 for all k ≥ 2.
Theorem 1.27 g(k) ≥ 2k + 23
Notice that in Euler’s time, it was only conjectured that g(k) existed. Euler’s proof is constructive but his lower bound is truly remarkable :
" #
k
3
k
• Mahler proved in 1957 that g(k) 6= 2 +
− 2 for at most a finite number of values
2
for k.
" #
k
3
• Stemmler, Kubina and Wunderlich showed in 1990 that g(k) = 2k +
− 2 for all
2
k ≤ 471, 600, 000.
15
1.8
The Pell Equation x2 − ny 2 = 1
Let n ∈ N that is not a perfect square. Put
√
√
√
√
Q( n) = {a + b n | a, b ∈ Q} and Z[ n] = {a + b n | a, b ∈ Z}
√
√
√
Since n is irrational and every element√of Q( n) can√be written uniquely as a + b n for some
a, b ∈ Q. Recall that the norm of a + b n is N (a + b n) = a2 − nb2 .
√
√
We proved that α ∈ Z[ n] is a unit if and only N (α) = ±1. Putting α = x + y n where
x, y ∈ Z, this leads to solving the Diophantine equations
x2 − ny 2 = −1
or
x2 − ny 2 = 1
In this section, we concentrate on the latter :
x2 − ny 2 = 1
This Diophantine equation is called the Pell equation or Pell-Fermat equation.
We want to find all (x, y) ∈ N2 with x2 − ny 2 = 1. Clearly (x, y) = (1, 0) is a natural solution.
This solution is called the trivial solution. We also put (x0 , y0 ) = (1, 0). Are there any other
solutions? We will prove that there are infinitely many solutions that can be described in terms
of a ‘smallest’ or ‘fundamental’ solution.
Suppose that (x1 , y1 ), (x2 , y2 ) ∈ N2 are solutions of x2 − ny 2 = 1. Then we have that
x1 < x2 ⇔ x21 < x22 ⇔ 1 + ny12 < 1 + ny22 ⇔ y12 < y22 ⇔ y1 < y2
So we can order the natural solutions of x2 − ny 2 = 1 :
(x1 , y1 ) < (x2 , y2 ) ⇔ x1 < x2 ⇔ y1 < y2
We start by proving that the equation x2 − ny 2 = 1 always has a non-trivial solution. We give
a non-constructive existence proof that relies heavily on the Pigeonhole Principle :
• If we distribute more than k pigeons into k holes then at least one hole contains more
than one pigeon.
• If we distribute infinitely many pigeons into finitely many holes then at least one hole
contains infinitely many pigeons.
The main step is Dirichlet’s Approximation Theorem.
16
Theorem 1.28 [Dirichlet] Let n, B ∈ N0 with n not a perfect square. Then there exist a, b ∈ Z
√
1
such that 1 ≤ b ≤ B and |a − b n| < .
B
√
√
Proof : For k = 1, 2, . . . , B + 1, put αk = k n − [k n] (where for x ∈ R, [x] is the integral
part of x : it is the largest integer smaller than or equal to x). Note that αk is irrational and
0 < αk <√1 for k = 1, 2, . . . , B + 1. Moreover, {α1 , α2 , . . . , αB+1 } are B + 1 different numbers
because n is irrational. By the Pigeonhole Principle, we have that
1
for some 1 ≤ k < l ≤ B + 1
B
√
√
Put a = [l n] − [k n] and b = l − k. Then a, b ∈ Z, 1 ≤ b = l − k ≤ B and
√
√
√
√
√
√
√
√
1
|a − b n| = |([l n] − [k n]) − (l − k) n| = |(k n − [k n]) − (l n − [l n])| = |αk − αl | ≤
B
√
√
1
2
Since b 6= 0, we have that a − b n is irrational and so |a − b n| < .
B
Corollary 1.29 Let n ∈ N with n not a perfect square. Then there are infinitely many couples
√
1
(a, b) ∈ Z × N0 with |a − b n| < .
b
√
1
Proof : Suppose that there are only finitely many couples (a, b) ∈ Z × N0 with |a − b n| < ,
b
√
√
1
say {(a1 , b1 ), . . . , (ak , bk )}. Pick B ∈ N with
≤ min{1, |a1 − b1 n|, . . . , |ak − bk n|}. By
B
Dirichlet’s Approximation Theorem, there exists (a, b) ∈ Z × N0 such that 1 ≤ b ≤ B and
√
√
1
1
1
|a − b n| < . Since 1 ≤ b ≤ B, we have that |a − b n| <
≤ . Hence (a, b) = (ai , bi ) for
B
B
b
some 1 ≤ i ≤ k. But then
√
√
√
1
≤ |ai − bi n|
|ai − bi n| = |a − b n| <
B
a contradiction.
√
1
2
Hence there are infinitely many couples (a, b) ∈ Z × N0 with |a − b n| < .
b
Corollary
1.30
√
√ Let n ∈ N with n not a perfect square. Then there are infinitely many numbers
a − b n ∈ Z[ n] with the same norm.
√
1
Proof : By Corollary 1.29 there are infinitely many couples (a, b) ∈ Z×N0 with |a − b n| < .
b
If (a, b) is such a couple then
√
√
√
√
√
√
√
1
|a + b n| = |(a − b n) + 2b n| ≤ |a − b n| + 2b n < + 2b n < 3b n
b
and so
√
√
√
√
√
1
|N (a − b n)| = |a2 − nb2 | = |a − b n| |a + b n| < · 3b n = 3 n
b
√
√
√
√
Hence there are infinitely many numbers a − b n ∈ Z[ n] with |N (a − b n)| < 3 n. Since
the norm is an√integer,√it follows from the Pigeonhole principle that there are infinitely many
numbers a − b n ∈ Z[ n] with the same norm.
2
|αk − αl | ≤
17
√
√
Corollary
1.31
There
exists
an
integer
N
and
two
different
numbers
a
1 − b1 n, a2 − b2 n ∈
√
√
√
Z[ n] such
that
N
(a
1 − b1 n) = N (a2 − b2 n) = N , a1 ≡ a2 mod |N |, b1 ≡ b2 mod |N | and
√
√
(a1 − b1 n)(a2 − b2 n) > 0.
√
√
1.30,
there
exists
N
∈
Z
and
infinitely
many
numbers
a
−
b
n
∈
Z[
n]
Proof : By Corollary
√
with N (a − b n) = N . Since there are only finitely many congruence classes modulo |N |,
it follows from the Pigeonhole Principle
√ that there exist p, q ∈ {0, 1, . . . , |N | − 1} such that
infinitely many of these numbers a − b n satisfy a ≡ p mod |N | and b ≡ q mod |N |. Since
every number is either positive or negative, we get√(using the Pigeonhole Principle one more
time) that infinitely many
√ a − b n have the same
√ sign. So there
√ are two
√ of these√numbers
that
N
(a
−
b
n)
=
N
(a
−
b
different numbers a1 − b1 n, a2 − b2 n ∈ Z[ n] such
2
2 n) = N ,
√1 1
√
2
a1 ≡ a2 mod |N |, b1 ≡ b2 mod |N | and (a1 − b1 n)(a2 − b2 n) > 0.
We are now ready to prove the existence of a non-trivial solution to the Pell-Fermat equation.
Theorem 1.32 Let n ∈ N with n not a perfect square. Then the equation x2 − ny 2 = 1 has a
non-trivial solution. So there exists (x, y) ∈ N2 such that x2 − ny 2 = 1 and (x, y) 6= (1, 0).
√
√
√
2
2
Proof : Let
√ N and a1 − b1 n, a2 − b2 n ∈ Z[ n] be as in Corollary 1.31. Since a2 − nb2 =
N (a2 − b2 n) = N , we get that
√
√
√
(a1 − b1 n)(a2 + b2 n)
a a − nb1 b2 a1 b2 − b1 a2 √
a1 − b 1 n
√ =
√
√ = 1 2
+
n
N
N
a2 − b 2 n
(a2 − b2 n)(a2 + b2 n)
√
Recall that a21 − nb21 = N (a1 − b1 n) = N , a1 ≡ a2 mod |N | and b1 ≡ b2 mod |N |. Hence
a1 a2 − nb1 b2 ≡ a1 a1 − nb1 b1 ≡ a21 − nb21 ≡ 0
mod |N |
a1 b 2 − b 1 a2
a1 a2 − nb1 b2
∈ Z. Similarly, we get that b :=
∈ Z. Hence
N
N
√
√
√
a1 − b 1 n
√ ∈ Z[ n]
a+b n=
a2 − b 2 n
√
√
Since N (a1 − b1 n) = N (a2 − b2 n), we get that
√ √
√
a1 − b 1 n
N (a1 − b1 n)
2
2
√
√ =1
a − nb = N (a + b n) = N
=
a2 − b 2 n
N (a2 − b2 n)
√
√
√
√
Since (a
+ b n > 0. Finally, since a1 − b1 n and
√1 − b1 n)(a2 − b2 n) > 0, we get that a√
a2 − b2 n are different numbers, we have that a + b n 6= 1. So (|a|, |b|) is a non-trivial solution
of x2 − ny 2 = 1.
2
So a :=
So the equation x2 − ny 2 = 1 has a non-trivial solution. Recall that we can order the natural
solutions of x2 − ny 2 = 1. We call the smallest non-trivial natural solution of x2 − ny 2 = 1 the
fundamental solution of x2 − ny 2 = 1 and denote it by (x1 , y1 ).
18
By trial and error, we compiled a list of the fundamental solution for some small square-free
values of n :
n
2
3
5
6
7
10
(x1 , y1 )
(3, 2)
(2, 1)
(9, 4)
(5, 2)
(8, 3)
(19, 6)
Later, we will use continued fractions to find the fundamental solution which can be enormous,
even for relatively small values of n. For example, the fundamental solution of x2 − 61y 2 = 1 is
(1766319049, 226153980)
√
Let√(x1 , y1 ) be the fundamental solution of x2 −ny 2 = 1. For k ∈ N, we have that (x1 +y1 n)k ∈
Z[ n]. Hence we can define xk , yk ∈ N by
√
√
xk + yk n = (x1 + y1 n)k for k = 0, 1, 2, . . .
Note that it follows from Newton’s Binomium that xk , yn are indeed natural numbers for all
k ≥ 0.
We now prove that all the natural solutions of x2 − ny 2 = 1 are {(xk , yk ) | k = 0, 1, 2, . . .}.
√
One final remark : let x, y ∈ Q with x2 − ny 2 = 1 (so N (x + y n) = 1). Then
√
√
√
1
x−y n
x−y n
√ =
√
√ = 2
=
x
−
y
n
x − ny 2
x+y n
(x + y n)(x − y n)
Theorem 1.33 Let n ∈ N with n not a perfect square and let
√ (x1 , y1 ) be the
√ fundamental
solution of x2 − ny 2 = 1. For k ≥ 0, define xk , yk ∈ N by xk + yk n = (x1 + y1 n)k . Then the
following holds :
(a) All the natural solutions of x2 − ny 2 = 1 are given by {(xk , yk ) | k = 0, 1, 2, . . .}
√
√
(x1 + y1 n)k + (x1 − y1 n)k
xk =
√ k 2
√
(b) For all k ≥ 0, we have that
(x
+
y
n) − (x1 − y1 n)k
1
1
√
yk =
2 n
xk = x1 xk−1 + ny1 yk−1
(c) For all k ≥ 1, we have that
yk = y1 xk−1 + x1 yk−1
19
Proof : (a) Using the norm, we get that for all k ≥ 0
√
√ √ k
x2k − nyk2 = N (xk + yk n) = N (x1 + y1 n)k = N (x1 + y1 n) = (x21 − ny12 )k = 1k = 1
So (xk , yk ) is a natural solution of x2 − ny 2 = 1 for all k ≥ 0.
√
√
Suppose that x, y ∈ N with x2 −ny 2 = 1. Since (x1 +y1 n)0 = 1 and lim (x1 + y1 n)k = +∞,
k→+∞
it follows that there exists a unique k ∈ N such that
√
√
√
(1)
(x1 + y1 n)k ≤ x + y n < (x1 + y1 n)k+1
Note that
√
√
√
√
√
√
x+y n
x+y n
√ k =
√ = (x + y n)(xk − yk n) = xxk − nyyk + (yxk − xyk ) n := a + b n
(x1 + y1 n)
xk + yk n
where a, b ∈ Z. Applying the norm, we find
√
a2 − nb2 = N (a + b n)
√
√
= N ((x + y n)(xk − yk n))
√
√
= N (x + y n)N (xk − yk n)
= (x2 − ny 2 )(x2k − nyk2 )
= 1
It follows from (1) that
√
√
√
√
√
(x1 + y1 n)k+1
(x1 + y1 n)k
x+y n
√ k ≤
√ k =a+b n<
√ k = x1 + y1 n
1=
(x1 + y1 n)
(x1 + y1 n)
(x1 + y1 n)
Thus
(2)
1
1
√ ≤1
√ <
x1 + y1 n
a+b n
Hence
√
√
x1 − y1 n < a − b n ≤ 1
and so
√
√
−1 ≤ −a + b n < −x1 + y1 n < 0
Adding (2) and (3), we get that
(3)
√
√
0 ≤ 2b n < 2y1 n
So
0 ≤ b < y1
Since a2 − nb2 = 1 and (x1 , y1 ) is the fundamental solution of X 2 − nY 2 = 1, we get that
(|a|, b) = (1, 0). It follows from (2) that a = 1. So
√
√
√
√
√
x + y n = (a + b n)(x1 + y1 n)k = (x1 + y1 n)k = xk + yk n
20
Hence (x, y) = (xk , yk ), which proves (a).
(b) Pick k ≥ 0. Recall that
√
√
xk + yk d = (x1 + y1 d)k
(4)
Hence
xk − yk
√
1
1
√ =
√
d=
=
xk + yk d
(x1 + y1 d)k
1
√
x1 + y1 d
k
√
= (x1 − y1 d)k
(5)
Solving equations (4) and (5) for xk and yk , we get
√
√
√
√
(x1 + y1 n)k − (x1 − y1 n)k
(x1 + y1 n)k + (x1 − y1 n)k
√
and yk =
xk =
2
2 n
which proves (b).
(c) Pick k ≥ 1. Then
√
√
xk + yk n = (x1 + y1 √n)k
√
= (x1 + y1 √n)(x1 + y1 n)k−1
√
= (x1 + y1 n)(xk−1 + √
yk−1 n)
= x1 xk−1 + ny1 yk−1 + n(x1 yk−1 + y1 xk−1 )
Hence
xk = x1 xk−1 + ny1 yk−1
and yk = x1 yk−1 + y1 xk−1
2
which proves (c).
Example : Consider the equation
x2 − 3y 2 = 1
We ‘see’ that (2, 1) is the fundamental solution. Hence all the natural solutions of x2 − 3y 2 = 1
are given by
(
)
√
√
√
√ !
(2 + 3)k + (2 − 3)k (2 + 3)k − (2 − 3)k
√
,
|k ∈ N
2
2 3
We can also get these solutions by considering
√
√
xk + yk 3 = (2 + 3)k
21
for k = 0, 1, 2, . . .
Chapter 2
Analytic Number Theory
2.1
2.1.1
The Riemann-Zeta Function
The Riemann-Zeta Function and the Euler Product
Riemann saw that there was a connection between the distribution of primes and the zeros of
a function, now called the zeta function. The zeta function is a function of a complex variable
traditionally denoted by s with real part σ and imaginary part t (so s = σ + it).
+∞
X
1
converges absolutely and uniformly on any compact subset of
Lemma 2.1 The series
ns
n=1
the half plane Re(s) > 1.
Proof : Let T be a compact subset of the half plane Re(s) > 1. Then there exists p > 1 such
that Re(s) ≥ p for all s ∈ T . Note that
1
1
1
1
1
= 1 =
= ln(n)σ = σ ≤ p for all s ∈ T
ns |ns |
ln(n)s
|e
|
|e
|
n
n
+∞
+∞
X
X
1
1
Since the series
converges (it’s a p-series with p > 1), we get that the series
p
n
ns
n=1
n=1
converges absolutely and uniformly on T by the Weierstrass M -test.
2
Definition 2.2 We define the Riemann-zeta function ζ(s) by
+∞
X
1
ζ(s) =
ns
n=1
for all s ∈ C with Re(s) > 1
It follows from Lemma 2.1 that ζ(s) is an analytic function on the half plane Re(s) > 1.
The following property of the zeta function will be used to prove that certain sets of prime
numbers are infinite.
22
Lemma 2.3 For s ∈ R with s > 1, we have
lim ζ(s) = +∞
s→1+
Proof : For all s ∈ (1, +∞) and all integers N > 1, we have that
Z N
N
N
−1
X
X
1
dx
1
1 − N 1−s
<
<
=
s
s
n
x
s−1
ns
1
n=2
n=1
Taking the limit as N goes to +∞, we find
+∞
+∞
X
X
1
1
1
≤
≤
s
n
s − 1 n=1 ns
n=2
for all s ∈ (1, +∞)
1
≤ ζ(s)
s−1
for all s ∈ (1, +∞)
Hence
ζ(s) − 1 ≤
So
1
1
≤ ζ(s) ≤
+1
s−1
s−1
for all s ∈ (1, +∞)
2
Hence lim+ ζ(s) = +∞.
s→1
The following theorem gives a relation between the zeta function and prime numbers.
Theorem 2.4 (Euler Product for ζ(s)) For all s ∈ C with Re(s) > 1, we have that
1
1 − p−s
prime
Y
ζ(s) =
p
Proof : Pick s ∈ C with Re(s) > 1. Let p1 < p2 < p3 < · · · be the list off all primes. Then by
definition, we have
k
Y
Y
1
1
=
lim
−s
k→+∞
1−p
1 − p−s
i
i=1
p prime
Note that for any prime p,
+∞
X
1
1 − p−s
p−js =
j=0
where the series converges absolutely. Using the Cauchy Product for absolutely convergent
series, we get that for any k ≥ 1
!
+∞
+∞
k
k
X
X
X 1
Y
Y
1
−js
=
p
=
(pα1 1 . . . pαk k )−s =
i
−s
ns
1 − pi
α ,...,α =0
i=1
i=1
j=0
n∈N
1
23
k
k
where Nk is the set of natural numbers that have no prime divisors bigger than pk . Note that
+∞
X
1
+∞
N1 ⊆ N2 ⊆ N3 ⊆ · · · and ∪k=1 Nk = N. Since the series
converges absolutely, we get
ns
n=1
k
+∞
X 1
Y
X
1
1
1
= lim
=
= ζ(s)
−s = lim
−s
s
s
k→+∞
k→+∞
1
−
p
n
n
1
−
p
i
n=1
i=1
prime
n∈N
Y
p
2
k
This Euler Product leads to an easy analytic proof of Euclid’s Theorem.
Theorem 2.5 (Euclid) There are infinitely many prime numbers.
Proof : Suppose that there are only finitely many prime numbers, say p1 < p2 < · · · < pk .
Then it follows from the Euler Product that
lim ζ(s) = lim+
s→1+
s→1
k
k
Y
Y
1
1
1
< +∞
= lim+
−s =
−s
s→1
1−p
1 − pi
1 − p−1
i
i=1
i=1
prime
Y
p
This is a contradiction to Lemma 2.3. Hence there are infinitely many prime numbers.
2.1.2
2
Analytic Continuation of ζ(s) and the Riemann Hypothesis
The zeta function ζ(s) is only defined for s ∈ C with Re(s) > 1. Euler proved that the series
η(s) =
+∞
X
(−1)n−1
n=1
ns
is an analytic function on the set {s ∈ C \ {1} | Re(s) > 0} and that
ζ(s) =
η(s)
1 − 21−s
for all s ∈ C with Re(s) > 1
η(s)
is an analytic continuation of ζ(s). It follows from the theory of analytic continuation
1 − 2−s
that this is the only way to extend the definition of ζ(s) in an analytic way to Re(s) > 0.
e of ζ(s) that is defined over C \
Riemann showed that there exists an analytic continuation ζ(s)
Z
So
+∞
xs−1 e−x dx.
{1} (again, this continuation is unique). For s ∈ C with Re(s) > 0, put Γ(s) =
0
Then Γ(s) is analytic on C \ {0, −1, −2, . . .}. Riemann proved that
πs s s−1
e
ζ(s) = 2 π sin
Γ(1 − s) ζ(1 − s) for all s ∈ C with Re(s) < 0
(∗)
2
e = 0 for s = −2, −4, −6, . . .. These are called the trivial
It follows from this formula that ζ(s)
e
zeros of ζ(s).
Using the Euler Product, one can show that ζ(s) 6= 0 for all s ∈ C with Re(s) > 1.
e has no non-trivial
Since Γ(s) 6= 0 for all s ∈ C with Re(s) > 0, it follows from (*) that ζ(s)
e then 0 ≤ Re(s) ≤ 1. It turns out that
zeros with Re(s) < 0. So if s is a non-trivial zero of ζ(s)
e are symmetric around the line Re(s) = 1 : if s is a non-trivial zero
the non-trivial zeros of ζ(s)
2
e then ζ(s)
e = 0 ⇔ ζ(1
e − s) = 0. This leads to Riemann’s famous conjecture :
of ζ(s)
24
e then Re(s) = 1 .
Riemann Hypothesis : If s is a non-trivial zero of ζ(s)
2
In 1896, Hadamard and de la Vallée-Poussin were able to prove the following :
e then Re(s) 6= 1.
If s is a non-trivial zero of ζ(s)
2.2
The Prime Number Theorem
For x ≥ 0, let π(x) be the number of primes less than or equal to x.
e
In 1859, Riemann found a connection between π(x) and the non-trivial zeros of ζ(s).
He also
x
.
showed how a proof of his conjecture would result in a proof that π(x) is asymptotic to
ln(x)
In 1896, this major result in number theory was finally proven by Hadamard and de la ValléePoussin (although they were unable to prove Riemann’s conjecture, they did prove a related
result).
Theorem 2.6 (Prime Number Theorem) lim
x→+∞
π(x)
π(x)
=1
lim Z x
x = x→+∞
dt
ln(x)
2 ln(t)
In 1949, Selberg and Erdős found elementary proofs of the Prime Number Theorem (in number
theory, an elementary proof is a proof that does not use complex analysis or abstract algebra;
it can still be extremely complicated).
2.3
Dirichlet’s Theorem
This section is devoted to the proof of the following theorem (due to Dirichlet) :
Let a, m ∈ N with gcd(a, m) = 1. Then there exist infinitely many primes p with
p ≡ a mod m.
For certain values of a and m, one can prove quite easily that there are infinitely many primes p
with p ≡ a mod m. For example, if a = 1 and m = 2, we need to prove that there are infinitely
amny odd primes. This easily follows from Euclid’s Theorem (there are infinitely many primes)
and the fact that 2 is the only even prime.
The following theorem gives an elementary proof of Dirichlet’s Theorem in the special case that
a = 3 and m = 4.
Theorem 2.7 there are infinitely many primes p with p ≡ 3 mod 4.
Proof : Suppose there are only finitely many primes p with p ≡ 3 mod 4, say p1 < p2 < · · · <
pk . Put N = p21 p22 · · · p2k + 2. Modulo 4, we find
N ≡ p21 p22 · · · p2k + 2 ≡ 32 · 32 · · · 32 + 2 ≡ 1 · 1 · · · 1 + 2 ≡ 1 + 2 ≡ 3
25
mod 4
In particular, N is odd. So 2 does not divide N . If pi divides N for some i = 1, 2, . . . , k then
pi |(N − p21 p22 · · · p2k ) and so pi divides 2, a contradiction. Hence if a prime p divides N then
p ≡ 1 mod 4. So the prime factorization of N is of the form
N = q 1 q 2 . . . qn
where qi is a prime and qi ≡ 1 mod 4 for i = 1, 2, . . . , n. Again considering modulo 4, we find
that
N ≡ q1 q2 · · · qn ≡ 1 · 1 · · · 1 ≡ 1 mod 4
a contradiction since N ≡ 3 mod 4.
Hence there are infinitely many primes p with p ≡ 3 mod 4.
2.3.1
2
Group Characters
Definition 2.8 Let (G, ·) be a finite abelian group.
(1) A character of G is a homomorphism φ : G → (C0 , ·)
(2) Ĝ is the set of all characters of G.
(3) For ϕ, ψ ∈ Ĝ, we define the map ϕ · ψ : G → C0 : g → ϕ(g)ψ(g)
One easily checks that
• (Ĝ, ·) is an abelian group with identity element ϕ0 : G → C0 : g → 1 (also called the
trivial character )
• the inverse of ϕ ∈ Ĝ is the character ϕ−1 : G → C0 : g →
1
ϕ(g)
The following proposition gives us some basic properties about characters.
Proposition 2.9 Let G be a finite abelian group. Then the following holds :
(a) Let ϕ ∈ Ĝ , g ∈ G and n ∈ N with g n = 1G . Then ϕ(g) is a n-th root of unity. In
1
particular,
= ϕ(g).
ϕ(g)
(b) |Ĝ| = |G|
(c) Let 1G 6= g ∈ G. Then there exists ϕ ∈ Ĝ with ϕ(g) 6= 1.
Proof : (a) Since ϕ is a homomorphism, we get
(ϕ(g))n = ϕ(g n ) = ϕ(1G ) = 1
26
Hence |ϕ(g)| = 1. Put ϕ(g) = a + bi with a, b ∈ R. Then a2 + b2 = |ϕ(g)|2 = 1. So
1
1
a − bi
a − bi
=
=
= 2
= a − bi = ϕ(g)
ϕ(g)
a + bi
(a + bi)(a − bi)
a + b2
(b) (c) It follows from the Fundamental Theorem for Finite Abelian Groups that G is the
direct sum of a finite number of cyclic subgroups (say t). So there exist g1 , . . . , gt ∈ G and
n1 , . . . , nt ∈ N\{0, 1} such that gi is of order ni for i = 1, 2, . . . , t and every g ∈ G can be written
uniquely as g = g1m1 · · · gtmt with 0 ≤ mi < ni for i = 1, 2, . . . , t (hence G = hg1 i×hg2 i×· · ·×hgt i).
(b) For n ∈ N \ {0}, put Cn = {c ∈ C | cn = 1}. Consider the map
θ : Ĝ → Cn1 × Cn2 × · · · × Cnt : ϕ → (ϕ(g1 ), ϕ(g2 ), . . . , ϕ(gt ))
By (a), θ is well-defined.
Suppose that ϕ, ψ ∈ Ĝ with θ(ϕ) = θ(ψ). So ϕ(gi ) = ψ(gi ) for i = 1, 2 . . . , t. Pick g ∈ G. Then
g = g1m1 · · · gtmt where 0 ≤ mi < ni for i = 1, 2, . . . , t. Since ϕ, ψ are homomorphisms, we get
ϕ(g) = ϕ(g1m1 · · · gtmt ) = (ϕ(g1 ))m1 · · · (ϕ(gt ))mt = (ψ(g1 ))m1 · · · (ψ(gt ))mt = ψ(g1m1 · · · gtmt ) = ψ(g)
Hence ϕ = ψ and θ is one-to-one.
Pick (α1 , . . . , αt ) ∈ Cn1 × · · · × Cnt . Define the map
ϕ : G → C0 : g1m1 · · · gtmt → α1m1 · · · αtmt
where 0 ≤ mi < ni for i = 1, 2, . . . , t. One easily checks that ϕ ∈ Ĝ and θ(ϕ) = (α1 , . . . , αt ).
So θ is onto.
Hence θ is a bijection. So
|Ĝ| = |Cn1 × · · · × Cnt | = |Cn1 | · · · |Cnt | = n1 · · · nt = |hg1 i| · · · |hgt i| = |hg1 i × · · · × hgt i| = |G|
which proves (b).
(c) We can write g = g1k1 · · · gtkt with 0 ≤ ki < ni for i = 1, 2, . . . , t. Since g 6= 1G , there exists
j ∈ {1, 2, . . . , t} with kj 6= 0. Define the map
ϕ : G → C0 : g1m1 · · · gtmt → e
2πmj
nj
i
where 0 ≤ mi < ni for i = 1, 2, . . . , t. One easily checks that ϕ ∈ Ĝ and ϕ(g) = e
The characters of a finite abelian group satisfy some nice relations.
27
2πkj
nj
i
6= 1.
2
Proposition 2.10 (Orthogonality Relations) Let G be a finite abelian group and ϕ0 the
trivial character. Then the following relations hold :
X
|G| if ϕ = ϕ0
(a) For all ϕ ∈ Ĝ, we have that
ϕ(g) =
0
if ϕ 6= ϕ0
g∈G
(b) For all g ∈ G, we have that
X
|G|
0
ϕ(g) =
X
ϕ(g) =
if g = 1G
if g =
6 1G
ϕ∈Ĝ
Proof : (a) Pick ϕ ∈ Ĝ. If ϕ = ϕ0 then
X
g∈G
1 = |G|
g∈G
So we may assume that ϕ 6= ϕ0 . Then there exists h ∈ G with ϕ(h) 6= 1. Note that G =
{hg | g ∈ G}. Since ϕ is a homomorphism, we get
!
X
X
X
X
ϕ(g)
ϕ(h)ϕ(g) = ϕ(h)
ϕ(hg) =
ϕ(g) =
g∈G
g∈G
g∈G
g∈G
!
Hence (ϕ(h) − 1)
X
= 0. Since ϕ(h) 6= 1, we get that
ϕ(g)
g∈G
X
ϕ(g) = 0.
g∈G
(b) Pick g ∈ G. If g = 1G then by Proposition 2.9(b)
X
X
ϕ(g) =
1 = |Ĝ| = |G|
ϕ∈Ĝ
ϕ∈Ĝ
So we may assume that g 6= 1G . By Proposition 2.9(c), there exists ψ ∈ Ĝ with ψ(g) 6= 1. Note
that Ĝ = {ψϕ | ϕ ∈ Ĝ}. Then we get
X
X
X
X
ϕ(g)
ϕ(g) =
(ψϕ)(g) =
ψ(g)ϕ(g) = ψ(g)
ϕ∈Ĝ
Hence (ψ(g) − 1)
ϕ∈Ĝ
ϕ∈Ĝ
ϕ∈Ĝ
X
ϕ(g) = 0. Since ψ(g) 6= 1, we get that
ϕ∈Ĝ
X
ϕ∈Ĝ
28
ϕ(g) = 0.
2
2.3.2
Dirichlet Characters and L-Functions
Throughout this section, m ∈ N with m ≥ 2.
Definition 2.11
(a) For n ∈ Z, we put n = n + mZ ∈ Z/mZ.
(b) Put Z∗m = {n | n ∈ Z, gcd(m, n) = 1}. Note that (Z∗m , ·) is a an abelian group of order
ϕ(m) (where ϕ is the Euler-Phi function).
(c) A Dirichlet character (mod m) is a character of Z∗m . We denote the trivial character of
Z∗m by χ0 .
(d) Let χ be a Dirichlet character. We extend the definition of χ to N as follows :
χ(n) if gcd(m, n) = 1
χ:N→C:n→
0
if gcd(m, n) 6= 1
One easily checks that χ(n1 n2 ) = χ(n1 )χ(n2 ) for all n1 , n2 ∈ N.
(e) The L-function associated to the Dirichlet character χ is the series
L(s, χ) =
+∞
X
χ(n)
n=1
ns
First, we prove that L(s, χ) is an analytic function on the half plane Re(s) > 0 whenever χ is
not the trivial Dirichlet character.
Theorem
numbers such that the sequence of partial
+ Let han in≥1 be a sequence of complex
* n 2.12
+∞
X an
X
is bounded. Then the series
sums
ak
converges to an analytic function on the
s
n
n=1
k=1
n≥1
half plane Re(s) > 0.
2
Proof :
Corollary 2.13 Let χ be a non-trivial Dirichlet character. Then L(s, χ) is an analytic function
on the half plane Re(s) > 0.
Proof : Pick n ∈ N. Put n = qm + r with q, r ∈ N and 0 ≤ r < m. Then
!
!
!
!
qm
n
r
m
r
X
X
X
X
X
χ(k) =
χ(k) +
χ(qm + k) = q
χ(k) +
χ(k)
k=1
k=1
By Proposition 2.10(a),
m
X
k=1
χ(k) =
k=1
m
X
k=1
χ(k) = 0. So
k=1
gcd(k,m)=1
n
r
r
X
X
X
χ(k) = χ(k) ≤
|χ(k)| ≤ r < m
k=1
k=1
k=1
29
k=1
by Proposition 2.9(a).
+∞
X
Pn
χ(n)
Hence the sequence h k=1 χ(k)in≥1 is bounded. So by Theorem 2.12, the series L(s, χ) =
ns
n=1
is an analytic function on the half plane Re(s) > 0.
2
For the trivial character, we can prove the following proposition.
Proposition 2.14 Let χ0 be the trivial Dirichlet character. Then L(s, χ0 ) is an analytic function on the half plane Re(s) > 1.
Proof : It’s enough to prove that the series
+∞
X
χ0 (n)
converges uniformly on any compact
s
n
n=1
subset of the half plane Re(s) > 1. Let T be a compact subset of the half plane Re(s) > 1.
Then there exists p > 1 such that Re(s) ≥ p for all s ∈ T . So by Proposition 2.9(a), we get
χ0 (n) χ0 (n) 1
1
1
1
1
=
ns ns ≤ |ns | = |eln(n)s | = |eln(n)σ | = nσ ≤ np for all s ∈ T
+∞
+∞
X
X
1
χ0 (n)
Since the series
converges (it’s a p-series with p > 1), we get that the series
p
n
ns
n=1
n=1
converges uniformly on T by the Weierstrass M -test.
2
Similarly as the Euler Product for the zeta function, there is a relation between L(s, χ) and
the prime numbers.
Theorem 2.15 (Euler Product for L(s, χ)) Let χ be a Dirichlet character. Then for all
s ∈ C with Re(s) > 1, we have that
1
1 − χ(p)p−s
prime
Y
L(s, χ) =
p
Proof : Pick s ∈ C with Re(s) > 1. Let p1 < p2 < p3 < · · · be the list off all primes. Then by
definition, we have
k
Y
Y
1
1
= lim
−s
k→+∞
1 − χ(p)p
1 − χ(p)p−s
i
i=1
p prime
Note that for any prime p,
+∞
X
χ(p)j p−js =
j=0
1
1 − χ(p)p−s
where the series converges absolutely. Using the Cauchy Product for absolutely convergent
series and the fact that χ is completely multiplicative, we get
!
k
k
+∞
+∞
Y
Y
X
X
X χ(n)
1
αk −s
α1
j −js
α1
αk
χ(p)
p
=
·
·
·
χ(p
)
)(p
=
(χ(p
)
.
.
.
p
)
=
1
k
1
i
k
ns
1 − χ(p)p−s
i
α ,...,α =0
i=1
i=1
j=0
n∈N
1
k
30
k
where Nk is the set of natural numbers that have no prime divisors bigger than pk . Note that
+∞
X
χ(n)
+∞
N1 ⊆ N2 ⊆ N3 ⊆ · · · and ∪k=1 Nk = N. Since the series
converges absolutely (because
ns
n=1
+∞
X
χ(n) 1
≤ 1 and the series
converges absolutely), we get
ns |ns |
ns
n=1
k
+∞
X χ(n) X
Y
1
χ(n)
1
= lim
=
= L(s, χ)
−s = lim
−s
s
k→+∞
k→+∞
1 − χ(p)p
n
ns
1 − χ(p)pi
n=1
i=1
prime
n∈N
Y
p
2
k
The following property of L(s, χ) is not so easy to prove.
Theorem 2.16 Let χ be a non-trivial Dirichlet character. Then L(1, χ) 6= 0.
2
Proof :
The trivial Dirichlet character behaves quite differently .
Theorem 2.17 Let χ0 be the trivial Dirichlet character. Then for s ∈ R with s > 1, we have
lim L(s, χ0 ) = +∞
s→1+
Proof : Pick s ∈ R with s > 1. Using the Euler Product for L(s, χ0 ) and ζ(s), we get
1
1 − χ0 (p)p−s
p prime
Y
1
=
1 − p−s
L(s, χ0 ) =
Y
p prime
p6|m
1
1 − p−s
prime
Y
=
p
= ζ(s)
Y
!
Y
(1 − p−s )
p prime
p|m
(1 − p−s )
p prime
p|m
Note that
Y
Y
−s
lim+
(1
−
p
)
=
(1 − p−1 )
s→1
p prime
p|m
p prime
p|m
is a finite strictly positive real number. Hence it follows from Lemma 2.3 that lim+ L(s, χ0 ) = +∞.
s→1
2
31
Proposition 2.18 Let χ be a Dirichlet character. Then the following holds :
!
+∞
X X
χ(p)k
converges to an analytic function M (s, χ) on the half plane
(a) The series
ks
kp
p prime
k=1
Re(s) > 1
(b) eM (s,χ) = L(s, χ) for all s ∈ C with Re(s) > 1
(c) There exists a function Φ(s, χ) defined on the half plane Re(s) > 1 such that Φ(s, χ) is
bounded on the half plane Re(s) > 1 and
M (s, χ) = Φ(s, χ) +
X χ(p)
ps
p prime
for all s ∈ C with Re(s) > 1.
(d) M (s, χ) is bounded on (1, +∞) if χ is not the trivial character.
Proof : (a) Let T be a compact subset of the half plane Re(s) > 1. Then there exists σ > 1
such that Re(s) ≥ σ for all s ∈ T . Note that
χ(p)k 1
kpks ≤ kpkσ for all k ∈ N, all primes p and all s ∈ T
and
X
p prime
+∞
X
1
kpkσ
k=1
!
≤
X
p prime
+∞
X
1
pkσ
k=1
!
+∞
X 1
X
1
1
<
2
<
2
< +∞
σ −1
σ
σ
p
p
n
n=1
prime
p prime
X
=
p
since the latter series is a p-series with p = σ > 1. Hence the series
X
+∞
X
χ(p)k
!
kpks
converges absolutely and uniformly on T . Since T was an !
arbitrary compact subset of the half
+∞
k
X X χ(p)
plane Re(s) > 1, we get that the series
converges to an analytic function
kpks
p prime
k=1
p prime
k=1
M (s, χ) on the half plane Re(s) > 1.
(b) Let p1 < p2 < p3 <!· · · be the list of all primes. The above also shows that for i = 1, 2, . . .,
+∞
X
χ(pi )k
the series
converges to an analytic function fi (s) on the half plane Re(s) > 1.
kpks
i
k=1
Recall the following from complex analysis :
The series
+∞ k
X
z
k=1
g(z)
e
k
converges to an analytic function g(z) on {z ∈ C | |z| < 1} with
1
for all z ∈ C with |z| < 1.
=
1−z
32
χ(pi ) 1
Note that s = Re(s) < 1 for i = 1, 2, . . . and all s ∈ C with Re(s) > 1. Hence
pi
p
1
1 − χ(pi )p−s
i
* n
+
X
Since the sequence
fi (s)
efi (s) =
i=1
for i = 1, 2, . . . and all s ∈ C with Re(s) > 1
converges to M (s, χ) on the half plane Re(s) > 1 and ez is
n≥1
D Pn
E
analytic on C, we get that the sequence e i=1 fi (s)
converges to eM (s,χ) on the half plane
n≥1
Re(s) > 1. But
Pn
e
i=1
fi (s)
=
n
Y
fi (s)
e
=
i=1
n
Y
i=1
1
for all s ∈ C with Re(s) > 1
1 − χ(pi )p−s
i
Hence by Theorem 2.15, we get
M (s,χ)
e
= lim
n→+∞
n
Y
i=1
Y
1
1
= L(s, χ)
=
−s
1 − χ(p)p−s
1 − χ(pi )pi
p prime
(c) For s ∈ C with Re(s) > 1, put
Φ(s, χ) =
+∞
X
X χ(p)k
kpks
p prime
k=2
Since the series
X
p prime
+∞
X
χ(p)k
k=1
!
!
converges absolutely, we get that
kpks
M (S, χ) =
=
+∞
X
χ(p)k
X
p prime
+∞
X
k=1
k=1
!
kpks
X χ(p)k
kpks
p prime
!
+∞
X χ(p) X
X χ(p)k
=
+
ps
kpks
p prime
p prime
k=2
X χ(p)
=
+ Φ(s, χ)
ps
p prime
for all s ∈ C with Re(s) > 1.
33
!
Again because of absolute convergence, we get that
!
+∞
X
k X
χ(p)
|Φ(s, χ)| = ks
kp
p prime
k=2
!
+∞
X X
χ(p)k =
kpks p prime
k=2
!
+∞ X X
χ(p)k ≤
kpks p prime
k=2
!
+∞
X X
1
≤
pk
p prime
k=2
=
X
p prime
≤2
p2
1
−p
X 1
p2
p prime
+∞
X
1
≤2
n2
n=1
= 2 ζ(2)
for all s ∈ C with Re(s) > 1. So Φ(s, χ) is bounded on the half plane Re(s) > 1.
(d) Suppose that χ is not the trivial character. Then L(1, χ) 6= 0 by Theorem 2.16. Hence
L0 (s, χ)
is analytic on B(1, ) := {s ∈ C | |s − 1| < }.
there exists 0 < < 1 such that
L(s, χ)
By the Antiderivative Theorem, there exists a function N (s, χ) such that N (s, χ) is analytic
L0 (s, χ)
on B(1, ) and N 0 (s, χ) =
for all s ∈ B(1, ). Since eM (s,χ) = L(s, χ), we get that
L(s, χ)
L0 (s, χ)
M 0 (s, χ) =
for all s ∈ C with Re(s) > 1. So M 0 (s, χ) = N 0 (s, χ) for all s ∈ B(1, )
L(s, χ)
with Re(s) > 1. Hence there exists c ∈ C such that
M (s, χ) = N (s, χ) + c
for all s ∈ B(1, ) with Re(s) > 1
Since N (s, χ) is analytic on B(1, ), we get that M (s, χ) is bounded on (1, 1 + 2 ]. It follows
from (a) that M (s, χ) is bounded on [1 + 2 , +∞). So M (s, χ) is bounded on (1, +∞).
2
Theorem 2.19 (Dirichlet) Let a, m ∈ N with gcd(a, m) = 1. Then there exist infinitely
many primes p with p ≡ a mod m.
Proof : Let p be a prime. If p|m then
X
χ(a)χ(p) = 0
∗
d
χ∈Z
m
34
So suppose that gcd(p, m) = 1. By Proposition 2.9(a), we get that
χ(a)χ(p) = χ(a)χ(p) = (χ(a))−1 χ(p) = χ(a−1 )χ(p) = χ(a−1 p)
for all Dirichlet characters χ. Hence by Proposition 2.10(b), we have
X
X
φ(m) if a−1 p = 1
−1
χ(a)χ(p) =
χ(a p) =
0
if a−1 p 6= 1
∗
d
χ∈Z
m
∗
d
χ∈Z
m
Note that
a−1 p = 1 ⇔ p = a ⇔ p ≡ a
mod m
So we get
X
χ(a)χ(p) =
φ(m)
0
if p ≡ a
if p ≡
6 a
mod m
mod m
(∗)
∗
d
χ∈Z
m
Using Proposition 2.18(c), we find
X
χ(a)M (s, χ) =
X
∗
d
χ∈Z
m
∗
d
χ∈Z
m
X χ(p)
χ(a) Φ(s, χ) +
ps
p prime
for all s ∈ C with Re(s) > 1. By Proposition 2.18(c), Ω(s) :=
X
!
χ(a) Φ(s, χ) is bounded on
∗
d
χ∈Z
m
the half plane Re(s) > 1. Using (*) and absolute convergence, we get
!
X
X χ(p)
X
X χ(a)χ(p)
= φ(m)
χ(a)
=
s
s
p
p
p prime
p prime
∗
d
χ∈Z
m
∗
d
χ∈Z
m
for all s ∈ C with Re(s) > 1. Putting everything together, we get
X
X
1
χ(a)M (s, χ) = Ω(s) + φ(m)
ps
X
p prime
p≡a mod m
1
ps
(∗∗)
p prime
p≡a mod m
∗
d
χ∈Z
m
for all s ∈ C with Re(s) > 1. By Proposition 2.18(d),
X
χ(a)M (s, χ) is bounded on (1, +∞)
∗
d
χ0 6=χ∈Z
m
By Theorem 2.17 and Proposition 2.18(a)(b), we have that
lim χ0 (a) M (s, χ0 ) = lim+ M (s, χ0 ) = +∞
s→1+
s→1
Since Ω(s) is bounded on the half plane Re(s) > 1, it follows from (**) that
X
1
φ(m)
ps
p prime
p≡a mod m
is not bounded over (1, +∞). This means that we have an infinite series. So there exist infinitely
many primes p with p ≡ a mod m.
2
35
Chapter 3
Continued Fractions
3.1
Finite Continued Fractions
Definition 3.1 :
(1) Let n ∈ N, a0 ∈ R and a1 , . . . , an ∈ R+
0 . The finite continued fraction with partial
denominators a1 , . . . , an (notation : [a0 ; a1 , . . . , an ]) is the number
1
a0 +
1
a1 +
1
a2 +
a3 +
1
..
.
1
an−2 +
an−1 +
1
an
(2) The continued fraction [a0 ; a1 , . . . , an ] is called simple if a0 , a1 , . . . , an are integers. We
will use abbreviations like FSCF for “finite, simple continued fraction”, etc.
(3) For x ∈ R, the √
integral part of x (notation : [x]) is the biggest integer smaller than or
equal to x. So [ 2] = 1 and [−3.2] = −4.
Example : We get that
1
[3; 2, 1, 2, 6] = 3 +
=
1
2+
1+
1
2+
36
1
6
172
51
Clearly, every FSCF is a rational number. The next theorem show that the converse is also
true.
Theorem 3.2 Every rational number can be written as a FSCF.
a
Proof : Note that every rational number q can be written as where a ∈ Z and b ∈ N0 . We
b
prove the theorem by induction on b. If b = 1 then q = a = [a]. So suppose that every rational
a
number of the form with a ∈ Z and b ∈ N0 can be written as a FSCF if b = 1, 2, . . . , m − 1 for
b
a
some m ≥ 2. Let a ∈ Z. Put q = . Using the Division Algorithm, we can write a = mk + r
m
a
= k = [k]. So we may assume that r 6= 0.
with k, r ∈ Z and 0 ≤ r < m. If r = 0 then q =
m
Then
1
a
mk + r
q=
=
=k+
m
m
m
r
m
m
By induction,
can be written as a FSCF, say
= [b0 ; b1 , b2 , . . . , bn ]. So
r
r
q=
1
a
=k+
=k+
m
m
b0 +
r
1
1
1
b1 +
b2 +
1
..
.
1
bn−2 +
bn−1 +
1
bn
2
Hence q = [k; b0 , b1 , . . . , bn ].
This proof
provides
us with an algorithm to represent a rational number q as a FSCF : a0 = [q];
1
a1 =
, etc.
q − [q]
Writing a Rational Number x as a FSCF
(1) Put x0 = x and a0 = [x0 ].
(2) Suppose that we already have x0 , x1 , . . . , xn and a0 , a1 , . . . , an for some n ≥ 0.
(a) If an = xn , the algorithm stops. We get that x = [a0 ; a1 , a2 , . . . , an ].
1
(b) If an 6= xn , put xn+1 =
and an+1 = [xn+1 ].
xn − an
37
Example : Write −
31
as a FSCF.
25
x0 = −
x1 =
31
25
so a0 = −2
1
31
− − (−2)
25
1
19
x2 =
=
6
25
−1
19
1
=6
x3 =
19
−3
6
=
25
19
so a1 = 1
so a2 = 3
so a3 = 6 and we stop
Hence we get
−
31
= [−2; 1, 3, 6]
25
We end this section by noting that that [a0 ; a1 , a2 , . . . , an−1 , an , 1] = [a0 ; a1 , a2 , . . . , an−1 , an + 1].
So we can write a rational number in at least two ways as a FSCF. It turns out that these are
the only ways to represent a rational number as a FSCF.
3.2
Convergents of a Continued Fraction
The previous section might have left the impression that evaluating finite continued fractions
involves a lot of calculations. Moreover, if we extend our continued fraction, we have to start
all over again : in order to calculate [1; 2, 3, 4] we don’t use any of the calculations we needed
to evaluate [1; 2, 3]. In this section, we develop an iterative method to evaluate finite continued
fractions that resolves these problems.
Definition 3.3 : Let [a0 ; a1 , . . . , an ] be a FCF.
(1) For k = 0, 1, . . . , n, we define the k-th convergent of [a0 ; a1 , . . . , an ] (notation : Ck ) as the
number [a0 ; a1 , . . . , ak ].
(2) For k = −2, −1, 0, 1, . . . , n, we define numbers pk and qk as follows :
p−2 = 0
p−1 = 1
pk = ak pk−1 + pk−2
q−2 = 1
q−1 = 0
qk = ak qk−1 + qk−2
Remark : It’s easy to see that qk > 0 for all k ≥ 0.
38
for k = 0, 1, 2, . . . , n
Lemma 3.4 Let [a0 ; a1 , . . . , an ] be a FCF. Then
[a0 ; a1 , . . . , ak , x] =
xpk + pk−1
xqk + qk−1
for k = 0, 1, . . . , n and all x ∈ R+
0.
Proof : The proof is by induction on k.
Let k = 0 and x ∈ R+
0 . Then we get that
[a0 ; x] = a0 +
xa0 + 1
xp0 + p−1
1
=
=
x
x
xq0 + q−1
So assume that k ≥ 1. By induction, we have that
[a0 ; a1 , . . . , ak−1 , y] =
ypk−1 + pk−2
yqk−1 + qk−2
for all y ∈ R+
0
1
Let x ∈ R+
0 . Using the above formula with ‘y = ak + ’, we find
x
"
1
[a0 ; a1 , . . . , ak−1 , ak , x] = a0 ; a1 , . . . , ak−1 , ak +
x
!
1
ak +
pk−1 + pk−2
x
!
=
1
ak +
qk−1 + qk−2
x
#
x(ak pk−1 + pk−2 ) + pk−1
x(ak qk−1 + qk−2 ) + qk−1
xpk + pk−1
=
xqk + qk−1
=
2
The following theorem shows that the numbers p0 , p1 , . . . , pn and q0 , q1 , . . . , qn are an efficient
way of calculating finite continued fractions.
Theorem 3.5 Let [a0 ; a1 , . . . , an ] be a FCF. Then
Ck = [a0 ; a1 , . . . , ak ] =
Proof : Note that C0 = a0 =
get that
pk
for k = 0, 1, . . . , n
qk
p0
. Pick k ∈ {1, 2, . . . , n}. Using Lemma 3.4 with ‘x = ak ’, we
q0
Ck = [a0 ; a1 , . . . , ak−1 , ak ] =
39
ak pk−1 + pk−2 pk
=
ak qk−1 + qk−2
qk
2
We describe an algorithm to calculate the numbers p0 , p1 , . . . , pn and q0 , q1 , . . . , qn . It is quite
similar to our implementation of the Euclidean Algorithm.
Let [a0 ; a1 , a2 , . . . , an ] be a FCF. We start by writing down the following ‘matrix’ :
0 1
1 0
a0
a1
..
.
an−1
an
We calculate the rows R0 , R1 , . . ., Rn−1 and Rn using the formula Rk = ak Rk−1 + Rk−2 for
k = 0, 1, . . . , n.
Then the first column are the numbers p0 , p1 , . . . , pn while the second column are the numbers
q0 , q1 , . . . , qn .
Example : Calculate [1; 2], [1; 2, 3], [1; 2, 3, 4] and [1; 2, 3, 4, 5]
We easily get the following table :
0
1
1
3
10
43
225
1
0
1
2
7
30
157
1
2
3
4
5
3
10
43
225
Hence [1; 2] = , [1; 2, 3] = , [1; 2, 3, 4] =
and [1; 2, 3, 4, 5] =
.
2
7
30
157
Notice in our example that gcd(pk , qk ) = 1 for k = 0, 1, 2, 3, 4. This is always the case.
Lemma 3.6 Let [a0 ; a1 , . . . , an ] be a FCF. Then
pk qk−1 − qk pk−1 = (−1)k−1 for k = −1, 0, 1, . . . , n
Proof : The proof is by induction on k.
For k = −1, we get that p−1 q−2 − q−1 p−2 = 1 · 1 − 0 · 0 = 1 = (−1)−1−1 .
So we may assume that k ≥ 0. Then we have that
pk qk−1 − qk pk−1 = (ak pk−1 + pk−2 )qk−1 − (ak qk−1 + qk−2 )pk−1 = −(pk−1 qk−2 − qk−1 pk−2 )
By induction, we get that pk−1 qk−2 − qk−1 pk−2 = (−1)(k−1)−1 . Hence
pk qk−1 − qk pk−1 = −(−1)k−2 = (−1)k−1
40
2
Corollary 3.7 Let [a0 ; a1 , . . . , an ] be a FSCF. Then gcd(pk , qk ) = 1 for k = 0, 1, . . . , n.
Proof : Pick k ∈ {0, 1, . . . , n}. Put d = gcd(pk , qk ). By Lemma 3.6, we have that pk qk−1 −
qk pk−1 = (−1)k−1 . Since d|pk and d|qk , we get that d|(−1)k−1 and so d = 1.
2
The convergents of a finite continued fraction show an alternating pattern.
Theorem 3.8 Let [a0 ; a1 , . . . , an ] be a FCF. Then the following holds :
(a) The convergents with even subscripts form a strictly increasing sequence :
C0 < C2 < C4 < · · ·
(b) The convergents with odd subscripts form a strictly decreasing sequence :
C1 > C3 > C5 > · · ·
(c) Any convergent with an odd subscript is greater than any convergent with an even subscript :
C0 < C2 < C4 < · · · < C5 < C3 < C1
Proof : (a)(b) Pick k ∈ {0, 1, 2, . . . , n − 2}. Using Theorem 3.5 and Lemma 3.6, we easily get
that
Ck+2 − Ck =
=
=
=
=
pk+2 pk
−
qk+2 qk
qk pk+2 − pk qk+2
qk qk+2
qk (ak+2 pk+1 + pk ) − pk (ak+2 qk+1 + qk )
qk qk+2
ak+2 (pk+1 qk − qk+1 pk )
qk qk+2
ak+2 (−1)k
qk qk+2
If k is even, we see that Ck+2 −Ck > 0 and so Ck+2 > Ck . Hence we get that C0 < C2 < C4 < · · · .
If k is odd, then Ck+2 − Ck < 0 and so Ck+2 < Ck . Hence we get that C1 > C3 > C5 > · · · .
(c) Using Theorem 3.5 and Lemma 3.6, we get that
Cn − Cn−1
pn pn−1 pn qn−1 − qn pn−1 (−1)n−1
=
−
=
=
qn qn−1
qn−1 qn
qn−1 qn
Let r and s be integers such that 0 ≤ 2r, 2s − 1 ≤ n. Suppose first that n is even, say n = 2m.
(−1)2m−1
< 0 and so C2m < C2m−1 . By (a) and (b), we get that
Then C2m − C2m−1 =
q2m−1 q2m
C2r ≤ C2m < C2m−1 ≤ C2s−1
Suppose next that n is odd, say n = 2m − 1. Similarly, we get that C2m−2 < C2m−1 and so
C2r ≤ C2m−2 < C2m−1 ≤ C2s−1 .
2
41
3.3
Infinite Continued Fractions
Suppose that a0 ∈ R and ak ∈ R+
0 for all k ≥ 1. We still put Ck = [a0 ; a1 , . . . , ak ] for all k ≥ 0.
As on page 38, we define pk and qk for all k ≥ −2. Then all the theorems we’ve proven about
Ck , pk and qk are still valid.
Lemma 3.9 Let a0 ∈ R and ak ∈ R such that ak ≥ 1 for all k ≥ 1. Then qk ≥ k for all k ≥ 0.
Proof : The proof is by induction on k. Note that q0 = 1 > 0, q1 = a1 ≥ 1 and q2 = a2 q1 +q0 ≥
1 + 1 = 2. For k ≥ 3, we get by induction that
qk = ak qk−1 + qk−2 ≥ qk−1 + qk−2 ≥ (k − 1) + (k − 2) = 2k − 3 ≥ k
2
Theorem 3.10 Let a0 ∈ R and ak ∈ R such that ak ≥ 1 for all k ≥ 1. Then lim Ck exists (so
k→∞
the sequence (Ck )k≥0 converges) and
C0 < C2 < C4 < · · · < lim Ck < · · · < C5 < C3 < C1
k→∞
Proof : By Theorem 3.8(a), we have that C0 < C2 < C4 < · · · . So (C2k )k≥0 is a strictly
increasing sequence. By Theorem 3.8(c), this sequence is bounded (by C1 ). Hence this sequence
converges. Call the limit α.
Similarly, we get that the sequence (C2k+1 )k≥0 converges. Call the limit β. Note that C2k < α
and β < C2k+1 for all k ≥ 0.
For all k, l ≥ 0, we have that C2k < C2l+1 by Theorem 3.8(c) and so C2k ≤ β. Since this is true
for all k ≥ 0, we get that α ≤ β.
Pick k ≥ 0. We have that β < C2k+1 and C2k < α. Similarly as in the proof of Theorem 3.8(c)
and by Lemma 3.9, we get that
0 ≤ β − α < C2k+1 − C2k =
1
(−1)2k
1
=
≤
q2k+1 q2k q2k+1 q2k 2k(2k + 1)
Since this is true for all k ≥ 0, we get that 0 ≤ β − α ≤ 0. So β = α. Hence the sequence
(Ck )k≥0 converges (to α = β).
2
Definition 3.11 (1) Let a0 ∈ R and ak ∈ R such that ak ≥ 1 for all k ≥ 1. The infinite
continued fraction with partial denominators a1 , a2 , . . . (abreviation : ICF; notation :
[a0 ; a1 , a2 , . . .]) is the number lim [a0 ; a1 , . . . , ak ]. Note that this limit exists by Theorem
k→∞
3.10.
(2) If a0 ∈ Z and ak ∈ N0 for all k ≥ 1, then the ICF [a0 ; a1 , a2 , . . .] is called an infinite,
simple continued fraction (abreviation : ISCF).
We’ve seen that an FSCF represents a rational number and that every rational number can be
written as an FSCF.
We will now prove that an ISCF represents an irrational number and that every irrational
number can be written uniquely as an ISCF.
42
Theorem 3.12 Every ISCF is an irrational number.
Proof : Let x be an ISCF, say x = [a0 ; a1 , a2 , . . .]. Pick k ≥ 0. Using Theorem 3.8, we get that
Ck < x < Ck+1 if k is even and Ck+1 < x < Ck is k is odd. Using Theorem 3.5 and Lemma 3.6,
we get that
p
p q − q p (−1)k p
1
k
k+1 k k+1
k+1 k
− =
0 < |x − Ck | < |Ck+1 − Ck | = =
=
qk+1 qk qk qk+1 qk qk+1
qk qk+1
a
Suppose that x is rational. Then there exist a ∈ Z and b ∈ N0 such that x = . Hence
b
a p 1
b
k
0< − <
or 0 < |aqk − bpk | <
b qk qk qk+1
qk+1
Note that this formula is true for all k ≥ 0. By Lemma 3.9, we can pick k ≥ 0 such that
qk+1 > b. For this k, we get that
0 < |aqk − bpk | <
b
qk+1
<1
a contradiction since aqk − bpk is an integer.
Hence x is irrational.
2
Before we prove that every irrational number can be written uniquely as an ISCF, we remark
the following :
Let [a0 ; a1 , a2 , . . .] be an ISCF. Then
[a0 ; a1 , a2 , . . .] = lim [a0 ; a1 , a2 , . . . , ak ]
k→∞
= lim
k→∞
= a0 +
1
a0 +
[a1 ; a2 , a3 , . . . , ak ]
!
1
lim [a1 ; a2 , a3 , . . . , ak ]
k→∞
1
[a1 ; a2 , a3 , . . .]
= [a0 ; [a1 , a2 . . .]]
= a0 +
Similarly, we get that
[a0 ; a1 , a2 , . . .] = [a0 ; a1 , . . . , ak , [ak+1 ; ak+2 , ak+3 , . . .]] for all k ≥ 1
43
Example : Calculate [3; 1, 2, 1, 2, 1, 2, . . .].
We have that [3; 1, 2, 1, 2, 1, 2, . . .] = 3 +
1
. Put y = [1; 2, 1, 2, 1, 2, . . .]. Then
[1; 2, 1, 2, 1, 2, . . .]
we have that
1
y = [1; 2, 1, 2, 1, 2, . . .] = 1 +
2+
1
[1; 2, 1, 2, 1, 2, . . .]
1
=1+
2+
1
y
=
3y + 1
2y + 1
So y is a solution of the equation
3y + 1
or 2y 2 − 2y − 1 = 0
2y + 1
√
√
1+ 3
1± 3
. Since y > 0, we have that y =
. Hence we get that
We get that y =
2
2
√
√
1
1
5+3 3
√ =
√ =2+ 3
[3; 1, 2, 1, 2, 1, 2, . . .] = 3 + = 3 +
y
1+ 3
1+ 3
y=
2
Next, we prove that an irrational number can be written as at most one ISCF. We need a little
lemma that will also show us how to write an irrational number as a ISCF.
Lemma 3.13 Let x = [a0 ; a1 , a2 , . . .] be an ISCF. Then a0 = [x].
Proof : By Theorem 3.10, we get that C0 < x < C1 and so
a0 < x <
1
p 1 a0 a1 + 1
=
= a0 +
≤ a0 + 1
q1
a1
a1
2
since a1 ≥ 1. Hence a0 = [x].
Corollary 3.14 An irrational number can be written as at most one ISCF.
Proof : Suppose that there are two ISCF’s that are the same irrational number x, say
x = [a0 ; a1 , a2 , . . .] = [b0 ; b1 , b2 , . . .]
By Lemma 3.13, we get that a0 = [x] = b0 . But
a0 +
1
1
= [a0 ; a1 , a2 , . . .] = x = [b0 ; b1 , b2 , . . .] = b0 +
[a1 ; a2 , a3 , . . .]
[b1 ; b2 , b3 , . . .]
Since a0 = b0 , we get that [a1 ; a2 , a3 , . . .] = [b1 ; b2 , b3 , . . .].
Repeating this process, we get that ak = bk for all k ≥ 0.
2
Let x be an irrational number. If we can write x as an ISCF, then this ISCF is unique. Note
that Lemma 3.13 also gives us the only possible candidate : a0 = [x]. Hence we define an ISCF
(related to x) and prove that it actually equals x.
44
Definition 3.15 : Let x be an irrational number. We define real numbers xk and integers ak
for k ≥ 0 as follows :
(*) x0 = x and a0 = [x0 ]
(*) Suppose we already have x0 , . . . , xk and a0 , . . . , ak for some k ≥ 0. Then xk+1 =
1
x k − ak
and ak+1 = [xk+1 ].
We get that xk is an irrational number and ak is an integer for all k ≥ 0; moreover xk > 1 and
ak ≥ 1 for all k ≥ 1.
Theorem 3.16 Let x be an irrational number. For all k ≥ 0, define xk and ak as above. Then
the following holds :
(a) x = [a0 ; a1 , a2 , . . .].
pq 1
for all k ≥ 0.
(b) x − <
qk qk qk+1
Proof : We first prove the following claim :
x = [a0 ; a1 , . . . , ak , xk+1 ] for all k ≥ 0.
1
1
as xk = ak +
. The proof of the claim is by
x k − ak
xk+1
1
induction on k. For k = 0, we get that x = x0 = a0 +
= [a0 ; x1 ]. So assume that k ≥ 1.
x1
Then we get that
Note that we can rewrite xk+1 =
x = [a0 ; a1 , . . . , ak−1 , xk ] = [a0 ; a1 , . . . , ak−1 , ak +
1
xk+1
] = [a0 ; a1 , . . . , ak−1 , ak , xk+1 ]
which proves the claim.
Pick k ≥ 1. By Lemma 3.4 with x = xk+1 , we get that
x = [a0 ; a1 , . . . , ak , xk+1 ] =
xk+1 pk + pk−1
xk+1 qk + qk−1
Hence we have
x − Ck =
xk+1 pk + pk−1 pk
− (pk qk−1 − pk−1 qk )
(−1)k
−
=
=
xk+1 qk + qk−1 qk
qk (xk+1 qk + qk−1 )
qk (xk+1 qk + qk−1 )
by Lemma 3.6. Since xk+1 > ak+1 , we get by Lemma 3.9 that
|x − Ck | =
1
1
1
1
<
=
≤
qk (xk+1 qk + qk−1 ) qk (ak+1 qk + qk−1 ) qk qk+1 k(k + 1)
45
Hence lim |x − Ck | = 0. So x = lim Ck = lim [a0 ; a1 , a2 , . . . , ak ] = [a0 ; a1 , a2 , . . .].
k→∞
k→∞
k→∞
2
√
Example : Write 6 as an ISCF.
We calculate xk and ak for k ≥ 0.
x0 =
x1 = √
x2 = √
√
1
6−2
6 ≈ 2.4 and so a0 = 2
√
=
1
6+2
−2
2
=
6+2
≈ 2.2 and so a1 = 2
2
√
6 + 2 ≈ 4.4 and so a2 = 4
√
1
x3 = √
=
( 6 + 4) − 2
6+2
= x1
2
Since x√
3 = x1 , we get that 2 = a1 = a3 = a5 = a7 = · · · and 4 = a2 = a4 = a6 = a8 = · · ·
Hence 6 = [2; 2, 4, 2, 4, 2, 4, . . .].
3.4
Rational Approximations of Irrational Numbers
In this section, we prove that the convergents of an ISCF are very good rational approximations
for the ISCF. Moreover, any ‘good’ rational approximation of an irrational number x is a
convergent of the ISCF for x.
Lemma 3.17 Let x be the ISCF [a0 ; a1 , a2 , . . .], k ≥ 0 and a, b ∈ Z such that 1 ≤ b < qk+1 .
Then |qk x − pk | ≤ |bx − a| and we have equality only if a = pk and b = qk .
Proof : Consider the following system of linear equations in α and β :
pk α + pk+1 β = a
qk α + qk+1 β = b
Using Lemma 3.6 and Cramer’s Rule, we find that
α = (−1)k+1 (aqk+1 − bpk+1 ) and
β = (−1)k+1 (bpk − aqk )
Note that α 6= 0. Indeed, if α = 0, then aqk+1 = bpk+1 and so qk+1 |bpk+1 ; but gcd(pk+1 , qk+1 ) = 1
by Lemma 3.7; so qk+1 |b and b ≥ qk+1 , a contradiction.
Assume that β = 0. Similarly as above, we get that qk |b and so b = mqk for some m ∈ N0 .
Then a = mpk and so |bx − a| = m|qk x − pk | ≥ |qk x − pk |. Note that we have equality if and
only if m = 1 (and so a = pk and b = qk ).
Hence we may assume that β 6= 0. If β < 0 then qk α = b − qk+1 β > 0 and so α > 0; if β > 0
then b < qk+1 ≤ βqk+1 , hence qk α = b − βqk+1 < 0 and so α < 0. Hence α and β have opposite
46
signs. By Theorem 3.8, we get that Ck < x < Ck+1 if k is even and Ck+1 < x < ck is k is odd.
Hence x − Ck and x − Ck+1 have opposite signs. So qk x − pk and qk+1 x − pk+1 have opposite
signs. We conclude that α(qk x − pk ) and β(qk+1 x − pk+1 ) both have the same sign. Hence we
get that
|bx − a| = |(qk α + qk+1 β)x − (pk α + pk+1 β)|
= |α(qk x − pk ) + β(qk+1 x − pk+1 )|
= |α||qk x − pk | + |β||qk+1 x − pk+1 |
> |α||qk x − pk |
≥ |qk x − pk |
2
We can now prove our first result related to convergents of an ISCF and rational approximations.
Theorem
3.18
Let
x be
the ISCF [a0 ; a1 , a2 , . . .], k ≥ 0 and a, b ∈ Z such that 1 ≤ b ≤ qk .
a
pk Then x − ≤ x − and we have equality only if a = pk and b = qk .
qk b
a
pk Proof : Suppose that x − ≥ x − . Then we get that
qk b
a
a
pk |qk x − pk | = qk x − ≥ qk x − ≥ b x − = |bx − a|
qk b
b
By Lemma 3.17, this is only possible if a = pk and b = qk .
2
179
is the best rational approximation of π
57
179
among all the fractions with denominator less than or equal to 57. Yet
is not a convergent
57
of π.
Note that the converse of Theorem 3.18 is false :
The following result is an easy consequence of Theorem 3.16.
p
1
k
Theorem 3.19 Let x be an irrational number and n ∈ N. Then x − < 2 for all k ≥ 0.
qk q k
Proof : Recall that the sequence hq
k ik≥0 is increasing. Hence by Theorem 3.16, we get that
pk 1
1
≤ 2
2
x − <
qk qk qk+1 qk
Convergents of an irrational number are not the only fractions with this property :
1
19
π − < 2
6 6
yet 19
is not a convergent of π. However, very ‘good’ rational approximations of an irrational
6
number x have to be convergents of x.
47
Theorem 3.20 Let x be an irrational number, a ∈ Z and b ∈ N0 such that x −
Then
a
1
≤ 2.
b
2b
a
is a convergent of the ISCF for x.
b
Proof : For k ≥ 0, let pk and qk be the integers, defined using the ISCF for x.
Note that (qk )k≥1 is a strictly increacing sequence of positive integers (qk+1 = ak qk + qk−1 > qk
for all k ≥ 1) and q0 = 1. Hence there exists a unique k ≥ 0 such that qk ≤ b < qk+1 . By
Lemma 3.17, we get that
1
a
|qk x − pk | ≤ |bx − a| = b x − ≤
b 2b
Hence
p
1
1
k
x − = |qk x − pk | ≤
qk qk
2bqk
Since x is irrational, we actually have a strict inequality.
Suppose that bpk 6= aqk . Then |bpk − aqk | ≥ 1 and so
bp − aq p
1
k
k
k a pk
≤
= − ≤ − x + x −
bqk bqk qk b qk
a
1
1
+ 2
<
b 2bqk 2b
Hence we get that
1
1
< 2 and so b < qk
2bqk 2b
a contradiction since b ≥ qk .
a pk
Hence bpk = aqk . So =
= Ck .
b qk
2
One can prove (without using the theory of continued fractions) that given any irrational
a
number x, there are infinitely many rational numbers (with a ∈ Z and b ∈ N0 ) such that
b
a
1
x − < √ 2 . Note that all these rational numbers must be convergents of x by Theorem
b
5b
3.20.
We prove that infinitely many convergents of the ISCF for x have this property.
48
pk 1
for some
Theorem 3.21 Let x be an irrational number and n ∈ N. Then x − <
qk 2qk2
k ∈ {n, n + 1}.
p
pn
pn+1
1
k
Proof : Suppose that x − ≥ 2 for k = n, n + 1. By Theorem 3.8(c),
<x<
if n
qk 2qk
qn
qn+1
pn+1
pn
is even and
<x<
if n is odd. Hence
qn+1
qn
pn pn+1 pn
p
1
1
−
= − x + x − n+1 >
+
2
qn
qn
qn+1
qn+1 2qn2 2qn+1
By Lemma 3.6, we have that
p
p q
−
p
q
p
1
n+1 n n+1 n
n n+1
=
−
=
qn qn+1 qn qn+1
qn qn+1
Putting everything together, we find
1
1
1
> 2 + 2
qn qn+1
2qn 2qn+1
2
+ qn2 . So 0 > (qn+1 − qn )2 , a contradiction.
Hence 2qn qn+1 > qn+1
2
We can do a little bit better.
pk 1
Theorem 3.22 Let x be an irrational number and n ∈ N0 . Then x − < √ 2 for some
qk 5 qk
k ∈ {n − 1, n, n + 1}.
pk 1
Proof : Suppose that x − ≥ √ 2 for k = n − 1, n, n + 1.
qk 5 qk
Pick k ∈ {n − 1, n}. As in the proof of Theorem 3.21, we get that
p
p
1
1
1
k
k+1
= − x + x −
≥ √ 2 +√ 2
qk qk+1 qk
qk+1 5 qk
5 qk+1
Putting bk =
qk+1
, we find
qk
√
5 > qk qk+1
1
1
+ 2
2
qk qk+1
!
=
49
qk
1
qk+1
+
= bk +
qk
qk+1
bk
So
b2k −
√
5 bk + 1 < 0
Solving this inequality, we get
√
√
5−1
5+1
< bk <
2
2
Hence
√
for k = n − 1, n
(∗)
5+1
qn+1 an+1 qn + qn−1 qn + qn−1
1
> bn =
=
≥
=1+
2
qn
qn
qn
bn−1
So
bn−1 > √
1
5+1
−1
2
=√
2
5−1
√
=
5+1
2
2
a contradiction to (*).
This is a good as it gets in the following sense :
There exist irrational numbers α
such that for all c >
a
1
a
many rational numbers with α − ≤ 2 .
b
b c b
3.5
√
5, there are only a finitely
Periodic Continued Fractions
√
On page 46, we represented 6 as an ISCF and noted that this ISCF has a repeating part.
And on page 44, we calculated [3; 1, 2, 1, 2, 1, 2, . . .] and noted that square roots were involved.
In this section, we will describe all the irrational numbers whose ISCF has a repeating part :
they all contain (somehow) a square root.
3.5.1
Quadratic Irrationals
Definition 3.23 Let d ∈ N that is not a perfect square.
√
√
(a) We put Q( d) = {q1 + q2 d | q1 , q2 ∈ Q}.
√
√
(b) If q1 , q2√∈ Q then the conjugate of q1 + q2 d (notation : q1 + q2 d ) is the number
q1 − q2 d.
√
√
Remark : Note that every element of Q(√ d) can be written
uniquely as q1 + q2 d with
√
q1 , q2 ∈ Q. Indeed, suppose that p1 + p2 d = q1 + q2 d where p1 , p2 , q1 , q2 ∈ Q. Then
√
√
√
q1 − p 1
(p2 − q2 ) d = q1 − p1 . If p2 6= q2 , then d =
∈ Q, a contradiction since d is irrational
p 2 − q2
(because d is not a perfect square). Hence q2 = p2 and so also p1 = q1 .
50
Theorem 3.24 Let d ∈ N such that d is not a perfect square. Then the following holds :
√
(a) Q( d) is a field.
√
(b) For all x, y, z ∈ Q( d) with z 6= 0, we have that xy = x y, x ± y = x ± y and z −1 = z −1 .
√
Proof : (a) Since Q( d) ⊂ R and R is a field, we only need to prove the following
√
√
• x − y ∈ Q(√ d) for all x, y ∈ Q(√ d)
• xy −1 ∈ Q( d) for all x, y ∈ Q( d) with y 6= 0
√
√
We will prove the second statement. Let x = p1 +p2 d and y = q1 +q2 d where p1 , p2 , q1 , q2 ∈ Q
and (q1 , q2 ) 6= (0, 0). Then we have that
√
p1 + p2 d
−1
√
xy
=
q1 + q2 √d
√
(p1 + p2 d)(q1 − q2 d)
√
√
=
(q1 + q2 d)(q1 − q2 d)
√
(p1 q1 − dp2 q2 ) + (p2 q1 − p1 q2 ) d
=
q12 − dq22
p1 q1 − dp2 q2 p2 q1 − p1 q2 √
=
+ 2
d
q12 − dq22
q1 − dq22
√
So xy −1 ∈ Q( d).
√
(b) We will prove
√ that xy = x y for√all x, y ∈ Q( d). The other formulas are proven similarly.
Let x = p1 + p2 d and y = q1 + q2 d where p1 , p2 , q1 , q2 ∈ Q. Then
√
√
xy = (p1 + p2 d)(q1 + q2 d)
√
= (p1 q1 + dp2 q2 ) + (p1 q2 + p2 q1 ) d
√
= (p1 q1 + dp2 q2 ) − (p1 q2 + p2 q1 ) d
√
√
= (p1 − p2 d)(q1 − q2 d)
= xy
2
Definition 3.25 A number√x is a quadratic irrational if there exists d ∈ N such that d is not
a perfect square and x ∈ Q( d) \ Q.
3.5.2
Periodic Continued Fractions
Definition 3.26 (a) An infinite continued fraction [a0 ; a1 , a2 , . . .] is called periodic if there
exists p ∈ N0 and N ∈ N such that ak+p = ak for all k ≥ N . We write
[a0 ; a1 , a2 , . . .] = [a0 ; a1 , a2 , . . . , aN −1 , aN , aN +1 , . . . , aN +p−1 ]
51
(b) The smallest p with this property is called the period of the periodic continued fraction.
(c) A periodic continued fraction is purely periodic if we can choose N = 0.
Example : [1; 3, 5, 3, 5, 3, 5, . . .] = [1; 3, 5]. This ISCF is periodic with period 2 but is not purely
periodic.
We want to prove that an ISCF is periodic if and only if it is a quadratic irrational. One
direction is quite easy to prove.
Proposition 3.27 Let x = [a0 ; a1 , a2 , . . .] be a periodic ISCF. Then x is a quadratic irrational.
Proof : Suppose that x = [a0 ; a1 , a2 , . . . , aN −1 , aN , aN +1 , . . . , aN +p−1 ]. Put y = [aN ; aN +1 , . . . , aN +p−1 ].
By Lemma 3.4, we get that
y = [aN ; aN +1 , . . . , aN +p−1 , y] =
pp−1 y + pp−2
qp−1 y + qp−2
where hpk ik≥−1 and hqk ik≥−1 are associated to the ISCF [aN ; aN +1 , . . .].
So we have that
qp−1 y 2 − (pp−1 − qp−2 ) y − pp−2 = 0
Put a = qp−1 , b = pp−1 − qp−2 , c = pp−2 and d = (pp−1 − qp−2 )2 + 4qp−1 pp−2 . Since y > 0, we get
that
√
b+ d
y=
2a
Note that d ∈ N and that d is not a perfect square since y is irrational by Theorem 3.12.
Using Lemma 3.4, we find that
x = [a0 ; a1 , . . . , aN −1 , aN , aN +1 , . . . , aN +p−1 ] = [a0 ; a1 , . . . , aN −1 , y] =
pN −1 y + pN −2
qN −1 y + qN −2
where hpk ik≥−2 and hqk ik≥−2 are associated √
to the ISCF [a0 ; a1 , . . .]. Since
√ y is irrational,
qN −1 y + qN −2 6= 0. By Theorem 3.24(a), Q( d) is a field and so x ∈ Q( d). Hence x is a
quadratic irrational since x is irrational by Theorem 3.12.
2
Note that a quadratic irrational can be written in a special form.
Lemma 3.28 Let x be a quadratic irrational. Then there exist
√ P, Q ∈ Z and D ∈ N such that
P+ D
.
Q 6= 0, D is not a perfect square, Q|(D − P 2 ) and x =
Q
Proof : Since x is a quadratic irrational, there exists d ∈ N such that d is not a perfect square
√
√
and x ∈ Q( d) \ Q. So x = q1 + q2 d for some q1 , q2 ∈ Q. Note that q2 6= 0 since x is irrational.
52
Putting q1 and q2 on a common denominator, we see that there exist a, b ∈ Z with b 6= 0 and
c ∈ N0 such that
√
√
√
√
√
a + b d c(a + b d) ac + bc d ac + ε b2 c2 d εac + b2 c2 d
=
x=
=
=
=
c
c2
c2
c2
εc2
where ε = sgn(bc) ∈ {−1, 1}.√Put P = εac, Q = εc2 and D = b2 c2 d. Then P, Q ∈ Z with
P+ D
Q 6= 0, D ∈ N and x =
. Note that D is not a perfect square since x is irrational.
Q
Moreover, D − P 2 = b2 c2 d − a2 c2 = εc2 (εb2 d − εa2 ). So Q|(D − P 2 ).
2
We can now prove the converse of Proposition 3.29.
Theorem 3.29 An ISCF is periodic if and only if it is a quadratic irrational.
Proof : If an ISCF is periodic then it is a quadratic irrational by Proposition 3.27.
So let x be a quadratic irrational. By Lemma 3.28, there exist
√ P, Q ∈ Z and D ∈ N such that
D
P
+
Q 6= 0, D is not a perfect square, Q|(D − P 2 ) and x =
. Define hxk ik≥0 and hak ik≥0 as
Q
in Definition 3.15. Now define hPk ik≥0 and hQk ik≥0 as follows :
P0 = P , Q0 = Q
P =a Q
for all k ≥ 1
k
k−1 k−1 − Pk−1
2
D
−
P
k
Qk =
for all k ≥ 1
Qk−1
√
Pk + D
Claim 1 : For all k ≥ 0, we have that Pk , Qk ∈ Z with Qk 6= 0 and xk =
Qk
The proof of Claim
1 is by induction on k. For k = 0, we have that P0 = P ∈ Z, Q0 = Q ∈ Z0
√
P+ D
and x =
. So suppose that the claim is true for k = 0, 1, . . . , n for some n ≥ 0. Then
Q
Pn+1 = an Qn − Pn ∈ Z since Pn , Qn ∈ Z by induction. We easily get that
Qn+1
2
D − (an Qn − Pn )2 D − Pn2
D − Pn+1
=
=
=
+ 2an Pn − a2n Qn
Qn
Qn
Qn
D − P02
D − Pn2
∈ Z since Q|(D − P 2 ); if n > 0 then
= Qn−1 ∈ Z by induction.
Q0
Qn
∈ Z. Note that Qn+1 6= 0 since D is not a perfect square. Using induction, we find
If n = 0 then
Hence Qn+1
that
xn+1 =
1
1
Qn
Qn
√
=
=√
=√
x n − an
Pn + D
D − (an Qn − Pn )
D − Pn+1
− an
Qn
53
Hence
xn+1
√
√
√
√
Qn ( D + Pn+1 ) Pn+1 + D Pn+1 + D
Qn ( D + Pn+1 )
√
=
= √
=
=
2
2
D − Pn+1
Qn+1
D − Pn+1
( D − Pn+1 )( D + Pn+1 )
Qn
which proves Claim 1.
Claim 2 : There exists N > 0 such that xk < 0 for all k > N .
Pick k ≥ 1. Using the proof of Theorem 3.16 and Lemma 3.4, we get that
x = [a0 ; a1 , . . . , ak−1 , xk ] =
pk−1 xk + pk−2
qk−1 xk + qk−2
Taking conjugates of both sides and using Theorem 3.24(b), we have that
x=
pk−1 xk + pk−2
qk−1 xk + qk−2
Solving this for xk , we find that
xk =
qk−2 x − pk−2
pk−1 − qk−1 x
We can rewrite this as
pk−2
qk−1
qk−2
xk =
−
qk−2
pk−1
x−
qk−1
x−
pk
= x, we get that
k→∞ qk
Since lim
pk−2
qk−2 x − x
=
=1
lim
k→∞
pk−1 x − x
x−
qk−1
x−
Hence there exists N ≥ 1 such that
q
k−1
xk − 1 < 1 for all k > N
−
qk−2
So we have that
−1 < −
qk−1
xk − 1 < 1 for all k > N
qk−2
54
In particular, we get that
qk−1
xk > 0 for all k > N
qk−2
are positive for all k > N , we get that
−
Since qk−1 and qk−2
xk < 0 for all k > N
which proves Claim 2.
√
√
√
Claim 3 : − D < Pk < D and 0 < Qk < 2 D for all k > N .
Pick k > N . By Claim 1, we have that
√
Pk + D
= xk > 1
Qk
By Claim 2, we get that
− Pk +
Qk
Multiplying (*) and (**), we get that
√
D
(∗)
= −xk > 0
(∗∗)
D − Pk2
>0
Q2k
So D − Pk2 > 0. Hence
√
√
− D < Pk < D
Adding (*) and (**), we get that
√
2 D
>0
Qk
So Qk > 0. Hence it follows from (*) that
√
Qk < Pk +
√
D<2 D
which proves Claim 3.
We can now finish the proof of the theorem. By Claim 1, (Pk , Qk ) ∈ Z2 for all k ≥ 0. By
Claim 3, the set {(Pk , Qk ) | k ≥ 0} is finite. Hence there exist m, n ∈ N0 such that m < n and
(Pm , Qm ) = (Pn , Qn ). We prove by induction on k that xk+n−m = xk and ak+n−m = ak for all
k ≥ m. Let k = m. Using Claim 1, we get that
√
√
Pn + D Pm + D
=
= xm
xn =
Qn
Qm
Hence an = [xn ] = [xm ] = am . Assume now that k > m. Again using Claim 1 and induction,
we get that
1
1
xk+n−m =
=
= xk
xk−1+n−m − ak−1+n−m xk−1 − ak−1
55
So ak+n−m = [xk+n−m ] = [xk ] = ak .
Hence
x = [a0 ; a1 , a2 , . . .] = [a0 ; a1 , . . . , am−1 , am , am+1 , . . . , an−1 ]
2
So the ISCF for x is periodic.
The final result in this section is the description of quadratic irrationals with a purely periodic
ISCF.
Definition 3.30 Let x be a quadratic irrational. Then x is reduced if x > 1 and −1 < x < 0.
Examples :
is reduced.
√
√
√
√
1+ 5
1+ 5
5 and
are both quadratic irrationals but 5 is not reduced while
2
2
Reduced quadratic irrationals have the following property.
Lemma 3.31 Let x be a reduced quadratic irrational. Put y =
1
. Then y is reduced.
x − [x]
Proof : By definition, x > 1 and −1 < x < 0. Since x is irrational, we have that 0 < x−[x] < 1
and so
1
>1
y=
x − [x]
Since x is a quadratic
√ irrational, we get that√there exists d ∈ N such that d is not a perfect
square and x ∈ Q( d). By Theorem 3.24, Q( d) is a field. Clearly, y is irrational and so y is
a quadratic irrational.
Since x > 1 and x < 0, we have that x − [x] < −1. So
−1 <
1
<0
x − [x]
Using Theorem 3.24(b), we get that
y=
1
1
1
1
=
=
=
x − [x] x − [x] x − [x] x − [x]
Hence −1 < y < 0. So y is reduced.
2
56
Theorem 3.32 An ISCF is purely periodic if and only if it is a reduced quadratic irrational.
Proof : Suppose first that x := [a0 ; a1 , a2 , . . .] is a purely periodic ISCF, say x = [a0 ; a1 , . . . , an ]
for some n ≥ 0. Note that x is a quadratic irrational by Theorem 3.29. By Lemma 3.4, we get
that
pn x + pn−1
x = [a0 ; a1 , . . . , an ] = [a0 ; a1 , . . . , an , x] =
qn x + qn−1
Hence
qn x2 − (pn − qn−1 )x − pn−1 = 0
Applying the conjugate to this equation and using Theorem 3.24(b), we get that
qn x2 − (pn − qn−1 )x − pn−1 = 0
So x and x are the roots of the quadratic function f (t) = qn t2 − (pn − qn−1 )t − pn−1 . Note that
f (−1) = qn + pn − qn−1 − pn−1 = (qn − qn−1 ) + (pn − pn−1 ) > 0
Since a0 > 0, we get that
f (0) = −pn−1 < 0
By the Intermediate Value Theorem, f (t) has a root in (−1, 0). By Theorem 3.10, we have that
1 ≤ a0 = C0 < x
Since the only roots of f (t) are x and x, we get that
−1 < x < 0
So x is reduced.
Suppose next that x is a reduced quadratic irrational. Define hxk ik≥0 and hak ik≥0 as in Definition
3.15. Using Lemma 3.31 and induction on k, we see that xk is a reduced quadratic irrational
for all k ≥ 0.
1
Pick k ≥ 0. Since xk+1 =
, we get that
x k − ak
x k = ak +
1
(∗)
xk+1
Taking the conjugates of both sides and using Theorem 3.24(b), we find that
x k = ak +
1
xk+1
= ak +
1
xk+1
= ak +
Note that −1 < xk < 0 since xk is reduced. Hence we get
−1 < ak +
57
1
xk+1
<0
1
xk+1
We can rewrite this as
ak < −
Hence
"
−
1
xk+1
1
xk+1
< ak + 1
#
= ak
(∗∗)
It follows from (the proof of) Theorem 3.29 that x = [a0 ; a1 , a2 , . . .] is periodic and that we
can choose m ∈ N minimal such that there exists n > m with xm = xn . Suppose that m ≥ 1.
Using (**), we get that
"
# "
#
1
1
am−1 = −
= −
= an−1
xm
xn
Hence by (*), we have that
xm−1 = am−1 +
1
1
= an−1 +
= xn−1
xm
xn
a contradiction to the minimality of m.
Hence m = 0 and
x = [a0 ; a1 , a2 , . . .] = [a0 ; a1 , a2 , . . . , am−1 ]
2
So the ISCF for x is purely periodic.
58
Chapter 4
Elliptic Curves
In this chapter, we give a very brief introduction to elliptic curves. We will mention most
results without proof.
4.1
Cubic Curves
Definition 4.1
(a) A curve is a subset of C2 of the form
{(a, b) ∈ C2 | f (a, b) = 0}
where f (x, y) ∈ C[x, y].
(b) Let C be the curve f (x, y) = 0 and (a, b) ∈ C. Then (a, b) is a singular point on C if
∂f
∂f
(a, b) =
(a, b) = 0
∂x
∂y
(b) A cubic curve is a curve of the form
y 2 = ax3 + bx2 + cx + d
where a, b, c, d ∈ C with a 6= 0.
Note that we are mostly interested in finding rational points on the curve f (x, y) = 0 where
f (x, y) ∈ Q[x, y].
Lemma 4.2 Let f (x) ∈ C[x] and C the curve y 2 = f (x). Then (x0 , y0 ) is a singular point on
C if and only if y0 = 0 and f (x0 ) = f 0 (x0 ) = 0. In particular, a cubic curve has at most one
singular point.
Proof : Let (x0 , y0 ) ∈ C. Then (x0 , y0 ) is a singular point on C if and only if
2y0 = 0
f 0 (x0 ) = 0
59
So y0 = 0. Since (x0 , y0 ) ∈ C, we have that
y02 = f (x0 )
Hence (x0 , y0 ) is a singular point on C if and only if y0 = 0 and f (x0 ) = f 0 (x0 ) = 0. This
means that x0 is a root of f (x) of multiplicity at least two.
Suppose now that C is a cubic curve. Then f (x) is a polynomial of degree three. So f (x) can
have at most one root of multiplicity at least two. Hence C has at most one singular point. 2
Let a,
b, c, d, e, f, g ∈ Q with a, d 6= 0. Then there exist q1 , q2 , q3 , q4 ∈ Q such that the substitux = q 1 X + q2
tion
transforms the curve
y = q3 Y + q4
ay 2 + by + c = dx3 + ex2 + f x + g
into a curve of the form
Y 2 = X 3 + AX + B
where A, B ∈ Q. So we concentrate on cubic curves of the latter form.
Lemma 4.3 Let C be the cubic curve y 2 = x3 + ax + b where a, b ∈ C. Then C has a singular
point if and only if 4a3 + 27b2 = 0.
Proof : By Lemma 4.2, C has a singular point if and only if x3 +ax+b has a root of multiplicity
at least two. This happens if and only if
x3 + ax + b = (x − r)2 (x − s)
for some r, s ∈ C. Note that
(x − r)2 (x − s) = x3 − (s + 2r)x2 + (2rs + r2 )x − r2 s
for all r, s ∈ C. So C has a singular point if and only if
s + 2r = 0
2rs + r2 = a
S :=
−r2 s = b
the system of equations
(1)
(2)
(3)
has a solution (s, r) ∈ C2 .
Suppose first that S has a solution (s, r) ∈ C2 . From (1), we get that s = −2r. Substituting
this in (2) and (3), we have that
−3r2 = a and 2r3 = b
Hence
4a3 + 27b2 = 4(−3r2 )3 + 27(2r3 )2 = 0
Suppose next that 4a3 + 27b2 = 0. If a = 0 then b = 0 and (r, s) = (0,!0) is a solution of S. So
3b 3b
is a solution of S. 2
we may assume that a 6= 0. One easily checks that (r, s) = − ,
2a a
60
4.2
Elliptic Curves and the Group Law
Definition 4.4 An elliptic curve is a curve of the form
{(x, y) ∈ C2 | y 2 = x3 + ax + b}
where a, b ∈ C with 4a3 + 27b2 6= 0. An elliptic curve over Q is an elliptic curve where a, b ∈ Q.
Let E be an elliptic curve. What are the points at infinity of E? For this we need some
projective geometry. Suppose that E is the curve
y 2 = x3 + ax + b
where 4a3 + 27b2 6= 0. The associated homogeneous equation is
Y 2 Z = X 3 + aXZ 2 + bZ 3
The projective line at infinity has equation Z = 0. So the points at infinity of E are the
solutions of
2
Y Z = X 3 + aXZ 2 + Z 3
Z=0
We easily get that the point (0, 1, 0) is the only solution. We denote this point by e. Note that
a line in the XY -plane that goes through e is a vertical line.
What are the points of intersection between a line and the elliptic curve y 2 = x3 + ax + b?
Suppose first that the line is not vertical. Then the line has an equation of the form y = mx + c
where m, c ∈ C. Hence the points of intersections are the solutions of
y = mx + c
y 2 = x3 + ax + b
Substituting ’y = mx + c’ into the second equation and simplifying, we get that
y = mx + c
x3 − m2 x2 + (a − 2mc)x + b − c2 = 0
(*)
Hence, counting multiplicities, we see that a non-vertical line intersects an elliptic curve E in
three points on E.
Suppose next that the line is vertical. Then the line has an equation of the form x = c where
c ∈ C. Hence the points of intersections are the solutions of
x=c
y 2 = x3 + ax + b
Substituting ’x = c’ into the second equation, we get that
x=c
y 2 = c3 + ac + b
Hence, counting multiplicities, we see that a vertical line intersects an elliptic curve E in two
points on E and the point e (at infinity).
61
Definition 4.5 Let E be the elliptic curve y 2 = x3 + ax + b.
(a) We put
E(C) = {(x, y) ∈ C2 | y 2 = x3 + ax + b} ∪ {e}
E(Q) = {(x, y) ∈ Q2 | y 2 = x3 + ax + b} ∪ {e}
(b) Let P1 , P2 ∈ E(C). We define a third point in E(C), which we denote by P1 + P2 , as
follows :
1. P1 6= e 6= P2 and P1 6= P2
Suppose first that the line P1 P2 is not vertical. Then the line P1 P2 intersects E in a
third point (x, y). We put
P1 + P2 = (x, −y)
Suppose next that the line P1 P2 is vertical. We put
P 1 + P2 = e
2. P1 6= e 6= P2 and P1 = P2
Suppose first that the tangent line to E at P1 is not vertical. Then this tangent line
intersects E in a third point (x, y). We put
P1 + P2 = (x, −y)
Suppose next that the tangent line to E at P1 is vertical. We put
P 1 + P2 = e
3. P1 = e or P2 = e
We put
P1 + e = P 1
and e + P2 = P2
The following theorem is far from obvious.
Theorem 4.6 Let E be an elliptic curve. Then E(C), + is an abelian group. Moreover, if E
is an elliptic curve over Q then E(Q) is a subgroup of E(C).
Proof : Clearly, e is an identity element and P1 + P2 = P2 + P1 for all P1 , P2 ∈ E(C). The
inverse of (x, y) ∈ E is (x, −y) while the inverse of e is e. Associativity is quite hard to prove.
If E is an elliptic curve over Q, then it follows from (*) on page 61 that P1 − P2 ∈ E(Q); so
E(Q) is a subgroup of E(C).
2
The following propositions allow us to perform practical calculations in E(C).
62
Proposition 4.7 Let E be the elliptic curve y 2 = x3 +ax+b and P = (x1 , y1 ), Q = (x2 , y2 ) ∈ E
with x1 6= x2 . Then
!2
y2 − y1
x3 = x2 − x1 − x1 − x2
P + Q = (x3 , y3 ) where
!
y2 − y1
y3 = − x2 − x1 (x3 − x1 ) − y1
Proof : The line through P1 and P2 has equation
y=
y2 − y1
(x − x1 ) + y1
x2 − x1
It follows from (*) on page 61 that
x1 + x2 + x3 =
y2 − y1
x2 − x1
!2
2
The formulas for x3 and y3 are now easily deduced.
Proposition 4.8 Let E be the elliptic curve y 2 = x3 +ax+b and P = (x1 , y1 ) ∈ E with y1 6= 0.
Then
!2
3x21 + a
− 2x1
x2 =
2y1
P + P = (x2 , y2 ) where
!
3x21 + a
(x2 − x1 ) − y1
y2 = −
2y1
Proof : Using implicit differentiation, we get that
2y
dy
= 3x2 + a
dx
Hence the tangent line to E at P has equation
y=
3x21 + a
(x − x1 ) + y1
2y1
It follows from (*) on page 61 that
x1 + x1 + x2 =
3x21 + a
2y1
The formulas for x2 and y2 are now easily deduced.
63
!2
2
Example : Consider the elliptic curve
E : y 2 = x3 + 17
We easily see the following integral points on E :
P = (−1, 4)
and
Q = (2, 5)
We can now use the group law to construct new rational points on E :
!
!
64 59
137 2651
,−
, Q+Q= − ,
P +P =
64
512
25 125
!
8 109
P + Q = − ,−
, P − Q = (8, 23)
9
27
We finish this section by mentioning without proof some major results. Note that the proof of
most of these results is extremely complicated.
Theorem 4.9 Let E be the elliptic curve y 2 = x3 + ax + b with a, b ∈ Q. Then the following
holds :
(a) (Mordell, 1922) The group E(Q), + is finitely generated.
(b) (Siegel, 1929) There are only finitely many integral points on E.
(c) (Lutz-Nagell,1937) If a, b ∈ Z and (x, y) ∈ E(Q) has finite order, then x, y ∈ Z and either
y = 0 or y|(4a3 + 27b2 ).
(d) (Mazur, 1976) The torsion subgroup of E(Q) (this is the set of all elements in E(Q) of finite order) is isomorphic to one of the following groups : Z/mZ with m ∈ {1, 2, . . . , 9, 10, 12}
or Z/2mZ ⊕ Z/2Z with m ∈ {1, 2, 3, 4}.
4.3
Sums of Two Cubes
Let k ∈ N0 . Does there exist a number n that can be written as the sum of k (positive)
cubes in k different ways? Elliptic curves give us a construction to answer this question in the
affirmative.
We want to find numbers n 6= 0 such that the Diophantine equation
X3 + Y 3 = n
has k different integral solutions (note that if (a, b) is a solution, we do not consider (b, a) to
be a different solution). We make the substitution
36n + y
X=
6x
36n − y
Y =
6x
64
After making this substitution and simplifying, we get the elliptic curve
y 2 = x3 − 432n2
Note that we can solve our substitution rules for x and y :
12n
x=
X +Y
36n(X − Y )
y=
X +Y
Hence there is a bijection between the rational points on the elliptic curve y 2 = x3 − 432n2 and
the rational points on the curve X 3 + Y 3 = n (note that integral points on one curve do not
necessarily lead to integral points on the other curve).
Starting with a certain value for X, Y and n, we get a rational point (mostly of infinite order)
on the elliptic curve; using the group law, we easily get more rational points on the elliptic
curve which lead to rational points on X 3 + Y 3 = n; clearing the denominators will lead to
integral solutions of X 3 + Y 3 = m for some m ∈ N.
We illustrate this with the case n = 9.
We easily get that (X1 , Y1 ) = (2, 1) is a solution of
X3 + Y 3 = 9
The associated elliptic curve is
y 2 = x3 − 34992
which has the rational point
P = (36, 108)
We easily get that
P + P = (252, −3996)
The associated rational point on X 3 + Y 3 = 9 is
(X2 , Y2 ) =
So we have
3
3
=9
2 +1 !
3
− 17
+
7
!
17 20
− ,
7 7
!3
20
=9
7
Multiplying both equations by 73 , we get that
(2 · 7)3 + (1 · 7)3 = 9 · 73 = (−17)3 + 203
65
So 9 · 73 = 3087 is a number that can be written as the sum of two cubes in (at least) two
different ways.
We can keep going :
P + P + P = (73, 595)
Hence we have
and
(X3 , Y3 ) =
!
919 271
,−
438 438
23 + 13 = 9
!3
!3
− 17
20
+
=9
7
7
!3
!3
919
271
438 + − 438 = 9
Multiplying both equations by (7 · 438)3 , we get that
(2 · 7 · 438)3 + (1 · 7 · 438)3 = (−17 · 438)3 + (20 · 438)3 = (919 · 7)3 + (−271 · 7)3 = 9 · (7 · 438)3
So 9 · (7 · 438)3 = 259393423464 is a number that can be written as the sum of two cubes in (at
least) three different ways.
What if we only want to use positive cubes? Suppose that the point P has infinite order (it
is rather rare for P to have finite order). For k ≥ 1, we can calculate (Xk , Yk ) (the rational
point associated to kP ). One can prove that Xk , Yk > 0 for infinitely many k’s. We list the
first couple of values of (Xk , Yk ) in our case (n = 9) :
(X1 , Y1 ) = (2, 1)
(X2 , Y2 ) =
!
17 20
− ,
7 7
(X3 , Y3 ) =
!
919 271
,−
438 438
(X4 , Y4 ) =
!
36520 188479
−
,
90391 90391
(X5 , Y5 ) =
!
169748279 152542262
,−
53023559
53023559
(X6 , Y6 ) =
415280564497 676702467503
,
348671682660 348671682660
66
!