Download Continued Fractions and Diophantine Equations

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Polynomial greatest common divisor wikipedia , lookup

Eisenstein's criterion wikipedia , lookup

Factorization wikipedia , lookup

Cubic function wikipedia , lookup

Quartic function wikipedia , lookup

Fundamental theorem of algebra wikipedia , lookup

Quadratic equation wikipedia , lookup

Elementary algebra wikipedia , lookup

History of algebra wikipedia , lookup

System of linear equations wikipedia , lookup

System of polynomial equations wikipedia , lookup

Equation wikipedia , lookup

Transcript
Number Theory (part 6): Continued Fractions and Diophantine Equations
(by Evan Dummit, 2014, v. 1.00)
Contents
6 Continued Fractions and Diophantine Equations
6.1
1
Linear Diophantine Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
6.1.1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
Pythagorean Triples . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5
The Frobenius Coin Problem
x2 + y 2 = z 2 :
6.2
The Equation
6.3
Continued Fractions
6.3.1
Convergents to Continued Fractions
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6
6.3.2
Innite Continued Fractions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8
6.3.3
Periodic Continued Fractions
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10
6.4
The Farey Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
11
6.5
Rational Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
13
6.6
Pell's Equation
16
6.7
An Assortment of Diophantine Equations
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
18
6.8
Some Remarks on Fermat's Last Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
20
6
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Continued Fractions and Diophantine Equations
In this chapter, we discuss Diophantine equations, which are concerned with the problem of solving equations
over the integers: one of the earliest nontrivial examples was posed by Diophantus, whence the general name of
Diophantine equation for this class of problems.
xn + y n = z n , which we will make a central
2
Another important equation is Pell's equation x − Dy = 1, whose solution is related to another
One of the most famous Diophantine equations is the Fermat equation
focus of studying.
2
important topic in classical number theory: continued fractions and rational approximation. We begin by analyzing
a few simple Diophantine equations, then discuss the ideas behind rational approximation and develop the theory
of continued fractions. Finally, we apply these results, and others, to solve various Diophantine equations.
There is no general procedure for deciding whether a given Diophantine equation possesses any solutions, or (even if
existence is known) for nding them all. (This can in fact be formalized and proven using methods of mathematical
logic.) Thus, many of the methods for solving Diophantine equations are rather
chapter are primarily to provide a survey of elementary techniques.
structure of
6.1
•
√
Z[ D]
ad hoc,
and so our goals in this
One recurring theme, however, will be the
and closely related rings.
Linear Diophantine Equations
The simplest equations are linear equations in two variables. (Solving a linear equation in one variable is, of
course, trivial.)
◦
The general form of such an equation is
ax + by = c,
determine when this equation has an integral solution
•
Theorem: Let
a, b, c
(x, y).
If
solution, then all the others are
(x, y),
a, b,
and
c:
our goal is to
and then to characterize all the solutions.
ab 6= 0, and set d = gcd(a, b). If d - c, then the equation ax + by = c has
d | c, then the equation has innitely many solutions, and if (x0 , y0 ) is one
(x0 − bt/d, y0 + at/d), for some integer t.
be integers with
no solutions in integers
for some xed integers
1
◦
Remark: If
a = b = 0,
then the equation is either trivially true (if
we can assume that the gcd exists. If one of
ab 6= 0
◦
c = 0)
or trivially false (if
c 6= 0),
so
is zero, the equation is also trivial, so assuming that
loses nothing.
Proof: Observe that there is an integral solution to
congruence
◦
a, b
ax ≡ c
(mod
b),
since then
c − ax
y=
.
b
ax + by = c
if and only if there is a solution to the
ax ≡ c (mod b) has a solution only if
c.
◦ In this case, if we set a0 = a/d, b0 = b/d, and c0 = c/d, the set of all such x is given by the residue class
x0 modulo b0 , where x0 ≡ c0 · (a0 )−1 (mod b0 ).
◦ Now if (x, y) is any solution, then by the above, we see that x = x0 − bt/d for some integer t, and then
y = y0 + at/d.
Now from our discussion of linear congruences, we know that
d = gcd(a, b)
•
divides
Example: Find all solutions to
◦
◦
◦
First, we compute
in integers
gcd(14, 18) = 2, and then divide
7x ≡ 6 (mod 9).
(x, y).
through by the gcd to get
7x + 9y = 6.
This is equivalent to solving
We compute (via the Euclidean algorithm) that the inverse of 7 mod 9 is 4, so multiplying both sides by
4 yields
◦
14x + 18y = 12
x ≡ 24 ≡ 6
(mod 9).
Hence one solution is
(x, y) = (6, −4).
The set of all solutions is then
(x, y) = (6 − 9t, −4 + 7t)
for
t ∈ Z.
•
We could also solve equations of this form using changes of variable:
•
Example: Find all solutions to the equation
◦
4x + 13y = 5.
13 = 3·4+1, so we can write the system in the form 4x+(3·4+1)y = 5,
4(x + 3y) + 1y = 5.
u = x + 3y , this new system becomes 4u + y = 5, which we can easily solve to get
By the division algorithm, we have
and rearrange this into the form
◦
If we substitute
y = 5 − 4u.
•
◦
Substituting back yields
x = u − 3y = u − 3(5 − 4u) = −15 + 13u.
◦
Thus, we obtain the general solution
(x, y) = (−15 + 13u, 5 − 4u)
.
This latter method, using changes of variable, is the proper way to solve systems of linear Diophantine
equations.
◦
The study of the solutions to a system of linear equations (with coecients in a eld, often
Q , R,
or
C)
is a basic problem of linear algebra.
◦
The standard solution technique is to convert the system into matrix form, and then perform row and
column operations on the matrix until it is in a suciently simple form that the solution to the original
system is obvious.
◦
The general procedure for solving a system of linear equations over
Z
is essentially the same, except for
the added complication that all of the row and column operations need to be done over
◦
Z.
As with a system of equations over a eld, the end result will be either that the system has no solution,
a unique solution, or an innite family of solutions with some number of free parameters.
◦
We will not go into the technical details, since the procedure falls more properly into a course in linear
algebra or abstract algebra. Instead, we will just give an example.
•
Example: Find all solutions to
◦
3x + 5y + 7z = 11
in integers
(x, y, z).
3(x + y + 2z) + 2y + z = 11, and then
w = x + y + 2z .
◦ The new equation is 3w + 2y + z = 11, which we can easily solve, obtaining z = 11 − 2y − 3w.
◦ Solving for x yields x = w − y − 2z = −22 + 7w + 3y , so we obtain the general solution (x, y, z) =
(−22 + 7w + 3y, y, 11 − 2y − 3w) , where w, y are arbitrary integers.
Motivated by the division algorithm, we rewrite the equation as
substitute
2
6.1.1 The Frobenius Coin Problem
•
ax + by = c in integers (x, y). In various
c this equation has a solution in nonnegative integers
Above, we characterized when there exists a solution to the equation
settings, we are interested in knowing for which values of
(x, y).
◦
If
a
and
b
are not relatively prime, clearly
the gcd) we can reduce to the case where
•
c must be divisible by their gcd,
a and b are relatively prime.
and (by dividing through by
One version of this problem uses postage stamps, which often cost irregular amounts: if, for example, there
are postage stamps worth 5 cents and stamps worth 13 cents, is it possible to use them to put exactly 79
cents' worth of postage on an envelope?
◦
The most obvious method is simply to make a list of totals that are attainable: 0, 5, 10, 13, 15, 18, 20,
23, 25, 26, 28, 30, 31, 33, 35, 36, 38, 39, 40, 41, 43, 44, 45, 46, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58,
59, 60, ....
◦
Based on our list, it seems that every value above 47 is attainable. (Indeed, it is easy to see that since
48, 49, 50, 51, and 52 are all attainable, we can obtain any larger number by adding additional 5-cent
stamps.)
◦
Thus, for example, we get exactly 79 cents of postage by using three 13-cent stamps and eight 5-cent
stamps.
•
Another version occurs in sports: In American football, a team can score 3 points for a eld goal, or 7 points
for a touchdown. What possible scores can a team obtain? (Ignore safeties, missed extra points, and so forth.)
◦
Like above, we can simply list the totals that are attainable: 0, 3, 6, 7, 9, 10, 12, 13, 14, 15, 16, 17, 18,
19, 20, 21, 22, 23, ....
◦
Based on our list, only a few values are unattainable: 1, 2, 4, 5, 8, and 11. Indeed, it is easy to see that
since 12, 13, and 14 are attainable, any larger number is also attainable by adding more eld goals.
•
The problem of describing the largest integer that cannot be written as a nonnegative linear combination of
two integers (sometimes called the Frobenius coin problem) was rst solved by Sylvester:
•
Theorem: If
a
and
b
be written in the form
◦
1
(a − 1)(b − 1)
2
integer is ab − a − b.
are relatively prime integers, then there are exactly
ax + by
with
x, y ≥ 0,
and the largest such
integers that cannot
Proof: For brevity, we say an integer is representable if it can be written in the form
ax + by
with
x, y ≥ 0.
◦
Without loss of generality, assume
a < b.
Arrange the nonnegative integers in an array in the following
manner:
◦
0
a
2a
1
a+1
2a + 1
2
a+2
2a + 2
.
.
.
.
.
.
.
.
.
ab − a
ab − a + 1
ab − a + 2
···
···
···
a−1
2a − 1
3a − 1
.
.
.
···
ab − 1
Now we use the array to mark all of the representable integers. We rst circle all of the multiples of
then an integer is representable precisely if it appears in the same column as some multiple of
down.
◦
For illustration, here is the array with
a=3
and
0
3
6
9
1
2
3
b = 5:
1
4
7
10
1
1
2
5
8
1
1
1
4
b,
b:
lower
◦
Since
a
and
b
are relatively prime, the integers 0,
b, 2b,
... ,
(a − 1)b
all lie in dierent columns. Thus,
the largest element that is left unmarked is the element one row above
this is the largest integer not expressible as
◦
ax + by
with
(a − 1)b,
which is
ab − a − b,
so
x, y ≥ 0.
For the other part, we simply count the number of unmarked integers in the array: the number of integers
lying above
kb
is
bkb/ac,
a−1
X
so there are a total of
k=0
∗
kb
a
unmarked integers in the array.
We can interpret this as the number of lattice points lying under the line
y=
b
x, with 1 ≤ x ≤ a−1,
a
which is (equivalently) the number of lattice points lying below the diagonal of the rectangle with
vertices
∗
(0, 0), (a, 0), (a, b),
and
(0, b)
and not on any of the sides.
By symmetry, since there are no lattice points on the interior of the diagonal, this is exactly half of
the lattice points lying strictly inside the
a×b
rectangle, and there are
(a − 1)(b − 1)
such lattice
points.
◦
Remark: An alternate method for obtaining this count is to prove that if
representable if and only if
•
For
a=5
and
b = 13,
ab − a − b − c
0 ≤ c ≤ ab − a − b,
then
c
is
is not representable.
we see that the largest non-representable integer is
5 · 13 − 13 − 5 = 47,
which is indeed
what we found above.
•
We could of course generalize this problem, to ask: for given integers
n
a1 , a2 , ... , ak , what is the largest integer
ai ?
that cannot be written as a nonnegative integer linear combination of the
◦
It turns out that there is no general formula when
k > 2.
In fact, for large
k,
there is not even a known
algorithm for nding the answer that is appreciably faster than merely attempting to list the possibilities!
6.2
•
The Equation
x2 + y 2 = z 2 :
Pythagorean Triples
Now that we have discussed solving linear equations in integers, we turn our attention to the simplest quadratic
Diophantine equation: characterizing integer triples
◦
(x, y, z)
such that
x2 + y 2 = z 2 .
Such triples are naturally called Pythagorean triples because (by the Pythagorean theorem) they form
the sides of a right triangle: a familiar example is the nearly-ubiquitous
◦
Pythagorean triple, we can create more by scaling
and
z:
thus we obtain
(6, 8, 10), (9, 12, 15),
and
(x, y, z) is primitive if gcd(x, y, z) =
and would like to characterize the primitive triples.
(x, y, z) is a primitive Pythagorean triple, x and y cannot both be even, since then
x and y cannot both be odd, since then x2 + y 2 ≡ 2 (mod 4), but 2 is not a
So we conclude that z must be odd, and that exactly one of x and y is also odd.
First we note that if
z
would also be even. Also,
square modulo 4.
•
x, y ,
(3, 4, 5).
To exclude these essentially repetitious cases, we say a Pythagorean triple
1,
◦
triangle.
Since the dening equation is homogeneous (i.e., all of the terms have the same degree), if we have one
so forth from
◦
(3, 4, 5)
Theorem:
(x, y, z) with x even
s > t of opposite parity,
Every primitive Pythagorean triple of the form
(2st, s2 − t2 , s2 + t2 ),
for some relatively prime integers
is of the form
(x, y, z) =
and (conversely) any such
triple is Pythagorean and primitive.
◦
multiplying
out. Furthermore, it is easy to check that if
parity, that
gcd(s2 − t2 , s2 + t2 ) = 1,
◦
so this triple is primitive.
We will give three proofs of the nontrivial direction:
arithmetic of
◦
(2st)2 + (s2 − t2 )2 = (s2 + t2 )2 simply by
s and t are relatively prime and have opposite
For the reverse direction, it is easy to see that
Z[i],
one using the arithmetic of
Z,
one using the
and one using geometry.
The central idea in the rst proof is to rearrange the equation and use the arithmetic of
Z. The central
Z[i], while the central idea in the third proof is to
2
2
use the geometry of the dehomogenized curve x + y = 1 to study the rational solutions.
z − y z + y x 2
◦ Proof 1: Since y and z are both odd and x is even, we can rewrite the equation as
·
=
.
2
2
2
idea in the second proof is to exploit the arithmetic of
4
∗
∗
z−y
z+y
and
are relatively prime: their gcd divides their sum z and
2
2
since y and z are relatively prime, the gcd must be 1.
z+y
share no prime divisors and their product is a square, each of them
2
Now we claim that
y,
z−y
2
dierence
and
Since
and
their
must
individually be a square, by the uniqueness of prime factorization.
∗
Hence there exist integers
∗
Then
t
◦
z = s2 + t2
and
t
such that
y = s2 − t2 ,
z−y
= t2
2
and then we also
are necessarily relatively prime and have opposite
Proof 2: In
∗
and
s
Z[i],
we factor the equation as
z+y
= s2 .
2
obtain x = 2st, as claimed. Furthermore, s
parity, since (x, y, z) is primitive.
and
and
(x + iy)(x − iy) = z 2 .
x + iy and x − iy are relatively prime: any gcd must divide 2x and 2y , hence
x + iy is not divisible by the prime 1 + i, since x and y are of opposite parity.
∗ Hence, since x+iy and x−iy are relatively prime and have product equal to a square, by the uniqueness of prime factorization in Z[i], there exists some s + it ∈ Z[i] and some unit u ∈ {1, i, −1, −i}
2
such that x + iy = u(s + it) .
∗ Multiplying out yields x + iy = u (s2 − t2 ) + (2st)i . Since x is even and y is odd, we must have
u = ±i: then writing out the various possibilities yields the given parametrization.
x 2 y 2
+
= 1, so it is sucient to describe
◦ Proof 3: Dividing by z 2 yields the equivalent equation
z
z
2
2
all points (a, b) on the unit circle x + y = 1 whose coordinates are both rational numbers.
Now we claim that
divide 2. However,
∗
∗
To do this, consider all non-vertical lines passing through the point
Such a line will intersect the circle
2
2
x +y = 1
(−1, 0).
in exactly one other point. If the coordinates of this
point are rational, then the line will have rational slope.
t
,
s
t
(x + 1) so we can simply compute
s
2
2
s −t
2st
the other intersection point to see that it is (x, y) =
,
, which is rational.
s2 + t2 s2 + t2
2
s − t2
2st
,
∗ Thus, the rational points on the unit circle are those of the form
for some integers
s2 + t2 s2 + t2
s and t. Clearing the denominator yields the desired Pythagorean triples.
∗
◦
Conversely, if the line has rational slope
its equation is
y=
u = tan(θ/2), which transforms
sin(θ) and cos(θ) into an integral of a rational function of u, which
fraction decomposition. (With the notation above, u = s/t.)
Remark: The third proof is closely related to the Weierstrass substitution
an integral of any rational function of
can then be evaluated using partial
•
Using the characterization above, we can easily generate a list of all the primitive Pythagorean triples: ordered
by the length of the hypotenuse, the rst few are (3,4,5), (5,12,13), (8,15,17), (7,24,25), (20,21,29), (12,35,37),
(9,40,41), (28,45,53), ....
6.3
•
Continued Fractions
Denition: A nite continued fraction is an expression of the form
1
a0 +
,
1
a1 +
1
a2 +
a3 + · · · +
1
ak−1 +
where the
ai
are positive real numbers.
compact notation
[a0 , a1 , · · · , ak ].
If the
1
For brevity, we will denote this expression using the much more
ai
Example:
◦
We note a few very basic properties:
1
3+
4
=
are integers, we term it a simple continued fraction.
30
.
13
◦
[2, 3, 4] = 2 +
1
ak
[a0 , a1 , · · · , ak ] = a0 +
5
1
1
= [a0 , a1 , · · · , ak−1 + ].
[a1 , · · · , ak ]
ak
•
Clearly, every simple continued fraction is a rational number.
written as a simple continued fraction: if
p/q
Conversely, every rational number can be
is any positive rational number in lowest terms, then if we apply
the Euclidean algorithm to write
p
=
q1 q + r1
q
=
q2 r1 + r2
r1
=
q3 r2 + r3
.
.
.
rk−1
=
qk rk + 1
rk
=
qk+1
where we know the last remainder will be 1 since
p/q
that
◦
p
= [q1 , q2 , · · · , qk , qk+1 ].
q
is in lowest terms, then it is an easy induction to verify
Furthermore, by the uniqueness of the Euclidean algorithm, all of the quotients are unique, so the
expression is unique, except for the fact that we can write
◦
[q1 , q2 , · · · , qk ] = [q1 , q2 , · · · , qk − 1, 1].
If we exclude the case where the nal term is equal to 1, then every positive rational number can be
written uniquely as a continued fraction.
•
Example: To convert
so that, by the above,
◦
•
17
7
into a continued fraction, we rst compute
17
= [2, 2, 3]
7
17
=
2·7+3
7
=
2·3+1
3
=
3·1
.
Another way of doing this is just to work it out explicitly, by writing
Example: To convert
so that
67
19
67
= [3, 1, 1, 9]
19
17
3
1
= 2+ = 2+
= 2+
7
7
7/3
1
2+
1
3
.
into a continued fraction, we compute
67
=
3 · 19 + 10
19
10
=
=
1 · 10 + 9
1·9+1
9
=
9·1
.
6.3.1 Convergents to Continued Fractions
•
Denition: If
the
nth
C = [a0 , a1 , · · · , ak ] is given, then the continued fraction Cn = [a0 , a1 , · · · , an ] for n < k
C.
is called
convergent to
117
7
22
= [1, 6, 3, 5], the successive convergents are [1] = 1, [1, 6] = , [1, 6, 3] =
, and
101
6
19
117
[1, 6, 3, 5] =
.
101
7
22
117
◦ Observe that
≈ 1.1584, while ≈ 1.1666 and
≈ 1.1580.
101
6
19
◦ Notice that the convergents are fairly close to the actual value of the continued fraction, and their
◦
Example: For
accuracy improves as we take higher convergents. We will return to this idea in a moment.
6
•
There is a simple recursive relation that allows us to compute the convergents
•
Proposition: Let
C = [a0 , a1 , · · · , ak ],
and dene
p−1 =
and for
2 ≤ n ≤ k,
1
q−1 =
0
p1 = a0
q0 =
1
set
pn = an pn−1 + pn−2
Then
◦
Proof: For the rst statement, we use induction. The cases
and
[a0 , a1 ] = a0 + 1/a1 = (a0 a1 + 1)/a1 ,
n=1
[a0 , a1 , · · · , am−1 , x] =
Now observe that
result with
and
n=2
pm−1 x + pm−2
.
qm−1 x + qm−2
n ≤ m.
[a0 , a1 , · · · , am−1 , am , am+1 ] = [a0 , a1 , · · · , am−1 , am +
x = am +
1
am+1
are trivial to verify, since
as claimed.
For the inductive step, suppose we know that the result holds for
the case that
◦
qn = an qn−1 qn−2 .
Cn = pn /qn .
[a0 ] = a0 /1
◦
Ci .
By hypothesis, for any
1
]:
am+1
x,
it is
now apply the above
to obtain
[a0 , a1 , · · · , am−1 , am +
1
am+1
]
1
am +
pm−1 + pm−2
am+1
1
am +
qm−1 + qm−2
am+1
(am pm−1 + pm−2 ) + pm−1 /am+1
(am qm−1 + qm−2 ) + qm−1 /am+1
pm + pm−1 /am+1
qm + qm−1 /am+1
am+1 pm + pm−1
pm+1
=
,
am+1 qm + qm−1
qm+1
=
=
=
=
which is the desired result.
•
Using the expressions above, we can deduce a few simple corollaries about the
•
C = [a0 , a1 , · · · , ak ],
pn qn−2 − pn−2 qn = (−1)n−2 an .
Corollary: Let
◦
with the
pi
and
qi
as above.
Then
pi
and
qi :
pn qn−1 − pn−1 qn = (−1)n−1
and
For the rst statement, by the recursion we can write
pn qn−1 − pn−1 qn
=
(an pn−1 + pn−2 )qn−1 − pn−1 (an qn−1 − qn−2 )
= −(pn−1 qn−2 − pn−2 qn−2 )
so since
◦
•
p1 q0 − p0 q1 = 1,
pn qn−1 − pn−1 qn = (−1)n−1 .
The second statement follows in the same way. (We skip the algebra.)
Using the result above, we can show that the convergents actually converge to the value of the original
continued fraction
•
by a trivial induction we see that
Corollary: Let
and
1
.
qn2
C:
(−1)n−1
qn−1 qn
1
Furthermore, C1 > C3 > C5 > · · · > C6 > C4 > C2 , and |C − Cn | ≤
<
qn qn+1
C = [a0 , a1 , · · · , ak ], with the pi and qi as above.
Cn − Cn−2 =
(−1)n−2 an
.
qn−2 qn
7
For
Ci = pi /qi , we have Cn −Cn−1 =
◦
The rst two statements follow immediately by dividing the relations
pn qn−2 − pn−2 qn = (−1)
◦
From the statement
n−2
an
by
qn qn−1
and
qn qn−2
(−1)n−2 an
,
qn−2 qn
Cn − Cn−2 =
pn qn−1 − pn−1 qn = (−1)n−1
and
respectively.
we see that
Cn < Cn−2
if
n
is odd, and
Cn > Cn−2
if
n
is even.
◦
Hence, by a trivial induction, we see
◦
Furthermore, since
C2n+1 > C2n
C1 > C3 > C5 > · · ·
for every
n,
and
· · · > C6 > C4 > C2 .
we can combine the two chains of inequalities to obtain the
third statement.
◦
C
For the last statement, we simply observe that the inequalities above imply that
Cn+1
for every
n,
hence the triangle inequality implies
is between
1
1
< 2.
|C − Cn | ≤ |Cn+1 − Cn | =
qn qn+1
qn
Cn
and
6.3.2 Innite Continued Fractions
•
We have discussed nite continued fraction expansions (of rational numbers) and shown that their convergents
obey some nice relations, and we can compute the simple continued fraction expansion of any rational number
using the Euclidean algorithm.
◦
We would now like to extend our discussion to include irrational numbers: what, for example, does it
mean to ask for the continued fraction expansion of
√
2?
Or of
π,
or
ln(2)?
◦
None of these is a rational number, so any such expansion cannot be nite.
◦
To handle this, we simply extend our denition of continued fraction to an innite continued fraction
by taking a limit.
•
a0 , a1 , ... of positive integers, we dene the
lim [a0 , a1 , · · · , an ] of its nite continued fraction
Denition: Given a sequence
innite continued fraction
[a0 , a1 , · · · ]
convergents.
◦
to be the limit
n→∞
It is worthwhile explaining why this limit exists. From our work above, if
Cn = [a0 , a1 , · · · , an ],
α =
then
C1 > C3 > C5 > · · · > C6 > C4 > C2 .
◦
So the sequence
◦
Similarly, the sequence
C1 , C3 , C5 ,
... is monotone decreasing and bounded below (by
C2 , C4 , C6 ,
C2 ),
hence it has a limit.
... is monotone increasing and bounded above (by
C1 ),
hence it also
has a limit.
◦
◦
These two limits must be equal because
|Cn − Cn+1 | <
1
,
qn2
which tends to zero as
Alternatively, we could appeal to the fact that the intervals
n → ∞.
[C2n , C2n−1 ] form a set of nested closed
R, their intersection is a single point C
intervals of lengths tending to zero, so by standard properties of
equal to the limit of the sequence
•
Proposition: If
α = [a0 , a1 , · · · ]
Ci .
is an innite continued fraction and
Cn = [a0 , a1 , · · · , an ],
then
|α − Cn | ≤
1
1
< 2.
qn qn+1
qn
◦
•
The proof follows identically from the nite case, since
α
lies between
Cn
and
Cn+1 .
Since rational numbers have a nite continued fraction expansion (unique up to replacing the last term in an
essentially trivial way), we would expect that any innite continued fraction must be irrational. We would
also expect that distinct innite continued fractions have dierent values.
•
Proposition: If
α = [a0 , a1 , a2 , · · · ] is an innite continued fraction, then α is irrational.
Furthermore, any two
dierent innite continued fractions converge to dierent values.
◦
Proof: For the rst statement, suppose
p pn < 1 ,
0< −
q
qn qn2
meaning that
α = p/q
were rational. By the proposition above, we know that
0 < |pqn − pn q| <
8
q
.
qn
◦
However,
q
qn
impossible, since if
◦
n → ∞,
goes to zero as
qn > q
since
the expression
is xed but
|pqn − pn q|
For the second statement, rst observe that
that
q
qn
is a strictly increasing sequence. This is
would be an integer between 0 and 1.
C0 < α < C1 ,
meaning that
a0 < α < a0 +
bαc = a0 .
◦
Next, observe that
◦
Now suppose
α = lim [a0 , a1 , · · · , an ] = lim a0 +
n→∞
n→∞
1
[a1 , · · · , an ]
= a0 +
1
,
a1
so we see
1
.
[a1 , a2 , · · · ]
β = [b0 , b1 , · · · ] and β = α. By taking oors, we see that b0 = a0 .
1
1
◦ Then [b1 , b2 , · · · ] =
=
= [a1 , a2 , · · · ]. Taking oors again shows b1 = a1 .
β − b0
α − a0
◦ Repeating the argument yields bi = ai for every i, so α and β are identical.
•
So far, we have discussed the ideas behind innite continued fractions, but we have not actually computed
any!
◦
It is not hard, from the above, to work out the procedure for converting an irrational number
innite continued fraction
◦
First, we must have
◦
Then, as we also observed,
a0 = bαc,
0 < α − a0 < 1
1 because
α
into an
[a0 , a1 , a2 , · · · ].
as we observed above.
1
1
, so if we dene α1 =
(which is greater
α − a0
α − a0
irrationality of α and the denition of the oor function), we must
[a1 , a2 , · · · ] =
by
than
have
a1 = bα1 c.
1
α1 − a1
◦
Now we repeat: we set
◦
In general, we obtain the terms recursively, via the relations
α2 =
As noted above, each of the
◦
ai
a2 = bα2 c.
and take
a0 = bαc, αi =
will be a positive integer, because
αi
1
,
αi−1 − ai−1
and
ai = bαi c.
will always be greater than 1.
[a0 , a1 , a2 , · · · ] actually converges to α.
α = [a0 , a1 , · · · , an , αn+1 ], and then compute
pn αn+1 + pn−1
pn |α − [a0 , a1 , · · · , an ]| = −
qn αn+1 + qn−1
qn 1
=
qn (qn αn+1 + qn−1 )
1
<
qn qn−1
We should verify that the resulting continued fraction
To do this,
we observe (essentially by the denition) that
αn+1 is positive.
lim [a0 , a1 , · · · , an ].
because
Since this tends to zero as
n → ∞,
we see that
n→∞
•
Example: Find the continued fraction expansion of
◦
With
α=
√
2,
√
2.
we nd, successively,
n
αn
an
αn − an
√0
2
1
√
2−1
√ 1
2+1
√ 2
2−1
√
and since each term after this will repeat, we see that
•
√
Example: Find the continued fraction expansion of
9
7.
√ 2
2+1
√ 2
2−1
···
···
···
···
2 = [1, 2, 2, 2, 2, · · · ]
.
α
is indeed equal to
◦
With
α=
√
7,
we nd, successively,
n
0
√
7
αn
an
√ 1
7+2
3
√ 1
7−1
3
2
αn − an
√
7−2
and since each term after this will repeat, we see
•
Example: Find the continued fraction expansion of
◦
With
4
···
√ 2
√ 3
√
7+1
7+1
7 + 2 ···
2
3
4
···
√ 1
√ 1
7−1
7−2 √
7 − 2 ···
2
3
√
that
7 = [2, 1, 1, 1, 4, 1, 1, 1, 4, · · · ]
.
π.
α = π , we nd, numerically, that the rst ten terms are [3, 7, 15, 1, 292, 1, 1, 1, 2, 1, · · · ].
This is easy
to do even with a hand calculator: simply subtract o the integer part, reciprocate, and repeat.
◦
There is no apparent pattern, and the sequence does not seem to repeat, in the nice way that the two
previous examples did.
6.3.3 Periodic Continued Fractions
•
•
Two of the continued fractions above both eventually repeat. We will give this situation a special name:
Denition:
An innite continued fraction
ar = an+r
for all suciently large
r.
[a0 , a1 , a2 , · · · ]
n such that
[a0 , a1 , a2 , · · · , ak , ak+1 , ak+2 , · · · , ak+n ] to
is periodic if there is some integer
We employ the notation
indicate that the block of integers under the bar repeats indenitely.
◦
This is the same notation used for repeating decimals. (This is reasonable, since it is essentially the same
situation, too.)
•
Example: Find the real number
◦
•
By the periodicity of the expansion, we know that
α=1+
2
◦
This yields a quadratic equation for
◦
Since
◦
Remark: It is also worth computing the convergents to
α > 1,
α,
we need the plus sign, so
namely
α=
α = α + 1,
√
1+ 5
2
1
α+1
=
.
α
α
whose solutions are
√
1± 5
α=
.
2
. (This is the famous golden ratio.)
3 5 8 13 21 34 55
,
,
,
,
,
,
, and so
2 3 5 8 13 21 34
forth. Notice that these are simply ratios of consecutive Fibonacci numbers, which (once noticed) is easy
√
Fn+1
1+ 5
to prove by induction. In fact, our results about convergence provide a proof that lim
=
.
n→∞ Fn
2
Example: Find the real number
By the periodicity of the expansion, we know that
◦
This yields a quadratic equation for
Since
α > 1,
α:
they are 1, 2,
α = [2, 5].
◦
◦
•
α = [1].
α,
we need the plus sign, so
namely
α=
α=2+
1
1
5+
α
=2+
α(5α + 1) = 11α + 2,
5+
√
5
35
α
11α + 2
=
.
5α + 1
5α + 1
whose solutions are
α=
5±
√
5
35
.
.
So far, all of the periodic continued fractions we have seen have been solutions of a quadratic polynomial in
Q[x].
This is not an accident:
10
•
Theorem: If
α is a periodic continued fraction, then α is an irrational root of a quadratic polynomial in Q[x].
α > 0 is irrational and a root of a quadratic polynomial in Q[x], then α has a periodic continued
Conversely, if
fraction expansion.
α = [a0 , a1 , a2 , · · · , ak , ak+1 , ak+2 , · · · , ak+n ],
◦
Proof (forward): Let
◦
Then by the periodicity of the expansion, we have
◦
Expanding this out yields
pn−1 γ + pn−2
,
γ=
qn−1 γ + qn−2
and
γ = [ak+1 , ak+1 , · · · , ak+n ].
γ = [ak+1 , · · · , ak+n , γ].
which is a quadratic equation for
γ.
√
b+ c
◦ Since γ is irrational (being an innite continued fraction), we conclude that γ =
d
b, c, and d.
◦
for some integers
α = [a0 , a1 , a2 ,√
· · · , ak , γ] is also a rational function in γ (and irrational), so clearing the denominator
e+ f
for some integers e, f , and g , which is also a root of a quadratic polynomial.
shows that α =
g
√
b+ c
◦ Remark: For the reverse direction, the idea is to show that for a given α =
, there are only a
d
nite number of possible αi that can appear when computing the terms in continued fraction expansion,
Then
and so eventually a repetition must occur. There does not seem to be an especially brief argument doing
this, so we will omit the details.
6.4
•
The Farey Sequences
If
α
is an irrational number, we have seen that the continued fraction convergents to
◦
•
Specically, if
p/q
α
provide very good
α.
rational approximations of
is a convergent to
α,
then we showed earlier that
α −
p 1
< .
q q2
We would like to know how good of an approximation this gives, compared to choosing some other rational
numbers.
•
One thing we might rst look at is the set of rational numbers of small denominator.
Since we want to
understand distances between nearby numbers, we should arrange the rationals in increasing order.
•
Denition: The Farey sequence of order
(in lowest terms) are
≤ n,
n is the set of rational numbers between 0 and 1 whose denominators
arranged in increasing order.
0 1 1 1 2 3 1
,
,
,
,
,
,
.
1 4 3 2 3 4 1
◦
Example: The Farey sequence of order 4 is
◦
Example: To obtain the Farey sequence of order 5, we simply insert the terms with denominator 5 in
the proper locations:
◦
0 1 1 1 2 1 3 2 3 4 1
,
,
,
,
,
,
,
,
,
,
.
1 5 4 3 5 2 5 3 4 5 1
There are several natural and immediate questions about this sequence: for example, how many terms
does it have?
Are consecutive terms related to each other?
In going from the
(n − 1)st
to the
nth
sequence, how many terms are added and where do they go?
◦
Several of these can be immediately answered: of all rational numbers of the form
k
,
n
the ones in lowest
gcd(k, n) = 1. Thus, the number of such terms added in going from the (n − 1)st
nth is simply ϕ(n).
n
X
induction, we see that the length of the Farey sequence of order n is 1 +
ϕ(d). It is a
terms will be those with
Farey sequence to the
◦
By a trivial
d=1
nontrivial problem to estimate the rate of growth of this function, but it turns out to be approximately
3 2
n
π2
◦
A little numerical experimentation quickly leads to a discovery of the following result:
11
•
Proposition: If
◦
a
b
and
Proof: Suppose
∗
∗
c
d
a
b
are consecutive terms in the Farey sequence of order
and
c
d
n,
then
bc − ad = 1.
are consecutive terms in the Farey sequence.
(b, a), and (d, c).
1
triangle is
B + I − 1,
2
In the plane, draw the triangle whose vertices are (0,0),
By Pick's Theorem, the area of this lattice-point
lattice points on the boundary and
I
where
B
is the number of
is the number of points in the interior. We claim
B=3
and
I = 0.
∗
(x, y) in the interior, where (necessarily) y ≤ max(b, d).
(0, 0) to (x, y) would lie strictly between a/b and c/d: but then
c/d in the Farey sequence, which by hypothesis it is not.
To see this, suppose there were a lattice point
Then the slope of the line joining
y/x
∗
would be between
a/b
and
Now suppose there were a lattice point on the boundary not equal to one of the vertices. It cannot
(b, a), since a and b are relatively prime. Similarly, it cannot lie
(d, c). If it were on the side joining (b, a) and (d, c), then by the same
argument given above, there would be a term between a/b and c/d in the Farey sequence.
1
∗ Thus, B = 3 and I = 0, so the triangle has area . By basic geometry, the area of this triangle is
2
1
|ad − bc|, so since bc > ad we conclude immediately that bc − ad = 1.
2
lie on the side joining (0,0) and
on the side joining (0,0) and
•
From this proposition, we obtain a number of very useful corollaries.
•
Corollary: If
◦
a e
,
,
b f
and
c
d
are three consecutive terms in the Farey sequence, then
Notation: This last expression is sometimes called the mediant of
a/b
e
a+c
=
.
f
b+d
c/d.
and
(It is also occasion-
ally called the baseball average, since it is the expression, frequently computed in baseball, used when
combining several hit percentages into a single statistic.)
◦
Proof: By the result just proven, we know that
of two linear equations in the two variables
f = (b + d)/(bc − ad):
◦
•
thus,
One can check directly that
Corollary: If
a
b
and
c
d
e
a+c
=
, as
f
b+d
a+c
appears
b+d
◦
∗
•
a+c
,
b+d
in the Farey sequence of order
max(b, d).
bc − ad = 1,
b + d.
then these two terms are
The rst term that will appear between them
b + d.
is the term immediately following
bx − ay = 1,
(e, f ) and (c, d) are solutions to the linear Diophan-
and used the structure of the solutions to deduce this result.
For the second statement, we just showed that a/b and c/d are consecutive in the Farey sequence of order
max(b, d). Now increase the order of the sequence in increments of 1: if e/f is the rst term to appear
between a/b and c/d, then as shown above, it would necessarily be the case that e = (a+c)/(bc−ad) = a+c
and f = (b + d)/(bc − ad) = b + d.
bc − ad = 1
◦
c
d
and this rst occurs in the Farey sequence of order
Corollary: The rational numbers
if
and
are rational numbers between 0 and 1 with
Alternatively, we could have observed that both
tine equation
◦
a
b
between
a
in the sequence of order max(b, d). Then be−af =
b
1, so subtracting from bc − ad = 1 yields b(c − e) − a(d − f ) = 0, so b(c − e) = a(d − f ). Since a and b
are relatively prime, we conclude that b divides d − f . Since f ≤ max(b, d) < b + d, the only possibility
is that f = d, and then e = c.
Proof: Suppose
e
f
be − af = 1 and that cf − de = 1. This is a system
f , so solving it yields e = (a + c)/(bc − ad) and
and
claimed.
consecutive entries in the Farey sequence of order
(in a later sequence) is
e
and
b + d > n.
a
b
and
c
d
are consecutive terms in the Farey sequence of order
Proof: If the rst condition fails, then there is a term between
max(b, d).
If the second condition fails, then
a+c
b+d
12
n
if and only
a/b and c/d whose denominator is at most
is a term between
a/b
and
c/d.
◦
If both conditions hold, then the corollary above immediately implies that there are no terms between
a/b
•
c/d
and
If
Corollary:
in the Farey sequence of order
a/b
and
e/f
n.
following
◦
•
e/f
is
c/d,
n,
are consecutive terms in the Farey sequence of order
n+b
c=
e−a
f
where
the term immediately
and
n+b
d=
f − b.
f
e
a+c
=
. Thus, there must exist some integer k such
f
b+d
that a + c = ke and b + d = kf , so that c = ke − a and d = kf − b. Since the closest term to e/f will
n+b
have k as large as possible, and since d ≤ n, the largest possible value of k is
.
f
Proof: By the mediant property, we know that
Using the above results, we can construct the portion of any Farey sequence around any desired rational
number, without needing to compute all of the terms in the sequence.
•
in the Farey sequence of order 500.
◦
By the above results, if
◦
Solving this Diophantine equation using the Euclidean algorithm produces the solutions
11k, 55 + 202k)
◦
◦
k
is
k = 2,
11/202
and
c/d
are consecutive terms, then
and
202c − 11d = 1.
(c, d) = (3 +
k ∈ Z.
k
is, the smaller the value of
so the rst term is
25
.
457
Now we can apply the two-term recursion to
14/255
◦
for
The larger the value of
for
•
11/202
Example: Find the rst three terms after
11/202
c
11
1
−
=
d 202
202d
and
25/457
will be. The largest possible value
to quickly nd the next terms: they are
17/308.
Thus, the three terms are
25
14
17
,
,
457 255 308
.
Similarly, we can ll in the portion of any desired Farey sequence between any two given rational numbers
simply by taking mediants.
•
Example: Find all terms between
◦
7/33
and
14/65
in the Farey sequence of order 100.
First, we notice that these terms are not consecutive (in any Farey sequence), because
14 · 33 − 7 · 65 = 7,
not 1.
◦
We start by nding terms between them: the mediant of these two terms is
◦
Now
◦
Also,
◦
Now we just need to ll in the missing terms. By the results above, these are all given by computing
7/33
and
3/14
and
3/14
are consecutive in the Farey sequence of order 33, since
14/65
21/98 = 3/14.
33 · 3 − 7 · 14 = 1.
are consecutive in the Farey sequence of order 65, since
14 · 14 − 3 · 65 = 1.
mediants of the terms already found.
◦
6.5
•
We obtain the sequence
7 17 10 13 16 19 3 20 17 14
,
,
,
,
,
,
,
,
,
33 80 47 61 75 89 14 93 79 65
.
Rational Approximation
Now that we have discussed the Farey sequences, we can use them to discuss the approximation of irrational
numbers by rational numbers, and compare the results to the ones we obtained using continued fractions.
•
Proposition: If
0<q≤n
◦
and
n is a positive integer
1
α − p ≤
.
q
q(n + 1)
Proof: By replacing
α
with
and
α − bαc
α
is any real number, then there is a rational number
as necessary, we can assume that
13
α
lies in
[0, 1].
p
q
such that
a
c
and
be two consecutive
b
d
c
a
≤ x ≤ . By our earlier results, we know that bc − ad = 1 and b + d ≥ n + 1.
b
d
a a+c
a+c c
◦ The number α either lies in the interval
,
or in
, .
b b+d
b+d d
a a + c |ad − bc|
a
1
=
≤
.
∗ In the rst case, α − ≤ −
b
b
b + d
b(b + d)
b(n + 1)
c c
a + c |ad − bc|
1
=
∗ In the second case, α − ≤ −
≤
.
d
d b+d
d(b + d)
d(n + 1)
p
1
p
◦ Thus, in either case, we obtain a rational number such that α − ≤
.
q
q
q(n + 1)
◦
•
Now consider the Farey sequence of order
Corollary: If
that
◦
α −
α
•
and let
terms such that
is any irrational real number, then there are innitely many distinct rational numbers
p
q
such
p 1
< .
q q2
Proof: Apply the previous result
to the Farey sequence of order
rational numbers
◦
n,
pn
qn
such that
1
1
α − p n <
< 2,
qn qn (n + 1)
qn
n
for each
and with
α is irrational, none of these
dierences
can be zero, and so
p
pn
n
become arbitrarily small,
terms
, since the distances α −
qn
qn Since
n:
this yields a collection of
q n ≤ n.
there must be innitely many dierent
but remain nonzero.
Notice that we have already obtained this same result, but earlier we obtained it using continued fraction
expansions. One reasonable question, then, is: is there any relation between our two methods?
◦
It seems reasonable to guess that a suciently good rational approximation of an irrational number
should be related to the continued fraction expansion somehow.
◦
•
In fact, the terms of the continued fraction expansion give the best rational approximation:
Proposition: Suppose
α
continued fraction convergent to
then
◦
α,
and
α −
q ≥ qn+1 .
Observe that the rst statement says that the best rational approximation to
Farey sequence of order
◦
p/q
is any rational number. If pn /qn is the nth
p pn < α−
, then q > qn . In fact, if |qα − p| < |qn α − pn |,
q qn is any irrational real number and
qn ,
is the convergent
Proof: Consider the Farey sequence of order
α,
among all terms in the
pn /qn .
qn :
since
|pn−1 qn − pn qn−1 | = 1,
we see that
pn−1
qn−1
are consecutive in this sequence. Hence, there is no rational number with denominator less than
and
qn
pn
qn
that
lies between them.
◦
q < qn+1 . By basic linear algebra, there exist integers x and y
p = xpn + ypn+1 and q = xqn + yqn+1 . (They are integers because the determinant is ±1.)
◦ Notice that since q < qn+1 , the second equation requires that one of x, y be positiveand theother
pn+1
pn
pn
and α −
also have opposite signs, we conclude that x α −
and
is negative. Since α −
q
q
qn
n
n+1
pn+1
y α−
have the same sign.
qn+1
◦ Then we can write
For the second statement, suppose that
such that
|qα − p| =
|(xqn + yqn+1 )α − (xpn + ypn+1 )|
=
|x(qn α − pn ) + y(qn+1 α − pn+1 )|
=
|x| · |qn α − pn | + |y| · |qn+1 α − pn+1 |
≥
|qn α − pn |
which establishes the contrapositive of the desired result.
14
•
α is any irrational real number, then there are innitely
p
1
p
. Conversely, any rational number
such that α − <
satisfying
q
2q 2
q
convergent to α.
Proposition: If
◦
p
q
this bound is a continued fraction
Remark: The constant 2 in the rst part is not sharp. In fact, it is a result of Hurwitz that the 2 can be
replaced with
◦
many distinct rational numbers
√
5,
but with no larger constant.
Proof: For the rst statement, let
∗
We claim that at least one of
does.
∗
Then since
α
lies between
pn
qn
pn
qn
pn
qn
be the
and
nth
pn+1
qn+1
continued fraction convergent to
α.
satises the desired inequality, so suppose that neither
pn+1
, we have
qn+1
2
2
pn
pn+1
pn
pn+1 qn − α + qn+1 − α
qn − qn+1 =
pn
pn+1
> 4
− α · − α
qn
qn+1
and
≥ 4·
where in the middle step we used the inequality
1
1
1
· 2
=
.
2qn2 2qn+1
qn qn+1
(x + y)2 ≥ 4xy
(which is equivalent to
(x − y)2 ≥ 0,
α is irrational).
pn
1
p
n+1 >
∗ Taking the square root gives −
, but this is false since these quantities are equal.
qn
qn+1
qn qn+1
1
p
◦ For the second statement, suppose that p/q is not a convergent and that α − < 2 .
q
2q
∗ Let n be such that qn ≤ q < qn+1 .
p
1
∗ By the previous proposition, it must be the case that |qn α − pn | < |qα − p| = q α − <
.
q
2q
pn 1
∗ Thus, we conclude that α −
<
.
qn 2qqn
pn q − pqn p pn p
1
= −
≤ − α + α − pn < 1 + 1 .
∗ Now we get
≤ qqn
qqn
q
qn
q
qn 2qqn
2qn2
∗ But this implies q < qn , which is a contradiction.
and equality cannot hold in our case because
•
We will remark at this point that the continued fraction expansion provides a good way to detect approximations of rational numbers using a decimal expansion: simply compute the continued fraction of the decimal,
and then round o appropriately.
•
Example: Find a rational number of small denominator with decimal expansion
◦
We compute the continued fraction expansion of
α = 0.4614379084967,
0.4614379084967 . . . .
which is easy to do with a
calculator or computer.
◦
We obtain the exact expression
◦
We truncate just before the huge term in the middle to get a guess of
◦
And indeed, we see that
◦
From our results above, we can see that any rational number that is a closer approximation will have
α = [0, 2, 5, 1, 57, 1, 53354674, 4, 1, 1, 6, 4].
353
.
765
353
≈ 0.461437908496732.
765
denominator roughly on order of the next convergent
[0, 2, 5, 1, 57, 1, 53354674] =
number we found is clearly the simplest.
◦
α = [0, 2, 5, 1, 57, 1] =
It is interesting to note that the period of the decimal expansion of
the rational number before the expansion started repeating!
15
353
765
18834200269
,
40816326362
so the
is 16, so in fact we have identied
6.6
•
Pell's Equation
The goal of this section is to study the integer solutions
squarefree integer and
◦
√
thus, solving this equation in integers
In particular, if we can write down all solutions when
since an element is a unit in
◦
◦
√
Z[ D]
i its norm is a
It is possible that there are no solutions to
D
this would say
a positive
an arbitrary integer.
One reason that we should care about solving this equation is that
N (x + y D) = r:
norm r .
◦
r
(x, y) to Pell's equation x2 − Dy 2 = r, for D
2
x ≡ −1
D),
(mod
(x, y)
x2 − Dy 2 = r
is equivalent to
will tell us which elements in
√
Z[ D]
have
√
r = ±1, they will correspond to the units in Z[ D],
unit in Z.
x2 − Dy 2 = −1:
for example, if
D≡3
(mod 4), then modulo
which we know to be false.
However, we will show that there are always solutions to
x2 − Dy 2 = 1
for any
D.
•
Our rst result relates Pell's equation to our results with continued fractions:
•
D be a positive squarefree integer and r any integer with r2 + |r| < D. If x and y are positive
√
x
2
integers with x − Dy = r , then
D.
is a continued fraction convergent to
y
x √ 1
x
◦ Proof: We want to show that − D <
. By our previous results, this will show that
is a
2
y
2y
y
√
convergent to
D.
Proposition: Let
2
◦
Then we can write
x √ − D
y
=
2
x − Dy 2 √ 2
|y| x/y + D
=
|r|
p
√ y 2 r/y 2 + D + D
≤
where we used the fact that
•
r/y 2 + D
x2 − Dy 2 all arise
2
2
solutions to x − Dy = r .
is at most
|r|
1
=
q
2y 2
2
y 2 |r| + |r|
|r|2
√
and
D ≤ |r|,
by the given information.
Since small values of
from continued fraction convergents, we should use the convergents
to study the
A rst step is to show that the Pell equation with
r=1
always has
a nontrivial solution:
•
D
Proposition: If
is a positive squarefree integer, then the equation
solution in integers
◦
Proof: If
p
q
√
1 + 2 D.
√
◦ Since D
values for
(x, y),
and
√
Z[ D]
√
is a convergent to
√
has a nontrivial unit
x2 − Dy 2 = 1
always has a nontrivial
x + y D.
√
p √ 1+2 D 2
√
D, then + D < 1+2 D, so we see that p2 − Dq 2 <
·q =
q
q2
is irrational, there are an innite number of convergents but only a nite number of possible
p2 − dq 2 ,
so by the pigeonhole principle there is some
r
such that
p2 − Dq 2 = r
has innitely
many solutions.
◦
r: then there are again only nitely many possible pairs for the reduction of (p, q) modulo
r, so there exist two distinct convergents x/y and s/t such that x2 − Dy 2 = s2 − Dt2 = r, x ≡ s (mod
r), and y ≡ t (mod r).
√
x+y D
xs − Dyt
−xt + ys √
√
D, and observe that both xs − Dyt ≡
◦ Now we compute u =
=
+
r
r
s+t D
√
x2 − Dy 2 ≡ 0 (mod r) and −xt + ys ≡ 0 (mod r), so the quotient u is of the form a + b D where a, b ∈ Z.
Choose such an
16
◦
But now
√
N (x + y D)
√
= 1,
N (u) =
N (s + t D)
so
u
√
is a unit in
Z[ D],
and
xs − Dyt −xt + ys
,
r
r
is a nontrivial
solution to Pell's equation.
•
Now that we know there exists a nontrivial unit
•
Denition: For a xed positive squarefree
of positive integers such that
√
u = x1 + y1 D.
◦
u
in
√
Z[ D],
we can write down all of the units.
D, a fundamental
√ solution (x, y) to Pell's equation is a pair√(x, y)
x2 − Dy 2 = ±1 and x1 + y1 D is minimal. The fundamental unit of Z[ D] is
√
x1 + y1 D
Note that this is well-dened: there will be a unique minimal positive value for
(x1 , y1 )
satisfying
over all pairs
x21 − Dy12 = ±1.
2
2
D = 2, it is easy to see that (1, 1) is a fundamental
solution to Pell's equation x −2y = ±1,
√
and the corresponding fundamental unit is u = 1 +
2.
√
√
√
• Proposition: Let u = x1 + y1 D be the fundamental unit in Z[ D]. If w is an arbitrary unit in Z[ D]√
, then
w = ±un√ for some integer n (possibly negative). Equivalently, each pair (xn , yn ) dened by xn + yn D =
(x1 + y1 D)n is a solution to x2 − Dy 2 = ±1, and these are all of the solutions up to changing the signs of
xn or yn .
◦
◦
Example: For
Proof: By scaling by
that
w ∈ [un , un+1 )
−1
since
(as necessary), assume
u
w
is positive. Then there exists a unique integer
√
w · u−n ∈ [1, u), and w · u−n is also a unit in Z[ D].
√
◦ If this unit x + y D were not equal to 1, then (possibly after ipping signs on one
√ of its terms)
2
2
yield a positive solution (x, y) to Pell's equation x − Dy = ±1 such that x + y D < u.
◦
such
Now observe that
w · u−n = 1,
it would
w = un .
◦
But this contradicts the minimality of
◦
The second statement is simply a rephrasing of the rst, after converting the statements about units to
u,
so in fact we must have
whence
statements about solutions to Pell's equation (by taking norms).
◦
Remark (for those who like group theory): This result says that the group of units of
phic to
unit
•
n
is a real number greater than 1.
(Z/2Z)×Z.
(The rst term is from the
±(1 +
is isomor-
± sign and the second term is the power of the fundamental
u.)
Example: Since the fundamental unit of
√
√
Z[ D]
2)n
for
√
Z[ 2]
is
u = 1+
√
2,
the set of units of
√
Z[ 2]
are the elements
n ∈ Z.
√
41+29 2, and we can indeed compute that 412 −2·292 = 1.
√
• We have almost completely described the units of Z[ D]:√ all that is left is to explain how to nd the
fundamental unit using the continued fraction expansion of
D:
√
and
• Proposition: If D is a squarefree positive integer,
√ the continued fraction expansion of D is periodic
√
of the form [a0 , a1 , a2 , · · · , ar−1 , 2a0 ] with a0 = b Dc. Then the fundamental unit u = x1 + y1 D satises
x1 = pr−1 and y1 = qr−1 . Furthermore, there exists a solution to x2 − Dy 2 = −1 if and only if r is odd.
•
•
◦
Taking the fth power (say) yields the element
◦
We will omit the details of this proof, since it is fairly lengthy.
◦
We will content ourselves with verifying the result in one or two cases.
√
Z[ 7].
√
◦ Earlier, we computed the expansion as 7 = [2, 1, 1, 1, 4]. This has the
2
2
conclude there is no solution to x − 7y = −1.
8
◦ Then the desired convergent is C4 = [2, 1, 1, 1] = , and we can indeed
3
√
√
conclude that the fundamental unit of Z[ 7] is 8 + 3 7 .
Example: Find the fundamental unit in
Example: Find the fundamental unit in
√
Z[ 13].
17
desired form with
verify that
r = 4,
so we
82 − 7 · 32 = 1 .
We
◦
√
13 = [3, 1, 1, 1, 1, 6]. This has the desired form with
x2 − 13y 2 = −1.
18
2
2
, and we can indeed verify that 18 − 13 · 5 = −1.
◦ Then the desired convergent is C5 = [3, 1, 1, 1, 1] =
5
√
√
We conclude that the fundamental unit of Z[ 13] is 18 + 5 13 .
We compute the continued fraction expansion of
r = 5,
◦
so we conclude there is a solution to
If we want a solution to Pell's equation
√
√
2
(18 + 5 13) = 649 + 180 3:
6.7
•
x2 − 13y 2 = 1
instead, we simply square the fundamental unit:
(649, 180).
then the minimal nontrivial solution is given by
An Assortment of Diophantine Equations
In this section we discuss a number of unrelated Diophantine equations.
◦
◦
Several will have the form
y 2 = x3 + ax2 + bx + c
a, b, c.
for xed integers
Such equations are examples of elliptic curves, which are quite important in modern number theory.
Over a eld of characteristic zero, up to a change of variables, an elliptic curve has the general form
y 2 = x3 + Ax + B .
◦
Elliptic curves appear in a tremendous number of places, and are fundamentally important to modern
number theory.
◦
For example, they are the basis for elliptic curve factorization (currently one of the fastest integer
factorization algorithms), elliptic curve cryptography (a frequently-used cryptosystem which is often
used instead of RSA), and they are a key ingredient in the proof of Fermat's Last Theorem (see the next
section).
•
Proposition: There are no solutions to the Diophantine equation
◦
The idea of this proof is to use modular arithmetic.
◦
Proof: We prove the result by induction on
∗
∗
For the base case
a = 0,
x2 + y 2 + z 2 = 4a (8b + 7).
a.
consider the equation modulo 8.
Each square is either 0, 1, or 4 mod 8, so it is not possible to obtain a sum of 7 mod 8 by adding
three of them.
∗
∗
a ≤ k , and take a = k + 1.
x + y + z = 4k+1 (8b + 7) modulo 4. Each
Now suppose there are no solutions for
Consider the equation
2
2
2
of the squares is 0 or 1, while
4k+1 (8b + 7) is 0 mod 4, so all of the squares must be 0 mod 4.
x 2 y 2 z 2
∗ Then
+
+
= 4k (8b + 7), but this is a contradiction since by induction, this equation
2
2
2
the term
has no solutions.
◦
Remark: In fact, these are the only integers that cannot be written as a sum of three squares, as rst
proven by Legendre. (Gauss gave a formula for the number of such representations, similar to Fermat's
formula for the number of ways of writing an integer as a sum of two squares.)
•
Proposition: The Diophantine equation
(0, ±1), (1, ±3), (6, ±47),
◦
has the solutions
(x, y) = (−4, ±3),
and no others.
The idea of this result is to attempt to complete the square of the
inequalities to bound how big
◦
y 2 = x4 + 4x3 + x2 + 2x + 1
x
and
Proof: We complete the square of the
y
x-terms,
and then use some simple
can be.
4
3
2
x-terms and obtain x +4x +x +2x+1 =
3
x + 2x −
2
2
2
+8x−
5
.
4
y 2 to (x2 + 2x − 2)2 = x4 + 4x3 − 8x + 4 and (x2 + x − 1)2 =
x4 + 2x3 + 2x2 − 4x + 1.
∗ We see that y 2 − (x2 + x − 2)2 = x2 + 10x − 3 is positive outside [−10.3, 0.3], while (x2 + x − 1)2 − y 2 =
x2 − 6x is positive outside [0, 6].
∗ Hence, if x 6∈ [−10, 6], then we have the strict inequalities (x2 + x − 2) < y 2 < (x2 + x − 1)2 , which
is impossible if x and y are both integers.
∗
Thus, we should try comparing
18
∗
x ∈ [−10, 6].
Thus it must be the case that
Now it is (reasonably) easy to check the 17 cases to nd
the solutions as listed above.
◦
Remark: More generally, one can adapt this proof method to show that there are only nitely many
y 2 = x4 + ax3 + bx2 + cx + d
solutions to any equation of the form
•
3
2
2
3
2
2
(x, y, z) = (a − 3ab , 3a b − b , a + b )
◦
x2 + y 2 = z 3 with gcd(x, y) = 1 are
prime integers a, b of opposite parity.
for relatively
The idea of this proof is to work over the Gaussian integers
Proof: First observe that if
∗
∗
∗
a, b, c, d.
All solutions to the Diophantine equation
Proposition:
◦
for xed integers
were both odd, then
z ≡2
Z[i].
(mod 4), but 2 is not a cube modulo 4.
gcd(x, y) = 1, we conclude that one is even and the other is odd.
Z[i], factor the equation as (x + iy)(x − iy) = z 3 .
We claim that x + iy and x − iy are relatively prime: any common divisor would divide both 2x and
2y , hence divide 2. But 1 + i (the only Gaussian prime dividing 2) does not divide x + iy , since x, y
Since
x, y
x, y
3
of the form
are not both even since
Now, over
are of opposite parity.
∗
Now, by the uniqueness of prime factorization in
Z[i],
we conclude that
x + iy
must be a unit times
a cube.
∗
∗
◦
Z[i] is actually a cube, we conclude that x+iy = (a+bi)3 for some a+bi ∈ Z[i].
3
2
2
3
Equating real and imaginary parts yields x = a −3ab , y = 3a b−b , and then z = (a+bi)(a−bi) =
2
2
a + b , as claimed.
But since each unit in
More generally, one can use an almost identical argument to write down the solutions to
Remark:
x2 + y 2 = z d
•
d.
Corollary: The only solution to the Diophantine equation
◦
•
for any positive integer
y 2 = x3 − 1
is
(x, y) = (1, 0).
gcd(x, y) = 1. Rearranging the equation into the form 1 + y 2 = x3 and applying the
3
2
2
2
previous result shows that 1 = a − 3ab for a, b ∈ Z. Factoring this gives 1 = a(a − 3b ). Clearly,
a = ±1, and then the only solution is easily seen to be (a, b) = (1, 0), yielding (x, y) = (1, 0).
Proof: Clearly,
We do not need to restrict to considering Diophantine equations where the terms involved are polynomials in
the variables: we can also include variables in the exponents.
•
Proposition: The Diophantine equation
◦
◦
3a − 2b = 1
has the two solutions
(a, b) = (1, 1), (2, 3),
and no others.
The idea is to use congruence conditions.
Proof: Clearly
a
and
b
must be nonnegative, else the denominators of the rational numbers involved
could not be equal. Clearly
b=0
fails, and
b=1
gives
a = 1.
b ≥ 2 and consider the equation modulo 4: we obtain 3a ≡ 1 (mod 4), meaning that a
is even, say, a = 2k .
∗ Then we have 2b = 32k − 1 = (3k + 1)(3k − 1), so 3k + 1 and 3k − 1 must both be powers of 2.
∗ But their dierence is 2, and so they must be 4 and 2 respectively. Thus, the only other solution is
(a, b) = (2, 3).
∗
◦
Now suppose
Remark: This result is a special case of a result called Catalan's conjecture (recently proven in 2002
by Mihailescu) that 8 and 9 are the only perfect powers that are consecutive: in other words, the only
solutions to
•
xa − y b = 1
in integers greater than 1 is
Proposition: The Diophantine equation
◦
y 2 = x3 + 7
(a, b, x, y) = (2, 3, 3, 2).
has no solutions.
The idea of this result is to rewrite the equation slightly, exploit congruence conditions, and then quadratic
reciprocity to obtain a contradiction.
◦
Proof: If
∗
∗
x
Then
is even, then this equation yields
x3 + 7 ≡ 0
y2 ≡ 3
x ≡ 1 (mod 4).
y 2 + 1 = x3 + 8 = (x + 2)(x2 − 2x + 4).
(mod 4) meaning
Now we write the equation as
(mod 4), which is not possible.
19
∗
By our results about quadratic residues, any prime divisor of
4, since
∗
∗
◦
2
y +1≡0
p)
is equivalent to
But
x+2
is a divisor of
x3 + 8
2
3
y +1=x +8
p.
p.
must be congruent to 1 modulo 4.
Remark: This result is notable because there always is a solution to the equation
y 2 = x3 + 7
modulo
p
(This is not trivial to prove.)
x4 + y 4 = z 2
has no solutions with
xyz 6= 0.
This result, as noted, is originally due to Fermat. Note in particular that it is a stronger result than the
n=4
◦
must be congruent to 1 modulo
being a quadratic residue modulo
congruent to 3 modulo 4, so we have a contradiction.
Theorem (Fermat): The Diophantine equation
◦
−1
Thus, any prime divisor (hence any divisor) of
for every prime
•
(mod
y2 + 1
case of Fermat's Last Theorem, which asks for solutions to
x4 + y 4 = z 4 .
We show the result using an innite descent technique: we consider the smallest nontrivial solution of
the equation, and use it to construct a smaller solution.
◦
Proof: If the equation has nontrivial solutions, let
u be the smallest positive integer such that x4 +y 4 = u2
has a solution.
∗
Observe that
gcd(x, y) = 1, otherwise we could replace x, y, u with x/d, y/d, u/d2
to obtain a smaller
solution.
∗
x, y is even and the other is odd: by interchangx is even.
∗ Then (x2 , y 2 , u) is a primitive Pythagorean triple, so from our parametrization we see that x2 = 2st,
y 2 = s2 − t2 , and u = s2 + t2 for some integers s > t > 0 of opposite parity.
∗ Since y 2 = s2 − t2 , it must be the case that s is odd and t is even: otherwise, y 2 = s2 − t2 would be
congruent to −1 modulo 4.
x 2
∗ If we set t = 2k , then we see that
= sk where gcd(s, k) = 1, so each of s and k is a perfect
2
By reducing both sides modulo 4, we see that one of
ing, assume
square.
s = a2 and k = b2 yields the system y 2 = s2 − t2 = a4 − 4b4 , so that y 2 + (2b2 )2 = a4 .
2
Then (y, 2b , t) is a primitive Pythagorean triple, so there exist relatively prime integers m and n
2
2
2
2
2
2
such that 2b = 2mn, y = m − n , and a = m + n .
2
∗ The rst equation gives b = mn, so m and n are both squares: say, m = v 2 and n = w2 .
∗ Then, at last, we see that a2 = v 4 + w4 , meaning that we have a new solution (v, w, a) to the
2
2
2
original equation. Clearly a ≤ a = s < s + t = u, so this solution is smaller. We have obtained a
∗
∗
Setting
contradiction.
6.8
•
Some Remarks on Fermat's Last Theorem
One of the most famous Diophantine equations is Fermat's equation
xn + y n = z n ,
for a xed integer
n ≥ 3.
Clearly, there are solutions if one of the variables is equal to 0: the question is whether this equation possesses
any other solutions.
◦
It is enough to prove that there are no solutions in the cases
since any
•
n>2
n=4
and
n=p
where
p
is an odd prime,
is divisible by 4 or an odd prime.
This result was famously conjectured by Fermat in 1637, who wrote (in the margin of his book, in Latin) It
is impossible to separate a cube into two cubes, or a fourth power into two fourth powers, or in general, any
power higher than the second, into two like powers. I have discovered a truly marvellous proof of this, which
this margin is too narrow to contain.
•
◦
It is now believed that Fermat probably did not have a correct proof of this result.
◦
In the next section, we will give a proof that there are no solutions to the Fermat equation when
For the case
n = 3,
the idea is to factor the equation
non-real cube root of unity.
20
x3 + y 3 = z 3
in the ring
Z[ρ],
where
ρ=
1+
√
2
n = 4.
−3
is a
◦
The elements of this ring are of the form
◦
It can be shown that the ring
Z[ρ]
a + bρ
for
a, b ∈ Z,
p2 = −ρ − 1.
where
has a division algorithm using the same method we used to show
Z[i]
had a division algorithm, so it is a Euclidean domain and thus has unique factorization.
◦
•
(x + y)(x + ρy)(x + ρ2 y) = z 3 . Now
2
to small issues, x + y , x + ρy , x + ρ y are relatively prime in Z[ρ], and
2
2
2
contradiction. (The procedure is analogous to the analysis of x + y = z
Then we can factor the equation as
The argument in the
the ring
Z[ζn ]
where
n = 3 case
ζn = e2πi/n
the idea is to show that, up
then to use this to derive a
using the arithmetic of
lends itself to a natural generalization, namely, factoring
is an
nth
Z[i].)
xn + y n = z n
over
root of unity.
◦
However, quite unfortunately, the ring
◦
So (alas) this technique will not work in general. However, determining when this approach can succeed
Z[ζn ]
is not a unique factorization domain for most
n!
was one of the original motivations for studying unique factorization in general rings.
•
The cases
n=5
and
n=7
were shown in the 1800s by various mathematicians using various techniques. A
number of other cases were shown individually, and then results of Germain and others established innite
classes of prime
•
n
for which there are no nontrivial solutions to the equation.
However, the lack of a solution to Fermat's equation for every
n > 2
was not established until 1995, with
Andrew Wiles's celebrated proof of the Taniyama-Shimura-Weil conjecture. (Wiles announced his result in
1993, but a gap was discovered later that year. Wiles, working with Richard Taylor, closed the gap by 1994.)
◦
The connection between Fermat's equation and elliptic curves stems from an observation made by Frey
ap + bp = cp ,
a b (a + b ) = (abc)p .
in 1984: if there is a nonzero solution to
discriminant equal to
◦
p p
p
then the elliptic curve
y 2 = x(x − ap )(x + bp )
has
p
Such a curve would have a number of unusual properties, and (in particular) is what is called a semistable
elliptic curve, and it would also fail to be modular. (To understand what these terms mean would require
at least a graduate-level course in number theory.)
◦
Wiles's results proved that every semistable elliptic curve is modular, which, when combined with Frey's
observations, shows that the Fermat equation cannot have a solution in nonzero integers.
◦
It is worth noting that, as with most major mathematical advances, the fundamental ideas put forward
in Wiles's work are just as important as the end result of his proof.
Well, you're at the end of my handout. Hope it was helpful.
Copyright notice: This material is copyright Evan Dummit, 2014. You may not reproduce or distribute this material
without my express permission.
21