Download 5 The Pell equation

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Location arithmetic wikipedia , lookup

Quadratic reciprocity wikipedia , lookup

Addition wikipedia , lookup

Fermat's Last Theorem wikipedia , lookup

Elementary mathematics wikipedia , lookup

Fundamental theorem of algebra wikipedia , lookup

Number theory wikipedia , lookup

Partial differential equation wikipedia , lookup

Proofs of Fermat's little theorem wikipedia , lookup

System of polynomial equations wikipedia , lookup

Transcript
5
5.1
The Pell equation
Side and diagonal numbers
In
Hence the discovery that (1)
√ ancient time, only rational numbers were thought of as numbers.
√
2 is the length of a hypoteneuse of a right triangle, and (2) 2 is irrational (which we proved
in Section 3, but the Greeks also had another proof for) was quite perplexing. Hence the ancient
Greeks studied the Diophantine equation
x2 − 2y 2 = 1
√
to try to understand 2. They were able to produce a sequence of increasingly large solutions
(xi , yi ). Note that we can rewrite such an equation as
r
√
x2
1
1
x
= 2 + 2 ⇐⇒
= 2+ 2 → 2
2
y
y
y
y
√
as y → ∞. Hence the solutions (xi , yi ) provide increasingly good rational approximations to 2 as
yi gets large.
We will study the solutions in Z to the more general Pell equation
x2 − ny 2 = 1
for any n ∈ N. Note Pell’s equation always has the “trivial” solutions (±1, 0). Further, the case
where n is a square is easy:
Exercise 5.1. If n ∈ N is a square, show the only solutions of x2 − ny 2 = 1 are (±1, 0). (Cf.
Exercises 5.1.3, 5.1.4.)
√
Hence, from now on, we will assume n is not a square. Then we know n is irrational from
Section 2.5.
To understand this equation thoroughly over Z, we need to work with another number system
√
√
Z[ n] = a + b n : a, b ∈ Z .
This idea of working with a larger number system than Z to study problems about Z is the basis of
algebraic number theory. (Note we could also think of each Z/mZ a smaller number system than Z,
which we used in Chapter 3 to study problems over Z as well. However because Z/mZ is smaller, it
can typically only be used to limit the kinds of solutions we might have to an equation over Z, and
√
not actually prove the existence of solutions.) Here we take for granted the existence of n, which
as pointed out above, was not always known (or thought) to be a number. The reason we want to
√
work specifically with the ring1 Z[ n] is of course so we can factor Pell’s equation:
√
√
x2 − ny 2 = (x + y n)(x − y n) = 1.
√
If x = a + b n, we say a is the rational part and b is the irrational part of x. (This is
√
analogous to real and imaginary parts of complex numbers.) Note that two numbers x, y ∈ Z[ n]
1
A ring is, roughly, a number system in which you can add, subtract and multiply, but not necessarily divide. We
will give the formal definition in Chapter 10. For now, you don’t need to know anything about rings; we will just get
in the habit of calling some number systems rings to stress that they are somehow similar to Z.
29
are equal if and only if their rational and irrational parts are equal. The (⇐) direction is obvious.
√
√
To prove the (⇒) direction, write x = a1 + b1 n and y = a2 + b2 n. Then
√
x = y ⇐⇒ a1 − a2 = (b2 − b1 ) n.
√
2
If b2 − b1 6= 0, then n = ab12 −a
−b1 ∈ Q which is a contradiction. Hence b2 = b1 , and therefore a1 = a2
also.
You might wonder about the title of this section. I won’t cover it. See the text.
5.2
The equation x2 − 2y 2 = 1
It is simple to determine all rational solutions of x2 − 2y 2 = 1 using the rational slope (Diophantus
chord) method. However determining the integer solutions is a different matter. First we observe
by trial-and-error that the smallest non-trivial solution is (3, 2).
Exercise 5.2. Check the following composition rule holds:
(x21 − 2y12 )(x22 − 2y22 ) = x23 − 2y32
where
x3 = x1 x2 + 2y1 y2 , y3 = x1 y2 + y1 x2 .
Hence if (x1 , y1 ) and (x2 , y2 ) are to solutions to x2 − 2y 2 = 1, so is there composition (x3 , y3 ),
defined as above. We denote
(x3 , y3 ) = (x1 , y1 ) · (x2 , y2 ).
We will see in the next section that the solutions to x2 − 2y 2 = 1 form a group under this operation.
Further, since the definition of composition is symmetric in (x1 , y1 ) and (x2 , y2 ) this will be an
abelian group.
Example. (x1 , y1 ) · (±1, 0) = (±x1 , ±y1 ).
Example. (3, 2) · (3, 2) = (9 + 8, 12) = (17, 12)
Example. (3, 2)3 = (3, 2) · (17, 12) = (99, 70).
Through composition (the “powers” of (3, 2)) we can see that we can get infinitely many solutions,
each getting larger. The first three powers give the sequence of approximations
3
= 1.5
2
17
= 1.416
12
99
= 1.41428757
70
√
≈ 2 = 1.4142135623 . . .
Exercise 5.3. Compute (3, 2)4 . Use this to obtain a decimal approximation for
digits is it accurate? (Use a calculator/computer.)
30
√
2. To how many
5.3
The group of solutions
This section shows that the solutions of x2 − 2y 2 = 1 form a group under the composition defined
above, and this group is generated by (3, 2) and (−1, 0). However it is subsumed in the next section
which treats x2 − ny 2 = 1 using norms, so I will not treat the case n = 2 separately.
5.4
√
The general Pell equation and Z[ n]
To put the ideas we will use in context, let us recall some things about complex numbers. Let z ∈ C.
Then we can write z = x + yi where x, y ∈ R. The complex conjugate of z is z = x − yi, and
zz = (x + yi)(x − yi) = x2 + y 2 .
Drawing z as a vector in the complex plane, we see that zz is the square of the length of this vector,
i.e., zz = |z|2 . Define the norm of z to be
N (z) = zz.
Since complex conjugation respects multiplication,
N (z1 z2 ) = z1 z2 z1 z2 = z1 z 1 z2 z 2 = N (z1 )N (z2 ),
i.e, the norm is multiplicative.
√
√
√
Similar to the complex case, define the conjugate of α = x + y n ∈ Z[ n] to be α = x − y n,
and the norm of α to be
√
√
N (α) = αα = (x + y n)(x − y n) = x2 − ny 2 .
Note N (α) = N (α).
The following lemma is clear.
√
Lemma 5.1. Solutions of x2 − ny 2 = 1 are in 1-1 correspondence with the elements in Z[ n] of
√
norm 1. The correspondence is given by (x, y) ↔ x + y n.
Now we want to know a basic property of norms.
√
Lemma 5.2. For α, β ∈ Z[ n], we have N (αβ) = N (α)N (β), i.e., N is multiplicative.
√
√
Proof. Write α = x1 + y1 n, β = x2 + y2 n. Note
√
√
√
αβ = x1 x2 + ny1 y2 − (x1 y2 + y1 x2 ) n = (x1 − y1 n)(x2 − y2 n) = α · β.
Hence
N (αβ) = αβαβ = αα · ββ = N (α)N (β).
Hence if α and β have norm 1, so does αβ. In light of Lemma 5.1, this says if we have two
solutions to x2 −ny 2 = 1, we can compose them to construct a third. Precisely, we can say something
stronger.
31
Corollary 5.3. (Brahmagupta composition rule) If (x1 , y1 ) and (x2 , y2 ) are solutions to
x21 − ny12 = a, x22 − ny 2 = b.
Then the composition
(x3 , y3 ) = (x1 , y1 ) · (x2 , y2 ) := (x1 x2 + ny1 y2 , x1 y2 + y1 x2 )
is a solution of
x23 − ny32 = ab.
Proof. We simply translate the above into a statement about norms. The hypothesis says N (x1 +
√
√
y1 n) = a and N (x2 + y2 n) = b. Now observe that
√
√
√
√
(x1 + y1 n)(x2 + y2 n) = x1 x2 + ny1 y2 + (x1 y2 + y1 x2 ) n = x3 + y3 n.
Hence by the multiplicative property of the norm,
√
√
√
x23 − ny32 = N (x3 + y3 n) = N (x1 + y1 n)N (x2 + y2 n) = ab.
Note this is much nicer than the straightforward proof given on p. 82.
Aside: this result says that if a and b are of the form x2 − ny 2 , so is ab. Hence if we want to
ask the question which integers are of the form x2 − ny 2 , we should first determine which primes
are of the form x2 − ny 2 . We will not pursue this now, however we will return to this idea when
considering which numbers are sums of squares, or more generally, of the form x2 + ny 2 . (The “+”
case turns out to be simpler, but still not easy.)
√
√
Since Z[ n] ⊆ R, there is a natural order on Z[ n]. By Lemma 5.1, this gives us a way to order
the solutions to Pell’s equation. As the case of n = 2 suggests, we want to first look for a “smallest”
non-trivial solution and try to obtain all other solutions from that.
If (x, y) is a solution to x2 − ny 2 = 1, so are (±x, ±y). So when we say we want a “smallest”
solution, we should make a restriction like x, y > 0. Thinking in terms of elements of norm 1, note
√
√
that conjugates and negatives of α = x + y n give ±x ± y n. Hence we want to look for the
smallest element of norm 1 such that x, y > 0.
√
√
√
Definition 5.4. The fundamental +unit2 of Z[ n] is the smallest = x + y n ∈ Z[ n] such
that x, y > 0 and N () = 1.
√
Lemma 5.5. The fundamental +unit of Z[ n] is well defined and always exists.
√
Proof. We will show in the next section that there is always some 6= ±1 in Z[ n] such that
N () = 1. By possibly taking the negative and/or conjugate of , we may assume x, y > 0. So at
√
least one candidate exists. Since the set of all a + b n with a, b ∈ N is discrete in R, there must be
a minimal such (i.e., such cannot get arbitrarly close to 1).
2
This is not standard terminology. One normally defines the fundamental unit, which can have norm ±1. If it
has norm +1, this coincides with our definition; if it has norm -1, its square is what we are calling the fundamental
+unit.
32
√
√
Example. The fundamental +unit of Z[ 2] is 3 + 2 2.
Exercise 5.4. An alternative definition of fundamental unit is the smallest > 1 such that N () = 1.
√
Prove that this is equivalent to the above definition as follows. Suppose = x + y n > 1 and
N () = 1. Show (i) 0 < < 1. Then deduce (ii) x, y > 0.
√
Theorem 5.6. Let U + = {α ∈ Z[ n] : N (α) = 1}. Then U + is an infinite abelian group under
√
multplication. Furthermore, it is generated by the fundamental unit of Z[ n] and −1.
Proof. Clearly the identity 1 ∈ U + and multiplication on U + is associative. Note that for any
α ∈ U + , N (α) = αα = 1 implies that α = α−1 . Since N (α) = 1 also, we have α ∈ U + . Also by the
multiplicative property of the norm, if α, β ∈ U + then N (αβ) = N (α)N (β) = 1 so αβ ∈ U + . This
√
shows U + is a group, and it is clearly abelian because multiplication in Z[ n] is commutative.
√
Now let be the fundamental +unit of Z[ n]. Suppose there exists α ∈ U + such that α 6= ±m
for any m ∈ Z. By taking the negative and/or conjugate if need be, we may assume α > 1. Since is minimal and m → ∞ as m → ∞, there must be some m > 0 such that m < α < m + 1. But
then 1 < α−m < and N (α−m ) = 1, contradicting the minimality of . Hence each α ∈ U + is
(±) a power of .
√
√
Remark. All α ∈ U + are called units of Z[ n], because like ±1, they are invertible in Z[ n].
√
The actual definition of the units of Z[ n] is the set of invertible elements, which is easy to see is
precisely the set of elements of norm ±1.
Hence the solutions of Pell’s equation are given by ±m where m ∈ Z. Since we know = −1 , then
−m = m . Thus m and −m give essentially the same solutions.
√
√
Corollary 5.7. Suppose = x0 +y0 n is a fundamental +unit of Z[ n]. Then all integer solutions
√
√
to Pell’s equation x2 − ny 2 = 1 are of the form (±x, ±y) where x + y n = (x0 + y0 n)m and m ≥ 0.
Equivalently, up to sign, all solutions to Pell’s equations are given by non-negative powers (in the
sense of Brahmagupta composition) of the fundamental solution (x0 , y0 ).
√
√ m
Example. Up to sign, all non-trivial solutions of x2 − 2y 2 = 1 are given
by
(x
+
y
2)
=
(3
+
2
2)
√ m
for m > 0, i.e., x and y are the rational and irrational parts of (3 + 2 2) .
The book says little about how to find fundamental solutions (called smallest positive solutions
in the text). By rewriting Pell’s equation as
x2 = ny 2 + 1
it becomes clear that we can find the fundamental solution (or fundamental +unit) by finding the
smallest y > 0 such that ny 2 + 1 is a square. This will give the smallest x > 0 which solves
√
x2 − ny 2 = 1, i.e., x and y are simultaneously minimal for this solution, making x + y n minimal
(with x, y > 0) among U + .
2
2
Example. Since 3 · 12 + 1 is a square, the smallest
positive
√
√ (fundamental) solution to x − 3y = 1
is (2, 1). Hence the fundamental +unit of Z[ 3] is 2 + 3. Up to sign, all solutions are powers of
(2, 1), e.g., (2, 1)2 = (7, 4) and (2, 1)3 = (26, 15). This provides the successive approximations 12 , 74 ,
√
26
3.
15 for
33
2
2
Exercise 5.5.
√ Find the fundamental solution (x0 , y0 ) to x − 5y = 1. What is the fundamental
+unit of Z[ 5]? Compute the solutions
√ given by the square and the cube of (x0 , y0 ). What rational
number decimal approximations to 5 do they yield? To how many digits are they accurate? (Use
a calculator.)
Exercise 5.6. Exercises 5.4.4, 5.4.5.
5.5
The pigeonhole argument
The simple-minded method for determining fundamental solutions above is only practical for small
n. For instance, when n = 61, the fundamental solution is
(1766319049, 226153980)
(Bhaskara II, 12th century; Fermat). In general, one can, for instance, use the classical theory of
continued fractions. We will not go into this here, but we will prove the existence of a non-trivial
solution for all nonsquare n, which is due to Lagrange in 1768. However, we will give a proof due
to Dirichlet (ca. 1840). It uses the
Pigeonhole principle. If m > k pigeons go into k boxes, at least one must box must contain more
than 1 pigeon (finite version). If infinitely many pigeons go into k boxes, at least one box must
contain infinitely many pigeons (infinite version).
Proposition 5.8. (Dirichlet’s approximation theorem) For any nonsquare n and integer B >
1, there exist a, b ∈ Z such that 0 < b < B and
√
1
|a − b n| < .
B
(This says that
a
b
is close to
√
n.)
Proof. Consider the B − 1 irrational numbers
√
√
√
n, 2 n, . . . , (B − 1) n.
√
For each such k n, let ak ∈ N be such that
√
0 < ak − k n < 1.
Partition the interval [0, 1] into B subintervals of length B1 . Then, of the B + 1 numbers
√
√
√
0, a1 − n, a2 − n, . . . , aB−1 − (B − 1) n, 1
in [0, 1] two of them must be in the same subinterval of length B1 . Hence they are less than distance
√
1
1
B apart, i.e., their difference satisfies |a − b n| < B . Further their irrational parts must be distinct,
so we have −B < b < B with b 6= 0. If b > 0 we are done; if b < 0, simply multiply a and b by
−1.
34
√
Step 1. Fix B1 = B. Then by above, there exists |a1 − b1 n| < B1 < b11 . Let B2 > B1 such that
√
1
B2 < |a1 − b1 n|. Applying Dirichlet’s approximation again, we get a new pair (a2 , b2 ) of integers
such that
√
1
1
|a2 + b2 n| <
< .
B2
b2
√
Repeating this we see there an infinite sequence of integer pairs (a, b) such that |a − b n| gets
smaller and smaller, and
√
1
|a − b n| < .
b
for all (a, b). (This is gives a infinite sequence of increasingly good approximations.)
√
Step 2. Assume (a, b) satisfy |a − b n| < 1b . Note that
√
√
√
√
√
|a + b n| ≤ |a − b n| + |2b n| ≤ 1 + 2b n ≤ 3b n.
Then
√
√
√ 1
√
|a2 − nb2 | = |a + b n||a − b n| ≤ 3b n = 3 n.
b
√
√
√
Hence there are infinitely many a − b n ∈ Z[ n] whose norm, in absolute values, is at most 3 n.
Step 3. By successive applications of the (infinite) pigeonhole principle, we have
√
√
(i) infinitely many a − b n with the same norm N , where |N | ≤ 3 n (the norm is always an
integer)
√
(ii) infinitely many a − b n with norm N and a ≡ a0 mod N for some a0 .
√
(iii) infinitely many a − b n with norm N , a ≡ a0 mod N , b ≡ b0 mod N for some b0 .
√
√
In particular, we have two a1 − b1 n, a2 − b2 n such that they both have norm N , a1 ≡
√
√
a2 mod N , b1 ≡ b2 mod N , and a1 − b1 n 6= ±(a2 − b2 n). (It’s possible N < 0, and we define
mod N for negative N to be the same as mod |N |. However, N 6= 0 because 0 is the only element
√
of Z[ n] of norm 0.)
Step 4. Consider
√
√
√
√
a1 − b1 n
(a − b1 n)(a2 − b2 n)
a1 a2 − nb1 b2 a1 b2 − b1 a2 √
√ = 1
a+b n=
+
n.
=
2
2
N
N
a2 − b2 n
a2 − nb2
√
√
√
Since a1 − b1 n 6= ±(a2 − b2 n), surely a + b n 6= ±1. If we know a, b ∈ Z, then since
√
√
√
N (a + b n) = N (a1 − b1 n)N (a2 − b2 n)−1 = N N −1 = 1,
√
√
we get that a + b n is an element of Z[ n] of norm 1 which is not ±1.
To show that a is an integer, observe that N |a1 a2 − nb1 b2 because
a1 a2 − nb1 b2 ≡ a1 a1 − nb1 b1 ≡ a21 − nb21 ≡ 0 mod N.
The first congruence holds because a1 ≡ a2 mod N and b1 ≡ b2 mod N . Similarly, b is an integer
because
a1 b2 − b1 a2 ≡ a1 b1 − b1 a1 ≡ 0 mod N.
This proves
Theorem 5.9. If n ∈ N is nonsquare, then x2 − ny 2 = 1 has a nontrivial solution in Z, i.e., a
solution besides (±1, 0).
35
5.6
*Quadratic forms
Note: This section does not AT ALL follow what is in the text.
The ideas above can be put into a more general context. We say Q(x, y) is a binary quadratic form
if
Q(x, y) = ax2 + bxy + cy 2 .
The basic questions are, which numbers k are of the form
k = Q(x, y),
and for such n, what are the solutions (or at least, how many are there?). We answered the question
thoroughly for Q(x, y) = x2 − ny 2 (n > 0) and k = 1: 1 is always of the form x2 − ny 2 —in two
ways if n is a square and in infinitely many ways otherwise, and we showed how to determine all
solutions.
Assuming n is not a square, if k is of the form x2 − ny 2 , then there are infinitely many solutions
to
x2 − ny 2 = k,
and they are generated from a fundamental solution. The reason is that such solutions correspond
√
to elements of Z[ n] of norm k. If α has norm k, then so does µα for any µ of norm 1, and we
showed that there are infinitely many elements of norm 1 in Section 5.4. We will not deal with the
question of which k are of the form x2 − ny 2 here, but it was treated in Gauss’ Disquistiones.
The form x2 − ny 2 is called an indefinite form because it takes on positive and negative values.
The general theory of indefinite forms is similar, and another interesting example is the case of the
form
Q(x, y) = x2 + xy − y 2 .
Here the solutions to Q(x, y) = 1 are given by (F2n+1 , F2n+2 ) where Fn is the n-th Fibonacci
number
√
1+ 5
(cf. Exercise 5.8.4; F1 = F2 = 1). The form Q(x, y) is the norm of the element x + y 2 in
(
)
√
√
1+ 5
1+ 5
Z[
] := a + b
: a, b ∈ Z .
2
2
Here, the golden ratio
√
1+ 5
2
√
is a fundamental unit for Z[ 1+2 5 ], but this has norm −1. In fact the
√
solutions are generated by the powers of the fundamental +unit, 1 + 1+2
an interesting way of computing the Fibonacci numbers:
√ !n
√
3+ 5
1+ 5
= F2n+1 + F2n+2
.
2
2
5
=
√
3+ 5
2 .
Hence this gives
In fact, proving this relation (say by induction) is an alternative way of showing Exercise 5.8.4.
√ n
√
Exercise 5.7. Check that 3+2 5
= F2n−1 + F2n 1+2 5 holds for n = 1, 2, 3.
Opposed to the indefinite forms, we have the definite forms. We say Q(x, y) is positive
definite (resp. negative definite) if Q(x, y) ≥ 0 (resp Q(x, y) ≤ 0) for all x, y ∈ Z. For example,
36
x2 + ny 2 for n ∈ N is a positive definite form. (The negative definite forms are just the negatives of
positive definite forms, so it makes sense to study just the positive ones.)
In contrast to the indefinite case, it is clear that if k is of the form
x2 + ny 2 = k
there are only finitely many solutions for (x, y).
√ These are in 1-1 correspondence with the elements
of norm k in the imaginary quadratic ring Z[ −n]. While the point of view of norms is similar to
the indefinite case, the definite and indefinite cases have a rather different flavor (with the definite
case being the more easy of the two).
In the next chapter, we will study the ring of Gaussian integers Z[i], with the goal in mind of
determining which numbers are the sum of two squares x2 + y 2 . Brahmagupta composition, as we
remarked earlier, suggests that we can reduce the problem to the question of which primes are sums
of two squares, for which the pattern becomes much more apparent.
5.7
*The map of primitive vectors
5.8
*Periodicity in the map of x2 − ny 2
The material in these two optional sections is an introduction to Conway’s recent (in the last 20
years or so) new insights into a visual approach to binary quadratic forms. While the material is
interesting, we will focus on other things in this class. If you are interested in learning about it, I
recommend Conway’s own (small) book, The Sensual Quadratic Form.
5.9
Discussion
Probably worth reading.
37