Download 5 The Pell equation

5 5.1 The Pell equation Side and diagonal numbers In Hence the discovery that (1) √ ancient time, only rational numbers were thought of as numbers. √ 2 is the length of a hypoteneuse of a right triangle, and (2) 2 is irrational (which we proved in Section 3, but the Greeks also had another proof for) was quite perplexing. Hence the ancient Greeks studied the Diophantine equation x2 − 2y 2 = 1 √ to try to understand 2. They were able to produce a sequence of increasingly large solutions (xi , yi ). Note that we can rewrite such an equation as r √ x2 1 1 x = 2 + 2 ⇐⇒ = 2+ 2 → 2 2 y y y y √ as y → ∞. Hence the solutions (xi , yi ) provide increasingly good rational approximations to 2 as yi gets large. We will study the solutions in Z to the more general Pell equation x2 − ny 2 = 1 for any n ∈ N. Note Pell’s equation always has the “trivial” solutions (±1, 0). Further, the case where n is a square is easy: Exercise 5.1. If n ∈ N is a square, show the only solutions of x2 − ny 2 = 1 are (±1, 0). (Cf. Exercises 5.1.3, 5.1.4.) √ Hence, from now on, we will assume n is not a square. Then we know n is irrational from Section 2.5. To understand this equation thoroughly over Z, we need to work with another number system √ √ Z[ n] = a + b n : a, b ∈ Z . This idea of working with a larger number system than Z to study problems about Z is the basis of algebraic number theory. (Note we could also think of each Z/mZ a smaller number system than Z, which we used in Chapter 3 to study problems over Z as well. However because Z/mZ is smaller, it can typically only be used to limit the kinds of solutions we might have to an equation over Z, and √ not actually prove the existence of solutions.) Here we take for granted the existence of n, which as pointed out above, was not always known (or thought) to be a number. The reason we want to √ work specifically with the ring1 Z[ n] is of course so we can factor Pell’s equation: √ √ x2 − ny 2 = (x + y n)(x − y n) = 1. √ If x = a + b n, we say a is the rational part and b is the irrational part of x. (This is √ analogous to real and imaginary parts of complex numbers.) Note that two numbers x, y ∈ Z[ n] 1 A ring is, roughly, a number system in which you can add, subtract and multiply, but not necessarily divide. We will give the formal definition in Chapter 10. For now, you don’t need to know anything about rings; we will just get in the habit of calling some number systems rings to stress that they are somehow similar to Z. 29 are equal if and only if their rational and irrational parts are equal. The (⇐) direction is obvious. √ √ To prove the (⇒) direction, write x = a1 + b1 n and y = a2 + b2 n. Then √ x = y ⇐⇒ a1 − a2 = (b2 − b1 ) n. √ 2 If b2 − b1 6= 0, then n = ab12 −a −b1 ∈ Q which is a contradiction. Hence b2 = b1 , and therefore a1 = a2 also. You might wonder about the title of this section. I won’t cover it. See the text. 5.2 The equation x2 − 2y 2 = 1 It is simple to determine all rational solutions of x2 − 2y 2 = 1 using the rational slope (Diophantus chord) method. However determining the integer solutions is a different matter. First we observe by trial-and-error that the smallest non-trivial solution is (3, 2). Exercise 5.2. Check the following composition rule holds: (x21 − 2y12 )(x22 − 2y22 ) = x23 − 2y32 where x3 = x1 x2 + 2y1 y2 , y3 = x1 y2 + y1 x2 . Hence if (x1 , y1 ) and (x2 , y2 ) are to solutions to x2 − 2y 2 = 1, so is there composition (x3 , y3 ), defined as above. We denote (x3 , y3 ) = (x1 , y1 ) · (x2 , y2 ). We will see in the next section that the solutions to x2 − 2y 2 = 1 form a group under this operation. Further, since the definition of composition is symmetric in (x1 , y1 ) and (x2 , y2 ) this will be an abelian group. Example. (x1 , y1 ) · (±1, 0) = (±x1 , ±y1 ). Example. (3, 2) · (3, 2) = (9 + 8, 12) = (17, 12) Example. (3, 2)3 = (3, 2) · (17, 12) = (99, 70). Through composition (the “powers” of (3, 2)) we can see that we can get infinitely many solutions, each getting larger. The first three powers give the sequence of approximations 3 = 1.5 2 17 = 1.416 12 99 = 1.41428757 70 √ ≈ 2 = 1.4142135623 . . . Exercise 5.3. Compute (3, 2)4 . Use this to obtain a decimal approximation for digits is it accurate? (Use a calculator/computer.) 30 √ 2. To how many 5.3 The group of solutions This section shows that the solutions of x2 − 2y 2 = 1 form a group under the composition defined above, and this group is generated by (3, 2) and (−1, 0). However it is subsumed in the next section which treats x2 − ny 2 = 1 using norms, so I will not treat the case n = 2 separately. 5.4 √ The general Pell equation and Z[ n] To put the ideas we will use in context, let us recall some things about complex numbers. Let z ∈ C. Then we can write z = x + yi where x, y ∈ R. The complex conjugate of z is z = x − yi, and zz = (x + yi)(x − yi) = x2 + y 2 . Drawing z as a vector in the complex plane, we see that zz is the square of the length of this vector, i.e., zz = |z|2 . Define the norm of z to be N (z) = zz. Since complex conjugation respects multiplication, N (z1 z2 ) = z1 z2 z1 z2 = z1 z 1 z2 z 2 = N (z1 )N (z2 ), i.e, the norm is multiplicative. √ √ √ Similar to the complex case, define the conjugate of α = x + y n ∈ Z[ n] to be α = x − y n, and the norm of α to be √ √ N (α) = αα = (x + y n)(x − y n) = x2 − ny 2 . Note N (α) = N (α). The following lemma is clear. √ Lemma 5.1. Solutions of x2 − ny 2 = 1 are in 1-1 correspondence with the elements in Z[ n] of √ norm 1. The correspondence is given by (x, y) ↔ x + y n. Now we want to know a basic property of norms. √ Lemma 5.2. For α, β ∈ Z[ n], we have N (αβ) = N (α)N (β), i.e., N is multiplicative. √ √ Proof. Write α = x1 + y1 n, β = x2 + y2 n. Note √ √ √ αβ = x1 x2 + ny1 y2 − (x1 y2 + y1 x2 ) n = (x1 − y1 n)(x2 − y2 n) = α · β. Hence N (αβ) = αβαβ = αα · ββ = N (α)N (β). Hence if α and β have norm 1, so does αβ. In light of Lemma 5.1, this says if we have two solutions to x2 −ny 2 = 1, we can compose them to construct a third. Precisely, we can say something stronger. 31 Corollary 5.3. (Brahmagupta composition rule) If (x1 , y1 ) and (x2 , y2 ) are solutions to x21 − ny12 = a, x22 − ny 2 = b. Then the composition (x3 , y3 ) = (x1 , y1 ) · (x2 , y2 ) := (x1 x2 + ny1 y2 , x1 y2 + y1 x2 ) is a solution of x23 − ny32 = ab. Proof. We simply translate the above into a statement about norms. The hypothesis says N (x1 + √ √ y1 n) = a and N (x2 + y2 n) = b. Now observe that √ √ √ √ (x1 + y1 n)(x2 + y2 n) = x1 x2 + ny1 y2 + (x1 y2 + y1 x2 ) n = x3 + y3 n. Hence by the multiplicative property of the norm, √ √ √ x23 − ny32 = N (x3 + y3 n) = N (x1 + y1 n)N (x2 + y2 n) = ab. Note this is much nicer than the straightforward proof given on p. 82. Aside: this result says that if a and b are of the form x2 − ny 2 , so is ab. Hence if we want to ask the question which integers are of the form x2 − ny 2 , we should first determine which primes are of the form x2 − ny 2 . We will not pursue this now, however we will return to this idea when considering which numbers are sums of squares, or more generally, of the form x2 + ny 2 . (The “+” case turns out to be simpler, but still not easy.) √ √ Since Z[ n] ⊆ R, there is a natural order on Z[ n]. By Lemma 5.1, this gives us a way to order the solutions to Pell’s equation. As the case of n = 2 suggests, we want to first look for a “smallest” non-trivial solution and try to obtain all other solutions from that. If (x, y) is a solution to x2 − ny 2 = 1, so are (±x, ±y). So when we say we want a “smallest” solution, we should make a restriction like x, y > 0. Thinking in terms of elements of norm 1, note √ √ that conjugates and negatives of α = x + y n give ±x ± y n. Hence we want to look for the smallest element of norm 1 such that x, y > 0. √ √ √ Definition 5.4. The fundamental +unit2 of Z[ n] is the smallest = x + y n ∈ Z[ n] such that x, y > 0 and N () = 1. √ Lemma 5.5. The fundamental +unit of Z[ n] is well defined and always exists. √ Proof. We will show in the next section that there is always some 6= ±1 in Z[ n] such that N () = 1. By possibly taking the negative and/or conjugate of , we may assume x, y > 0. So at √ least one candidate exists. Since the set of all a + b n with a, b ∈ N is discrete in R, there must be a minimal such (i.e., such cannot get arbitrarly close to 1). 2 This is not standard terminology. One normally defines the fundamental unit, which can have norm ±1. If it has norm +1, this coincides with our definition; if it has norm -1, its square is what we are calling the fundamental +unit. 32 √ √ Example. The fundamental +unit of Z[ 2] is 3 + 2 2. Exercise 5.4. An alternative definition of fundamental unit is the smallest > 1 such that N () = 1. √ Prove that this is equivalent to the above definition as follows. Suppose = x + y n > 1 and N () = 1. Show (i) 0 < < 1. Then deduce (ii) x, y > 0. √ Theorem 5.6. Let U + = {α ∈ Z[ n] : N (α) = 1}. Then U + is an infinite abelian group under √ multplication. Furthermore, it is generated by the fundamental unit of Z[ n] and −1. Proof. Clearly the identity 1 ∈ U + and multiplication on U + is associative. Note that for any α ∈ U + , N (α) = αα = 1 implies that α = α−1 . Since N (α) = 1 also, we have α ∈ U + . Also by the multiplicative property of the norm, if α, β ∈ U + then N (αβ) = N (α)N (β) = 1 so αβ ∈ U + . This √ shows U + is a group, and it is clearly abelian because multiplication in Z[ n] is commutative. √ Now let be the fundamental +unit of Z[ n]. Suppose there exists α ∈ U + such that α 6= ±m for any m ∈ Z. By taking the negative and/or conjugate if need be, we may assume α > 1. Since is minimal and m → ∞ as m → ∞, there must be some m > 0 such that m < α < m + 1. But then 1 < α−m < and N (α−m ) = 1, contradicting the minimality of . Hence each α ∈ U + is (±) a power of . √ √ Remark. All α ∈ U + are called units of Z[ n], because like ±1, they are invertible in Z[ n]. √ The actual definition of the units of Z[ n] is the set of invertible elements, which is easy to see is precisely the set of elements of norm ±1. Hence the solutions of Pell’s equation are given by ±m where m ∈ Z. Since we know = −1 , then −m = m . Thus m and −m give essentially the same solutions. √ √ Corollary 5.7. Suppose = x0 +y0 n is a fundamental +unit of Z[ n]. Then all integer solutions √ √ to Pell’s equation x2 − ny 2 = 1 are of the form (±x, ±y) where x + y n = (x0 + y0 n)m and m ≥ 0. Equivalently, up to sign, all solutions to Pell’s equations are given by non-negative powers (in the sense of Brahmagupta composition) of the fundamental solution (x0 , y0 ). √ √ m Example. Up to sign, all non-trivial solutions of x2 − 2y 2 = 1 are given by (x + y 2) = (3 + 2 2) √ m for m > 0, i.e., x and y are the rational and irrational parts of (3 + 2 2) . The book says little about how to find fundamental solutions (called smallest positive solutions in the text). By rewriting Pell’s equation as x2 = ny 2 + 1 it becomes clear that we can find the fundamental solution (or fundamental +unit) by finding the smallest y > 0 such that ny 2 + 1 is a square. This will give the smallest x > 0 which solves √ x2 − ny 2 = 1, i.e., x and y are simultaneously minimal for this solution, making x + y n minimal (with x, y > 0) among U + . 2 2 Example. Since 3 · 12 + 1 is a square, the smallest positive √ √ (fundamental) solution to x − 3y = 1 is (2, 1). Hence the fundamental +unit of Z[ 3] is 2 + 3. Up to sign, all solutions are powers of (2, 1), e.g., (2, 1)2 = (7, 4) and (2, 1)3 = (26, 15). This provides the successive approximations 12 , 74 , √ 26 3. 15 for 33 2 2 Exercise 5.5. √ Find the fundamental solution (x0 , y0 ) to x − 5y = 1. What is the fundamental +unit of Z[ 5]? Compute the solutions √ given by the square and the cube of (x0 , y0 ). What rational number decimal approximations to 5 do they yield? To how many digits are they accurate? (Use a calculator.) Exercise 5.6. Exercises 5.4.4, 5.4.5. 5.5 The pigeonhole argument The simple-minded method for determining fundamental solutions above is only practical for small n. For instance, when n = 61, the fundamental solution is (1766319049, 226153980) (Bhaskara II, 12th century; Fermat). In general, one can, for instance, use the classical theory of continued fractions. We will not go into this here, but we will prove the existence of a non-trivial solution for all nonsquare n, which is due to Lagrange in 1768. However, we will give a proof due to Dirichlet (ca. 1840). It uses the Pigeonhole principle. If m > k pigeons go into k boxes, at least one must box must contain more than 1 pigeon (finite version). If infinitely many pigeons go into k boxes, at least one box must contain infinitely many pigeons (infinite version). Proposition 5.8. (Dirichlet’s approximation theorem) For any nonsquare n and integer B > 1, there exist a, b ∈ Z such that 0 < b < B and √ 1 |a − b n| < . B (This says that a b is close to √ n.) Proof. Consider the B − 1 irrational numbers √ √ √ n, 2 n, . . . , (B − 1) n. √ For each such k n, let ak ∈ N be such that √ 0 < ak − k n < 1. Partition the interval [0, 1] into B subintervals of length B1 . Then, of the B + 1 numbers √ √ √ 0, a1 − n, a2 − n, . . . , aB−1 − (B − 1) n, 1 in [0, 1] two of them must be in the same subinterval of length B1 . Hence they are less than distance √ 1 1 B apart, i.e., their difference satisfies |a − b n| < B . Further their irrational parts must be distinct, so we have −B < b < B with b 6= 0. If b > 0 we are done; if b < 0, simply multiply a and b by −1. 34 √ Step 1. Fix B1 = B. Then by above, there exists |a1 − b1 n| < B1 < b11 . Let B2 > B1 such that √ 1 B2 < |a1 − b1 n|. Applying Dirichlet’s approximation again, we get a new pair (a2 , b2 ) of integers such that √ 1 1 |a2 + b2 n| < < . B2 b2 √ Repeating this we see there an infinite sequence of integer pairs (a, b) such that |a − b n| gets smaller and smaller, and √ 1 |a − b n| < . b for all (a, b). (This is gives a infinite sequence of increasingly good approximations.) √ Step 2. Assume (a, b) satisfy |a − b n| < 1b . Note that √ √ √ √ √ |a + b n| ≤ |a − b n| + |2b n| ≤ 1 + 2b n ≤ 3b n. Then √ √ √ 1 √ |a2 − nb2 | = |a + b n||a − b n| ≤ 3b n = 3 n. b √ √ √ Hence there are infinitely many a − b n ∈ Z[ n] whose norm, in absolute values, is at most 3 n. Step 3. By successive applications of the (infinite) pigeonhole principle, we have √ √ (i) infinitely many a − b n with the same norm N , where |N | ≤ 3 n (the norm is always an integer) √ (ii) infinitely many a − b n with norm N and a ≡ a0 mod N for some a0 . √ (iii) infinitely many a − b n with norm N , a ≡ a0 mod N , b ≡ b0 mod N for some b0 . √ √ In particular, we have two a1 − b1 n, a2 − b2 n such that they both have norm N , a1 ≡ √ √ a2 mod N , b1 ≡ b2 mod N , and a1 − b1 n 6= ±(a2 − b2 n). (It’s possible N < 0, and we define mod N for negative N to be the same as mod |N |. However, N 6= 0 because 0 is the only element √ of Z[ n] of norm 0.) Step 4. Consider √ √ √ √ a1 − b1 n (a − b1 n)(a2 − b2 n) a1 a2 − nb1 b2 a1 b2 − b1 a2 √ √ = 1 a+b n= + n. = 2 2 N N a2 − b2 n a2 − nb2 √ √ √ Since a1 − b1 n 6= ±(a2 − b2 n), surely a + b n 6= ±1. If we know a, b ∈ Z, then since √ √ √ N (a + b n) = N (a1 − b1 n)N (a2 − b2 n)−1 = N N −1 = 1, √ √ we get that a + b n is an element of Z[ n] of norm 1 which is not ±1. To show that a is an integer, observe that N |a1 a2 − nb1 b2 because a1 a2 − nb1 b2 ≡ a1 a1 − nb1 b1 ≡ a21 − nb21 ≡ 0 mod N. The first congruence holds because a1 ≡ a2 mod N and b1 ≡ b2 mod N . Similarly, b is an integer because a1 b2 − b1 a2 ≡ a1 b1 − b1 a1 ≡ 0 mod N. This proves Theorem 5.9. If n ∈ N is nonsquare, then x2 − ny 2 = 1 has a nontrivial solution in Z, i.e., a solution besides (±1, 0). 35 5.6 *Quadratic forms Note: This section does not AT ALL follow what is in the text. The ideas above can be put into a more general context. We say Q(x, y) is a binary quadratic form if Q(x, y) = ax2 + bxy + cy 2 . The basic questions are, which numbers k are of the form k = Q(x, y), and for such n, what are the solutions (or at least, how many are there?). We answered the question thoroughly for Q(x, y) = x2 − ny 2 (n > 0) and k = 1: 1 is always of the form x2 − ny 2 —in two ways if n is a square and in infinitely many ways otherwise, and we showed how to determine all solutions. Assuming n is not a square, if k is of the form x2 − ny 2 , then there are infinitely many solutions to x2 − ny 2 = k, and they are generated from a fundamental solution. The reason is that such solutions correspond √ to elements of Z[ n] of norm k. If α has norm k, then so does µα for any µ of norm 1, and we showed that there are infinitely many elements of norm 1 in Section 5.4. We will not deal with the question of which k are of the form x2 − ny 2 here, but it was treated in Gauss’ Disquistiones. The form x2 − ny 2 is called an indefinite form because it takes on positive and negative values. The general theory of indefinite forms is similar, and another interesting example is the case of the form Q(x, y) = x2 + xy − y 2 . Here the solutions to Q(x, y) = 1 are given by (F2n+1 , F2n+2 ) where Fn is the n-th Fibonacci number √ 1+ 5 (cf. Exercise 5.8.4; F1 = F2 = 1). The form Q(x, y) is the norm of the element x + y 2 in ( ) √ √ 1+ 5 1+ 5 Z[ ] := a + b : a, b ∈ Z . 2 2 Here, the golden ratio √ 1+ 5 2 √ is a fundamental unit for Z[ 1+2 5 ], but this has norm −1. In fact the √ solutions are generated by the powers of the fundamental +unit, 1 + 1+2 an interesting way of computing the Fibonacci numbers: √ !n √ 3+ 5 1+ 5 = F2n+1 + F2n+2 . 2 2 5 = √ 3+ 5 2 . Hence this gives In fact, proving this relation (say by induction) is an alternative way of showing Exercise 5.8.4. √ n √ Exercise 5.7. Check that 3+2 5 = F2n−1 + F2n 1+2 5 holds for n = 1, 2, 3. Opposed to the indefinite forms, we have the definite forms. We say Q(x, y) is positive definite (resp. negative definite) if Q(x, y) ≥ 0 (resp Q(x, y) ≤ 0) for all x, y ∈ Z. For example, 36 x2 + ny 2 for n ∈ N is a positive definite form. (The negative definite forms are just the negatives of positive definite forms, so it makes sense to study just the positive ones.) In contrast to the indefinite case, it is clear that if k is of the form x2 + ny 2 = k there are only finitely many solutions for (x, y). √ These are in 1-1 correspondence with the elements of norm k in the imaginary quadratic ring Z[ −n]. While the point of view of norms is similar to the indefinite case, the definite and indefinite cases have a rather different flavor (with the definite case being the more easy of the two). In the next chapter, we will study the ring of Gaussian integers Z[i], with the goal in mind of determining which numbers are the sum of two squares x2 + y 2 . Brahmagupta composition, as we remarked earlier, suggests that we can reduce the problem to the question of which primes are sums of two squares, for which the pattern becomes much more apparent. 5.7 *The map of primitive vectors 5.8 *Periodicity in the map of x2 − ny 2 The material in these two optional sections is an introduction to Conway’s recent (in the last 20 years or so) new insights into a visual approach to binary quadratic forms. While the material is interesting, we will focus on other things in this class. If you are interested in learning about it, I recommend Conway’s own (small) book, The Sensual Quadratic Form. 5.9 Discussion Probably worth reading. 37

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download 5 The Pell equation