Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Quadratic form wikipedia , lookup
Field (mathematics) wikipedia , lookup
Polynomial ring wikipedia , lookup
Cayley–Hamilton theorem wikipedia , lookup
Factorization of polynomials over finite fields wikipedia , lookup
Polynomial greatest common divisor wikipedia , lookup
System of polynomial equations wikipedia , lookup
Factorization wikipedia , lookup
Eisenstein's criterion wikipedia , lookup
Script: Diophantine Approximation A. Kresch Spring 2016 1 Elementary theory Diophantine approximation is the study of approximations of real numbers by rational numbers. Given a real number α, we ask, how well can we approximate α by a rational number p/q? For a fixed value of q, the real number qα differs from some integer by an amount less than 1: p 1 ∀ q ∈ N>0 ∃ p ∈ Z : α − < . q q Suppose we no longer fix the value of q, but allow q to vary. Our first task will be to obtain an improved bound, in which 1/q is replaced by 1/q 2 ; when α is irrational the improved bound p 1 ∃ p ∈ Z : α − < 2 . q q will be satisfied for infinitely many values of q. When we attempt to replace 2 by a greater exponent on the right-hand side, the validity of such a bound is connected with the nature of the number α, and particularly whether α is algebraic or transcendental. We will obtain results that, when α is algebraic, restrict the exponents for which such a bound may be attained for more than just finitely many values of q. The first of these, Liouville’s theorem, is elementary in nature, having been presented in connection with the historically first transcendental numbers to be exhibited (Liouville 1844). 1.1 Dirichlet’s approximation theorem The proof of Dirichlet’s theorem on Diophantine approximation, which furnishes the improved bound mentioned above, is a beautiful application of the pigeonhole principle, the assertion that, given a positive integer Q, any function mapping a set of cardinality greater than Q to a set of cardinality Q must assign the same value to some pair of distinct elements of the domain. 1 2 1. Elementary theory Proposition 1.1 (Dirichlet’s approximation theorem). Let α ∈ R and Q ∈ N>0 . Then there exist integers p and q with 1 ≤ q ≤ Q, such that α − p < 1 . q Qq Proof. We consider the function from {0, 1, . . . , Q} to {0, 1, . . . , Q − 1}, defined by q 7→ Q(qα − bqαc) . In other words, the fractional part qα − bqαc of qα is rounded down to the nearest multiple of 1/Q, and the function records the multiple of 1/Q obtained in this manner. By the pigeonhole principle, there exist integers 0 ≤ q < q0 ≤ Q where the function has the same value; in particular, |q 0 α − qα − p| < 1 Q for some integer p. It follows that α − q0 p 1 < . 0 −q Q(q − q) Since 1 ≤ q 0 − q ≤ Q, we have the desired conclusion. Corollary 1.2. For α ∈ R the set of rational numbers p/q satisfying α − p < 1 , q q2 (1) where p and q are relatively prime integers, is: (i) infinite, if α is irrational; (ii) finite, if α is rational, in which case there exists real C > 0 such that for any integers p and q with q > 0 and p/q 6= α we have α − p ≥ C , q q and more generally for any real number s > 1 the inequality, as in (1) but with exponent 2 replaced by s, is also satisfied for only finitely many rational numbers p/q. 1.1. Dirichlet’s approximation theorem 3 Proof. For any Q ∈ N>0 , the p and q that we obtain from Proposition 1.1 may be taken to be relatively prime (by dividing out their gcd) and yield p/q satisfying (1). If α ∈ / Q then for any finite set S of rational numbers, with sufficiently large Q the conclusion of Proposition 1.1 does not hold for any element of S. Therefore the set of rational numbers satisfying (1) is infinite. If α ∈ Q, say, α = p0 /q0 for integers p0 and q0 with q0 > 0, then p p0 q − pq0 α− = . q q0 q 1/(s−1) So (ii) holds with C = 1/q0 and implies, for any integers p and q with q ≥ q0 and p/q 6= α, that |α − p/q| ≥ 1/q0 q ≥ 1/q s . Notice that Corollary 1.2(ii) yields an amusingly simple proof of the irrationality of the number e: n ∞ X X 1 2 2 1 1 0<e− = < = · k! k! (n + 1)! n + 1 n! k=0 k=n+1 for n ∈ N. (The proof of the transcendence of e, as presented in the Algebra I lecture and discussed below, is more sophisticated and makes use of a nontrivial auxiliary function.) The nth Farey sequence is the increasing sequence of rational numbers between 0 and 1 (including the endpoints), with denominator less than or equal to n. E.g., for n = 5, 1 1 1 2 1 3 2 3 4 0, , , , , , , , , , 1. 5 4 3 5 2 5 3 4 5 Lemma 1.3. If p/q and p0 /q 0 are adjacent terms in the nth Farey sequence for some n, where p, q, p0 , q 0 are integers with gcd(p, q) = gcd(p0 , q 0 ) = 1, then |p0 q − pq 0 | = 1. Proof. There is no loss of generality in assuming that p, q, p0 , q 0 are nonnegative, and p p0 < 0. q q We first observe, there are uniquely determined integers r and s satisfying qr − ps = 1 and 1 ≤ s ≤ q. Since q − p ≥ 1 = qr − ps, which implies q(s − 1) ≥ p(s − 1) ≥ q(r − 1), we have s ≥ r > 0. In particular, r/s occurs in the nth Farey sequence, somewhere to the right of p/q. 4 1. Elementary theory We prove the lemma by contradiction. Suppose p0 q − pq 0 ≥ 2. Since p0 /q 0 is adjacent to p/q in the nth Farey sequence, we must have p0 r < . q0 s We claim, p0 − (p0 q − pq 0 − 1)r and q 0 − (p0 q − pq 0 − 1)s are positive, relatively prime integers satisfying p0 − (p0 q − pq 0 − 1)r p0 p < 0 < . q q − (p0 q − pq 0 − 1)s q0 Indeed, p0 q − (p0 q − pq 0 − 1)qr − (pq 0 − (p0 q − pq 0 − 1)ps) = p0 q − pq 0 − (p0 q − pq 0 − 1) = 1, p0 q 0 − (p0 q − pq 0 − 1)p0 s − (p0 q 0 − (p0 q − pq 0 − 1)q 0 r) = (p0 q − pq 0 − 1)(q 0 r − p0 s) > 0, q 0 − (p0 q − pq 0 − 1)s = q(q 0 r − p0 s) + s > 0. The claim stands in contradiction to the hypothesis that p0 /q 0 is adjacent to p/q in the nth Farey sequence. Lemma 1.4. (i) If p/q and p0 /q 0 are adjacent terms in the nth Farey sequence, for some integers p, q, p0 , q 0 with q, q 0 > 0 and gcd(p, q) = gcd(p0 , q 0 ) = 1, then (p0 + p)q − p(q + q 0 ) = p0 (q + q 0 ) − q 0 (p + p0 ) = 1, and the rational number p + p0 q + q0 lies between p/q and p0 /q 0 . (ii) If p/q, p00 /q 00 , and p0 /q 0 are three adjacent terms in the nth Farey sequence, for some integers p, q, p00 , q 00 , p0 , q 0 with q, q 00 , q 0 > 0 and gcd(p, q) = gcd(p00 , q 00 ) = gcd(p0 , q 0 ) = 1, then p00 p + p0 = . q 00 q + q0 Proof. For (i) we may suppose without loss of generality, p/q < p0 /q 0 . So p0 q−pq 0 = 1 by Lemma 1.3. This implies (p0 + p)q − p(q + q 0 ) = 1 and p0 (q + q 0 ) − q 0 (p + p0 ) = 1. As a consequence, p p + p0 p0 < < . q q + q0 q0 For (ii) we may suppose without loss of generality, p/q < p00 /q 00 < p0 /q 0 . By Lemma 1.3, p00 q − pq 00 = p0 q 00 − p00 q 0 = 1. So pq 00 + p0 q 00 − p00 q − p00 q 0 = −1 + 1 = 0. 1.1. Dirichlet’s approximation theorem 5 Proposition 1.5. If p/q and p0 /q 0 are adjacent terms in the nth Farey sequence, for some integers p, q, p0 , q 0 with q, q 0 > 0 and gcd(p, q) = gcd(p0 , q 0 ) = 1, then for every real number α between p/q and p0 /q 0 at least one of the following inequalities holds: α − p < √ 1 , q 5q 2 0 α − p + p < √ 1 , q + q0 5(q + q 0 )2 0 α − p < √ 1 . q0 5q 02 Proof. As before we may suppose p/q < p0 /q 0 . Define p00 := p + p0 and q 00 := q + q 0 . We argue by contradiction. Suppose, first, we have p p00 < α < 00 q q and all three inequalities fail, i.e., α− p 1 , ≥√ q 5q 2 p00 1 , −α≥ √ 00 q 5q 002 p0 1 . −α≥ √ 0 q 5q 02 Adding pairs of inequalities and applying Lemmas 1.3 and 1.4(i), we obtain 1 p00 p 1 1 1 = 00 − ≥ √ + qq 00 q q 5 q 2 q 002 and So we have 1 p0 p 1 1 1 = 0− ≥√ + . qq 0 q q 5 q 2 q 02 √ 5qq 00 ≥ q 2 + q 002 and √ 0 5qq ≥ q 2 + q 02 . Adding, expanding, and rearranging, we obtain √ 2 5−1 0 0≥2 q−q , 2 which is impossible. The argument for the case p00 /q 00 < α < p0 /q 0 is similar. Corollary 1.6 (Hurwitz). For every irrational number α ∈ R there exist infinitely many rational numbers p/q, where p and q are relatively prime integers satisfying α − p < √ 1 . q 5q 2 Proof. There is no loss of generality in supposing that 0 < α < 1. Then, for every n ∈ N>0 , Proposition 1.5 supplies a rational number p/q that satisfies the desired condition. It remains to show that there are infinitely many such rational numbers. Given any finite set S of rational numbers, we take n to be large enough, so that 6 1. Elementary theory |γ − α| > 1/n for every γ ∈ S. The adjacent terms p/q and p0 /q 0 in the nth Farey sequence, with p/q < α < p0 /q 0 , are therefore not in S, and as well (p + p0 )/(q + q 0 ) is not in S by Lemma 1.4(i). So Proposition 1.5 supplies a rational number p/q that satisfies the desired condition and is not in S. The constant in Corollary 1.6 is optimal. Indeed, when √ α= 5−1 2 we have, for any ε > 0, that there are only finitely many rational numbers p/q, with p and q relatively prime integers satisfying α − p < √ 1 . q ( 5 + ε)q 2 (2) We see this by writing √ 2 x +x−1= x− √ 5 − 1 − 5 − 1 x− . 2 2 So, 2 2 α − p = |p √+ pq − q | q − 5−1 − p q 2 2 (3) q √ √ When |(− 5 − 1)/2 − p/q|√> 5 + ε we have |α − p/q| > ε, and this is compatible 2 −1 −1 with of q. When √(2) only for q < ε√ ( 5 + ε) , i.e., only for finitely many values √ |(− 5 − 1)/2 − p/q| ≤ 5 + ε, the right-hand side of (3) is at least 1/( 5 + ε)q 2 . We recall, an algebraic number is a root of a nontrivial polynomial with integer coefficients. An algebraic number α has a minimal polynomial, the unique monic polynomial in Q[x] that divides every polynomial f ∈ Q[x] satisfying f (α) = 0. The degree of an algebraic number α is the degree of the minimal polynomial of α. Proposition 1.7 (Liouville). Let α ∈ R be an algebraic number of degree n ≥ 2. Then there exists a positive constant C such that every rational number p/q, where p and q are relatively prime integers with q > 0, satisfies α − p ≥ C . q qn The proof follows the same argument as that given to establish the optimality of the constant in Hurwitz’s theorem, with the minimal polynomial of α factored into n linear factors in C[x]. 1.1. Dirichlet’s approximation theorem 7 Proof. Let f ∈ Q[x] be the minimal polynomial of α, and let M ∈ N>0 be a common denominator of the coefficients of f , so that we may write f (x) = n 1 X Ai x i M i=0 with Ai ∈ Z for all i. We factor f (x) over the complex numbers as f (x) = (x − α1 )(x − α2 ) · · · (x − αn ), with α1 , . . . , αn ∈ C and α1 = α. We have the following equality, where the numerator on the right-hand side is a nonzero integer: n A pn−1 q + · · · + A0 q n | α − p = |An p + n−1p . q M α2 − q · · · αn − pq q n We claim, the result holds with C := M 1 . i=2 (|αi − α| + 1) Qn Indeed, we have 1 1 α − p ≥ Q Q ≥ , q M q n ni=2 |αi − pq | M q n ni=2 (|αi − α| + |α − pq |) and this implies the result. Corollary 1.8. Let α ∈ R be an algebraic numer of degree n ≥ 2. Then for every real number s > n, the inequality α − p < 1 q qs is satisfied for at most finitely many rational numbers p/q, with p and q relatively prime integers and q > 0. Proof. Let C be as in Proposition 1.7, and let s > n. For any positive integer q with q s−n ≥ 1 , C we have, for any integer p, α − p ≥ C ≥ 1 . q qn qs So for any rational number p/q satisfying the inequality in the statement we must have q < C −1/(s−n) . 8 1. Elementary theory Corollary 1.9. Let α ∈ R be an irrational number. Suppose that there exist a sequence (pi /qi ) of rational numbers tending to α, with pi and qi relatively prime integers and qi > 0 for every i, and an unbounded increasing sequence of real numbers (si ), with α − pi < 1 qi q si for every i. Then α is transcendental. Proof. By Corollary 1.2(ii), α is irrational. If we suppose that α is algebraic of degree n ≥ 2, then by taking i such that si > n we obtain a contradiction to Corollary 1.8 for s = si . Example. For Liouville’s number ∞ X 1 , 10j! j=1 we may take for pi /qi the partial sums. We have, then, qi = 10i! . The hypothesis of Corollary 1.9 is satisfied with si = i, hence Liouville’s number is transcendental. Definition. The irrationality measure of a real number α is the supremum of the set of real numbers s such that α − p < 1 q qs holds for infinitely many rational numbers p/q, with relatively prime integers p and q and q > 0. The observations and results stated so far tell us: • Every real number has irrationality measure at least 1. • If α ∈ Q then the irrationality measure of α is equal to 1. • If α ∈ / Q then the irrationality measure of α is at least 2. • If α is algebraic of degree n ≥ 2 then the irrationality measure of α is ≤ n. • As consequences: – Quadratic irrational numbers have irrationality measure 2. – If α has infinite irrationality measure, then α is transcendental. It is also a nice exercise to show that the real numbers with irrationality measure greater than 2 form a set of Lebesgue measure zero. A highlight of this lecture will be Roth’s theorem, which is the statement that irrational algebraic numbers all have irrationality measure equal to 2. 1.2. Continued fractions 1.2 9 Continued fractions An important source of approximations of an irrational number α ∈ R is the continued fraction expansion, an expression of the form 1 a0 + (1) 1 a1 + a2 + 1 a3 + · · · with a0 ∈ Z and ai ∈ N>0 for i ≥ 1. In the other direction, given the sequence (ai ) we may define the above expression by truncating and passing to the limit. Definition. Let a0 , a1 , . . . be a sequence of integers with ai > 0 for i ≥ 1. We define rational numbers [a0 , . . . , an ] for n ∈ N, called convergents, recursively by [a0 ] := a0 , [a0 , a1 , . . . , an ] := a0 + 1 . [a1 , . . . , an ] We will see soon that the sequence of numbers [a0 , . . . , an ] does, in fact, converge. First we record some basic properties. Proposition 1.10. Given a sequence of integers a0 , a1 , . . . with ai > 0 for i ≥ 1, we define p−1 := 1, q−1 := 0, p0 := a0 , q0 := 1, c0 := a0 , and recursively, for n > 0, pn := an pn−1 + pn−2 , qn := an qn−1 + qn−2 , cn := pn . qn Then (qn )n∈N>0 is an increasing sequence of positive integers, pn and qn are relatively prime for every n, and the following identities are valid for n ∈ N>0 : pn pn−1 pn−1 pn−2 an 1 = (2) qn qn−1 qn−1 qn−2 1 0 pn qn−1 − pn−1 qn = (−1)n−1 (3) n (4) cn = [a0 , a1 , . . . , an ]. (5) pn qn−2 − pn−2 qn = (−1) an Proof. That (qn )n∈N>0 is an increasing sequence is clear from the definition, as is equation (2). Evaluating determinants, we obtain (3) by an inductive argument, and deduce as a consequence that pn and qn are relatively prime. A matrix equation, similar to (3) but with the two subscripts n − 1 changed to n − 2 on the left-hand side and the 0 and 1 in the rightmost column swapped on the right-hand side, lets 10 1. Elementary theory us deduce (by evaluating determinants) equation (4). To obtain (5), we define a new sequence by ãn := an+1 , and with this, sequences (p̃n ) and (q̃n ), which by an inductive argument are seen to satisfy pn = a0 p̃n−1 + q̃n−1 and qn = p̃n−1 . (This inductive argument takes the previous two cases as induction hypothesis.) Now (5) follows by straightforward induction on n. Corollary 1.11. With notation as above, the numbers cn = [a0 , a1 , . . . , an ] form a convergent sequence. If we define α := limn→∞ cn then the following properties hold. (i) The subsequence (c2n ) increases toward α. (ii) The subsequence (c2n+1 ) decreases toward α. (iii) We have α ∈ / Q. (iv) For every n ∈ N we have 1 pn 1 < α − < . 2 (an+1 + 2)qn qn an+1 qn2 Proof. We have, by (3) and (4), c0 < c2 < c4 < · · · < c5 < c3 < c1 , and |cn+1 − cn | = 1 1 ≤ (an+1 qn + qn−1 )qn an+1 qn2 for n ∈ N. So limn→∞ cn exists, and the limit α satisfies properties (i), (ii), and the portion |α − cn | < 1/an+1 qn2 of (iv), which by Corollary 1.2(ii) implies (iii). As well, |cn+2 − cn | < |α − cn |, and by the calculation an+2 (an+2 an+1 + 1)qn2 + an+2 qn−1 qn an+2 ≥ (an+2 an+1 + an+2 + 1)qn2 1 ≥ . (an+1 + 2)qn2 |cn+2 − cn | = we obtain the remaining part of (iv). Definition. Given a sequence of integers a0 , a1 , . . . with ai > 0 for i ≥ 1, the corresponding infinite continued fraction shown in (1) and denoted as well by [a0 , a1 , . . . ], is defined to be the limit α appearing in Corollary 1.11. 1.2. Continued fractions 11 Corollary 1.12. We have [a0 , a1 , . . . ] = a0 + 1 . [a1 , . . . ] Proof. The sequence of convergents [a0 , a1 , . . . , an ] converges to [a0 , a1 , . . . ], and the sequence of quantities a0 + 1/[a1 , . . . , an ] converges to a0 + 1/[a1 , . . . ]. Example. When ai = 1 for every i we find √ 5+1 [1, 1, . . . ] = . 2 √ Indeed, α := [1, 1, . . . ] satisfies α = 1+1/α, which implies α = (± 5+1)/2, and since 1 < α < 2 (Corollary 1.11) sign must be +. This continued fraction is connected with the Fibonacci numbers, given by F0 := 0, F1 := 1, and Fn := Fn−1 + Fn−2 ; specifically, pn = Fn+2 and qn = Fn+1 . As well we have (solution to linear recurrence relation) √ √ 1 1 + 5 n 1 − 5 n Fn = √ − . 2 2 5 Given a finite sequence of integers a0 , a1 , . . . , an for some n ∈ N, with ai > 0 for 1 ≤ i ≤ n, we have the rational number [a0 , . . . , an ] (defined as above), which may be described as a finite continued fraction. The next result shows that every rational number may be expressed as a finite continued fraction, in a manner that is unique up to the substitution of (an − 1) + 1/1 for an ≥ 2, and every irrational number is in a unique way an infinite continued fraction. Proposition 1.13. Let β ∈ R. We define a0 := bβc, and if β ∈ Z we set n := 0, otherwise we define β1 := 1/(β − a0 ). Recursively, given βk , we define ak := bβk c, and if βk ∈ Z we set n := k, otherwise βk+1 := 1/(βk − ak ). If β ∈ Q then the procedure terminates with [a0 , . . . , an ] = β with n = 0 or an ≥ 2, and [a0 , . . . , an ] and [a0 , . . . , an −1, 1] are the only ways to express β as a finite continued fraction. If β ∈ / Q, then the procedure defines an infinite sequence a0 , a1 , . . . , uniquely characterized by the property [a0 , a1 , . . . ] = β. Moreover, if β ∈ Q then [ai , . . . , an ] = βi for i ≤ n; if β ∈ / Q then [ai , ai+1 , . . . ] = βi for all i. Proof. For i ≥ 1, we have βi > 1, with βi ∈ Q if and only if β ∈ Q. Furthermore, we claim ( [a0 , . . . , ai ] ≤ β < [a0 , . . . , ai−1 ], if i is even, [a0 , . . . , ai−1 ] < β ≤ [a0 , . . . , ai ], if i is odd, 12 1. Elementary theory with equality in each case if and only if βi ∈ Z. Indeed, we have β − [a0 , . . . , ai ] = (−1)i β i − ai β1 · · · βi [a1 , . . . , ai ] · · · [ai−1 , ai ]ai by an inductive argument, where we apply the induction hypothesis in the form of the expression for β1 − [a1 , . . . , ai ]. The procedure either stops with a0 , . . . , an with n = 0 or an ≥ 2, and β = [a0 , . . . , an ] = [a0 , . . . , an −1, 1], or yields an infinite sequence with β = [a0 , a1 , . . . ]. Finally, if we apply the procedure to β := [a0 , . . . , an ] with n = 0 or an ≥ 2, then we obtain βi = [ai , . . . , an ] and bβi c = ai for i = 1, . . . , n, and with β := [a0 , a1 , . . . ] we obtain βi = [ai , ai+1 , . . . ] and bβi c = ai for all i ∈ N>0 . (The latter assertion uses Corollary 1.12.) These observations justify the uniqueness assertions. Lemma 1.14. Given β ∈ R, let sequences (βk ), (pk ), and (qk ) be as in Proposition 1.13 and (ck ) the associated sequence (finite or infinite) of convergents. Let k ∈ N be such that ck = pk /qk and ck+1 = pk+1 /qk+1 are defined (always the case, if β ∈ / Q). (i) We have βk+1 pk + pk−1 . β= βk+1 qk + qk−1 (ii) For any integers p and q with 0 < q < qk+1 , we have |p − qβ| ≥ |pk − qk β|. Proof. We prove (i) by induction on k. The case k = 0 is clear. For the inductive step, from βk = ak + 1/βk+1 and the induction hypothesis we have β= βk+1 (ak pk−1 + pk−2 ) + pk−1 βk+1 pk + pk−1 = . βk+1 (ak qk−1 + qk−2 ) + qk−1 βk+1 qk + qk−1 For (ii), we define r := pk q − qk p and s := pk+1 q − qk+1 p. By Proposition 1.10, spk − rpk+1 = (−1)k p and sqk − rqk+1 = (−1)k q. Notice, because of the hypothesis imposed on q, we must have s 6= 0. For the same reason we must have rs ≥ 0. So by Corollary 1.11(i)–(ii) we have |p − qβ| = |(spk − rpk+1 ) − (sqk − rqk+1 )β| = |s(pk − qk β) − r(pk+1 − qk+1 β)| = |s(pk − qk β)| + |r(pk+1 − qk+1 β)|, and the desired inequality follows. Proposition 1.15. Given β ∈ R the associated sequence of convergents contains any rational number p/q with integers p and q satisfying β − p < 1 . q 2q 2 1.2. Continued fractions 13 Proof. The result is clear if β = p/q, so we suppose the contrary. There is no loss of generality in supposing that p and q are relatively prime, with q positive. We let k ∈ N be such that qk ≤ q < qk+1 . By Lemma 1.14(ii) we have |p − qβ| ≥ |pk − qk β|. So p p p p 1 1 1 k k ≤ , − ≤ − β + − β < 2 + q qk q qk 2q 2qqk qqk and this forces p/q = pk /qk . Proposition 1.15 is a powerful result. It tells us that if we want to check some property about approximations of an irrational number by rational numbers, it suffices to focus our attention just on the convergents, at least for questions about approximations by p/q to within 1/2q 2 . We illustrate this with the next result. Let us say that an irrational number α ∈ R is badly approximable if there exists a positive constant C such that α − p < C q q2 holds for only finitely many rational numbers p/q, with p and q relatively prime integers. Proposition 1.16. An irrational number α = [a0 , a1 , . . . ] is badly approximable if and only if the sequence (an ) is bounded. Proof. Suppose that α is badly approximable, and let C be as in the definition. So in particular, the convergents pn /qn for sufficiently large n satisfy |α − pn /qn | ≥ C/qn2 . Then for such n we have an+1 < 1/C by Corollary 1.11(iv), and thus the sequence (an ) is bounded. For the reverse implication, we suppose that (an ) is bounded and suppose specifically that n0 , K ∈ N are such that an ≤ K for all n > n0 . We claim, the definition of badly approximable for α is satisfied with C = 1/(K + 2). Indeed, any rational number p/q satisfying the inequality is a convergent, by Proposition 1.15. For the convergents pn /qn with n ≥ n0 we have |α−pn /qn | > C/qn2 by Corollary 1.11(iv). A classical fact is that an irrational number α has (eventually) periodic continued fraction expansion [a0 , . . . , am−1 , am , . . . , an ] if and only if α belongs to a quadratic extension of Q; we give a proof of this fact below; then Proposition 1.16 will tell us that all such numbers are badly approximable, a fact that also follows directly √ from Liouville’s theorem (Proposition 1.7). Let d ≥ 2 be a squarefree integer and Q(√ d) the √ corresponding quadratic extension, with nontrivial automorphism (sending d to − d) denoted by β 7→ β 0 . 14 1. Elementary theory √ An element β ∈ Q( d) is said to be reduced if β>1 and − 1 < β 0 < 0. We define ι(β) := − 1 β0 √ for β ∈ Q( d)× and observe that ι(ι(β)) = β, and if β is reduced then so is ι(β). Lemma 1.17. The formula β 7→ 1 β − bβc √ defines a bijective map from the set of reduced elements of Q( d) to itself, with inverse given by 1 β 7→ ι . ι(β) − bι(β)c Proof. If β is reduced, then in particular β is irrational. We have 0 < β − bβc < 1, and hence 1/(β − bβc) > 1. As well, bβc is positive, so β 0 − bβc < −1, and hence 1/(β 0 − bβc) lies between −1 and 0. Applying ι to 1/(β − bβc), we obtain −β 0 + bβc. It is then straightforward to verify the formula for the inverse. √ For an irrational number β in Q( d), we consider the minimal polynomial of β, scaled to have integer coefficients with gcd 1 Ax2 + Bx + C (6) and define the discriminant of β to be B 2 − 4AC. Lemma 1.18. The map in Lemma 1.17 preserve the discriminant, and for given √ D ∈ N>0 there are only finitely many reduced elements of Q( d) of discriminant D. Proof. Subtracting an integer from β does not change the discriminant, as we may see by an explicit calculation. The discriminant 1/β is equal to the discriminant of β, since the minimal polynomial (rescaled, as above) has the same coefficients, in reverse order. Since the map in Lemma 1.17 is built out of these operations, the discriminant is preserved. For the second assertion, we consider a (scaled) minimal polynomial (6), which without loss of generality coefficient √ may be assumed to have positive leading √ A. The roots are (−B ± D)/2A. Only the larger root (−B + D)/2A has a √ √ chance√ to be reduced,√and this requires D > 2A + B and B > − D, which imply |B| < D and A < D. Finally, C is determined by A, B, and D, so there are √ only finitely many reduced elements of Q( d) of discriminant D. 1.2. Continued fractions 15 Proposition 1.19. An irrational number β ∈ R has periodic continued fraction expansion if and only if β belongs to a quadratic extension of Q. Proof. It is straightforward to see that any periodic continued fraction belongs to a√quadratic extension of Q (by arguing, essentially, as in the example [1, 1, . . . ] = √ ( 5 + 1)/2). It remains to show that any irrational number in Q( d) has periodic continued fraction (where d ≥ 2 is a squarefree integer). The argument proceeds in two steps: (i) we show that the sequence (βn ) of Proposition 1.13 contains a reduced element; (ii) we establish the periodicity of the subsequence of reduced elements and hence of the corresponding integers an = bβn c. √ For (i) we have, for irrational β ∈ Q( d), the following consequence of Lemma 1.14(i): 1 βqn − pn qn (−1)n , − = = + n−1 2 βn+1 βqn−1 − pn−1 qn−1 (β − pqn−1 )qn−1 where the second equality uses (3) of Proposition 1.10. We apply the nontrivial √ automorphism of Q( d) to the left- and right-hand sides to obtain a new relation containing the expression β 0 − pn−1 /qn−1 , which converges to β 0 − β and hence for sufficiently large n has constant sign. It follows that βn is reduced, for some n. For (ii) we observe by Lemma 1.17, if βn is reduced then so is βn+1 . In combination with Lemma 1.18, this shows that the sequence βn , βn+1 , . . . is periodic. Next, in this brief treatment of continued fractions, we recall the proof of the transcendence of e by means of auxiliary functions, which as we will see, following H. Cohn, A short proof of the simple continued fraction expansion of e, American Mathematical Monthly volume 113 (2006), pp. 57–62, may also be used to deduce the continued fraction expansion. Let d ∈ N>0 and c0 , . . . , cd ∈ Z be given, with c0 6= 0; we assert that d X cj ej 6= 0, j=0 and since d may be increased arbitrarily (by introducing additional coefficients, equal to zero), we may suppose that A := d X j=0 cj d Y (j − i) i=0 i6=j is nonzero. We choose an integer n > |A| so that Ced (dd+2 )n < n! where C := d X j=0 |cj |. 16 1. Elementary theory Let us introduce the auxiliary function (a function of j) 1 j e n! Z j xn (x − 1)n · · · (x − d)n e−x dx. 0 For j ∈ {0, 1, . . . , d} we have 1 j e n! Z 0 j 1 1 xn (x − 1)n · · · (x − d)n e−x dx ≤ ej jdn(d+1) ≤ ed dn(d+2) . n! n! For the linear combination of function values with coefficients cj , then, Z d C X cj j j n n n −x x (x − 1) · · · (x − d) e dx e ≤ ed (dd+2 )n < 1. n! n! 0 (7) j=0 If g(x) is any polynomial and we write g + g 0 + . . . for the sum of derivatives of all orders (a finite sum since g (k) = 0 for sufficiently large k), then d (g + g 0 + · · · )e−x = −g(x)e−x . dx This observation lets us evaluate the integral in the auxiliary function. Setting g(x) := xn (x − 1)n · · · (x − d)n , the linear combination of function values in (7) evaluates to d 1 X j cj e n! ∞ Z j=0 0 g(x)e−x dx − d X cj j=0 n! g(j) + g 0 (j) + · · · . (8) The second sum in (8) is an integer, since the derivatives to order less than n contribute nothing and n! divides g (n) . The remainder of the argument is number-theoretic in nature. We may suppose that n = p, a prime number, and we examine the value of the second sum in (8) mod p. By expanding g (k) for k ≥ p with the iterated Leibniz rule and discarding all terms divisible by x − j or by p · p! we are left to consider only the contribution from g (p) , and from the general fact ap ≡ a mod p (Fermat’s little theorem) we find d X cj j=0 p! g(j) + g 0 (j) + · · · ≡ A mod p. Since p > |A|, the second sum in (8) is a nonzero integer. Compatibility with (7) requires the first sum to be nonzero, as desired. 1.2. Continued fractions 17 We now focus on the case d = 1 of the above argument, where the value at j = 1 of the auxiliary function is Z e 1 n An := x (x − 1)n e−x dx. n! 0 We introduce some similar expressions: Z e 1 n Bn := x (x − 1)n+1 e−x dx, n! 0 Z e 1 n+1 Cn := x (x − 1)n e−x dx. n! 0 Each of these quantities is a Z-linear combination of 1 and e. For instance with g(x) as above, e 1 (g(0) + g 0 (0) + · · · ) − (g(1) + g 0 (1) + · · · ). n! n! There is also the obvious relation An = Cn = An + Bn . We obtain two further relations, for n ≥ 1: An = Bn−1 + Cn−1 , Bn = 2nAn + Cn−1 , as consequences of d n x (x − 1)n e−x = nxn−1 (x − 1)n e−x + nxn (x − 1)n−1 e−x − xn (x − 1)n e−x , dx d n+1 x (x − 1)n e−x = nxn (x − 1)n e−x + xn (x − 1)n e−x dx + nxn (x − 1)n e−x + nxn (x − 1)n−1 e−x − xn+1 (x − 1)n e−x . Proposition 1.20. We have e = [2, 1, 2, 1, 1, 4, 1, 1, 6, 1, 1, 8, 1, . . . ]. Proof. Let the displayed continued fraction define pn and qn for n = −1, 0, 1, . . . . Now the above relations, plus the starting data B0 = −1 and C0 = −2 + e (which we obtain by direct computation) yield, by an inductive argument, An = −p3n−2 + q3n−2 e, Bn = −p3n−1 + q3n−1 e, Cn = −p3n + q3n e. As a consequence of (7), the sequence (An ) is bounded (in absolute value). This implies, since (qn )n∈N>0 is an increasing sequence, that the convergents p3n−2 /q3n−2 tend to e. It follows that the continued fraction is equal to e. 18 1. Elementary theory The proof of the following corollary uses the general observation that the sequence (qn ) of any infinite continued fraction grows at least exponentially. Indeed, in the example [1, 1, . . . ] this sequence consists of Fibonacci numbers, whose growth is asymptotically exponential. The Fibonacci sequence (with a shift in index by one) is thus a lower bound for the sequence in general, since in the general case we have qn = an qn−1 + qn−2 ≥ qn−1 + qn−2 . Corollary 1.21. The number e has irrationality measure equal to 2. Proof. Since e is irrational, the irrationality measure must be at least 2. So it remains only to show, for ε > 0, that the inequality e − p < 1 q q 2+ε holds for only finitely many rational numbers p/q (where p and q are relatively prime integers, with q > 0). When q ≥ 21/ε , the inequality implies |e − p/q| < 1/2q 2 , which by Proposition 1.15 implies that p/q appears in the sequence of convergents to e. For n ∈ N the terms an in the continued fraction expansion given in Proposition 1.20 satisfy 2n + 10 , an+1 + 2 ≤ 3 hence by Corollary 1.11, 3 e − pn > . qn (2n + 10)qn2 Since the sequence (qn ) grows at least exponentially, the linear expression (2n+10)/3 is smaller than qnε for all sufficiently large n, and we are done. Liouville’s number, and other numbers whose definition follows a similar pattern, have continued fraction expansion that follows a regular pattern. This was observed by J. Shallit in a pair of articles published in the Journal of Number Theory (volume 11, pages 209–217 and volume 14, pages 228–231). Lemma 1.22. For positive integers a1 , a2 , . . . with a1 ≥ 2, −[0, a1 , . . . , an ] = [−1, 1, a1 −1, a2 , . . . , an ] for all n ∈ N>0 , and −[0, a1 , a2 , . . . ] = [−1, 1, a1 −1, a2 , . . . ]. Proof. This results by comparing, with a shift by one, the respective sequences of convergents. Lemma 1.23. Let x := [0, a1 , . . . , an ] be a finite continued fraction for some positive integer n, with an ≥ 2; we write x = p/q where p and q are positive relatively prime integers. Then for any integer m ≥ 2, x+ (−1)n = [0, a1 , . . . , an , m − 1, 1, an − 1, an−1 , . . . , a1 ]. mq 2 1.3. Heights and coefficients 19 Proof. By Lemma 1.22, we have [m − 1, 1, an − 1, an−1 , . . . , a1 ] = m − [0, an , . . . , a1 ]. Let us adopt the notation of Proposition 1.10, so e.g., p = pn and q = qn . From (2) we have the matrix identity 0 1 a1 1 an 1 pn pn−1 ··· = . 1 0 1 0 1 0 qn qn−1 Transposing and conjugating by the non-identity 2 × 2 permutation matrix, we find [0, an , . . . , a1 ] = qn−1 . qn Now by Lemma 1.14(i), [0, a1 , . . . , an , m − 1, 1, an − 1, an−1 , . . . , a1 ] = (m − qn−1 qn )pn qn−1 qn )qn + pn−1 + qn−1 (m − mpn qn − pn qn−1 + pn−1 qn = mqn2 n pn (−1) , = + qn mqn2 where in the last step we have used (3). Example. Starting with 1/10 + 1/100 = [0, 9, 11] and applying Lemma 1.23 repeatedly we obtain the continued fraction expansion of Liouville’s number [0, 9, 11, 101·2! − 1, 1, 10, 9, 102·3! − 1, 1, 8, 10, 1, 101·2! − 1, 11, 9, . . . ] P −3i , the Similarly, starting with 1/10 + 1/1000 = [0, 9, 1, 9, 10] we obtain, for ∞ i=0 10 continued fraction expansion 1 2 1 [0, 9, 1, 9, 10, 103 − 1, 1, 9, 9, 1, 9, 103 − 1, 1, 8, 1, 9, 9, 1, 103 − 1, 10, 9, 1, 9, . . . ]. It is a nice exercise to use fraction expansion and Proposition 1.15 to P the continued −3i has irrationality measure 3. 10 show that the number ∞ i=0 1.3 Heights and coefficients The classical auxiliary functions in §1.1 and §1.2, minimal polynomials of algebraic numbers and (linear combinations of) residuals of Taylor series approximations to ex , are naturally suggested by the respective irrational numbers under consideration. The more modern results in Diophantine approximation that will be presented make use of auxiliary functions which cannot be written down explicitly. Instead, one 20 1. Elementary theory keeps careful track of the sizes of coefficients and deduces the existence of suitable auxiliary functions from clever application of the pigeonhole principle. Here we introduce some notions and results that will be used in demonstrating the existence of and working with such auxiliary functions. Let f ∈ Z[x] be a nonzero polynomial. The most elementary measure of size of f is the degree, which is applicable as well to an algebraic number α (as the degree of the minimal polynomial). Now we define the height of f = an xn + · · · + a1 x + a0 (ai ∈ Z for all i) to be the maximum of the absolute values of the coefficients: H(f ) := max |ai |; 0≤i≤n the height of an integer polynomial in several variables is, similarly, the maximum of the absolute values of the coefficients of all the monomials. If α is an algebraic number, then as mentioned before, the minimal polynomial may be scaled to have integer coefficients with gcd 1, and the maximum of the absolute values of the coefficients is the height H(α). (The scaled polynomial is unique up to sign, so the absolute values of coefficients are well-defined.) For example, for relatively prime integers p and q with q 6= 0, the rational number p/q has height max(|p|, |q|). The following important property is clear from the fact that there are only finitely many integer polynomials with degree and absolute value of coefficients bounded by given quantities. Fact (Northcott property). Let n be a positive integer and T a positive real number. Among all algebraic numbers α of degree at most n, those with H(α) ≤ T are finite in number. For instance, the set of algebraic numbers of degree at most 2 and height at most three rational numbers, 0 and ±1, the real quadratic numbers √ 1 consists of √ ( 5 ± 1)/2 and (− 5 ± 1)/2, and the non-real 4th and 6th roots of unity. Let α be an algebraic integer of degree n. Then Z[α] is a free Z-module of rank n; the standard basis is 1, α, α2 , . . . , αn−1 . Now we are interested in the coefficients of a power of α with respect to this basis. Since the powers up to n − 1 of α are the basis elements, we make a statement that applies to powers at least n. Lemma 1.24. Let α be an algebraic integer of degree n and height H. Then for i ≥ n the coefficients of αi with respect to the standard Z-basis 1, α, . . . , αn−1 of Z[α] have absolute value at most (H + 1)i−n+1 . Proof. We let xn + an−1 xn−1 + · · · + a0 be the minimal polynomial of α, with ai ∈ Z for i = 0, . . . , n − 1, and we prove the statement by induction on i. For the base case i = n we have αn = −a0 − a1 α − · · · − an−1 αn−1 , 1.3. Heights and coefficients 21 with coefficients at most H in absolute value, by the definition of height. For the inductive step we suppose i > n, with αi−1 = b0 + b1 α + · · · + bn−1 αn−1 for some integers b0 , . . . , bn−1 with |bi | ≤ (H + 1)i−n for all i. Then αi = −a0 bn−1 + (b0 − a1 bn−1 )α + · · · + (bn−2 − an−1 bn−1 )αn−1 , with coefficient of 1 of absolute value at most H(H + 1)i−n and other coefficients of absolute value at most (H + 1)i−n+1 . Lemma 1.25. Let H, m, and n be positive integers, and let (aij ) ∈ Mat(m × n, Z) with |aij | ≤ H for all i and j. Assume that m < n. Then there exists a nontrivial solution (x1 , . . . , xn ) ∈ Zn to the system of linear Diophantine equations n X aij xj = 0 (1 ≤ i ≤ m) j=1 with, for every j, m |xj | ≤ (nH) n−m . Proof. Let B := b(nH)m/(n−m) c, and let S := {(x1 , . . . , xn ) ∈ Zn | 0 ≤ xj ≤ B ∀ j}, − so the cardinality of S is (B + 1)n . Let A := (aij ), and let n+ i , respectively ni denote the number of positive, respectively negative entries in the ith row of A. So, − − n+ i + ni ≤ n, and for x = (x1 , . . . , xn ) ∈ S the ith entry of Ax lies between −ni BH + and ni BH. This means, the linear map corresponding to A maps S to a set of cardinality at most (nBH + 1)m . Now (B + 1)n−m > (nH)m , and hence (B + 1)n > (nBH + 1)m . We may apply the pigeonhole principle to deduce that there exist distinct x and x0 in S with Ax = Ax0 , and x − x0 is a solution to the system of linear Diophantine equations that obeys the stated bound. Lemma 1.26. (i) Let f ∈ Z[x] be a polynomial of degree d and height H. For k ∈ N, (1/k!)f (k) has integer coefficients and height at most 2d H. (ii) Let f ∈ Z[x1 , . . . , xn ] have degree dj in the variable xj for all j and height H. For i1 , . . . , in ∈ N, (1/i1 ! · · · in !)∂ i1 +···+in /∂xi11 · · · ∂xinn f has integer coefficients and height at most 2d1 +···+dn H. ` Proof. For (i), if a` denotes the coefficient of x , then for k ≤ ` ≤ d the coefficient ` `−k (k) of x of (1/k!)f is k a` . Since the binomial coefficient is bounded by 2` , this gives (i). The multivariable generalization (ii) is clear by the same reasoning. 22 2 2. Thue’s theorem Thue’s theorem Algebraic numbers of degree 2, we have seen, are badly approximable, which implies (and is stronger than) irrationality measure 2. For real numbers that are algebraic of degree n ≥ 3, there is a gap between the lower bound for the irrational measure of 2 (by Dirichlet’s approximation theorem) and the upper bound of n (coming from Liouville’s theorem). The first step toward closing this gap is Thue’s theorem. Theorem 2.1 (Thue). Let α ∈ R be an algebraic number of degree n ≥ 3. For every ε > 0 the inequality 1 α − p < n/2+1+ε q q holds for only finitely many rational numbers p/q, where p and q are relatively prime integers with q > 0. As a first reduction step, we point out that it suffices to prove Thue’s theorem under the assumption that α is an algebraic integer. Indeed, suppose that we know the result for algebraic integers. If α is a general algebraic number, then mα is an algebraic integer for some m ∈ N>0 , and hence 1 mα − p < n/2+1+ε/2 q q holds for only finitely many rational numbers p/q, where p and q are relatively prime integers with q > 0. Since for q ≥ m2/ε we have m q n/2+1+ε ≤ 1 q n/2+1+ε/2 , we deduce the desired result for α. The exposition follows K. B. Stolarsky, Algebraic Numbers and Diophantine Approximation. 2.1 A class of auxiliary functions We suppose that α is an algebraic integer of degree n ≥ 3, with minimal polynomial xn + an−1 xn−1 + · · · + a1 x + a0 of height H := max(|a0 |, . . . , |an−1 |). The auxiliary functions used in the proof of Thue’s theorem will be nonzero polynomials in two variables of the form P (x) − yQ(x) ∈ Z[x, y] 2.1. A class of auxiliary functions 23 such that, when α is substituted for y, there is an identity of the form P (x) − αQ(x) = (x − α)h (F0 (x) + αF1 (x) + · · · + αn−1 Fn−1 (x)) (1) in (Z[α])[x], for some integer h ≥ 2. Notice, given (1) and supposing Q(α) 6= 0 for simplicity, that for β ∈ R with |β − α| sufficiently small, γ := P (β)/Q(β) satisfies |γ − α| < |β − α|; when β is a rational number, then so is γ. Let us postulate that each Fi has degree at most some given k ∈ N (where we also allow Fi to be zero for some i) and write Fi (x) = k X cjn+i+1 xj . j=0 for i = 0, . . . , n − 1. Let us as well write the powers of α in terms of the standard Z-basis of Z[α]: n−1 X (m) m α = bi αi . i=0 Then (1) translates into the system of linear equations in the coefficients ci : n−1 X min(k,m) X j (−1) i=0 j=0 j≥m−h h (h+i+j−m) b` cjn+i+1 = 0 m−j (2 ≤ ` ≤ n−1, 0 ≤ m ≤ h+k). In every summand, the binomial coefficients is at most 2h , and by Lemma 1.24 the (h+j+`−m) quantity |b` | is at most (H + 1)h . Proposition 2.2. With the notation as above, suppose that n − 1 h < k + 1 < (n − 1)h. 2 Then, setting r := k+1 , (n/2 − 1)h there exist Z-linearly independent polynomials P , Q ∈ Z[x] satisfying (1) for some F0 , . . . , Fn−1 ∈ Z[x], each (zero or) of degree at most k and of height at most n r (n(k + 1)(2H + 2)h ) 2 r−1 −1 . Proof. We view (1), as above, as a system of (n − 2)(h + k + 1) linear Diophantine equations in c1 , . . . , cn(k+1) with coefficients of absolute value ≤ (2H + 2)h . Then by Lemma 1.25, there exist a nontrivial identity (1) where F0 , . . . , Fn−1 have degree at most k and height at most (n(k + 1)(2H + 2)h )(n−2)(h+k+1)/(2(k+1)−(n−2)h) . The exponent is (n/2)(r/(r − 1)) − 1. 24 2. Thue’s theorem It remains only to exclude that P and Q in the identity (1) are Z-linearly dependent. A Z-linear dependence would imply that P and Q are divisible by the hth power of the minimal polynomial of α. But the degrees of P and Q can be at most h + k, and by the hypothesis, this is less than hn. For our purposes it will suffice to fix a value of r slightly more than 1. E.g., we may take r equal to (or close to) 1 + 1/N for a positive integer N . Corollary 2.3. Suppose that α is an algebraic integer of degree n ≥ 3 and height H, let N be a positive integer, let h ≥ 4N , and define k := j 1+ k 1 n − 1 h − 1. N 2 Then there exist Z-linearly independent polynomials P , Q ∈ Z[x] of height at most h 1 (2nH + 2n)(N + 2 )n , satisfying an identity (1) for some F0 , . . . , Fn−1 ∈ Z[x], each (zero or) of degree at most k and height at most h 1 (2nH + 2n)(N + 2 )n−1 . Proof. We have i 1 n −1 h− . k+1= 1+ N 2 2N for some 0 ≤ i < 2N . Define r := (k + 1)/(n/2 − 1)h, as in Proposition 2.2. Then r = 1 + 1/N − i/N (n − 2)h, hence 1 + 1/2N < r ≤ 1 + 1/N and n r 1 − 1 < N + n − 1. 2r−1 2 As well, n(k + 1) ≤ n(n − 2)h ≤ nh . Now we apply Proposition 2.2, and from the height bound in that statement we obtain H(Fi ) ≤ ((2nH +2n)h )(N +1/2)n−1 for all i. Furthermore, each coefficient of P or Q is a linear function of the coefficients of the polynomials Fi , with coefficients of absolute value at most (2H +2)h . So the absolute value of each coefficient is at most n(k + 1)(2H + 2)h ((2nH + 2n)(N +1/2)n−1 )h , and this is ≤ ((2nH + 2n)(N +1/2)n )h . 2.2 Excluded approximations from an approximation Let α be an algebraic integer of degree n ≥ 3 and height H, let N be a positive integer, and let P , Q ∈ Z[x] be as in Corollary 2.3 for some h ≥ 4N . 2.2. Excluded approximations from an approximation 25 Let κ > n/2 + 1, and let p and q be relatively prime integers with q > 0 and α − p < 1 . q qκ (1) We will use P and Q to exhibit an interval of integers, depending on q, that other rational approximations satisfying (1) must avoid. Lemma 2.4. Let α be an algebraic integer of degree n ≥ 3 and height H, let N be a positive integer, set C := (4nH + 4n)(2N +1)n , let polynomials P , Q ∈ Z[x] be as in Corollary 2.3 for some h ≥ 4N , let κ > n/2+1, and let p and q be relatively prime integers with q > 0 satisfying (1). Let t denote the multiplicity of p/q as a root of P 0 (x)Q(x) − P (x)Q0 (x); in particular, t is 0 if p/q is not a root of P 0 (x)Q(x) − P (x)Q0 (x). Then qt ≤ C h. We remark that P 0 (x)Q(x) − P (x)Q0 (x) = Q(x)2 (d/dx)(P (x)/Q(x)) is not the zero polynomial by the Z-linear independence of P and Q. So the multiplicity t in the statement of Lemma 2.4 is well-defined. Proof. By Gauss’s lemma, we have P 0 (x)Q(x) − P (x)Q0 (x) = (qx − p)t G(x) (2) for some G ∈ Z[x]. It follows that the leading coefficient of P 0 (x)Q(x) − P (x)Q0 (x) is at least q t . But the coefficients of P 0 (x)Q(x) − P (x)Q0 (x) have absolute value at most 2(h + k)2 (2nH + 2n)(2N +1)nh , and this is less than (4nH + 4n)(2N +1)nh . With the notation of Lemma 2.4, we have dt 0 0 P (x)Q(x) − P (x)Q (x) 6= 0. dxt x= p q By the general Leibniz rule we deduce that there exist natural numbers i and j summing to t + 1, such that P (i) p (j) p p (i) p Q − P (j) Q 6 0. = q q q q Now let p0 and q 0 be relatively prime integers with q 0 > q and 0 α − p < 1 . q0 q 0κ 26 2. Thue’s theorem In this situation, we either have Q(i) (p/q)Q(j) (p/q) = 0, in which case by swapping i and j if necessary we may suppose Q(i) (p/q) = 0 and as a consequence P (i) (p/q) 6= 0, or we have Q(i) (p/q)Q(j) (p/q) 6= 0, in which case P (i) (p/q)/Q(i) (p/q) and P (j) (p/q)/Q(j) (p/q) are distinct rational numbers, and by swapping i and j if necessary we may suppose Q(i) (p/q) 6= 0 and P (i) (p/q)/Q(i) (p/q) 6= p0 /q 0 . Summarizing, there exists i ∈ N satisfying: ( Q(i) (p/q) = 0, P (i) (p/q) 6= 0 or log C i≤ (3) h+1 and P (i) (p/q) p0 log q Q(i) (p/q) 6= 0, Q (i) (p/q) 6= q 0 . Lemma 2.5. Let U , V , and W be positive real numbers and κ > 1. Assume that 2U 1/κ V 1−1/κ < W. Then for r ∈ R>0 , Ur −κ+1 +Vr ≥W implies r∈ / 2U W 1 κ−1 , W . 2V Proof. By the assumption, (U/V )1/κ lies in the interior of the stated interval. If r < (U/V )1/κ then V r < U r−κ+1 , hence the inequality implies 2U r−κ+1 > W . If r > (U/V )1/κ then V r > U r−κ+1 , hence the inequality implies 2V r > W . Lemma 2.6. Let α be an algebraic integer of degree n and height H, let h ∈ N, and let S ∈ (Z[α])[x] be a polynomial of degree d, with S = S0 + S1 α + · · · + Sn−1 αn−1 . We define R(x) := (x − α)h S(x) and for i = 0, . . . , h define S [i] ∈ (Z[α])[x] by R(i) = (x − α)h−i S [i] (x), written as [i] [i] [i] S [i] (x) = S0 (x) + αS1 (x) + · · · + αn−1 Sn−1 (x). [i] Then each polynomial Sj has coefficients divisible by i! and height at most i!(H + 1)i+n−1 2d+h+i+3 max(H(S0 ), . . . , H(Sn−1 )). Proof. By the general Leibniz rule, (i) h−i R (x) = (x − α) i X h 1 i! (x − α)c S (c) (x). i−c c! c=0 We expand using the binomial theorem and apply Lemmas 1.24 and 1.26(i). 2.2. Excluded approximations from an approximation 27 Proposition 2.7. Let α be an algebraic integer of degree n ≥ 3 and height H, and let N be a positive integer. Then there is a positive constant δ such that for h ≥ 4N , polynomials P , Q ∈ Z[x] as in Corollary 2.3, real number κ > n/2 + 1, relatively prime integers p and q satisfying (1) with q ≥ (4nH + 4n)(2N +1)2n , relatively prime integers p0 and q 0 satisfying q 0 > q and 0 α − p < 1 , q0 q 0κ and natural number i as in (3), we have 1 n 1 n q h+b(1+ N )( 2 −1)hc−1−i q 0−κ+1 + q h+b(1+ N )( 2 −1)hc−1−i−(h−i)κ q 0 ≥ δ h . Proof. As in Corollary 2.3 we define k := b(1 + 1/N )(n/2 − 1)hc − 1. We define A := q h+k−i (i) p , P i! q B := q h+k−i (i) p . Q i! q Since P (i) /i! and Q(i) /i! are integer polynomials of degree at most h + k − i, we know that A and B must be integers, and by (3) they satisfy Bp0 − Aq 0 6= 0. By Lemma 1.26(i) and (1) there is a constant D depending only on α and N , such that |A|, |B| ≤ q h+k−i Dh . Now by application of Lemma 2.6 there is a constant E, also depending only on α and N , such that 1 ≤ |Bp0 − Aq 0 | ≤ |B||p0 − αq 0 | + q 0 |A − αB| p h−i ≤ q h+k−i q 0−κ+1 Dh + q h+k−i q 0 α − E h q ≤ (q h+k−i q 0−κ+1 + q h+k−i−(h−i)κ q 0 ) max(D, E)h Setting δ := 1/ max(D, E), we obtain the desired inequality. In order for Proposition 2.7 to be useful, Lemma 2.5 should be applicable. The following result describes choices that make Lemma 2.5 applicable. Proposition 2.8. Let an integer n ≥ 3 and real numbers κ > n/2 + 1 and C > 1 be given. Then, for any N ∈ N>0 satisfying n 1 n + − 1 < κ − 1, (4) 2 N 2 28 2. Thue’s theorem real number β > 1 satisfying n 1 n β < κ − 1, + −1 2 N 2 integer h0 ≥ 4N satisfying 1− (5) 1 1 > , h0 β (6) and positive real number δ there exists an integer q0 ≥ C 2 such that for all integers h ≥ h0 and q ≥ q0 and all i ∈ N with i≤ log C h+1 log q (7) we have h < β, h−i h + b(1 + N1 )( n2 − 1)hc − 1 − i 1 n n < + − 1 β, h−i 2 N 2 2U 1/κ V 1−1/κ < W, (8) (9) (10) where 1 n U := q h+b(1+ N )( 2 −1)hc−1−i , V := q 1 h+b(1+ N )( n −1)hc−1−i−(h−i)κ 2 (11) , h W := δ . (12) (13) Proof. Without loss of generality we have δ ≤ 1. We may choose q0 ≥ C 2 so that 1− Then 1 log C 1 − > . h0 log q0 β h ≤ h−i 1− 1 log C log q − 1 h < β. With k := b(1 + 1/N )(n/2 − 1)hc − 1, then, h+k−i h+k h n 1 n ≤ < + − 1 β < κ − 1. h−i h h−i 2 N 2 Now the condition 2U 1/κ V 1−1/κ < W is equivalent to 21/(h−i) h+k−i q h−i < q κ−1 , δ h/(h−i) (14) and we see, the first factor on the left-hand side must lie between δ −1 and 2δ −β . So, after suitably increasing q0 if needed, we have that (14) holds for all q ≥ q0 . 2.2. Excluded approximations from an approximation 29 Corollary 2.9. Let α be an algebraic integer of degree n ≥ 3, and let κ > n/2 + 1. Then there exist integers h0 and q0 and a positive constant γ < 1 such that for any relatively prime integers p and q satisfying q ≥ q0 and (1) and any integer h ≥ h0 , if p0 and q 0 are relatively prime integers with q 0 > q and 0 α − p < 1 , q0 q 0κ then q0 ∈ / [q γh , q h ]. Proof. We make choices of constants as in Proposition 2.8. So, first, N ∈ N>0 should be chosen to satisfy (4), and as in Lemma 2.4 we set C := (4nH + 4n)(2N +1)n , where H denotes the height of α. Now, to get the conclusion stated here, we impose the requirement that β > 1 should satisfy n 1 n β 1+ + < κ; (15) −1 2 N 2 notice that this implies (5). We let h0 be as in (6) and δ as in Proposition 2.7. Now Proposition 2.8 supplies an integer q0 ≥ C 2 such that for all h ≥ h0 , q ≥ q0 , and i ∈ N satisfying (7), inequalities (8)–(10) are satisfied; let U , V , and W be as defined in (11)–(13). We observe, by (8) and (9), h+b(1+1/N )(n/2−1)hc−1−i W δh ) h−i = q (h−i)(κ− 2V 2 δ h h (κ−( n2 + N1 ( n2 −1))β) > qβ 2 δ βκ −( n2 + N1 ( n2 −1)) h = q . 21/h In the last expression, the exponent of q (within the outer brackets) is, by (15), greater than 1. So, for sufficiently large q we have W/2V > q h . Similarly, using (9), 1 h+b(1+1/N )(n/2−1)hc−1−i (n/2+(1/N )(n/2−1))β 1 h 2U κ−1 κ−1 = 2 κ−1 δ − κ−1 q (n/2+(1/N )(n/2−1))β W 1 h 1 h < 2 κ−1 δ − κ−1 q (h−i) ≤ 2 κ−1 δ − κ−1 q h (n/2+(1/N )(n/2−1))β κ−1 (n/2+(1/N )(n/2−1))β κ−1 . By reasoning as above, if we fix γ such that ( n2 + 1 n N(2 − 1))β < γ < 1, κ−1 then for sufficiently large q we have (2U/W )1/(κ−1) < q γh . The result now follows from Lemma 2.5 and Propositions 2.7 and 2.8. 30 2.3 2. Thue’s theorem Proof of theorem Given the last result established in §2.2, the proof of Theorem 2.1 goes quite quickly. Proof of Theorem 2.1. We have already seen, for the proof it suffices to consider the case that α is an algebraic integer of degree n ≥ 3. Given ε > 0 we set κ := n/2+1+ε and let h0 , q0 , and γ be as in Corollary 2.9. Increasing h0 if necessary, we may suppose that γ h0 ≥ . 1−γ We need to show, there are only finitely many rational numbers p/q, where p and q are relatively prime positive integers with q > 0, such that α − p < 1 . q qκ We are, of course, done if all such p/q satisfy q < q0 . So we suppose that there is some such rational approximation p/q with q ≥ q0 . By applying Corollary 2.9 to h = h0 , h0 +1, . . . and observing from our additional requirement on h0 that the adjacent pairs of excluded intervals overlap each other, we find, that for any relatively prime integers p0 and q 0 with q 0 > q and 0 α − p < 1 , q0 q 0κ we have q 0 < q γh0 . This observation completes the proof of Thue’s theorem. A natural question is, for given α and ε, how in practice to list all the rational approximations satisfying the bound in Thue’s theorem. One can search for such approximations, by letting q run from 1 up to any given positive integer and testing for each q if some relatively prime integer p satisfies the inequality. One can (after reducing to the case that α is an algebraic integer) also compute constants h0 , q0 , and γ as in Corollary 2.9: we have made constants explicit in §2.1 and throughout most of §2.2, and one could continue this effort and produce explicit forms for constants D and E in Proposition 2.7 (which determine δ) and q0 in Proposition 2.8. However, one is left with the logical alternative in the proof of Theorem 2.1, that either all approximations p/q meeting the bound in Thue’s theorem satisfy q < q0 , in which case we just search this far for approximations, or some such p/q exists with q ≥ q0 , in which case it suffices to search up to q γh0 . But it is not known, in general, how to distinguish these cases. So we say that Thue’s theorem is an ineffective result, meaning that it describes a finite set that we have no way of determining. 2.3. Proof of theorem 31 For a certain class of Diophantine equations, Thue’s theorem implies that the set of solutions is finite. The Diophantine equation in the next result is the Thue equation. Theorem 2.10. Let n ≥ 3, let F (x, y) be an irreducible homogeneous polynomial of degree n whose coefficients are integers with gcd 1, and let m be a nonzero integer. Then F (x, y) = m has only finitely many solutions in integers x and y. Proof. The solutions with y = 0 are easily determined, so we assume y 6= 0 throughout the proof. Let f (x) := F (x, 1) be the non-homogeneous form of the given polynomial, with leading coefficient a and roots α1 , . . . , αn ∈ C. The given Diophantine equation is equivalent to n a Y x 1 − αi = n . (1) m y y i=1 Let δ := 1 min |αi − αj |. 2 1≤i<j≤n Notice, if |x/y − αi | ≥ δ for all i then (1) implies |y| ≤ |m/a|1/n /δ; there are thus only finitely many integer solutions (x, y) obeying this bound. It remains to treat the case that for some i we have |x/y − αi | < δ. The other n − 1 factors in the product in (1) all have absolute value greater than δ. So, we have 1−n αi − x < |m/a|δ . (2) y |y|n Let κ be any real number with n + 1 < κ < n. 2 By Thue’s theorem there are only finitely many pairs of relatively prime integers p and q with q > 0, such that αi − p < 1 . q qκ This implies that there are only finitely many pairs of integers x and y such that αi − x < 1 , (3) y |y|κ since to each rational approximation p/q obeying the given bound there correspond just finitely many pairs of integers x and y with x/y = p/q satisfying (3). Since κ < n, the finiteness assertion for (3) implies that there are only finitely many pairs of integers x and y satisfying (2). 32 3. Roth’s theorem 3 Roth’s theorem Roth’s theorem, proved in 1955 and for which Roth was awarded the Fields medal in 1958, says that every algebraic number in R has irrationality measure 2. Since the case of quadratic irrational numbers has already been discussed, the theorem is stated for algebraic numbers of degree ≥ 3. Theorem 3.1 (Roth). Let α ∈ R be an algebraic number of degree n ≥ 3. For every ε > 0 the inequality α − p < 1 q q 2+ε holds for only finitely many rational numbers p/q, where p and q are relatively prime integers with q > 0. Roth’s theorem is also called Thue-Siegel-Roth theorem, since it stands as final improvement to Thue’s theorem (1909), which Siegel (1921) had strengthened, replacing the exponent n/2+1+ε by n/k +k −1+ε for k ∈ N, (k −1)k < n ≤ k(k +1). This way, the chapter on Thue’s theorem is logically unnecessary, though many of the same techniques are used here in a more sophisticated setting. Exactly as for Thue’s theorem, we make the initial reduction step, that it suffices to prove Roth’s theorem under the assumption that α is an algebraic integer. For the proof we follow J. W. S. Cassels, An Introduction to Diophantine Approximation. 3.1 Wronskians The proof of Roth’s theorem uses, as auxiliary functions, polynomials in several variables. We need a generalization of the fact, used in Lemma 2.4, that for a pair of polynomials in one variable, neither a constant multiple of the other, the 2 × 2 matrix with the polynomials and their derivatives has nonzero determinant. We work over Q here, though this could be replaced by any field of characteristic 0. Let m ∈ N>0 , and let K := Q(x1 , . . . , xm ). The next result relates two notions, the Q-linear independence of elements f1 , . . . , f` ∈ K and the K-linear independence of vectors (D1 f1 , . . . , D1 f` ), . . . , (D` f1 , . . . , D` f` ) ∈ K ` for differential operators D1 , . . . , D` , each a monomial in ∂/∂x1 , . . . , ∂/∂xm . The latter can be detected by ` × ` determinants, which are called generalized Wronskians. A monomial in differential operators ∂ ∂ ··· ∂xa1 ∂xar has a well-defined degree r. The identity operator is the unique such differential operator having degree 0. 3.1. Wronskians 33 Proposition 3.2. Let m ∈ N>0 , and let K := Q(x1 , . . . , xm ). For ` ∈ N>0 and elements f1 , . . . , f` ∈ K the following are equivalent: (i) The elements f1 , . . . , f` are Q-linearly independent. (ii) There exist differential operators D1 , . . . , D` , where for each i the operator Di is a monomial in ∂/∂x1 , . . . , ∂/∂xm of degree less than i, such that the generalized Wronskian det(Di fj )1≤i,j≤` (1) is nonzero. Proof. The implication (ii) ⇒ (i) is obvious. We prove (i) ⇒ (ii) by induction on `. The case ` = 1 is trivial. Given ` > 1 and assuming the result known for (` − 1)-tuples of elements of K, we suppose that f1 , . . . , f` are such that the generalized Wronskians (1) all vanish. We may further suppose that f1 , . . . , f`−1 are Q-linearly independent, since if f1 , . . . , f`−1 are Q-linearly dependent then so are f1 , . . . , f` . Applying the induction hypothesis, then, some generalized Wronskian of size (` − 1) × (` − 1) is nonzero: det(Di fj )1≤i,j≤`−1 6= 0. (2) We consider the (` − 1) × ` matrix over K D1 f1 . . . D1 f` .. .. . . . D`−1 f1 . . . D`−1 f` By (2), this is a matrix of rank ` − 1, whose kernel is the K-span of (a1 , . . . , a`−1 , 1) for some a1 , . . . , a`−1 ∈ K. By assumption the generalized Wronskians (1) vanish, so a1 Df1 + · · · + a`−1 Df`−1 + Df` = 0 (3) for every monomial D of degree less than ` in ∂/∂x1 , . . . , ∂/∂xm . We apply ∂/∂xh to (3): ∂a1 ∂a`−1 Df1 + · · · + Df`−1 ∂xh ∂xh ∂ ∂ ∂ + a1 Df1 + · · · + a`−1 Df`−1 + Df` = 0. ∂xh ∂xh ∂xh (4) When D has degree less than ` − 1, the operator (∂/∂xh )D has degree less than `, and by (3) the rightmost ` terms in (4) sum to zero. This is the case when D ∈ {D1 , . . . , D`−1 }; by these instances of (4), using (2), we have ∂aj /∂xh = 0 for 1 ≤ j ≤ ` − 1 and 1 ≤ h ≤ m. This implies aj ∈ Q for j = 1, . . . , ` − 1, and now a1 f1 + · · · + a`−1 f`−1 + f` = 0 is a nontrivial Q-linear relation among f1 , . . . , f` . 34 3.2 3. Roth’s theorem Multivariable auxiliary functions Let α be an algebraic integer of degree n ≥ 3. The auxiliary functions used in the proof of Roth’s theorem will be polynomials f ∈ Q[x1 , . . . , xm ] where m depends on n and the constant ε in the statement of Roth’s theorem, vanishing to high order (according to a suitable measure of order of vanishing) at (α, . . . , α). Definition. Let F ∈ Q[x1 , . . . , xm ] and α1 , . . . , αm ∈ C. The index of F at (α1 , . . . , αm ) relative to given positive integers r1 , . . . , rm is defined to be i1 im ind(F ) = ind(α1 ,...,αm ) (F ) := min + ··· + , rm (i1 ,...,im ) r1 where the minimum is taken over all (i1 , . . . , im ) such that the coefficient of (x1 − α1 )i1 · · · (xm − αm )im in the Taylor expansion of F around (α1 , . . . , αm ) is nonzero. (When F = 0 we make the convention that the index is infinite.) Equivalently, the index is the minimum of the given expression over all (i1 , . . . , im ) such that ∂ i1 +···+im F (α1 , . . . , αm ) 6= 0. ∂xi11 · · · ∂ximm Proposition 3.3. Given α1 , . . . , αm ∈ C and r1 , . . . , rm ∈ N>0 the index at (α1 , . . . , αm ) relative to r1 , . . . , rm satisfies the P following properties: i1 i +···+i i m m 1 (i) ind(∂ F/∂x1 · · · ∂xm ) ≥ ind(F ) − ν iν /rν ; (ii) ind(F + G) ≥ min(ind(F ), ind(G)); (iii) ind(F G) = ind(F ) + ind(G); (iv) ind(F ) is, for F ∈ Q[x1 , . . . , xk ] with k < m, equal to the index of F at (α1 , . . . , αk ) relative to r1 , . . . , rk . Proof. By the definition of index as stated in terms of vanishing of partial derivatives of F at (α1 , . . . , αm ), we have (i). Assertion (ii) is obvious, and (iii) is immediate from the observation that ind(F ) is the t-degree after substituting xi := αi + t1/ri yi . Under the assumption in (iv) all monomials with nonzero coefficient in the Taylor expansion of F around (α1 , . . . , αm ) have ik+1 = · · · = in = 0, and from this the property is immediate. Lemma 3.4. Let m be a positive integer. Given positive integers r1 , . . . , rm and real 0 < δ < 1, the number of m-tuples (i1 , . . . , im ) of natural numbers satisfying iν ≤ rν ∀ ν and i1 im 1 + ··· + ≤ m(1 − δ) r1 rm 2 is at most (r1 + 1) · · · (rm + 1) δ r 2 . m 3.2. Multivariable auxiliary functions 35 p Proof. We prove the result by induction on m. The result is trivial if δ ≤ 2/m p and in particular is trivial for m = 1 and m = 2. So we suppose m > 2 and δ > 2/m. Fixing im , by the induction hypothesis the number of (i1 , . . . , im−1 ) satisfying the given inequality is at most (r1 + 1) · · · (rm−1 + 1) √ 2m − 2. δm − 1 + 2 rim m We have rm X i=0 r m 1 1X = i 2 δm − 1 + 2 rm i=0 = 1 1 + i i δm − 1 + 2 rm δm + 1 − 2 rm rm X i=0 δ 2 m2 δm i 2 − (1 − 2 rm ) δm δ 2 m2 − 1 1 < (rm + 1) p , δ m(m − 1) ≤ (rm + 1) p where in the last step we have used the assumption δ > 2/m: r 1 1 δ 2 m2 − 1 > δ 2 m2 1 − > δ 2 m2 1 − . 2m m This gives the desired bound. The following result supplies the auxiliary functions used in the proof of Roth’s theorem. Proposition 3.5. Let α be an algebraic integer of degree n ≥ 3 and height H, and let δ > 0. For any integer m with m> 8n2 δ2 and r1 , . . . , rm ∈ N>0 , there exists nonzero F ∈ Z[x1 , . . . , xm ] having degree in xj at most rj for all j, index at least 1 m(1 − δ), 2 at (α, . . . , α) relative to r1 , . . . , rm , and height at most (4H + 4)r1 +···+rm . Proof. We write F = X ci1 ...im xi11 · · · ximm 36 3. Roth’s theorem where the multi-index ranges over 0 ≤ i1 ≤ r1 , . . . , 0 ≤ im ≤ rm . The number of unknown coefficients is (r1 + 1) · · · (rm + 1). To have index at least (1/2)m(1 − δ) imposes the constraints 1 ∂ i1 +···+im F (α, . . . , α) = 0 i1 ! · · · im ! ∂xi11 · · · ∂ximm for all i1 , . . . , im with i1 im 1 + ··· + < m(1 − δ). r1 rm 2 The number of constraints is the quantity appearing in Lemma 3.4; each constraint consists of n linear equations (one for every basis element of Z[α]) in the ci1 ...im with coefficients of absolute value at most (2H + 2)r1 +···+rm . Indeed, differentiating and dividing by the appropriate factorials introduces coefficients bounded by 2r1 +···+rm (cf. Lemma 1.26(ii)), and the expression of powers of α in terms of the standard basis of Z[α] multiplies these coefficients by at most (H + 1)r1 +···+rm (by Lemma 1.24). We apply Lemma 1.25. By the hypothesis on m, the exponent in Lemma 1.25 is less than 1. So a solution exists, corresponding to nonzero F meeting the degree and index requirements, with height at most (r1 + 1) · · · (rm + 1)(2H + 2)r1 +···+rm . Bounding ri + 1 by 2ri leads to the height bound in the statement. 3.3 Index at nearby rational points Given a collection of good rational approximations to α we will obtain a lower bound on the index of an auxiliary function as in §3.2. The lower bound will be applicable when the denominators of the rational approximations are large and close to each other in a weighted sense depending on the given r1 , . . . , rm and will be paired with a general upper bound, obtained using the machinery of Wronskians. Proposition 3.6. Let α be an algebraic integer of degree n ≥ 3 and height H, and let δ and ε be positive real numbers with 15δ < ε < 1 . 12 Given positive integers r1 , . . . , rm and pairs of relatively prime integers pi and qi with qi > 0 and α − pi < 1 qi qi2+ε for i = 1, . . . , m such that, for every i, qiδ > 64(H + 1) max(1, |α|), and r1 log q1 ≤ ri log qi ≤ (1 + δ)r1 log q1 , 3.3. Index at nearby rational points 37 we have 1 ind(p1 /q1 ,...,pm /qm ) (F ) ≥ εm 8 for all F ∈ Z[x1 , . . . , xm ] with xj -degree ≤ rj for all j, ind(α,...,α) (F ) ≥ (1/2)m(1−δ), and H(F ) ≤ (4H + 4)r1 +···+rm . Proof. Given F ∈ Z[x1 , . . . , xm ] as in the statement and i1 , . . . , im such that i1 im 1 + ··· + < εm r1 rm 8 we need to show that G := ∂ i1 +···+im F 1 i1 ! · · · im ! ∂xi11 · · · ∂ximm satisfies G p1 pm ,..., = 0. q1 qm By Lemma 1.26(ii) we have H(G) ≤ (8H + 8)r1 +···+rm . By Proposition 3.3(i), ind(α,...,α) (G) ≥ (1/2)m(1 − δ) − (1/8)εm > (1/2)m(1 − ε/3). We express G(p1 /q1 , . . . , pm /qm ) using the Taylor expansion around (α, . . . , α), which by the index bound reduces to X j1 +···+ rjm ≥ 21 m(1− 3ε ) r1 m j j 1 ∂ j1 +···+jm G p1 pm − α 1 ··· − α m. (α, . . . , α) j j m 1 j1 ! · · · jm ! ∂x1 · · · ∂xm q1 qm Notice, j +···+j mG ∂1 1 ≤ 32(H + 1) max(1, |α|) r1 +···+rm , (α, . . . , α) j1 ! · · · jm ! ∂xj11 · · · ∂xjmm by Lemma 1.26(ii) and the bound (r1 + 1) · · · (rm + 1) ≤ 2r1 +···rm on the number of terms in G. As well, m X p1 j pm j − log − α 1 · · · − α m ≥ (2 + ε) jν log qν q1 qm ν=1 1 ε r1 log q1 ≥ (2 + ε) m 1 − 2 3 m X ε ε rν log qν . ≥ 1+ 1 − (1 + δ)−1 2 3 ν=1 Since (1 + ε/2)(1 − ε/3) = 1 + (1/6)ε(1 − ε) > (1 + δ)2 , we have p1 rm −1−δ − αj1 · · · pm − αjm < |q r1 · · · qm | . 1 q1 qm 38 3. Roth’s theorem Combining, we have at most 2r1 +···+rm terms, each of absolute value at most rm |−1−δ . It follows that the absolute value (32(H + 1) max(1, |α|))r1 +···+rm |q1r1 · · · qm of the integer rm G(p1 /q1 , . . . , pm /qm ) q1r1 · · · qm is at most r +···+rm r1 rm −δ | , |q1 · · · qm 64(H + 1) max(1, |α|) 1 and by the hypothesis on qiδ this is less than 1. Proposition 3.7. Let m be a positive integer and δ ∈ R with 0 < δ < 1/12, and set γ := 24 δ 2m−1 . 2m 12 Given positive integers r1 , . . . , rm satisfying ri+1 ≤ γri for i = 1, . . . , m − 1 and pairs of relatively prime integers pi and qi with qi > 0 for i = 1, . . . , m such that for every i, qiγ ≥ 8m and r1 log q1 ≤ ri log qi , then ind(p1 /q1 ,...,pm /qm ) (F ) ≤ δ for every 0 6= F ∈ Z[x1 , . . . , xm ] with xj -degree ≤ rj for all j and H(F ) ≤ q1γr1 . Proof. We prove the result by induction on m. For the base case m = 1 we write F (x) = (qx − p)t G(x) (where we have omitted the subscript 1 from p, q, and x), with G(p/q) 6= 0. We have G ∈ Z[x] (Gauss’s lemma). So q t divides the leading coefficient of F , and thus (writing r for r1 ) q t ≤ H(F ) ≤ q δr . This gives the result, since indp/q (F ) = t/r. For the inductive step, we suppose m > 1 and the result known for smaller values of m. We may write ` X F = gi (x1 , . . . , xm−1 )hi (xm ) i=1 where gi and hi are polynomials with rational coefficients in the indicated variables and ` is minimal; certainly ` ≤ rm + 1. A Q-linear dependence among g1 , . . . , g` or among h1 , . . . , h` would allow us to write F as above with fewer than ` summands. So the minimalily of ` implies that g1 , . . . , g` are Q-linearly independent and h1 , . . . , h` are Q-linearly independent. We apply Proposition 3.2. Applied, to g1 , . . . , g` we find that there exist D1 , . . . , D` , where Di is a monomial in ∂/∂x1 , . . . , ∂/∂xm−1 of degree less than i for every i, such that det(Di gj )1≤i,j≤` 6= 0. 3.3. Index at nearby rational points 39 In the one-variable case (usual Wronskians), the Di hj are just hj , h0j , . . . , and we obtain (i−1) det(hj )1≤i,j≤` 6= 0. a i(m−1) Writing Di as ∂ ai1 +···+ai(m−1) /∂xa1i1 · · · ∂xm−1 , we introduce ∆i := 1 ∂ ai1 +···+ai(m−1) ai(m−1) ai1 ! · · · ai(m−1) ! ∂xa1i1 · · · ∂xm−1 and u := det(∆i gj )1≤i,j≤` . Similarly, we introduce v := det 1 (i−1) . hj (i − 1)! 1≤i,j≤` By the multiplicativity of the determinant, the polynomial with integer coefficients 1 ∂F W := det ∆i (j − 1)! ∂xj−1 1≤i,j≤` m satisfies W = uv. There exists c ∈ Q, unique up to sign, so that V := cv has integer coefficients with gcd 1. Then U := c−1 u as well has integer coefficients. So we have the factorization of integer polynomials W (x1 , . . . , xm ) = U (x1 , . . . , xm−1 )V (xm ). (1) Since W is the determinant of an ` × ` matrix whose entries are polynomials with xj -degree at most rj for every j, the xj -degree of W is at most `rj . By Lemma 1.26(ii) and the hypothesis, the entries of the matrix have height at most 2r1 +···+rm q1γr1 . The determinant is a sum of `! ≤ `rm ≤ 2`rm terms, each of height at most (r1 + 1)` · · · (rm + 1)` (2r1 +···+rm q1γr1 )` , which is ≤ 4(r1 +···+rm )` q1γr1 ` . So H(W ) ≤ 8(r1 +···+rm )` q1γr1 ` ≤ q12γr1 ` where for the second inequality we have used r1 + · · · + rm ≤ mr1 and 8m ≤ q1γ . In the factorization (1) every coefficient of W is a product of corresponding coefficients of U and V , so H(U ), H(V ) ≤ q12γr1 ` . 40 3. Roth’s theorem The hypotheses of the proposition are satisfied for the quantities m0 := m − 1, δ 0 := δ2 , 12 ri0 := `ri (i = 1, . . . , m − 1), with corresponding quantity γ 0 related by γ 0 = 2γ. So by the induction hypothesis and Proposition 3.3(iv), ind(p1 /q1 ,...,pm /qm ) (U ) ≤ ` δ2 . 12 Similarly, by appealing to the single-variable case of the proposition, we have ind(p1 /q1 ,...,pm /qm ) (V ) ≤ ` δ2 . 12 Now by Proposition 3.3(ii), ind(p1 /q1 ,...,pm /qm ) (W ) ≤ ` δ2 . 6 (2) The proof concludes by relating the index of F to that of W . This is possible by expanding the determinant and applying Proposition 3.3(i)–(iii). We let θ := ind(p1 /q1 ,...,pm /qm ) (F ). So by Proposition 3.3(i) and the inequality ai(m−1) j − 1 ai1 + · · · + ai(m−1) j − 1 ai1 + ··· + + ≤ + r1 rm−1 rm rm−1 rm `−1 j−1 ≤ + rm−1 rm j−1 rm ≤ + rm−1 rm j−1 ≤γ+ rm we have ind(p1 /q1 ,...,pm /qm ) ∆i ∂F δ2 j−1 ≥ max θ − − ,0 j−1 ∂ xm 24 rm and hence by Proposition 3.3(ii)–(iii) and (2), ` X j=1 max θ − δ2 j−1 δ2 − ,0 ≤ ` . 24 rm 6 (3) 3.3. Index at nearby rational points 41 There are now two cases. If θ ≥ (`−1)/rm then, directly, we obtain θ ≤ δ 2 /2 < δ. If θ < (` − 1)/rm , then the terms with j > θrm + 1 contribute trivially, and hence (3) implies δ2 bθrm c(bθrm c + 1) δ2 (bθrm c + 1)θ − ` − ≤` . 24 2rm 6 So θ(bθrm c + 1) θ(bθrm c + 1) δ2 δ2 θ2 ≤ = (bθrm c + 1)θ − ≤ ` ≤ rm , 2 2 2 4 2 which implies θ ≤ δ. rm Proof of Theorem 3.1. We may suppose ε < 1/12. Let n denote the degree and H the height of α. We prove the result by contradiction. Suppose that there exist infinitely many rational numbers p/q, where p and q are relatively prime integers with q > 0, such that α − p < 1 . (4) q q 2+ε Then we let δ be a positive real number less than ε/15, and we choose an integer m > 8n2 /δ 2 . We define γ as in Proposition 3.7. We choose, among the rational approximations satisfying (4), one whose denominator is greater than both (4H + 4)m/γ and (64(H + 1) max(1, |α|))1/δ ; this we call p1 /q1 . Now we let p2 /q2 , . . . , pm /qm be further such approximations, chosen so that 2/γ qi+1 > qi for every i. Then we take r1 to be an integer satisfying r1 ≥ and define ri := 1 log qm δ log q1 j r log q k 1 1 +1 log qi for i = 2, . . . , m. Since m > 8n2 /δ 2 , we may apply Proposition 3.5 to obtain 0 6= F ∈ Z[x1 , . . . , xm ] with xj -degree ≤ rj for all j, index at least (1/2)m(1 − δ) at (α, . . . , α), and height at most (4H + 4)r1 +···+rm . The hypotheses of Propositions 3.6 and 3.7 are satisfied: we have qm > · · · > q1 > (64(H + 1) max(1, |α|))1/δ , we easily verify r1 log q1 ≤ ri log qi ≤ (1 + δ)r1 log q1 for all i, from which follows γri > 2(log qi / log qi+1 )ri ≥ (2/(1 + δ))ri+1 ≥ ri+1 ; as well, we have H(F ) ≤ (4H + 4)mr1 ≤ q1γr1 . By Proposition 3.6 the index of F at (p1 /q1 , . . . , pm /qm ) is at least εm/8. By Proposition 3.7, ind(p1 /q1 ,...,pm /qm ) (F ) ≤ δ < ε/15, and we have a contradiction.