* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Chapter 1: Basic Arithmetic
Survey
Document related concepts
Line (geometry) wikipedia , lookup
Wiles's proof of Fermat's Last Theorem wikipedia , lookup
Mathematics of radio engineering wikipedia , lookup
List of first-order theories wikipedia , lookup
Approximations of π wikipedia , lookup
Elementary mathematics wikipedia , lookup
Vincent's theorem wikipedia , lookup
Factorization wikipedia , lookup
System of polynomial equations wikipedia , lookup
Fundamental theorem of algebra wikipedia , lookup
Proofs of Fermat's little theorem wikipedia , lookup
Factorization of polynomials over finite fields wikipedia , lookup
Transcript
Chapter 1 Basic Arithmetic All rings in this lecture are unitary and commutative. In this chapter, we will analyze basic arithmetic operations with the point of view of implementing them on a computer. All computers provide basic integer arithmetic, but in general only in a very specific way. More precisely, computes usually allow to do arithmetic in Z/2k Z for a fixed k, the machine word-width, where k ∈ {8, 16, 32, 64, 128}. (Most modern computers have k = 64.) Since 264 = 18446744073709551616 is quite large, this is often no restriction. But sometimes, especially when doing computations in mathematics, numbers tend to grow larger than that. If one still wants to know the precise result, one has to invest some work to simulate larger integers with what the CPU supports. The basic approach to do this is covered in Section 1.1. Afterwards, we will introduce some notation from complexity theory to simplify discussing the running time needed for arithmetic. We will count the running time in different units, most prominently basic b-operations, which correspond to what the CPU can do natively with one operation, and to ring or field operations, which obviously depend on the underlying ring respectively field. The next topics we will cover are univariate polynomials and the Euclidean Algorithm, one of the most important algorithms in arithmetic. The later sections in this chapter are merely applications of the content of the first four sections, which will enable us to do arithmetic in the rational numbers Q, in residue class rings Z/nZ of the integers, in residue class rings of K[X], and in finite fields. 1.1 Integer Arithmetic A very fundamental problem in computer algebra is integer arithmetic. The basic questions are: • How to represent integers on a computer? • How to do basic arithmetic – addition, subtraction, multiplication, division with remainder – in this representation? • How fast can we do arithmetic? The most used representation is a positional notation;1 the decimal notation we learned in school is also one. 1 The first counting system with positional notation was used by the Babylonians with base 60. 1 2 CHAPTER 1. BASIC ARITHMETIC Definition 1.1.1. Let N ∈ N be a natural number and b ≥ 2 be another natural number. An (n + 1)-tuple (an , . . . , a0 ) ∈ {0, . . . , b − 1}n+1 is called a b-adic representation of N if and only if n X N= ai bi . i=0 In case n = 0 or an 6= 0, we say that this representation is reduced. We say that this representation has size n. The number system we use is the one of reduced 10-adic representations. Remarks 1.1.2. (a) For every b ≥ 2, every natural number has a unique reduced b-adic representation. (b) One could also accept n = −1, which corresponds to the empty list (). This would introduce another representation of zero (compare Section 1.3). The reason why this is usually not done for integers is that in this way, one can work more efficiently with integers which fit into one digit because no list size changes are necessary as long as the result fits into one digit. (c) If (an , . . . , a0 ) and (cm , . . . , c0 ) are b-adic representations of the same number, then ai = ci for i = 0, . . . , min{n, m} and ai = 0 for m < i ≤ n and cj = 0 for n < j ≤ m. (d) The maximal number b-adicly representable with (an , . . . , a0 ) is n X i (b − 1)b = (b − 1) i=0 n X i=0 bi = (b − 1) · bn+1 − 1 = bn+1 − 1. b−1 If N is an arbitrary number, this shows that it can be represented b-adicly by (an , . . . , a0 ) if and only if N ≤ bn+1 − 1 ⇐⇒ N + 1 ≤ bn+1 ⇐⇒ logb (N + 1) ≤ n + 1 ⇐⇒ logb (N + 1) − 1 ≤ n. Moreover, since N is an integer and (bn+1 − 1) + 1 = bn+1 , we also obtain N ≤ bn+1 − 1 ⇐⇒ logb N < n + 1. (e) Therefore, the size of the reduced b-adic representation of N > 0 is sizeb (N ) := dlogb (N + 1) − 1e = dlogb (N + 1)e − 1 = blogb (N )c. Moreover, for N > 0, |sizeb (N ) − Cb log N | ≤ Db 2 for Cb := log1 b > 0 and Db := max{1, log log b } = 1. For N = 0, sizeb (0) = 0 = dlog(N + 1)e − 1. 1.1. INTEGER ARITHMETIC 3 (f ) Internally, a computer uses non-reduced 2-adic representations with a fixed n. This n is usually one less than a power of two, for example 31 = 25 − 1 or 63 = 26 − 1. Therefore, a computer can internally represent natural numbers {0, . . . , 232 − 1} respectively {0, . . . , 264 − 1}. Such an internal representation is called a CPU integer. We exploit this to represent larger numbers by choosing b as 232 respectively 264 . An arbitrary natural number is thus given as a list of CPU integers together with its length. Note that we identify the set of CPU integers with Z/232 Z respectively Z/264 Z. Also note that in fact, negative numbers can be represented as well by choosing a fitting representative system of the residue class ring Z/2m Z, for example −2m/2 , −2m/2 + 1, . . . , −1, 0, 1, . . . , 2m/2 − 2, 2m/2 − 1 in case m is even.2 (g) To represent arbitrary integers and not just natural numbers, we add a sign bit, an element in {+, −} which indicates the sign of the number. Note that the number 0 has two such representations. We will now explain how to do basic arithmetic with such b-adic representations. We will assume that we already know how to do all required operations with arguments in the range {0, . . . , b − 1}; we refer to such operations as basic b-operations. We define them in more detail in the following: Remarks 1.1.3. Let a, c ∈ {0, . . . , b − 1}. (a) Let γ ∈ {0, 1}. The sum a + c + γ is either in {0, . . . , b − 1}, or a + c + γ − b ∈ {0, . . . , b − 1}. Write a + c + γ = q · b + r with q, r ∈ N and r < b; thus q ∈ {0, 1}. We define a ⊕b,γ c := r and Carryadd,b (a, c, γ) := q ∈ {0, 1}. The latter is called the carry bit.3 (b) Let γ ∈ {0, 1}. In case a − c − γ ≥ 0, define a b,γ c := a − c − γ, and in case a−c−γ < 0, define ab,γ c := b+a−c−γ. We also define4 Carrysub,b (a, c, γ) := 0 in case a − c − γ ≥ 0 and Carrysub,b (a, c, γ) := 1 in case a − c − γ < 0.5 (c) Write a · c = q · b + r with q, r ∈ N and r < b. We define a b c := r and Carrymul,b (a, c) := q ∈ {0, . . . , b − 2}.6 (Note that a, c ≤ b − 1 implies a · c ≤ (b − 1)2 = b2 − 2b + 1, whence q ≤ b − 2.) 2 This system of representing negative numbers is called two’s complement. If an element of Z/2m Z is represented by a bitstring, one can compute the negative by flipping all bits and adding 1 to the result. 3 In x86 assembler, this is done by the ADC instruction (add with carry). If γ = 0 is known, one can use the ADD instruction. The value of Carryadd,b is returned via the carry flag. 4 Instead of “carry”, a more meaningful name would be “borrow”. To minimize the number of different names, we use “carry” here as well. In x86 assembler, the “borrow” bit is stored and retrieved from the “carry” flag. 5 In x86 assembler, this is done by the SBB instruction (subtraction with borrow). If γ = 0 is known, one can use the SUB instruction. The value of Carrysub,b is returned via the carry flag. 6 In x86 assembler, this is done by the MUL instruction (unsigned multiplication). This command returns simultaneously q and r, and indicates via the overflow and carry flag whether q 6= 0 or not. 4 CHAPTER 1. BASIC ARITHMETIC (d) Note that ({0, . . . , b − 1}, ⊕b,0 , b ) is a model of the residue class ring Z/bZ. Also note that b equals subtraction of natural numbers modulo b.7 (e) Assume that c 6= 0 and let a0 ∈ {0, . . . , b2 − 1}. We assume that a0 is given in the form a01 b + a02 , where a01 , a02 ∈ {0, . . . , b − 1}, and we assume that a/c < b (if this is not satisfied, the values of the following expressions will not be defined). Let q, r ∈ N with 0 ≤ r < c such that a0 = q · c + r. We denote a0 divb c := q and a0 remb c := r. Obtaining q and r is also called Euclidean (long) division.8 (f ) By basic b-operation, we mean the operations ⊕b,• , b,• , b , divb , remb , Carryadd,b , Carrysub,b and Carrymul,b , as well as comparisons of elements in {0, . . . , b − 1}. Using these basic operations, we can explain how to do arithmetic with b-adic representations: Theorem 1.1.4 (Basic Integer Arithmetic). Let b ≥ 2 be a natural number, and let N and M be two natural numbers with reduced b-adic representations (an , . . . , a0 ) of N and (cm , . . . , c0 ) of M . (a) Testing whether N < M , N = M or N > M can be done in at most min{n, m}+ 2 comparisons of elements {0, . . . , b − 1}. (b) Computing a reduced b-adic representation of N + M can be done in at most max{n, m} + 1 evaluations of ⊕b,• and Carryadd,b . (c) Computing a reduced b-adic representation of |N − M | together with9 sgn(N − M ) can be done in at most max{n, m} + 2 evaluations of b,• and Carrysub,b and min{n, m} + 2 comparisons of elements {0, . . . , b − 1}. (d) Computing a reduced b-adic representation of N · M can be done in at most (m + 1)(n + 1) evaluations of b and Carrymul,b , at most (2m + 1)(n + 1) + m evaluations of ⊕b,• and at most (2m + 1)n + 2m evaluations of Carryadd,b . (e) Assume that M 6= 0. Computing reduced b-adic representations of two numbers Q, R ∈ N with N = Q · M + R and R < M can be done in at most max{0, 2m(n−m)+4n+2m+6} evaluations of b and Carrymul,b , max{0, 5m(n− m) + 12n + 3m + 21} evaluations of ⊕b,• , max{0, 5m(n − m) + 9n + 6m + 12} evaluations of Carryadd,b , max{0, 3m(n − m) + 9n − 6m + 9} evaluations of b,• and Carrysub,b and comparisons of elements {0, . . . , b − 1}, n + 2 evaluations of divb and m + 1 evaluations of remb . We will later see that except for comparison, addition and subtraction, we can improve upon these operations by making them faster, at least in case n and m are sufficiently large. Proof of Theorem 1.1.4. 7 The x86 assembler also features a “signed multiplication” instruction, IMUL. If the result modulo b = 2` is taken, where ` is the bit-width of the CPU, the output is identical to the one of MUL. The main difference is the handling of the flags and sign extension for the q part of the result. 8 In x86 assembler, this is done by the DIV instruction (unsigned division). The numerator can actually be in range {0, . . . , b2 − 1}, and the instruction computes both q and r. In case the quotient does not fit into the range {0, . . . , b−1}, or if the divisor is zero, an error will occur. If the numerator is in range {0, . . . , b − 1} and the denominator is 6= 0, no error will occur. 9 Here, sgn(x) = −1 for x < 0, sgn(0) = 0 and sgn(x) = 1 for x > 0. 1.1. INTEGER ARITHMETIC 5 (a) In case n 6= m, we right away know that N > M (in case of n > m) or N < M (in case of n < m). If n = m, we find the largest index i with ai 6= ci . If no such index exists, N = M . Otherwise, we know that N < M in case ai < ci , and N > M in case ai > ci . Therefore, we need at most n + 1 comparisons to find the index i, and at most one more comparison to distinguish between ai < ci and ai > ci . (b) For i > n, we interpret ai = 0, and for i > m, we interpret ci = 0. To compute a reduced b-adic representation of N + M , we start with γ := 0 and iterate i from 0 up to max{n, m}. In step i, we compute di := ai ⊕b,γ ci and γ := Carryadd,b (ai , ci , γ). After the last iteration, (dmax{n,m} , . . . , d0 ) is the reduced b-adic representation of N + M in case γ = 0, and (1, dmax{n,m} , . . . , d0 ) is the reduced b-adic representation of N + M in case γ = 1. In case i > n (or i > m) and γ = 0, we do not need to use ⊕b,γ and Carryadd,b anymore to determine the last di ’s. In fact, in case i > n, we will have that the resulting reduced b-adic representation is (cm , . . . , ci+1 , di , . . . , d0 ). This shows that we need at most max{n, m} + 1 evaluations of ⊕b,γ and Carryadd,b . (c) We proceed as in (a) to find out whether N < M , N = M or N > M , and to find the largest k with ak 6= ck in case N 6= M , where again we interpret ai = 0 for i > n and ci = 0 for i > m. In case N = M , return (). Otherwise, swap N and M if necessary so that N > M , set γ := 0 and iterate i = 0, . . . , k. In iteration i, compute di := ai b,γ ci and γ := Carrysub,b (ai , ci , γ). After all iterations, we must have γ = 0 as N > M . We then find the largest index i ∈ {0, . . . , k} with di 6= 0 and return (di , . . . , d0 ). Since k ≤ max{n, m}, we need at most max{n, m} + 1 evaluations of b,• and Carrysub,b . For the comparison, we need at most k (for finding i) +(max{n, m}− k + 2) (for finding k) comparisons, which yields ≤ max{n, m} + 2 comparisons in total. Pm Pm i i (d) Note that M = i=0 ci b . Therefore, M N = i=0 (ci N )b . Multiplication of N by bi is easy: in case N > 0, N · bi has the reduced representation (an , . . . , a0 , 0, . . . , 0), where at the end we have added i zeros. We call such multiplications also shifts by i positions.10 Therefore, to compute N M , we compute ci N for i = 0, . . . , m (as long as ci 6= 0), and add the correctly shifted results together. To compute ci N , we use P b and Carrymul,b together with ⊕b,• and Carryadd,b : first note that ci N = nj=0 aj ci bj . The following diagram shows how a (not necessarily reduced) b-ary representation (dn+1 , dn , . . . , d0 ) of ci N can be com10 Computers usually offer special instructions to multiply with powers of 2, since they can be implemented very efficiently in the same way. The same is true for division by powers of 2. In x86 assembler, the instructions SHL and SHR execute shifts to the left (multiplication) and right (division). 6 CHAPTER 1. BASIC ARITHMETIC puted: a0 a1 a2 .. an · · · · ci ci ci . ci ⊕k ⊕k ··· l ⊕k ⊕k ⊕ = dn+1 dn ··· d3 d2 d1 d0 (The long boxes with the multiplications inside should be interpreted as two entries, the left one being Carrymul,b (aj , ci ) and the right one being aj b ci . The bend arrows should be interpreted as Carryadd,b .) Here, we execute n + 1 evaluations of b and Carrymul,b , one assignment, n + 1 evaluations of ⊕b,• , and n evaluations of Carryadd,b (note that in the last addition, which computes dn+1 , there will be no carry, since Carrymul,b (an , ci ) ≤ b − 2. Pk−1 If i=0 ci N bi is already computed and we just computed ck N , we need at most P n + 2 evaluations of ⊕b,• and Carryadd,b to compute ki=0 ci N bi (see (b)). This is only needed for k = 1, . . . , m. Therefore, the maximum total number of operations are: • at most (m + 1)(n + 1) evaluations of b and Carrymul,b , • (m + 1)(n + 1) + m(n + 2) = (2m + 1)(n + 1) + m evaluations of ⊕b,• , and • (m + 1)n + m(n + 2) = (2m + 1)n + 2m evaluations of Carryadd,b . (e) [Knu81, Section 4.3.1] In case n < m, we set Q = (0) and R = (an , . . . , a0 ) and we are done. If this is not the case, we proceed with two simplifications: • We first normalize N and M so that cm ≥ bb/2c. If this is not already the case, we multiply11 both N and M by x := bb/(cm + 1)c ∈ {1, . . . , bb/2c}, and after the whole computation, we divide R by x. • We then reduce to n − m + 1 divisions for two numbers of almost the same size as follows: first, compute Q0 , R0 such that (0, an , . . . , an−m ) = (cm , . . . , c0 ) · Q0 + R0 0 , . . . , r 0 ) := R0 < (c , . . . , c ); then 0 ≤ Q0 ≤ b − 1. Multiplying with (rm m 0 0 this identity with bn−m and adding (an−m−1 , . . . , a0 ), we obtain 0 (an , . . . , a0 ) = (cm , . . . , c0 ) · (Q0 , 0, . . . , 0 ) + (rm , . . . , r00 , an−m−1 , . . . , a0 )), | {z } n−m times 11 To see that this works, first note that cm x ≥ bb/2c: this is clear for cm = 1, and for cm ≥ 2, we consider the two cases cm ≥ b/4 and cm < b/4. In case cm < b/4, 2c2m /(cm − 1) ≤ 4c < b implies cm (b − cm )/(cm + 1) ≥ b/2, which in turn implies cm bb/(cm + 1)c ≥ bb/2c. In case cm ≥ b/4 we have b/4 < cm + 1 ≤ bb/2c, which yields 2 ≤ b/bb/2c < b/(cm + 1) < 4, whence x ∈ {2, 3}. Now bb/2c ≤ b/2 ≤ 2cm ≤ 3cm completes the proof that cm x ∈ {bb/2c, . . . , b − 1}. We now want to show that xM < bm+1 , which shows that the reduced b-adic representation of xM has size m; the above then shows that the leading digit is at least bb/2c. Note that M/bm < cm + 1, whence xM < x · (cm + 1)bm ≤ b/(cm + 1) · (cm + 1)bm = bm+1 . 1.1. INTEGER ARITHMETIC 7 where 0 (rm , . . . , r00 , an−m−1 , . . . , a0 ) < (cm , . . . , c0 )bn−m . 0 , . . . , r0 , a We then continue by dividing (rm 0 n−m−1 ) by (cm , . . . , c0 ) to compute the next digit of Q. This will be iterated until we are done. For the above, we have to discuss two special cases: (i) Exact division by a number in {1, . . . , b − 1}; (ii) Long division with remainder for two numbers (am+1 , . . . , a0 ) by (cm , . . . , c0 ), where (am+1 , . . . , a0 ) < (cm , . . . , c0 )b and cm ≥ bb/2c. Note that for (ii), the quotient will be in {0, . . . , b − 1}. We now discuss the two steps (i) and (ii) in more detail. (i) We can essentially proceed as in the second reduction mentioned above, since the long division of a two-digit b-adic number by a one-digit b-adic number is a basic b-operation. Given N = (ek , . . . , e0 ) and q ∈ {2, . . . , b − 1}, we want to compute N/q, knowing that N/q ∈ N. We proceed by extending N to (ek+1 , ek , . . . , e0 ) with ek+1 = 0; then (ek+1 , ek , . . . , e0 ) < qbk+1 . We compute (qk , . . . , q0 ) ∈ {0, . . . , b − 1}k+1 by iterating i = k, . . . , 0. In iteration i, we compute qi := (ei+1 b + ei ) divb q and ei := (ei+1 b + ei ) remb q. Before iteration i, if we have (ei+1 , ei , . . . , e0 ) < qbi+1 , then (ei+1 , ei ) < qb, whence (ei+1 b + ei ) divb q and (ei+1 b + ei ) remb q are defined. Since ((ei+1 b + ei ) remb q) ≤ b − 1, we will have that (ei , ei−1 , . . . , e0 ) < qbi after ei is redefined. Hence, during all iterations, divb and remb are well-defined. Note that here, we have used k + 1 evaluations of divb and remb . This shows that the whole normalization process needs at most • 2(n + 1) + 2(m + 1) evaluations of b and Carrymul,b , • 5(n + 1) + 5(m + 1) + 4 evaluations of ⊕b,• , • 5n + 5m + 8 evaluations of Carryadd,b (to multiply N and M by x; compare (d)), and • m + 1 evaluations of divb and remb (to divide R by x). (ii) Define q̂ := min{b(am+1 b + am )/cm c, b − 1}. If q is the quotient q = b(am+1 , . . . , a0 )/(cm , . . . , c0 )c, then q̂ is a good approximation of q: Claim: q̂ ≥ q ≥ q̂ − 2. [Knu81, p. 256, Theorems A and B] For the first inequality, assume that q̂ = b(am+1 b + am )/cm c, since q < b implies the claim when q̂ = b − 1. Now cm q̂ ≥ am+1 b + am − (cm − 1), whence (am+1 , . . . , a0 ) − q̂(cm , . . . , c0 ) ≤ m+1 X ai bi − q̂cm bm i=0 ≤ m+1 X ai bi − (am+1 b + am − cm + 1)bm i=0 = (cm − 1)bm + m−1 X i=0 ai bi < cm bm ≤ (cm , . . . , c0 ), 8 CHAPTER 1. BASIC ARITHMETIC which implies q̂ ≥ q. For the other inequality, assume that q̂ − 3 ≥ q. We have am+1 bm+1 + am bm am+1 b + am = cm cm bm (am+1 , . . . , a0 ) (am+1 , . . . , a0 ) ≤ < , m cm b (cm , . . . , c0 ) − bm q̂ ≤ where the last denominator is > 0 since otherwise, cm = 1 and cm−1 = · · · = c0 = 0 would imply q = q̂. As q > (am+1 , . . . , a0 )/(cm , . . . , c0 ) − 1, (am+1 , . . . , a0 ) (am+1 , . . . , a0 ) − +1 (cm , . . . , c0 ) − bm (cm , . . . , c0 ) bm (am+1 , . . . , a0 ) + 1. · = (cm , . . . , c0 ) (cm , . . . , c0 ) − bm 3 ≤ q̂ − q < This implies (am+1 , . . . , a0 ) (cm , . . . , c0 ) − bm >2 (cm , . . . , c0 ) bm m−1 X = 2(cm − 1) + 2 ci bi−m ≥ 2(cm − 1) ∈ Z. i=0 We conclude with (am+1 , . . . , a0 ) b − 4 ≥ q̂ − 3 ≥ q = (cm , . . . , c0 ) ≥ 2(cm − 1) ≥ 2 b−1 2 − 2 = b − 3, a contradiction. Therefore, q̂ − 3 < q, whence q ≥ q̂ − 2. Thus we first compute (am+1 , . . . , a0 ) − max{0, q̂ − 2} · (cm , . . . , c0 ) and compare it to (cm , . . . , c0 ). If the former is less than the latter, we subtract (cm , . . . , c0 ), and repeat if necessary. After at most two subtractions, the remainder will be less than (cm , . . . , c0 ). For the one multiplication and (at most) two comparisons and three subtractions, we need at most such many basic b-operations (compare (c) and (d)): • 2(m + 1) evaluations of b and Carrymul,b , • 5(m + 1) + 2 evaluations of ⊕b,• , • 5m + 4 evaluations of Carryadd,b , • 3(m + 3) evaluations of b,• and Carrysub,b , • 3(m + 3) comparisons of elements {0, . . . , b − 1}. For computation of q̂, we need one evaluation of divb . Now, to sum up, we need to do the normalization step at most once, and the division in (ii) at most n−m+1 times. Therefore, the total number of operations needed at most are: 1.2. A BIT OF COMPLEXITY THEORY 9 • 2(n + 1) + 2(m + 1) + 2(m + 1)(n − m + 1) = 2m(n − m) + 4n + 2m + 6 evaluations of b and Carrymul,b , • 5(n+1)+5(m+1)+4+(5(m+1)+2)(n−m+1) = 5m(n−m)+12n+3m+21 evaluations of ⊕b,• , • 5n + 5m + 8 + (5m + 4)(n − m + 1) = 5m(n − m) + 9n + 6m + 12 evaluations of Carryadd,b , • 3(m + 3)(n − m + 1) = 3m(n − m) + 9n − 6m + 9 evaluations of b,• and Carrysub,b and comparisons of elements {0, . . . , b − 1}, • m + 1 + (n − m + 1) = n + 2 evaluations of divb , • m + 1 evaluations of remb . Note that it is in general not recommended to implement such integer arithmetic by one-selves, but to use libraries which provide such arithmetic. A prime example is the GNU Multiprecision library [GMP]. In some programming languages, such as Python [Py], support for multiprecision arithmetic is already included. In fact, while Python offers two integer types, int (for CPU integers) and long (for arbitrary precision integers), values of type int will automatically12 turn into type long if the result of an expression does not fit into int. 1.2 A Bit of Complexity Theory In the last section we have seen how to do integer arithmetic using b-adic representations. For example, multiplying two b-adic representations of integers N and M require – assuming a reduced representation – (sizeb (M ) + 1)(sizeb (N ) + 1) evaluations of b and Carrymul,b , at most (2 sizeb (M ) + 1)(sizeb (N ) + 1) + sizeb (M ) evaluations of ⊕b,• and at most (2 sizeb (M ) + 1) sizeb (N ) + 2 sizeb (M ) evaluations of Carryadd,b . Carrying these numbers around is quite annoying, and in many cases does not say a lot. If we just use basic b-operations as the measure, it gets simpler: multiplication of two b-adic reduced representations of integers N and M requires at most 2(sizeb (M ) + 1)(sizeb (N ) + 1) + (2 sizeb (M ) + 1)(sizeb (N ) + 1) + sizeb (M ) + (2 sizeb (M ) + 1) sizeb (N ) + 2 sizeb (M ) = 6 sizeb (N ) sizeb (M ) + 7 sizeb (M ) + 4 sizeb (N ) + 3 basic b-operations. This expression is still somewhat complicated. Moreover, since N we have seen that sizeb (N ) ≈ log log b in Remark 1.1.3 (d), we see that the number of basic b-operations is 6 7 4 3 log N · log M + log M + log N + log b log b log b log b 6 ≈ log N · log M, log b ≈ where for the last “≈”, we assume that both N and M are large. Note that except for the constant, the expression only depends on log N and log M , and not at all on the choice of b. Thus, in case b is fixed – which it usually is, after fixing a 12 This was implemented in Python 2.2. In Python 3, there is only one integer type. Also see Section 2.3.2. 10 CHAPTER 1. BASIC ARITHMETIC concrete architecture on which we implement algorithms on –, what only matters is the expression following the constant, namely the log N · log M . In the following, for a set K, we will consider functions f : NK → R. If K = {1, . . . , m}, then f : NK → R is a function in m natural variables to the reals. For a, b ∈ NK we will write a ≤ b if and only if ak ≤ bk for all k ∈ K. Definition 1.2.1. Let K be an index set and let f, g : NK → R be two functions. (a) We write f ∈ O(g) if and only if ∃n0 ∈ NK ∃c > 0∀n ∈ NK : n ≥ n0 ⇒ |f (n)| ≤ c · |g(n)|. We say that f is in “ big-O of g”. (b) We write f ∈ o(g) if and only if f ∈ O(g) ∧ g 6∈ O(f ). We say that f is in “ little-O of g”. (c) We write f ∈ Θ(g) if and only if f ∈ O(g) ∧ g ∈ O(f ). We say that f is in “ Theta of g”. Remarks 1.2.2. For (a), (b) and (c), assume that g(n) 6= 0 for all n ∈ K. (a) Then f ∈ o(g) if and only if limn→∞ this case.13 f (n) g(n) = 0. In particular, the limit exists in (b) Moreover, f ∈ O(g) if and only if lim supn→∞ f (n) g(n) < ∞. (c) Finally, f ∈ Θ(g) if and only if g ∈ Θ(f ), which is the case if and only if (n) (n) ≤ lim supn→∞ fg(n) < ∞. 0 < lim inf n→∞ fg(n) (d) If f = am X m + am−1 X m−1 + · · · + a0 is a univariate polynomial with am 6= 0, then f ∈ O(g), where g(x) := xm . That is, we only take the largest term of the polynomial and are only interested in the exponent, but not in its coefficient. Using the big-O notation, we can simply state that multiplication of two b-adic reduced representations of positive integers N and M can be done with O(log N · log M ) basic b-operations. A restatement of Theorem 1.1.4 using the new notation is the following: Corollary 1.2.3 (Basic Integer Arithmetic). Let b ≥ 2 be a natural number, and let N and M be two integers. Assume that we are given reduced b-adic representations of N and M . We assume that the sign is given separately as a value in {−1, +1}. (a) Testing whether N < M , N = M or N > M using the reduced b-adic representations of N and M can be done in O(min{log N, log M }) basic b-operations. 13 Here, with n → ∞, we mean that n = (nk | k ∈ K) is a family of variables which all converge to ∞ uniformly. For |K| < ∞, this is equivalent to pointwise convergence of the nk . 1.3. POLYNOMIAL ARITHMETIC 11 (b) Computing a reduced b-adic representation of N + M using the reduced b-adic representations of N and M can be done in O(max{log N, log M }) basic boperations. (c) Computing a reduced b-adic representation of N − M using the reduced b-adic representations of N and M can be done in O(max{log N, log M }) basic boperations. (d) Computing a reduced b-adic representation of N · M using the reduced b-adic representations of N and M can be done in O(log M · log N ) basic b-operations. (e) Assume that M 6= 0. Computing reduced b-adic representations of two numbers Q, R ∈ N with N = Q·M +R and R < M using the reduced b-adic representations of N and M can be done in O(max{log M · (log N − log M + 1), 1}) basic b-operations. 1.3 Polynomial Arithmetic Two fundamental arithmetics in computer algebra are arithmetic in Z and polynomial arithmetic. As with integers, one first has to think on how to represent polynomials on a computer. There are essentially two general representations: P • A dense representation: a polynomial ni=0 ai X i over a ring R is specified by a list (an , . . . , a0 ) ∈ Rn+1 of coefficients. Pn ei with e < · · · < e is • A sparse representation: a polynomial 1 n i=1 ai X specified by a list ((a1 , e1 ), . . . , (an , en )) of pairs (ai , ei ) ∈ R × N. Both representations have advantages and disadvantages, depending on how they are used. In some cases, mixing these two representations can yield improvements. For example, when representing the polynomial X q − X for a huge prime power q, the sparse representation is the representation of choice. On the other hand, to n −1 represent the quotient XX−1 of the two sparse polynomials X n − 1 and X − 1 as a Pn−1 i polynomial, one obtains i=0 X , for which a dense representation is better suited. In this lecture, and in particular for the rest of this section, we will almost exclusively work with dense representations of polynomials. Remark 1.3.1. (a) Similar to b-adic of natural numbers, the dense representation Pnrepresentations i (an , . . . , a0 ) of i=0 ai X can be made unique by forcing n = −1 (the empty list “()”) or an 6= 0. As before, we call such representations reduced. (b) We define the degree of the zero polynomial as −1. (As opposed to −∞, which is common in many parts of mathematics.) Then the degree of the polynomial equals the length of the list minus one for the reduced dense representation. P (c) We denote the leading coefficient an of a polynomial f = ni=0 ai X i with an 6= 0 by LC(f ). In case f = 0, we define LC(0) := 0. Pn i (d) If (an , . . . , a0 ) represents the polynomial f = i=0 ai X , we will often write f = (an , . . . , a0 ) in the following. The four basic operations for polynomials are 12 CHAPTER 1. BASIC ARITHMETIC • addition and subtraction, • multiplication, and • Euclidean (long) division with remainder. We present some simple algorithms for these operations and analyze their running time in terms of operations in the underlying ring R. Note that in particular for R = Z and R = Q, these operations can be very expensive, and for such rings, special algorithms can perform much faster. Theorem 1.3.2 (Polynomial Arithmetic). Let f, g ∈ R[X] be two polynomials with n = deg f and m = deg g, given by their dense representation. Let λ ∈ R. (a) We can compute a dense representation of λf using at most n+1 multiplications and n + 2 comparisons in R. (If R is zero-divisor free, one comparison suffices to check whether λ 6= 0). The total number of operations in R is thus in O(n). (b) We can compute a dense representation of f + g and f − g using min{n, m} + 1 additions respective subtractions in R, max{n, m} − min{n, m} duplication of elements or negations, and at most min{n, m} + 1 comparisons. The total number of operations in R is thus in O(max{n, m}). (c) We can compute a dense representation of f · g using mn additions and (m + 1)(n + 1) multiplications in R and at most n + m + 1 comparisons in R. The total number of operations in R is thus in O(mn). (d) Assume that LC(g) is a unit in R. We can compute a dense representation of q, r ∈ R[X] with f = qg +r and deg r < m = deg g using one inversion of a unit, at most (n − m + 1)m subtractions, at most (n − m + 1)(m + 1) multiplications in R, at most n + 1 duplications of elements of R and at most m comparisons in R. The total number of operations in R is thus in O(nm). In case LC(g) = 1, it suffices to do at most (n − m + 1)m subtractions, (n − m + 1)m multiplications and m comparisons in R. Proof. Let f = (an , . . . , a0 ) and g = (bm , . . . , b0 ). (a) In case λ = 0, the result is (). Otherwise, λf = (λan , . . . , λa0 ). In case R is zero-divisor free, this representation is already reduced; otherwise, one can start with λan ; if λan = · · · = λai = 0 and λai+1 6= 0, then (λai+1 , . . . , λa0 ) is reduced and can be returned. In case no such i exists, return (). This algorithm clearly requires n + 1 multiplications in R (in case λ 6= 0). Normalization of the result requires at most n+1 comparisons of the coefficients to 0 ∈ R. (b) Without loss of generality, assume that n ≤ m. Then f ± g = (±bm , . . . , ±bn+1 , an ± bn , . . . , a0 ± b0 ). In case m > n, this is already a reduced representation; in case m = n, one starts with the largest index i such that ai ± bi 6= 0, or returns () in case no such index exists. We need at most min{n, m} + 1 comparisons to 0 ∈ R for this. In case n = m, we have n + 1 additions resp. subtractions and 0 duplications. In case n < m, we have n + 1 additions resp. subtractions and m − n duplications resp. negations (for (−bm , . . . , −bn+1 )). In case n > m, we have m + 1 additions resp. subtractions and n − m duplications. 1.4. EUCLIDEAN ALGORITHM 13 (c) Note that f · g = (cm+n , . . . , c0 ), where min{i,n} X ci = aj bi−j . j=max{0,i−m} This coefficient can be computed using Ni := min{i, n}−max{0, i−m} additions and Ni + 1 multiplications. Therefore, all coefficients together can be computed using n+m X Ni = i=0 = n+m X min{i, n} − i=0 X n i=0 n+m X max{0, i − m} i=0 i+ n+m X X m m+n X n − 0+ (i − m) i=n+1 i=0 i=m+1 m+n m X X 1 n(n + 1) + mn − i− i − nm = 2 i=0 i=0 1 n(n + 1) + m(m + 1) + 4mn − (m + n)(m + n + 1) = mn = 2 P additions and n+m i=0 (Ni +1) = mn+(n+m+1) = (m+1)(n+1) multiplications. Finally, to normalize the result, we need at most n + m + 1 comparisons with 0 ∈ R. (d) In case n < m, we set q := () and r := f . This requires n + 1 duplications. Now assume n ≥ m. Then deg q = n − m. In case LC(g) 6= 1, we invert LC(g) and store it, say in u. We start with r := f = (an , . . . , a0 ) =: (rn , . . . , r0 ) and q := (cn−m , . . . , c0 ) with all ci = 0 in the beginning. We proceed iteratively from i = n − m down to i = 0. Before and after each iteration, f = q · g + r. For each i, we set ci := urm+i and subtract ci X i g from r. For the latter, we subtract ci bj from rj+i , j = 0, . . . , m. (Note that ci bm = rm+i ubm = rm+i , whence after this, rm+i = 0. In fact, we do not subtract ci bm from ri+m , but simply set ri+m = 0.) Therefore, in iteration i, we do m + 1 (or just m if u = 1) multiplications and m subtractions in case ci 6= 0, and nothing in case ci = 0. After all iterations, we will have rm = · · · = rn+m = 0, and we determine the smallest i such that ri = · · · = rn+m = 0 and set r = (ri−1 , . . . , r0 ); in case no such i exists, we set r = (). As we have n − m + 1 iterations, we perform at most (n − m + 1)(m + 1) multiplications (or just (n−m+1)m if u = 1), at most (n−m+1)m subtractions and at most m comparisons (to normalize r) to compute q and r. 1.4 Euclidean Algorithm A class of integral domains with very nice division properties are Euclidean rings, rings in which one has an Euclidean division. Prime examples are the integers (Z) and polynomial rings over fields (K[x]). 14 CHAPTER 1. BASIC ARITHMETIC Definition 1.4.1. An integral domain R is called Euclidean if there exists a function ν : R → Z ∪ {−∞}, ν(R \ {0}) ⊆ N such that (a) for every a, b ∈ R, b 6= 0, we have ν(a) ≤ ν(ab); and (b) for every a, b ∈ R, b 6= 0, there exist q, r ∈ N with a = qb + r and ν(r) < ν(b). We will call ν an (Euclidean) valuation. Remarks 1.4.2. (a) [Rog71] Note that any ring R with valuation ν which satisfies the definition except part (a) can be made Euclidean by defining ν̂(x) := min{ν(xa) | a ∈ R, a 6= 0}: then (R, ν̂) satisfies both (a) and (b). (b) [vzGG03, p. 60, Exercise 3.5] Moreover, one can consider the set XR := {ν | ν satisfies condition (b)} and define νmin (x) := min{ν(x) | ν ∈ XR }. Then νmin satisfies both (a) and (b). (c) Note that if a, b ∈ R are associated, i.e. a = be with e ∈ R, then ν(a) = ν(b).14 (d) Moreover, note that ν(0) < ν(1) < ν(a) for every non-zero non-unit a ∈ R \ (R∗ ∪ {0}).15 Example 1.4.3. For R = Z, the function ν : Z → N, x 7→ |x| satisfies the conditions of the definition. Here, ν(a) = ν(b) if and only if a and b are associated, which is the case if and only if a = ±b. One can show that the minimal function νmin is νmin (x) = blog2 |x|c with νmin (0) = −∞ [vzGG03, p. 61, Exercise 3.5 (vi)]. Example 1.4.4. For R = K[x], ν = deg satisfies the conditions of the definition. (In fact, ν = νmin [vzGG03, p. 61, Exercise 3.5 (vi)].) Here, q and r are uniquely determined by a and b: if a = qb+r = q 0 b+r0 for q, q 0 , r, r0 ∈ K[x] with deg r, deg r0 < deg b, then r − r0 = b(q − q 0 ) implies deg(r − r0 ) = deg b + deg(q − q 0 ), which is only possible if r − r0 = 0 = q − q 0 . In this case, ν(a) = ν(b) does not imply that a and b are associated: for example, consider a = x and b = x + 1. Euclidean domains are principal ideal domains and thus also factorial, which means that they have unique factorization of non-zero elements as a product of a unit and prime elements. Another advantage is that they allow to compute greatest common divisors very efficiently. The basic technique of doing this was first described by Euclid in his books Elements VII and X. His original treatment only uses subtraction and comparison of integers, but it can be sped up dramatically by using Euclidean (long) division. This yields the modern form of one of the most fundamental algorithms, which we will now state in full detail: Theorem 1.4.5 (Euclidean Algorithm). Let R be an Euclidean domain with valuation ν. Assume that we are given a0 , a1 ∈ R \ {0}. Define sequences (ai )i≥0 , (bi )i≥0 , (ci )i≥0 , (qi )i≥0 , (Ti )i≥0 , as follows: We have ν(a) = ν(be) ≤ ν(bee−1 ) = ν(b) by part (a), and conversely, ν(b) = ν(ae−1 ) ≤ ν(ae−1 e) = ν(a). 15 First, ν(1) ≤ ν(1 · a) = ν(a) shows that every non-zero non-unit’s valuation is at least ν(1). Hence, 0 = q ·1+r with ν(r) < ν(1) implies that r = 0 and ν(0) < ν(1). Finally, consider 1 = q ·a+r with ν(r) < ν(a). It is not possible that r = 0, as this implies 1 = q · a, which is absurd as a is a non-unit. Therefore, ν(1) ≤ ν(r) < ν(a). 14 1.4. EUCLIDEAN ALGORITHM • define b0 b1 c0 c1 15 1 0 := ; 0 1 • if ai−2 and ai−1 are defined, define qi , ai , Ti , bi and ci as follows: (i) let qi , ai ∈ R with ai−2 = qi ai−1 + ai such that ν(ai ) < ν(ai−1 ) (Euclidean division) in case ai−1 6= 0, and qi = ai = 0 in case ai−1 = 0; 0 1 (ii) define Ti := ∈ R2×2 ; 1 −qi (iii) define bi−2 ci−2 (bi , ci ) := (1, −qi ) . bi−1 ci−1 Then there exists an index i with ai = 0. Let n ≥ 1 be the largest index such that an 6= 0. Then the following holds: (a) for 1 ≤ j ≤ i ≤ n + 1, ai−1 bi−1 ci−1 aj−1 bj−1 cj−1 = Ti Ti−1 · · · Tj+1 ; ai bi ci aj bj cj (b) for i = 1, . . . , n + 1, bi−1 ci−1 Ai := = Ti Ti−1 · · · T2 bi ci and det Ai = (−1)i−1 ; (c) for i = 1, . . . , n + 1, ai = bi a0 + ci a1 and bi , ci are coprime; (d) an = bn a0 + cn a1 (Bézout equation) is a greatest common divisor of a0 and a1 ; (e) (−1)i−1 a0 = ci ai−1 − ci−1 ai and (−1)i a1 = bi ai−1 − bi−1 ai for i = 1, . . . , n + 1; in particular, a0 = (−1)n cn+1 an and a1 = (−1)n+1 bn+1 an ; (f ) we have qi 6= 0, 2 < i ≤ n+1, and for i = 2, we have qi = 0 only if ν(a1 ) > ν(a0 ). During the rest of this section, as well as the next section, we will always use the notation from the theorem. Note that if we are only interested in a greatest common divisor of a0 and a1 , we do not have to carry bi , ci around. In fact, one can implement this algorithm in very few lines, as the following Python example for integers shows: Listing 1.1: GCD of Integers 1 2 3 4 5 def gcd (a , b ) : " Compute GCD of its two inputs " while b != 0: a, b = b, a % b return a If we want to compute x, y ∈ Z such that ax + by = gcd(a, b), we need to invest more work. This processes is often called the Extended Euclidean Algorithm. Fortunately, we can read off the required formulas directly from the theorem: 16 CHAPTER 1. BASIC ARITHMETIC Listing 1.2: Extended GCD of Integers 1 2 3 4 5 6 7 8 9 10 11 12 13 14 def gcdex (a , b ) : " Compute extended GCD ( with B é zout equation ) of its two inputs . Returns the GCD followed by the coefficients of the linear combination . " ai = b # ai stands for : a with index i aim1 = a # aim1 stands for : a with index i -1 bi = 0 # bi stands for : b with index i bim1 = 1 # bim1 stands for : b with index i -1 ci = 1 # ci stands for : c with index i cim1 = 0 # cim1 stands for : c with index i -1 while ai != 0: q , r = divmod ( aim1 , ai ) # compute both quotient and remainder aim1 , ai = ai , r bim1 , bi = bi , bim1 - q * bi cim1 , ci = ci , cim1 - q * ci return aim1 , bim1 , cim1 Note that divmod(a, b) for integers a, b does an Euclidean division: it returns a pair (q, r) with a = q * b + r and abs(r) < abs(b). In fact, q = b ab c. Instead of q, r = divmod(aim1, ai), we could have also written q, r = aim1 / ai, aim1 % ai, or q = aim1 / ai; r = aim1 % ai. We can accelerate the algorithm, since in the first loop iteration, bi and ci are very easy to compute: Listing 1.3: Optimized Extended GCD of Integers 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 def gcdex (a , b ) : " Compute extended GCD ( with B é zout equation ) of its two inputs . Returns the GCD followed by the coefficients of the linear combination . " ai = b # ai stands for : a with index i aim1 = a # aim1 stands for : a with index i -1 # We can accelerate the first step if ai != 0: q , r = divmod ( aim1 , ai ) # compute both quotient and remainder aim1 , ai = ai , r bim1 , bi = 0 , 1 # before : bi = 0 , bim1 = 1 cim1 , ci = 1 , -q # before : ci = 1 , cim1 = 0 # Now continue while ai != 0: q , r = divmod ( aim1 , ai ) # compute both quotient and remainder aim1 , ai = ai , r bim1 , bi = bi , bim1 - q * bi cim1 , ci = ci , cim1 - q * ci else : bim1 = 1 cim1 = 0 return aim1 , bim1 , cim1 Here, in lines 7 to 10, we essentially unroll 16 the first loop iteration and specialize it. We can test the function as follows: 16 Unrolling loops is an important optimization done by many compilers, which translate (human- 1.4. EUCLIDEAN ALGORITHM 1 2 3 4 5 6 17 a,b = 5, 7 d ,x , y = gcdex (a , b ) print d == x * a + y * b a , b = 9 84 50 41 03 12 30 12 41 987 51 34 12 83 71 23 12 3 , 79347851293412371287123183713712471927132 d ,x , y = gcdex (a , b ) print d == x * a + y * b Python instantaneous prints out the answer True in both cases. Note that 1 = 11487751908344251135520265893748021674635 · 98450410312301241987513412837123123 + (−14253365031402041390769848991456372) · 79347851293412371287123183713712471927132, whence the two large numbers are coprime. Proof. Note that ν(a1 ) > ν(a2 ) > ν(a3 ) > · · · . Since ν(ai ) ≥ 0 is an integer while ai 6= 0, there must be some index i such that ai = 0. As further a0 6= 0 6= a1 , a maximal n ≥ 1 exists with an 6= 0. (a) one checks this quickly for j = i − 1, and it is trivial for j = i; the general case then follows by induction on i − j. b0 b1 1 0 (b) Apply (a) for j = 2 and use ; this yields Ai = Ti Ti−1 · · · T2 . = c0 c1 0 1 Now det Ti = −1, whence det Ai = (−1)i−1 . ∗ ∗ ∗ (1, 0, 0)T = ai , we can apply (a) and (b) to obtain (c) Since (0, 1) ai ∗ ∗ 1 a0 bi−1 ci−1 a0 b0 c0 0 = bi ci ai = 0 1 , bi ci a1 b1 c1 a1 0 which simplifies to bi a0 + ci a1 . Since −1 = det Ai = bi−1 ci − ci−1 bi , we see that ci and bi are coprime. readable) program code into machine code. The rationale is that conditional jumps (like: repeat the code block if the counter is less than something) are costly on modern CPUs, since the CPUs try to “look ahead” what will happen soon and might even already execute later instructions (out of order execution) under very special circumstances. In case of a conditional jump, the CPU does not know in advance whether it will jump or not, and so cannot really look ahead. Therefore, code such as 1 2 for i in xrange (4) : x[i] = i will often automatically be translated to 1 2 3 4 x [0] x [1] x [2] x [3] = = = = 0 1 2 3 by the compiler. 18 CHAPTER 1. BASIC ARITHMETIC (d) That an = bn a0 + cn a1 follows from (c); this equation implies that every divisor of a0 and a1 also divides an . So we are left to show an | a0 , a1 . For that, we show by induction on j that an divides both an+1−j and an−j , for j = 0, . . . , n. For j = 0, an+1−j = an+1 = 0 and an−j = an , whence this is clear. Now assume an | an+1−j , an−j . As an+1−(j+1) = an−j and an−(j+1) = an−j−1 = qn+1−j an−j + an+1−j , we see using the induction hypothesis that an also divides an+1−(j+1) and an−(j+1) . (e) Note that for i = 1, . . . , n + 1, we have A−1 i i−1 = (−1) ci −ci−1 −bi bi−1 (since det Ai = (−1)i−1 ). Using this together with parts (a) and (b), we obtain (−1)i−1 a0 = ci ai−1 − ci−1 ai and (−1)i a1 = bi ai−1 − bi−1 ai . (1.1) Plugging in i = n + 1, we obtain (−1)n a0 = cn+1 an and (−1)n+1 a1 = bn+1 an . (f) If ν(ai−2 ) > ν(ai−1 ) (which is always satisfied for i = 3, . . . , n + 1), then we cannot have qi = 0, as otherwise ai = ai−2 and thus ν(ai ) > ν(ai−1 ), which contradicts the construction of qi and ai . In the following lemma, we prove two further results for the special cases of integers – with a more rigid definition of Euclidean division – and polynomials over a field. These will be very useful when analyzing the algorithm in the next section. Lemma 1.4.6. Assume that we have either of the following two cases: • R = Z, ν(x) = |x| and for Euclidean division a = qb + r, we assume that |a − r| < |a| in case r 6= 0, which is equivalent to ar ≥ 0; • R = K[X] for some field K and ν(f ) = deg f . Further assume that ν(a0 ) ≥ ν(a1 ). Then during the Euclidean algorithm, the following properties hold: (a) For i = 2, . . . , n + 1, ν(bi−1 ) ≤ ν(bi ) and ν(ci−1 ) ≤ ν(ci ). In the case of R = Z, we have strict inequality except possibly ν(b3 ) = ν(b2 ), which happens if and only if ν(q3 ) = 1, and ν(c2 ) = ν(c1 ), which happens if and only if ν(q2 ) = 1. In the case of R = K[X], we have strict inequality except possibly ν(c1 ) ≤ ν(c2 ), which happens if and only if ν(a0 ) = ν(a1 ). (b) For i = 1, . . . , n + 1, ν(bi ai−1 ) ≤ ν(a1 ) and ν(ci ai−1 ) ≤ ν(a0 ), and in the case of R = K[X], we have equality. Remark 1.4.7. Note that in case R = Z of Lemma 1.4.6, we need the additional assumption that a · r ≥ 0 (which is equivalent to |a − r| < |a| if r 6= 0). Consider a0 = 7 and a1 = 5. Since 7=2·5−3 and |−3| < |5|, 1.4. EUCLIDEAN ALGORITHM 19 we could choose q2 = 2 and a2 = −3. This would result in b2 = 1 and c2 = −2, whence ν(c2 a1 ) = 10 > 7 = ν(a0 ). If we continue as follows: 5 = (−2) · (−3) − 1 since |−1| < |−3|, we obtain q3 = −2, a3 = −1, b3 = −6 and c3 = 9. Here, both ν(c3 a2 ) = 9 > 7 = ν(a0 ) and ν(b3 a2 ) = 6 > 5 = ν(a1 ). This violates part (b) of the lemma. Proof of Lemma 1.4.6. Note that since det Ti = −1, we have that gcd(bi−1 , bi ) = 1 = gcd(ci−1 , ci ) for i = 2, . . . , n + 1; we also have the coprimeness for i = 1, since b0 = 1 = c1 . Therefore, for i = 2, . . . , n + 1, we have bi = bi−2 − qi bi−1 6= 0 and ci = ci−2 − qi ci−1 6= 0 except possibly c2 = 0 if q2 = 0. But since ν(a0 ) ≥ ν(a1 ) we must have q2 6= 0 as well, whence also c2 6= 0. We consider the two cases R = Z and R = K[X] separately. • We begin with R = K[X]. First note that except possibly for i = 2 (where it could happen that deg a0 = deg a1 ), deg qi > 0 since deg ai−2 > deg ai−1 . (a) We proceed by induction on i. For i = 2, b1 = 0, b2 = 1 and c1 = 1, c2 = −q2 6= 0. Now assume that the statement holds for some i ≥ 2. Since bi+1 = bi−1 − qi+1 bi and deg bi−1 < deg bi , deg bi+1 = deg(qi+1 bi ) = deg qi+1 + deg bi > deg bi since deg qi+1 > 0. The claim holds analogous for ci+1 . (b) Since (−1)i−1 a0 = ci ai−1 − ci−1 ai and deg ci > deg ci−1 and deg ai−1 > deg ai , deg a0 = deg(ci ai−1 ). The claim holds analogous for deg a1 = deg(bi ai−1 ). • Now we consider R = Z. We begin with proving some auxiliary claims. Claim (iii) directly implies (a), and claim (v) directly implies (b). (i) Claim: If a0 and a1 have the same sign, then all qi ≥ 0, i = 2, . . . , n; otherwise, all qi ≤ 0, i = 2, . . . , n. First note that ai always has the same sign as ai+2 . If both a0 and a1 have the same sign, then all ai ≥ 0. As ai < ai−1 ≤ ai−2 and ai−2 = qi ai−1 + ai , it must be that qi ≥ 0. If a0 a1 < 0, we have that ai−2 and qi ai−1 have the same sign. Since ai−2 ai−1 ≤ 0, it must be that qi ≥ 0. (ii) Claim: If a0 and a1 have the same sign, (−1)i bi , (−1)i+1 ci ≥ 0 for i = 0, . . . , n + 1, and if a0 and a1 have different signs, bi , ci ≥ 0 for i = 0, . . . , n + 1. First assume that a0 a1 > 0. We prove the claim by induction on (i − 1, i). For i = 0, (−1)i bi = 1 ≥ 0 and (−1)i+1 ci = 0 ≥ 0. For i = 1, (−1)i bi = 0 ≥ 0 and (−1)i+1 ci = 1 ≥ 0. Now assume that the claim is true for i − 1 and i. Then (−1)i+1 bi+1 = (−1)i+1 bi−1 − qi+1 (−1)i+1 bi = (−1)i−1 bi−1 + qi+1 (−1)i bi . Since by claim (i), qi+1 ≥ 0, and by induction hypothesis, (−1)i−1 bi−1 , (−1)i bi ≥ 0, we get (−1)i+1 bi+1 ≥ 0. Similarly, (−1)(i+1)+1 ci+1 = (−1)i+2 ci−1 − qi+1 (−1)i+2 ci = (−1)(i−1)+1 ci−1 + qi+1 (−1)i+1 ci ≥ 0. 20 CHAPTER 1. BASIC ARITHMETIC Now assume that a0 a1 < 0. We again proceed by induction on (i − 1, i). For i = 0 and i = 1, all bi and ci are ≥ 0 by definition. Now assume that the claim is true for i − 1 and i. Then bi+1 = bi−1 − qi+1 bi ≥ 0 since by induction hypothesis, bi−1 , bi ≥ 0 and by claim (i), qi ≤ 0. The proof for ci+1 ≥ 0 proceeds the same way. (iii) Claim: We have |bi | = |qi bi−1 | + |bi−2 | > |bi−1 | and |ci | = |qi ci−1 | + |ci−2 | > |ci−1 | for i = 2, . . . , n + 1. First assume that a0 a1 > 0. By claim (ii), |bi | = (−1)i bi , whence |bi | = (−1)i bi−2 − (−1)i qi bi−1 = (−1)i−2 bi−2 + qi (−1)i−1 bi−1 . Using claim (ii) again yields |bi−2 | = (−1)i−2 bi−2 and |bi−1 | = (−1)i−1 bi−1 , and by claim (i), |qi | = qi . Therefore, |bi | = |qi ||bi−1 | + |bi−1 |. Similarly, |ci | = |qi ||ci−1 | + |ci−1 |. Now assume that a0 a1 < 0. By claim (ii), |bi | = bi = bi−2 − qi bi−1 = bi−2 + (−qi )bi−1 . By claim (ii), |bi−2 | = bi−2 and |bi−1 | = bi−1 , and by claim (i), |qi | = −qi . Therefore, |bi | = |qi ||bi−1 | + |bi−1 |. Similarly, |ci | = |qi ||ci−1 | + |ci−1 |. We now prove the statement on the inequality by strong induction on i. Note that b0 = 1, b1 = 0, b2 = 1 and b3 = −q3 . Therefore, |b2 | > |b1 | and |q3 | = |b3 | ≥ |b2 | = 1 as q3 6= 0 (part (f) of theorem); moreover, we have |b3 | > |b2 | if and only if |q3 | = 6 1. Now c0 = 0, c1 = 1, c2 = −q2 and c3 = 1 + q2 q3 . As q2 q3 ≥ 0 by claim (i) and q2 , q3 6= 0 by part (f) of the theorem, c3 = 1 + |q2 q3 | ≥ 1 + |q2 | > |q2 | = |c2 | ≥ |c1 | = 1; moreover, we have |c2 | > |c1 | if and only if |q2 | = 6 1. Finally, assume that i > 3 and that the (not necessarily strict) inequality statement is true for all less i. The above show that bi 6= 0 6= ci for i > 1, whence |bi | = |qi ||bi−1 | + |bi−1 | > |qi ||bi−1 | ≥ |bi−1 | since qi 6= 0. Similarly, |ci | = |qi ||ci−1 | + |ci−1 | > |qi ||ci−1 | ≥ |ci−1 |. (iv) Claim: If i, j ∈ {0, . . . , n+1} with i 6≡ j (mod 2), we have (−1)j ac and (−1)i ai 0j ≥ 0. ai bj a1 ≥0 We have (−1)i ai has the same sign as a0 and (−1)i+1 ai has the same sign as a1 . (See note in claim (i).) Now first consider that a0 and a1 have the same sign; then all ai have the same sign. By claim (ii), (−1)j bj , (−1)j+1 cj ≥ 0. Combining this yields 0 ≤ (−1)j bj · ai a1 and 0 ≤ (−1)j+1 cj · ai . a0 Now assume that a0 and a1 have different signs. By claim (ii), bj , cj ≥ 0 for all j. Now 0 ≤ bj · (−1)i+1 ai a1 and 0 ≤ cj · (−1)i ai , a0 and since (−1)i+1 = (−1)j and (−1)j+1 = (−1)i the claim follows. (v) Claim: For i = 1, . . . , n + 1, |bi | ≤ |a1 /ai−1 | and |ci | ≤ |a0 /ai−1 |. 1.4. EUCLIDEAN ALGORITHM 21 We show this claim by induction over i. For i = 1, bi = 0, ci = 1 and a0 /ai−1 = a0 /a0 = 1. Hence, |bi | = 0 ≤ |a0 /ai−1 | and |ci | = 1 = |a0 /ai−1 |. Now assume that the statement is true for some i. As ai+1 < ai−1 , the induction hypothesis yields ai+1 bi ai−1 bi < ≤ 1 and ai+1 ci < ai−1 ci ≤ 1. a1 a1 a0 a0 bi By claim (iv), (−1)i ai+1 ≥ 0, whence the above yields a1 ai+1 bi − (−1)i a1 ai+1 bi − 1 ≤ 1. = (−1)i |a1 | a1 Since ai+1 bi − (−1)i a1 = ai bi+1 by part (e) of the theorem, we obtain |ai bi+1 | ≤ |a1 |. ci Similarly, by claim (iv), (−1)i+1 ai+1 ≥ 0, whence the above yields a0 ai+1 ci + (−1)i a0 (−1)i+1 ai+1 ci = − 1 ≤ 1. |a0 | a0 Since ai+1 ci + (−1)i a0 = ai ci+1 by part (e) of the theorem, we finally obtain |ai ci+1 | ≤ |a0 |. 1.4.1 Analyzing the Euclidean Algorithm In this section, we still use the notation of Theorem 1.4.5. Theorem 1.4.8 (Euclidean Algorithm for Polynomials). Let K be a field and f, g ∈ K[x] \ {0} two non-zero polynomials given via their dense representation. Assume that we use the basic polynomial arithmetic described in Section 1.3 with dense representations. (a) Then the Euclidean algorithm needs at most 2 deg f · deg g + deg f + deg g + 1 field operations (without inversion), min{deg f, deg g} + 1 comparisons (with 0 ∈ K) and at most min{deg f, deg g} − 2 inversions in K to compute a dense representation of a greatest common divisor of f and g. (b) The Euclidean algorithm needs at most 6 deg f · deg g − 2 max{deg f, deg g} − 4 min{deg f, deg g}+6 field operations (without inversion), min{deg f, deg g}+1 comparisons (with 0 ∈ K) and at most min{deg f, deg g} − 2 inversions in K to compute both a dense representation of a greatest common divisor d ∈ K[x] of f and g as well as dense representations of x, y ∈ K[x] such that d = f x + gy. Proof. Set mi := deg ai , 0 ≤ i ≤ n + 1, and assume that m0 ≥ m1 ; then m0 ≥ m1 > m2 > · · · > mn+1 . Since the number of field operations to compute Euclidean long division for polynomials of degree mi−1 and mi is at most (2mi + 1)(mi−1 − mi + 1) (Theorem 1.3.2 (d)), and additionally mi − mi+1 comparisons (since mi+1 is the degree of the remainder), the total number of field operations required for the Pn Euclidean algorithm for just computing a GCD is at most i=1 (2mi + 1)(mi−1 − P mi + 1) and ni=1 (mi − mi+1 ) = m1 + 1 comparisons. P For a sequence m0 ≥ m1 > · · · > mn ≥ 0, define f (m0 , . . . , mn ) := ni=1 (2mi + 1)(mi−1 − mi + 1). We will show the following three claims: 22 CHAPTER 1. BASIC ARITHMETIC (i) Claim: For x ≥ y, we have f (x, y, y − 1, y − 2, . . . , 2, 1, 0) = 2xy + x + y + 1. We have f (x, y, y − 1, y − 2, . . . , 2, 1, 0) = (2y + 1)(x − y + 1) + f (y, y − 1, . . . , 1, 0) = 2(x − y)y + x + y + 1 + y X (2(y − i) + 1)((y − i + 1) − (y − i) + 1) i=1 y X = 2(x − y)y + x + y + 1 + 2 (2y − 2i + 1) i=1 = 2(x − y)y + x + y + 1 + 4y 2 − 2y(y + 1) + 2y = 2xy + x + y + 1. (ii) Claim: Assume that m0 ≥ · · · ≥ mn and that there exists i ∈ {1, . . . , n} and m with mi−1 ≥ m ≥ mi . Then f (m0 , . . . , mn ) ≤ f (m0 , . . . , mi−1 , m, mi , . . . , mn ). We have f (m0 , . . . , mi−1 , m, mi , . . . , mn ) − f (m0 , . . . , mn ) = (2m + 1)(mi−1 − m + 1) + (2mi + 1)(m − mi + 1) − (2mi + 1)(mi−1 − mi + 1) = 2(m − mi )(mi−1 − m) + 2m + 1 > 0 since mi−1 ≥ m and m ≥ mi . (iii) Claim: Assume that m0 ≥ · · · ≥ mn and that mn ≥ m ≥ 0. Then f (m0 , . . . , mn ) ≤ f (m0 , . . . , mn , m). We have f (m0 , . . . , mn , m) − f (m0 , . . . , mn ) = (2m + 1)(mn − m + 1) > 0 since mn ≥ m. Finally, note that since mi+1 ≤ mi − 1, we have mn ≤ m1 − (n − 1), whence mn ≥ 0 implies n − 1 ≤ m1 . Therefore, we have at most min{deg f, deg g} − 2 loop iterations and also at most these many inversions, and need at most n X (2mi + 1)(mi−1 − mi + 1) ≤ 2m0 m1 + m0 + m1 + 1 i=1 = 2 deg f · deg g + deg f + deg g + 1 field operations. We now need to investigate the additional complexity of computing the Bézout equation. By Lemma 1.4.6, deg bi = deg a1 − deg ai−1 = m1 − mi−1 and deg ci = deg a0 − deg ai−1 = m0 − mi−1 for i ≥ 2. Note that since bi = bi−2 − qi bi−1 , the computation of bi requires at most (deg bi +1)+(2 deg qi ·deg bi−1 +deg qi +deg bi−1 +1) for i ≥ 3 by Theorem 1.3.2 (b) and (c). Similarly, the computation of ci requires at most 2 deg qi · deg ci−1 + deg qi + deg ci−1 + deg ci + 2 field operations. For i = 2, b2 = 1 − q2 · 0 = 1 and c2 = 0 − q2 · 1 = −q2 , whence deg b2 = 0 and deg c2 = deg q2 = 1.4. EUCLIDEAN ALGORITHM 23 deg a0 − deg a1 = m0 − m1 . Note that no comparisons need to be done since K is zero-divisor free and no cancellations occur during the subtractions. Now deg qi = deg ai−2 − deg ai−1 = mi−2 − mi−1 . This yields that the total number of field operations to compute both bi and ci , 3 ≤ i ≤ n + 1, is at most (2(mi−2 − mi−1 ) · (m0 − mi−2 ) + (mi−2 − mi−1 ) + (m0 − mi−2 ) + (m0 − mi−1 ) + 2) + (2(mi−2 − mi−1 ) · (m1 − mi−2 ) + (mi−2 − mi−1 ) + (m1 − mi−2 ) + (m1 − mi−1 ) + 2) = 2((mi−2 − mi−1 ) · ((m0 + m1 ) − 2mi−2 ) + ((m0 + m1 ) − 2mi−1 ) + 2), while for i = 3, we need m0 − m1 + 1 negations. Therefore, the total time needed to compute all bi , ci , 2 ≤ i ≤ n (note that we do not need bn+1 and cn+1 ), equals m0 − m1 + 1 n X +2 ((mi−2 − mi−1 ) · ((m0 + m1 ) − 2mi−2 ) + ((m0 + m1 ) − 2mi−1 ) + 2). i=3 We proceed as before: P for a sequence m0 ≥ m1 > · · · > mn ≥ 0, define f (m0 , . . . , mn ) as m0 −m1 +1+2 ni=3 ((mi−2 −mi−1 )·((m0 +m1 )−2mi−2 )+((m0 +m1 )−2mi−1 )+2). We will show the following three claims: (i) Claim: For x ≥ y, we have f (x, y, y − 1, y − 2, . . . , 2, 1, 0) = 4xy − 3x − 5y + 5. Note that since mi = y − i + 1, we have n = y + 1 and 2((mi−2 − mi−1 ) · ((m0 + m1 ) − 2mi−2 ) + ((m0 + m1 ) − 2mi−1 ) + 2) = 4(x − y + 2i − 5). Therefore, f (x, y, y − 1, y − 2, . . . , 2, 1, 0) y+1 X =x−y+1+4 (x − y + 2i − 5) i=3 = (1 + 4(y − 1))(x − y) + 1 − 20(y − 1) + 8 1 2 (y + 1)(y + 2) − 1 − 2 = 4xy − 3x − 5y + 5. (ii) Claim: Assume that m0 ≥ · · · ≥ mn and that there exists i ∈ {2, . . . , n} and m with mi−1 ≥ m ≥ mi . Then f (m0 , . . . , mn ) ≤ f (m0 , . . . , mi−1 , m, mi , . . . , mn ). We have 1 2 (f (m0 , . . . , mi−1 , m, mi , . . . , mn ) − f (m0 , . . . , mn )) = ((mi−1 − m) · ((m0 + m1 ) − 2mi−1 ) + ((m0 + m1 ) − 2m) + 2) + ((m − mi ) · ((m0 + m1 ) − 2m) + ((m0 + m1 ) − 2mi ) + 2) − ((mi−1 − mi ) · ((m0 + m1 ) − 2mi−1 ) + ((m0 + m1 ) − 2mi ) + 2) = 2(m − mi )(mi−1 − m) + (m0 − m) + (m1 − m) + 2 > 0 since mi−1 ≥ m, m ≥ mi , and m ≥ m0 , m1 . 24 CHAPTER 1. BASIC ARITHMETIC (iii) Claim: Assume that m0 ≥ · · · ≥ mn and that mn ≥ m ≥ 0. Then f (m0 , . . . , mn ) ≤ f (m0 , . . . , mn , m). We have f (m0 , . . . , mn , m) − f (m0 , . . . , mn ) = 2((mn−1 − mn ) · ((m0 + m1 ) − 2mn−1 ) + ((m0 + m1 ) − 2mn ) + 2) ≥ 0 since m0 ≥ m1 ≥ mn−1 ≥ mn . This shows that the number of field operations is bounded by 4m0 m1 −3m0 −5m1 +5, and the number of field operations (excluding inversions) for both computing the GCD as well as the Bézout equation is bounded by 6m0 m1 − 2m0 − 4m1 + 6. Theorem 1.4.9 (Euclidean Algorithm for Integers). Let a, c ∈ Z \ {0} be two nonzero integers given as reduced b-adic representations. Assume that we use the basic multiprecision integer arithmetic described in Section 1.1, and assume that during long division x = qy + r with |r| < |y|, we always have xy ≥ 0. (a) Then the Euclidean algorithm needs at most O(log |a| · log |c|) basic operations to compute a b-adic representation of a greatest common divisor of a and c. (b) The Euclidean algorithm needs at most O(log |a| · log |c|) basic operations to compute both a b-adic representation of greatest common divisor d ∈ Z of a and c as well as b-adic representations of x, y ∈ Z such that d = f x + gy. Proof. Note that by Corollary 1.2.3, two b-adic representations of two integers x, y can be added and subtracted in O(max{log |x|, log |y|}) basic b-operations and a long division x = qy + r, |r| < |y| with xy ≥ 0 can be computed in O(max{log |y| · (log |x| − log |y| + 1), 1}) basic b-operations. Next, note that during the Euclidean algorithm, |ai+2 | ≤ 12 |ai |, whence log |ai+2 | decreases in two least log 2. This yields |a2i+1 | ≤ 2−i |a1 |, whence |an | ≥ 1 steps by at −2bn/2c implies 1 ≤ a2bn/2c ≤ 2 |a1 | ≤ 2−n+1 |a1 |. Define mi := log |ai |, 0 ≤ i ≤ n, and without loss of generality, assume that |a0 | ≥ |a1 |. Then 2n−1 ≤ |a1 | yields n ≤ 1 + log1 2 log |a1 | ∈ O(m1 ). To just compute a GCD, we need to do long divisions. Each long division ai−2 = qi ai−1 + ai , |ai | < |ai−1 |, ai−2 ai ≥ 0 can be computed in O(mi−1 (mi−2 − mi−1 + 1)) basic b-operations. Therefore, the total number of basic b-operations required is in O n+1 X i=2 mi−1 (mi−2 − mi−1 + 1) . 1.4. EUCLIDEAN ALGORITHM 25 Now note that n X = i=1 n X mi (mi−1 − mi + 1) mi mi−1 − i=1 n X mi mi + i=1 = m0 m1 + n X n X mi i=1 mi mi−1 − i=2 n X mi−1 mi−1 − mn mn + i=2 n X mi−1 + mn i=2 n X = m0 m1 + (mi − mi−1 +1)mi−1 − mn mn + mn | {z } i=2 ≤ m0 m1 + n X ≤0 mi ≤ m0 m1 + nm1 . i=1 Since n = O(m1 ), we obtain that the number of basic b-operations is bounded by 17 O(m0 m1 + m21 ) = O(m0 m1 ) jsince m k 1 ≤ m0 . For (b), note that |qi | = |ai−2 | |ai−1 | ≤ |ai−2 | |ai−1 | , whence log |qi | ≤ mi−2 − mi−1 . More- |a0 | 1| over, by Lemma 1.4.6 (b), |bi | ≤ |a|ai−1 | and |ci | ≤ |ai−1 | for i = 1, . . . , n + 1; thus, log |bi | ≤ m1 − mi−1 and log |ci | ≤ m0 − mi−1 . Finally, Lemma 1.4.6 (c), the sequences (|bi |)i and (|ci |)i are increasing, except for i = 0, 1, but there, the values are in {0, 1}. Therefore, computing bi = bi−2 − qi bi−1 and ci = ci−2 − qi ci−1 requires O(m0 + m1 − 2mi−1 ) basic b-operations for the subtractions and O((mi−2 − mi−1 ) · (m0 + m1 − 2mi−1 )) basic b-operations for the multiplications. For i = 2 though, b2 = 1 and c2 = −q2 , whence for this step, we need only O(log |q2 |) ≤ O(m0 − m1 ) basic b-operations. Therefore, the total number of basic b-operations is in n X O m0 − m1 + (m0 + m1 − 2mi−1 ) + (mi−2 − mi−1 ) · (m0 + m1 − 2mi−1 ) . i=3 The sum equals (m0 − m1 ) + n X (m0 + m1 − 2mi−1 )(mi−2 − mi−1 + 1) i=3 n X ≤ (m0 − m1 ) + (m0 + m1 ) (mi−2 − mi−1 + 1) i=3 = (m0 − m1 ) + (m0 + m1 )(m1 − mn−1 + (n − 2)). Since mn−1 ≤ m1 , m1 ≤ m0 and n = O(m1 ), the total number of basic b-operations to additionally compute the Bézout equation is O(m0 m1 ). 1.4.2 Normalizing Euclidean Algorithm Note that as soon as |R∗ | > 1, two non-zero elements of R do not have the, but a greatest common divisor. If g is such a greatest common divisor, then so is ge, where e ∈ R∗ . In fact, these are all greatest common divisors of these two elements. 17 Note that a similar technique could be applied in the proof of Theorem 1.4.8. Unfortunately, it would yield a leading coefficient of 4 instead of 2 in part (a). 26 CHAPTER 1. BASIC ARITHMETIC To make greatest common divisors unique, we need to introduce a normal form. For this, we take a selection M of elements which we will call monic, such that every non-zero ring element a ∈ R is associated to precisely one m ∈ M . Let e ∈ R∗ be the unit with me = a; we then write LU(a) := e and call LU(a) the leading unit of a. We call m = LU(a)−1 a ∈ M the normalization of a. We assume that always LU(1) = 1, and for a = 0, we define LU(a) = 1. For R = Z, we take M as the set of positive natural numbers. Then the leading unit of an integer z ∈ Z \ {0} is its sign sgn(z), and its normalization its absolute value |z|. For R = K[X], we take M as the set of monic polynomials, i.e. the polynomials of the form X n +an−1 X n−1 +· · ·+a0 ∈ K[X]. Then the leading unit LU(f ) of a polynomial is identical to the leading coefficient LC(f ) as defined in Remark 1.3.1. The normalization of f is obtained by dividing all coefficients by the leading coefficient; thus the normalization of −2X 4 + 4X 2 − 3X + 1 ∈ Q[X] is X 4 − 2X 2 + 23 X − 12 . If R is some ring which is not a field, andP LU is defined for R, we can extend LU to R[X] by setting LU(f ) = LU(an ) if f = ni=0 an X n and an 6= 0. If R is a field, we are forced to take LU(a) = a for a 6= 0 (as we require LU(1) = 1), whence the normalization of everything non-zero is 1; then this construction for R[X] yields the same leading unit as above for K[X]. But in case R = Z, this construction just flips the sign of the polynomial such that the leading term is positive. That is, for f = −2X 4 +4X 2 −3X +1, LU(f ) = −1 and LU(f )−1 f = 2X 4 − 4X 2 + 3X − 1. (After all, we cannot get rid of the leading term 2 as above for Q[X], since 2 is not a unit in R = Z.) We can now describe the normalized euclidean algorithm. Note that in case d is a greatest common divisor of a and b, we define gcd(a, b) := LU(d)−1 d, so that gcd(a, b) is normalized. By going through Theorem 1.4.5 and Lemma 1.4.6, we obtain the following corollary: Corollary 1.4.10 (Normalized Euclidean Algorithm). Let R be an Euclidean domain with valuation ν and assume that we have normalizations. Assume that we are given f, g ∈ R \ {0} with ν(f ) ≥ ν(g). Define sequences (ai )i≥0 , (bi )i≥0 , (ci )i≥0 , (qi )i≥0 , (ρi )i≥0 , (Ti )i≥0 , as follows: −1 • define ρ0 := LC(f ) and a0 := ρ−1 0 f and ρ1 := LC(g) and a1 := ρ1 g; −1 b0 b1 0 ρ0 ; • define := c0 c1 0 ρ−1 1 • if ai−2 and ai−1 are defined, define qi , ai , ρi , Ti , bi and ci as follows: (i) let qi , ai ∈ R with ai−2 = qi ai−1 + âi such that ν(âi ) < ν(ai−1 ) (Euclidean division) and ρi = LC(âi ), ai = ρ−1 i âi in case ai−1 6= 0, and qi = ai = 0 and ρi = 1 in case ai−1 = 0; then ρi ai = ai−2 − qi ai−1 ; 0 1 ∈ R2×2 ; (ii) define Ti := ρ−1 −qi ρi−1 i (iii) define (bi , ci ) := −1 (ρ−1 i , −qi ρi ) bi−2 ci−2 . bi−1 ci−1 Then there exists an index i with ai = 0. Let n ≥ 1 be the largest index such that an 6= 0. Then the following holds: 1.4. EUCLIDEAN ALGORITHM 27 (a) for 1 ≤ j ≤ i ≤ n + 1, aj−1 bj−1 cj−1 ai−1 bi−1 ci−1 ; = Ti Ti−1 · · · Tj+1 aj bj cj ai bi ci (b) for i = 1, . . . , n + 1, −1 ρ0 0 bi−1 ci−1 Ai := = Ti Ti−1 · · · T2 bi ci 0 ρ−1 1 and det Ai = (−1)i−1 −1 j=0 ρj Qi ∈ R∗ ; (c) for i = 1, . . . , n + 1, ai = bi f + ci g and gcd(bi , ci ) = 1; (d) gcd(a0 , a1 ) = an = bn f + cn g (Bézout equation); Q Q (e) (−1)i−1 f ii=0 ρi = ci ai−1 − ci−1 ai and (−1)i g ii=0 ρi = bi ai−1 − bi−1 ai for i = 1, . . . , n + 1; in particular, f = (−1)n cn+1 an n Y ρ−1 i and g = (−1)n+1 bn+1 an i=0 n Y ρ−1 i ; i=0 (f ) we have qi 6= 0, 2 ≤ i ≤ n + 1; (g) Assume that we have either of the following two cases: • R = Z, ν(x) = |x| and for Euclidean division a = qb + r, we assume that r ≥ 0 if a, b ≥ 0; • R = K[X] for some field K and ν(f ) = deg f . (i) For i = 2, . . . , n + 1, ν(bi−1 ) ≤ ν(bi ) and ν(ci−1 ) ≤ ν(ci ). In the case of R = Z, we have strict inequality except possibly ν(b3 ) = ν(b2 ), which happens if and only if ν(q3 ) = 1, and ν(c2 ) = ν(c1 ), which happens if and only if ν(q2 ) = 1. In the case of R = K[X], we have strict inequality except possibly ν(c1 ) ≤ ν(c2 ), which happens if and only if ν(a0 ) = ν(a1 ). (ii) For i = 1, . . . , n + 1, ν(bi ai−1 ) ≤ ν(a1 ) and ν(ci ai−1 ) ≤ ν(a0 ), and in the case of R = K[X], we have equality. The corresponding algorithm is the following: 28 CHAPTER 1. BASIC ARITHMETIC Input: Euclidean domain R with normalization, and f, g ∈ R \ {0} Output: d, x, y ∈ R such that d = gcd(f, g) = xf + yg 1. If ν(f ) < ν(g), then: −1 • ρ0 := LC(f ), a0 := ρ−1 0 f , b0 := ρ0 , c0 := 0; −1 • ρ1 := LC(g), a1 := ρ−1 1 g, b1 := 0, c1 := ρ1 ; else: −1 • ρ0 := LC(g), a0 := ρ−1 0 g, b0 := ρ0 , c0 := 0; −1 • ρ1 := LC(f ), a1 := ρ−1 1 f , b1 := 0, c1 := ρ1 ; 2. Set n := 0; 3. Repeat while an+1 6= 0: (a) Set n := n + 1; (b) Write an−1 = qn+1 an + â with qn+1 , â ∈ R, ν(â) < ν(an ); (c) Set ρn+1 := LC(â); (d) Set an+1 := ρ−1 n+1 â; (e) Set bn+1 := ρ−1 n+1 (bn−1 − qn+1 bn ); (f) Set cn+1 := ρ−1 n+1 (cn−1 − qn+1 cn ); 4. If ν(f ) < ν(g): • then return an , bn , cn ; • else return an , cn , bn . Algorithm 1.1: Normalized Euclidean Algorithm Theorem 1.4.11 (Euclidean Algorithms for Polynomials). [vzGG03, Theorem 3.16] For polynomials over a field, the number of field operations used by Algorithm 1.1 can almost be bounded by the bounds in Theorem 1.4.8. In particular, assuming that we use the basic polynomial arithmetic described in Section 1.3 with dense representations, the algorithm needs at most 6 deg f · deg g − 2 max{deg f, deg g} − 4 min{deg f, deg g} + 5 additions, subtractions and multiplications in K and at most min{deg f, deg g} + 2 inversions in K. In case we are not interested in x and y, and all bi ’s and ci ’s are not computed, it suffices to do 2 deg f · deg g + deg f + deg g + 1 additions, subtractions and multiplications in K as well as at most min{deg f, deg g} + 2 inversions in K. 1.4. ARITHMETIC FOR RATIONAL NUMBERS, RESIDUE CLASS ... 1.5 1.5.1 29 Arithmetic for Rational Numbers, Residue Class Rings and Finite Fields Rational Numbers Arithmetic for rational numbers is quite easy to describe. Given two rational numbers a/b and c/d with a, c ∈ Z, b, d ∈ N>0 , we can compute a c ad ± bc ± = , b d bd −1 sgn(a) · b a = b |a| and a c ac · = . b d bd Comparing fractions can be done using the identities a c = ⇔ ad = bc, b d a c < ⇔ ad < bc, b d etc. The only question is how efficient can these operations be computed. First of all, note that ab = dc does not imply a = c and b = d. This is true only if gcd(a, b) = 1 = gcd(c, d) (since we assumed b, d > 0). Therefore, one usually stores rational numbers in normal form by keeping numerator and denominator coprime, and the denominator positive. For the special rational number 0, we use the representation 0/1. With this representation, testing for equality is simpler than testing inequality, since for the latter, two products have to be computed. If a/b is given by (a, b) ∈ Z× N>0 with gcd(a, b) = 1, we can define18 size(a/b) := max{log |a|, log |b|}. With the convention log 0 = −∞ this is a well-defined nonnegative real number. If we represent a/b on a computer using arbitrary precision integers (see Section 1.1), the size used to represent the numbers is sizec (a) + sizec (b) (where c ≥ 2 is the base used), and by Remark 1.1.2 (e), sizec (a) + sizec (b) ∈ Θ(size(a/b)). Theorem 1.5.1 (Rational Arithmetic). Let x, y be two rational numbers represented as described above, i.e. numerator and denominator are stored as reduced c-adic representations and numerator and denominator are coprime. (a) We can test whether x = 0 or x = 1 in O(1) basic c-operations. (b) We can test whether x = y in O(max{size(x), size(y)}) basic c-operations. (c) We can test whether x < y, x ≤ y, x = y, x ≥ y or x > y in O(size(x) size(y)) basic c-operations. (d) We can compute a representation of x−1 (assuming x 6= 0) in O(1) basic coperations. (Note that O(size(x)) duplications of c-adic digits are needed.) (e) We can compute a representation of x ± y in O(max{size(x)2 , size(y)2 }) basic c-operations. (f ) We can compute a representation of x·y in O(size(x) size(y)) basic c-operations. Proof. For (a), note that x = 0 if and only if the numerator equals 0, which can be tested by at most one comparison of c-adic digits. Testing whether x = 1 amounts to testing whether both numerator and denominator equal 1, which can be done by 18 Note that this is the logarithm of the naive height used in Diophantine approximations. 30 CHAPTER 1. BASIC ARITHMETIC two comparisons of c-adic digits. For (b) we can simply compare numerator and denominator since our representation is unique. For (c), we have to compare ad to bc if x = a/b and y = d/e. The comparison requires O(min{sizec (ae), sizec (bd)}) = O(min{log |a| + log |e|, log |b| + log |d|}) ⊆ O(size(x)+size(y)) basic c-operations by Corollary 1.2.3 (a), and computation of the products can be done in O(size(x) size(y)) basic c-operations by Corollary 1.2.3 (d). For (d), it suffices to flip signs twice in case the numerator is negative, and to swap numerator and denominator. If we store the result into a new variable, we have to duplicate the c-adic representations, which requires O(size(x)) assignments of c-adic digits. For (e) we need to compute products of numbers of sizes size(x) and size(y), and add respectively subtract them. Since multiplying costs O(size(x) size(y)) basic coperations (Corollary 1.2.3 (d)) and the results are of size O(max{size(x), size(y)}), the time required for the addition or subtraction is dominated by the time required for the multiplications (Corollary 1.2.3 (b) and (c)). Finally, to reduce the representation, we have to compute the GCD of numerator and denominator. By Theorem 1.4.9, this can be done in O(max{size(x), size(y)}2 ) = O(max{size(x)2 , size(y)2 }) basic c-operations. For (f), we could do the same as in (e) (except adding resp. subtracting) and obtain the same running time. But we can do better. Assume that x = a/b and y = d/e with gcd(a, b) = 1 = gcd(d, e). We first compute the GCDs g = gcd(a, e) and h = gcd(b, d). By Theorem 1.4.9, this can be done in O(size(x) size(y)) basic c-operations. Then x · y = [(a/g) · (d/h)]/[(b/h) · (e/g)] and gcd((a/g) · (d/h), (b/h) · (e/g)) = 1. Therefore, we do not need to reduce the result. The divisions a/g, d/h, b/h and e/g can be computed in Corollary 1.2.3 (e) in O(size(x) size(y)) basic c-operations (note that the divisor is of size O(min{size(x), size(y)})), and the final multiplications in O(size(x) size(y)) basic c-operations. Note that one can speed up the addition a/b + d/e by first computing g = gcd(b, e), b0 = b/g and e0 = e/g. Then a d ae0 + db0 + = . b e be0 (Unfortunately, as opposed to multiplication, it could still be that gcd(ae0 +db0 , be0 ) > 1, whence we have to compute another GCD to compute the unique representation.) In case b and e are not coprime, this can drastically decrease the size of the numbers which have to be multiplied. The asymptotic running time of this approach is the same as for the “naive” addition a/b + d/e = (ae + db)/(be). 1.5.2 Residue Class Rings If R is a ring and I an ideal of R, then R/I is called a residue class ring. Basic arithmetic in R/I is defined via representatives: (a + I) ± (b + I) = (a ± b) + I, −(a + I) = (−a) + I and (a + I) · (b + I) = (a · b) + I. Inversion and comparisons (equality or inequality) on the other hand are much more complicated in general. For example, a + I = b + I if and only if a − b ∈ I. But testing membership of an ideal can be complicated – just consider ideals in19 K[X, Y ]. Also, since R could be 19 Note that in K[X, Y ] one can also compute unique representatives modulo ideals. For this, one needs to fix a so-called Gröbner base of the ideal. Gröbner bases are used in many areas of algorithmic Commutative Algebra and Algebraic Geometry; more information can be found in [vzGG03, Chapter 21]. We will not cover Gröbner bases in this course. 1.5. ARITHMETIC FOR RATIONAL NUMBERS, RESIDUE CLASS ... 31 infinite even if R/I is finite, one has to take care of the size of the representatives. For example, 12398123 + 2Z = 1 + 2Z, but the latter is a much better representative of a residue class modulo 2Z than the first. In this section, we want to consider residue class rings of (certain) Euclidean domains. If R is Euclidean with valuation ν, then any ideal I of R is principal, i.e. there exists some f ∈ R with f R = hf i = I, and f can be determined by ν(f ) = min{ν(g) | g ∈ I, g 6= 0}. In fact, any such f ∈ I where ν(f ) attains the minimum will do. Moreover, R/I can be represented by {g + I | g ∈ R, ν(g) < ν(f )}. In case Euclidean division is unique (as in the case of univariate polynomials over a field), this yields a unique representation of R/I. In the case of R = Z, long division can be made unique by specifying a = qb + r with 0 ≤ r < |b|. Then Z/f Z = {a + Z | a ∈ Z, 0 ≤ a < |f |} allows unique representation of R/I = Z/f Z. Note that in any principal ideal domain, g + I is a unit in R/I with I = f R if and only if gcd(f, g) = 1, and any Bézout equation xf + yg = 1 with f, g ∈ R yields (g + I)−1 = y + I. In Euclidean domains, computing the GCD as well as Bézout equations can be done with the Euclidean Algorithm (compare Section 1.4). Therefore, in Euclidean domains where arithmetic is effective, we can effectively do arithmetic in R/I. We will now use the results from the previous sections to analyze the costs of arithmetic in case R = K[X] and R = Z. Corollary 1.2.3 and Theorem 1.4.9 yield the following result on arithmetic in Z/nZ: Corollary 1.5.2 (Residue Class Ring Arithmetic over the Integers). Fix some natural number b ≥ 2. Let R = Z and let n ∈ Z \ {0} be some integer. We represent elements in Z/nZ by reduced b-adic representations of natural numbers a ∈ N with a < |n|; this representation is unique. (a) Each element of Z/nZ can be represented using O(log |n|) b-adic digits. (b) Testing equality and inequality can be done in O(log |n|) basic b-operations. (c) Computing sum and difference of two elements in Z/nZ can be done in O(log |n|) basic b-operations. (d) Computing the product of two elements in Z/nZ can be done in O((log |n|)2 ) basic b-operations. (e) Determining if an element of Z/nZ is invertible and computing its inverse (if it exists) can be done in O((log |n|)2 ) basic b-operations. Theorem 1.3.2 and Theorem 1.4.8 yield the following result on arithmetic in K[X]/hf i: 32 CHAPTER 1. BASIC ARITHMETIC Corollary 1.5.3 (Residue Class Ring Arithmetic over Polynomial Rings over Fields). Let K be a field. Let R = K[X] and let f ∈ R \ {0} be some polynomial. We represent elements in K[X]/hf i by univariate polynomials g over K with deg g < deg f in dense representation (compare Section 1.3); this representation is unique. (a) Each element of K[X]/hf i can be represented using deg f elements of K. (b) Testing equality and inequality can be done in deg f comparisons of elements in K. (c) Computing sum and difference of two elements in K[X]/hf i can be done in deg f additions respectively subtractions in K. (d) Computing the product of two elements in K[X]/hf i can be done in at most 2(deg f )2 + O(deg f ) additions and multiplications in K. (e) Determining if an element of K[X]/hf i is invertible and computing its inverse (if it exists) can be done in at most 6(deg f )2 + O(deg f ) additions and multiplications and at most deg f − 2 inversions in K. 1.5.3 Finite Fields It is well-known that every finite field K is of the form (Z/pZ)[X]/hf i, where f ∈ (Z/pZ)[X] is irreducible. The resulting field has pdeg f elements. (Note that in case deg f = 1, then we can simply use Z/pZ to represent the field.) Any two finite fields of cardinality pn are isomorphic. Corollary 1.5.4 (Arithmetic in Finite Fields). Let p be a prime and f ∈ (Z/pZ)[X] be an irreducible polynomial. Let n = deg f and q = pn . Assume that we represent elements in K := (Z/pZ)[X]/hf i by polynomials of degree < n whose coefficients are natural numbers a ∈ {0, . . . , p − 1}, which are represented by reduced b-adic representations, where b ≥ 2 is fixed. This representation is unique and combines the representations of Corollaries 1.5.2 and 1.5.3. (a) Each element of K can be represented using O(log q) b-adic digits. (b) Testing equality and inequality can be done in O(log q) basic b-operations. (c) Computing sum and difference of two elements in K can be done in O(log q) basic b-operations. (d) Computing the product of two elements in K can be done O((log q)2 ) basic boperations. (e) Inverting a non-zero element of K can be done in O((log q)2 ) basic b-operations. Proof. (a) Every element can be represented by n elements in Z/pZ (Corollary 1.5.3 (a)), which each need O(log p) b-adic digits (Corollary 1.5.2 (a)). Hence, the total number of b-adic digits is O(n log p). Now n log p = log pn = log q, whence the claim follows. (b) Follows from Corollary 1.5.3 (b) and Corollary 1.5.2 (b). (c) Follows from Corollary 1.5.3 (c) and Corollary 1.5.2 (c). 1.6. EXPONENTIATION AND FAST SCALAR MULTIPLICATION 33 (d) Follows from Corollary 1.5.3 (d) and Corollary 1.5.2 (d), since n2 (log p)2 = (log q)2 . (e) All arithmetic operations in Z/pZ needed to execute the Euclidean algorithm in (Z/pZ)[X] can be done in O((log p)2 ) basic b-operations. Since O((deg f )2 ) such operations are needed, the total number of basic b-operations is again O((log pn )2 ). 1.6 Exponentiation and Fast Scalar Multiplication Assume that we want to compute 12345679876543 mod 56473829. The result is a number in the interval {0, . . . , 56473828}. We could compute it by multiplying 1234567 by itself 9876542 times, and then reducing the result modulo 56473829. This is a very bad idea, since the b-adic size of 12345679876543 is ≈ 9876543 log 1234567 ≈ 138530672.2 , while the b-adic size of the result is ≈ 17.85 log b log b log b . A more efficient approach is to reduce modulo 56473829 after every operation. This still leaves us 9876542 multiplications and remainders of size ≈ 17.85 log b . A far more efficient approach is binary exponentiation. We first describe it slightly more general, for some base c ≥ 2; specializing to c = 2 yields binary exponentiation. Assume that we are given a c-adic representation (en , . . . , e0 ) of the exponent e. Then we can compute ae by the following algorithm: Listing 1.4: c-ary Exponentiation 1 2 3 4 5 6 7 8 def power (a , e , c ) : " Computes a ^ e . Assumes that e is a c - adic representation " result = 1 for k in xrange ( len ( e ) - 1 , -1 , -1) : result = result ** c if e [ k ] > 0: result = result * ( a ** e [ k ]) return result Note that in Python, a**b computes ab , while a^b computes the bitwise “exclusive or” of a and b. Note that this is different in Sage [S+ 13]: there, both a**b and a^b denote ab . This algorithm is most efficient in case c = 2: in that case, exponentiating result by c corresponds to multiplying result by itself, and e[k] ∈ {0, 1} shows that we either do nothing in the second step, or multiply result by a. Therefore, in every loop iteration, we do at least one and at most two multiplications. Now noting that log e n = size2 (e) ≈ log2 e = log 2 , we obtain: Proposition 1.6.1 (Binary Exponentiation). If a ∈ R and e ∈ N, then computing ae can be done in at most 2blog2 ec ∈ O(log e) multiplications in R given a 2-adic reduced representation of e. If we use R = Z/56473829Z, then this algorithm allows to compute 12345679876543 mod 56473829 34 CHAPTER 1. BASIC ARITHMETIC by doing at most 2blog2 9876543c = 46 multiplications in R. This is far less than doing 9876542 multiplications in R by the naive method.20 One application is to do encryption and decryption in the RSA cryptosystem [RSA78]. For this, we have a modulus n ≈ 2k , where k ≈ 1024, 2048 or 4096. To decrypt or encrypt, one needs to compute ac mod n, where a and c satisfy 0 ≤ a, c < n. Combining Corollary 1.5.2 with Proposition 1.6.1 yields that encryption and decryption can be done in O((log n)2 · log n) = O(log3 n) = O(k 3 ) basic boperations. We will see later in Chapter 3 that this can be improved to almost O(k 2 ) basic b-operations. Note that exponentiation of an element a ∈ R is the action of N on the multiplicative semigroup of R: (e, a) 7→ ae . In case G is an additively written (semi-)group with neutral element 0G , one also often considers the additive action N × G → G, (e, g) 7→ e · g, where 0 · g := 0G and (n + 1) · g := (n · g) + g. This group action can be evaluated similarly to the exponentiation above, and we obtain the same result: Proposition 1.6.2 (Fast Scalar Multiplication). Let (G, +) be a semigroup, a ∈ G and e ∈ N>0 . The computation e · a can be done in 2blog2 ec ∈ O(log e) semigroup operations in G, if a 2-adic reduced representation of e is given. Note that we can also use this to do multiplication in N, by considering G = N. This kind of multiplication is also known as Russian peasant multiplication. For b = 2, it is essentially the algorithm in Theorem 1.1.4 (d). 1.7 Finding Irreducible Polynomials over Finite Fields To work with a finite field Fpn , where p is a prime and n a positive natural number, we need to have an irreducible polynomial f ∈ Fp [X] = (Z/pZ)[X]. A basic result from analytic number theory states that such polynomials are easy to find: Theorem 1.7.1 (Prime Number Theorem for Polynomials). Fix a finite field K of q elements and a degree n. Let P be the set of all polynomials in K[X] of degree n, and let I be the set of all irreducible polynomials in K[X] of degree n. Then |I| 1 1 |P | − n ≤ 2q n/2 . Proof. See, for example, [Ros02, Theorem 2.2] or [Fon11, Korollar 4.4.17]. Note that q ≥ 2 implies 1 2q n/2 ≤ 2−1−n/2 . This shows that the probability that a random polynomial of degree n over K is irreducible is at least 0.46 n and at most 1.54 n . A simple algorithm to find an irreducible polynomial over Z/pZ of degree n is to pick a random (monic) polynomial f ∈ (Z/pZ)[X] of degree n and test whether it is irreducible. The above result states that we will be successful with probability 1 21 This leaves open the question on n , i.e. in average we need to try n polynomials. 20 The result is 31787348. Using modular exponentiation, MAPLE computes this in a fraction of a second. 21 Such a picking process is modeled by the geometric distribution. The expected value of the geometric distribution with probability λ of success equals λ1 . 1.7. FINDING IRREDUCIBLE POLYNOMIALS OVER FINITE FIELDS 35 how to check a polynomial over a finite field for irreducibility. This is answered by the following result: Lemma 1.7.2. Let K be a finite field of q elements and f ∈ K[X] of degree n. Then f is irreducible if n = 1, or if n > 1 and for all k ∈ N with 1 ≤ k ≤ 21 n, f and k X q − X are coprime. Proof. (See, for example, [Fon11, Beispiel 4.4.18].) In case f is not irreducible, f must have an irreducible factor p ∈ K[X] with k := deg p ≤ 12 deg f . Now K[X]/hpi k is a finite field of q k elements, whence every element of it is a zero of X q − X. k k Therefore, p must divide X q − X, and gcd(f, X q − X) is non-trivial. k Conversely, if gcd(f, X q − X) is non-trivial, for some k ≤ 12 deg f , then there k exists an irreducible polynomial p ∈ K[X] such that p | f and p | (X q − X). But this shows that p has a zero in Fqk , which implies that deg p ≤ k, whence f cannot be irreducible. As mentioned in Section 1.3, the dense representation of polynomials is not very useful in all cases. One example mentioned there were polynomials of the form X n − X for large n. Now, assume that q = 19 and n = 6; these are rather small choices. In this case, we have to try k = 1, 2, 3, and for k = 3, we obtain the polynomial X 6859 − X which is of degree 6859, whose reduced dense representation consists of two 1’s and 6858 0’s. One could certainly store this polynomial and do division with remainder with f to obtain the remainder (we do not need the quotient here), but this is a waste of memory. What is worse, is that if we use the classical algorithm of Theorem 1.3.2 (d) to compute the remainder, we spend O(q n ) operations for this computation. This is far from efficient. k k Note that to compute X q −X modulo f , it suffices to first compute X q modulo k f , and then subtract X from the result. Now computing X q modulo f amounts to computing a reduced representation of X + f K[X] raised to the power q k in the ring Rf := K[X]/f K[X]. By Proposition 1.6.1, this can be done in O(log q k ) = O(k log q) operations in Rf . Proposition 1.7.3. Let K be a finite field of q elements and assume that f ∈ K[X] is a polynomial of degree n. We can test whether f is irreducible in 2 log2 q·(deg f )4 + O(log q · (deg f )3 ) operations in K. Proof. By Corollary 1.5.3, one multiplication in Rf := K[X]/hf i can be done in 2(deg f )2 + O(deg f ) operations in K. By Proposition 1.6.1, we need to do at most 2 log2 e operations in Rf to compute X e modulo f , we need a total of at most 4 log2 e · (deg f )2 + O(log e · deg f ) operations in K to compute (X e − X) mod f . Then, by Theorem 1.4.8, we need additional (deg f )2 + O(deg f ) operations in K to finish the computation of gcd(X e − X, f ). Therefore, we can check whether gcd(X e − X, f ) = 1 in (4 log2 e + 1)(deg f )2 + log e · O(· deg f ) operations in K. Now by Lemma 1.7.2, it suffices to check this for e ∈ {q, q 2 , . . . , q b(deg f )/2c }. 36 CHAPTER 1. BASIC ARITHMETIC Therefore, with ` = ` X = k=1 ` X 1 2 deg f ≈ 1 2 deg f , the total number of operations equals (4 log2 q + 1)(deg f ) + log q · O(deg f ) k 2 k k · 4 log2 q · (deg f )2 + log q · O(deg f ) + `(deg f )2 k=1 = 1 2 `(` + 1) · 4 log2 q · (deg f )2 + log q · O(deg f ) + `(deg f )2 = 2 log2 q · (deg f )4 + O(log q · (deg f )3 ) operations in K. If we combine this with Theorem 1.7.1, we obtain: Corollary 1.7.4 (Ben-Or’s Algorithm [BO81]). Let p be a prime and n a positive natural number. We can construct Fpn by finding an irreducible polynomial f ∈ (Z/pZ)[X] with a Las Vegas22 algorithm, which in average requires O(n5 (log p)3 ) basic b-ary operations. For large p and for large n this can be sped up, as we will see in Chapter 3. 22 Two important classes of non-deterministic algorithms are Las Vegas algorithms and Monte Carlo algorithms. Las Vegas algorithms yield a result only with a certain success probability, but if they return a result, it is correct. On the other hand, Monte Carlo algorithms always return a result, which is correct only with a certain probability, or which is distributed by a certain probability distribution (depending on the algorithm) centered around the correct result. In the case of Ben-Or’s algorithm, the success probability is around 1/n, whence one expects that one has to execute the algorithm around n times until it succeeds. But if it succeeds, Proposition 1.7.3 guarantees that the result is correct.