Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Big O notation wikipedia , lookup
Vincent's theorem wikipedia , lookup
Elementary mathematics wikipedia , lookup
Collatz conjecture wikipedia , lookup
Quadratic reciprocity wikipedia , lookup
List of prime numbers wikipedia , lookup
Proofs of Fermat's little theorem wikipedia , lookup
Factorization of polynomials over finite fields wikipedia , lookup
Integer Factorization Methods Christopher Koch CSE489-01 Algorithms in CS & IT New Mexico Institute of Mining and Technology May 9, 2014 Contents List of Theorems and Definitions iii List of Algorithms iii List of Figures iii 1 Introduction 1 1.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Integer Factorization 2 3 2.1 Complexity of Integer Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2.2 Trial Division . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.2.1 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Pollard’s p − 1 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.3.1 Worked Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.3.2 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Pollard’s ρ Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.4.1 Cycles in Z/nZ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.4.2 Floyd’s Cycle-finding Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.4.3 Pollard’s ρ Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.4.4 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.3 2.4 3 Strong Primes 13 4 References 14 ii List of Theorems and Definitions 1.1 Theorem (Fundamental theorem of arithmetic) . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Definition (Landau notation) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 2.1 Theorem (Prime number theorem) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.1 Definition (p-adic order) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.2 Theorem (Fermat’s little theorem) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.2 Definition (Periodic sequences) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.3 Proposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 List of Algorithms 2.1 Trial Division . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.2 Pollard’s p − 1 method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.3 Modular exponentiation and GCD of Pollard’s p − 1 . . . . . . . . . . . . . . . . . . . . 8 2.4 Floyd’s cycle-finding algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.5 Pollard’s ρ method (Monte Carlo factorization) . . . . . . . . . . . . . . . . . . . . . . . 11 List of Figures 2.1 2.2 Number of primes less than n versus n: the prime counting function π(n). For example, there are 550 primes less than or equal to 4000. . . . . . . . . . . . . . . . . . . 1 Rho-shaped ultimately periodic sequence . . . . . . . . . . . . . . . . . . . . . . . . . . iii 5 9 1 1 INTRODUCTION Introduction VERY positive integer (whole number) greater than one can be written as the multiplication of prime numbers, for example 10 = 2 × 5. The fundamental theorem of arithmetic details that every positive integer greater than 1 must have such a “unique prime factorization;” however, it does not specify how to find such factorization. Integer factorization methods are procedures that find a given integer’s unique prime factors. E Theorem 1.1 (Fundamental theorem of arithmetic). Let n be an integer greater than 1. Then, there exist not necessarily distinct prime numbers p1 , p2 , . . . , pi such that n = p1 × p2 × ⋯ × pi . The fundamental theorem of arithmetic was first proven by the Greek mathematician Euclid around 300 BC in his treatise Elements [1]. While every positive integer greater than 1 has a prime factorization, finding it is a hard problem. The naïve approach would be to take the given integer n and to try to divide it by every prime between 1 and n. If it is divisible by a prime, then that prime is one of the prime factors of n. One must consider that an integer n may be divisible by a prime more than once, for example 20 = 2 × 2 × 5 has the prime 2 twice in its prime factorization. One must also consider that finding the prime numbers between 1 and n is also a difficult problem. While simple, this approach is an exponential time algorithm. The problem of distinguishing prime numbers from composite numbers and of resolving the latter into their prime factors is known to be one of the most important and useful in arithmetic. [. . . ] Further, the dignity of the science itself seems to require that every possible means be explored for the solution of a problem so elegant and celebrated. — Carl Friedrich Gauß [2, Article 329] Traditionally, prime numbers and all that they involve, such as integer factorization, had few applications outside of pure mathematics. Even the community of number theorists viewed colleagues that occupied themselves with integer factorization as incurably obsessive [3, p. 675]. Some results and theorems known about prime numbers were only discovered due to some mathematician’s obsession with primes; for example, Gauß conjectured the famous prime number theorem when he was 15, but did not discuss it until almost 60 years later because he thought it was not a significant result [4, p. 54]. All this changed entirely in the 1970s, when the concepts of public-key cryptography were invented through RSA [5], an encryption scheme named after its inventors Ronald Rivest, Adi Shamir, and Leonard Adleman. RSA specifically relied on the difficulty of factoring large integers into primes. In addition to that, complexity theory became popular [6]: complexity theory enabled researchers to express rigorously that using a certain method to factor a number n will take at most time f (n) for some function f . While the notation of some of complexity theory (such as Landau notation) had been around since the beginning of the 20th century, it was now being applied to computational complexity: this meant that algorithms, such as integer factorization algorithms, could be compared to each other in terms of speed. Christopher Koch 1 1 INTRODUCTION 1.1 Notation The security of RSA encryption lies in the difficulty of factoring large integers; specifically in that all known integer factorization algorithms are exponential time algorithms. While the mathematics behind RSA cannot be broken, the ability to factor large integers quickly (in polynomial time) would make the use of RSA infeasible. This would be fatal: RSA is the “most widely deployed public-key cryptosystem” [7]; it is for example used for secure web browsing, digital signatures, and secure e-mail communications. To this day, a lot of research has been done in integer factorization methods. In the last 40 years, a number of methods were developed and optimized; for example, Pollard’s ρ algorithm in 1975 [8] [9, ch. 31], Pollard’s p − 1 algorithm in 1974 [10], or Lenstra’s elliptic curve factorization algorithm in 1987 [11]. To this day, the three best practical methods of integer factorization are the general number field sieve, the quadratic sieve, and the elliptic curve factorization algorithm. There is also Shor’s algorithm for quantum computers; however, to date no quantum computer has been built that can run Shor’s algorithm. 1.1 Notation The GCD of two integers a and b will be denoted gcd(a, b) and the LCM as lcm(a, b). We remind the reader of the (modern) limit definition of the notation due to Bachmann and Landau (Big O) [12, p. 401]. Of course, Bachmann did not define it this specifically when he first used it; the limit definition came to exist later. Definition 1.1 (Landau notation). We say that a function f (n) is O(g(n)) as n goes to ∞ (infinity) if f (n) ∣ < ∞. lim ∣ n→∞ g(n) We may write f (n) ∈ O(g(n)) or f (n) = O(g(n)). We use lg n to denote the logarithm with base 2 of n. The time to multiply a n-bit number will be denoted by M (n). Schoolbook multiplication is M (n) = O(n2 ), while mutliplication due to Strassen and Schönhage is M (n) = O(n lg n lg lg n). The time to find the GCD of two n-bit numbers will be denoted G(n). The Euclidean algorithm would be G(n) = O(n2 ), while the GCD due to Schönhage takes G(n) = O(M (n) lg n). Christopher Koch 2 2 INTEGER FACTORIZATION 2 Integer Factorization Given an integer n, many integer factorization methods will only find one divisor of n and it may not be a prime. It is then necessary to test whether the resulting divisor is a prime or not. To find more than one factor of n, the respective algorithm would have to be used repeatedly. In this paper, only the trial division algorithm will give all prime divisors of n. Generally, integer factorization algorithms can be divided into two categories: general-purpose algorithms, whose runtime just depends on the size of the integer to be factored, and special-purpose algorithms, whose runtime may depend not only on the size of the integer to be factored but also on some possibly unknown properties of that integer. The algorithms discussed here will be from both categories: trial division is a general-purpose algorithm, while the other algorithms discussed are special-purpose algorithms. We will discuss the properties that an integer must possess for each special-purpose algorithm to work well. 2.1 Complexity of Integer Factorization Integer factorization is traditionally seen as a hard problem by computer scientists and mathematicians [3, p. 676]. In this context, “hard” means that the problem is computationally intensive: There is no known integer factorization algorithm for Turing machines that factors in polynomial time. Of course, integer factorization can be verified in polynomial time, which means it is in the NP complexity class. While all known integer factorization algorithms for Turing machines are exponential or subexponential time, we also know that integer factorization is in the BQP complexity class – bounded error quantum polynomial time. Peter Shor’s factoring algorithm for quantum computers showed this in 1995. It is suspected that integer factorization is outside of P, NP-complete, and coNP-complete. If that turns out to be true, factorization would be in the NP-intermediate class; however, if we knew that to be true, we would also know that P ≠ NP. Christopher Koch 3 2 INTEGER FACTORIZATION 2.2 2.2 Trial Division Trial Division Trial division is the naïve method of integer factorization. Given an integer n > 1, we know that factors of n can be between 2 and n/2. The easiest thing to do would be to test every prime number between 2 and n/2 for divisibility of n. However, we can make one more observation to improve trial division: We also know that there may only be zero or one prime factor of n that is greater √ √ √ √ than n. If there were two factors p, q > n, we would have pq > n n = n. √ Then, we simply test every prime number between 2 and n inclusive to see whether it divides n and divide n by that number if it is divisible. In the end, since there may be one prime factor √ greater than n, we test whether n > 1 and use that as a factor if it is. Thus, we know that n is prime if trial division returns a list of prime factors that just contains n itself. Hence, trial division can also be used for primality testing. However, there are much more efficient methods of primality testing than trial division and it should not be used for such a task. Algorithm 2.1 details the steps to be taken for the trial division algorithm. Algorithm 2.1 Trial Division Input: an integer n to be factored Output: a list of prime factors of n 1: 2: 3: 4: 5: 6: 7: 8: 9: function TrialDivision(n) D ← () √ for all p in primes( n) do while n mod p = 0 do append(D, p) n ← n/p if n > 1 then append(D, n) return D 2.2.1 ▷ empty list Analysis To analyze the trial division algorithm, we must find out how many times the for-loop repeats. In √ fact, it repeats for every prime number less than n, so we must find out how many prime numbers √ there are less than n. This is where the prime number theorem comes into play. Theorem 2.1 (Prime number theorem). Let n ∈ Z+ . Let π(n) be the number of primes less than n; it is called the prime-counting function. Then, π(n) = 1. n→∞ n/ ln(n) lim n This means that π(n) ∈ O ( ln(n) ) by the limit definition of Landau notation. For example, π(10) = 4, since the primes less than or equal to 10 are 2, 3, 5, and 7. Thus, there are four primes less than 10. Christopher Koch 4 2 INTEGER FACTORIZATION 2.2 Trial Division Figure 2.1: Number of primes less than n versus n: the prime counting function π(n). For example, there are 550 primes less than or equal to 4000. n Figure 2.1 shows how close the values of π(n) and ln(n) lie together. The prime number theorem was conjectured by Carl Friedrich Gauß in the year 1792 [4, p. 54], but it was not proved until 1896 by Jacques Hadamard and Charles Jean de la Vallée-Poussin. In fact, Gauß did not reveal his conjecture until almost 60 years later in a letter, because he did not think his discovery was very √ √ important [4, p. 54]. This means that the number of primes lower than n is of order O ( ln(√nn) ). Therefore, that is how many times the for-loop is repeated. We then have to consider how many times the while-loop is repeated for each p. Definition 2.1 (p-adic order). The p-adic order of an integer n > 0 is the highest integer v such that pv divides n: νp (n) = max{v ∈ Z+ ∶ pv ∣n} Then for each p, the while-loop repeats νp (n) times. An upper bound for νp (n) would be if n = pνp (n) , so νp (n) ≤ logp (n) ≤ log2 (n) since p ≥ 2. So, the while-loop runs O(lg n) for each p. Then, trial division is √ √ ⎛ ⎞ n O(π( n) lg(n)M (lg n)) = O √ lg(n)M (lg n) = O ( nM (lg n)) . ⎝ ln ( n) ⎠ √ Christopher Koch 5 2 INTEGER FACTORIZATION 2.3 2.3 Pollard’s p − 1 Method Pollard’s p − 1 Method In 1974, John M. Pollard invented the p − 1 method almost as just a by-product of a paper on primality testing. Pollard’s p − 1 method is based on a famous theorem by Pierre de Fermat called Fermat’s little theorem [10,11]. As with most of his number theory conjectures, Fermat did not actually prove this theorem [13]. Many years after its conjecture, Leonhard Euler proved it in 1736 and generalized it to a theorem known as the Euler totient theorem today [13]. Theorem 2.2 (Fermat’s little theorem). Let p be a prime and a be an integer coprime to p. Then, ap−1 ≡ 1 (mod p). This is equivalent to p∣(ap−1 − 1). While it may not be intuitive at first, the theorem also implies that for some prime p and any integer m, am(p−1) ≡ 1 (mod p). The idea of Pollard’s p − 1 method is to select K as the multiplication of many small primes and calculate the gcd(aK − 1, n) in the hope that (p − 1)∣K for some p∣n, since if those two conditions are true, we have that p∣(aK − 1). Then, gcd(aK − 1, n) ≥ p and we have found a non-trivial divisor. Algorithm 2.2 Pollard’s p − 1 method Input: an integer n and a smoothness bound B Output: a non-trivial divisor of n or failure function PollardP-1(n, B) a ← random integer coprime to n 3: K← p⌊logp (n)⌋ ∏ 1: 2: 4: 5: 6: 7: 8: 9: 10: primes p≤B K m ← (a − 1) mod n g ← gcd (m, n) if g = 1 then either increase B and return PollardP-1(n, B) or return failure else return g ▷ modular exponentiation ▷ g must be a divisor of n We notice that to find a prime divisor p of n, the number p − 1 must be a multiple of K. For the algorithm to run quickly, we want K to be a small number, but to find all prime factors of n, we √ need K to be n. The algorithm then only works fast for numbers n where a prime divisor p’s neighbor p − 1 is divisible by small primes. Since we cannot predict whether any number n has this property, we cannot say whether it is feasible to use this algorithm or not for that number. This introduces the concept of strong primes: in certain cryptographic schemes such as RSA, we need to use an integer that is hard to factor. If we consider all known integer factorization algorithms and their properties, then we can designate a strong prime to be a prime number that exhibits none Christopher Koch 6 2 INTEGER FACTORIZATION 2.3 Pollard’s p − 1 Method of the properties that the factorization algorithms need to work well. For the p − 1 algorithm that means that a strong prime is a prime whose prime factor p’s neighbor p − 1 has at least one large prime factor, so that K has to be a big number and the p − 1 algorithm runs slowly for any n that has p as a prime divisor. It is to note that Pollard’s p − 1 algorithm only finds one divisor of n and this divisor is not necessarily prime. It may either be that the divisor is prime or it needs to be factored further. To find all factors of n, the algorithm needs to be used repeatedly. 2.3.1 Worked Example We want to factor n = 540 143 and we choose B = 8. Since the computation of K is extensive, for the example we choose a similar measure: K = lcm(2, 3, . . . , B), the lowest common multiple of the numbers between 2 and B. Then, we have K = lcm(2, 3, 4, 5, 6, 7, 8) = 840. We also have (2K − 1) mod 540 143 = (2840 − 1) mod 540 143 = 53 046. To find a divisor, we compute d = gcd(53 046, 540 143) = 421. Indeed, we have that 540 143 = 421 × 1283. Just as noted above, we notice that this only worked because 421 − 1 = 420 was the factorization of many small primes: 420 = 2 × 2 × 3 × 5 × 7. All these factors were also factors of K and this is why the method worked. If we now tried to factor 491 389 into its prime factors 383 and 1283, we notice that 383 − 1 = 382 is not the factorization of many small prime factors, since 382 = 2 × 191. Then, we would have to choose at least B = 191 for the method to work. At this point, the computation of finding K becomes really expensive. Christopher Koch 7 2 INTEGER FACTORIZATION 2.3.2 2.3 Pollard’s p − 1 Method Analysis The runtime of Pollard’s p − 1 algorithm entirely depends on the exponentiation and modular exponentiation of lines 2-4. To analyze those, we give a specific algorithm for finding the modular exponentiation and GCD as in Algorithm 2.3. Of course, the runtime of these arithmetic operations is dependent on both the number n to be factored and the choice of B. Algorithm 2.3 Modular exponentiation and GCD of Pollard’s p − 1 1: K ← a 2: for all p in primes(B) do 3: pc ← p 4: while pc < n do 5: K ← K p (mod n) 6: pc ← pc ∗ p 7: g ← gcd(K − 1, n) The for-loop repeats π(B) times, where π is the prime-counting function as defined in Theorem 2.1. Each while-loop repeats ⌊logp (n)⌋ times for each p. Then, we have ∑primes p≤B ⌊logp (n)⌋ modular exponentiations and multiplications. Each modular exponentiation is O(lg(p)M (lg n)) while each multiplication is M (lg n). We also find the GCD once; however, G(lg n) = O(lg lg nM (lg n)), which is already less than the time taken for modular exponentiation. Then, the cost of the modular exponentiations and multiplications is ∑ ⌊logp (n)⌋O(lg(p)M (lg n)) ≤ ∑ O(logp (n) lg(p)M (lg n)) ∑ O(lg(n)M (lg n)) primes p≤B primes p≤B ≤ primes p≤B = O(π(B) lg(n)M (lg n)) (1) Then, equation (1) is the runtime of Pollard’s p − 1 algorithm. The source at [14] agrees with this version of the analysis. Just like trial division, this algorithm ends up being exponential time in the size of n. However, we now have a factor B to play with that allows us to choose performance and reliability of the algorithm. Larger values of B will likely get results, but the algorithm will take longer, while smaller values of B might not get any results, but the algorithm will run quicker. √ √ We notice that if B = n, Pollard’s p − 1 is O(π( n) lg(n)M (lg n)), which is the same complexity as trial division. However, that would be just one iteration of Pollard’s p − 1 algorithm, which finds only one divisor of n while trial division finds all divisors of n. It becomes apparent that to use Pollard’s p − 1 algorithm feasibly, we have to choose small values of B. Christopher Koch 8 2 INTEGER FACTORIZATION 2.4 2.4 Pollard’s ρ Method Pollard’s ρ Method To talk about Pollard’s ρ method, we need to develop a few small results about cycles in Z/nZ. Pollard’s ρ method basically works by trying to find cycles modulo p while doing a functional iteration modulo n, where n is the integer to be factored and p is a prime factor. 2.4.1 Cycles in Z/nZ Consider a function f (x) = (x2 +1) mod 6 and define a sequence using this function as its functional iteration. That is, let x0 = 0 be the start of the sequence, and apply f to get the next value in the sequence. So, x1 = f (x0 ) = 01 + 1 mod 6 = 1, x2 = f (x1 ) = 12 + 1 mod 6 = 2, x3 = 5, etc. If we take that sequence and start writing it out, we get (0, 1, 2, 5, 2, 5, 2, 5, ⋯) and we notice that eventually, 2 and 5 start cycling. Once f (5) = 2, it is just like jumping back two elements, since we will simply apply the same functional iteration to 2 again. We already know that f (2) = 5, so why do that computation again? A sequence that cycles like this is called periodic and the “jumping” distance is called the period of the sequence. In this case, the sequence is actually ultimately periodic, since the cycling does not start until the third element of the sequence. Definition 2.2 (Periodic sequences). A sequence {Xi }i≥0 is considered ultimately periodic if there exist integers λ (called period) and N such that Xn+λ = Xn for all n ≥ N . If N = 0, the sequence is just called periodic. xi+1 xi ≡ xj xi+2 period (mod n) xi−1 xj−1 xi−2 x1 Figure 2.2: Rho-shaped ultimately periodic sequence2 We can generalize the previous example: given any function f modulo n (f ∶ Z/nZ → Z/nZ) that is applied to an element to form a sequence (a functional iteration), we can say that it will be ultimately periodic just like the sequence above. The reason for this is simple: Z/nZ is a finite set; it has exactly n elements. So if f is applied to some x0 at least n times, one of the values in Z/nZ has to eventually repeat itself (this follows from the Pidgeonhole principle). An ultimately periodic sequence induces a ρ-shaped picture as in Figure 2.2, where the tail is the segment of the sequence before it becomes periodic, and the head is the cycle. 2 TikZ drawing taken from https://github.com/mmaker/bachelor/blob/master/book/pollardrho.tex Christopher Koch 9 2 INTEGER FACTORIZATION 2.4 Pollard’s ρ Method Proposition 2.3. Let f ∶ Z/nZ → Z/nZ and consider the functional iteration (sequence) {Xi }i≥0 where Xi ∈ Z/nZ and Xm+1 = f (Xm ). This sequence is ultimately periodic. Proof. Assume X0 , X1 , . . . , Xm−1 distinct for some positive integer m and Xm not. By the Pidgeonhole principle, we know that m ≤ n. Then, Xm = Xµ for some 0 ≤ µ ≤ m − 1. Let λ = m − µ be the period. By induction, we need to show that Xn+λ = Xn for all n ≥ µ. For the base case, let n = µ. Then, Xµ+λ = Xm = Xµ . We then assume Xn+λ = Xn for some n. Then, Xn+1+λ = f (Xn+λ ) = f (Xn ) = Xn+1 . Hence, the sequence is ultimately periodic with period λ for n ≥ µ. 2.4.2 Floyd’s Cycle-finding Algorithm The naïve way of finding a cycle in a functional iteration would be to test xi against every xj for j < i until a cycle is found. However, in The Art of Computer Programming, Knuth gives a much more elegant way of finding cycles and attributes it to R. W. Floyd [15]. Algorithm 2.4 details a simple procedure to find the cycle given a functional iteration f on a finite set and a starting element x0 . The algorithm is sometimes also called the “tortoise-hare algorithm,” where x is the tortoise and y is the hare: since we know that applying f means we will eventually be periodic, we let x walk slowly in the “cycle” and y walk twice as fast as x in the “cycle.” Eventually, when they meet up, we have found the cycle. If we kept track of how many iterations we have gone through, then we can figure out what the period of the functional iteration is. Formally, this works because X2λ = Xλ , where X is the functional iteration and λ is the period. Algorithm 2.4 Floyd’s cycle-finding algorithm Input: function f and start-value x0 Output: halts when a cycle is found function FloydCycle(f, x0 ) x ← f (x0 ), y ← f (f (x0 )) 3: while x ≠ y do 4: x ← f (x) 5: y ← f (f (y)) 1: 2: Christopher Koch 10 2 INTEGER FACTORIZATION 2.4.3 2.4 Pollard’s ρ Method Pollard’s ρ Method While trial division guarantees to find all divisors of a given integer n, it does so very slowly. On the other hand, Pollard’s ρ method executes quickly, but it may not find any divisors of n and fail. Suppose n is the integer to be factored and p is a prime divisor of n. Pollard’s ρ method works by trying to find cycles modulo p in a sequence of pseudorandom numbers modulo n. But why is a cycle modulo p important? Let’s say we have found two elements in the sequence where xi ≠ xj and xi ≡ xj (mod p). This implies that xi − xj is a multiple of p. We have that n is also a multiple of p, so gcd(∣xi − xj ∣, n) ≥ p. Thus, by finding a cycle modulo p in the sequence, we have found a non-trivial divisor of n. We also need to generate a sequence of pseudorandom numbers modulo n. This is generally done by using f (x) = x2 + b (mod n) for b ∈ Z/nZ ≠ 0, 2. We assume that every output that f (x) could give has equal probability; i.e. even though f (x) clearly seems to define a dependence, we assume that the probability of each output is the same. Using f as defined previously is only based on empirical results, whereas b = 0, 2 did not produce a uniform distribution. The method, of course, takes its name from the ρ shape that it induces with the sequence. Pollard created it in 1975 [8], calling it a Monte Carlo approach to factorization and saying in the paper that he considered naming it the ρ method. Evidently, that name was used for it eventually anyway. Algorithm 2.5 Pollard’s ρ method (Monte Carlo factorization) Input: an integer n Output: a non-trivial divisor of n or failure 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: function PollardRho(n) x ← 2, y ← 2, g ← 1 while g = 1 do x ← x2 − 1 (mod n) 2 y ← (y 2 − 1) − 1 (mod n) g ← gcd(∣x − y∣, n) if g = n then return failure else return g 2.4.4 ▷ g must be a divisor of n Analysis The analysis of Pollard’s ρ method relies on finding the expected number of iterations needed to find a congruency modulo p in the functional iteration modulo n. Keeping in mind that we assume that the output of f is a pseudorandom number modulo n, we can draw an analogy to the birthday paradox problem: How many people need to be in a room for there to be a probability of m that at least two people have the same birthday? Usually, the problem is posed with m = 1/2. If we replace some terminology, we can rephrase the problem this way: How many iterations do we Christopher Koch 11 2 INTEGER FACTORIZATION 2.4 Pollard’s ρ Method need to draw from f for two elements to have the same remainder modulo p with probability m? Doing this analysis [9, Chapter 5.4] [16], we will find that we can find a cycle after approximately 1√ 1 − 8p ln(1 − m) 2 √ iterations. Thus, for probability 1/2, we have that we need approximately 1.177 p iterations to find a cycle modulo p. √ In fact, this analysis leads us to conclude that one can always find a cycle in θ( p) steps for any probability m. Christopher Koch 12 3 3 STRONG PRIMES Strong Primes Cryptographically, some methods of integer factorization introduce a concept of strong primes. For certain cryptographical schemes, it is desirable to choose primes that are hard to factor in most methods of integer factorization. It is then of advantage that some methods of integer factorization are either slow or work well only for primes with special properties, but never both. A strong prime is then a prime that does not possess any of these special properties and can thus only be factored slowly. For example, one of the conditions for a prime p to be strong is if p − 1 has large prime factors, due to Pollard’s p − 1 algorithm. Similarly, Williams’ p + 1 algorithm means that p + 1 needs to have large factors for p to be a strong prime. However, it is argued by some, such as Ronald Rivest, that strong primes bring no additional security to a cryptographical scheme, because fast methods such as Lenstra’s elliptic curve method do not rely on any special properties and thus, will outperform algorithms like Pollard’s p − 1 in any case [17]. Christopher Koch 13 4 REFERENCES 4 References [1] Euclid of Alexandria, “Elements of geometry,” in Book VII, J. Heiberg and R. Fitzpatrick, Eds. Richard Fitzpatrick, Dec. 2007. [Online]. Available: http://farside.ph.utexas.edu/euclid.html [2] C. F. Gauß, Disquisitiones Arithmeticae. Berlin, Germany: Springer, 1801 / 1986, translation by Arthur A. Clarke and William C. Waterhouse. [3] J. van Leeuwen, Ed., Handbook of Theoretical Computer Science. Elsevier, 1990, vol. A. [4] H. Mania, Gauß – Eine Biographie (Gauß – A Biography). 2009. Amsterdam, Netherlands: Hamburg, Germany: Rowohlt, [5] R. L. Rivest, A. Shamir, and L. M. Adleman, “A method for obtaining digital signatures and public-key cryptosystems,” CACM, vol. 21, no. 2, pp. 120–126, Feb. 1978. [Online]. Available: http://doi.acm.org/10.1145/359340.359342 [6] L. Fortnow and S. Homer, “A short history of computational complexity,” Bulletin of the European Association for Theoretical Computer Science, vol. 80, pp. 95–133, 2002. [Online]. Available: http://people.cs.uchicago.edu/~fortnow/papers/history.pdf [7] C. F. Cid, “Cryptanalysis of RSA: A survey,” 2003. [Online]. Available: http: //www.sans.org/reading-room/whitepapers/vpns/cryptanalysis-rsa-survey-1006 [8] J. M. Pollard, “A monte carlo method for factorization,” BIT Numerical Mathematics, vol. 15, no. 3, pp. 331–334, 1975. [Online]. Available: http://dx.doi.org/10.1007/BF01933667 [9] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein, Introduction to Algorithms, 2nd ed. Boston, MA: McGraw-Hill, 2001. [10] J. M. Pollard, “Theorems on factorization and primality testing,” Mathematical Proceedings of the Cambridge Philosophical Society, vol. 76, pp. 521–528, Nov. 1974. [Online]. Available: http://journals.cambridge.org/article_S0305004100049252 [11] H. W. Lenstra Jr., “Factoring integers with elliptic curves,” Annals of mathematics, vol. 126, no. 3, pp. 649–673, 1987. [12] P. Bachmann, Zahlentheorie – Zweiter Teil. Stuttgart, Germany: B. G. Teubner, 1894, reprinted by Johnson Reprint, New York, NY, 1968. [13] L. Euler, “Theorematum quorundam ad numeros primos spectantium demonstratio,” Commentarii academiae scientiarum Petropolitanae, vol. 8, pp. 141–146, 1741, english translation available. [Online]. Available: http://eulerarchive.maa.org/pages/E054.html [14] A. V. Sutherland, “18.783 elliptic curves – lecture notes #12,” MIT Course Website, Mar. 2013, http://math.mit.edu/classes/18.783/LectureNotes12.pdf. [15] D. E. Knuth, The Art of Computer Programming - Seminumerical Algorithms, 3rd ed. Reading, MA: Addison-Wesley, 1997. [16] C. Koch, “Integer factorization methods,” Apr. 2014, presentation given in a seminar class at New Mexico Tech (CSE489/589). [Online]. Available: https://cs.nmt.edu/~ckoch/ presentation-factorization-notes.pdf Christopher Koch 14 4 REFERENCES [17] R. Rivest and R. Silverman, “Are ’strong’ primes needed for rsa,” Cryptology ePrint Archive, Report 2001/007, 2001, http://eprint.iacr.org/. Christopher Koch 15