Download Integer Factorization Methods

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Big O notation wikipedia , lookup

Addition wikipedia , lookup

Vincent's theorem wikipedia , lookup

Elementary mathematics wikipedia , lookup

Arithmetic wikipedia , lookup

Collatz conjecture wikipedia , lookup

Quadratic reciprocity wikipedia , lookup

List of prime numbers wikipedia , lookup

Algorithm wikipedia , lookup

Proofs of Fermat's little theorem wikipedia , lookup

Factorization of polynomials over finite fields wikipedia , lookup

Transcript
Integer Factorization Methods
Christopher Koch
CSE489-01 Algorithms in CS & IT
New Mexico Institute of Mining and Technology
May 9, 2014
Contents
List of Theorems and Definitions
iii
List of Algorithms
iii
List of Figures
iii
1 Introduction
1
1.1
Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 Integer Factorization
2
3
2.1
Complexity of Integer Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
2.2
Trial Division . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4
2.2.1
Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4
Pollard’s p − 1 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6
2.3.1
Worked Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7
2.3.2
Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8
Pollard’s ρ Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9
2.4.1
Cycles in Z/nZ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9
2.4.2
Floyd’s Cycle-finding Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.4.3
Pollard’s ρ Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4.4
Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3
2.4
3 Strong Primes
13
4 References
14
ii
List of Theorems and Definitions
1.1
Theorem (Fundamental theorem of arithmetic) . . . . . . . . . . . . . . . . . . . . . . .
1
1.1
Definition (Landau notation) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2
2.1
Theorem (Prime number theorem) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4
2.1
Definition (p-adic order) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5
2.2
Theorem (Fermat’s little theorem) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6
2.2
Definition (Periodic sequences) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9
2.3
Proposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
List of Algorithms
2.1
Trial Division . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4
2.2
Pollard’s p − 1 method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6
2.3
Modular exponentiation and GCD of Pollard’s p − 1 . . . . . . . . . . . . . . . . . . . .
8
2.4
Floyd’s cycle-finding algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.5
Pollard’s ρ method (Monte Carlo factorization) . . . . . . . . . . . . . . . . . . . . . . . 11
List of Figures
2.1
2.2
Number of primes less than n versus n: the prime counting function π(n). For example, there are 550 primes less than or equal to 4000. . . . . . . . . . . . . . . . . . .
1
Rho-shaped ultimately periodic sequence . . . . . . . . . . . . . . . . . . . . . . . . . .
iii
5
9
1
1
INTRODUCTION
Introduction
VERY positive integer (whole number) greater than one can be written as the multiplication
of prime numbers, for example 10 = 2 × 5. The fundamental theorem of arithmetic details that
every positive integer greater than 1 must have such a “unique prime factorization;” however, it
does not specify how to find such factorization. Integer factorization methods are procedures that
find a given integer’s unique prime factors.
E
Theorem 1.1 (Fundamental theorem of arithmetic). Let n be an integer greater than 1. Then,
there exist not necessarily distinct prime numbers p1 , p2 , . . . , pi such that
n = p1 × p2 × ⋯ × pi .
The fundamental theorem of arithmetic was first proven by the Greek mathematician Euclid around
300 BC in his treatise Elements [1].
While every positive integer greater than 1 has a prime factorization, finding it is a hard problem.
The naïve approach would be to take the given integer n and to try to divide it by every prime
between 1 and n. If it is divisible by a prime, then that prime is one of the prime factors of n.
One must consider that an integer n may be divisible by a prime more than once, for example
20 = 2 × 2 × 5 has the prime 2 twice in its prime factorization. One must also consider that finding
the prime numbers between 1 and n is also a difficult problem. While simple, this approach is an
exponential time algorithm.
The problem of distinguishing prime numbers from composite numbers and of resolving
the latter into their prime factors is known to be one of the most important and useful
in arithmetic. [. . . ] Further, the dignity of the science itself seems to require that every
possible means be explored for the solution of a problem so elegant and celebrated.
— Carl Friedrich Gauß [2, Article 329]
Traditionally, prime numbers and all that they involve, such as integer factorization, had few applications outside of pure mathematics. Even the community of number theorists viewed colleagues
that occupied themselves with integer factorization as incurably obsessive [3, p. 675]. Some results
and theorems known about prime numbers were only discovered due to some mathematician’s obsession with primes; for example, Gauß conjectured the famous prime number theorem when he
was 15, but did not discuss it until almost 60 years later because he thought it was not a significant
result [4, p. 54].
All this changed entirely in the 1970s, when the concepts of public-key cryptography were invented
through RSA [5], an encryption scheme named after its inventors Ronald Rivest, Adi Shamir, and
Leonard Adleman. RSA specifically relied on the difficulty of factoring large integers into primes. In
addition to that, complexity theory became popular [6]: complexity theory enabled researchers to
express rigorously that using a certain method to factor a number n will take at most time f (n) for
some function f . While the notation of some of complexity theory (such as Landau notation) had
been around since the beginning of the 20th century, it was now being applied to computational
complexity: this meant that algorithms, such as integer factorization algorithms, could be compared
to each other in terms of speed.
Christopher Koch
1
1
INTRODUCTION
1.1
Notation
The security of RSA encryption lies in the difficulty of factoring large integers; specifically in that
all known integer factorization algorithms are exponential time algorithms. While the mathematics
behind RSA cannot be broken, the ability to factor large integers quickly (in polynomial time)
would make the use of RSA infeasible. This would be fatal: RSA is the “most widely deployed
public-key cryptosystem” [7]; it is for example used for secure web browsing, digital signatures, and
secure e-mail communications.
To this day, a lot of research has been done in integer factorization methods. In the last 40 years, a
number of methods were developed and optimized; for example, Pollard’s ρ algorithm in 1975 [8] [9,
ch. 31], Pollard’s p − 1 algorithm in 1974 [10], or Lenstra’s elliptic curve factorization algorithm in
1987 [11]. To this day, the three best practical methods of integer factorization are the general
number field sieve, the quadratic sieve, and the elliptic curve factorization algorithm. There is also
Shor’s algorithm for quantum computers; however, to date no quantum computer has been built
that can run Shor’s algorithm.
1.1
Notation
The GCD of two integers a and b will be denoted gcd(a, b) and the LCM as lcm(a, b).
We remind the reader of the (modern) limit definition of the notation due to Bachmann and Landau
(Big O) [12, p. 401]. Of course, Bachmann did not define it this specifically when he first used it;
the limit definition came to exist later.
Definition 1.1 (Landau notation). We say that a function f (n) is O(g(n)) as n goes to ∞ (infinity)
if
f (n)
∣ < ∞.
lim ∣
n→∞ g(n)
We may write f (n) ∈ O(g(n)) or f (n) = O(g(n)).
We use lg n to denote the logarithm with base 2 of n.
The time to multiply a n-bit number will be denoted by M (n). Schoolbook multiplication is M (n) =
O(n2 ), while mutliplication due to Strassen and Schönhage is M (n) = O(n lg n lg lg n).
The time to find the GCD of two n-bit numbers will be denoted G(n). The Euclidean algorithm
would be G(n) = O(n2 ), while the GCD due to Schönhage takes G(n) = O(M (n) lg n).
Christopher Koch
2
2
INTEGER FACTORIZATION
2
Integer Factorization
Given an integer n, many integer factorization methods will only find one divisor of n and it may
not be a prime. It is then necessary to test whether the resulting divisor is a prime or not. To find
more than one factor of n, the respective algorithm would have to be used repeatedly. In this paper,
only the trial division algorithm will give all prime divisors of n.
Generally, integer factorization algorithms can be divided into two categories: general-purpose algorithms, whose runtime just depends on the size of the integer to be factored, and special-purpose
algorithms, whose runtime may depend not only on the size of the integer to be factored but also
on some possibly unknown properties of that integer. The algorithms discussed here will be from
both categories: trial division is a general-purpose algorithm, while the other algorithms discussed
are special-purpose algorithms. We will discuss the properties that an integer must possess for each
special-purpose algorithm to work well.
2.1
Complexity of Integer Factorization
Integer factorization is traditionally seen as a hard problem by computer scientists and mathematicians [3, p. 676]. In this context, “hard” means that the problem is computationally intensive:
There is no known integer factorization algorithm for Turing machines that factors in polynomial
time. Of course, integer factorization can be verified in polynomial time, which means it is in the
NP complexity class.
While all known integer factorization algorithms for Turing machines are exponential or subexponential time, we also know that integer factorization is in the BQP complexity class – bounded
error quantum polynomial time. Peter Shor’s factoring algorithm for quantum computers showed
this in 1995.
It is suspected that integer factorization is outside of P, NP-complete, and coNP-complete. If that
turns out to be true, factorization would be in the NP-intermediate class; however, if we knew that
to be true, we would also know that P ≠ NP.
Christopher Koch
3
2
INTEGER FACTORIZATION
2.2
2.2
Trial Division
Trial Division
Trial division is the naïve method of integer factorization. Given an integer n > 1, we know that
factors of n can be between 2 and n/2. The easiest thing to do would be to test every prime number
between 2 and n/2 for divisibility of n. However, we can make one more observation to improve
trial division: We also know that there may only be zero or one prime factor of n that is greater
√
√
√ √
than n. If there were two factors p, q > n, we would have pq > n n = n.
√
Then, we simply test every prime number between 2 and n inclusive to see whether it divides
n and divide n by that number if it is divisible. In the end, since there may be one prime factor
√
greater than n, we test whether n > 1 and use that as a factor if it is.
Thus, we know that n is prime if trial division returns a list of prime factors that just contains n
itself. Hence, trial division can also be used for primality testing. However, there are much more
efficient methods of primality testing than trial division and it should not be used for such a task.
Algorithm 2.1 details the steps to be taken for the trial division algorithm.
Algorithm 2.1 Trial Division
Input: an integer n to be factored
Output: a list of prime factors of n
1:
2:
3:
4:
5:
6:
7:
8:
9:
function TrialDivision(n)
D ← ()
√
for all p in primes( n) do
while n mod p = 0 do
append(D, p)
n ← n/p
if n > 1 then
append(D, n)
return D
2.2.1
▷ empty list
Analysis
To analyze the trial division algorithm, we must find out how many times the for-loop repeats. In
√
fact, it repeats for every prime number less than n, so we must find out how many prime numbers
√
there are less than n. This is where the prime number theorem comes into play.
Theorem 2.1 (Prime number theorem). Let n ∈ Z+ . Let π(n) be the number of primes less than
n; it is called the prime-counting function. Then,
π(n)
= 1.
n→∞ n/ ln(n)
lim
n
This means that π(n) ∈ O ( ln(n)
) by the limit definition of Landau notation.
For example, π(10) = 4, since the primes less than or equal to 10 are 2, 3, 5, and 7. Thus, there are
four primes less than 10.
Christopher Koch
4
2
INTEGER FACTORIZATION
2.2
Trial Division
Figure 2.1: Number of primes less than n versus n: the prime counting function π(n). For example, there
are 550 primes less than or equal to 4000.
n
Figure 2.1 shows how close the values of π(n) and ln(n)
lie together. The prime number theorem
was conjectured by Carl Friedrich Gauß in the year 1792 [4, p. 54], but it was not proved until
1896 by Jacques Hadamard and Charles Jean de la Vallée-Poussin. In fact, Gauß did not reveal his
conjecture until almost 60 years later in a letter, because he did not think his discovery was very
√
√
important [4, p. 54]. This means that the number of primes lower than n is of order O ( ln(√nn) ).
Therefore, that is how many times the for-loop is repeated.
We then have to consider how many times the while-loop is repeated for each p.
Definition 2.1 (p-adic order). The p-adic order of an integer n > 0 is the highest integer v such
that pv divides n:
νp (n) = max{v ∈ Z+ ∶ pv ∣n}
Then for each p, the while-loop repeats νp (n) times. An upper bound for νp (n) would be if n = pνp (n) ,
so νp (n) ≤ logp (n) ≤ log2 (n) since p ≥ 2. So, the while-loop runs O(lg n) for each p.
Then, trial division is
√
√
⎛
⎞
n
O(π( n) lg(n)M (lg n)) = O
√ lg(n)M (lg n) = O ( nM (lg n)) .
⎝ ln ( n)
⎠
√
Christopher Koch
5
2
INTEGER FACTORIZATION
2.3
2.3
Pollard’s p − 1 Method
Pollard’s p − 1 Method
In 1974, John M. Pollard invented the p − 1 method almost as just a by-product of a paper on
primality testing.
Pollard’s p − 1 method is based on a famous theorem by Pierre de Fermat called Fermat’s little
theorem [10,11]. As with most of his number theory conjectures, Fermat did not actually prove this
theorem [13]. Many years after its conjecture, Leonhard Euler proved it in 1736 and generalized it
to a theorem known as the Euler totient theorem today [13].
Theorem 2.2 (Fermat’s little theorem). Let p be a prime and a be an integer coprime to p. Then,
ap−1 ≡ 1 (mod p).
This is equivalent to p∣(ap−1 − 1).
While it may not be intuitive at first, the theorem also implies that for some prime p and any
integer m,
am(p−1) ≡ 1 (mod p).
The idea of Pollard’s p − 1 method is to select K as the multiplication of many small primes and
calculate the gcd(aK − 1, n) in the hope that (p − 1)∣K for some p∣n, since if those two conditions
are true, we have that p∣(aK − 1). Then, gcd(aK − 1, n) ≥ p and we have found a non-trivial divisor.
Algorithm 2.2 Pollard’s p − 1 method
Input: an integer n and a smoothness bound B
Output: a non-trivial divisor of n or failure
function PollardP-1(n, B)
a ← random integer coprime to n
3:
K←
p⌊logp (n)⌋
∏
1:
2:
4:
5:
6:
7:
8:
9:
10:
primes p≤B
K
m ← (a − 1) mod n
g ← gcd (m, n)
if g = 1 then
either increase B and return PollardP-1(n, B)
or return failure
else
return g
▷ modular exponentiation
▷ g must be a divisor of n
We notice that to find a prime divisor p of n, the number p − 1 must be a multiple of K. For the
algorithm to run quickly, we want K to be a small number, but to find all prime factors of n, we
√
need K to be n. The algorithm then only works fast for numbers n where a prime divisor p’s
neighbor p − 1 is divisible by small primes. Since we cannot predict whether any number n has this
property, we cannot say whether it is feasible to use this algorithm or not for that number.
This introduces the concept of strong primes: in certain cryptographic schemes such as RSA, we
need to use an integer that is hard to factor. If we consider all known integer factorization algorithms
and their properties, then we can designate a strong prime to be a prime number that exhibits none
Christopher Koch
6
2
INTEGER FACTORIZATION
2.3
Pollard’s p − 1 Method
of the properties that the factorization algorithms need to work well. For the p − 1 algorithm that
means that a strong prime is a prime whose prime factor p’s neighbor p − 1 has at least one large
prime factor, so that K has to be a big number and the p − 1 algorithm runs slowly for any n that
has p as a prime divisor.
It is to note that Pollard’s p − 1 algorithm only finds one divisor of n and this divisor is not
necessarily prime. It may either be that the divisor is prime or it needs to be factored further. To
find all factors of n, the algorithm needs to be used repeatedly.
2.3.1
Worked Example
We want to factor n = 540 143 and we choose B = 8.
Since the computation of K is extensive, for the example we choose a similar measure: K =
lcm(2, 3, . . . , B), the lowest common multiple of the numbers between 2 and B.
Then, we have
K = lcm(2, 3, 4, 5, 6, 7, 8) = 840.
We also have
(2K − 1) mod 540 143 = (2840 − 1) mod 540 143 = 53 046.
To find a divisor, we compute
d = gcd(53 046, 540 143) = 421.
Indeed, we have that 540 143 = 421 × 1283. Just as noted above, we notice that this only worked
because 421 − 1 = 420 was the factorization of many small primes: 420 = 2 × 2 × 3 × 5 × 7. All these
factors were also factors of K and this is why the method worked.
If we now tried to factor 491 389 into its prime factors 383 and 1283, we notice that 383 − 1 = 382
is not the factorization of many small prime factors, since 382 = 2 × 191. Then, we would have
to choose at least B = 191 for the method to work. At this point, the computation of finding K
becomes really expensive.
Christopher Koch
7
2
INTEGER FACTORIZATION
2.3.2
2.3
Pollard’s p − 1 Method
Analysis
The runtime of Pollard’s p − 1 algorithm entirely depends on the exponentiation and modular
exponentiation of lines 2-4. To analyze those, we give a specific algorithm for finding the modular
exponentiation and GCD as in Algorithm 2.3. Of course, the runtime of these arithmetic operations
is dependent on both the number n to be factored and the choice of B.
Algorithm 2.3 Modular exponentiation and GCD of Pollard’s p − 1
1: K ← a
2: for all p in primes(B) do
3:
pc ← p
4:
while pc < n do
5:
K ← K p (mod n)
6:
pc ← pc ∗ p
7: g ← gcd(K − 1, n)
The for-loop repeats π(B) times, where π is the prime-counting function as defined in Theorem 2.1.
Each while-loop repeats ⌊logp (n)⌋ times for each p. Then, we have ∑primes p≤B ⌊logp (n)⌋ modular
exponentiations and multiplications.
Each modular exponentiation is O(lg(p)M (lg n)) while each multiplication is M (lg n). We also
find the GCD once; however, G(lg n) = O(lg lg nM (lg n)), which is already less than the time taken
for modular exponentiation.
Then, the cost of the modular exponentiations and multiplications is
∑
⌊logp (n)⌋O(lg(p)M (lg n)) ≤
∑
O(logp (n) lg(p)M (lg n))
∑
O(lg(n)M (lg n))
primes p≤B
primes p≤B
≤
primes p≤B
= O(π(B) lg(n)M (lg n))
(1)
Then, equation (1) is the runtime of Pollard’s p − 1 algorithm. The source at [14] agrees with this
version of the analysis.
Just like trial division, this algorithm ends up being exponential time in the size of n. However,
we now have a factor B to play with that allows us to choose performance and reliability of the
algorithm. Larger values of B will likely get results, but the algorithm will take longer, while smaller
values of B might not get any results, but the algorithm will run quicker.
√
√
We notice that if B = n, Pollard’s p − 1 is O(π( n) lg(n)M (lg n)), which is the same complexity
as trial division. However, that would be just one iteration of Pollard’s p − 1 algorithm, which finds
only one divisor of n while trial division finds all divisors of n. It becomes apparent that to use
Pollard’s p − 1 algorithm feasibly, we have to choose small values of B.
Christopher Koch
8
2
INTEGER FACTORIZATION
2.4
2.4
Pollard’s ρ Method
Pollard’s ρ Method
To talk about Pollard’s ρ method, we need to develop a few small results about cycles in Z/nZ.
Pollard’s ρ method basically works by trying to find cycles modulo p while doing a functional
iteration modulo n, where n is the integer to be factored and p is a prime factor.
2.4.1
Cycles in Z/nZ
Consider a function f (x) = (x2 +1) mod 6 and define a sequence using this function as its functional
iteration. That is, let x0 = 0 be the start of the sequence, and apply f to get the next value in the
sequence. So, x1 = f (x0 ) = 01 + 1 mod 6 = 1, x2 = f (x1 ) = 12 + 1 mod 6 = 2, x3 = 5, etc.
If we take that sequence and start writing it out, we get
(0, 1, 2, 5, 2, 5, 2, 5, ⋯)
and we notice that eventually, 2 and 5 start cycling. Once f (5) = 2, it is just like jumping back two
elements, since we will simply apply the same functional iteration to 2 again. We already know that
f (2) = 5, so why do that computation again? A sequence that cycles like this is called periodic and
the “jumping” distance is called the period of the sequence. In this case, the sequence is actually
ultimately periodic, since the cycling does not start until the third element of the sequence.
Definition 2.2 (Periodic sequences). A sequence {Xi }i≥0 is considered ultimately periodic if there
exist integers λ (called period) and N such that Xn+λ = Xn for all n ≥ N .
If N = 0, the sequence is just called periodic.
xi+1
xi ≡ xj
xi+2
period (mod n)
xi−1
xj−1
xi−2
x1
Figure 2.2: Rho-shaped ultimately periodic sequence2
We can generalize the previous example: given any function f modulo n (f ∶ Z/nZ → Z/nZ) that
is applied to an element to form a sequence (a functional iteration), we can say that it will be
ultimately periodic just like the sequence above. The reason for this is simple: Z/nZ is a finite set;
it has exactly n elements. So if f is applied to some x0 at least n times, one of the values in Z/nZ
has to eventually repeat itself (this follows from the Pidgeonhole principle).
An ultimately periodic sequence induces a ρ-shaped picture as in Figure 2.2, where the tail is the
segment of the sequence before it becomes periodic, and the head is the cycle.
2
TikZ drawing taken from https://github.com/mmaker/bachelor/blob/master/book/pollardrho.tex
Christopher Koch
9
2
INTEGER FACTORIZATION
2.4
Pollard’s ρ Method
Proposition 2.3. Let f ∶ Z/nZ → Z/nZ and consider the functional iteration (sequence) {Xi }i≥0
where Xi ∈ Z/nZ and Xm+1 = f (Xm ). This sequence is ultimately periodic.
Proof. Assume X0 , X1 , . . . , Xm−1 distinct for some positive integer m and Xm not. By the Pidgeonhole principle, we know that m ≤ n.
Then, Xm = Xµ for some 0 ≤ µ ≤ m − 1. Let λ = m − µ be the period.
By induction, we need to show that Xn+λ = Xn for all n ≥ µ.
For the base case, let n = µ. Then, Xµ+λ = Xm = Xµ . We then assume Xn+λ = Xn for some n.
Then, Xn+1+λ = f (Xn+λ ) = f (Xn ) = Xn+1 .
Hence, the sequence is ultimately periodic with period λ for n ≥ µ.
2.4.2
Floyd’s Cycle-finding Algorithm
The naïve way of finding a cycle in a functional iteration would be to test xi against every xj for
j < i until a cycle is found.
However, in The Art of Computer Programming, Knuth gives a much more elegant way of finding
cycles and attributes it to R. W. Floyd [15]. Algorithm 2.4 details a simple procedure to find the
cycle given a functional iteration f on a finite set and a starting element x0 .
The algorithm is sometimes also called the “tortoise-hare algorithm,” where x is the tortoise and
y is the hare: since we know that applying f means we will eventually be periodic, we let x walk
slowly in the “cycle” and y walk twice as fast as x in the “cycle.” Eventually, when they meet up, we
have found the cycle. If we kept track of how many iterations we have gone through, then we can
figure out what the period of the functional iteration is. Formally, this works because X2λ = Xλ ,
where X is the functional iteration and λ is the period.
Algorithm 2.4 Floyd’s cycle-finding algorithm
Input: function f and start-value x0
Output: halts when a cycle is found
function FloydCycle(f, x0 )
x ← f (x0 ), y ← f (f (x0 ))
3:
while x ≠ y do
4:
x ← f (x)
5:
y ← f (f (y))
1:
2:
Christopher Koch
10
2
INTEGER FACTORIZATION
2.4.3
2.4
Pollard’s ρ Method
Pollard’s ρ Method
While trial division guarantees to find all divisors of a given integer n, it does so very slowly. On
the other hand, Pollard’s ρ method executes quickly, but it may not find any divisors of n and fail.
Suppose n is the integer to be factored and p is a prime divisor of n. Pollard’s ρ method works by
trying to find cycles modulo p in a sequence of pseudorandom numbers modulo n.
But why is a cycle modulo p important? Let’s say we have found two elements in the sequence
where xi ≠ xj and xi ≡ xj (mod p). This implies that xi − xj is a multiple of p. We have that n is
also a multiple of p, so gcd(∣xi − xj ∣, n) ≥ p. Thus, by finding a cycle modulo p in the sequence, we
have found a non-trivial divisor of n.
We also need to generate a sequence of pseudorandom numbers modulo n. This is generally done
by using f (x) = x2 + b (mod n) for b ∈ Z/nZ ≠ 0, 2. We assume that every output that f (x) could
give has equal probability; i.e. even though f (x) clearly seems to define a dependence, we assume
that the probability of each output is the same. Using f as defined previously is only based on
empirical results, whereas b = 0, 2 did not produce a uniform distribution.
The method, of course, takes its name from the ρ shape that it induces with the sequence. Pollard
created it in 1975 [8], calling it a Monte Carlo approach to factorization and saying in the paper
that he considered naming it the ρ method. Evidently, that name was used for it eventually anyway.
Algorithm 2.5 Pollard’s ρ method (Monte Carlo factorization)
Input: an integer n
Output: a non-trivial divisor of n or failure
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
function PollardRho(n)
x ← 2, y ← 2, g ← 1
while g = 1 do
x ← x2 − 1 (mod n)
2
y ← (y 2 − 1) − 1 (mod n)
g ← gcd(∣x − y∣, n)
if g = n then
return failure
else
return g
2.4.4
▷ g must be a divisor of n
Analysis
The analysis of Pollard’s ρ method relies on finding the expected number of iterations needed to
find a congruency modulo p in the functional iteration modulo n.
Keeping in mind that we assume that the output of f is a pseudorandom number modulo n, we
can draw an analogy to the birthday paradox problem: How many people need to be in a room for
there to be a probability of m that at least two people have the same birthday?
Usually, the problem is posed with m = 1/2.
If we replace some terminology, we can rephrase the problem this way: How many iterations do we
Christopher Koch
11
2
INTEGER FACTORIZATION
2.4
Pollard’s ρ Method
need to draw from f for two elements to have the same remainder modulo p with probability m?
Doing this analysis [9, Chapter 5.4] [16], we will find that we can find a cycle after approximately
1√
1 − 8p ln(1 − m)
2
√
iterations. Thus, for probability 1/2, we have that we need approximately 1.177 p iterations to find
a cycle modulo p.
√
In fact, this analysis leads us to conclude that one can always find a cycle in θ( p) steps for any
probability m.
Christopher Koch
12
3
3
STRONG PRIMES
Strong Primes
Cryptographically, some methods of integer factorization introduce a concept of strong primes. For
certain cryptographical schemes, it is desirable to choose primes that are hard to factor in most
methods of integer factorization. It is then of advantage that some methods of integer factorization
are either slow or work well only for primes with special properties, but never both. A strong prime
is then a prime that does not possess any of these special properties and can thus only be factored
slowly.
For example, one of the conditions for a prime p to be strong is if p − 1 has large prime factors,
due to Pollard’s p − 1 algorithm. Similarly, Williams’ p + 1 algorithm means that p + 1 needs to have
large factors for p to be a strong prime.
However, it is argued by some, such as Ronald Rivest, that strong primes bring no additional
security to a cryptographical scheme, because fast methods such as Lenstra’s elliptic curve method
do not rely on any special properties and thus, will outperform algorithms like Pollard’s p − 1 in
any case [17].
Christopher Koch
13
4
REFERENCES
4
References
[1] Euclid of Alexandria, “Elements of geometry,” in Book VII, J. Heiberg and R. Fitzpatrick, Eds.
Richard Fitzpatrick, Dec. 2007. [Online]. Available: http://farside.ph.utexas.edu/euclid.html
[2] C. F. Gauß, Disquisitiones Arithmeticae. Berlin, Germany: Springer, 1801 / 1986, translation
by Arthur A. Clarke and William C. Waterhouse.
[3] J. van Leeuwen, Ed., Handbook of Theoretical Computer Science.
Elsevier, 1990, vol. A.
[4] H. Mania, Gauß – Eine Biographie (Gauß – A Biography).
2009.
Amsterdam, Netherlands:
Hamburg, Germany: Rowohlt,
[5] R. L. Rivest, A. Shamir, and L. M. Adleman, “A method for obtaining digital signatures and
public-key cryptosystems,” CACM, vol. 21, no. 2, pp. 120–126, Feb. 1978. [Online]. Available:
http://doi.acm.org/10.1145/359340.359342
[6] L. Fortnow and S. Homer, “A short history of computational complexity,” Bulletin of the
European Association for Theoretical Computer Science, vol. 80, pp. 95–133, 2002. [Online].
Available: http://people.cs.uchicago.edu/~fortnow/papers/history.pdf
[7] C. F. Cid, “Cryptanalysis of RSA: A survey,” 2003. [Online]. Available: http:
//www.sans.org/reading-room/whitepapers/vpns/cryptanalysis-rsa-survey-1006
[8] J. M. Pollard, “A monte carlo method for factorization,” BIT Numerical Mathematics,
vol. 15, no. 3, pp. 331–334, 1975. [Online]. Available: http://dx.doi.org/10.1007/BF01933667
[9] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein, Introduction to Algorithms, 2nd ed.
Boston, MA: McGraw-Hill, 2001.
[10] J. M. Pollard, “Theorems on factorization and primality testing,” Mathematical Proceedings
of the Cambridge Philosophical Society, vol. 76, pp. 521–528, Nov. 1974. [Online]. Available:
http://journals.cambridge.org/article_S0305004100049252
[11] H. W. Lenstra Jr., “Factoring integers with elliptic curves,” Annals of mathematics, vol. 126,
no. 3, pp. 649–673, 1987.
[12] P. Bachmann, Zahlentheorie – Zweiter Teil.
Stuttgart, Germany: B. G. Teubner, 1894,
reprinted by Johnson Reprint, New York, NY, 1968.
[13] L. Euler, “Theorematum quorundam ad numeros primos spectantium demonstratio,”
Commentarii academiae scientiarum Petropolitanae, vol. 8, pp. 141–146, 1741, english
translation available. [Online]. Available: http://eulerarchive.maa.org/pages/E054.html
[14] A. V. Sutherland, “18.783 elliptic curves – lecture notes #12,” MIT Course Website, Mar.
2013, http://math.mit.edu/classes/18.783/LectureNotes12.pdf.
[15] D. E. Knuth, The Art of Computer Programming - Seminumerical Algorithms, 3rd ed. Reading, MA: Addison-Wesley, 1997.
[16] C. Koch, “Integer factorization methods,” Apr. 2014, presentation given in a seminar
class at New Mexico Tech (CSE489/589). [Online]. Available: https://cs.nmt.edu/~ckoch/
presentation-factorization-notes.pdf
Christopher Koch
14
4
REFERENCES
[17] R. Rivest and R. Silverman, “Are ’strong’ primes needed for rsa,” Cryptology ePrint Archive,
Report 2001/007, 2001, http://eprint.iacr.org/.
Christopher Koch
15