Download The PRIME Problem Sanjeet Tiwana Computer Science and

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Vincent's theorem wikipedia , lookup

Big O notation wikipedia , lookup

List of important publications in mathematics wikipedia , lookup

Addition wikipedia , lookup

List of prime numbers wikipedia , lookup

Halting problem wikipedia , lookup

Quadratic reciprocity wikipedia , lookup

Fundamental theorem of algebra wikipedia , lookup

Algorithm characterizations wikipedia , lookup

Proofs of Fermat's little theorem wikipedia , lookup

Factorization of polynomials over finite fields wikipedia , lookup

Transcript
The PRIME Problem
Sanjeet Tiwana
Computer Science and
Mathematics
Session 2002/2003
The candidate confirms that the work submitted is their own and the appropriate
credit has been given where reference has been made to the work of others.
I understand that failure to attribute material which is obtained from another source
may be considered as plagiarism.
(Signature of student) _______________________________
Summary
The problem of trying to determine if a given number is prime is an ancient problem, which
has riddled some of the world’s greatest minds for centuries. This problem has been
investigated since the days of the famous mathematicians Euclid (ca. 325 – ca.270) and
Eratosthenes (ca. 248 – ca. 192) due to the fundamental importance of prime numbers in
number theory. But it is only up until recent that computer scientists have also developed a
fascination with prime number due to their use in encryption algorithms such as the RSA
algorithm. All such systems presently use probalistic primality-testing algorithms. This is
mainly due to the fact that these probalistic primality-testing algorithms can be used to obtain
extremely fast and confident results. No deterministic primality-testing algorithm could have
possibly been used in such systems since until recent, no one had been able to devise an
unconditional deterministic primality-testing algorithm, which had polynomial time
complexity.
It was on 6th August 2002 that three Indian computer scientists M. Agrawal, N. Kayal, and N.
Saxana, distributed a report “PRIMES is in P” which contained a deterministic polynomial
time primality-testing algorithm. This algorithm was an incredible discovery, and it solved an
age-old question of whether primality could be tested in polynomial time.
This report contains details of how this project has successfully dealt with the problem of
understanding and evaluating this deterministic primality-testing algorithm as well as
determining any possible outcomes of the algorithm in the field of cryptography. The project
has also examined other primality-testing algorithms and techniques and has explored the
computational issues associated with these problems, and also with the factorisation problem
which is another problem faced by computer scientists, which if solved would have an
tremendous immediate impact in the field of cryptography. The report also contains a
description of possible future work and enhancements that could be made to this area of
research.
i
Acknowledgements
This project could not have been a success with the help, constant support and guidance of the
projects supervisor Prof Martin Dyer.
I would therefore like to take this opportunity to thank Prof Martin Dyer for all of his
assistance, without which I would have encountered many countless difficulties in
successfully completing this project.
ii
Table of Contents
Chapter 1
1.1 Introduction to the problem……………………………………………………………..…1
1.2 Statement of the Problem……………………………………………………………….…2
1.3 Objectives………………………………………………………………………...………. 2
1.4 Minimum requirements…………………………………………………………………... 3
1.5 Possible enhancements…………………………………………………………………. ...3
1.6 Relevance to degree program……………………………………………………………...3
1.7 Solution to the problem……………………………………………………………………3
Chapter 2
2.1 Summary of press coverage of the AKS algorithm.……………………………………….6
Chapter 3
Basic principles
3.1 Number Theory. …………………………………………………………………………..8
3.2 Group and Field Theory…………………………………………………………………..12
3.2 Time Complexity Theory…………………………………………………………………13
3.3 Other Important Definitions…………………………………………………………… ...13
Chapter 4
The AKS algorithm
4.1 Basis of the AKS algorithm………………………………………………………………16
4.2 Proof of Correctness……………………………………………………………………...17
4.3 Time Complexity Analysis……………………………………………………………….21
Chapter 5
Sieve of Eratosthenes
5.1 Introduction…………… ……………………………………………………………… ..23
5.2 Experimental Analysis……………………………………………………………………23
Chapter 6
Miller-Rabin algorithm
6.1 Introduction…………… ………………………………………………………………...25
6.2 Experimental Analysis……………………………………………………………………26
Chapter 7
7.1 The impact of the AKS algorithm in the field of cryptography.………………………… 28
Chapter 8
Brief description and investigation of the factorisation problem. ……………………………29
iii
Chapter 9
9.1 Conclusion. …………………………………………………………………………… ...30
9.2 Future Work ……………………………………………………………………………...30
Appendix A
Personal Reflection………………………………...…………………………………………31
Appendix B
1.1 Sieve of Eratosthenes Experimental Analysis Results…...………………………………32
1.2 Miller-Rabin Experimental Analysis Results…………….………………………………35
REFERENCES………………………………………………...…………………………… 38
iv
Chapter 1
This chapter gives an introduction to the problem of determining primality, and outlines some
previous attempts made to solve the problem.
1.1 Introduction to the problem
A number is said to be prime if it is greater than 1, and its only positive factors are 1 and
itself. Now since a number can be defined to be prime so simply, it can difficult to
comprehend exactly why mathematicians have struggled for centuries to try and devise
methods of establishing whether or not a number is prime.
The theoretical concepts of prime numbers have interested many ancient mathematicians such
as Euclid (ca. 325 – ca. 270)[1, 86] and Eratosthenes (ca. 248 – ca. 192)[2, 55] since prime
numbers are of fundamental importance in number theory. Prime numbers are the building
blocks of integers, as all integers (greater than 1) are either prime or can be expressed as a
product of primes.
There are many unanswered questions regarding prime numbers. Ever since it was established
that there are an infinite number of primes, hoards of people have been trying to get into
record books by trying to find the largest prime number yet known. Although they realise that
they might not stay in the record books for long, this realisation still does not seem to lessen
their eagerness to find such numbers. At present the largest known prime is 213466917- 1, which
was discovered by twenty-year-old Michael Cameron from Canada on 14/11/2001[3]. The
main questions surrounding primes have been related to their distribution, and whether a
formula could be developed which could verify a number to be prime.
As well mathematicians computer scientists have also developed an interest in prime
numbers, due to the use of prime numbers in cryptography. Cryptosystems such as the RSA
system depends on secret prime numbers. Thus it has become immensely important to be able
to determine whether a given number is prime accurately and efficiently for computer
scientists as well as mathematicians.
These tests, which are used to determine where or not a particular number is prime, are called
primality tests. At present probalistic primality-testing algorithms are most generally used in
cryptosystems. Probalistic primality-testing algorithms such as the Miller-Rabin algorithm are
algorithms, which determine whether a number is prime with arbitrarily small probability of
1
error, e.g. less than 2-100. Although probalistic primality tests can be used to obtain extremely
fast and confident results, there is still obviously a small chance of error, which cannot be
completely disregarded.
People have dedicated vast amounts of their lives to aid the study of prime numbers, but
unfortunately not all of these efforts have been successful. Many have worked and devised
algorithms for deterministic primality testing, from the ancient Chinese and Greeks to
Aldeman, Promerance, and Rumely who in 1983 developed a deterministic primality testing
algorithm which has running time (logn)O(log log log n) [4,1] But until recent, no one had been
able to devise an unconditional deterministic primality-testing algorithm, which had
polynomial running time.
It was on 6th August 2002 that three Indian computer scientists M. Agrawal, N. Kayal, and N.
Saxana, distributed a report “PRIMES is in P” [4] which contains a deterministic polynomialtime algorithm (now known as the AKS algorithm) they devised that is able to test whether a
given input n is prime or composite. This result has tremendous significance, since not only
have these scientists solved a problem that many intellects have been trying to solve for
centuries but also because this result may pave the way to solving other such problems such
as the factorisation problem and it may also have an impact in the field of cryptography.
1.2 Statement of the problem
The aim of this project is to understand the problem and relevance of distinguishing between
prime and composite numbers.
1.3 Objectives
The projects objectives were defined as follows:
1. To understand the underlying mathematical and computational principles involved in
primality testing.
2. To understand the computational problems that arises in primality testing.
3. To establish the importance of primality testing.
4. To investigate the use of primality testing in different fields.
5. To investigate the report written by Agrawal, Kayal, and Saxena. “PRIMES is in P”
6. To establish the importance of the algorithm developed by Agrawal, Kayal, and
Saxena.
7. To investigate other possible outcomes the algorithm may have in various fields.
2
1.4 Minimum Requirements
From the objectives the minimum requirements for the project were identified to be:
1. To understand the computational issues related to primality testing.
2. To write an account of the issues related to primality testing with particular reference
to the report “PRIMES is in P” by Agrawal, Kayal, and Saxena.
3. To implement at least one primality testing algorithm and to study it experimentally.
The deliverable of the project were identified to be:
1. A written account of issues related to primality testing with particular reference to the
report “PRIMES is in P” by Agrawal, Kayal, and Saxena.
2. One or more implemented primality testing algorithm, and the results of studying that
algorithm experimentally.
1.5 Possible Enhancements
If the minimum requirements were exceeding then the possible enhancements were identified
as being:
1. To investigate the relevance of primality testing in cryptography.
2. To briefly investigate and describe the factorisation problem.
3. To briefly investigate and describe the logarithm problem.
1.6 Relevance to degree program
The project involved applying as well as learning many new computational and mathematical
techniques. It required understanding and analysing algorithms by building on knowledge
developed from previously studied modules such as COMP1360 Introduction to algorithms
and data structures, COMP2360 Theory of computation, COMP3370 Algorithms and
complexity. The project also involved developing and using skills learnt in the modules,
which involved programming such as COMP2650 Object oriented programming and
COMP2660 Software project management.
1.7 Solution to the problem
A solution to the problem was developed through the following stages:
1.7.1
Feasibility study
The feasibility study involved reading through the report “PRIMES is in P” by Agrawal,
Kayal, and Saxena, in order to decide whether or not the subject of the project was
3
appropriate for student studying computer science and mathematics and whether it could be
carried out realistically within the given time scale with all of the available resources.
1.7.2
Project management
The next stage of the project involved deciding on the projects aims, objectives, minimum
requirement, deliverables and further enhancements. After this all milestones were identified
and a schedule was decided on stating the order each milestones had to be met in and by what
date. The following project schedule was produced:
Date
Work to be done
14/02/03
Complete background research of the
problem
28/02/03
Complete the investigation of the report
“PRIMES is in P”
07/03/03
Start preliminary study of other existing
primality testing algorithms in view of
implementing them
14/03/03
Complete and submit table of contents and
draft chapter
21/03/03
Complete write up of the investigation of the
report “PRIMES is in P”
04/04/03
Complete the preliminary study of other
existing primality testing algorithms and their
implementation.
02/05/03
Complete and submit final report with
deliverables.
As shown above all the minimum requirements and deliverable were scheduled to be
completed in such a way that, there would be ample time to review the work or/and to allow
for extra time to complete any unfinished tasks and also to allow for the possibility of making
further enhancements to the project.
This project schedule was then reviewed and modified when the table of contents was decided
on to allow for some extra time for the investigation of the report “PRIMES is in P”. This was
primarily due to some of the changes that were made in the report by the inventors of the
4
algorithm.
1.7.3
Production of Deliverables
The project was successful completed by producing all of the required deliverables on time
with additional enhancements.
5
Chapter 2
This chapter describes the impact that the AKS algorithm has had among Computer Scientists
and Mathematicians worldwide and the media coverage it has received.
2.1 Summary of press coverage of the AKS algorithm
The “PRIMES is in P” report [4] produced by Professor Mahindra Agrawal and with two of
his students Neeraj Kayal and Nitin Saxena from the Indian Institute of technology in Kanpur
has caused quite a stir in the field of theoretical computer science and mathematics.
On August 6th 2002, they made available the report, which contained the solution to an
ancient problem that has puzzled some of the world’s greatest minds for centuries. In this
report they had devised a deterministic polynomial time algorithm, which could determine
whether or not a number was prime without any limitations and with one hundred percent
accuracy. More than 30, 000 people worldwide had downloaded the paper within 24 hours of
it being made available online on August 7th 2002[5].
The algorithm developed by Agrawal, Kayal and Saxena, now known as the AKS algorithm
is truly a remarkable achievement. It is not at all surprising that the AKS algorithm and its
developers have received such tremendous worldwide recognition.
As described in the New York Times [6] and the Times of India [7], the development of this
amazing algorithm has caused a huge amount of excitement among uncountable people
working and researching in this area. Even some of the world’s most renowned
mathematicians and computer scientists such as Dr. Carl Pomerance at Bell Labs, exhibited
great enthusiasm over the development of AKS algorithm.
Dr. C. Pomerance who has himself vastly researched this field, was amongst those, whom
were emailed a draft version of the paper containing the result by the Indian scientists. On the
very day of receiving this paper, Dr. Pomerance established its correctness and arranged an
impromptu seminar to discuss the result that very afternoon. Dr. Pomerance’s reaction of
holding a seminar at such short notice was justified by him stating it was “a measure of how
wonderfully elegant” this algorithm was and by him describing the AKS algorithm as being
“beautiful”.
Shafi Goldwasser who is a professor of computer science at the Massachusetts Institute of
Technology and the Weizmann institute of Science in Israel was also among the many, whom
6
have exhibited great appreciation of the groundbreaking algorithm. The professor described it
as being “the best result heard of in over ten years”.
The article in the New York Times [6] also elaborates on the point that despite the vital role
of primality testing in encryption systems, the AKS algorithm [6] “has no immediate
applications, since existing ones are faster and their error probability can be made so small
that it is practically zero.”
These “existing” algorithms are probalistic primality testing algorithms, such as the MillerRabin primality test, which has polynomial time complexity, but its reliability is dependent
upon the validity of the unproved Extended Reimann Hypothesis.
But despite the algorithm not having an immediate practical use, the excitement surrounding
the result has not at all been dampened; as it is believed that subsequent improvements and
refinement to the algorithm will make it more practical.
7
Chapter 3
This chapter describes some of the basic mathematical and computational principles needed
to understand primality testing algorithms such as the AKS algorithm and the Miller-Rabin
algorithm.
3. Basic Principles
3.1 Number Theory
Numbers theory is a branch of mathematics in which different types of numbers and the
relationships between these numbers are studied. Prime numbers form the central concepts
number theory, since prime numbers are the building blocks of all other numbers. Now in
order to define prime numbers more formally it is necessary know the following definition.
3.1.1 Definition
“Let a, b be integers. We say that b is divisible by a, or that a divides b, if there exists c ε Z
such that ac = b. We then call a a factor or divisor of b and b a multiple of a and write a|b.”
[8, 19]
For example 4|64, since 4x16 = 64, and –9| 81, since –9x-9 = 81. But 3 does not divide 2 in Z
since there is no c ε Z such that 3c = 2.
From the above definition of divisibility, prime numbers can now be defined as follows:
3.1.2 Definition
“The integer p is prime iff (i) p ≠ -1, 0, 1 and (ii) the only divisors of p in Z are –1, 1, p, and –
p.” [8, 19]
All integers, which are not prime are said to be composite.
For example the first five prime numbers are 2, 3, 5, 7, 11 and the first five composite
numbers are 1, 4, 6, 8, and 9.
As well as prime numbers, relatively prime numbers are also immensely significant in
number theory and in computer science due to their use in cryptosystems such as the RSA
algorithm. So it is necessary to understand what coprime numbers are and how they can be
calculated efficiently.
8
3.1.3 Definition
Two integers are said to be relatively prime or coprime iff, they share no positive factors
except 1. Two coprime numbers a, b are denoted as a⊥b.
An important property for coprime numbers is that for all pairs of coprime numbers a, b,
gcd(a,b) = 1. It can be established whether or not a pair of numbers are coprime efficiently by
calculating gcd(a, b) using the Euclidean algorithm.
Now after establishing what prime numbers are, it is important for the purpose of primality
testing to establish how many prime numbers exits and whether or not they are distributed
evenly.
The first problem of determining how many prime numbers exists was solved by Euclid [1,
86]. Euclid proved that there exist an infinitely many primes. This theorem is known as
Euclid’s second Theorem. But before proving this theorem it needs to be shown that any
integer greater than 1 can be expressed as a product of primes.
3.1.4 Theorem (modified from [10, 22])
If a is any integer greater than 1, a can be expressed as a product of positive prime numbers.
Proof
If a = 2, then statement is obviously true.
Let the statement be true for all a ≤ n-1
Now considering n,
Case 1: n is prime.
In this case the statement obviously hold.
Case 2: n is composite.
In this case n has factors other than –1, 1, -n, and n.
Therefore let n = lm where both l and m are both positive and neither l or m are equal to 1 or
n. Then 1≤ l ≤ n and 1≤ k ≤ n. There both the induction hypothesis both l and k are prime ⇒ n
is a product of two positive prime numbers.
The above result can now be used to prove Euclid’s second Theorem.
3.1.5 Theorem (modified from [8, 25-26])
There exits infinitely many primes.
Proof
9
The statement can be proved by contradiction, assuming that there only exist a finite number
of primes.
Let p1,p2,……..ps, be all the finitely many primes.
Then let n be a number such that n = p1+p2+……+ps+ 1
Since any number greater than 1 can be expressed as a product of primes, n can also be
factorised as n = q1q2…..qr.
Now since q1 must be in the complete list of primes p1,p2,……ps, let q1 = pi
Where 1 ≤ i ≤ r
Then q1| n and q1| (n-1)
Therefore q1| n-(n-1) i.e. q1|1.
But q1 cannot divide 1 since q1 is prime, therefore the initial assumption is false.
Hence there exist infinitely many primes.
So now after establishing that there are infinitely many prime numbers it would be very useful
if there was some method to determine how these prime number were distributed, since if
they were distributed regularly then there might have been some formula for expressing a
prime number. But unfortunately this is not the case. The distribution of prime numbers can
be explained from the prime number theorem, which uses the function π(x) defined below.
3.1.6 Definition
The function π(x) where x is a positive real number denotes the number of primes not
exceeding x.
Examples of values of π(x) can be seen section 1.1.1 of appendix B.
3.1.7 Theorem (modified from [1, 71])
The Prime Number Theorem. As x approaches infinity, the ratio of π(x) to x /log x approaches
1.
This theorem has been used in the experimental analysis of the Sieve of Eratosthenes
algorithm in chapter 5.
The next definition of congruence allows us to work in divisibility relationships, and is used
as the basis of many important theorems such as Fermat’s little theorem.
3.1.7 Definition
Two integers a, b are called congruent modulo the integer m if and only if m divides (a – b),
i.e. a ≡ b mod m if and only if m|(a – b), otherwise they are said to be incongruent modulo m.
10
The above definition can now be used to prove Fermat’s little theorem, which not only forms
the basis of the identity behind the AKS algorithm but it also forms the basis of many
probalistic primality testing algorithms such as the Miller-Rabin algorithm which a presently
being used in encryption systems.
3.1.8 Theorem
Fermat’s Little Theorem. If p is a prime number and a is a positive number such that gcd(p, a)
= 1, then ap-1 ≡ 1 mod p.
Proof
The above statement can be rewritten as ap ≡ a mod p ⇒ p | (ap – a)
This can be proved by using induction on the size of a.
Let a = 1, then 1p ≡ 1 mod p, therefore the statement holds for a = 1.
Now assume that the statement is true for a = k ⇒ kp ≡ k mod p, considering k + 1:
(k + 1)p = kp + C(p, 1).kp-1 + C(p, 2).kp-2 + ……+ + C(p, p -1).k1 + 1 (using the binomial
expansion)
⇒ (k + 1)p ≡ kp + 0 + 0 + …….+ 1 (mod p)
since p|C(p, r) if 1 ≤ r ≤ p –1.
⇒ (k + 1)p ≡ kp + 1 (mod p), but by induction hypothesis kp ≡ k mod p
∴ kp + 1 (mod p) ≡ k + 1 (mod p)
Hence by mathematical induction the result is true for all a.
Now although Fermat’s little theorem appears to be a simple and good way of testing for
primality, there are certain composite numbers, which also pass the test.
For example, consider the number 341, 11 divides 341 therefore it cannot obviously be prime.
But if we apply Fermat’s little theorem to it with a = 2 we get 2341 ≡ 2 (mod 341) ⇒ 341
passes Fermat’s little theorem even though its not prime! But if we apply the test with a = 7, it
is seen that 7341 ≠7 (mod 341), so with a = 7 we get the expected result.
Its has been shown that although such numbers are rare, there are infinitely many composite
numbers which pass Fermat’s little theorem by behaving like primes for specific values of a
and these numbers are called psuedoprimes.
11
3.1.9 Definition
A psuedoprimes to the base a is a composite integer n which for a positive integer a satisfies
the equation an ≡ a (mod n) where gcd(a, n) = 1.
So a psuedoprimes can be shown to be composite by repeated application of Fermat’s little
theorem. But if we apply this procedure to the composite number 561, it can be observed that
561 satisfies the equation a561 ≡ a (mod 561) for all values of a such that gcd(a, 561) = 1. This
leads to the following definition.
3.1.10 Definition (adapted from [11, 207])
A Carmichael number or an absolute psuedoprime is a composite integer which satisfies the
equation an ≡ a (mod n) for all positive integer values of a where gcd(a, n) = 1.
As described in [11, 208] Carmichael numbers were discovered Robert Carmichael in the
early part of the 20th century, and it has only recently be shown in 1992 that there are actually
an infinitely many Carmichael numbers.
3.2 Group and field theory
Basic group and field theory has been used in the AKS algorithm.
3.2.1 Definition
A group comprises of a set G together with an operation *, which together satisfy the
following four properties of closure, identity, inverse and Associativity.
3.2.2 Definition (modified from [17, 31])
A field f is a set, which contains a multiplication, and addition operation, which satisfies the
rules of associativity and commutativity of multiplication and addition, the distributive law,
the existence of an additive identity 0, a multiplicative identity 1 and multiplicative inverse
for all elements in the set except 0.
An Example of a field is Z /pZ which is a field of integers modulo a prime number p [17, 31].
The following definitions are related to time complexity analysis of algorithms.
12
3.3 Time complexity theory
3.3.1 Definition
A Turing machine is an abstract model of computer execution and storage system.
3.3.2 Definition
“A Turing Machine M is said to have time complexity T(n) if whenever M is given input w of
length n, M halts after making at most T(n) moves, regardless of whether or not M
accepts.”[11, 414]
3.3.3 Definition
“A language L is in class P if there is some polynomial T(n) such that L = L(M) for some
deterministic Turing machine M of time complexity T(n).”[11, 414].
3.3.4 Definition
“A language L is in the class NP (nondeterministic polynomial) if there is a nondeterministic
Turing Machine M and a polynomial time complexity T(n) such that L = L(M), and when M is
given an input of length n there are no sequences of more than T(n) moves of M.”[11, 419]
3.3.5 Definition (adapted from [9, 881-887])
The Θ notation is used to asymptotically bound a function by upper and lower bounds. This is
a notation used to describe the running time of an algorithm. The running time can be
established by inspecting the structure of the algorithm. This can be divided into two
categories:
1. O- notation: This is used to asymptotically upper bound of a function.
2. Ω - notation: This is used to asymptotically lower bound of a function.
Other definitions and lemmas related directly to the analysis of the AKS algorithm are stated
below.
3.4 Other Important Definitions
All of the following definitions and lemmas are required in order to establish the correctness
and time complexity of the AKS algorithm, which has been explained in the next chapter.
3.4.1 Definition
13
The greatest integer function x gives the largest integer less than or equal to x.
For example 1.4 = 1 and -2.6 = -1.
3.4.2 Definition
The least integer function x gives the smallest integer greater than or equal to x.
For example 1.4 = 2 and -2.6 = -2.
3.4.3 Definition
A prime power of a number is the highest power of a prime that divides a number.
3.4.4 Definition
Let n be a natural number and a be an integer such that gcd(a, n) = 1. Then the order of a
modulo n is the smallest number k such that ak = 1 mod n. It is denoted by on(a).
For example the order of 2 modulo 7 is 3, since 21 = 2, 22 = 4, and 23 = 8 = 1 mod 7.
3.4.5 Definition
Let n be a natural number, then the Euler’s totient function, Φ(n) = number of numbers less
than n that are relatively prime to n.
Now it can be seen from a theorem generalised by Euler [12, 4-5] based on Fermat’s Little
Theorem that aΦ(n) = 1 mod n. From this it can be easily derived that for any a, n such that
gcd(a, n) = 1, on(a)| Φ(n).
3.4.6 Definition (Adapted from [14, 20-21])
A cyclotomic polynomial is a polynomial given by Φr (x) = ∏k (x-ςk) where k = 1….r such that
gcd(k, r) =1, and ςk = e2πik/r are the rth roots of unity in C.
Φr(x) is an integer polynomial(i.e. all coefficients of the x terms are integers) and an
irreducible polynomial(i.e. a polynomial which cannot be factorised in the same field) with
degree Φ(r).
The first three cyclotomic polynomials are Φ1(x) = x – 1, Φ2(x) = x + 1, and Φ3(x) = x2 + x +1.
Let h(x) = xr-1. Now since h(x) is a polynomial in only one variable x ⇒ the roots of h(x) are
also called the zero’s of h(x) [14, 378]. From this it can be deduced that the r-th roots of unity
14
are precisely the zeros of the polynomial h(x) = xr-1. Similarly the primitive r-th roots of unity
are precisely the zeros of the rth cyclotomic polynomial Φr(x)
Now since every r-th root of unity is a primitive d-th root of unity for exactly one positive
divisor d of r [15] this implies that h(x) = ∏d|r Φd (x).
This formula illustrates that:
3.4.7 Definition
For the rth cyclotomic polynomial Φr (x), Φr (x) | h(x) and h(x) factorises into irreducible
factors.
It has also been shown that:
These factors have degree or(p) where Φr (x) is the rth cyclotomic polynomial over fp.
Since h(x) is a factor of the cyclotomic polynomial Φr (x), x is a primitive rth root of unity in
the field fp [4, 5].
3.4.8 Definition
The number m∈N, is said to be “introspective” [4, 4] for a polynomial f(x) if [f(x)]m = f(xm)
(mod xr-1, p).
Furthermore it can easily be shown that Introspective numbers are closed under multiplication
and
For each introspective number m∈N, the set of polynomials for which m is introspective, is
also closed under multiplication.
These properties give rise to the following lemma.
3.4.9 Lemma
Every numbers in I = {ni . pi| i, j ≥ 0} is introspective for every polynomial in P = {∏
∏a
(x+a)e} where a = 1….l and e ≥ 0.
Proof
The proof of this lemma is immediate from the two properties stated above.
15
Chapter 4
This chapter analysis the correctness and the time complexity of the AKS algorithm.
4. Deterministic primality testing algorithm (AKS algorithm)
The AKS algorithm can determine conclusively whether or not a number is prime. The basic
idea behind the algorithm is a version of Fermat’s Little Theorem. All the proofs in this
chapter have been established by modifying and elaborating on the proofs given in the report
containing the algorithm [4].
4.1 Basis of the AKS algorithm
The following lemma forms the basis of the AKS algorithm
4.1.1 Lemma: Suppose that a is an integer, n is a natural number with n ≥ 2 and a and n are
coprime (i.e gcd(a, n ) = 1). Then n is prime iff: (x + a)n = (x p +a) mod n
(1)
Proof:
Let 0<i<n, here we need to show that (x + a)n = (xn + a) mod n is true for all prime numbers
and false for all composite numbers.
(x + a)n = (xn + a) mod n is true iff n|((x + a)n - (xn + a))
now (x + a)n - (xn + a) = xn + nx n-1a+ ….. + a n - (xn + a) by using the binomial expansion.
Therefore the coefficient of xi in (x + a)n - (xn + a) is C(n, i) a n-i .
(i) Let n be prime
If n is prime then the coefficients all xi terms are = 0 (mod n) since C(n, i) = 0 mod n
Also an – a = 0 (mod n) using Fermats Little Theorem.
∴The equation is identically zero over Zn for prime numbers.
(ii) Let n be composite
Let q be a prime factor of n and let ∃k such that qk||n
Then ∃b such that n = bqk
Now considering the coefficient (n(n-1)…(n-q+1))/q! in the expansion of (x + a)n - (xn + a)
Since n = bqk
(n(n-1)…(n-q+1))/q! = ( bqk ( bqk -1)…( bqk – q + 1))/ q ( q – 1 )!
16
=( bqk-1 ( bqk-1 -1)…( bqk-1 – q + 1))/( q – 1 )!
This is obviously not divisible by n
⇒ if n is composite then n does not divide ((x + a)n - ( xn + a ))
Hence (x + a)n = (xn + a) mod n iff n is prime.
4.1.2 Time analysis of (1)
In the worst possible case, n left hand side coefficients would have to be evaluated, which
takes time Ω(n) i.e. the algorithm would have exponential running time in the length of n.
But this time was reduced by the in the AKS algorithm by evaluating both sides of equation
(1) modulo a polynomial in the form xr-1 for a small number of a’s. So the following equation
was used instead of equation (1).
( x + a )n = xn + a mod (xr-1, n)
(2)
(Where r, is an approximately chosen small number)
It is immediate that all prime numbers satisfy equation (2) form equation (1).
Equation (2) takes less time to compute since the number of coefficients are reduced by
evaluating both sides of the equation modulo a polynomial xr-1.
But the drawback in using equation (2) is that for certain values of a and r some composite
numbers also satisfy the equation. But this problem can easily be rectified since it can be
shown that for an approximately chosen r, n must be a prime power if (2) is satisfied for
several a’s. So a deterministic polynomial time algorithm can be obtained since the
appropriate a’s and r are bounded by a polynomial in log n.
4.2 Proof of Correctness
The AKS algorithm is as follows [4, 3]:
Input: integer n > 1
1.
If (n = ab for a∈ N and b > 1), output COMPOSITE.
2.
Find the smallest number r such that or(n) > 4 log2n.
3.
If 1< gcd(a,n) < n for some a ≤ r, output COMPOSITE.
4.
If n ≤ r, output PRIME.
5.
for a = 1 to 2sqrt(φ(r))log n do
if (( x + a )n ≠ xn + a mod (xr-1, n)), output COMPOSITE;
6.
Output PRIME;
17
In order to determine that the algorithm is correct it is necessary to show that the algorithm
returns PRIME if and only if the integer entered is prime.
This can be done in two stages by showing that if n is prime, the algorithm returns PRIME,
and if the algorithm returns PRIME then n is prime.
4.2.1 Lemma
If n is prime, the algorithm returns PRIME.
Proof
It is obvious that since n is a prime number, then the if statements in lines 1 and 3 evaluate to
false, so the algorithm never returns COMPOSITE, and by lemma (4.1.1) it has already been
shown that for each value of a the for loop in line 5 will also never return COMPOSITE. So
the algorithm will output PRIME in step 4 or 5.
The remaining proof is based on the following properties:
1. ∃r such that r = O(log3n) where or(n) > 4log2n .
2. Let p be a prime divisor of n, and p > r.
3. p, n ∈ Zr*
4. Let l = 2sqrt(φ(r)) log n.
The first property bounds the magnitude of r. This is essential, as an appropriate r has to be
found in line 2 of the algorithm. It has been shown that such an r exist in lemma 5.3 of [4, 34].
The second property relates to lines 3 and 4 of the algorithm. Here p has to be greater than r
since if it were not, then lines 3 and 4 would determine whether n was prime or composite.
Now for the same reason it is obvious that n and r are coprime and since p is a prime divisor
of n ⇒ p, n ∈ Zr*. Now in line 5 of the algorithm, assuming the number is prime, the
algorithm keeps r fixed and verifies l equations for all values of a where 1 ≤ a ≤ l.
⇒ ( x + a )n = xn + a mod (xr-1, n)
∀a where 1 ≤ a ≤ l.
⇒ ( x + a )n = xn + a mod (xr-1, p)
∀a where 1 ≤ a ≤ l. (since p is a prime divisor of n)
⇒ ( x + a )p = xp + a mod (xr-1, p)
∀a where 1 ≤ a ≤ l. (by lemma 4.1.1)
⇒ n, and p are both introspective numbers(from definition 3.4.8)
The remaining correctness of the algorithm can be done by considering two groups based on
two sets I and P as defined in lemma 3.4.9.
18
1. Let the first group G1 be a set of all residues of numbers in I mod r. Then:
•
G1 is a subgroup of Zr*(since gcd(n, r) = gcd(p, r) = 1).
•
|G1| > 4log2n. This is because or(n) > 4log2n and G1 is a set of all residues of
numbers in I mod r.
2. Let the second group G2 be a set of non-zero residues of polynomial in P modulo
h(x) and p, where h(x) is an irreducible factor of the cyclotomic polynomial Φr (x) as
described by lemma 3.4.7. Then:
•
G2 is a subgroup of the multiplicative group fp[x]/h(x).
•
G2 is generated by the polynomials x + 1, x + 2, …, x + l in the field
•
f = fp[x]/h(x).
Now it that the groups G1 and G2 have been defined it is possible to analyse the size of the
group G2
4.2.2 Lemma
There exist at least C(t+ l - 2, t -1) distinct polynomials of degree less that t in G2.
Proof
This lemma can be proved by contradiction as follows:
Assuming there exists two distinct polynomial f(x) and g(x) such that f(x), g(x) ∈ P and the
degree of f(x) and g(x) is less that t, and also assuming that f(x) = g(x) in f. So for m ∈ I,
[f(x)]m = [g(x)] m in f.
⇒ f(xm) = g(xm) in f
(since m is introspective for f(x) and g(x), and since h(x) | xr-1)
⇒ 0 = f(xm) - g(xm) in f
⇒ xm is a root of the polynomial p(y) = f(y) - g(y) for all m ∈ G1.
But since G1 is a subgroup of Zr* ⇒ gcd(m, r) = 1 ⇒ xm is a primitive rth root of unity for all
such m ∈ G1.
Therefore the number of distinct roots of the polynomial p in f are equal to |G1| = t. But since
the degree of f(x) and g(x) < t from the initial assumption in f the degree of p also has to be
less than t. This is a contradiction.
⇒ f(x) can not be equal to g(x).
It is also seen that all the elements x + 1, x + 2, …, x + l are all distinct in f since for i ≠ j in fp
l = 2sqrt(φ(r)) log n.< 2sqrt(r) log n < r, and p > r ⇒ l< p for 1 ≤ i ≠ j ≤ l. But these
elements stated would also include the element x + a for a ≤ l which is not to be included in
19
the group G1. Therefore there are l – 1 distinct polynomials of degree one in G2, ⇒ there are
at least C(t+ l - 2, t -1) distinct polynomials of degree less than t in G2.
i.e. |G2| ≥ C(t+ l - 2, t -1).
After showing that the size group G2 is exponential in t, it can now also be shown that the size
of G2 is also upper bounded by an exponential function in t when n is not a power of p.
4.2.3 Lemma
|G2| < 1/2 n2√t if n is not a power of p.
Proof
Let I1 be a subset of I = {ni . pi| i, j ≥ 0}, in such a way that in the subset I1, i can still take the
same values but there is a restriction on j to be less than or equal to √t.
In other words I1 = { ni . pi| i ≥ 0, 0 ≤ j ≤ √t }.
Now supposing that n is not a prime power of p (and since p is a prime factor of n) would
imply that the number of elements in the set I1 = (√t +1) (√t +1) = (√t +1)2 > t.
This means that |I1| > |G1| (since |G1| = t), so there are at least two elements in the I1 which are
equal modulo xr – 1. Therefore let these two elements be k and l and also let k > l.
So xk = xl (mod xr – 1) and also for a polynomial f(x) ∈ P:
[f(x)]k = f(xk) (mod xr-1, p)
(since k is introspective)
= f(xl) (mod xr-1, p)
= [f(x)]l (mod xr-1, p)
⇒ In the field f ( f = fp[x]/h(x)), [f(x)]k = [f(x)]l and f(x) ∈ G2
⇒ 0 = [f(x)]k- [f(x)]l in f.
Now let q be a polynomial such that q(y) = yk - yl in the field f, then clearly f(x) is a root of
q(y), and q(y) has at least |G2| distinct roots (since f(x) ∈ G2) in f.
But the degree of q(y) = k ≤ (np) √t < 1/2 n2√t since p< n (p is a prime factor of n and by the
initial assumption n is not a prime power of p).
⇒ |G2| < 1/2 n2√t.
Now using the last two lemmas it is possible to complete the remaining part of the correctness
proof.
4.2.4 Lemma
If the algorithm returns PRIME then n is prime.
Proof
Let the algorithm return PRIME.
20
Now from lemma 4.2.3 there exist at least C(t+ l - 2, t -1) distinct polynomials of degree less
that t in G2 where l = 2sqrt(φ(r)) log n and t = |G1|.
i.e.
|G2| ≥ C(t+ l - 2, t -1) but (t -1) > 2sqrt(t)logn
⇒ |G2| ≥ C(2sqrt(t)logn + l - 1, 2sqrt(t)logn)
Now we know that l = 2sqrt(φ(r)) log n ≥ 2sqrt(t)logn , so substituting this in the above
inequality gives:
|G2| ≥ C(22sqrt(t)logn - 1, 2sqrt(t)logn)
But 2sqrt(t)logn ≥ 3 (due to the ranges of t and n), so the inequality now becomes:
|G2| ≥ 22sqrt(t)logn
⇒ |G2| ≥ 1/2 n2√t.
Now from lemma 4.2.3, |G2| < 1/2 n2√t if n is not a power of p.
⇒ n is a prime power of p. So n must be in the form n = pk for any k > 0. But from line 1 of
the algorithm if n = ab for a∈ N and b > 1 then the algorithm outputs COMPOSITE, so since
the algorithm does not output COMPOSITE k must be equal to 0.
⇒ n = p0
⇒ n = p,
∴n is equal to its prime divisor ⇒ n is prime.
Hence if algorithm returns PRIME then n is prime.
Therefore from lemma 4.2.1 and lemma 4.2.4 it can be concluded that the algorithm outputs
the correct result.
4.3 Time Complexity Analysis
The time complexity analysis of the AKS algorithm can be analysed by examining each line
of the algorithm, as it has been done in theorem 5.1 of [4, 6]. In this analysis it is seen that the
total running time is dominated by line 5 of the algorithm, so the asymptotic time complexity
of the algorithm (i.e. the behaviour of the execution time as n approaches infinity) can be
determined from the time taken to perform line 5. In line 5, the equation in lemma 4.1.1 has to
be verified for values of a from a = 1 to 2sqrt(φ(r))log n.
⇒ 2sqrt(φ(r))log n equations have to be evaluated.
(1)
Now the next thing to consider is the time taken to evaluate each of these equations. The
number of multiplications required by each equation = O(logn) [13, 102].
The size of the coefficients = O(logn) [13, 98]. (3)
21
(2)
Also the polynomial has degree = r.
(4)
So from (2), (3) and (4) the time taken to evaluate each equation = O~(r.logn. logn) =
O~(r.log2n).
(5)
And from (1) and (5) the total asymptotic time complexity for evaluating 2sqrt(φ(r))log n
equations = O~(r.sqrt(φ(r))log 3n) = O~(r3/2.log 3n) = O~((log3n)3/2.log 3n) due to the bound on
the magnitude of r is equal to O(log3n), where or(n) > 4log2n.
⇒ Asymptotic time complexity = O~((logn)9/2.log 3n) = O~((logn)4.5.log3n) = O~(log7.5n)
Therefore the AKS algorithm has time complexity O~(log7.5n).
It can therefore be concluded that the AKS algorithm is an deterministic primality testing
algorithm which has time complexity O~(log7.5n).
22
Chapter 5
This chapter describes and analysis the time complexity of the Sieve of Eratosthenes
deterministic primality test
5. Sieve of Eratosthenes
5.1 Introduction
The Sieve of Eratosthenes is a deterministic primality test, which was invented by
Eratosthenes of Cyrene at around 200 B.C [2, 55]. It is an efficient method of finding all
small prime numbers - say less than 10,000,000. The procedure behind this is based
essentially on the fact that an integer is prime if its only positive divisors are 1 and itself. This
method uses this fact to “sieve” out all prime numbers less that or equal to a given number as
follows.
1. Make a list of all integers from 2 to n.
2. Mark 2 as prime and cross out all multiples of 2.
3. Move to the next integer, which has not been crossed out, mark it as prime and cross
out all
multiples of it.
4. Repeat step 3 until all the numbers in the list have either been marked as prime or
crossed out.
5.2 Experimental Analysis
When implemented into a C++ program (see Appendix B section 1.1 for psuedo-code) called
sieve.cc the program produced a vector containing all prime numbers ≤ n. The correctness of
this program was verified by comparing the list of prime numbers ≤ 100000 produced by the
program with a list of known prime numbers at [15].
Now an important point to note about this method is that the while loop in the implemented
code of algorithm only needs to be entered when p2 ≤ n (see Appendix B section 1.1), that is
when the number being examined is less than the square root of n. The reason for this can be
explained from the following theorem. - If n is a composite integer, then n has a prime factor
not exceeding the square root of n (the proof of this can be seen in [11, 67]).
It can be seen more clearly how this result is used in the algorithm by considering an
example. Let n = 144, then the square root of n = 12. So from the above theorem it can be
deduced that all composite numbers less that 144 must have prime factors less than or equal
23
to 12. So in this case it is only needed to cross out all numbers less than 144 which have the
prime factors 2, 3, 5, 7, 9, and 11. All the remaining numbers in the list will be prime.
So the Sieve of Eratosthenes basically works for an integer n by checking for divisibility by
all prime numbers less that the square root of n.
Now although this algorithm works efficiently for small numbers, the time complexity of the
algorithm grows exponentially with the size of the input n. This can be seen be seen from the
table of results, and the graph produced in Appendix B section 1.2. These results were
produced by running the program a 100 000 times, and then dividing the results obtained by
100 000 to estimate the run time of a single execution of the program. This was necessary for
estimating the time taken to execute the program for small values of n.
From the table it can be seen that the program runs very quickly and efficiently for small
values of n. It took just 2.2 milliseconds to output all prime numbers less than and equal to 5,
but as larger values n were tested the time taken by the program started to increase rapidly. It
took 1186.5 milliseconds for n = 1500. This is the reason why it is not practical to use the
Sieve of Eratosthenes when large prime numbers are required to be tested. One of the reasons
why this method has exponential growth is that it this method requires numbers less than and
equal n to be stored in some sort of an array or vector. So the larger the value of n, the larger
the size of the storage vector which results in greater time complexity.
The program sieve.cc also outputs the value of π(n) for input n. The table of results showing
this can be seen in section 1.1.2 of appendix B. The relationship between π(n) and n has been
established by the Prime number theorem (theorem 3.1.7). Using this it can be seen that the
probability of a positive random number x being prime = (x / log x) / x = 1 / log x. This shows
that prime numbers become scarcer as numbers get larger, which is one of the reasons why
determining large prime numbers is such a difficult task.
From all of the above analysis it can be concluded that the Sieve of Eratosthenes is a
deterministic primality-testing algorithm, which can be used efficiently to determine primality
for small numbers.
24
Chapter 6
This chapter describes and analysis the time complexity of the Miller-Rabin primality test
6. The Miller-Rabin Algorithm
6.1 Introduction
The Miller-Rabin algorithm is a probalistic primality-testing algorithm, which was developed
by Michael Rabin but was based on Gary Miller’s ideas. It is in some ways similar to
Fermat’s primality test. This test is actually a strong psuedoprime test.
A strong psuedoprime to the base a is an odd composite number n which when written in the
form n –1 = m.s2 where m is odd has the property the either am ≡ 1 mod m or at ≡ -1 mod n,
where t = m.2k for some k ∈ [0, s) [17, 129]. The Miller-Rabin test is able to correctly
distinguish between strong psuedoprime and prime numbers with a very small probability of
error.
An in depth description of the algorithm can be found at [9, 889-896] and [11, 209-212] but in
brief the main procedure behind this algorithm can be summarized from the above sources as
follows:
Let n be the integer to be tested, where n is an odd number greater than 1.(We only need to
test for the primality of odd integers since 2 in the only even prime number.
1. Find an odd m such that n-1 = 2km.
2. Find a random integer a such that 1< a< n-1.
3. If am ≡ ± 1 (mod n) or at ≡ -1 (mod n) where t = m.2r for at least 1 r where 1 ≤ r ≤ k-1,
then n might be prime. Otherwise n is definitely composite.
So if the algorithm determines the number to be composite, it does so with 100% accuracy.
On the other hand, if the algorithm determines the number to be prime, then there is still a
small chance that the number is in fact composite. But the probability of a number being
established as prime actually being composite is so minute, that it can be virtually ignored. It
has been shown in [17, 130-131] that the algorithm has 1 in 4 chance of failing to detect that
the number is composite. But this probability of error can be greatly reduced by repeated
iterations of the algorithm with different values of a. For example, if the test was repeated 100
times by picking 100 different random values of a between 1 and n-1, then the probability of
a composite number being mistaken as a prime, would be reduced to 10-60 which is even
25
smaller then the probability of a computer error being made during the execution of the
algorithm [12, 211].
6.2 Experimental Analysis
This algorithm was implemented into a C++ program called mr.cc using the previous
description of the algorithm, together with the additional information obtained from [16]. The
psuedo-code of the implemented algorithm is given in appendix B section 2.1.1. The
implemented program takes in a number n greater than 1 and outputs COMPOSITE if the
number is definitely composite, or it outputs PRIME if the number is prime with a small
possibility of error. This possibility of error is equal to (1/2)s , where s is the number of times
the routine described in section 2.1.1 of appendix B is repeated (see [16] for proof). The
program mr.cc was made to repeat the routine test 20 times, which reduced the probability of
an error occurring to (0.5)20 = 0.954x10-7 < 0.000001. In other words this increased the
probability of correctly testing for a prime number to 99.9999%.
The accuracy of the program mr.cc was then tested by running the program 40 times
consecutively for the same values of n, to test whether or not the program distinguished
between prime and composite numbers correctly. The program gave the correct output for all
the data values of n tested.
The execution time of the program for different input values was then examined using the
same approach as with the program sieve.cc described in chapter 4. But the only difference in
the technique being used was that the average result of 40 trials was used in the analysis. This
had to be done since the results for each individual trial for the same value of n were too
varied. The table of results and the graph of the obtained results can be seen in section 2.1.2
of appendix B.
From the table and graph it can be seen that the program runs extremely efficiently for
different values of n. Unlike the program implemented using the Sieve of Eratosthenes
algorithm sieve.cc, the program mr.cc was able to quickly determine primality for large
values of n. The time taken by the program to run did not grow at very quickly as the value of
n increased; instead it grew at a linear rate as the value of n grew in size.
The time complexity of the Miller Rabin algorithm can be determined from a famous number
theory conjecture called the Generalised Riemann Hypothesis. Now although this conjecture
26
has not yet been proved, it is widely believed to be true. Assuming this conjecture is true
would imply that (from [11, 212] the algorithm would use O((log2n)5) bit operations to
determine whether a positive integer n was prime. It is due to the fact that the Miller-Rabin
has this polynomial time complexity and such a low probability of error that the algorithm
forms the basis of many encryption and decryption systems such as the RSA cryptosystem
and is also used in commercial computer algebra systems such as mathematica.
27
Chapter 7
This chapter investigates any possible impact the AKS algorithm may have had in the field of
cryptography
7. The impact of the AKS algorithm in the field of cryptography
Cryptography can be defined as the study of analysing and deciphering code. As described in
[17,54-55] cryptography is a method a taking a message that is required to be sent – this is
called the plaintext and disguising it using some sort of coding technique into a coded
message – this is called the cipher text, and then sending it to the recipient. The recipient
knows how the message was coded, so the recipients has all the information required to
decode the message, but if some third person intercepts the coded message during
transmission, then this person will not be able to decipher the message so the secrecy of the
message is preserved.
Now prime numbers play a vital role in a lot of the presently used crypto-systems such as the
RSA-cryptosystem. A brief description of the algorithm is as follows (adapted from [9, 881887]):
1. Take two large prime number p and q such that p ≠ q and multiply them together,
such that n = pq.
2. Select a small integer e such that e = ( p – 1)( q – 1), which is also coprime to φ(n).
3.
Compute d as a multiplicative inverse of e mod φ(n).
4. Publish the pair p = (e, n) as the RSA public key and keep the secret the pair s = (d, n)
as the RSA secret key.
5. Then a one-way function on the public key is used to encrypt the message, and a oneway function based on the secret key is used to decrypt the message.
So it can be seen from above that the algorithm is required to find two large prime numbers
and multiply them together. At present systems like these use probalistic primality tests such
as that Miller-Rabin primality test but as described in the last chapter there is always a small
chance of error using such tests that a composite number my be selected instead of a prime.
This is the reason why it would much rather be preferred to use a deterministic primality
testing algorithm in encryption systems.
However at present it is not practically feasible to use the newly developed AKS algorithm in
encryption systems since even though it has polynomial time complexity, existing randomised
algorithms are can be made to run considerably faster with negligible error probability.
28
Chapter 8
This Chapter briefly states and investigates the factorisation problem
8. Brief description and investigation of the factorization problem.
The factorization problem is the problem of expressing a positive number n as a product of
primes. It has already been shown from the fundamental theorem of arithmetic that all
positive integers can be uniquely factorised (see [17, 12] for proof). But even till today no one
has been able to devise a technique of quickly and efficiently factorizing a number. This
problem is considered to be a lot harder than the prime problem.
One of the main reasons why this problem is so important is because of its role in encryption
systems, such as the RSA – encryption system described in the last chapter. The security of
this system relies on the fact that no one knows how factorize efficiently. If some one does
manage to devise a method of doing so then the security of internet communications and
transactions would be at serious risk.
Just like with prime numbers, mathematicians and computer scientists are trying very hard to
develop efficient factorization algorithms. Many algorithms have even been devised, all of
which are extremely complex, and are not very efficient. At present the fastest deterministic
algorithm for factorizing is the Pollard-Strassen algorithm [17, 192] but as stated in [9, 896]
even with one of today’s supercomputers this algorithm would not be able to feasibly
factorize an arbitrary 1024-bit number.
So due to the inability to factorize integers efficiently our internet communications and
transactions are currently safe!
29
Chapter 9
9.1 Conclusion
This report contains a detailed description of the underlying mathematical and computational
issues surrounding primality testing. It has investigated in detail various primality testing
techniques including the recently developed AKS algorithm. The report also contains the
results of the experimental study of two other primality testing algorithms as well as a brief
description of the factorization problem. So it can therefore be concluded that the project has
met all of its minimum requirements and it has also made further enhancements.
9.2 Future work
There is ample scope for future work in this research area such the implantation and
experimental study of the AKS algorithm. This could not be done in this project due to the
limiting time available. There are also many other aspects of the AKS algorithm, which have
not been explored such as the effect on the time complexity of the algorithm if the Sophie
Germain primes conjecture holds. Further research could also be done into the factorization
problem. Due to the enormity of the field of study, the list of further enhancements to this
project area are endless.
30
Appendix A
Project reflection
Overall I found this project extremely interesting. The AKS algorithm, which my project
mainly focussed on, had only recently been discovered. So it was an extremely new and
exciting area to study. But the main difficulty I had to face with this project was to try and
understand the basic mathematical principles behind the algorithm. I had to do a lot of
background reading to understand the algorithm and I had to teach myself many new
mathematical concept of number theory. This was very time consuming, and meant that I fell
a little behind schedule and had to revise the initial project schedule.
The other difficulty I had was with the write up of the project. This report was written using
Microsoft word, which is not very easy to use if you need to include a lot of mathematical
symbols in the write up. But I did eventually figure out a few short cut techniques and tricks
which enabled me to type up the second half of the project a lot quicker than the first half.
But despite these difficulties I thoroughly enjoyed researching this area of study; it’s the sort
of field in which unanswered questions keep on arising, and as soon as such a question is
answered it leads to loads of other questions being asked. This is probably the main reason
why I found it so interesting and fascinating to study.
31
Appendix B
1.1 Experimental Analysis of The sieve of Eratosthenes.
1.1.1 pseudo–code
The algorithm can be written in pseudo–code as follows. (Based on a description of the
algorithm at [2, 56-57])
Eratosthene(n)
{
primeNumbers[0] = 0
for i:= 1 to (n-1) do primeNumbers[i] = 1;
p:= 2
while p2 ≤ n do {
i:= j + 2p
while ( j ≤ n) do {
primeNumbers[j-1] = 0
j:= j + p
}
repeat p:= p + 1 untill primeNumbers[p-1] = 1;
}
return (primeNumbers)
}
32
1.1.2 Results
The following results were obtained for π(n) using the program sieve.cc based on the Sieve of
Eratosthenes.
N
π(n)
1
0
2
1
3
2
4
2
5
3
10
4
25
9
50
15
100
25
500
95
1000
168
5000
669
10000
1229
50000
5133
100000
9592
5000000
348513
.
33
1.1.3 Time analysis
The execution times illustrated in the table may not be very accurate due to factors such as
other programs running on the computers at the same time as the analysis was taken place,
every effort was made to ensure that the results obtained were as precise as possible.
Time Taken
Input
in
Number
milliseconds
0
0
2
0.5
3
0.8
4
1.8
5
2.2
6
2.9
7
3.2
8
3.8
9
5.3
10
6
25
16.2
50
34.8
100
68
150
107.8
250
185.1
500
371.7
1000
763.7
1500
1186.5
34
Graphical representation of results:
Time Taken (Milliseconds)
Sieve of Eratosthenes
1200
1000
800
600
400
200
0
0
200
400
600
800
1000
1200
1400
1600
Input Number
2.1 Experimental Analysis of the Miller-Rabin Randomised Primality test
2.1.1 pseudo-code
The psuedo code for the Miller-Rabin Randomised Primality test is as follows (based on the
description of the algorithm in [9 ,890-896] and [16]):
Miller-Rabin(n)
{
d = 1;
for i = k down to 0
bit = the ith bit in the binary representation of (n – 1)
x=d
d = d *d (mod n)
if ( d == 1) && ( x != n –1) && ( x != 1)
then return TRUE
if (bit == 1) then d = d *a (mod n) //end of for loop
if (d != 1) then return TRUE
return FALSE
}
35
2.1.2 Time analysis
The execution times illustrated in the table may not be very accurate due to factors such as
other programs running on the computers at the same time as the analysis was taken place,
every effort was made to ensure that the results obtained were as precise as possible.
Time Taken
Input
in
Number
milliseconds
0
0
2
1.69
3
1.61
4
1.9
5
2.22
6
2.53
7
2.35
8
2.35
9
2.39
10
2.6
25
3.2
50
3.69
100
4.38
150
4.89
250
5.49
500
6.44
1000
6.93
1500
7.37
2000
7.68
2500
7.72
5000
8.23
10000
9
50000
10.29
100000
11.03
500000
12.19
1000000
12.74
36
Graphical representation of the results:
Time Taken (milliseconds)
Miller-Rabin
1000000
900000
800000
700000
600000
500000
400000
300000
200000
100000
0
0
5
10
Imput Number
37
15
References
[11] K. H. Rosen (2000), Elementary Number Theory, 4th edition, Addison Wesley
Longman inc.
[2] P. Giblin (1993), Primes and Programming, Cambridge University Press.
[3] The Prime pages.
URL http://www.utm.edu/research/primes/notes/13466917/index.html [09/04/03]
[4] M.Agrawal, N. Kayal, N. Saxena. PRIMES is in P.
URL http://www.cse.iitk.ac.in/news/primality.html [28th Jan 2003]
[5] Danny Kinglsey, News in Science, ABC Science online.
URL http://www.abc.net.au/science/news/stories/s647647.htm [03/04/03]
[6] S. Robinson, New York Times, Section A, Page 20, Column 1, August 8, 2002.
URL http://www.nytimes.com/2002/08/08/science/08MATH.html [03/04/03]
[7] Chidanand Rajghatta Times News Network. [August 12, 2002 10:06:25 PM]
URL http://timesofindia.indiatimes.com/articleshow.asp?artid=18891466 [3/04/03]
[8] R B J T Allenby and E J Redfern (1989), Introduction To Number Theory With
Computing, Edward Arnold.
[9] T.H. Cormen, C.H. Leiserson, R.L. Rivest, C Stein (2002), Introduction to
Algorithms, 2nd edition, Prentice Hall.
[10] R. B. J. T. Allenby (1991), Rings, Fields and Groups, 2nd edition, Edward Arnold.
[11] J. E. Hopcroft, R. Motwani, J. D. Ullman (2001), Introduction to Automata Theory,
Languages, and Computation, 2nd edition, Addison-Wesley.
[12] E. Kranakis (1987), Primality And Cryptography, reprint, John Wiley & Sons Ltd.
[13] J. von zur Gathen, J. Gerhard (1999), Modern Computer Algebra
38
Cambridge University Press, Cambridge.
[14] Joseph J. Rotman(2002), Advanced Modern Algebra, Pearson Education Inc.
[15] The Prime pages.
http://www.utm.edu/research/primes. [12/10/03]
[16] Dr David Wessels
URL: http://engr.uark.edu/~wessels/algs/notes/prime.html [12/10/03]
[17] N. Koblitz (1994), A Course In Number Theory And Crytography, 2nd edition,
Springer-Verlag New York Inc.
39