Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Vincent's theorem wikipedia , lookup
Big O notation wikipedia , lookup
List of important publications in mathematics wikipedia , lookup
List of prime numbers wikipedia , lookup
Halting problem wikipedia , lookup
Quadratic reciprocity wikipedia , lookup
Fundamental theorem of algebra wikipedia , lookup
Algorithm characterizations wikipedia , lookup
Proofs of Fermat's little theorem wikipedia , lookup
Factorization of polynomials over finite fields wikipedia , lookup
The PRIME Problem Sanjeet Tiwana Computer Science and Mathematics Session 2002/2003 The candidate confirms that the work submitted is their own and the appropriate credit has been given where reference has been made to the work of others. I understand that failure to attribute material which is obtained from another source may be considered as plagiarism. (Signature of student) _______________________________ Summary The problem of trying to determine if a given number is prime is an ancient problem, which has riddled some of the world’s greatest minds for centuries. This problem has been investigated since the days of the famous mathematicians Euclid (ca. 325 – ca.270) and Eratosthenes (ca. 248 – ca. 192) due to the fundamental importance of prime numbers in number theory. But it is only up until recent that computer scientists have also developed a fascination with prime number due to their use in encryption algorithms such as the RSA algorithm. All such systems presently use probalistic primality-testing algorithms. This is mainly due to the fact that these probalistic primality-testing algorithms can be used to obtain extremely fast and confident results. No deterministic primality-testing algorithm could have possibly been used in such systems since until recent, no one had been able to devise an unconditional deterministic primality-testing algorithm, which had polynomial time complexity. It was on 6th August 2002 that three Indian computer scientists M. Agrawal, N. Kayal, and N. Saxana, distributed a report “PRIMES is in P” which contained a deterministic polynomial time primality-testing algorithm. This algorithm was an incredible discovery, and it solved an age-old question of whether primality could be tested in polynomial time. This report contains details of how this project has successfully dealt with the problem of understanding and evaluating this deterministic primality-testing algorithm as well as determining any possible outcomes of the algorithm in the field of cryptography. The project has also examined other primality-testing algorithms and techniques and has explored the computational issues associated with these problems, and also with the factorisation problem which is another problem faced by computer scientists, which if solved would have an tremendous immediate impact in the field of cryptography. The report also contains a description of possible future work and enhancements that could be made to this area of research. i Acknowledgements This project could not have been a success with the help, constant support and guidance of the projects supervisor Prof Martin Dyer. I would therefore like to take this opportunity to thank Prof Martin Dyer for all of his assistance, without which I would have encountered many countless difficulties in successfully completing this project. ii Table of Contents Chapter 1 1.1 Introduction to the problem……………………………………………………………..…1 1.2 Statement of the Problem……………………………………………………………….…2 1.3 Objectives………………………………………………………………………...………. 2 1.4 Minimum requirements…………………………………………………………………... 3 1.5 Possible enhancements…………………………………………………………………. ...3 1.6 Relevance to degree program……………………………………………………………...3 1.7 Solution to the problem……………………………………………………………………3 Chapter 2 2.1 Summary of press coverage of the AKS algorithm.……………………………………….6 Chapter 3 Basic principles 3.1 Number Theory. …………………………………………………………………………..8 3.2 Group and Field Theory…………………………………………………………………..12 3.2 Time Complexity Theory…………………………………………………………………13 3.3 Other Important Definitions…………………………………………………………… ...13 Chapter 4 The AKS algorithm 4.1 Basis of the AKS algorithm………………………………………………………………16 4.2 Proof of Correctness……………………………………………………………………...17 4.3 Time Complexity Analysis……………………………………………………………….21 Chapter 5 Sieve of Eratosthenes 5.1 Introduction…………… ……………………………………………………………… ..23 5.2 Experimental Analysis……………………………………………………………………23 Chapter 6 Miller-Rabin algorithm 6.1 Introduction…………… ………………………………………………………………...25 6.2 Experimental Analysis……………………………………………………………………26 Chapter 7 7.1 The impact of the AKS algorithm in the field of cryptography.………………………… 28 Chapter 8 Brief description and investigation of the factorisation problem. ……………………………29 iii Chapter 9 9.1 Conclusion. …………………………………………………………………………… ...30 9.2 Future Work ……………………………………………………………………………...30 Appendix A Personal Reflection………………………………...…………………………………………31 Appendix B 1.1 Sieve of Eratosthenes Experimental Analysis Results…...………………………………32 1.2 Miller-Rabin Experimental Analysis Results…………….………………………………35 REFERENCES………………………………………………...…………………………… 38 iv Chapter 1 This chapter gives an introduction to the problem of determining primality, and outlines some previous attempts made to solve the problem. 1.1 Introduction to the problem A number is said to be prime if it is greater than 1, and its only positive factors are 1 and itself. Now since a number can be defined to be prime so simply, it can difficult to comprehend exactly why mathematicians have struggled for centuries to try and devise methods of establishing whether or not a number is prime. The theoretical concepts of prime numbers have interested many ancient mathematicians such as Euclid (ca. 325 – ca. 270)[1, 86] and Eratosthenes (ca. 248 – ca. 192)[2, 55] since prime numbers are of fundamental importance in number theory. Prime numbers are the building blocks of integers, as all integers (greater than 1) are either prime or can be expressed as a product of primes. There are many unanswered questions regarding prime numbers. Ever since it was established that there are an infinite number of primes, hoards of people have been trying to get into record books by trying to find the largest prime number yet known. Although they realise that they might not stay in the record books for long, this realisation still does not seem to lessen their eagerness to find such numbers. At present the largest known prime is 213466917- 1, which was discovered by twenty-year-old Michael Cameron from Canada on 14/11/2001[3]. The main questions surrounding primes have been related to their distribution, and whether a formula could be developed which could verify a number to be prime. As well mathematicians computer scientists have also developed an interest in prime numbers, due to the use of prime numbers in cryptography. Cryptosystems such as the RSA system depends on secret prime numbers. Thus it has become immensely important to be able to determine whether a given number is prime accurately and efficiently for computer scientists as well as mathematicians. These tests, which are used to determine where or not a particular number is prime, are called primality tests. At present probalistic primality-testing algorithms are most generally used in cryptosystems. Probalistic primality-testing algorithms such as the Miller-Rabin algorithm are algorithms, which determine whether a number is prime with arbitrarily small probability of 1 error, e.g. less than 2-100. Although probalistic primality tests can be used to obtain extremely fast and confident results, there is still obviously a small chance of error, which cannot be completely disregarded. People have dedicated vast amounts of their lives to aid the study of prime numbers, but unfortunately not all of these efforts have been successful. Many have worked and devised algorithms for deterministic primality testing, from the ancient Chinese and Greeks to Aldeman, Promerance, and Rumely who in 1983 developed a deterministic primality testing algorithm which has running time (logn)O(log log log n) [4,1] But until recent, no one had been able to devise an unconditional deterministic primality-testing algorithm, which had polynomial running time. It was on 6th August 2002 that three Indian computer scientists M. Agrawal, N. Kayal, and N. Saxana, distributed a report “PRIMES is in P” [4] which contains a deterministic polynomialtime algorithm (now known as the AKS algorithm) they devised that is able to test whether a given input n is prime or composite. This result has tremendous significance, since not only have these scientists solved a problem that many intellects have been trying to solve for centuries but also because this result may pave the way to solving other such problems such as the factorisation problem and it may also have an impact in the field of cryptography. 1.2 Statement of the problem The aim of this project is to understand the problem and relevance of distinguishing between prime and composite numbers. 1.3 Objectives The projects objectives were defined as follows: 1. To understand the underlying mathematical and computational principles involved in primality testing. 2. To understand the computational problems that arises in primality testing. 3. To establish the importance of primality testing. 4. To investigate the use of primality testing in different fields. 5. To investigate the report written by Agrawal, Kayal, and Saxena. “PRIMES is in P” 6. To establish the importance of the algorithm developed by Agrawal, Kayal, and Saxena. 7. To investigate other possible outcomes the algorithm may have in various fields. 2 1.4 Minimum Requirements From the objectives the minimum requirements for the project were identified to be: 1. To understand the computational issues related to primality testing. 2. To write an account of the issues related to primality testing with particular reference to the report “PRIMES is in P” by Agrawal, Kayal, and Saxena. 3. To implement at least one primality testing algorithm and to study it experimentally. The deliverable of the project were identified to be: 1. A written account of issues related to primality testing with particular reference to the report “PRIMES is in P” by Agrawal, Kayal, and Saxena. 2. One or more implemented primality testing algorithm, and the results of studying that algorithm experimentally. 1.5 Possible Enhancements If the minimum requirements were exceeding then the possible enhancements were identified as being: 1. To investigate the relevance of primality testing in cryptography. 2. To briefly investigate and describe the factorisation problem. 3. To briefly investigate and describe the logarithm problem. 1.6 Relevance to degree program The project involved applying as well as learning many new computational and mathematical techniques. It required understanding and analysing algorithms by building on knowledge developed from previously studied modules such as COMP1360 Introduction to algorithms and data structures, COMP2360 Theory of computation, COMP3370 Algorithms and complexity. The project also involved developing and using skills learnt in the modules, which involved programming such as COMP2650 Object oriented programming and COMP2660 Software project management. 1.7 Solution to the problem A solution to the problem was developed through the following stages: 1.7.1 Feasibility study The feasibility study involved reading through the report “PRIMES is in P” by Agrawal, Kayal, and Saxena, in order to decide whether or not the subject of the project was 3 appropriate for student studying computer science and mathematics and whether it could be carried out realistically within the given time scale with all of the available resources. 1.7.2 Project management The next stage of the project involved deciding on the projects aims, objectives, minimum requirement, deliverables and further enhancements. After this all milestones were identified and a schedule was decided on stating the order each milestones had to be met in and by what date. The following project schedule was produced: Date Work to be done 14/02/03 Complete background research of the problem 28/02/03 Complete the investigation of the report “PRIMES is in P” 07/03/03 Start preliminary study of other existing primality testing algorithms in view of implementing them 14/03/03 Complete and submit table of contents and draft chapter 21/03/03 Complete write up of the investigation of the report “PRIMES is in P” 04/04/03 Complete the preliminary study of other existing primality testing algorithms and their implementation. 02/05/03 Complete and submit final report with deliverables. As shown above all the minimum requirements and deliverable were scheduled to be completed in such a way that, there would be ample time to review the work or/and to allow for extra time to complete any unfinished tasks and also to allow for the possibility of making further enhancements to the project. This project schedule was then reviewed and modified when the table of contents was decided on to allow for some extra time for the investigation of the report “PRIMES is in P”. This was primarily due to some of the changes that were made in the report by the inventors of the 4 algorithm. 1.7.3 Production of Deliverables The project was successful completed by producing all of the required deliverables on time with additional enhancements. 5 Chapter 2 This chapter describes the impact that the AKS algorithm has had among Computer Scientists and Mathematicians worldwide and the media coverage it has received. 2.1 Summary of press coverage of the AKS algorithm The “PRIMES is in P” report [4] produced by Professor Mahindra Agrawal and with two of his students Neeraj Kayal and Nitin Saxena from the Indian Institute of technology in Kanpur has caused quite a stir in the field of theoretical computer science and mathematics. On August 6th 2002, they made available the report, which contained the solution to an ancient problem that has puzzled some of the world’s greatest minds for centuries. In this report they had devised a deterministic polynomial time algorithm, which could determine whether or not a number was prime without any limitations and with one hundred percent accuracy. More than 30, 000 people worldwide had downloaded the paper within 24 hours of it being made available online on August 7th 2002[5]. The algorithm developed by Agrawal, Kayal and Saxena, now known as the AKS algorithm is truly a remarkable achievement. It is not at all surprising that the AKS algorithm and its developers have received such tremendous worldwide recognition. As described in the New York Times [6] and the Times of India [7], the development of this amazing algorithm has caused a huge amount of excitement among uncountable people working and researching in this area. Even some of the world’s most renowned mathematicians and computer scientists such as Dr. Carl Pomerance at Bell Labs, exhibited great enthusiasm over the development of AKS algorithm. Dr. C. Pomerance who has himself vastly researched this field, was amongst those, whom were emailed a draft version of the paper containing the result by the Indian scientists. On the very day of receiving this paper, Dr. Pomerance established its correctness and arranged an impromptu seminar to discuss the result that very afternoon. Dr. Pomerance’s reaction of holding a seminar at such short notice was justified by him stating it was “a measure of how wonderfully elegant” this algorithm was and by him describing the AKS algorithm as being “beautiful”. Shafi Goldwasser who is a professor of computer science at the Massachusetts Institute of Technology and the Weizmann institute of Science in Israel was also among the many, whom 6 have exhibited great appreciation of the groundbreaking algorithm. The professor described it as being “the best result heard of in over ten years”. The article in the New York Times [6] also elaborates on the point that despite the vital role of primality testing in encryption systems, the AKS algorithm [6] “has no immediate applications, since existing ones are faster and their error probability can be made so small that it is practically zero.” These “existing” algorithms are probalistic primality testing algorithms, such as the MillerRabin primality test, which has polynomial time complexity, but its reliability is dependent upon the validity of the unproved Extended Reimann Hypothesis. But despite the algorithm not having an immediate practical use, the excitement surrounding the result has not at all been dampened; as it is believed that subsequent improvements and refinement to the algorithm will make it more practical. 7 Chapter 3 This chapter describes some of the basic mathematical and computational principles needed to understand primality testing algorithms such as the AKS algorithm and the Miller-Rabin algorithm. 3. Basic Principles 3.1 Number Theory Numbers theory is a branch of mathematics in which different types of numbers and the relationships between these numbers are studied. Prime numbers form the central concepts number theory, since prime numbers are the building blocks of all other numbers. Now in order to define prime numbers more formally it is necessary know the following definition. 3.1.1 Definition “Let a, b be integers. We say that b is divisible by a, or that a divides b, if there exists c ε Z such that ac = b. We then call a a factor or divisor of b and b a multiple of a and write a|b.” [8, 19] For example 4|64, since 4x16 = 64, and –9| 81, since –9x-9 = 81. But 3 does not divide 2 in Z since there is no c ε Z such that 3c = 2. From the above definition of divisibility, prime numbers can now be defined as follows: 3.1.2 Definition “The integer p is prime iff (i) p ≠ -1, 0, 1 and (ii) the only divisors of p in Z are –1, 1, p, and – p.” [8, 19] All integers, which are not prime are said to be composite. For example the first five prime numbers are 2, 3, 5, 7, 11 and the first five composite numbers are 1, 4, 6, 8, and 9. As well as prime numbers, relatively prime numbers are also immensely significant in number theory and in computer science due to their use in cryptosystems such as the RSA algorithm. So it is necessary to understand what coprime numbers are and how they can be calculated efficiently. 8 3.1.3 Definition Two integers are said to be relatively prime or coprime iff, they share no positive factors except 1. Two coprime numbers a, b are denoted as a⊥b. An important property for coprime numbers is that for all pairs of coprime numbers a, b, gcd(a,b) = 1. It can be established whether or not a pair of numbers are coprime efficiently by calculating gcd(a, b) using the Euclidean algorithm. Now after establishing what prime numbers are, it is important for the purpose of primality testing to establish how many prime numbers exits and whether or not they are distributed evenly. The first problem of determining how many prime numbers exists was solved by Euclid [1, 86]. Euclid proved that there exist an infinitely many primes. This theorem is known as Euclid’s second Theorem. But before proving this theorem it needs to be shown that any integer greater than 1 can be expressed as a product of primes. 3.1.4 Theorem (modified from [10, 22]) If a is any integer greater than 1, a can be expressed as a product of positive prime numbers. Proof If a = 2, then statement is obviously true. Let the statement be true for all a ≤ n-1 Now considering n, Case 1: n is prime. In this case the statement obviously hold. Case 2: n is composite. In this case n has factors other than –1, 1, -n, and n. Therefore let n = lm where both l and m are both positive and neither l or m are equal to 1 or n. Then 1≤ l ≤ n and 1≤ k ≤ n. There both the induction hypothesis both l and k are prime ⇒ n is a product of two positive prime numbers. The above result can now be used to prove Euclid’s second Theorem. 3.1.5 Theorem (modified from [8, 25-26]) There exits infinitely many primes. Proof 9 The statement can be proved by contradiction, assuming that there only exist a finite number of primes. Let p1,p2,……..ps, be all the finitely many primes. Then let n be a number such that n = p1+p2+……+ps+ 1 Since any number greater than 1 can be expressed as a product of primes, n can also be factorised as n = q1q2…..qr. Now since q1 must be in the complete list of primes p1,p2,……ps, let q1 = pi Where 1 ≤ i ≤ r Then q1| n and q1| (n-1) Therefore q1| n-(n-1) i.e. q1|1. But q1 cannot divide 1 since q1 is prime, therefore the initial assumption is false. Hence there exist infinitely many primes. So now after establishing that there are infinitely many prime numbers it would be very useful if there was some method to determine how these prime number were distributed, since if they were distributed regularly then there might have been some formula for expressing a prime number. But unfortunately this is not the case. The distribution of prime numbers can be explained from the prime number theorem, which uses the function π(x) defined below. 3.1.6 Definition The function π(x) where x is a positive real number denotes the number of primes not exceeding x. Examples of values of π(x) can be seen section 1.1.1 of appendix B. 3.1.7 Theorem (modified from [1, 71]) The Prime Number Theorem. As x approaches infinity, the ratio of π(x) to x /log x approaches 1. This theorem has been used in the experimental analysis of the Sieve of Eratosthenes algorithm in chapter 5. The next definition of congruence allows us to work in divisibility relationships, and is used as the basis of many important theorems such as Fermat’s little theorem. 3.1.7 Definition Two integers a, b are called congruent modulo the integer m if and only if m divides (a – b), i.e. a ≡ b mod m if and only if m|(a – b), otherwise they are said to be incongruent modulo m. 10 The above definition can now be used to prove Fermat’s little theorem, which not only forms the basis of the identity behind the AKS algorithm but it also forms the basis of many probalistic primality testing algorithms such as the Miller-Rabin algorithm which a presently being used in encryption systems. 3.1.8 Theorem Fermat’s Little Theorem. If p is a prime number and a is a positive number such that gcd(p, a) = 1, then ap-1 ≡ 1 mod p. Proof The above statement can be rewritten as ap ≡ a mod p ⇒ p | (ap – a) This can be proved by using induction on the size of a. Let a = 1, then 1p ≡ 1 mod p, therefore the statement holds for a = 1. Now assume that the statement is true for a = k ⇒ kp ≡ k mod p, considering k + 1: (k + 1)p = kp + C(p, 1).kp-1 + C(p, 2).kp-2 + ……+ + C(p, p -1).k1 + 1 (using the binomial expansion) ⇒ (k + 1)p ≡ kp + 0 + 0 + …….+ 1 (mod p) since p|C(p, r) if 1 ≤ r ≤ p –1. ⇒ (k + 1)p ≡ kp + 1 (mod p), but by induction hypothesis kp ≡ k mod p ∴ kp + 1 (mod p) ≡ k + 1 (mod p) Hence by mathematical induction the result is true for all a. Now although Fermat’s little theorem appears to be a simple and good way of testing for primality, there are certain composite numbers, which also pass the test. For example, consider the number 341, 11 divides 341 therefore it cannot obviously be prime. But if we apply Fermat’s little theorem to it with a = 2 we get 2341 ≡ 2 (mod 341) ⇒ 341 passes Fermat’s little theorem even though its not prime! But if we apply the test with a = 7, it is seen that 7341 ≠7 (mod 341), so with a = 7 we get the expected result. Its has been shown that although such numbers are rare, there are infinitely many composite numbers which pass Fermat’s little theorem by behaving like primes for specific values of a and these numbers are called psuedoprimes. 11 3.1.9 Definition A psuedoprimes to the base a is a composite integer n which for a positive integer a satisfies the equation an ≡ a (mod n) where gcd(a, n) = 1. So a psuedoprimes can be shown to be composite by repeated application of Fermat’s little theorem. But if we apply this procedure to the composite number 561, it can be observed that 561 satisfies the equation a561 ≡ a (mod 561) for all values of a such that gcd(a, 561) = 1. This leads to the following definition. 3.1.10 Definition (adapted from [11, 207]) A Carmichael number or an absolute psuedoprime is a composite integer which satisfies the equation an ≡ a (mod n) for all positive integer values of a where gcd(a, n) = 1. As described in [11, 208] Carmichael numbers were discovered Robert Carmichael in the early part of the 20th century, and it has only recently be shown in 1992 that there are actually an infinitely many Carmichael numbers. 3.2 Group and field theory Basic group and field theory has been used in the AKS algorithm. 3.2.1 Definition A group comprises of a set G together with an operation *, which together satisfy the following four properties of closure, identity, inverse and Associativity. 3.2.2 Definition (modified from [17, 31]) A field f is a set, which contains a multiplication, and addition operation, which satisfies the rules of associativity and commutativity of multiplication and addition, the distributive law, the existence of an additive identity 0, a multiplicative identity 1 and multiplicative inverse for all elements in the set except 0. An Example of a field is Z /pZ which is a field of integers modulo a prime number p [17, 31]. The following definitions are related to time complexity analysis of algorithms. 12 3.3 Time complexity theory 3.3.1 Definition A Turing machine is an abstract model of computer execution and storage system. 3.3.2 Definition “A Turing Machine M is said to have time complexity T(n) if whenever M is given input w of length n, M halts after making at most T(n) moves, regardless of whether or not M accepts.”[11, 414] 3.3.3 Definition “A language L is in class P if there is some polynomial T(n) such that L = L(M) for some deterministic Turing machine M of time complexity T(n).”[11, 414]. 3.3.4 Definition “A language L is in the class NP (nondeterministic polynomial) if there is a nondeterministic Turing Machine M and a polynomial time complexity T(n) such that L = L(M), and when M is given an input of length n there are no sequences of more than T(n) moves of M.”[11, 419] 3.3.5 Definition (adapted from [9, 881-887]) The Θ notation is used to asymptotically bound a function by upper and lower bounds. This is a notation used to describe the running time of an algorithm. The running time can be established by inspecting the structure of the algorithm. This can be divided into two categories: 1. O- notation: This is used to asymptotically upper bound of a function. 2. Ω - notation: This is used to asymptotically lower bound of a function. Other definitions and lemmas related directly to the analysis of the AKS algorithm are stated below. 3.4 Other Important Definitions All of the following definitions and lemmas are required in order to establish the correctness and time complexity of the AKS algorithm, which has been explained in the next chapter. 3.4.1 Definition 13 The greatest integer function x gives the largest integer less than or equal to x. For example 1.4 = 1 and -2.6 = -1. 3.4.2 Definition The least integer function x gives the smallest integer greater than or equal to x. For example 1.4 = 2 and -2.6 = -2. 3.4.3 Definition A prime power of a number is the highest power of a prime that divides a number. 3.4.4 Definition Let n be a natural number and a be an integer such that gcd(a, n) = 1. Then the order of a modulo n is the smallest number k such that ak = 1 mod n. It is denoted by on(a). For example the order of 2 modulo 7 is 3, since 21 = 2, 22 = 4, and 23 = 8 = 1 mod 7. 3.4.5 Definition Let n be a natural number, then the Euler’s totient function, Φ(n) = number of numbers less than n that are relatively prime to n. Now it can be seen from a theorem generalised by Euler [12, 4-5] based on Fermat’s Little Theorem that aΦ(n) = 1 mod n. From this it can be easily derived that for any a, n such that gcd(a, n) = 1, on(a)| Φ(n). 3.4.6 Definition (Adapted from [14, 20-21]) A cyclotomic polynomial is a polynomial given by Φr (x) = ∏k (x-ςk) where k = 1….r such that gcd(k, r) =1, and ςk = e2πik/r are the rth roots of unity in C. Φr(x) is an integer polynomial(i.e. all coefficients of the x terms are integers) and an irreducible polynomial(i.e. a polynomial which cannot be factorised in the same field) with degree Φ(r). The first three cyclotomic polynomials are Φ1(x) = x – 1, Φ2(x) = x + 1, and Φ3(x) = x2 + x +1. Let h(x) = xr-1. Now since h(x) is a polynomial in only one variable x ⇒ the roots of h(x) are also called the zero’s of h(x) [14, 378]. From this it can be deduced that the r-th roots of unity 14 are precisely the zeros of the polynomial h(x) = xr-1. Similarly the primitive r-th roots of unity are precisely the zeros of the rth cyclotomic polynomial Φr(x) Now since every r-th root of unity is a primitive d-th root of unity for exactly one positive divisor d of r [15] this implies that h(x) = ∏d|r Φd (x). This formula illustrates that: 3.4.7 Definition For the rth cyclotomic polynomial Φr (x), Φr (x) | h(x) and h(x) factorises into irreducible factors. It has also been shown that: These factors have degree or(p) where Φr (x) is the rth cyclotomic polynomial over fp. Since h(x) is a factor of the cyclotomic polynomial Φr (x), x is a primitive rth root of unity in the field fp [4, 5]. 3.4.8 Definition The number m∈N, is said to be “introspective” [4, 4] for a polynomial f(x) if [f(x)]m = f(xm) (mod xr-1, p). Furthermore it can easily be shown that Introspective numbers are closed under multiplication and For each introspective number m∈N, the set of polynomials for which m is introspective, is also closed under multiplication. These properties give rise to the following lemma. 3.4.9 Lemma Every numbers in I = {ni . pi| i, j ≥ 0} is introspective for every polynomial in P = {∏ ∏a (x+a)e} where a = 1….l and e ≥ 0. Proof The proof of this lemma is immediate from the two properties stated above. 15 Chapter 4 This chapter analysis the correctness and the time complexity of the AKS algorithm. 4. Deterministic primality testing algorithm (AKS algorithm) The AKS algorithm can determine conclusively whether or not a number is prime. The basic idea behind the algorithm is a version of Fermat’s Little Theorem. All the proofs in this chapter have been established by modifying and elaborating on the proofs given in the report containing the algorithm [4]. 4.1 Basis of the AKS algorithm The following lemma forms the basis of the AKS algorithm 4.1.1 Lemma: Suppose that a is an integer, n is a natural number with n ≥ 2 and a and n are coprime (i.e gcd(a, n ) = 1). Then n is prime iff: (x + a)n = (x p +a) mod n (1) Proof: Let 0<i<n, here we need to show that (x + a)n = (xn + a) mod n is true for all prime numbers and false for all composite numbers. (x + a)n = (xn + a) mod n is true iff n|((x + a)n - (xn + a)) now (x + a)n - (xn + a) = xn + nx n-1a+ ….. + a n - (xn + a) by using the binomial expansion. Therefore the coefficient of xi in (x + a)n - (xn + a) is C(n, i) a n-i . (i) Let n be prime If n is prime then the coefficients all xi terms are = 0 (mod n) since C(n, i) = 0 mod n Also an – a = 0 (mod n) using Fermats Little Theorem. ∴The equation is identically zero over Zn for prime numbers. (ii) Let n be composite Let q be a prime factor of n and let ∃k such that qk||n Then ∃b such that n = bqk Now considering the coefficient (n(n-1)…(n-q+1))/q! in the expansion of (x + a)n - (xn + a) Since n = bqk (n(n-1)…(n-q+1))/q! = ( bqk ( bqk -1)…( bqk – q + 1))/ q ( q – 1 )! 16 =( bqk-1 ( bqk-1 -1)…( bqk-1 – q + 1))/( q – 1 )! This is obviously not divisible by n ⇒ if n is composite then n does not divide ((x + a)n - ( xn + a )) Hence (x + a)n = (xn + a) mod n iff n is prime. 4.1.2 Time analysis of (1) In the worst possible case, n left hand side coefficients would have to be evaluated, which takes time Ω(n) i.e. the algorithm would have exponential running time in the length of n. But this time was reduced by the in the AKS algorithm by evaluating both sides of equation (1) modulo a polynomial in the form xr-1 for a small number of a’s. So the following equation was used instead of equation (1). ( x + a )n = xn + a mod (xr-1, n) (2) (Where r, is an approximately chosen small number) It is immediate that all prime numbers satisfy equation (2) form equation (1). Equation (2) takes less time to compute since the number of coefficients are reduced by evaluating both sides of the equation modulo a polynomial xr-1. But the drawback in using equation (2) is that for certain values of a and r some composite numbers also satisfy the equation. But this problem can easily be rectified since it can be shown that for an approximately chosen r, n must be a prime power if (2) is satisfied for several a’s. So a deterministic polynomial time algorithm can be obtained since the appropriate a’s and r are bounded by a polynomial in log n. 4.2 Proof of Correctness The AKS algorithm is as follows [4, 3]: Input: integer n > 1 1. If (n = ab for a∈ N and b > 1), output COMPOSITE. 2. Find the smallest number r such that or(n) > 4 log2n. 3. If 1< gcd(a,n) < n for some a ≤ r, output COMPOSITE. 4. If n ≤ r, output PRIME. 5. for a = 1 to 2sqrt(φ(r))log n do if (( x + a )n ≠ xn + a mod (xr-1, n)), output COMPOSITE; 6. Output PRIME; 17 In order to determine that the algorithm is correct it is necessary to show that the algorithm returns PRIME if and only if the integer entered is prime. This can be done in two stages by showing that if n is prime, the algorithm returns PRIME, and if the algorithm returns PRIME then n is prime. 4.2.1 Lemma If n is prime, the algorithm returns PRIME. Proof It is obvious that since n is a prime number, then the if statements in lines 1 and 3 evaluate to false, so the algorithm never returns COMPOSITE, and by lemma (4.1.1) it has already been shown that for each value of a the for loop in line 5 will also never return COMPOSITE. So the algorithm will output PRIME in step 4 or 5. The remaining proof is based on the following properties: 1. ∃r such that r = O(log3n) where or(n) > 4log2n . 2. Let p be a prime divisor of n, and p > r. 3. p, n ∈ Zr* 4. Let l = 2sqrt(φ(r)) log n. The first property bounds the magnitude of r. This is essential, as an appropriate r has to be found in line 2 of the algorithm. It has been shown that such an r exist in lemma 5.3 of [4, 34]. The second property relates to lines 3 and 4 of the algorithm. Here p has to be greater than r since if it were not, then lines 3 and 4 would determine whether n was prime or composite. Now for the same reason it is obvious that n and r are coprime and since p is a prime divisor of n ⇒ p, n ∈ Zr*. Now in line 5 of the algorithm, assuming the number is prime, the algorithm keeps r fixed and verifies l equations for all values of a where 1 ≤ a ≤ l. ⇒ ( x + a )n = xn + a mod (xr-1, n) ∀a where 1 ≤ a ≤ l. ⇒ ( x + a )n = xn + a mod (xr-1, p) ∀a where 1 ≤ a ≤ l. (since p is a prime divisor of n) ⇒ ( x + a )p = xp + a mod (xr-1, p) ∀a where 1 ≤ a ≤ l. (by lemma 4.1.1) ⇒ n, and p are both introspective numbers(from definition 3.4.8) The remaining correctness of the algorithm can be done by considering two groups based on two sets I and P as defined in lemma 3.4.9. 18 1. Let the first group G1 be a set of all residues of numbers in I mod r. Then: • G1 is a subgroup of Zr*(since gcd(n, r) = gcd(p, r) = 1). • |G1| > 4log2n. This is because or(n) > 4log2n and G1 is a set of all residues of numbers in I mod r. 2. Let the second group G2 be a set of non-zero residues of polynomial in P modulo h(x) and p, where h(x) is an irreducible factor of the cyclotomic polynomial Φr (x) as described by lemma 3.4.7. Then: • G2 is a subgroup of the multiplicative group fp[x]/h(x). • G2 is generated by the polynomials x + 1, x + 2, …, x + l in the field • f = fp[x]/h(x). Now it that the groups G1 and G2 have been defined it is possible to analyse the size of the group G2 4.2.2 Lemma There exist at least C(t+ l - 2, t -1) distinct polynomials of degree less that t in G2. Proof This lemma can be proved by contradiction as follows: Assuming there exists two distinct polynomial f(x) and g(x) such that f(x), g(x) ∈ P and the degree of f(x) and g(x) is less that t, and also assuming that f(x) = g(x) in f. So for m ∈ I, [f(x)]m = [g(x)] m in f. ⇒ f(xm) = g(xm) in f (since m is introspective for f(x) and g(x), and since h(x) | xr-1) ⇒ 0 = f(xm) - g(xm) in f ⇒ xm is a root of the polynomial p(y) = f(y) - g(y) for all m ∈ G1. But since G1 is a subgroup of Zr* ⇒ gcd(m, r) = 1 ⇒ xm is a primitive rth root of unity for all such m ∈ G1. Therefore the number of distinct roots of the polynomial p in f are equal to |G1| = t. But since the degree of f(x) and g(x) < t from the initial assumption in f the degree of p also has to be less than t. This is a contradiction. ⇒ f(x) can not be equal to g(x). It is also seen that all the elements x + 1, x + 2, …, x + l are all distinct in f since for i ≠ j in fp l = 2sqrt(φ(r)) log n.< 2sqrt(r) log n < r, and p > r ⇒ l< p for 1 ≤ i ≠ j ≤ l. But these elements stated would also include the element x + a for a ≤ l which is not to be included in 19 the group G1. Therefore there are l – 1 distinct polynomials of degree one in G2, ⇒ there are at least C(t+ l - 2, t -1) distinct polynomials of degree less than t in G2. i.e. |G2| ≥ C(t+ l - 2, t -1). After showing that the size group G2 is exponential in t, it can now also be shown that the size of G2 is also upper bounded by an exponential function in t when n is not a power of p. 4.2.3 Lemma |G2| < 1/2 n2√t if n is not a power of p. Proof Let I1 be a subset of I = {ni . pi| i, j ≥ 0}, in such a way that in the subset I1, i can still take the same values but there is a restriction on j to be less than or equal to √t. In other words I1 = { ni . pi| i ≥ 0, 0 ≤ j ≤ √t }. Now supposing that n is not a prime power of p (and since p is a prime factor of n) would imply that the number of elements in the set I1 = (√t +1) (√t +1) = (√t +1)2 > t. This means that |I1| > |G1| (since |G1| = t), so there are at least two elements in the I1 which are equal modulo xr – 1. Therefore let these two elements be k and l and also let k > l. So xk = xl (mod xr – 1) and also for a polynomial f(x) ∈ P: [f(x)]k = f(xk) (mod xr-1, p) (since k is introspective) = f(xl) (mod xr-1, p) = [f(x)]l (mod xr-1, p) ⇒ In the field f ( f = fp[x]/h(x)), [f(x)]k = [f(x)]l and f(x) ∈ G2 ⇒ 0 = [f(x)]k- [f(x)]l in f. Now let q be a polynomial such that q(y) = yk - yl in the field f, then clearly f(x) is a root of q(y), and q(y) has at least |G2| distinct roots (since f(x) ∈ G2) in f. But the degree of q(y) = k ≤ (np) √t < 1/2 n2√t since p< n (p is a prime factor of n and by the initial assumption n is not a prime power of p). ⇒ |G2| < 1/2 n2√t. Now using the last two lemmas it is possible to complete the remaining part of the correctness proof. 4.2.4 Lemma If the algorithm returns PRIME then n is prime. Proof Let the algorithm return PRIME. 20 Now from lemma 4.2.3 there exist at least C(t+ l - 2, t -1) distinct polynomials of degree less that t in G2 where l = 2sqrt(φ(r)) log n and t = |G1|. i.e. |G2| ≥ C(t+ l - 2, t -1) but (t -1) > 2sqrt(t)logn ⇒ |G2| ≥ C(2sqrt(t)logn + l - 1, 2sqrt(t)logn) Now we know that l = 2sqrt(φ(r)) log n ≥ 2sqrt(t)logn , so substituting this in the above inequality gives: |G2| ≥ C(22sqrt(t)logn - 1, 2sqrt(t)logn) But 2sqrt(t)logn ≥ 3 (due to the ranges of t and n), so the inequality now becomes: |G2| ≥ 22sqrt(t)logn ⇒ |G2| ≥ 1/2 n2√t. Now from lemma 4.2.3, |G2| < 1/2 n2√t if n is not a power of p. ⇒ n is a prime power of p. So n must be in the form n = pk for any k > 0. But from line 1 of the algorithm if n = ab for a∈ N and b > 1 then the algorithm outputs COMPOSITE, so since the algorithm does not output COMPOSITE k must be equal to 0. ⇒ n = p0 ⇒ n = p, ∴n is equal to its prime divisor ⇒ n is prime. Hence if algorithm returns PRIME then n is prime. Therefore from lemma 4.2.1 and lemma 4.2.4 it can be concluded that the algorithm outputs the correct result. 4.3 Time Complexity Analysis The time complexity analysis of the AKS algorithm can be analysed by examining each line of the algorithm, as it has been done in theorem 5.1 of [4, 6]. In this analysis it is seen that the total running time is dominated by line 5 of the algorithm, so the asymptotic time complexity of the algorithm (i.e. the behaviour of the execution time as n approaches infinity) can be determined from the time taken to perform line 5. In line 5, the equation in lemma 4.1.1 has to be verified for values of a from a = 1 to 2sqrt(φ(r))log n. ⇒ 2sqrt(φ(r))log n equations have to be evaluated. (1) Now the next thing to consider is the time taken to evaluate each of these equations. The number of multiplications required by each equation = O(logn) [13, 102]. The size of the coefficients = O(logn) [13, 98]. (3) 21 (2) Also the polynomial has degree = r. (4) So from (2), (3) and (4) the time taken to evaluate each equation = O~(r.logn. logn) = O~(r.log2n). (5) And from (1) and (5) the total asymptotic time complexity for evaluating 2sqrt(φ(r))log n equations = O~(r.sqrt(φ(r))log 3n) = O~(r3/2.log 3n) = O~((log3n)3/2.log 3n) due to the bound on the magnitude of r is equal to O(log3n), where or(n) > 4log2n. ⇒ Asymptotic time complexity = O~((logn)9/2.log 3n) = O~((logn)4.5.log3n) = O~(log7.5n) Therefore the AKS algorithm has time complexity O~(log7.5n). It can therefore be concluded that the AKS algorithm is an deterministic primality testing algorithm which has time complexity O~(log7.5n). 22 Chapter 5 This chapter describes and analysis the time complexity of the Sieve of Eratosthenes deterministic primality test 5. Sieve of Eratosthenes 5.1 Introduction The Sieve of Eratosthenes is a deterministic primality test, which was invented by Eratosthenes of Cyrene at around 200 B.C [2, 55]. It is an efficient method of finding all small prime numbers - say less than 10,000,000. The procedure behind this is based essentially on the fact that an integer is prime if its only positive divisors are 1 and itself. This method uses this fact to “sieve” out all prime numbers less that or equal to a given number as follows. 1. Make a list of all integers from 2 to n. 2. Mark 2 as prime and cross out all multiples of 2. 3. Move to the next integer, which has not been crossed out, mark it as prime and cross out all multiples of it. 4. Repeat step 3 until all the numbers in the list have either been marked as prime or crossed out. 5.2 Experimental Analysis When implemented into a C++ program (see Appendix B section 1.1 for psuedo-code) called sieve.cc the program produced a vector containing all prime numbers ≤ n. The correctness of this program was verified by comparing the list of prime numbers ≤ 100000 produced by the program with a list of known prime numbers at [15]. Now an important point to note about this method is that the while loop in the implemented code of algorithm only needs to be entered when p2 ≤ n (see Appendix B section 1.1), that is when the number being examined is less than the square root of n. The reason for this can be explained from the following theorem. - If n is a composite integer, then n has a prime factor not exceeding the square root of n (the proof of this can be seen in [11, 67]). It can be seen more clearly how this result is used in the algorithm by considering an example. Let n = 144, then the square root of n = 12. So from the above theorem it can be deduced that all composite numbers less that 144 must have prime factors less than or equal 23 to 12. So in this case it is only needed to cross out all numbers less than 144 which have the prime factors 2, 3, 5, 7, 9, and 11. All the remaining numbers in the list will be prime. So the Sieve of Eratosthenes basically works for an integer n by checking for divisibility by all prime numbers less that the square root of n. Now although this algorithm works efficiently for small numbers, the time complexity of the algorithm grows exponentially with the size of the input n. This can be seen be seen from the table of results, and the graph produced in Appendix B section 1.2. These results were produced by running the program a 100 000 times, and then dividing the results obtained by 100 000 to estimate the run time of a single execution of the program. This was necessary for estimating the time taken to execute the program for small values of n. From the table it can be seen that the program runs very quickly and efficiently for small values of n. It took just 2.2 milliseconds to output all prime numbers less than and equal to 5, but as larger values n were tested the time taken by the program started to increase rapidly. It took 1186.5 milliseconds for n = 1500. This is the reason why it is not practical to use the Sieve of Eratosthenes when large prime numbers are required to be tested. One of the reasons why this method has exponential growth is that it this method requires numbers less than and equal n to be stored in some sort of an array or vector. So the larger the value of n, the larger the size of the storage vector which results in greater time complexity. The program sieve.cc also outputs the value of π(n) for input n. The table of results showing this can be seen in section 1.1.2 of appendix B. The relationship between π(n) and n has been established by the Prime number theorem (theorem 3.1.7). Using this it can be seen that the probability of a positive random number x being prime = (x / log x) / x = 1 / log x. This shows that prime numbers become scarcer as numbers get larger, which is one of the reasons why determining large prime numbers is such a difficult task. From all of the above analysis it can be concluded that the Sieve of Eratosthenes is a deterministic primality-testing algorithm, which can be used efficiently to determine primality for small numbers. 24 Chapter 6 This chapter describes and analysis the time complexity of the Miller-Rabin primality test 6. The Miller-Rabin Algorithm 6.1 Introduction The Miller-Rabin algorithm is a probalistic primality-testing algorithm, which was developed by Michael Rabin but was based on Gary Miller’s ideas. It is in some ways similar to Fermat’s primality test. This test is actually a strong psuedoprime test. A strong psuedoprime to the base a is an odd composite number n which when written in the form n –1 = m.s2 where m is odd has the property the either am ≡ 1 mod m or at ≡ -1 mod n, where t = m.2k for some k ∈ [0, s) [17, 129]. The Miller-Rabin test is able to correctly distinguish between strong psuedoprime and prime numbers with a very small probability of error. An in depth description of the algorithm can be found at [9, 889-896] and [11, 209-212] but in brief the main procedure behind this algorithm can be summarized from the above sources as follows: Let n be the integer to be tested, where n is an odd number greater than 1.(We only need to test for the primality of odd integers since 2 in the only even prime number. 1. Find an odd m such that n-1 = 2km. 2. Find a random integer a such that 1< a< n-1. 3. If am ≡ ± 1 (mod n) or at ≡ -1 (mod n) where t = m.2r for at least 1 r where 1 ≤ r ≤ k-1, then n might be prime. Otherwise n is definitely composite. So if the algorithm determines the number to be composite, it does so with 100% accuracy. On the other hand, if the algorithm determines the number to be prime, then there is still a small chance that the number is in fact composite. But the probability of a number being established as prime actually being composite is so minute, that it can be virtually ignored. It has been shown in [17, 130-131] that the algorithm has 1 in 4 chance of failing to detect that the number is composite. But this probability of error can be greatly reduced by repeated iterations of the algorithm with different values of a. For example, if the test was repeated 100 times by picking 100 different random values of a between 1 and n-1, then the probability of a composite number being mistaken as a prime, would be reduced to 10-60 which is even 25 smaller then the probability of a computer error being made during the execution of the algorithm [12, 211]. 6.2 Experimental Analysis This algorithm was implemented into a C++ program called mr.cc using the previous description of the algorithm, together with the additional information obtained from [16]. The psuedo-code of the implemented algorithm is given in appendix B section 2.1.1. The implemented program takes in a number n greater than 1 and outputs COMPOSITE if the number is definitely composite, or it outputs PRIME if the number is prime with a small possibility of error. This possibility of error is equal to (1/2)s , where s is the number of times the routine described in section 2.1.1 of appendix B is repeated (see [16] for proof). The program mr.cc was made to repeat the routine test 20 times, which reduced the probability of an error occurring to (0.5)20 = 0.954x10-7 < 0.000001. In other words this increased the probability of correctly testing for a prime number to 99.9999%. The accuracy of the program mr.cc was then tested by running the program 40 times consecutively for the same values of n, to test whether or not the program distinguished between prime and composite numbers correctly. The program gave the correct output for all the data values of n tested. The execution time of the program for different input values was then examined using the same approach as with the program sieve.cc described in chapter 4. But the only difference in the technique being used was that the average result of 40 trials was used in the analysis. This had to be done since the results for each individual trial for the same value of n were too varied. The table of results and the graph of the obtained results can be seen in section 2.1.2 of appendix B. From the table and graph it can be seen that the program runs extremely efficiently for different values of n. Unlike the program implemented using the Sieve of Eratosthenes algorithm sieve.cc, the program mr.cc was able to quickly determine primality for large values of n. The time taken by the program to run did not grow at very quickly as the value of n increased; instead it grew at a linear rate as the value of n grew in size. The time complexity of the Miller Rabin algorithm can be determined from a famous number theory conjecture called the Generalised Riemann Hypothesis. Now although this conjecture 26 has not yet been proved, it is widely believed to be true. Assuming this conjecture is true would imply that (from [11, 212] the algorithm would use O((log2n)5) bit operations to determine whether a positive integer n was prime. It is due to the fact that the Miller-Rabin has this polynomial time complexity and such a low probability of error that the algorithm forms the basis of many encryption and decryption systems such as the RSA cryptosystem and is also used in commercial computer algebra systems such as mathematica. 27 Chapter 7 This chapter investigates any possible impact the AKS algorithm may have had in the field of cryptography 7. The impact of the AKS algorithm in the field of cryptography Cryptography can be defined as the study of analysing and deciphering code. As described in [17,54-55] cryptography is a method a taking a message that is required to be sent – this is called the plaintext and disguising it using some sort of coding technique into a coded message – this is called the cipher text, and then sending it to the recipient. The recipient knows how the message was coded, so the recipients has all the information required to decode the message, but if some third person intercepts the coded message during transmission, then this person will not be able to decipher the message so the secrecy of the message is preserved. Now prime numbers play a vital role in a lot of the presently used crypto-systems such as the RSA-cryptosystem. A brief description of the algorithm is as follows (adapted from [9, 881887]): 1. Take two large prime number p and q such that p ≠ q and multiply them together, such that n = pq. 2. Select a small integer e such that e = ( p – 1)( q – 1), which is also coprime to φ(n). 3. Compute d as a multiplicative inverse of e mod φ(n). 4. Publish the pair p = (e, n) as the RSA public key and keep the secret the pair s = (d, n) as the RSA secret key. 5. Then a one-way function on the public key is used to encrypt the message, and a oneway function based on the secret key is used to decrypt the message. So it can be seen from above that the algorithm is required to find two large prime numbers and multiply them together. At present systems like these use probalistic primality tests such as that Miller-Rabin primality test but as described in the last chapter there is always a small chance of error using such tests that a composite number my be selected instead of a prime. This is the reason why it would much rather be preferred to use a deterministic primality testing algorithm in encryption systems. However at present it is not practically feasible to use the newly developed AKS algorithm in encryption systems since even though it has polynomial time complexity, existing randomised algorithms are can be made to run considerably faster with negligible error probability. 28 Chapter 8 This Chapter briefly states and investigates the factorisation problem 8. Brief description and investigation of the factorization problem. The factorization problem is the problem of expressing a positive number n as a product of primes. It has already been shown from the fundamental theorem of arithmetic that all positive integers can be uniquely factorised (see [17, 12] for proof). But even till today no one has been able to devise a technique of quickly and efficiently factorizing a number. This problem is considered to be a lot harder than the prime problem. One of the main reasons why this problem is so important is because of its role in encryption systems, such as the RSA – encryption system described in the last chapter. The security of this system relies on the fact that no one knows how factorize efficiently. If some one does manage to devise a method of doing so then the security of internet communications and transactions would be at serious risk. Just like with prime numbers, mathematicians and computer scientists are trying very hard to develop efficient factorization algorithms. Many algorithms have even been devised, all of which are extremely complex, and are not very efficient. At present the fastest deterministic algorithm for factorizing is the Pollard-Strassen algorithm [17, 192] but as stated in [9, 896] even with one of today’s supercomputers this algorithm would not be able to feasibly factorize an arbitrary 1024-bit number. So due to the inability to factorize integers efficiently our internet communications and transactions are currently safe! 29 Chapter 9 9.1 Conclusion This report contains a detailed description of the underlying mathematical and computational issues surrounding primality testing. It has investigated in detail various primality testing techniques including the recently developed AKS algorithm. The report also contains the results of the experimental study of two other primality testing algorithms as well as a brief description of the factorization problem. So it can therefore be concluded that the project has met all of its minimum requirements and it has also made further enhancements. 9.2 Future work There is ample scope for future work in this research area such the implantation and experimental study of the AKS algorithm. This could not be done in this project due to the limiting time available. There are also many other aspects of the AKS algorithm, which have not been explored such as the effect on the time complexity of the algorithm if the Sophie Germain primes conjecture holds. Further research could also be done into the factorization problem. Due to the enormity of the field of study, the list of further enhancements to this project area are endless. 30 Appendix A Project reflection Overall I found this project extremely interesting. The AKS algorithm, which my project mainly focussed on, had only recently been discovered. So it was an extremely new and exciting area to study. But the main difficulty I had to face with this project was to try and understand the basic mathematical principles behind the algorithm. I had to do a lot of background reading to understand the algorithm and I had to teach myself many new mathematical concept of number theory. This was very time consuming, and meant that I fell a little behind schedule and had to revise the initial project schedule. The other difficulty I had was with the write up of the project. This report was written using Microsoft word, which is not very easy to use if you need to include a lot of mathematical symbols in the write up. But I did eventually figure out a few short cut techniques and tricks which enabled me to type up the second half of the project a lot quicker than the first half. But despite these difficulties I thoroughly enjoyed researching this area of study; it’s the sort of field in which unanswered questions keep on arising, and as soon as such a question is answered it leads to loads of other questions being asked. This is probably the main reason why I found it so interesting and fascinating to study. 31 Appendix B 1.1 Experimental Analysis of The sieve of Eratosthenes. 1.1.1 pseudo–code The algorithm can be written in pseudo–code as follows. (Based on a description of the algorithm at [2, 56-57]) Eratosthene(n) { primeNumbers[0] = 0 for i:= 1 to (n-1) do primeNumbers[i] = 1; p:= 2 while p2 ≤ n do { i:= j + 2p while ( j ≤ n) do { primeNumbers[j-1] = 0 j:= j + p } repeat p:= p + 1 untill primeNumbers[p-1] = 1; } return (primeNumbers) } 32 1.1.2 Results The following results were obtained for π(n) using the program sieve.cc based on the Sieve of Eratosthenes. N π(n) 1 0 2 1 3 2 4 2 5 3 10 4 25 9 50 15 100 25 500 95 1000 168 5000 669 10000 1229 50000 5133 100000 9592 5000000 348513 . 33 1.1.3 Time analysis The execution times illustrated in the table may not be very accurate due to factors such as other programs running on the computers at the same time as the analysis was taken place, every effort was made to ensure that the results obtained were as precise as possible. Time Taken Input in Number milliseconds 0 0 2 0.5 3 0.8 4 1.8 5 2.2 6 2.9 7 3.2 8 3.8 9 5.3 10 6 25 16.2 50 34.8 100 68 150 107.8 250 185.1 500 371.7 1000 763.7 1500 1186.5 34 Graphical representation of results: Time Taken (Milliseconds) Sieve of Eratosthenes 1200 1000 800 600 400 200 0 0 200 400 600 800 1000 1200 1400 1600 Input Number 2.1 Experimental Analysis of the Miller-Rabin Randomised Primality test 2.1.1 pseudo-code The psuedo code for the Miller-Rabin Randomised Primality test is as follows (based on the description of the algorithm in [9 ,890-896] and [16]): Miller-Rabin(n) { d = 1; for i = k down to 0 bit = the ith bit in the binary representation of (n – 1) x=d d = d *d (mod n) if ( d == 1) && ( x != n –1) && ( x != 1) then return TRUE if (bit == 1) then d = d *a (mod n) //end of for loop if (d != 1) then return TRUE return FALSE } 35 2.1.2 Time analysis The execution times illustrated in the table may not be very accurate due to factors such as other programs running on the computers at the same time as the analysis was taken place, every effort was made to ensure that the results obtained were as precise as possible. Time Taken Input in Number milliseconds 0 0 2 1.69 3 1.61 4 1.9 5 2.22 6 2.53 7 2.35 8 2.35 9 2.39 10 2.6 25 3.2 50 3.69 100 4.38 150 4.89 250 5.49 500 6.44 1000 6.93 1500 7.37 2000 7.68 2500 7.72 5000 8.23 10000 9 50000 10.29 100000 11.03 500000 12.19 1000000 12.74 36 Graphical representation of the results: Time Taken (milliseconds) Miller-Rabin 1000000 900000 800000 700000 600000 500000 400000 300000 200000 100000 0 0 5 10 Imput Number 37 15 References [11] K. H. Rosen (2000), Elementary Number Theory, 4th edition, Addison Wesley Longman inc. [2] P. Giblin (1993), Primes and Programming, Cambridge University Press. [3] The Prime pages. URL http://www.utm.edu/research/primes/notes/13466917/index.html [09/04/03] [4] M.Agrawal, N. Kayal, N. Saxena. PRIMES is in P. URL http://www.cse.iitk.ac.in/news/primality.html [28th Jan 2003] [5] Danny Kinglsey, News in Science, ABC Science online. URL http://www.abc.net.au/science/news/stories/s647647.htm [03/04/03] [6] S. Robinson, New York Times, Section A, Page 20, Column 1, August 8, 2002. URL http://www.nytimes.com/2002/08/08/science/08MATH.html [03/04/03] [7] Chidanand Rajghatta Times News Network. [August 12, 2002 10:06:25 PM] URL http://timesofindia.indiatimes.com/articleshow.asp?artid=18891466 [3/04/03] [8] R B J T Allenby and E J Redfern (1989), Introduction To Number Theory With Computing, Edward Arnold. [9] T.H. Cormen, C.H. Leiserson, R.L. Rivest, C Stein (2002), Introduction to Algorithms, 2nd edition, Prentice Hall. [10] R. B. J. T. Allenby (1991), Rings, Fields and Groups, 2nd edition, Edward Arnold. [11] J. E. Hopcroft, R. Motwani, J. D. Ullman (2001), Introduction to Automata Theory, Languages, and Computation, 2nd edition, Addison-Wesley. [12] E. Kranakis (1987), Primality And Cryptography, reprint, John Wiley & Sons Ltd. [13] J. von zur Gathen, J. Gerhard (1999), Modern Computer Algebra 38 Cambridge University Press, Cambridge. [14] Joseph J. Rotman(2002), Advanced Modern Algebra, Pearson Education Inc. [15] The Prime pages. http://www.utm.edu/research/primes. [12/10/03] [16] Dr David Wessels URL: http://engr.uark.edu/~wessels/algs/notes/prime.html [12/10/03] [17] N. Koblitz (1994), A Course In Number Theory And Crytography, 2nd edition, Springer-Verlag New York Inc. 39