Download Chapter Three {Word doc}

Chapter Three Notes The Fundamentals: Algorithms, the Integers, Matrices Based on: Discrete Math & Its Applications - Kenneth Rosen CSC125 - Spring 2010 3.1 Algorithms & 3.3 Algorithm Complexity It is important to understand the resource requirements of various algorithms. The main resources we need to consider are time and space. First, we look at an example of time requirements. Exercises: 1. Download this Visual Basic program and run it for the following inputs: ab abc abcd abcde .. abcdefghi 2. Analytically estimate its run time for the following inputs: abcdefghijklmn abcdefghijklmnopqr abcdefghijklmnopqrstuvwzyz Introduction to Tractability There are some algorithms for which we are unable to perform the computation within the resources available to us for certain inputs. In such a case we say that it is intractable. Dictionary.com defines tractable and intractable thus: tractable \TRAK-tuh-bul\, adjective: Capable of being easily led, taught, or managed; docile; manageable; governable; as, tractable children; a tractable learner. Tractable derives from Latin tractabilis, from tractare, to handle, to manage, frequentative of traho, to draw, to drag. http://dictionary.reference.com/wordoftheday/archive/2000/07/27.html 1 intractable \in-TRAK-tuh-buhl\, adjective: 1. Not easily governed, managed, or directed; stubborn; obstinate; as, "an intractable child." 2. Not easily wrought or manipulated; as, "intractable materials." 3. Not easily remedied, relieved, or dealt with; as, "intractable problems." Intractable is from Latin intractabilis, from in-, "not" + tractabilis, "manageable," from trahere, "to draw (along), to drag, to pull." http://dictionary.reference.com/wordoftheday/archive/2002/07/23.html In computer science, a programming task is considered to be tractable if it can be accomplished in a reasonable period of time or with a reasonable supply of physical resources (usually space). Otherwise it is intractable. Since the actual run time of a program varies from machine to machine, it is better to develop an analytical measure whereby we can compare one algorithm to another. The tools used for this are the Big-O, Big-, and Big- classifications (see Sections 3.2 & 3.3 of Rosen). The most often used categories are: O(1) :: constant O(log n) :: logarithmic O(n) :: linear O(n log n) :: linearithmic O(nk) :: polynomial O(bn) :: exponential O(n!) :: factorial n O(n ) :: polyexponential Section 3.2 :: Growth of functions Studying the chart below, it is not immediately clear that the categories are arranged in ascending order. In the early going numbers in the lower categories are larger than their counterparts in the higher categories. Little by little, though, the higher categories begin to assert their dominance over the lower ones. As we begin to approach n=10, 2n has established dominance over n2 and n! has done the same with 2n. The proper placement of n10 and n100, however, does not become clear until we reach the larger numbers found in the chart you are to fill in for homework as Exercise 3. The picture that begins to emerge is that these categories can be seen as chutes. Just as an animal cannot jump from one chute to another, so the values obtained in each category as n grows larger travel along an established path. And, the numbers 2 found on a path of higher category eventually overtake those on a lower order path. Even more, they then begin to exceed the lower order ones by exorbitant amounts. Therefore, it is important for the computer scientist to know which chute a particular algorithm belongs, especially if large input values are to be anticipated. There is nothing worse than offering your employer a program that initially runs like a Ferrari, but begins to wheeze and cough and falter and sputter as it is applied to bigger problems. The moral of the story is: Know which chute your program falls in, because once it is there, it cannot get out! The method for establishing which chute an algorithm – and therefore any program that uses that algorithm – falls in is given in Definitions 1, 2 &3 on pages 180 and 189 of Rosen. It depends on establishing two constants C and k, called witnesses to the relationship that two functions, f and g, belong in the same chute (category). The key insight is there is a point, given by k, beyond which the values given by function f are within a fixed multiple of those given by the category function g. Most discussion of these categories centers on Big-O. For completeness, though, there are two other means of categorization, Big Omega and Big Theta, which are discussed below. Growth of Functions Chart for Small Numbers n log n n log n n2 n10 n100 2n n! 1 ~ ~ 1 1 1 2 1 2 .301 .602 4 ~ 103 1.3x1030 4 2 3 .477 1.43 9 5.9x 104 5 x1047 8 6 4 .602 2.4 16 ~ 106 1.6x1060 16 24 5 .699 3.49 25 ~ 107 7.9x1069 32 120 6 .778 4.67 36 6 x 107 6.5x1077 64 720 3 7 .845 5.92 49 2.8x108 3.2x1084 128 5040 8 .903 7.22 64 9 x 109 2 x 1090 256 40320 9 .954 8.59 81 3.5x109 2.7x1095 512 362880 Big-O, Big Omega, Big Theta Big-O establishes upper bound to trajectory of function Big- establishes lower bound to trajectory of function Big- asserts that function is element of both Big-O and Big-. f(x)  (g(x)) iff f(x)  O(g(x)) and f(x)  (g(x)) If  C1 and k1 such that |f(x)|  |C1*g(x)|,  x > k1, then f(x)  O(g(x)) If  C2 and k2 such that |f(x)|  |C2*g(x)|,  x > k2, then f(x)  (g(x)) If f(x)  O(g(x)) and f(x)  (g(x)) then f(x)  (g(x)) C and k are called witnesses to the above relationships. The categories established are known as complexity classes. Often, though, it is common to speak of these categories simply as Big-O. Rosen gives the most common complexity classes in order of increasing complexity in Table 1, p. 196. Be sure to learn that Table. Logarithms For our purposes, it is easiest to understand logarithms in this manner. If a number x is expressed as another number b raised to an exponent e, then the logarithm of x is e, and b is called the base of the logarithm. For example: 10000 = 104 , so log10 10000 = 4 4096 = 212 , so log2 4096 = 12 Exponents, and the resulting logarithms, need not be integers. Thus, 199526≈ 105.3, so log10 199526 = 5.3 4 For our purposes we will not be as interested in that degree of precision, so we will be more likely to say log10 199526 = 5, and log10 199526 = 6, or 5 ≤ log10 199526 ≥ 6. Exercises Exercise 3: Fill in the cells of this table. For each column you may leave the remaining cells empty once you have determined that the rest of the column is intractable. For simplicity you may assume the logarithm is to base 10. n log n n log n n2 n10 n100 2n n! 101 102 103 106 109 1012 Exercise 4. What is the runtime Big- classification of GenPerms, from the assignment above? Exercise 5. What is the space usage Big- classification of GenPerms, from the assignment above? Why Analysis of Algorithms? Among the most important reasons for studying algorithms are these: (1) To understand what to expect of our programs, i.e., how long they will take and how much space will be required for various inputs, and (2) to choose the best possible algorithms for our programs. To gain some insight we will study the linear search, binary search, bubble sort and exchange sort algorithms. Three of these are found in Rosen, pp. 170-174. 5 An important rule of thumb for establishing the complexity class of an algorithm is this – zero in on the highest order term and drop any constant coefficients. Thus, if the run time of an algorithm is given as: 15x5 + 45x3 - 231x, that algorithm is O (x5). With that in mind, examine Algorithm 2, p. 170 of Rosen. What is its complexity class in terms of n, the number of items in the search list? It is easy to see that if the item sought is not in the list the while loop will be executed n times. If the item is in the list and the items we seek are in random locations, that loop will be executed, on the average, 1/2 n times. Further, there is a constant number of steps in the loop and a constant number of steps before and after the loop. Using our rule of thumb, we conclude that the complexity class is O(n). Let us now look at Algorithm 3, p. 172 of Rosen. Here the analysis takes a bit more work, since there is no obvious count to the executions of the while loop. We need an insight. Notice that on each iteration we are cutting the search space in half. Thus, if we start with 32 items to search, on subsequent iterations we will have 16, 8, 4, 2, and 1 item(s) in our search list. Thus, there were 5 iterations. Notice also that 25 = 32. So we are looking at log2 n. But for our example n was an exact power of 2. What if that is not the case? Suppose n = 637? Since we cannot divide exactly in half, we will have two segments, one slightly larger than the other. Assuming in each case that the item sought is in the larger of the two segments, the size of the search list will be: 319, 160, 80, 40, 20, 10, 5, 3, 2, 1. This gives us 10 iterations. So we see that the number of iterations is log2 637. We conclude, then, that the complexity class of binary search is O(log2 n). We look now at two fairly simple sorting algorithms. Study the description of Exchange Sort and a trace of its execution. Also study the description of Bubble Sort, a shorter version of that description, and a trace of its execution. Exercise 6: Give the initial values of i and j for binary search algorithm (Alg. #3, p. 172), and values of m, am, i and j after each iteration of the while loop when seeking the positions of 19, 23, and 51 for this sorted set of integers: 2, 15, 19, 33, 44, 49, 51, 66, 77, 78 Exercise 7: Produce a trace of the execution of both Exchange Sort and Bubble Sort on the following two lists of items: 92 8 73 47 23 39 44 42 8 73 97 23 39 5 6 Section 3.4 Divisibility Modular Arithmetic Hashing Section 3.5 Primes Prime theorems Prime conjectures Cryptography – RSA Recursion – basic structure 1. Base case 2. Recursive call Dominant control structure – if-then-else to check for base case Each recursive call must bring us closer to the base case Mathematical Induction – basic structure 1. Base case 2. Inductive step Mathematical Induction – basic structure 1. Prove base case 2. Assume true for k & prove that it then follows for k+1 Mathematical Induction - problem Prove: The sum of the first n even nonzero integers is n(n+1) In other words, prove: ni=1 i = n(n+1) Prove base case: To prove: 1i=1 i = n(n+1) Proof: 1i=1 i = 2*1 = 2  If n =1, then n(n+1) = 1*(1+1) = 2  Inductive step: Assume: ki=1 i = k(k+1) To prove: k+1i=1 i = (k+1)[(k+1) + 1]  Proof: k+1i=1 i = ki=1 i + 2(k+1) = k(k+1) + 2(k+1) 7 = (k+1)(k+2) = (k+1)[(k+1) + 1]  Integers, Primes, Number Theory To know (Section 3.4): Definitions 1, 2 & 3 (pp. 201-203) Theorems 1-4 (pp. 202-204) Corollaries 1 & 2 (pp. 202 & 205) To know (Section 3.5): Definitions 1, 2, 3 & 5 (pp. 210, 215-217) Theorems 1-5 (pp. 211-213, 217) Prime decomposition (prime factorization) To know (Section 3.6): Algorithm 5 (p. 226) Lemma 1 (p. 228) Euclidean algorithm (pp. 229 & 313) Sorting & Searching Exchange Sort – time is O(n2) Bubble Sort – time is O(n2) Linear Search – time is O(n) Binary Search – time is O(log2 n) Give the results that would be produced at the end of each beta sweep if the integers below were sorted using Exchange Sort. 92 8 73 47 23 39 44 Answer: 8 8 8 8 8 8 92 23 23 23 23 23 73 92 39 39 39 39 47 73 92 44 44 44 23 47 73 92 47 47 39 39 47 73 92 73 44 44 44 47 73 92 After beta sweep # 1 8 92 73 47 23 39 44 After beta sweep # 2 8 23 92 73 47 39 44 8 After beta sweep # 3 8 23 39 92 73 47 44 After beta sweep # 4 8 23 39 44 92 73 47 After beta sweep # 5 8 23 39 44 47 92 73 After beta sweep # 6 8 23 39 44 47 73 92 Show the process of sorting the numbers below using Bubble Sort by writing out the result after the completion of each inner loop. 42 8 73 97 23 39 5 Answer: 8 8 8 8 8 5 42 73 23 39 5 42 23 39 5 73 23 39 5 42 73 23 5 39 42 73 5 23 39 42 73 8 23 39 42 73 97 97 97 97 97 97 After pass # 1 8 42 73 23 39 5 97 After pass # 2 8 42 23 39 5 73 97 After pass # 3 8 23 39 5 42 73 97 After pass # 4 8 23 5 39 42 73 97 After pass # 5 8 5 23 39 42 73 97 After pass # 6 5 8 23 39 42 73 97 Give the initial values of i and j for binary search algorithm (Alg. #3, p. 172), 9 and values of m, am, i and j after each iteration of the while loop when seeking the positions of 19, 23, and 51 for this sorted set of integers: 2, 15, 19, 33, 44, 49, 51, 66, 77, 78 Answer: In all 3 cases the initial values are: i = 1 and j = 10 x = 19 m am i j 5 3 2 1 9 33 19 15 1 1 1 2 5 3 2 2 x = 23 m am i j 5 3 2 1 1 3 5 3 3 x = 51 m am i j 5 8 7 6 6 6 6 6 10 8 7 6 49 33 19 49 77 66 51 and values of m, am, i and j after each iteration of the while loop when seeking the positions of 19, 23, and 51 for this sorted set of integers: 2, 15, 19, 33, 44, 49, 51, 66, 77, 78 2, 15, 19, 33, 44, 49, 51, 66, 77, 78 Counting To know (Section 5.1): Product Rule Sum Rule Subtraction Principle (principle of inclusion-exclusion) Tree diagrams To know (Section 5.2): Pigeonhole Principle Generalized Pigeonhole Principle To know (Section 5.3): Permutations 10 r-permutations – Corollary 1 (p. 356) Combinations r-combinations – Theorem 2 (p. 358) – denoted (nr) To know (Section 5.4): Binomial Theorem (p. 363) Pascal’s Triangle To know Theorem 1 (p.184) in Rosen. Table 1, p. 196. Be sure to learn that Table. Table 2, p. 198. Answers Exercises: 1. Download this Visual Basic program and run it for the following inputs: ab abc abcd abcde .. abcdefghi The actual values depend on the machine, OS, etc. Here is one set of answers:: ab - less than 1 millisecond abc - less than 1 millisecond abcd - 0.0078125 5 * 0.0078125 = 0.0390625 abcde - 0.015625 6 * 0.015625 = 0.09375 ≈ 0.109375 abcdef - 0.109375 7 * 0.109375 = 0.765625 ≈ 0.4921875 abcdefg - 0.4921875 8 * 0.4921875 = 3.9375 ≈ 4.085938 abdcefgh - 4.085938 9 * 4.085938 = 36.773442 ≈ 38.40625 abcdefghi - 38.40625 10 * 38.40625 = 384.0625 ≈ 379.8906 abcdefghij - 379.8906 11 Analysis: Although there are two exceptions: abcd => abcde and abcdef => abcdefg, the run time for a string of length n+1 is (n+1) times the runtime for a string of length n. Since few computers are fully dedicated to one process, we can expect exceptions for those cases where the total run time is short. Input string was ab Begin time was 73678.66 End time was 73678.66 Elapsed time in seconds:: 0 Elapsed time (min:sec):: 0 : 0 Input string was abc Begin time was 73682.41 End time was 73682.41 Elapsed time in seconds:: 0 Elapsed time (min:sec):: 0 : 0 Input string was abcd Begin time was 73686.9 End time was 73686.91 Elapsed time in seconds:: 0.0078125 Elapsed time (min:sec):: 0 : 0 Input string was abcde Begin time was 73691.13 End time was 73691.15 Elapsed time in seconds:: 0.015625 Elapsed time (min:sec):: 0 : 0 Input string was abcdef Begin time was 73696.72 End time was 73696.83 Elapsed time in seconds:: 0.109375 Elapsed time (min:sec):: 0 : 0 Input string was abcdefg Begin time was 73701.62 End time was 73702.11 Elapsed time in seconds:: 0.4921875 Elapsed time (min:sec):: 0 : 0 Input string was abcdefgh Begin time was 73707.52 End time was 73711.61 Elapsed time in seconds:: 4.085938 Elapsed time (min:sec):: 0 : 4 Input string was abcdefghi Begin time was 73721.95 12 End time was 73760.36 Elapsed time in seconds:: 38.40625 Elapsed time (min:sec):: 0 : 38 Input string was abcdefghij Begin time was 73768.93 End time was 74148.82 Elapsed time in seconds:: 379.8906 Elapsed time (min:sec):: 6 : 20 2. Analytically estimate its run time for the following inputs: abcdefghijklmn Run time for abcdefghij = 379.8906 sec. So, run time for abcdefghijklmn = 379.8906*11*12*13*14 = 9126491.7744 sec. = 152108.19624 min. = 2535.136604 hrs. = 105.6 days abcdefghijklmnopqr Run time for abcdefghijklmn = 105.6 days So, run time for abcdefghijklmnopqr = 105.6*15*16*17*18 = 7757518 days = 21253.5 years = 212.535 centuries abcdefghijklmnopqrstuvwzyz Run time for abcdefghijklmnopqr = 212.535 centuries So, run time for abcdefghijklmnopqrstuvwzyz = 26!/18! * 212.535 centuries = 62990928000 * 212.535 = 13,387,776,882,480 centuries = 1.34 x 1013 centuries One further note: There is a second VB program whose runtime varies significantly from the one we just now analyzed. You will notice that when it is run, it does not print out the permutations that it generates. Thus, we see that for input of abcdefgh it runs about 17 times faster. Why? The first program runs slower because it takes time to place the output to the screen. In the long run, though, it does not help us overcome the tractability problem. With a string only one letter longer, we are again waiting minutes for the program to finish. So, for input of abcdefghijklmnopqrstuvwxyz we are still waiting centuries for the program to end. abcde - less than 1 millisecond abcdef - 0.0078125 7 * 0.0078125 = 0.0546875 ≈ 0.046875 abcdefg - 0.046875 8 * 0.046875 = 0.375 > 0.2421875 13 abcdefgh - 0.2421875 9 * 0.2421875 = 2.1796875 ≈ 2.070313 abcdefghi - 2.070313 10 * 2.070313 = 20.70313 ≈ 19.97656 abcdefghij - 19.97656 11 * 19.97656 = 219.74216 ≈ 207.6172 abcdefghijk - 207.6172 Input string was abcde Begin time was 75178.52 End time was 75178.52 Elapsed time in seconds:: 0 Elapsed time (min:sec):: 0 : 0 Input string was abcdef Begin time was 75183.4 End time was 75183.41 Elapsed time in seconds:: 0.0078125 Elapsed time (min:sec):: 0 : 0 Input string was abcdefg Begin time was 75188.99 End time was 75189.04 Elapsed time in seconds:: 0.046875 Elapsed time (min:sec):: 0 : 0 Input string was abcdefgh Begin time was 75195.17 End time was 75195.41 Elapsed time in seconds:: 0.2421875 Elapsed time (min:sec):: 0 : 0 Input string was abcdefghi Begin time was 75201.56 End time was 75203.63 Elapsed time in seconds:: 2.070313 Elapsed time (min:sec):: 0 : 2 Input string was abcdefghij Begin time was 75211.86 End time was 75231.84 Elapsed time in seconds:: 19.97656 Elapsed time (min:sec):: 0 : 20 Input string was abcdefghijk Begin time was 75242.13 End time was 75449.74 Elapsed time in seconds:: 207.6172 Elapsed time (min:sec):: 3 : 28 Exercise 4. 14 What is the runtime Big- O classification of GenPerms, from the assignment above? If we build a tree with branches representing calls to the function Perms, for input of abcde that tree would have 5 nodes at level 1, 5*4=20 at level 2, 5*4*3 = 60 at level 3, 5*4*3*2 = 120 at level 4, and 5*4*3*2*1 at level 5. So the total number of calls to Perms = 5+20+60+120+120 = 325. But, notice that 5 = 5!/4!, 20 = 5!/3!, 60 = 5!/2!, and 120 = 5!/1! = 5!/0!. So we have, 5 j=0 5!/j! = 5! * 5 j=0 1/j! And, in the general case, we have, n j=0 n!/j! = n! * n j=0 1/j! It turns out that n j=0 1/j! is e = 2.71828182845904523536. . which is the base of natural logarithms and is sometimes called Euler's number. Thus, we find that the run time is e*n!, so its classification is O(n!). Exercise 5. What is the space usage Big-O classification of GenPerms, from the assignment above? It would be very natural to expect that the space usage would be in a similar category, but that is not the case. Going back to the tree we built to show function calls to Perms, we observe that whenever there are nested function calls we only need to allocate memory for the functions which have not yet completed execution. Therefore, as we go from branch to branch on our tree, we only keep one branch in memory at a time. If we the branch displays execution in a left to right manner, branches to the left of the current branch have completed execution (and no longer require memory) and branches to the right represent future function calls (and do not yet require memory). Thus, the amount of memory needed is proportional to the height of the tree (or, if you prefer, the depth of the nested function calls). And this value is n. Therefore, the space usage is in the category is O(n). Growth of Functions Chart n log n n log n n2 n10 n100 2n n! 101 1 101 102 1010 10100 ~103 3.6*106 102 2 2 x 102 104 1020 10200 ~1030 ~10158 15 103 3 3 x 103 106 1030 10300 ~10300 106 6 6 x 106 1012 1060 10600 ~103x10^5 ~105x10^6 109 9 9 x 109 1018 1090 10900 ~103x10^8 ~108x10^9 1012 12 1.2*1013 1024 10120 101200 ~103x10^11 ~1011x10^12 Logarithms No need to be out of sorts 1. Exchange Sort Algorithm Trace Complexity class 2. Bubble Sort Algorithm & shorter version Trace Complexity class Searching 1. Linear Search - Rosen, p. 170 2. Binary Search - Rosen, p. 172 Integers, Primes, Number Theory Section 3.4 16 ~102567 Divisibility Modular Arithmetic Hashing To know (Section 3.4): Definitions 1, 2 & 3 (pp. 201-203) Theorems 1-4 (pp. 202-204) Corollaries 1 & 2 (pp. 202 & 205) Section 3.5 Primes Prime theorems Prime conjectures Cryptography – RSA To know (Section 3.5): Definitions 1, 2, 3 & 5 (pp. 210, 215-217) Theorems 1-5 (pp. 211-213, 217) Prime decomposition (prime factorization) To know (Section 3.6): Lemma 1 (p. 228) Euclidean algorithm (pp. 229 & 313) Recursion & Mathematical Induction Recursion – basic structure 1. Base case 2. Recursive call Dominant control structure – if-then-else to check for base case Each recursive call must bring us closer to the base case Mathematical Induction – basic structure 1. Base case 2. Inductive step Mathematical Induction – basic structure 1. Prove base case 2. Assume true for k & prove that it then follows for k+1 17 Mathematical Induction - problem Prove: The sum of the first n even nonzero integers is n(n+1) In other words, prove: ni=1 i = n(n+1) Prove base case: To prove: 1i=1 i = n(n+1) Proof: 1i=1 i = 2*1 = 2  If n =1, then n(n+1) = 1*(1+1) = 2  Inductive step: Assume: ki=1 i = k(k+1) To prove: k+1i=1 i = (k+1)[(k+1) + 1]  Proof: k+1i=1 i = ki=1 i + 2(k+1) = k(k+1) + 2(k+1) = (k+1)(k+2) = (k+1)[(k+1) + 1]  Counting To know (Section 5.1): Product Rule Sum Rule Subtraction Principle (principle of inclusion-exclusion) Tree diagrams To know (Section 5.2): Pigeonhole Principle Generalized Pigeonhole Principle To know (Section 5.3): Permutations r-permutations – Corollary 1 (p. 356) Combinations r-combinations – Theorem 2 (p. 358) – denoted (nr) To know (Section 5.4): Binomial Theorem (p. 363) Pascal’s Triangle 18

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Chapter Three {Word doc}