Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
CSC401 – Analysis of Algorithms Chapter 1 Algorithm Analysis Objectives: Introduce algorithm and algorithm analysis Discuss algorithm analysis methodologies Introduce pseudo code of algorithms Asymptotic notion of algorithm efficiency Mathematics foundation for algorithm analysis Amortization analysis techniques What is an algorithm ? Input Algorithm Output An algorithm is a step-by-step procedure for solving a problem in a finite amount of time. What is algorithm analysis ? Two aspects: • Running time – How much time is taken to complete the algorithm execution? • Storage requirement – How much memory is required to execute the program? 2 Running Time – Easier to analyze – Crucial to applications such as games, finance and robotics best case average case worst case 120 100 Running Time Most algorithms transform input objects into output objects. The running time of an algorithm typically grows with the input size. Average case time is often difficult to determine. We focus on the worst case running time. 80 60 40 20 0 1000 2000 3000 4000 Input Size 3 How Is An Algorithm Analyzed? Two methodologies 9000 8000 – Experimental analysis Time (ms) Method 7000 – Write a program implementing the algorithm – Run the program with inputs of varying size and composition – Use a method like System.currentTimeMillis() to get an accurate measure of the actual running time – Plot the results 6000 5000 4000 3000 2000 1000 0 0 50 100 Input Size Limitations – It is necessary to implement the algorithm, which may be difficult – Results may not be indicative of the running time on other inputs not included in the experiment. – In order to compare two algorithms, the same hardware and software environments must be used 4 How Is An Algorithm Analyzed? – Theoretical Analysis Method – Uses a high-level description of the algorithm instead of an implementation – Characterizes running time as a function of the input size, n. – Takes into account all possible inputs – Allows us to evaluate the speed of an algorithm independent of the hardware/software environment Characteristics – A description language -- Pseudo code – Mathematics 5 Pseudocode Example: find max element of an array High-level description of an algorithm More structured than Algorithm arrayMax(A, n) English prose Input array A of n integers Output maximum element of A Less detailed than a program currentMax A[0] Preferred notation for for i 1 to n 1 do describing algorithms if A[i] currentMax then Hides program design currentMax A[i] issues return currentMax 6 Pseudocode Details Control flow – – – – – – if … then … [else …] switch … while … do … repeat … until … for … do … Indentation replaces braces Method declaration Algorithm method (arg [, arg…]) Input … Output … Method call var.method (arg [, arg…]) Return value return expression Expressions Assignment (like in Java) Equality testing (like in Java) n2 Superscripts and other mathematical formatting allowed Array indexing: A[i] 7 The Random Access Machine (RAM) Model A CPU An potentially unbounded bank of memory cells, each of which can hold an arbitrary number or character 2 0 1 Memory cells are numbered and accessing any cell in memory takes unit time. 8 Primitive Operations Basic computations performed by an algorithm Identifiable in pseudocode Largely independent from the programming language Exact definition not important (we will see why later) Assumed to take a constant amount of time in the RAM model Examples: – Evaluating an expression – Assigning a value to a variable – Performing an arithmetic operation – Comparing two numbers – Indexing into an array – Following an object reference – Calling a method – Returning from a method 9 Counting Primitive Operations By inspecting the pseudocode, we can determine the maximum number of primitive operations executed by an algorithm, as a function of the input size Algorithm arrayMax(A, n) currentMax A[0] for i 1 to n 1 do if A[i] currentMax then currentMax A[i] { increment counter i } return currentMax # operations 2 1+n 2(n 1) 2(n 1) 2(n 1) 1 Total 7n 2 10 Estimating Running Time Algorithm arrayMax executes 7n 2 primitive operations in the worst case. Define: a = Time taken by the fastest primitive operation b = Time taken by the slowest primitive operation Let T(n) be the worst-case time of arrayMax. Then a (7n 1) T(n) b(7n 1) Hence, the running time T(n) is bounded by two linear functions 11 Growth Rate of Running Time Changing the hardware/ software environment – Affects T(n) by a constant factor, but – Does not alter the growth rate of T(n) – Linear n – Quadratic n2 – Cubic n3 In a log-log chart, the slope of the line corresponds to the growth rate of the function T (n ) The linear growth rate of the running time T(n) is an intrinsic property of algorithm arrayMax 1E+30 Growth rates of 1E+28 Cubic 1E+26 functions: Quadratic 1E+24 1E+22 1E+20 1E+18 1E+16 1E+14 1E+12 1E+10 1E+8 1E+6 1E+4 1E+2 1E+0 1E+0 Linear 1E+2 1E+4 1E+6 n 1E+8 1E+10 12 Constant Factors The growth rate is not affected by Examples – 102n + 105 is a linear function – 105n2 + 108n is a quadratic function T (n ) – constant factors or – lower-order terms 1E+26 1E+24 1E+22 1E+20 1E+18 1E+16 1E+14 1E+12 1E+10 1E+8 1E+6 1E+4 1E+2 1E+0 1E+0 Quadratic Quadratic Linear Linear 1E+2 1E+4 1E+6 1E+8 1E+10 n 13 Big-Oh Notation Given functions f(n) and g(n), we say that f(n) is O(g(n)) if there are positive constants c and n0 such that f(n) cg(n) for n n0 Example: 2n + 10 is O(n) – – – – 3n 2n+10 1,000 n 100 10 1 1 2n + 10 cn (c 2) n 10 n 10/(c 2) Pick c 3 and n0 10 Example: the function not O(n) 10,000 10 100 1,000 n 1,000,000 n^2 100n 100,000 n2 is – n2 cn – n c – The above inequality cannot be satisfied since c must be a constant 10n n 10,000 1,000 100 10 14 1 1 10 100 1,000 More Big-Oh Examples 7n-2 7n-2 is O(n) need c > 0 and n0 1 such that 7n-2 c•n for n n0 this is true for c = 7 and n0 = 1 3n3 + 20n2 + 5 3n3 + 20n2 + 5 is O(n3) need c > 0 and n0 1 such that 3n3 + 20n2 + 5 c•n3 for n n0 this is true for c = 4 and n0 = 21 3 log n + log log n 3 log n + log log n is O(log n) need c > 0 and n0 1 such that 3 log n + log log n c•log n for n n0 this is true for c = 4 and n0 = 2 15 Big-Oh and Growth Rate The big-Oh notation gives an upper bound on the growth rate of a function The statement “f(n) is O(g(n))” means that the growth rate of f(n) is no more than the growth rate of g(n) We can use the big-Oh notation to rank functions according to their growth rate g(n) grows more f(n) grows more Same growth f(n) is O(g(n)) g(n) is O(f(n)) Yes No Yes No Yes Yes 16 Big-Oh Rules If is f(n) a polynomial of degree d, then f(n) is O(nd), i.e., 1.Drop lower-order terms 2.Drop constant factors Use the smallest possible class of functions – Say “2n is O(n)” instead of “2n is O(n2)” Use the simplest expression of the class – Say “3n + 5 is O(n)” instead of “3n + 5 is O(3n)” 17 Asymptotic Algorithm Analysis The asymptotic analysis of an algorithm determines the running time in big-Oh notation To perform the asymptotic analysis – We find the worst-case number of primitive operations executed as a function of the input size – We express this function with big-Oh notation Example: – We determine that algorithm arrayMax executes at most 7n 2 primitive operations – We say that algorithm arrayMax “runs in O(n) time” Since constant factors and lower-order terms are eventually dropped anyhow, we can disregard them when counting primitive operations 18 Summations Summation: b f (i) f (a) + f (a + 1) + f (a + 2) + + f (b) i a 1 a n+1 a 1+ a + a ++ a Geometric summation: 1 a i 0 1 i 2 n a 1+ a + a ++ a + 1 a i 0 n i 2 n n Natural summation: i 1 + 2 + 3 + + n i 0 n(n + 1) 2 1 1 1 1 Harmonic number: 1 + + + + + ln n 2 3 n i 0 i Split summation: n k n i 0 i 1 i k +1 f (i) f (i) + f (i) 19 Logarithms and Exponents Logarithms – properties of logarithms: logb(xy) = logbx + logby logb (x/y) = logbx - logby logbxa = alogbx logba = logxa/logxb Exponents – properties of exponentials: a(b+c) = aba c abc = (ab)c ab /ac = a(b-c) b = a logab bc = a c*logab Floor and ceiling functions – – x the largest integer less than or equal to x x the smallest integer greater than or equal to x 20 Proof Techniques By Example -- counterexample Contrapositive – Principle: To justify “if p is true, then q is true”, show “if q is not true, then p is not true”. – Example: To justify “if ab is odd, a is odd or b is even”, assume “’a is odd or b is odd’ is not true”, then “a is even and b is odd”, then “ab is even”, then “’ab is odd’ is not true” Contradiction – Principle: To justify “p is true”, show “if p is not true, then there exists a contradiction”. – Example: To justify “if ab is odd, then a is odd or b is even”, let “ab is odd”, and assume “’a is odd or b is even’ is not true”, then “a is even and b is odd”, thus “ab is even” which is a contradiction to “ab is odd”. 21 Proof by Induction Induction -- To justify S(n) for n>=n0 – Principle Base cases: Justify S(n) is true for n0<=n<=n1 Assumption: Assume S(n) is true for n=N>=n1; or Assume S(n) is true for n1<=n<=N Induction: Justify S(n) is true for n=N+1 – Example: Question: Fibonacci sequence is defined as F(1)=1, F(2)=1, and F(n)=F(n-1)+F(n-2) for n>2. Justify F(n)<2^n for all n>=1 Proof by induction where n0=1. Let n1=2. – Base cases: For n0<=n<=n1, n=1 or n=2. If n=1, F(1)=1<2=2^1. If n=2, F(2)=2<4=2^2. It holds. – Assumption: Assume F(n)<2^n for n=N>=2. – Induction: For n=N+1, F(N+1)=F(N)+F(N1)<2^N+2^(N-1)<2 2^N=2^(N+1). It holds. 22 Proof by Loop Invariants Principle: – To prove a statement S about a loop is correct, define S in terms of a series of smaller statements S0, S1, …, Sk, where The initial claim S0 is true before the loop begins If Si-1 is true before iteration i begins, then show that Si is true after iteration i is over The final statement Sk is true Example: Consider the algorithm arrayMax – Statement S: max is the maximum number when finished – A series of smaller statements: Si: max is the maximum Algorithm arrayMax(A, n) in the first i+1 elements max A[0] of the array for i 1 to n 1 do S0 is true before the loop if A[i] max then If Si is true, then easy to max A[i] return max show Si+1 is true 23 S=Sn-1 is also true Basic Probability Sample space: the set of all possible outcomes from some experiment Probability space: a sample space S together with a probability function that maps subsets of S to real numbers between 0 and 1 Event: Each subset of A of S called an event Properties of the probability function Pr – – – – Pr(Ø)=0 Pr(S)=1 0<=Pr(A)<=1 for any subset A of S If A and B are subsets of S and AB=, then Pr(AB)=Pr(A)+Pr(B) Independence – A and B are independent if Pr(AB)=Pr(A)Pr(B) – A1, A2, …, An are mutually independent if Pr(A1A2… An)=Pr(A1)Pr(A2)…Pr(An) 24 Basic Probability Conditional probability – The conditional probability that A occurs, given B, is defined as: Pr(A|B)=Pr(AB)/Pr(B), assuming Pr(B)>0 Random variables – Intuitively, Variables whose values depend on the outcomes of some experiment – Formally, a function X that maps outcomes from some sample space S to real numbers – Indicator random variable: a variable that maps outcomes to either 0 or 1 Expectation:a random variable has random values – Intuitively, average value of a random variable – Formally, expected value of a random variable X is defined as E(X)=xxPr(X=x) – Properties: Linearity: E(X+Y) = E(X) + E(Y) Independence: If X and Y are independent, that is, Pr(X=x|Y=y)=Pr(X=x), then E(XY)=E(X)E(Y) 25 Computing Prefix Averages We further illustrate asymptotic analysis with two algorithms for prefix averages The i-th prefix average of an array X is average of the first (i + 1) elements of X: A[i] (X[0] + X[1] + … + X[i])/(i+1) Computing the array A of prefix averages of another array X has applications to financial analysis 35 30 X A 25 20 15 10 5 0 1 2 3 4 5 6 7 26 Prefix Averages (Quadratic) The following algorithm computes prefix averages in quadratic time by applying the definition Algorithm prefixAverages1(X, n) Input array X of n integers Output array A of prefix averages of X #operations A new array of n integers n for i 0 to n 1 do n s X[0] n for j 1 to i do 1 + 2 + …+ (n 1) s s + X[j] 1 + 2 + …+ (n 1) A[i] s / (i + 1) n return A 1 27 Arithmetic Progression The running time of prefixAverages1 is O(1 + 2 + …+ n) The sum of the first n integers is n(n + 1) / 2 – There is a simple visual proof of this fact Thus, algorithm prefixAverages1 runs in O(n2) time 7 6 5 4 3 2 1 0 1 2 3 4 5 6 28 Prefix Averages (Linear) The following algorithm computes prefix averages in linear time by keeping a running sum Algorithm prefixAverages2(X, n) Input array X of n integers Output array A of prefix averages of X A new array of n integers s0 for i 0 to n 1 do s s + X[i] A[i] s / (i + 1) return A #operations n 1 n n n 1 Algorithm prefixAverages2 runs in O(n) time 29 Amortization Amortization – Typical data structure supports a wide variety of operations for accessing and updating the elements – Each operation takes a varying amount of running time – Rather than focusing on each operation – Consider the interactions between all the operations by studying the running time of a series of these operations – Average the operations’ running time Amortized running time – The amortized running time of an operation within a series of operations is defined as the worst-case running time of the series of operations divided by the number of operations – Some operations may have much higher actual running time than its amortized running time, while some others 30 have much lower The Clearable Table Data Structure The clearable table – An ADT Storing a table of elements Being accessing by their index in the table – Two methods: add(e) -- add an element e to the next available cell in the table clear() -- empty the table by removing all elements Consider a series of operations (add and clear) performed on a clearable table S – Each add takes O(1) – Each clear takes O(n) – Thus, a series of operations takes O(n2), because it may consist of only clears 31 Amortization Analysis Theorem: – A series of n operations on an initially empty clearable table implemented with an array takes O(n) time Proof: – Let M0, M1, …, Mn-1 be the series of operations performed on S, where k operations are clear – Let Mi0, Mi1, …, Mik-1 be the k clear operations within the series, and others be the (n-k) add operations – Define i-1=-1,Mij takes ij-ij-1, because at most ij-ij-1-1 elements are added by add operations between Mij-1 and Mij – The total time of the series is: (n-k) + ∑k-1j=0(ij-ij-1) = n-k + (ik-1 - i-1) <= 2n-k Total time is O(n) Amortized time is O(1) 32 Accounting Method The method – Use a scheme of credits and debits: each operation pays an amount of cyber-dollar – Some operations overpay --> credits – Some operations underpay --> debits – Keep the balance at any time at least 0 Example: the clearable table – Each operation pays two cyber-dollars – add always overpays one dollar -- one credit – clear may underpay a variety of dollars the underpaid amount equals the number of add operations since last clear - 2 – Thus, the balance is at least 0 – So the total cost is 2n -- may have credits 33 Potential Functions Based on energy model – associate a value with the structure, Ø, representing the current energy state – each operation contributes to Ø a certain amount of energy t’ and consumes a varying amount of energy t – Ø0 -- the initial energy, Øi -- the energy after the i-th operation – for the i-th operation ti -- the actual running time t’i -- the amortized running time ti = t’i + Øi-1 - Øi – Overall: T=∑(ti), T’=∑(t’i) – The total actual time T = T’ + Ø0 - Øn – As long as Ø0 <= Øn, T<=T’ 34 Potential Functions Example -- The clearable table – the current energy state Ø is defined as the number of elements in the table, thus Ø>=0 – each operation contributes to Ø t’=2 – Ø0 = 0 – Øi = Øi-1 + 1, if the i-th operation is add add: t = t’+ Øi-1 - Øi = 2 - 1 = 1 – Øi = 0, if the i-th operation is clear clear: t =t’+ Øi-1 = 2 + Øi-1 – Overall: T=∑ (t), T’=∑(t’) – The total actual time T = T’ + Ø0 - Øn – Because Ø0 = 0 <= Øn, T<=T’ =2n 35 Extendable Array ADT: Extendable array is an array with extendable size. One of its methods is add to add an element to the array. If the array is not full, the element is added to the first available cell. When the array is full, the add method performs the following – – – – Allocate a new array with double size Copy elements from the old array to the new array Add the element to the first available cell in the new array Replace the old array with the new array Question: What is the amortization time of add? – Two situations of add: add -- O(1) Extend and add -- (n) – Amortization analysis: Each add deposits 3 dollars and spends 1 dollar Each extension spends k dollars from the size k to 2k (copy) 36 Amortization time of add is O(1) Relatives of Big-Oh big-Omega – f(n) is (g(n)) if there is a constant c > 0 and an integer constant n0 1 such that f(n) c•g(n) for n n0 big-Theta – f(n) is (g(n)) if there are constants c’ > 0 and c’’ > 0 and an integer constant n0 1 such that c’•g(n) f(n) c’’•g(n) for n n0 little-oh – f(n) is o(g(n)) if, for any constant c > 0, there is an integer constant n0 0 such that f(n) c•g(n) for n n0 little-omega – f(n) is (g(n)) if, for any constant c > 0, there is an integer constant n0 0 such that f(n) c•g(n) for n n0 37 Intuition for Asymptotic Notation Big-Oh – f(n) is O(g(n)) if f(n) is asymptotically less than or equal to g(n) big-Omega – f(n) is (g(n)) if f(n) is asymptotically greater than or equal to g(n) big-Theta – f(n) is (g(n)) if f(n) is asymptotically equal to g(n) little-oh – f(n) is o(g(n)) if f(n) is asymptotically strictly less than g(n) little-omega – f(n) is (g(n)) if is asymptotically strictly greater than g(n) 38 Example Uses of the Relatives of Big-Oh 5n2 is (n2) f(n) is (g(n)) if there is a constant c > 0 and an integer constant n0 1 such that f(n) c•g(n) for n n0 let c = 5 and n0 = 1 5n2 is (n) f(n) is (g(n)) if there is a constant c > 0 and an integer constant n0 1 such that f(n) c•g(n) for n n0 let c = 1 and n0 = 1 5n2 is (n) f(n) is (g(n)) if, for any constant c > 0, there is an integer constant n0 0 such that f(n) c•g(n) for n n0 need 5n02 c•n0 given c, the n0 that satisfies this is n0 c/5 0 39