Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
COMP 2402/2002 Abstract Data Types and Algorithms Prof: Eduardo Mesa Office: HP 5347 Email: [email protected] Office hours: Tuesday and Thursday 4:30pm – 5:30pm Web Site: http://people.scs.carleton.ca/~eamesaba/ and also WebCT Textbook: Open Data Structures (in Java). The pdf can be downloaded from the website TAs Andrew Trenholm Tawfic Abdul-Fatah • No office hours for TAs • Prompt answer to questions in: − Web-CT discussion − Carleton Computer Science Society forum. Evaluation Assignments 30% 3 assignments 10% each Midterm: 30% 2 Midterms 15% each Final Exam: 40% Active 5% Participation: bonus Assignments No late assignment will be accepted. • 1 Theory assignment − Must be handled first thing in class. − All pages must be stapled • 2 Programming Assignments − Must be uploaded on Web CT − Make a folder (<Student ID>_<First Name>_<Last Name>_ <Assignment #>) − Put all the source files in the folder − Add a text file with your Id and full name − Zip the folder End of the Introduction Begining of the Lecture Data Type a data storage format that can contain a specific type or range of values characterized by a set of operations that satisfy a set of specific properties. Example Data Type Int 31 31 Range of Value: -(2 ) to (2 ) Operations: +, -, *, / Properties: •Symmetry: a+b = b+a, a*b = b*a •Associative a+(b+c) = (a+b)+c •Definition: a/b , b must not be 0 Data Structures How to organize data to be able to perform operations on those data efficiently. A variable is the simplest data structure. Example Electronic Phone Book Contains different DATA: - Names - phone numbers - addresses Need to perform certain OPERATIONS: - add - delete - look for a phone number - look for an address Example Electronic Phone Book How to organize the data so to optimize the efficiency of the operations List Binary Search Tree Dictionary A B X Z Example Finding the best route for an email message in a network Contains DATA: - Network + Traffic Need to perform certain OPERATIONS: - Find the best route 10 12 7 43 8 16 21 18 14 9 Example Electronic Phone Book How to represent the data Adjacency Matrix Adjacency List Abstract Data Type (ADT) (interfaces) Define what operation can be done. Implementations (algorithms) Define how each operation is performed. A same ADT could have several different implementations. ADT (Shelf) Peter Lucy GetBook ( Book b ) Binary Search Random Search Data Structures So to perform the operations efficiently we need: •Identify your data •Identify the operations you need to perform (and how often each operation is performed) •Define efficiency •Choose the best structure for your data. Analysis of Algorithms Input Algorithm Output An algorithm is a step-by-step procedure for solving a problem in a finite amount of time. Analyze an algorithm = determine its efficiency Running time … Efficiency ? Memory … Quality of the result …. Simplicity …. Generally, while improving the efficiency in one of these aspects we diminish the efficiency in the others Running time Best case Average case Worst case The running time depends on the input size 100 Running Time It also depends on the input data: Different inputs can have different running times 120 80 60 40 20 0 1000 2000 3000 Input Size 4000 Running Time of an algorithm – Easier to analyze – Crucial to applications such as games, finance and robotics 120 100 Running Time • Average case time is often difficult to determine. • We focus on the worst case running time. best case average case worst case 80 60 40 20 0 1000 2000 3000 4000 Input Size 19 Example …. 15 4 If x is odd return x If x is even compute the sum S of the first x integers return S 15 10 20 Measuring the Running Time • How should we measure the running time of an algorithm? • Approach 1: Experimental Study t (ms) 60 50 40 30 20 10 0 50 100 n 21 Beyond Experimental Studies • Experimental studies have several limitations: – need to implement – limited set of inputs – hardware and software environments. 22 Theoretical Analysis • We need a general methodology that: - Uses a high-level description of the algorithm (independent of implementation). Characterizes running time as a function of the input size. Takes into account all possible inputs. Is independent of the hardware and software environment. 23 Analysis of Algorithms • Primitive Operations: Low-level computations independent from the programming language can be identified in pseudocode. • Examples: – calling a method and returning from a method – arithmetic operations (e.g. addition) – comparing two numbers, etc. By inspecting the pseudo-code, we can count the number of primitive operations executed by an algorithm. 24 Example: Algorithm arrayMax(A, n): Input: An array A storing n integers. Output: The maximum element in A. currentMax A[0] for i 1 to n -1 do if currentMax < A[i] then currentMax A[i] return currentMax 25 A 5 13 4 7 6 2 3 8 1 2 currentMax currentMax A[0] for i 1 to n -1 do if currentMax < A[i] then currentMax A[i] return currentMax 26 A 5 13 4 currentMax 7 6 2 3 8 1 2 5 currentMax A[0] for i 1 to n -1 do if currentMax < A[i] then currentMax A[i] return currentMax 27 A 5 13 4 currentMax 7 6 2 3 8 1 2 5 currentMax A[0] for i 1 to n -1 do if currentMax < A[i] then currentMax A[i] return currentMax 28 What are the primitive operations to count ? A 5 13 4 currentMax 7 6 13 2 3 8 1 2 Comparisons Assignments to currentMax currentMax A[0] for i 1 to n -1 do if currentMax < A[i] then currentMax A[i] return currentMax 29 currentMax A[0] for i 1 to n -1 do if currentMax < A[i] then currentMax A[i] 1 assignment n-1 n-1 comparisons assignments (worst case) return currentMax 5 7 8 10 11 12 14 16 17 20 30 In the best case ? 15 1 12 3 9 7 6 currentMax A[0] for i 1 to n -1 do if currentMax < A[i] then currentMax A[i] return currentMax 4 2 2 1 assignment n-1 0 comparisons assignments Summarizing: Worst Case: n-1 comparisons n assignments Best Case: n-1 comparisons 1 assignment Compute the exact number of primitive operations could be difficult We compare the asymptotic behaviour of the running time when the size of the input rise. Big-Oh (upper bound) – given two functions f(n) and g(n), we say that f(n) is O(g(n)) if and only if there are positive constants c and n0 such that f(n) c g(n) for n n0 c • g(n) f(n) n0 n 33 3 g(n) = 3 n What does it mean c g(n) ? 2 g(n) = 2 n Example: g(n) = n n 34 34 g(n) = n2 Graphical example … f(n) = 2n+1 f(n) is O(n2) ? n n0≈2.5 c=1 f(n) c g(n) for n n0 35 But also f(n) = 2n+1 g(n) = n n f(n) c g(n) for n n0 36 But also f(n) = 2n+1 2 g(n) = 2 n n f(n) c g(n) for n n0 37 But also f(n) = 2n+1 3 g(n) = 3 n f(n) is O(n) c = 3 and n0 = 1 n f(n) c g(n) for n n0 38 On the other hand… n2 is not O(n) because there is no c and n0 such that: n2 cn for n n0 ( no matter how large a c is chosen there is an n big enough (n > c) that n2 > c n ). n2 39 n0 n Notice: O(g(n)) is a set of functions •When we say f(n) = O(g(n)) we really mean f(n) ϵ O(g(n)) Formal definition of big-Oh: O(g(n)) = {f(n) : there exists positive constants c and n0 such that f(n) cg(n) for all n n0 } Example: Prove that f(n) = 60n2 + 5n + 1 is O(n2) We must find a constant c and a constant n0 such that: 60n2 + 5n + 1 ≤ c n2 for all n≥n0 5n ≤ 5n2 for all n≥1 1 ≤ n2 for all n≥1 f(n) ≤ 66n2 for all n≥1 f(n) ≤ 60n2 +5n2 + n2 for all n≥1 c= 66 et n0=1 => f(n) = O(n2) 41 Example: Prove f(n) = 5n log2 n + 8n - 200 = O(n log2 n) 5n log2 n + 8n - 200 ≤ 5n log2 n + 8n ≤ 5n log2 n + 8n log2 n for n ≥ 2 (log2 n ≥ 1) ≤ 13n log2 n f(n) ≤ 13n log2 n for all n ≥ 2 f(n) ϵ O(n log2 n ) [ c = 13, n0 = 2 ] Some commons relations c1 c2 O(n ) ⊂ O(n ) for any c1 < c2 For any constants a; b; c > 0, O(a) ⊂ O(log n) ⊂ O(nb) ⊂ O(cn) These make things faster 2 log2 n + 2 = O(log n) n + 2 = O(n) 2n + 15n1/2 = O(n) We can multiply these to learn about other functions, O(an) = O(n) ⊂ O(n log n) ⊂ O(n1+b) ⊂ O(ncn) Examples: O(n1/5) ⊂ O(n1/5 log n) Theorem: If g(n) is O(f(n)) , then for any constant c > 0 g(n) is also O(c f(n)) Theorem: O(f(n) + g(n)) = O(max(f(n), g(n))) Ex 1: 2n3 + 3n2 = O (max(2n3, 3n2)) = O(2n3) = O(n3) Ex 2: n2 + 3 log n – 7 = O(max(n2, 3 log n – 7)) = O(n2) Simple Big Oh Rule: Drop lower order terms and constant factors 7n-3 is O(n) 8n2log n + 5n2 + n is 12n3 + 5000n2 + 2n4 O(n2log n) is O(n4) 45 Other Big Oh Rules: •Use the smallest possible class of functions –Say “2n is O(n)” instead of “2n is O(n2)” •Use the simplest expression of the class –Say “3n + 5 is O(n)” instead of “3n + 5 is O(3n)” 46 Asymptotic Notation (terminology) • Special classes of algorithms: constant: O(1) logarithmic: O(log n) linear: O(n) quadratic: O(n2) cubic: O(n3) polynomial: O(nk), k >0 exponential: O(an), n > 1 Example of Asymptotic Analysis An algorithm for computing prefix averages The i-th prefix average of an array X is average of the first (i + 1) elements of X A[i] = X[0] + X[1] + … + X[i] 5 13 4 5 (i + 1) 8 6 2 3 8 1 2 9 7.3 7.5 … … … … … … 48 Example of Asymptotic Analysis Algorithm prefixAverages1(X, n) Input array X of n integers Output array A of prefix averages of X #operations A new array of n integers for i 0 to n 1 do s X[0] for j 1 to i do s s + X[j] A[i] s / (i + 1) return A 49 i 5 13 4 8 6 2 3 8 1 2 j=0 5 50 i 5 13 4 8 6 2 3 8 1 2 j = 0,1 5 9 51 i 5 13 4 8 6 2 3 8 1 2 j = 0,1,2 5 9 7.3 52 i 5 13 4 8 6 2 3 8 1 2 j = 0,1,2,3 5 9 7.3 7.5 53 Example of Asymptotic Analysis Algorithm prefixAverages1(X, n) Input array X of n integers Output array A of prefix averages of X A new array of n integers for i 0 to n 1 do s X[0] for j 1 to i do s s + X[j] A[i] s / (i + 1) return A #operations n n n 1 + 2 + …+ (n 1) 1 + 2 + …+ (n 1) n 1 54 • The running time of prefixAverages1 is O(1 + 2 + …+ n) • The sum of the first n integers is n(n + 1) / 2 – There is a simple visual proof of this fact 7 6 5 4 3 2 1 0 1 2 3 4 5 6 55 Thus, algorithm prefixAverages1 runs in time O(n(n + 1) / 2) which is O(n2) TO REMEMBER 1 + 2 + …+ n = n(n+1) 2 56 Another Example: A better algorithm for computing prefix averages Algorithm prefixAverages2(X): Input: An n-element array X of numbers. Output: An n -element array A of numbers such that A[i] is the average of elements X[0], ... , X[i]. Let X be an array of n numbers. s 0 for i 0 to n-1 do s s + X[i] A[i] s/(i+ 1) return array A 57 i 5 13 4 8 6 2 3 8 1 2 s=0 5 58 i 5 13 4 8 6 2 3 8 1 2 s=5 5 9 59 i 5 13 4 8 6 2 3 8 1 2 s=18 5 9 7.3 60 i 5 13 4 8 6 2 3 8 1 2 s=22 5 9 7.3 7.5 61 Another Example: A better algorithm for computing prefix averages Let X be an array of n numbers. s 0 for i 0 to n-1 do s s + X[i] A[i] s/(i+ 1) return array A # operations 1 n n n 1 O(n) time 62 big-Omega (lower bound) f(n) is (g(n)) if there exist c > 0 and n0 > 0 such that f(n) c • g(n) for all n n0 (thus, f(n) is (g(n)) iff g(n) is O(f(n)) ) f(n) c • g(n) n0 n 63 big-Theta … is big theta … g(n) is (f(n)) <===> if g(n) O(f(n)) AND f(n) O(g(n)) 64 Example: We have seen that f(n) = 60n2 + 5n + 1 is O(n2) but 60n2 + 5n + 1 60n2 So: with c = 60 f(n) c • n2 for n 1 and n0 = 1 for all n 1 f(n) is (n2) f(n) is O(n2) AND f(n) is (n2) f(n) is (n2) 65 Intuition for Asymptotic Notation Big-Oh – f(n) is O(g(n)) if f(n) is asymptotically less than or equal to g(n) big-Omega – f(n) is (g(n)) if f(n) is asymptotically greater than or equal to g(n) big-Theta – f(n) is (g(n)) if f(n) is asymptotically equal to g(n) 66 Math You Need to Review Logarithms and Exponents properties of logarithms: logb(xy) = logbx + logby logb (x/y) = logbx - logby logbxa = alogbx logbax = (1/a)logbx logba= logxa/logxb Natural logarithm: ln k = ∫ (1/x)dx k 1 e = limk→∞(1+1/n)n ≈ 2.71828 67 Math You Need to Review Logarithms and Exponents properties of exponentials: a(b+c) = aba c abc = (ab)c ab /ac = a(b-c) b = a logab bc = a c*logab 68 More Math to Review • Floor: x = the largest integer ≤ x • Ceiling: x = the smallest integer ≥ x • Summations: – Arithmetic progression: – Geometric progression: 69 More Math to Review Arithmetic Progression n S = di = 0 + d i=0 2S = + 2d + … + nd = nd+(n-1)d+(n-2)d + … + 0 nd + nd + nd + …+ nd = (n+1) nd S = d/2 n(n+1) for d=1 S = 1/2 n(n+1) 70 More Math to Review Geometric Progression n S = ri = 1 + r + r 2 + … + r n rS = i=0 r + r2 + … + rn + rn+1 n rS - S = ri = -1 - r - r2 - … - rn i=0 r + r2 + … + rn + rn+1 rS - S = (r-1)S = rn+1 - 1 S = (rn+1-1)/(r-1) If r=2, S = (2n+1-1) 71 Math You Need to Review Randomization and Probability Expected value E[X] = ∑x ϵ U(x*Pr{X=x}) Properties E[X + Y ] = E[X] + E[Y ] E[∑i =1..k(Xi)] = ∑i =1..k (E[Xi]) 72