Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Analysis CS 367 – Introduction to Data Structures Overview • Why study data structures? – biggest reason for a computer is to store, analyze, and retrieve data – data structures are used to store data – type of data structure chosen effects • amount of data that can be stored • retrieval speed • speed of analysis Data Structures • Data structure selection involves tradeoffs – speed vs. capacity vs. flexibility vs. complexity • Consider an array versus a vector – which is faster? – which has a larger storage capacity? – which has more flexibility in data placement? – which is more complex? – which one is better? • Must always consider which characteristics are most important for the data being handled Algorithms • Always do some kind of work on data – at least sort or search through it • Many different routes to same destination – consider 10 different people writing the same program • Picking the best route requires analysis – most concerned with how long an algorithm will take Analysis • Computation Complexity – how much effort and cost is associated with a specific algorithm • How do we measure computational complexity? – run algorithm on a set of data and time it? • results depend on speed of computer used – a PC vs. a Cray supercomputer • results depend on language algorithm written in – C vs. Java vs. Basic Analysis • Need a mathematical technique to analyze algorithms – needs to be free of any real-time units (ms) – should relate logical time unit to data size • t = cn t->logical time, c->constant, n->data size • notice that time does not need to given in seconds • Several different techniques have been suggested – we will only consider a few of them Big-O Notation • Formal, mathematical definition f(n) is O(g(n)) if there exist positive numbers c and N such that f(n) ≤ cg(n) for all n ≥ N. – Wow! That probably makes no sense at all • Now let’s put it into English – we want to find some function (g(n)) that is always greater than or equal to the original function (f(n)) • for small values of f(n), g(n) may not be equal or greater • However, once n reaches a certain size, then g(n) will always be greater than or equal to f(n) Big-O Notation • Still confused? Don’t feel bad, let’s see an example – consider the following function: f(n) = n2 + 2n + 35 – now consider this next function: • g(n) = n2 – by the previous definition, we need to find a c and an N that satisfy the following: • cg(n) ≥ f(n) for all n ≥ N Big-O Notation • So let’s find a c and an N – if c=2 then for what values of n is cg(n) ≥ f(n)? • solve the following equation for n: – cn2 ≥ n2 + 2n + 35 • n ≥ 7 (this means N = 7 from previous definition) • it is clear that there are an infinite number of c and N possibilities – what’s important, is g(n) is the same for all of them – but there’s an infinite number of possible g(n), too • g(n) = n2, n3, n4, … – pick the smallest g(n) that works Big-O Notation • The previous slide showed that g(n) = n2 – Hence, our Big-O notation is O(n2) for the function f(n) = n2 + 2n + 35 • Notice that all but the first term of f(n) were discarded to find g(n) – this works because for large n, the largest exponent dominates – consider n=1, n=5, n=100 Analysis • So how does all this math help algorithm analysis? – consider the following piece of code for(i=sum=0; i<n; i++) sum += a[i]; – before loop starts, 2 assignments (i and sum) – 2 assignments within the loop (sum and i) – equation relating time to n is: f(n) = 2 + 2n – computational complexity is: O(n) Analysis • Consider another example int rowsum[n]; int firstcolsum; for(i=firstcolsum=0; i<n; i++) { firstcolsum += a2d[i][0]; for(j=rowsum[i]=0; j<n; j++) rowsum[i] += a2d[i][j]; } – before loop starts: 2 assignments – inside outer loop: 4 assignments – inside inner loop: 2 assignments – equation: f(n) = 2 + 4n + 2n*n = 2 + 4n + 2n2 – computational complexity: O(n2) Analysis • Analysis can get much more sophisticated • Let’s compare 2 array search algorithms – brute force • check the first element, then second, third, etc. – binary search • • • • • requires a sorted array start at the center and divide the array into two if less than center, search lower half if greater than center, search upper half repeat until the number is found Brute Force Analysis • Code int bruteforce(int [] array, int key) { for(int i=0; i<array.length; i++) { if(array[i] == key) return i; } return -1; • Analysis – best case: iterations = 1 – worst case: iterations = n – average case: iterations = n/2 – average computational complexity = O(n) Binary Search Analysis • Code int binarySearch(int [] array, int key) { int lo = 0, hi = array.length – 1, mid; while(lo <= hi) { mid = (lo + hi) / 2; if(key < array[mid]) hi = mid – 1; if(key > array[mid]) lo = mid + 1; else return mid; } return -1; Binary Search Analysis • Analysis – best case: iterations = 1 – worst case: iterations = log2 n – average case: give equation • don’t worry about how this was achieved • see the book if you want details • it should be clear that calculating complexity is not always easy or straightforward – average computational complexity: O(log2 n) Big-Ω Notation • Formal, mathematical definition f(n) is Ω(g(n)) if there exist positive numbers c and N such that f(n) ≥ cg(n) for all n ≥ N. – only real difference between Big-Ω and Big-O is that Big-O is an upper bound on a function and Big-Ω is a lower bound – all the descriptions of Big-O apply – just reverse the “≤” sign in Big-O to a “≥” sign for Big- Ω Big-Θ Notation • Formal, mathematical definition f(n) is Θ(g(n)) if there exist positive numbers c1, c2, and N such that c1g(n) ≤ f(n) ≤ c2g(n) for all n ≥ N. • Again, we consider the following equation – f(n) = n2 + 2n + 35 • We have shown that this is O(n2) • It is easy to show that it is also Ω(n2) – if c=1 and g(n)=n2 then cg(n) ≤ f(n) • If c1=1, c2=2, and N=7, we satisfy the above definition and the computational complexity (using Big-Θ) becomes Θ(n2) Words of Warning • There are many other, more sophisticated methods of computational complexity • Those shown here do not always paint a complete picture • However, these techniques can provide some good insight into an algorithm