Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
CSE 101 Algorithm Design and Analysis Miles Jones and Russell Impagliazzo [email protected] [email protected] Lecture 12: Implementing Kruskal’s Algorithm Using Data Structures for Disjoint Sets ALGORITHMS WITH VERY LARGE OUTPUTS Challenge: Think of a succinctly describable algorithm A(say, fits with large writing on an index card) No system calls On each integer, A terminates and outputs an integer, not infinity Goal: Make A(10) as big as possible ACKERMAN FUNCTIONS A_1 (n, m) = n+m A_{i+1} (n,m) = n if m=1, A_{i} (n, A_{i+1}(n,m -1)) if m > 1 So A_{i+1} does A_i to n ``m times’’ Example: A_2(n, m) = n + A_2 (n, m -1) so A_2 (n,m)= A_3 (n,m) = n * A_3(n,m-1), so A_3 (n, m) = TOWER FUNCTION OK, A_4 (n, m) = n^{A_4 (n,m-1)}, so A_4 (n,m)=Tower(n,m)= …… n n n n n m n h i g h THE TOWER FUNCTION GROWS PRETTY LARGE T(2,2)= 2^2=4 T(2,3)= 2^4 = 16 T(2,4)= 2^{16} around 4000 T(2,5)=2^{4096}, much larger than the number of particles times time quanta in the universe’s history T(2,6)= if you used each particle in each time quanta in the universe, you can’t write this number in binary T(2,7) = if in each time quanta , each quantum event splits the universe into parallel universes, and you used each particle in each time quanta in each parallel universe to store a bit, you can’t write this number down: a pretty big number INVERSES OF LARGE FUNCTIONS If we have a quickly growing function, its inverse defines a non constant, but very slowly growing, function Example: Exp(n)=2^n, inverse is log (n) Inverse of Tower(2,n) is called log* n, number of times we take log before we get below 1. log*n > 7 only for numbers n that are too big to be written on all particles in all parallel universes, so doesn’t come up too often BUT THE TOWER FUNCTION IS JUST THE FOURTH ACKERMAN FUNCTION A_5 (2,2) = 4 A_5 (2,3) is a tower of 2’s 4 high, 4096 A_5 (2,4) is a tower of 2’s 4096 high. (start with that huge number, and exponentiate it another 4089 times) A_5 (2,5) is a tower of 2’s A_5 (2,4) high…..(I’ll let you imagine) THE ACTUAL ACKERMAN FUNCTION Ack( i, n, m) = A_i(n,m) can be defined by the following simple recursion: Ack(i,n,m): IF i=1 return n+m IF m=1 return n Return Ack(i-1, n, Ack(i,n,m-1)) Ack(n)= Ack(n,n,n) Imagine what Ack(10) is. It’s pretty big Of course, if there’s room left on the index card, we could keep going, say, looking at Ack composed with itself n times…. INVERSE ACKERMAN ɑ(n) = smallest j so that Ack(j) ≥ n. Goes to infinity as n grows, but very, very slowly HIGH-LEVEL KRUSKAL’S ALGORITHM Instance: Undirected graph G, with edge weights w(e) Output: A subset of edges X that form a spanning tree Start X as the empty set of edges Go through the edges from smallest weight to highest weight For each edge e={u,v}, if u is not already connected to v in X, add e to X Return X DATA STRUCTURE FOR DISJOINT SETS Main complication: Want to check if u is connected to v efficiently Tree T divides vertices into disjoint sets of connected components u is connected to v if they are in the same set Adding e to T merges the set containing u with the set containing v So we need a data structure that: Represents a partition of a set V into disjoint subsets. We’ll pick one element L from each subset to be the ``leader’’of a subset, in order to give the subsets distinct names Has an operation find(u) that returns the leader of u’s set Has an operation union(u,v) that replaces the two sets containing u and v with their union KRUSKAL’S ALGORITHM USING DSDS MAIN FACTORS IN TIME FOR KRUSKAL Sorting all edges: O(m log m) = O(m log n) 2 find operations per edge, O(m) * Time_find union when we add an edge to X, happens n -1 times (because a tree with n vertices always has n-1 edges, O(n) *Time_union SUBROUTINES OF KRUSKAL’S DSDS VERSION 1 Keep an array Leader(u) indexed by element In each array position, keep the leader of its set Initialize to self Find(u) : return Leader(u), O(1) union(u,v) : For each array position, if it’s currently Leader(v), change it to Leader(u). (O(n) time) Total time: O(n log n) sort+ O(1)*m Finds, + O(n)*O(n) unions = O(n^2) total VERSION 2.0: LISTS In addition to array, keep each set in a doubly linked list, so that once we have one element, we can find all of the other elements in constant time per element. Find stays the same. union : add links between the tail of Set(u), to the head of Set(v), where Set(u) is the set u is currently in, and change all of the array values in Set(v). Time: O(|Set(v)|) But if we say, always add in a single u to a growing Set(v), still order n^2 total time VERSION 2.1: STILL LISTS How should we avoid this? DSDS 2.1 LISTS WITH SIZES Keep the size of the list, Size(L) at the leader L When we perform a union, if Size(u) < Size (v), swap u and v, so that we are always updating pointers for the smaller of the two lists. Then add the two sizes and store at u. WORST CASE TIME Find is still constant time If at least one set is small, merge becomes faster. But in the worst-case, we could be merging two sets of size n/2. In that case, we’d still need to update n/2 pointers to the leader, So total worst-case time is still Ώ(n) BUT DOES WORST-CASE TIME GIVE US THE TOTAL PICTURE ? While we do want a bound on the worst -case time of our algorithm, the time for our algorithm might be better than the worst -case time for a step times the number of steps. If only a few steps take their worst case, the sum of time for all steps might be much less than the upper bound. In this case, how many times can we have the worst -case behavior? What must have happened in previous steps to work up to this worst case? WHAT MIGHT HAVE HAPPENED All n elements n/2 elements n/2 elements n/4 elts n/4 elts n/4 elts n/4 elts AMORTIZED ANALYSIS Techniques for bounding the total cost of data structure operations that improve over the worst-case Simple example: We have a register keeping track of total deposits. Bills get deposited, $1, $10, $100, $1000,..up to 10^n We need to perform carries in our register. So if we have $999,999 and one more gets added, we need to perform 7 steps to make the new value $1,000,000 So worst-case cost of deposit is Ώ(n) But I claim any m deposits take at most O(n +m) time AMORTIZED ANALYSIS, POTENTIAL METHOD Worst-case: k 9’s become 1 and k 0’s. But once we create 0’s, many deposits before we get back to worst -case. We’ll define P_t to be a ``potential function’’, measuring how far along we are in the process of building up at time t. AT_t= Time for t’th operation+ P_t – P_{t-1}. Then Σ_t AT_t = ∑_t (Time for t’th operation) + (P_T-P_{T-1})+ P_{T1}- P_{T-2})+….(P_1-P_0) = Total time + P_T –P_0 In other words, total time is at most total amortized time,plus initial potential ACCOUNT SUM EXAMPLE In register problem, what should P_t be? How can we measure how close we are to possible cascades? P_t = ACCOUNTING METHOD Find total cost by dividing responsibility among data elements, rather than operations. Our example: Charge each position in the counter 1 each time there is a carry from that position to the next, C_i be the total charge for position i. Observation: If we just look at the value of position i, it only increases if there is a carry to position i, A DATA STRUCTURE FOR KRUSKAL’S (DIRECTED TREES WITH RANKS) A DATA STRUCTURE FOR KRUSKAL’S (DIRECTED TREES WITH RANKS) A DATA STRUCTURE FOR KRUSKAL’S (DIRECTED TREES WITH RANKS) SUBROUTINES. SUBROUTINES. SUBROUTINES. To save on runtime, we must keep the heights of the trees short. So union of two ranks points the smaller rank to the bigger rank, that way, the tree will stay the same height. If the ranks are equal, then it increments one rank and points the smaller to the bigger. (this is the only way a rank can increase.) SUBROUTINES. makeset=O(1) find=O(height of tree containing x) union=O(find) EXAMPLE makeset({A,B,C,D,E,F,G}) union(A,D), union(B,E), union(B,F), union(A,G), union(D,G),union(B,D), union(C,E) EXAMPLE makeset({A,B,C,D,E,F,G}) union(B,C), union(E,G), union(D,F), union(A,B), union(D,G),union(A,F) EXAMPLE HEIGHT OF TREE ANCESTORS OF RANK K any vertex has at most one ancestor of rank k proof: each vertex has one pointer and ranks strictly increase along paths so each element has at most one ancestor of each rank. NUMBER OF VERTICES OF A GIVEN RANK HEIGHT OF TALLEST TREE (MAXIMUM RANK) the maximum rank is log(n) proof: (sort of) How many vertices of rank log(n) can there be? HEIGHT OF TALLEST TREE (MAXIMUM RANK) makeset=O(1) find=O(height of tree containing x) union=O(find) RUNTIME makeset=O(1) find=O(log(n)) union=O(log(n)) RUNTIME makeset=O(1) find=O(log(n)) union=O(log(n)) RUNTIME PRIM’S ALGORITHM PRIM’S ALGORITHM The cut property assures us that any algorithm following the guideline of continuously adding the next lightest edge between two disjoint sets will work. Prim’s algorithm always picks the two disjoint sets according to which is connected and which is not PRIM’S ALGORITHM On each iteration, the subtree grows by one edge PRIM’S ALGORITHM PRIM/DIJKSTRA’S Prim’s algorithm is just like Dijkstra’s where we are using the value cost(v) instead of dist(v). We can use the same algorithm changing dist to cost and we can use the same data structures. PRIM’S ALGORITHM PRIM’S EXAMPLE RUNTIME OF PRIM’S Runtime of Kruskal’s SET COVER Suppose you have a county with n towns. If you put a school in town x then all towns within a 30 mile radius could send their kids to that school. What is the least amount of schools you could build to accommodate all towns? SET COVER Make a graph where the towns are vertices and two vertices are connected if they are within 30 miles of each other. SET COVER Make a graph where the towns are vertices and two vertices are connected if they are within 30 miles of each other. SET COVER Greedy approach: Pick the town that accommodates the most number of other towns. Delete all towns it accommodates from the graph and repeat until all towns are accommodated. SET COVER Greedy approach: Pick the town that accommodates the most number of other towns. Delete all towns it accommodates from the graph and repeat until all towns are accommodated. SET COVER Greedy approach: Pick the town that accommodates the most number of other towns. Delete all towns it accommodates from the graph and repeat until all towns are accommodated. SET COVER Greedy approach: Pick the town that accommodates the most number of other towns. Delete all towns it accommodates from the graph and repeat until all towns are accommodated. SET COVER Is 4 the optimal solution? SET COVER SET COVER HOW BAD IS THE GREEDY APPROACH? Claim: Suppose B contains n elements and that the optimal solution consists of k sets. Then the greedy approach will use at most kln(n) sets. In our previous example, k=3 and n=11 so the greedy approach will not use more than 3*ln(11)=7 sets. Is it worth it to use the greedy approach? PROOF OF CLAIM Claim: Suppose B contains n elements and that the optimal solution consists of k sets. Then the greedy approach will use at most kln(n) sets. PROOF OF CLAIM PROOF OF CLAIM PROOF OF CLAIM PROOF OF CLAIM When t=kln(n), n_t<1 so no more elements are covered. IS GREEDY WORTH IT? is kln(n) that much bigger than k? The ratio between the greedy solution and optimal solution is always less than ln(n). It turns out that there does not exist a polynomial time algorithm that can give you a better approximation!!!!! Runtime of greedy algorithm: IS GREEDY WORTH IT? is kln(n) that much bigger than k? The ratio between the greedy solution and optimal solution is always less than ln(n). It turns out that there does not exist a polynomial time algorithm that can give you a better approximate: on!!!!! Runtime of greedy algorithm: Based on the data structure used. GREEDY ALGORITHM FOR SET COVER for all v in V makeset(v)