Download CSE 101 - WordPress.com

Document related concepts
no text concepts found
Transcript
CSE 101
Algorithm Design and Analysis
Miles Jones and Russell Impagliazzo
[email protected]
[email protected]
Lecture 12: Implementing Kruskal’s
Algorithm Using Data Structures for Disjoint
Sets
ALGORITHMS WITH VERY LARGE OUTPUTS
 Challenge: Think of a succinctly describable algorithm A(say, fits
with large writing on an index card)
 No system calls
 On each integer, A terminates and outputs an integer, not infinity
 Goal: Make A(10) as big as possible
ACKERMAN FUNCTIONS
 A_1 (n, m) = n+m
 A_{i+1} (n,m) = n if m=1, A_{i} (n, A_{i+1}(n,m -1)) if m > 1
 So A_{i+1} does A_i to n ``m times’’
 Example: A_2(n, m) = n + A_2 (n, m -1) so A_2 (n,m)=
 A_3 (n,m) = n * A_3(n,m-1), so A_3 (n, m) =
TOWER FUNCTION
 OK, A_4 (n, m) = n^{A_4 (n,m-1)}, so A_4 (n,m)=Tower(n,m)=
……
n
n
n
n
n
m
n
h
i
g
h
THE TOWER FUNCTION GROWS PRETTY LARGE
 T(2,2)= 2^2=4
 T(2,3)= 2^4 = 16
 T(2,4)= 2^{16} around 4000
 T(2,5)=2^{4096}, much larger than the number of particles times time
quanta in the universe’s history
 T(2,6)= if you used each particle in each time quanta in the universe,
you can’t write this number in binary
 T(2,7) = if in each time quanta , each quantum event splits the
universe into parallel universes, and you used each particle in each
time quanta in each parallel universe to store a bit, you can’t write this
number down: a pretty big number
INVERSES OF LARGE FUNCTIONS
 If we have a quickly growing function, its inverse defines a non constant, but very slowly growing, function
 Example: Exp(n)=2^n, inverse is log (n)
 Inverse of Tower(2,n) is called log* n, number of times we take log
before we get below 1.
 log*n > 7 only for numbers n that are too big to be written on all
particles in all parallel universes, so doesn’t come up too often
BUT THE TOWER FUNCTION IS JUST THE FOURTH
ACKERMAN FUNCTION
 A_5 (2,2) = 4
 A_5 (2,3) is a tower of 2’s 4 high, 4096
 A_5 (2,4) is a tower of 2’s 4096 high. (start with that huge number,
and exponentiate it another 4089 times)
 A_5 (2,5) is a tower of 2’s A_5 (2,4) high…..(I’ll let you imagine)
THE ACTUAL ACKERMAN FUNCTION
 Ack( i, n, m) = A_i(n,m) can be defined by the following simple
recursion:
 Ack(i,n,m):
 IF i=1 return n+m
 IF m=1 return n
 Return Ack(i-1, n, Ack(i,n,m-1))
 Ack(n)= Ack(n,n,n)
 Imagine what Ack(10) is. It’s pretty big
 Of course, if there’s room left on the index card, we could keep going,
 say, looking at Ack composed with itself n times….
INVERSE ACKERMAN
 ɑ(n) = smallest j so that Ack(j) ≥ n.
 Goes to infinity as n grows, but very, very slowly
HIGH-LEVEL KRUSKAL’S ALGORITHM
 Instance: Undirected graph G, with edge weights w(e)
 Output: A subset of edges X that form a spanning tree
 Start X as the empty set of edges
 Go through the edges from smallest weight to highest weight
 For each edge e={u,v}, if u is not already connected to v in X, add e to
X
 Return X
DATA STRUCTURE FOR DISJOINT SETS
 Main complication: Want to check if u is connected to v efficiently
 Tree T divides vertices into disjoint sets of connected components
 u is connected to v if they are in the same set
 Adding e to T merges the set containing u with the set containing v
 So we need a data structure that:
 Represents a partition of a set V into disjoint subsets. We’ll pick one
element L from each subset to be the ``leader’’of a subset, in order to
give the subsets distinct names
 Has an operation find(u) that returns the leader of u’s set
 Has an operation union(u,v) that replaces the two sets containing u and
v with their union
KRUSKAL’S ALGORITHM USING DSDS
MAIN FACTORS IN TIME FOR KRUSKAL
 Sorting all edges: O(m log m) = O(m log n)
 2 find operations per edge, O(m) * Time_find
 union when we add an edge to X, happens n -1 times (because a tree
with n vertices always has n-1 edges, O(n) *Time_union
SUBROUTINES OF KRUSKAL’S

DSDS VERSION 1
 Keep an array Leader(u) indexed by element
 In each array position, keep the leader of its set
 Initialize to self
 Find(u) : return Leader(u), O(1)
 union(u,v) : For each array position, if it’s currently Leader(v), change
it to Leader(u). (O(n) time)
 Total time: O(n log n) sort+ O(1)*m Finds, + O(n)*O(n) unions
 = O(n^2) total
VERSION 2.0: LISTS
 In addition to array, keep each set in a doubly linked list, so that once
we have one element, we can find all of the other elements in constant
time per element.
 Find stays the same.
 union : add links between the tail of Set(u), to the head of Set(v),
where Set(u) is the set u is currently in, and change all of the array
values in Set(v).
 Time: O(|Set(v)|)
 But if we say, always add in a single u to a growing Set(v), still order
n^2 total time
VERSION 2.1: STILL LISTS
 How should we avoid this?
DSDS 2.1 LISTS WITH SIZES
 Keep the size of the list, Size(L) at the leader L
 When we perform a union, if Size(u) < Size (v), swap
u and v, so that we are always updating pointers for the
smaller of the two lists.
Then add the two sizes and store at u.
WORST CASE TIME
 Find is still constant time
If at least one set is small, merge becomes faster.
But in the worst-case, we could be merging two sets of size n/2.
In that case, we’d still need to update n/2 pointers to the leader,
So total worst-case time is still Ώ(n)
BUT DOES WORST-CASE TIME GIVE US THE TOTAL
PICTURE ?
 While we do want a bound on the worst -case time of our algorithm,
the time for our algorithm might be better than the worst -case time for a
step times the number of steps. If only a few steps take their worst case, the sum of time for all steps might be much less than the upper
bound.
In this case, how many times can we have the worst -case behavior?
What must have happened in previous steps to work up to this worst case?
WHAT MIGHT HAVE HAPPENED
All n elements
n/2 elements
n/2 elements
n/4 elts
n/4 elts
n/4 elts
n/4 elts
AMORTIZED ANALYSIS
 Techniques for bounding the total cost of data structure operations
that improve over the worst-case
 Simple example: We have a register keeping track of total deposits.
 Bills get deposited, $1, $10, $100, $1000,..up to 10^n
 We need to perform carries in our register.
 So if we have $999,999 and one more gets added, we need to perform
7 steps to make the new value $1,000,000
So worst-case cost of deposit is Ώ(n)
But I claim any m deposits take at most O(n +m) time
AMORTIZED ANALYSIS, POTENTIAL METHOD
 Worst-case: k 9’s become 1 and k 0’s. But once we create 0’s, many
deposits before we get back to worst -case.
 We’ll define P_t to be a ``potential function’’, measuring how far along
we are in the process of building up at time t.
 AT_t= Time for t’th operation+ P_t – P_{t-1}.
 Then Σ_t AT_t = ∑_t (Time for t’th operation) + (P_T-P_{T-1})+ P_{T1}- P_{T-2})+….(P_1-P_0) = Total time + P_T –P_0
 In other words, total time is at most total amortized time,plus initial
potential
ACCOUNT SUM EXAMPLE
 In register problem, what should P_t be? How can we measure how
close we are to possible cascades?
 P_t =
ACCOUNTING METHOD
 Find total cost by dividing responsibility among data elements, rather
than operations.
 Our example: Charge each position in the counter 1 each time there
is a carry from that position to the next, C_i be the total charge for
position i.
 Observation: If we just look at the value of position i, it only increases
if there is a carry to position i,
A DATA STRUCTURE FOR KRUSKAL’S (DIRECTED TREES
WITH RANKS)

A DATA STRUCTURE FOR KRUSKAL’S (DIRECTED TREES
WITH RANKS)
A DATA STRUCTURE FOR KRUSKAL’S (DIRECTED TREES
WITH RANKS)
SUBROUTINES.

SUBROUTINES.
SUBROUTINES.
 To save on runtime, we must keep
the heights of the trees short.
 So union of two ranks points the
smaller rank to the bigger rank, that
way, the tree will stay the same
height.
 If the ranks are equal, then it
increments one rank and points the
smaller to the bigger. (this is the
only way a rank can increase.)
SUBROUTINES.

makeset=O(1)
find=O(height of tree containing x)
union=O(find)
EXAMPLE
 makeset({A,B,C,D,E,F,G})
 union(A,D), union(B,E), union(B,F), union(A,G), union(D,G),union(B,D),
union(C,E)
EXAMPLE
 makeset({A,B,C,D,E,F,G})
 union(B,C), union(E,G), union(D,F), union(A,B), union(D,G),union(A,F)
EXAMPLE
HEIGHT OF TREE

ANCESTORS OF RANK K
 any vertex has at most one ancestor of rank k
proof:
each vertex has one pointer and ranks strictly increase along paths so
each element has at most one ancestor of each rank.
NUMBER OF VERTICES OF A GIVEN RANK

HEIGHT OF TALLEST TREE (MAXIMUM RANK)
 the maximum rank is log(n)
proof: (sort of)
How many vertices of rank log(n) can there be?
HEIGHT OF TALLEST TREE (MAXIMUM RANK)

makeset=O(1)
find=O(height of tree containing x)
union=O(find)
RUNTIME

makeset=O(1)
find=O(log(n))
union=O(log(n))
RUNTIME

makeset=O(1)
find=O(log(n))
union=O(log(n))
RUNTIME

PRIM’S ALGORITHM

PRIM’S ALGORITHM
 The cut property assures us that any algorithm following the guideline of
continuously adding the next lightest edge between two disjoint sets will
work.
 Prim’s algorithm always picks the two disjoint sets according to which is
connected and which is not
PRIM’S ALGORITHM
 On each iteration, the subtree grows by one edge
PRIM’S ALGORITHM

PRIM/DIJKSTRA’S
 Prim’s algorithm is just like Dijkstra’s where we are using the value
cost(v) instead of dist(v).
 We can use the same algorithm changing dist to cost and we can use the
same data structures.
PRIM’S ALGORITHM

PRIM’S EXAMPLE
RUNTIME OF PRIM’S

Runtime of Kruskal’s
SET COVER
 Suppose you have a county with n towns. If you put a school in town x
then all towns within a 30 mile radius could send their kids to that school.
 What is the least amount of schools you could build to accommodate all
towns?
SET COVER
 Make a graph where the towns are vertices and two vertices are
connected if they are within 30 miles of each other.
SET COVER
 Make a graph where the towns are vertices and two vertices are
connected if they are within 30 miles of each other.
SET COVER
 Greedy approach: Pick the town that accommodates the most
number of other towns. Delete all towns it accommodates from the
graph and repeat until all towns are accommodated.
SET COVER
 Greedy approach: Pick the town that accommodates the most
number of other towns. Delete all towns it accommodates from the
graph and repeat until all towns are accommodated.
SET COVER
 Greedy approach: Pick the town that accommodates the most
number of other towns. Delete all towns it accommodates from the
graph and repeat until all towns are accommodated.
SET COVER
 Greedy approach: Pick the town that accommodates the most
number of other towns. Delete all towns it accommodates from the
graph and repeat until all towns are accommodated.
SET COVER
 Is 4 the optimal solution?
SET COVER

SET COVER

HOW BAD IS THE GREEDY APPROACH?
 Claim: Suppose B contains n elements and that the optimal solution
consists of k sets. Then the greedy approach will use at most kln(n) sets.
 In our previous example, k=3 and n=11 so the greedy approach will not
use more than 3*ln(11)=7 sets.
 Is it worth it to use the greedy approach?
PROOF OF CLAIM
 Claim: Suppose B contains n elements and that the optimal solution
consists of k sets. Then the greedy approach will use at most kln(n)
sets.
PROOF OF CLAIM
PROOF OF CLAIM
PROOF OF CLAIM
PROOF OF CLAIM
When t=kln(n), n_t<1 so no more elements
are covered.
IS GREEDY WORTH IT?
 is kln(n) that much bigger than k?
 The ratio between the greedy solution and optimal solution is always less
than ln(n).
 It turns out that there does not exist a polynomial time algorithm that can
give you a better approximation!!!!!
 Runtime of greedy algorithm:
IS GREEDY WORTH IT?
 is kln(n) that much bigger than k?
 The ratio between the greedy solution and optimal solution is always
less than ln(n).
 It turns out that there does not exist a polynomial time algorithm that
can give you a better approximate: on!!!!!
 Runtime of greedy algorithm: Based on the data structure used.
GREEDY ALGORITHM FOR SET COVER
 for all v in V
makeset(v)