Download File - computergixz

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Lattice model (finance) wikipedia , lookup

Quadtree wikipedia , lookup

Binary search tree wikipedia , lookup

Transcript
PAMBAYANG DALUBHASAAN NG MARILAO
Abangan Norte, Marilao, Bulacan
INFORMATION TECHNOLOGY DEPARTMENT
Minimum Spanning Trees
Spanning trees
A spanning tree of a graph is just a subgraph that contains all the vertices and is a tree. A graph may
have many spanning trees; for instance the complete graph on four vertices
o---o
|\ /|
| X |
|/ \|
o---o
has sixteen spanning trees:
o---o
|
|
|
|
|
|
O
o
o---o
\ /
X
/ \
o
o
o---o
|
|
|
o---o
o
o
|\ /
| X
|/ \
o
o
o
o
|\ |
| \ |
| \|
o
o
o---o
/
/
/
o---o
o---o
|\ |
| \ |
| \|
o
o
o
o
o
|
|
|
|
|
|
o---o
o o
\ /
X
/ \
o---o
o
/
/
/
o---o
o o
\ /|
X |
/ \|
o o
o
o
| /|
| / |
|/ |
o
o
o
o---o
|
|
|
o---o
o---o
\
\
\
o---o
o
\
\
\
o---o
o---o
| /|
| / |
|/ |
o
o
Minimum spanning trees
Now suppose the edges of the graph have weights or lengths. The weight of a tree is just the sum of
weights of its edges. Obviously, different trees have different lengths. The problem: how to find the
minimum length spanning tree?
This problem can be solved by many different algorithms. It is the topic of some very recent
research. There are several "best" algorithms, depending on the assumptions you make:

A randomized algorithm can solve it in linear expected time. [Karger, Klein, and Tarjan, "A
randomized linear-time algorithm to find minimum spanning trees", J. ACM, vol. 42, 1995, pp.
321-328.]
IT 213 – Data Structures and Algorithms
PAMBAYANG DALUBHASAAN NG MARILAO
Abangan Norte, Marilao, Bulacan
INFORMATION TECHNOLOGY DEPARTMENT


It can be solved in linear worst case time if the weights are small integers. [Fredman and Willard,
"Trans-dichotomous algorithms for minimum spanning trees and shortest paths", 31st IEEE
Symp. Foundations of Comp. Sci., 1990, pp. 719--725.]
Otherwise, the best solution is very close to linear but not exactly linear. The exact bound is O(m
log beta(m,n)) where the beta function has a complicated definition: the smallest i such that
log(log(log(...log(n)...))) is less than m/n, where the logs are nested i times. [Gabow, Galil,
Spencer, and Tarjan, Efficient algorithms for finding minimum spanning trees in undirected and
directed graphs. Combinatorica, vol. 6, 1986, pp. 109--122.]
These algorithms are all quite complicated, and probably not that great in practice unless you're
looking at really huge graphs. The book tries to keep things simpler, so it only describes one algorithm
but (in my opinion) doesn't do a very good job of it. I'll go through three simple classical algorithms
(spending not so much time on each one).
Why minimum spanning trees?
The standard application is to a problem like phone network design. You have a business with several
offices; you want to lease phone lines to connect them up with each other; and the phone company
charges different amounts of money to connect different pairs of cities. You want a set of lines that
connects all your offices with a minimum total cost. It should be a spanning tree, since if a network isn't
a tree you can always remove some edges and save money.
A less obvious application is that the minimum spanning tree can be used to approximately solve the
traveling salesman problem. A convenient formal way of defining this problem is to find the shortest
path that visits each point at least once.
Note that if you have a path visiting all points exactly once, it's a special kind of tree. For instance, in the
example above, twelve of sixteen spanning trees are actually paths. If you have a path visiting some
vertices more than once, you can always drop some edges to get a tree. So in general the MST weight is
less than the TSP weight, because it's a minimization over a strictly larger set.
On the other hand, if you draw a path tracing around the minimum spanning tree, you trace each edge
twice and visit all points, so the TSP weight is less than twice the MST weight. Therefore this tour is
within a factor of two of optimal. There is a more complicated way ( Christofides ' heuristic ) of using
minimum spanning trees to find a tour within a factor of 1.5 of optimal; I won't describe this here but it
might be covered in ICS 163 (graph algorithms) next year.
Kruskal's algorithm
Kruskal's algorithm is an algorithm in graph theory that finds a minimum spanning tree for
a connected weighted graph. This means it finds a subset of the edges that forms a tree that includes
IT 213 – Data Structures and Algorithms
PAMBAYANG DALUBHASAAN NG MARILAO
Abangan Norte, Marilao, Bulacan
INFORMATION TECHNOLOGY DEPARTMENT
every vertex, where the total weight of all the edges in the tree is minimized. If the graph is not
connected, then it finds a minimum spanning forest (a minimum spanning tree for each connected
component). Kruskal's algorithm is an example of a greedy algorithm.
This algorithm first appeared in Proceedings of the American Mathematical Society, pp. 48–50 in 1956,
and was written by Joseph Kruskal.
We'll start with Kruskal 's algorithm, which is easiest to understand and probably the best one for
solving problems by hand.
Kruskal's algorithm:
sort the edges of G in increasing order by length
keep a subgraph S of G, initially empty
for each edge e in sorted order
if the endpoints of e are disconnected in S
add e to S
return S
Note that, whenever you add an edge (u,v), it's always the smallest connecting the part of S reachable
from u with the rest of G, so by the lemma it must be part of the MST.
This algorithm is known as a greedy algorithm , because it chooses at each step the cheapest edge to
add to S. You should be very careful when trying to use greedy algorithms to solve other problems, since
it usually doesn't work. Eg if you want to find a shortest path from a to b, it might be a bad idea to keep
taking the shortest edges. The greedy idea only works in Kruskal's algorithm because of the key property
we proved.
Analysis: The line testing whether two endpoints are disconnected looks like it should be slow (linear
time per iteration, or O(mn) total). But actually there are some complicated data structures that let us
perform each test in close to constant time; this is known as the union-find problem and is discussed in
Baase section 8.5 (I won't get to it in this class, though). The slowest part turns out to be the sorting
step, which takes O(m log n) time.
IT 213 – Data Structures and Algorithms
PAMBAYANG DALUBHASAAN NG MARILAO
Abangan Norte, Marilao, Bulacan
INFORMATION TECHNOLOGY DEPARTMENT
Image
Description
This is our original graph. The numbers near the arcs indicate
their weight. None of the arcs are highlighted.
AD and CE are the shortest arcs, with length 5, and AD has
been arbitrarily chosen, so it is highlighted.
CE is now the shortest arc that does not form a cycle, with
length 5, so it is highlighted as the second arc.
IT 213 – Data Structures and Algorithms
PAMBAYANG DALUBHASAAN NG MARILAO
Abangan Norte, Marilao, Bulacan
INFORMATION TECHNOLOGY DEPARTMENT
The next arc, DF with length 6, is highlighted using much the
same method.
The next-shortest arcs are AB and BE, both with length 7. AB is
chosen arbitrarily, and is highlighted. The arc BD has been
highlighted in red, because there already exists a path (in green)
between B and D, so it would form a cycle (ABD) if it were
chosen.
The process continues to highlight the next-smallest arc, BE with
length 7. Many more arcs are highlighted in red at this
stage: BC because it would form the loop BCE, DE because it
would form the loop DEBA, and FE because it would
form FEBAD.
Finally, the process finishes with the arc EG of length 9, and the
minimum spanning tree is found.
IT 213 – Data Structures and Algorithms
PAMBAYANG DALUBHASAAN NG MARILAO
Abangan Norte, Marilao, Bulacan
INFORMATION TECHNOLOGY DEPARTMENT
The proof consists of two parts. First, it is proved that the algorithm produces a spanning tree. Second, it
is proved that the constructed spanning tree is of minimal weight.
Prim's algorithm
In computer science, Prim's algorithm is a greedy algorithm that finds a minimum spanning tree for
a connected weighted undirected graph. This means it finds a subset of the edges that forms a tree that
includes every vertex, where the total weight of all the edges in the tree is minimized. The algorithm was
developed in 1930 byCzech mathematician Vojtěch Jarník and later independently by computer
scientist Robert C. Prim in 1957 and rediscovered by Edsger Dijkstra in 1959. Therefore it is also
sometimes called the DJP algorithm, the Jarník algorithm, or the Prim–Jarník algorithm.
Rather than build a subgraph one edge at a time, Prim 's algorithm builds a tree one vertex at a time.
Prim's algorithm:
let T be a single vertex x
while (T has fewer than n vertices)
{
find the smallest edge connecting T to GT
add it to T
}
Since each edge added is the smallest connecting T to GT, the lemma we proved shows that we only add
edges that should be part of the MST.
Again, it looks like the loop has a slow step in it. But again, some data structures can be used to speed
this up. The idea is to use a heap to remember, for each vertex, the smallest edge connecting T with that
vertex.
Prim with heaps:
make a heap of values (vertex,edge,weight(edge))
initially (v,-,infinity) for each vertex
let tree T be empty
while (T has fewer than n vertices)
{
let (v,e,weight(e)) have the smallest weight in the heap
remove (v,e,weight(e)) from the heap
add v and e to T
for each edge f=(u,v)
if u is not already in T
find value (u,g,weight(g)) in heap
if weight(f) < weight(g)
replace (u,g,weight(g)) with (u,f,weight(f))
IT 213 – Data Structures and Algorithms
PAMBAYANG DALUBHASAAN NG MARILAO
Abangan Norte, Marilao, Bulacan
INFORMATION TECHNOLOGY DEPARTMENT
}
Analysis: We perform n steps in which we remove the smallest element in the heap, and at most 2m
steps in which we examine an edge f=(u,v). For each of those steps, we might replace a value on the
heap, reducing it's weight. (You also have to find the right value on the heap, but that can be done easily
enough by keeping a pointer from the vertices to the corresponding values.) I haven't described how to
reduce the weight of an element of a binary heap, but it's easy to do in O(log n) time. Alternately by
using a more complicated data structure known as a Fibonacci heap, you can reduce the weight of an
element in constant time. The result is a total time bound of O(m + n log n).
Image
Description
This is our original weighted graph. The numbers near the edges
indicate their weight.
Vertex D has been arbitrarily chosen as a starting point.
Vertices A, B, E and F are connected to D through a single edge. A is
the vertex nearest to D and will be chosen as the second vertex along
with the edge AD.
IT 213 – Data Structures and Algorithms
PAMBAYANG DALUBHASAAN NG MARILAO
Abangan Norte, Marilao, Bulacan
INFORMATION TECHNOLOGY DEPARTMENT
The next vertex chosen is the vertex nearest to either D or A. B is 9
away from D and 7 away from A, E is 15, and F is 6. F is the smallest
distance away, so we highlight the vertex F and the arc DF.
The algorithm carries on as above. Vertex B, which is 7 away from A,
is highlighted.
In this case, we can choose between C, E, and G. C is 8 away
from B, E is 7 away from B, and G is 11 away from F. Eis nearest, so
we highlight the vertex E and the arc BE.
IT 213 – Data Structures and Algorithms
PAMBAYANG DALUBHASAAN NG MARILAO
Abangan Norte, Marilao, Bulacan
INFORMATION TECHNOLOGY DEPARTMENT
Here, the only vertices available are C and G. C is 5 away from E,
and G is 9 away from E. C is chosen, so it is highlighted along with the
arc EC.
Vertex G is the only remaining vertex. It is 11 away from F, and 9
away from E. E is nearer, so we highlight it and the arc EG.
Now all the vertices have been selected and the minimum spanning
tree is shown in green. In this case, it has weight 39.
U
Edge(u,v)
{}
{D}
V\U
{A,B,C,D,E,F,G}
(D,A) = 5 V
(D,B) = 9
(D,E) = 15
IT 213 – Data Structures and Algorithms
{A,B,C,E,F,G}
PAMBAYANG DALUBHASAAN NG MARILAO
Abangan Norte, Marilao, Bulacan
INFORMATION TECHNOLOGY DEPARTMENT
(D,F) = 6
{A,D}
(D,B) = 9
(D,E) = 15
(D,F) = 6 V
(A,B) = 7
{B,C,E,F,G}
{A,D,F}
(D,B) = 9
(D,E) = 15
(A,B) = 7 V
(F,E) = 8
(F,G) = 11
{B,C,E,G}
{A,B,D,F}
(B,C) = 8
(B,E) = 7 V
(D,B) = 9 cycle
{C,E,G}
(D,E) = 15
(F,E) = 8
(F,G) = 11
{A,B,D,E,F}
(B,C) = 8
(D,B) = 9 cycle
(D,E) = 15 cycle
{C,G}
(E,C) = 5 V
(E,G) = 9
(F,E) = 8 cycle
(F,G) = 11
{A,B,C,D,E,F}
(B,C) = 8 cycle
(D,B) = 9 cycle
(D,E) = 15 cycle
{G}
(E,G) = 9 V
(F,E) = 8 cycle
(F,G) = 11
{A,B,C,D,E,F,G} (B,C) = 8 cycle {}
(D,B) = 9 cycle
IT 213 – Data Structures and Algorithms
PAMBAYANG DALUBHASAAN NG MARILAO
Abangan Norte, Marilao, Bulacan
INFORMATION TECHNOLOGY DEPARTMENT
(D,E) = 15 cycle
(F,E) = 8 cycle
(F,G) = 11 cycle
Proof of Correctness
Let P be a connected, weighted graph. At every iteration of Prim's algorithm, an edge must be found
that connects a vertex in a subgraph to a vertex outside the subgraph. Since P is connected, there will
always be a path to every vertex. The output Y of Prim's algorithm is a tree, because the edge and vertex
added to Y are connected. Let Y1 be a minimum spanning tree of P. If Y1=Y then Y is a minimum spanning
tree. Otherwise, let e be the first edge added during the construction of Y that is not in Y1, and V be the
set of vertices connected by the edges added before e. Then one endpoint of e is in V and the other is
not. Since Y1 is a spanning tree of P, there is a path in Y1 joining the two endpoints. As one travels along
the path, one must encounter an edge fjoining a vertex in V to one that is not in V. Now, at the iteration
when e was added to Y, f could also have been added and it would be added instead of e if its weight
was less than e. Since f was not added, we conclude that
Let Y2 be the graph obtained by removing f from and adding e to Y1. It is easy to show that Y2 is
connected, has the same number of edges as Y1, and the total weights of its edges is not larger than
that of Y1, therefore it is also a minimum spanning tree of P and it contains e and all the edges
added before it during the construction of V. Repeat the steps above and we will eventually obtain a
minimum spanning tree of P that is identical to Y. This shows Y is a minimum spanning tree.
Find the minimum spanning tree using Kruskal’s algorithm:
F
3
5
2
A
I
4
2
4
E
D
6
4
5
G
IT 213 – Data Structures and Algorithms
5
3
H
PAMBAYANG DALUBHASAAN NG MARILAO
Abangan Norte, Marilao, Bulacan
INFORMATION TECHNOLOGY DEPARTMENT
Find the minimum spanning tree using Prim’s algorithm:
A
D
F
C
I
B
IT 213 – Data Structures and Algorithms
H
E
G