Download Chapter 11 - Introduction to Abstract Data Types (ADTs)

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Lattice model (finance) wikipedia , lookup

Linked list wikipedia , lookup

Quadtree wikipedia , lookup

Red–black tree wikipedia , lookup

Interval tree wikipedia , lookup

B-tree wikipedia , lookup

Binary search tree wikipedia , lookup

Binary tree wikipedia , lookup

Transcript
Chapter 12 - Graphs and Trees
12.1 Definitions
Graph Definitions
Tree Definitions
12.2 Tree Data Representations
Generic Tree
Binary Tree
Static Representation
Dynamic Data Representation
12.3 Graph Data Representations
Adjacency Matrix
Edge List
12.4 Traversals
Depth-First Traversals
Breadth-First Traversals
12.5 Applications
Implementing a Depth-First Traversal
Implementing a Breadth-First Traversal
Minimum Spanning Tree
Chapter 12 - Graphs and Trees
In this chapter we give a formal introduction to the graphs and trees as abstract data types. We
investigate the various machine representations for trees and graphs and review a number of
sample applications using these representations.
12.1 Graphs
Graph Definition - A simple undirected graph G consists of a set of vertices V and a set of edges
E The elements of E are defined as pairs of elements of V, ek = (u,v) such that u not equal to v
and (u,v) an element of E implies that (v,u) is also an element of E. (In other words (u,v) and
(v,u) represent the same edge).
Graphs can be represented pictorially by nodes and lines as shown below:
Simple Connected Graph
Directed Acyclic Graph
Figure 12-1: Types of Graphs
Graph
Multigraph
Multigraphs allow multiple edges between the same pair or vertices and edges from and to the
same vertex.
The edges of a directed graph are called arcs and have a direction as indicated by an arrow.
Unlike graphs, an arc (u,v) in a directed graph does not imply that the arc (v,u) is also in the
directed graph.
An acyclic graph is a graph with no cycles. That is, there is no path along edges in the graph (or
along arcs in a directed graph) that lead from a vertex back to the same vertex.
Two vertices u, v in a graph are said to be adjacent if there is an edge e (or arc) connecting u to
v. The vertices u and v are called the endpoints of e.
The degree of a vertex v is given as deg(v) and is the number of edges incident with v. That is,
the number of edges for which v is an endpoint.
A simple but useful theorem: 2 |E| = deg(V) for all V in G=(V,E), where |E| is the cardinality of E
(i.e. the number of members of the set E). In words this theorem states that the sum of the
degrees of each of the vertices in a graph is equal to twice the number of edges in the graph.
This is true since each edge has two endpoints and therefore contributes 2 to the total degree
count.
A complete graph is one in which there is an edge between every pair of vertices.
A cycle or ring is a connected graph in which there is exactly one path from any node back to
itself. Many local area networks (LANs) are based on ring topologies in which each computer is
represented by a node in the cycle or ring graph. In a ring, each node is connected to two other
nodes.
An n-vertex wheel is a graph composed of an n-1 cycle graph plus one additional vertex
connected to the other n-1 vertices. The additional node has degree n-1 and in the application of
a computer network provides an additional level of connectivity (redundancy).
A bipartite graph G=(V,E) is one in which V can be partitioned into two disjoint subsets V1 and V2
such that every edge connects a vertex in V1 and a vertex in V2. Notice that no two nodes in a
group are connected. Determining is a particular graph is bipartite is equivalent to the problem of
finding if there are any 3-cycles in the graph. For example, if there are edges (u,v), (u,w) and
(v,w) in the edge list E of a graph G=(V,E) for any three vertices u,v and w in V then G cannot be
bipartite. Why?
cycle or ring
wheel
complete graph, K5
bipartite graph
Figure 12-2: Special Graphs
A hypercube is a 2n vertex graph in which each vertex is connected to exactly n other vertices.
There are a number of parallel computer algorithms that run efficiently on a collection of
processors arranged in a hypercube topology.
Actually we can define a virtual hypercube topology for a collection of 2 n processors that are
connected in any type of physical (i.e. actual) network. All we need to do is to define which
machines are adjacent (connected to each other) in our hypercube. We can keep up with
adjacency of processor using a clever processor addressing scheme. For 2 n processors we will
use an n-bit address ensuring that adjacent (connected) processors will have addresses that
differ in exactly one bit.
For example in a 24 = 16 processor (i.e. 4 dimensional) hypercube the processor with address
0000 is connected to processors with addresses (0001, 0010, 0100 and 1000).
What are the addresses of the processors connected to processor (1101) in a level 4 hypercube?
0001
0000
00
01
0100
11
10
a
1000
n=2, 22=4
0
n=0, 2 =1
000
0
001
b
1
n=1, 21=2
100
101
1111
011
010
0010
110
111
n=4, 24=16
n=3, 23=8
c
Figure 12-3: Hypercubes for N= 1,2 3 and 4
Sometimes we are interested in the minimum distance or number of steps between any two
processors in a computer network. The minimum separation between two processors in the
hypercube is equal to the number of bits difference between their respective addresses, therefore
we can directly compare processor addresses to calculate this distance. We can also generate a
sequence of connected processors through which to move data between processors. For
example to send data from processor 0000 to 1111 we can follow the path 0000 -> 0001 -> 0101
-> 1101 -> 1111. How many different paths minimal-length paths are there between processor
0000 and 1111?
Using Graphs to Represent State Machines
Another important use for graphs in computer science is in the representation of finite state
machines (FSMs). A finite state machine is a model of computation composed of (1) a set of
states with (2) a particular state defined as the start state and one or more state designated as
accept states, (3) an input alphabet or list of valid symbols that must be considered, and (4) a list
of transitions to the next state based on the current state and the current symbol (from the
alphabet) being read.
The input is scanned by the FSM and for each symbol read the state is updated according to the
state transition list. When the entire input string has been scanned the final state is compared to
the set of accept states and, if the final state is one of the accept states, the input string is
accepted by the FSM. At this point an example would be helpful. Let the FSM be defined over the
alphabet {0,1} (i.e. binary strings), let the start state be State=0 and let the accept state be
State=3. Consider the following FSM for recognizing binary strings containing at least three 1's.
1
1
1
1
start
0
1
0
2
0
3
0
0
Figure 12-4: An FSM to Recognize Binary Strings with at least Three 1's
In this FSM the double circle indicates the accept state, the start state is labeled and the
transitions between states are labeled according the input symbol currently being read. The
binary strings,
01010000
01001110
0111
110000
reject
accept
accept
reject
are labeled according to whether the FSM accepts or rejects them as a member of the class of
binary strings containing at least three 1's.
Now consider the FSM for the class of binary strings containing at least three consecutive 1's. In
this case we need to keep a record of the 1's encountered and return to the start state when a 0
is read, if we have not read at least three consecutive 1's. Once we have read the necessary
three consecutive 1's we will remain in the accept state regardless of the remainder of the binary
string. This FSM has the following graphical representation.
1
1
1
1
start
0
1
0
2
3
0
0
0
Figure 12-5: An FSM to Recognize Binary Strings with at least Three Consecutive 1's
Applying this FSM to the sample binary strings listed below gives the indicated results.
0000110011001100
0001110000000000
1010101010000000
11111111
reject
accept
reject
accept
This FSM is similar to the previous example except that we must restart our count each time we
encounter a zero, until we have achieved the required result of three 1's in a row. Once we reach
this goal we will accept the string.
Let's look at one more example of how directed graphs are used to represent FSMs. Consider
the FSM for accepting a string that can be interpreted as an integer. In this case our alphabet is
expanded to include any valid ASCII character. This will include the digits 0 through 9, the upper
and lowercase letters and all punctuation. We will define a valid integer to be any number of
leading blanks, followed by a possible + or - sign, followed by one or more digits, followed by any
number of trailing blanks. Some examples of valid and non-valid integers are listed below:
Valid Integers
123
123456
-543
9
Not Integers
123.456
12+345
Hello There
9
3
5
The FSM for recognizing (accepting) integer strings as defined above is given by,
0..9
+/-
0..9
b
b
start
0
1
+/-
2
3
4
0..9
b
0..9
b = blank
0..9 = a digit
+/- = plus or minus sign
else = any other ASCII character
b
else
else
else
else
else
5
else
Figure 12-6: An FSM to Accept Valid Integer Strings
where 0..9 is any ASCII digit, +/- is either a plus sign or a minus sign, b stands for spaces or
blanks and else represents any ASCII character not explicitly referenced in the transitions for the
current state. It is important to note that else refers to a different symbol set depending on the
current state. When in State 0, else represents any ASCII character that is not a digit, a blank
space or a sign. State 2, else represents any ASCII character that is not a digit. In State 5, else
represents any ASCII character. Finally, in this example we have three accept states (3 and 4).
If the FSM is left in any of these states the scanned string can be interpreted as an integer.
12.2 Trees
Trees are special types of graphs with contain no cycles. A tree T is defined as a set of nodes V
with one node designated as the root node, and a list of edges E connecting the nodes without
creating cycles. Each of the other nodes can be considered as the root node of its own sub tree.
height of tree is 3
A
B
root
C
D
child of A
parent of G,H,I and J
degree = 4
H
I
subtrees
E
F
K
G
L
leaf nodes
J
M
Figure 12-7: Graphical Representation of a Tree
The edges connect parent nodes to child nodes. When two nodes are connected by an edge the
node closer to the root node is called the parent and the other node is called the child. The
height of a tree is the number of edges from the root to the most distant leaf node. If a node is
not the root and not a leaf node then it is called an internal node.
There are three equivalent definitions of a tree in graph theory:
(1) a connected acyclic graph
(2) a graph with exactly one path between every pair of nodes
(3) a connected graph with exactly one more nodes than edges
Each of these definitions will have their advantages for different applications involving trees.
One type of tree is of particular importance to computer science. A binary tree is a tree in which
each node has at most two children. The children are named the left-child and the right child (if
they exist).
root
left child
right child
Figure 12-8: A Binary Tree
Tree Representations and the Tree ADT In computer science we learn about many different ways
to represent trees. For now we will see how a binary tree can be embedded in a list in a manner
similar to that used to represent stacks and queues. We can embed the nodes of a binary tree
into a one-dimensional list by defining a relationship between the position of each parent node
and the position of its children.
1. left_child of node i is 2*i
2. right_child of node i is 2*i+1
3. parent of node i is i/2 (integer division)
Figure 12-9: A List Representation of a Binary Tree
In order to move around in the tree we can develop functions that use these positional
relationships. We can also build a few functions and procedures to modify and return values in
the tree.
parent(T,a_node) - a function that returns the parent node of a_node in the tree T
left_child(T,a_node) - a function that returns the left child node of a_node in the tree T
right_child(T,a_node) - a function that returns the right child node of a_node in the tree T
root(T) - a function that returns the root node of tree T
is_avail(T,a_node) - a boolean function that tests if a_node is in the tree T
place_val(T,a_node,x) - a procedure that places value x at a_node in tree T
this_val(T,a_node) - a function that returns the value at a_node in tree T
In many applications, the binary tree is sparse. This means that many of the nodes do not have
two children. If we use a list to hold a binary tree of height n, the list must be capable of holding
around of 2n nodes. In this situation many of the positions in the list would remain empty. It
would be more efficient to represent a sparse binary tree using dynamic memory.
Just as we created a record for elements of a linear linked list, we can create records to represent
the nodes of a binary tree.
type node;
type pointer is access node;
type node is record
data : data_type;
left_child : pointer;
right_child : pointer;
end record;
data
left-child
pointer
right-child
pointer
These records can be linked together to construct a binary tree using the same programming
constructs used to create linked-lists in the stack and queue ADTs. The example below shows
the months of the year as the data elements in a binary tree. What is the significance of this
arrangement?
mar
apr
may
null
null
sep
jul
null
null
feb
oct
null
dec
null
aug
null
jun
null
null
jan
nov
null
null
null
null
Figure 12-10: A Dynamic Memory Representation of a Binary Tree
In this representation, we have allocated twelve records, one for each month. Compare this
representation with the list representation defined above. Since the height of this tree is 5 we
would need a list of size 25 = 32 to hold these 12 data values.
12.3 Graph Data Representations
For graph data structures to be helpful in computer applications we must have an effective
method for representing them in a computer program. We will review a few of the more common
methods for representing graphs.
Adjacency Matrix - For a graph G with n vertices we create an nxn boolean (or equivalent) matrix.
We label the row and columns of the matrix with the names of the vertices. Each element of the
matrix represents a potential edge in the graph as defined by its associated vertex pair. We set
an element of the matrix to TRUE if there is an edge connecting the two corresponding vertices,
otherwise we set the element to FALSE. We can also use 1's and 0's to indicate the presence or
absence of an edge.
Figure 12-11: A Graph and Its Adjacency Matrix
In the example above the 1's represent the presence of an edge and a blank indicates that there
is no edge connecting the corresponding pair of vertices. The dashes indicate that there can be
no edge connecting a vertex to itself. For a weighted graph the edge value will replace the 1's.
Adjacency List - For each vertex we list the
vertices connected to this vertex by an edge in
the graph. For an ordinary n-vertex connected
graph the number of vertices connected to a
particular vertex is no greater than n-1.
Therefore the edge list can be as large as n(n1), or n(n-1)/2 if each edge is represented only
once.
If the graph has many more edges than
vertices (approaching a complete graph) then
the adjacency matrix is the preferred method of
representation. If the graph is sparse (i.e. the
number of edges is much less than the
maximum number of edges) then an edge list
is probably preferred.
Figure 12-12: Sample Adjacency List
Dynamic Memory Representations - The graph representation methods described above use
static data structures such as arrays and lists. Sometimes we need to use a method of
representation that permits arbitrary growth and/or restructuring of the graph. We can use
dynamic memory to accomplish this. Representing graphs using pointer records and pointers
should be considered only in those cases in which the adjacency matrix or adjacency list
representations are not feasible.
A graphical representation of the dynamic memory data structure for our six-node sample graph
is provided below. In this representation we have two types of nodes. There is a node record for
each vertex as shown in the node list. A node record has a pointer to the next node (this pointer
does not represent an edge in the graph) and another pointer to an incidence list. The incidence
record contains two pointers. One pointer points to the next incidence record for this node and
the other points to one of the vertices that is adjacent to this vertex.
Figure 12-13: A Dynamic Memory Graph Representation
In our example vertex A is adjacent to 4
vertices so its incidence list has 4 records. The
edge pointers for vertex A point to records B,
C, D and E. The complexity of the data
structure for this simple graph indicates that
this representation method should be
considered only if warranted by the application.
One area where such representations are
helpful is computer graphics in which it is
important to be able to access adjacent facets
in a data model of a three-dimensional object.
Each facet corresponds to a record containing
the facet normal vector. The type of shading to
be applied to the surface depends on the
normals of the neighboring facets, which are
directly accessible through the incident list.
facet normals
Figure 12-14: 3D Wire frame Data Model
12.4 Traversals
An important operation on graphs and trees is the ability to move through the data representation,
testing each vertex and/or edge value. This evaluate-and-move operation is referred to as a
traversal. There are two popular methods of traversal used in solving many of the problems
involving graphs and trees. They are called depth-first traversal and breadth-first traversal.
In any traversal we need to establish a convention for ordering the vertices. In our examples we
will use alpha-numeric ordering of the vertex labels. This means that if two or more vertices can
be chosen next, we will select the vertex with a label that would come first in an alpha-numeric
sort.
Depth-First Traversal - In a depth-first traversal (DFT) the first child of a vertex is expanded
before any of its siblings. A depth-first traversal is best implemented using a recursive algorithm
(implies the use of a stack data structure). The ordering rule is applied recursively so that we
obtain the following order of vertices in a depth-first traversal from the root of the tree shown
below. ABEMNFGCHDIOPQJKLR.
A
B
E
M
F
N
C
G
D
H
O
I
J
P
Q
K
L
R
Figure 12-15: An Example Generic Tree
Starting with the root node A we have a choice of B,C and D to expand. We choose B (by our
alpha-numeric ordering convention) and place B at top of our stack. Next we consider the
children of B and choose E. The children of E are M and N so we choose M. At this point we
have reached a leaf node (i.e. a node with no children) so we pop the stack returning to node E
and expand the next child of E which is node N. Node N is also a leaf node so we pop the stack
back to E. However, E has no more unexpanded children so we pop the stack again back to
node B. The unexpanded children of B are F and G which are added to our list in order. Popping
back to A gives us access the second child of A which is C. Node C has a single child H which is
added to the list. We pop back to A to pick up its remaining child node D and so on.
A DFT can be implemented for a graph as well as a tree. In a graph traversal we need to keep a
record of the nodes that have been traversed so that they are not entered more than once. In the
example below we will perform a depth-first search of the graph shown below starting at node 1.
Figure 12-16: Sample Graph Showing the Beginning of a Depth-First Traversal (DFT)
The nodes adjacent to node 1 are 2, 3 and 4 so we choose 2 (by our convention). From node 2
we choose 4 since node 1 has already been expanded. From node 4 we choose node 5 since it
is the first unexpanded node in an alpha-numeric ordering of 1, 2 and 5 (actually its the only
unexpanded node reachable from node 4). From node 5 we choose 3 and 6.
Figure 12-17: Completed DFT
Breadth-First Traversal
In a breadth-first traversal (BFT) all the children of the current node (i.e. the node currently being
evaluated) are placed onto the queue before any of their children are considered. A breadth-first
traversal is obtained by following the stepwise procedure below:
Step 1: Place the starting node onto the queue.
Step 2: If the queue is not empty retrieve the node that is at the front of the queue
and add its label to the order-of-traversal list, otherwise STOP.
Step 3: Place all the un-encountered children of this node onto the queue (in alphanumeric order) and tag them as having been encountered.
Step 4: Return to Step 2.
The order-of-traversal list is a list of all the reachable nodes in the graph or tree arranged in the
order in which they were encountered in the breadth-first traversal. Note that, just as in the
depth-first traversal, we have to keep track of which nodes have been encountered so that we
don't evaluate any node more that once.
Let's try this procedure on the sample graph shown below. We will start with node E and use our
alpha-numeric ordering convention for arranging the nodes encountered at each step.
We choose node E as our arbitrary start node (the actual
starting node would be determined by the requirements of
the application or problem being solved). We place E onto
our queue and tag it as a used node.
Queue: E <-front of queue
A
B
C
D
E
F
G
H
I
Tagged: A B C D E F G H I
Order-Of-Traversal: [empty]
Figure 12-18: Sample BFT Graph
The node at the front of the queue is E, we remove it and
add its label to the Order-Of-Traversal (OOT) list. The
children of E are A, C and H. Since none of these nodes
have been encountered, they are all placed onto the queue.
Queue: H C A <-front of queue
A
B
C
D
E
F
G
H
I
Tagged: A B C D E F G H I
Order-Of-Traversal: E
Node A is now at the front of the queue so we remove it, add
its label to the OOT list and place the unused children of A
onto the queue.
A
B
C
D
E
F
G
H
I
Queue: D B H C <-front of queue
Tagged: A B C D E F G H I
Order-Of-Traversal: E A
Node C is the next node at the front of the queue. We
remove C add its label to the OOT list and place it child,
node I onto the queue.
A
B
C
D
E
F
G
H
I
A
B
C
D
E
F
G
H
I
Queue: I D B H <-front of queue
Tagged: A B C D E F G H I
Order-Of-Traversal: E A C
Node H is the next node at the front of the queue. The
children of H are D, E, F, G and I but only F and G are new
nodes, so we place these two nodes onto the queue.
Queue: G F I D B <-front of queue
Tagged: A B C D E F G H I
Order-Of-Traversal: E A C H
Since all the nodes have been encountered we can flush the
queue as we add the nodes to the OOT list. The final list is,
Order-Of-Traversal: E A C H B D I F G
12.5 Applications
Implementing a Depth-First Traversal - There are many ways to implement a depth-first traversal
of a graph. One of the most efficient is as a recursive algorithm for DFT of a graph. We will
represent the graph in an adjacency matrix.
type graphtype is array(1..100,1..100) of integer;
type nodelistype is array(1..100) of integer;
We have also created a 1-dimensional array data type for the node list and the traversal list. The
node list will be used to keep track of which nodes have been placed in the traversal list. Now we
need to declare variables.
graph
nodes
start_node
trav_list
trav_num
num
:
:
:
:
:
:
graphtype;
nodelistype;
integer;
nodelistype;
integer;
integer;
The array graph is the adjacency matrix, nodes is the node list, start_node is the label of the node
at which we will begin our traversals, trav_list will hold the list of node labels indicating the order
in which the nodes are evaluated, trav_num will be used as the index of the next available
position in trav_list and num is the number of nodes in the graph.
procedure DFT(node : in integer) is
begin
nodes(node):=1;
trav_list(trav_num):=node;
trav_num:=trav_num+1;
for col in 1..num loop
if graph(node,col)=1 and nodes(col)=0 then
DFT(col);
end if;
end loop;
end DFT;
In our example, we call DFT(1) which places node 1 in the trav_list and checks for other unused
nodes adjacent to node 1. Nodes 2, 3 and 4 are all adjacent to node 1 but as soon as DFT(2) is
called the copy of DFT(1) is pushed onto the process control stack and DFT(2) begins its
operation. This means that DFT(4) is called by DFT(2) before DFT(3) is called by DFT(1). You
should work though the execution of this program by hand in order to better understand this
process.
Implementing a Breadth-First Traversal - In the following section we develop an implementation
of the breadth-first traversal algorithm as an Ada procedure. As usual, our first task is to choose
a data structure. Since we want to use a queue to maintain our order of nodes we will naturally
make use of our ADT generic adt_queue. We will instantiate an integer queue as part of our BFT
program declaration.
package my_queue is new adt_queue(integer);
use my_queue;
We will use an adjacency matrix to hold the graph description. The data table below will be
saved as a text file named bft_graph.dat.
9
ABCDEFGHI
0 1 0 1 1
1 0 0 0 0
0 0 0 0 1
1 0 0 0 0
1 0 1 0 0
0 1 0 0 0
0 1 0 0 0
0 0 0 1 1
0 0 1 0 0
0
1
0
0
0
0
0
1
0
0
1
0
0
0
0
0
1
1
0
0
0
1
1
1
1
0
1
0
0
1
0
0
0
1
1
0
The first number is the number of nodes in the graph, which we will call nnodes. The next line is
a list of 9 node labels. We are using the same example graph that we used in our previous
example as shown in Figure 12-18. The adjacency matrix is the 9x9 array of 1's and 0's. Each
row and column correspond to a node while the 1's represent edges connecting pairs of nodes.
For example, the first row (which is associated with node A) has 1's in columns 2, 4 and 5
corresponding to edges connecting node A to nodes B, D and E.
We need a data structure to hold these data in our BFT program.
maxnodes : constant integer :=9;
type adjmatype is array(1..maxnodes,1..maxnodes)of integer;
adjmat : adjmatype;
nnodes : integer;
Although we have only 9 nodes in our sample graph we can set maxnodes to handle larger
graphs. We must be careful to use maxnodes everywhere we refer to the maximum number of
nodes so that we will have to change only one value.
We also need data structures to hold the list of labels (node symbols A through I), and lists for
the order of traversal and to keep a record of the tagged (used) nodes.
type symlistype is array(1..maxnodes)of character;
type numlistype is array(1..maxnodes)of integer;
label
oot
noot
used
:
:
:
:
symlistype;
symlistype;
integer := 0;
numlistype;
The oot array will hold the list of node labels in the order they are encountered in the BFT. This
will be the output of our program. The integer noot will keep track of how many node labels have
been placed on the oot list as the program runs.
The used list will be initially set to all 0's. As nodes are added to the back of the queue the
corresponding element of the used list will be set to 1. The BFT program will refer to these
values to see of a node has been used before it is added to the queue.
Before performing the traversal we will need to load the data from the text file into the program
data structures and to initialize the used list. Once the data has been loaded we will need to
select a starting node for the traversal. In our demonstration program we will ask the user to
choose a starting node (by number). We will then enqueue this starting index into Q and tag this
node as used.
put("Enter index of starting node... ");
get(start);
enqueue(Q,start);
used(start):=1;
Once the starting node has been placed into the queue we can run exercise our BFT algorithm by
dequeuing each node (index) from the queue, adding this node to the order of traversal (oot) list
and placing the unused children of this node at the back of the queue, in order.
while not(is_empty(Q)) loop
dequeue(Q,curval);
noot:=noot+1;
oot(noot):=label(curval);
for j in 1..nnodes loop
if adjmat(curval,j)=1 and used(j)=0 then
enqueue(Q,j);
used(j):=1;
end if;
end loop;
end loop;
At the top of the while..loop we take the next node index from the front of the queue Q and place
its label into the oot list. Note that we use the index curval to choose the correct node label from
the label list.
At the bottom of the while..loop we scan all the nodes in the graph, testing to see if a node is
adjacent to the current node and if it has not yet been used. If both of these conditions are true
then we enqueue this node and tag it as used.
This process repeats until the queue is empty. Note that once all the nodes have been placed in
the queue the for..loop at the bottom of the while..loop could be skipped and the remaining
nodes in the queue could be dequeued and placed into the oot list. The data below shows the
output for our BFT demonstration program.
Enter graph file name... bft_graph.dat
A
B
C
D
E
F
G
H
I
A
0
1
0
1
1
0
0
0
0
B
1
0
0
0
0
1
1
0
0
C
0
0
0
0
1
0
0
0
1
D
1
0
0
0
0
0
0
1
0
E
1
0
1
0
0
0
0
1
0
F
0
1
0
0
0
0
0
1
0
G
0
1
0
0
0
0
0
1
1
H
0
0
0
1
1
1
1
0
1
A
1
B
2
C
3
D
4
E
5
F
6
G
7
H
8
I
9
I
0
0
1
0
0
0
1
1
0
Enter index of starting node... 5
E A C H B D I F G
A complete listing of the BFT_demo.adb program is given below.
with ada.text_io, ada.integer_text_io, adt_queue;
use ada.text_io, ada.integer_text_io;
procedure BFT_demo is
maxnodes : constant integer :=30;
type adjmatype is array(1..maxnodes,1..maxnodes)of integer;
type symlistype is array(1..maxnodes)of character;
type numlistype is array(1..maxnodes)of integer;
package my_queue is new adt_queue(integer);
use my_queue;
fname : string(1..20);
fleng : integer;
datin : file_type;
Q : qtype;
start : integer;
curval : integer;
adjmat : adjmatype;
nnodes : integer;
oot
: symlistype;
noot : integer:=0;
label : symlistype;
used
: numlistype;
procedure loadgraph is
begin
put("Enter graph file name... ");
get_line(fname,fleng);
open(datin,in_file,fname(1..fleng));
get(datin,nnodes);
for i in 1..nnodes loop
get(datin,label(i));
used(i):=0;
end loop;
for i in 1..nnodes loop
for j in 1..nnodes loop
get(datin,adjmat(i,j));
end loop;
end loop;
close(datin);
new_line(2);
put("
");
for i in 1..nnodes loop
put(label(i));
put(" ");
end loop;
new_line;
for i in 1..nnodes loop
put(" ");
put(label(i));
for j in 1..nnodes loop
put(adjmat(i,j),3);
end loop;
new_line;
end loop;
new_line;
end loadgraph;
begin
loadgraph;
for i in 1..nnodes loop
put(" ");
put(label(i));
end loop;
new_line;
for i in 1..nnodes loop
put(i,3);
end loop;
new_line(2);
put("Enter index of starting node... ");
get(start);
enqueue(Q,start);
used(start):=1;
while not(is_empty(Q)) loop
dequeue(Q,curval);
noot:=noot+1;
oot(noot):=label(curval);
for j in 1..nnodes loop
if adjmat(curval,j)=1 and used(j)=0 then
enqueue(Q,j);
used(j):=1;
end if;
end loop;
end loop;
for i in 1..nnodes loop
put(oot(i));
put(" ");
end loop;
end BFT_demo;
Graphs and trees are important structures in computer science because they give us efficient
methods for modeling complex systems. In the next chapter, we will use these structures and the
associated traversal methods to implement efficient algorithms for managing large data sets.
Minimum Spanning Tree - The graph shown below is called a weighted graph because its edges
have values or weights assigned to them. The Minimum Spanning Tree (MST) problem is to find
a tree that is made up of all the nodes in the graph and a subset of the edges such that the sum
of the edge weights is a minimum.
To implement an algorithm for the minimum spanning tree problem we need a datarepresentation for weighted graphs. To solve the problem manually, we can look at a picture of
the graph, but for computer analysis an edge list or adjacency matrix representation is preferred.
As shown in the figure above the weight of each edge is included as a parameter in the edge list
and the weight replaces the boolean value in an adjacency matrix. The MST problem can be
implemented using either of these data structures.
Prim's Algorithm is an efficient method for finding the minimum spanning tree in a weighted
graph. This algorithm is stated formally as follows:
Given a weighted graph G consisting of a set of vertices V and a set of edges E with weights wi,j,
where,
G  {V , E}
V  {set of all vi , i  1..n}
E  {set of all ei, j  (vi , v j , wi, j )}
Prepare a vertex list VP and an edge list EP (that are initially empty) to hold the vertices and
edges selected by Prim's Algorithm.
1. Choose an any starting vertex vi and place it in the vertex list VP.
2. Find the smallest weight edge ei,j incident with a vertex in the vertex list whose inclusion
in the edge list will not create a cycle. This can be done by verifying that the other vertex
vj is not already in the vertex list.
3. Include this edge in the edge list EP and the associated vertex vj in the vertex list VP.
4. Repeat Steps 2 and 3 until all vertices of the graph are in the vertex list VP.
The solution to the MST is the edge list and the sum of the weights of the edges in the edge list
EP is the minimum weight (sometimes we say minimal since there may be more than one
minimum) spanning tree.
The type of graph traversal being performed in Prim's Algorithm is neither purely depth-first or
breadth-first since the only edges that are examined are those incident with vertices in the
selected vertex list. A simple way to implement Steps 2 and 3 is in a loop as shown in the
pseudo-code below:
wmin=some_large_value
For every edge ek=(vi,vj,wij) in E
if [(vi in VP and vj not in VP) or (vi not in VP and vj in VP)] and wij<wmin then
wmin = wij
emin=ek
end if
end loop
Include emin in EP and vi and vj in VP.
Exercises
.
1. How many edges are in,
a. an N-node complete graph?
b. an N-node cycle?
c. an N-node wheel?
d. an N-level hypercube?
2. Create finite-state machines to recognize the following sets. Assume an alphabet {0,1}.
a. all binary strings with an even number of 1's.
b. all binary strings with an odd number of 1's and an even number of 0's.
c. all binary strings containing the substring 1101.
d. all binary strings that do not contain the substring 1001.
3. Modify the integer recognizer FSM below to reject strings with leading or trailing blanks.
0..9
+/b
0..9
b
start
0
1
+/-
2
3
4
0..9
b
else
else
b
b = blank
0..9 = a digit
+/- = plus or minus sign
else = any other ASCII character
else
else
else
5
else
4. Give a fixed-size (linear) list showing the placement of the nodes in a list representation of the
binary tree shown in Figure 12-10. You should not show any edges in your sketch.
5. Write an Ada program that determines the maximum degree of any node in the graph. Your
program should return the label of the node and its degree. Choose an appropriate graph data
representation (adjacency matrix or adjacency list) and explain your choice.
6. Write an Ada program that finds cycles in a directed graph.
7. Write an Ada program that determines the height of binary tree as given in a list representation.
8. Which traversal method (DFT or BFT) would be best for each of the following operations on a
graph?
a. Finding the shortest path between a pair of nodes.
b. Finding the shortest cycle from a node back to the same node
c. Finding a cycle involving all the nodes in the graph (Hamiltonian Path)
d. Finding a subset of edges the make up a spanning tree (connects all nodes).