Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 12 - Graphs and Trees 12.1 Definitions Graph Definitions Tree Definitions 12.2 Tree Data Representations Generic Tree Binary Tree Static Representation Dynamic Data Representation 12.3 Graph Data Representations Adjacency Matrix Edge List 12.4 Traversals Depth-First Traversals Breadth-First Traversals 12.5 Applications Implementing a Depth-First Traversal Implementing a Breadth-First Traversal Minimum Spanning Tree Chapter 12 - Graphs and Trees In this chapter we give a formal introduction to the graphs and trees as abstract data types. We investigate the various machine representations for trees and graphs and review a number of sample applications using these representations. 12.1 Graphs Graph Definition - A simple undirected graph G consists of a set of vertices V and a set of edges E The elements of E are defined as pairs of elements of V, ek = (u,v) such that u not equal to v and (u,v) an element of E implies that (v,u) is also an element of E. (In other words (u,v) and (v,u) represent the same edge). Graphs can be represented pictorially by nodes and lines as shown below: Simple Connected Graph Directed Acyclic Graph Figure 12-1: Types of Graphs Graph Multigraph Multigraphs allow multiple edges between the same pair or vertices and edges from and to the same vertex. The edges of a directed graph are called arcs and have a direction as indicated by an arrow. Unlike graphs, an arc (u,v) in a directed graph does not imply that the arc (v,u) is also in the directed graph. An acyclic graph is a graph with no cycles. That is, there is no path along edges in the graph (or along arcs in a directed graph) that lead from a vertex back to the same vertex. Two vertices u, v in a graph are said to be adjacent if there is an edge e (or arc) connecting u to v. The vertices u and v are called the endpoints of e. The degree of a vertex v is given as deg(v) and is the number of edges incident with v. That is, the number of edges for which v is an endpoint. A simple but useful theorem: 2 |E| = deg(V) for all V in G=(V,E), where |E| is the cardinality of E (i.e. the number of members of the set E). In words this theorem states that the sum of the degrees of each of the vertices in a graph is equal to twice the number of edges in the graph. This is true since each edge has two endpoints and therefore contributes 2 to the total degree count. A complete graph is one in which there is an edge between every pair of vertices. A cycle or ring is a connected graph in which there is exactly one path from any node back to itself. Many local area networks (LANs) are based on ring topologies in which each computer is represented by a node in the cycle or ring graph. In a ring, each node is connected to two other nodes. An n-vertex wheel is a graph composed of an n-1 cycle graph plus one additional vertex connected to the other n-1 vertices. The additional node has degree n-1 and in the application of a computer network provides an additional level of connectivity (redundancy). A bipartite graph G=(V,E) is one in which V can be partitioned into two disjoint subsets V1 and V2 such that every edge connects a vertex in V1 and a vertex in V2. Notice that no two nodes in a group are connected. Determining is a particular graph is bipartite is equivalent to the problem of finding if there are any 3-cycles in the graph. For example, if there are edges (u,v), (u,w) and (v,w) in the edge list E of a graph G=(V,E) for any three vertices u,v and w in V then G cannot be bipartite. Why? cycle or ring wheel complete graph, K5 bipartite graph Figure 12-2: Special Graphs A hypercube is a 2n vertex graph in which each vertex is connected to exactly n other vertices. There are a number of parallel computer algorithms that run efficiently on a collection of processors arranged in a hypercube topology. Actually we can define a virtual hypercube topology for a collection of 2 n processors that are connected in any type of physical (i.e. actual) network. All we need to do is to define which machines are adjacent (connected to each other) in our hypercube. We can keep up with adjacency of processor using a clever processor addressing scheme. For 2 n processors we will use an n-bit address ensuring that adjacent (connected) processors will have addresses that differ in exactly one bit. For example in a 24 = 16 processor (i.e. 4 dimensional) hypercube the processor with address 0000 is connected to processors with addresses (0001, 0010, 0100 and 1000). What are the addresses of the processors connected to processor (1101) in a level 4 hypercube? 0001 0000 00 01 0100 11 10 a 1000 n=2, 22=4 0 n=0, 2 =1 000 0 001 b 1 n=1, 21=2 100 101 1111 011 010 0010 110 111 n=4, 24=16 n=3, 23=8 c Figure 12-3: Hypercubes for N= 1,2 3 and 4 Sometimes we are interested in the minimum distance or number of steps between any two processors in a computer network. The minimum separation between two processors in the hypercube is equal to the number of bits difference between their respective addresses, therefore we can directly compare processor addresses to calculate this distance. We can also generate a sequence of connected processors through which to move data between processors. For example to send data from processor 0000 to 1111 we can follow the path 0000 -> 0001 -> 0101 -> 1101 -> 1111. How many different paths minimal-length paths are there between processor 0000 and 1111? Using Graphs to Represent State Machines Another important use for graphs in computer science is in the representation of finite state machines (FSMs). A finite state machine is a model of computation composed of (1) a set of states with (2) a particular state defined as the start state and one or more state designated as accept states, (3) an input alphabet or list of valid symbols that must be considered, and (4) a list of transitions to the next state based on the current state and the current symbol (from the alphabet) being read. The input is scanned by the FSM and for each symbol read the state is updated according to the state transition list. When the entire input string has been scanned the final state is compared to the set of accept states and, if the final state is one of the accept states, the input string is accepted by the FSM. At this point an example would be helpful. Let the FSM be defined over the alphabet {0,1} (i.e. binary strings), let the start state be State=0 and let the accept state be State=3. Consider the following FSM for recognizing binary strings containing at least three 1's. 1 1 1 1 start 0 1 0 2 0 3 0 0 Figure 12-4: An FSM to Recognize Binary Strings with at least Three 1's In this FSM the double circle indicates the accept state, the start state is labeled and the transitions between states are labeled according the input symbol currently being read. The binary strings, 01010000 01001110 0111 110000 reject accept accept reject are labeled according to whether the FSM accepts or rejects them as a member of the class of binary strings containing at least three 1's. Now consider the FSM for the class of binary strings containing at least three consecutive 1's. In this case we need to keep a record of the 1's encountered and return to the start state when a 0 is read, if we have not read at least three consecutive 1's. Once we have read the necessary three consecutive 1's we will remain in the accept state regardless of the remainder of the binary string. This FSM has the following graphical representation. 1 1 1 1 start 0 1 0 2 3 0 0 0 Figure 12-5: An FSM to Recognize Binary Strings with at least Three Consecutive 1's Applying this FSM to the sample binary strings listed below gives the indicated results. 0000110011001100 0001110000000000 1010101010000000 11111111 reject accept reject accept This FSM is similar to the previous example except that we must restart our count each time we encounter a zero, until we have achieved the required result of three 1's in a row. Once we reach this goal we will accept the string. Let's look at one more example of how directed graphs are used to represent FSMs. Consider the FSM for accepting a string that can be interpreted as an integer. In this case our alphabet is expanded to include any valid ASCII character. This will include the digits 0 through 9, the upper and lowercase letters and all punctuation. We will define a valid integer to be any number of leading blanks, followed by a possible + or - sign, followed by one or more digits, followed by any number of trailing blanks. Some examples of valid and non-valid integers are listed below: Valid Integers 123 123456 -543 9 Not Integers 123.456 12+345 Hello There 9 3 5 The FSM for recognizing (accepting) integer strings as defined above is given by, 0..9 +/- 0..9 b b start 0 1 +/- 2 3 4 0..9 b 0..9 b = blank 0..9 = a digit +/- = plus or minus sign else = any other ASCII character b else else else else else 5 else Figure 12-6: An FSM to Accept Valid Integer Strings where 0..9 is any ASCII digit, +/- is either a plus sign or a minus sign, b stands for spaces or blanks and else represents any ASCII character not explicitly referenced in the transitions for the current state. It is important to note that else refers to a different symbol set depending on the current state. When in State 0, else represents any ASCII character that is not a digit, a blank space or a sign. State 2, else represents any ASCII character that is not a digit. In State 5, else represents any ASCII character. Finally, in this example we have three accept states (3 and 4). If the FSM is left in any of these states the scanned string can be interpreted as an integer. 12.2 Trees Trees are special types of graphs with contain no cycles. A tree T is defined as a set of nodes V with one node designated as the root node, and a list of edges E connecting the nodes without creating cycles. Each of the other nodes can be considered as the root node of its own sub tree. height of tree is 3 A B root C D child of A parent of G,H,I and J degree = 4 H I subtrees E F K G L leaf nodes J M Figure 12-7: Graphical Representation of a Tree The edges connect parent nodes to child nodes. When two nodes are connected by an edge the node closer to the root node is called the parent and the other node is called the child. The height of a tree is the number of edges from the root to the most distant leaf node. If a node is not the root and not a leaf node then it is called an internal node. There are three equivalent definitions of a tree in graph theory: (1) a connected acyclic graph (2) a graph with exactly one path between every pair of nodes (3) a connected graph with exactly one more nodes than edges Each of these definitions will have their advantages for different applications involving trees. One type of tree is of particular importance to computer science. A binary tree is a tree in which each node has at most two children. The children are named the left-child and the right child (if they exist). root left child right child Figure 12-8: A Binary Tree Tree Representations and the Tree ADT In computer science we learn about many different ways to represent trees. For now we will see how a binary tree can be embedded in a list in a manner similar to that used to represent stacks and queues. We can embed the nodes of a binary tree into a one-dimensional list by defining a relationship between the position of each parent node and the position of its children. 1. left_child of node i is 2*i 2. right_child of node i is 2*i+1 3. parent of node i is i/2 (integer division) Figure 12-9: A List Representation of a Binary Tree In order to move around in the tree we can develop functions that use these positional relationships. We can also build a few functions and procedures to modify and return values in the tree. parent(T,a_node) - a function that returns the parent node of a_node in the tree T left_child(T,a_node) - a function that returns the left child node of a_node in the tree T right_child(T,a_node) - a function that returns the right child node of a_node in the tree T root(T) - a function that returns the root node of tree T is_avail(T,a_node) - a boolean function that tests if a_node is in the tree T place_val(T,a_node,x) - a procedure that places value x at a_node in tree T this_val(T,a_node) - a function that returns the value at a_node in tree T In many applications, the binary tree is sparse. This means that many of the nodes do not have two children. If we use a list to hold a binary tree of height n, the list must be capable of holding around of 2n nodes. In this situation many of the positions in the list would remain empty. It would be more efficient to represent a sparse binary tree using dynamic memory. Just as we created a record for elements of a linear linked list, we can create records to represent the nodes of a binary tree. type node; type pointer is access node; type node is record data : data_type; left_child : pointer; right_child : pointer; end record; data left-child pointer right-child pointer These records can be linked together to construct a binary tree using the same programming constructs used to create linked-lists in the stack and queue ADTs. The example below shows the months of the year as the data elements in a binary tree. What is the significance of this arrangement? mar apr may null null sep jul null null feb oct null dec null aug null jun null null jan nov null null null null Figure 12-10: A Dynamic Memory Representation of a Binary Tree In this representation, we have allocated twelve records, one for each month. Compare this representation with the list representation defined above. Since the height of this tree is 5 we would need a list of size 25 = 32 to hold these 12 data values. 12.3 Graph Data Representations For graph data structures to be helpful in computer applications we must have an effective method for representing them in a computer program. We will review a few of the more common methods for representing graphs. Adjacency Matrix - For a graph G with n vertices we create an nxn boolean (or equivalent) matrix. We label the row and columns of the matrix with the names of the vertices. Each element of the matrix represents a potential edge in the graph as defined by its associated vertex pair. We set an element of the matrix to TRUE if there is an edge connecting the two corresponding vertices, otherwise we set the element to FALSE. We can also use 1's and 0's to indicate the presence or absence of an edge. Figure 12-11: A Graph and Its Adjacency Matrix In the example above the 1's represent the presence of an edge and a blank indicates that there is no edge connecting the corresponding pair of vertices. The dashes indicate that there can be no edge connecting a vertex to itself. For a weighted graph the edge value will replace the 1's. Adjacency List - For each vertex we list the vertices connected to this vertex by an edge in the graph. For an ordinary n-vertex connected graph the number of vertices connected to a particular vertex is no greater than n-1. Therefore the edge list can be as large as n(n1), or n(n-1)/2 if each edge is represented only once. If the graph has many more edges than vertices (approaching a complete graph) then the adjacency matrix is the preferred method of representation. If the graph is sparse (i.e. the number of edges is much less than the maximum number of edges) then an edge list is probably preferred. Figure 12-12: Sample Adjacency List Dynamic Memory Representations - The graph representation methods described above use static data structures such as arrays and lists. Sometimes we need to use a method of representation that permits arbitrary growth and/or restructuring of the graph. We can use dynamic memory to accomplish this. Representing graphs using pointer records and pointers should be considered only in those cases in which the adjacency matrix or adjacency list representations are not feasible. A graphical representation of the dynamic memory data structure for our six-node sample graph is provided below. In this representation we have two types of nodes. There is a node record for each vertex as shown in the node list. A node record has a pointer to the next node (this pointer does not represent an edge in the graph) and another pointer to an incidence list. The incidence record contains two pointers. One pointer points to the next incidence record for this node and the other points to one of the vertices that is adjacent to this vertex. Figure 12-13: A Dynamic Memory Graph Representation In our example vertex A is adjacent to 4 vertices so its incidence list has 4 records. The edge pointers for vertex A point to records B, C, D and E. The complexity of the data structure for this simple graph indicates that this representation method should be considered only if warranted by the application. One area where such representations are helpful is computer graphics in which it is important to be able to access adjacent facets in a data model of a three-dimensional object. Each facet corresponds to a record containing the facet normal vector. The type of shading to be applied to the surface depends on the normals of the neighboring facets, which are directly accessible through the incident list. facet normals Figure 12-14: 3D Wire frame Data Model 12.4 Traversals An important operation on graphs and trees is the ability to move through the data representation, testing each vertex and/or edge value. This evaluate-and-move operation is referred to as a traversal. There are two popular methods of traversal used in solving many of the problems involving graphs and trees. They are called depth-first traversal and breadth-first traversal. In any traversal we need to establish a convention for ordering the vertices. In our examples we will use alpha-numeric ordering of the vertex labels. This means that if two or more vertices can be chosen next, we will select the vertex with a label that would come first in an alpha-numeric sort. Depth-First Traversal - In a depth-first traversal (DFT) the first child of a vertex is expanded before any of its siblings. A depth-first traversal is best implemented using a recursive algorithm (implies the use of a stack data structure). The ordering rule is applied recursively so that we obtain the following order of vertices in a depth-first traversal from the root of the tree shown below. ABEMNFGCHDIOPQJKLR. A B E M F N C G D H O I J P Q K L R Figure 12-15: An Example Generic Tree Starting with the root node A we have a choice of B,C and D to expand. We choose B (by our alpha-numeric ordering convention) and place B at top of our stack. Next we consider the children of B and choose E. The children of E are M and N so we choose M. At this point we have reached a leaf node (i.e. a node with no children) so we pop the stack returning to node E and expand the next child of E which is node N. Node N is also a leaf node so we pop the stack back to E. However, E has no more unexpanded children so we pop the stack again back to node B. The unexpanded children of B are F and G which are added to our list in order. Popping back to A gives us access the second child of A which is C. Node C has a single child H which is added to the list. We pop back to A to pick up its remaining child node D and so on. A DFT can be implemented for a graph as well as a tree. In a graph traversal we need to keep a record of the nodes that have been traversed so that they are not entered more than once. In the example below we will perform a depth-first search of the graph shown below starting at node 1. Figure 12-16: Sample Graph Showing the Beginning of a Depth-First Traversal (DFT) The nodes adjacent to node 1 are 2, 3 and 4 so we choose 2 (by our convention). From node 2 we choose 4 since node 1 has already been expanded. From node 4 we choose node 5 since it is the first unexpanded node in an alpha-numeric ordering of 1, 2 and 5 (actually its the only unexpanded node reachable from node 4). From node 5 we choose 3 and 6. Figure 12-17: Completed DFT Breadth-First Traversal In a breadth-first traversal (BFT) all the children of the current node (i.e. the node currently being evaluated) are placed onto the queue before any of their children are considered. A breadth-first traversal is obtained by following the stepwise procedure below: Step 1: Place the starting node onto the queue. Step 2: If the queue is not empty retrieve the node that is at the front of the queue and add its label to the order-of-traversal list, otherwise STOP. Step 3: Place all the un-encountered children of this node onto the queue (in alphanumeric order) and tag them as having been encountered. Step 4: Return to Step 2. The order-of-traversal list is a list of all the reachable nodes in the graph or tree arranged in the order in which they were encountered in the breadth-first traversal. Note that, just as in the depth-first traversal, we have to keep track of which nodes have been encountered so that we don't evaluate any node more that once. Let's try this procedure on the sample graph shown below. We will start with node E and use our alpha-numeric ordering convention for arranging the nodes encountered at each step. We choose node E as our arbitrary start node (the actual starting node would be determined by the requirements of the application or problem being solved). We place E onto our queue and tag it as a used node. Queue: E <-front of queue A B C D E F G H I Tagged: A B C D E F G H I Order-Of-Traversal: [empty] Figure 12-18: Sample BFT Graph The node at the front of the queue is E, we remove it and add its label to the Order-Of-Traversal (OOT) list. The children of E are A, C and H. Since none of these nodes have been encountered, they are all placed onto the queue. Queue: H C A <-front of queue A B C D E F G H I Tagged: A B C D E F G H I Order-Of-Traversal: E Node A is now at the front of the queue so we remove it, add its label to the OOT list and place the unused children of A onto the queue. A B C D E F G H I Queue: D B H C <-front of queue Tagged: A B C D E F G H I Order-Of-Traversal: E A Node C is the next node at the front of the queue. We remove C add its label to the OOT list and place it child, node I onto the queue. A B C D E F G H I A B C D E F G H I Queue: I D B H <-front of queue Tagged: A B C D E F G H I Order-Of-Traversal: E A C Node H is the next node at the front of the queue. The children of H are D, E, F, G and I but only F and G are new nodes, so we place these two nodes onto the queue. Queue: G F I D B <-front of queue Tagged: A B C D E F G H I Order-Of-Traversal: E A C H Since all the nodes have been encountered we can flush the queue as we add the nodes to the OOT list. The final list is, Order-Of-Traversal: E A C H B D I F G 12.5 Applications Implementing a Depth-First Traversal - There are many ways to implement a depth-first traversal of a graph. One of the most efficient is as a recursive algorithm for DFT of a graph. We will represent the graph in an adjacency matrix. type graphtype is array(1..100,1..100) of integer; type nodelistype is array(1..100) of integer; We have also created a 1-dimensional array data type for the node list and the traversal list. The node list will be used to keep track of which nodes have been placed in the traversal list. Now we need to declare variables. graph nodes start_node trav_list trav_num num : : : : : : graphtype; nodelistype; integer; nodelistype; integer; integer; The array graph is the adjacency matrix, nodes is the node list, start_node is the label of the node at which we will begin our traversals, trav_list will hold the list of node labels indicating the order in which the nodes are evaluated, trav_num will be used as the index of the next available position in trav_list and num is the number of nodes in the graph. procedure DFT(node : in integer) is begin nodes(node):=1; trav_list(trav_num):=node; trav_num:=trav_num+1; for col in 1..num loop if graph(node,col)=1 and nodes(col)=0 then DFT(col); end if; end loop; end DFT; In our example, we call DFT(1) which places node 1 in the trav_list and checks for other unused nodes adjacent to node 1. Nodes 2, 3 and 4 are all adjacent to node 1 but as soon as DFT(2) is called the copy of DFT(1) is pushed onto the process control stack and DFT(2) begins its operation. This means that DFT(4) is called by DFT(2) before DFT(3) is called by DFT(1). You should work though the execution of this program by hand in order to better understand this process. Implementing a Breadth-First Traversal - In the following section we develop an implementation of the breadth-first traversal algorithm as an Ada procedure. As usual, our first task is to choose a data structure. Since we want to use a queue to maintain our order of nodes we will naturally make use of our ADT generic adt_queue. We will instantiate an integer queue as part of our BFT program declaration. package my_queue is new adt_queue(integer); use my_queue; We will use an adjacency matrix to hold the graph description. The data table below will be saved as a text file named bft_graph.dat. 9 ABCDEFGHI 0 1 0 1 1 1 0 0 0 0 0 0 0 0 1 1 0 0 0 0 1 0 1 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 1 1 0 0 1 0 0 0 1 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1 1 0 0 0 1 1 1 1 0 1 0 0 1 0 0 0 1 1 0 The first number is the number of nodes in the graph, which we will call nnodes. The next line is a list of 9 node labels. We are using the same example graph that we used in our previous example as shown in Figure 12-18. The adjacency matrix is the 9x9 array of 1's and 0's. Each row and column correspond to a node while the 1's represent edges connecting pairs of nodes. For example, the first row (which is associated with node A) has 1's in columns 2, 4 and 5 corresponding to edges connecting node A to nodes B, D and E. We need a data structure to hold these data in our BFT program. maxnodes : constant integer :=9; type adjmatype is array(1..maxnodes,1..maxnodes)of integer; adjmat : adjmatype; nnodes : integer; Although we have only 9 nodes in our sample graph we can set maxnodes to handle larger graphs. We must be careful to use maxnodes everywhere we refer to the maximum number of nodes so that we will have to change only one value. We also need data structures to hold the list of labels (node symbols A through I), and lists for the order of traversal and to keep a record of the tagged (used) nodes. type symlistype is array(1..maxnodes)of character; type numlistype is array(1..maxnodes)of integer; label oot noot used : : : : symlistype; symlistype; integer := 0; numlistype; The oot array will hold the list of node labels in the order they are encountered in the BFT. This will be the output of our program. The integer noot will keep track of how many node labels have been placed on the oot list as the program runs. The used list will be initially set to all 0's. As nodes are added to the back of the queue the corresponding element of the used list will be set to 1. The BFT program will refer to these values to see of a node has been used before it is added to the queue. Before performing the traversal we will need to load the data from the text file into the program data structures and to initialize the used list. Once the data has been loaded we will need to select a starting node for the traversal. In our demonstration program we will ask the user to choose a starting node (by number). We will then enqueue this starting index into Q and tag this node as used. put("Enter index of starting node... "); get(start); enqueue(Q,start); used(start):=1; Once the starting node has been placed into the queue we can run exercise our BFT algorithm by dequeuing each node (index) from the queue, adding this node to the order of traversal (oot) list and placing the unused children of this node at the back of the queue, in order. while not(is_empty(Q)) loop dequeue(Q,curval); noot:=noot+1; oot(noot):=label(curval); for j in 1..nnodes loop if adjmat(curval,j)=1 and used(j)=0 then enqueue(Q,j); used(j):=1; end if; end loop; end loop; At the top of the while..loop we take the next node index from the front of the queue Q and place its label into the oot list. Note that we use the index curval to choose the correct node label from the label list. At the bottom of the while..loop we scan all the nodes in the graph, testing to see if a node is adjacent to the current node and if it has not yet been used. If both of these conditions are true then we enqueue this node and tag it as used. This process repeats until the queue is empty. Note that once all the nodes have been placed in the queue the for..loop at the bottom of the while..loop could be skipped and the remaining nodes in the queue could be dequeued and placed into the oot list. The data below shows the output for our BFT demonstration program. Enter graph file name... bft_graph.dat A B C D E F G H I A 0 1 0 1 1 0 0 0 0 B 1 0 0 0 0 1 1 0 0 C 0 0 0 0 1 0 0 0 1 D 1 0 0 0 0 0 0 1 0 E 1 0 1 0 0 0 0 1 0 F 0 1 0 0 0 0 0 1 0 G 0 1 0 0 0 0 0 1 1 H 0 0 0 1 1 1 1 0 1 A 1 B 2 C 3 D 4 E 5 F 6 G 7 H 8 I 9 I 0 0 1 0 0 0 1 1 0 Enter index of starting node... 5 E A C H B D I F G A complete listing of the BFT_demo.adb program is given below. with ada.text_io, ada.integer_text_io, adt_queue; use ada.text_io, ada.integer_text_io; procedure BFT_demo is maxnodes : constant integer :=30; type adjmatype is array(1..maxnodes,1..maxnodes)of integer; type symlistype is array(1..maxnodes)of character; type numlistype is array(1..maxnodes)of integer; package my_queue is new adt_queue(integer); use my_queue; fname : string(1..20); fleng : integer; datin : file_type; Q : qtype; start : integer; curval : integer; adjmat : adjmatype; nnodes : integer; oot : symlistype; noot : integer:=0; label : symlistype; used : numlistype; procedure loadgraph is begin put("Enter graph file name... "); get_line(fname,fleng); open(datin,in_file,fname(1..fleng)); get(datin,nnodes); for i in 1..nnodes loop get(datin,label(i)); used(i):=0; end loop; for i in 1..nnodes loop for j in 1..nnodes loop get(datin,adjmat(i,j)); end loop; end loop; close(datin); new_line(2); put(" "); for i in 1..nnodes loop put(label(i)); put(" "); end loop; new_line; for i in 1..nnodes loop put(" "); put(label(i)); for j in 1..nnodes loop put(adjmat(i,j),3); end loop; new_line; end loop; new_line; end loadgraph; begin loadgraph; for i in 1..nnodes loop put(" "); put(label(i)); end loop; new_line; for i in 1..nnodes loop put(i,3); end loop; new_line(2); put("Enter index of starting node... "); get(start); enqueue(Q,start); used(start):=1; while not(is_empty(Q)) loop dequeue(Q,curval); noot:=noot+1; oot(noot):=label(curval); for j in 1..nnodes loop if adjmat(curval,j)=1 and used(j)=0 then enqueue(Q,j); used(j):=1; end if; end loop; end loop; for i in 1..nnodes loop put(oot(i)); put(" "); end loop; end BFT_demo; Graphs and trees are important structures in computer science because they give us efficient methods for modeling complex systems. In the next chapter, we will use these structures and the associated traversal methods to implement efficient algorithms for managing large data sets. Minimum Spanning Tree - The graph shown below is called a weighted graph because its edges have values or weights assigned to them. The Minimum Spanning Tree (MST) problem is to find a tree that is made up of all the nodes in the graph and a subset of the edges such that the sum of the edge weights is a minimum. To implement an algorithm for the minimum spanning tree problem we need a datarepresentation for weighted graphs. To solve the problem manually, we can look at a picture of the graph, but for computer analysis an edge list or adjacency matrix representation is preferred. As shown in the figure above the weight of each edge is included as a parameter in the edge list and the weight replaces the boolean value in an adjacency matrix. The MST problem can be implemented using either of these data structures. Prim's Algorithm is an efficient method for finding the minimum spanning tree in a weighted graph. This algorithm is stated formally as follows: Given a weighted graph G consisting of a set of vertices V and a set of edges E with weights wi,j, where, G {V , E} V {set of all vi , i 1..n} E {set of all ei, j (vi , v j , wi, j )} Prepare a vertex list VP and an edge list EP (that are initially empty) to hold the vertices and edges selected by Prim's Algorithm. 1. Choose an any starting vertex vi and place it in the vertex list VP. 2. Find the smallest weight edge ei,j incident with a vertex in the vertex list whose inclusion in the edge list will not create a cycle. This can be done by verifying that the other vertex vj is not already in the vertex list. 3. Include this edge in the edge list EP and the associated vertex vj in the vertex list VP. 4. Repeat Steps 2 and 3 until all vertices of the graph are in the vertex list VP. The solution to the MST is the edge list and the sum of the weights of the edges in the edge list EP is the minimum weight (sometimes we say minimal since there may be more than one minimum) spanning tree. The type of graph traversal being performed in Prim's Algorithm is neither purely depth-first or breadth-first since the only edges that are examined are those incident with vertices in the selected vertex list. A simple way to implement Steps 2 and 3 is in a loop as shown in the pseudo-code below: wmin=some_large_value For every edge ek=(vi,vj,wij) in E if [(vi in VP and vj not in VP) or (vi not in VP and vj in VP)] and wij<wmin then wmin = wij emin=ek end if end loop Include emin in EP and vi and vj in VP. Exercises . 1. How many edges are in, a. an N-node complete graph? b. an N-node cycle? c. an N-node wheel? d. an N-level hypercube? 2. Create finite-state machines to recognize the following sets. Assume an alphabet {0,1}. a. all binary strings with an even number of 1's. b. all binary strings with an odd number of 1's and an even number of 0's. c. all binary strings containing the substring 1101. d. all binary strings that do not contain the substring 1001. 3. Modify the integer recognizer FSM below to reject strings with leading or trailing blanks. 0..9 +/b 0..9 b start 0 1 +/- 2 3 4 0..9 b else else b b = blank 0..9 = a digit +/- = plus or minus sign else = any other ASCII character else else else 5 else 4. Give a fixed-size (linear) list showing the placement of the nodes in a list representation of the binary tree shown in Figure 12-10. You should not show any edges in your sketch. 5. Write an Ada program that determines the maximum degree of any node in the graph. Your program should return the label of the node and its degree. Choose an appropriate graph data representation (adjacency matrix or adjacency list) and explain your choice. 6. Write an Ada program that finds cycles in a directed graph. 7. Write an Ada program that determines the height of binary tree as given in a list representation. 8. Which traversal method (DFT or BFT) would be best for each of the following operations on a graph? a. Finding the shortest path between a pair of nodes. b. Finding the shortest cycle from a node back to the same node c. Finding a cycle involving all the nodes in the graph (Hamiltonian Path) d. Finding a subset of edges the make up a spanning tree (connects all nodes).