* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Primal And Dual Neural Networks For Shortest
Cracking of wireless networks wikipedia , lookup
Multiprotocol Label Switching wikipedia , lookup
Piggybacking (Internet access) wikipedia , lookup
Computer network wikipedia , lookup
Recursive InterNetwork Architecture (RINA) wikipedia , lookup
Network tap wikipedia , lookup
List of wireless community networks by region wikipedia , lookup
Airborne Networking wikipedia , lookup
Dijkstra's algorithm wikipedia , lookup
IEEE 802.1aq wikipedia , lookup
864 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 28, NO. 6, NOVEMBER 1998 [26] T. Whalen, “Decisionmaking under uncertainty with various assumptions about available information,” IEEE Trans. Syst., Man, Cybern., vol. SMC-14, pp. 888–900, 1984. [27] R. J. Wonnacott and T. H. Wonnacott, Introductory Statistics. New York: Wiley, 1985. [28] R. R. Yager, “A procedure for ordering fuzzy subsets of the unit interval,” Inf. Sci., vol. 24, pp. 143–161, 1981. [29] Y. Yuan, “Criteria for evaluating fuzzy ranking methods,” Fuzzy Sets Syst., vol. 44, pp. 139–157, 1991. [30] L. A. Zadeh, “Fuzzy sets,” Inf. Contr., vol. 8, pp. 338–353, 1965. [31] , “A fuzzy-algorithmic approach to the definition of complex or imprecise concepts,” Int. J. Man–Mach. Stud., vol. 8, pp. 249–291, 1976. Primal and Dual Neural Networks for Shortest-Path Routing Jun Wang Abstract— This paper presents two recurrent neural networks for solving the shortest path problem. Simplifying the architecture of a recurrent neural network based on the primal problem formulation, the first recurrent neural network called the primal routing network has less complex connectivity than its predecessor. Based on the dual problem formulation, the second recurrent neural network called the dual routing network has even much simpler architecture. While being simple in architecture, the primal and dual routing networks are capable of shortest-path routing like their predecessor. Index Terms—Neural networks, optimization, shortest path problem. I. INTRODUCTION The shortest path problem is concerned with finding the shortest path from a specified origin to a specified destination in a given network while minimizing the total cost associated with the path. The shortest path problem is an archetypal combinatorial optimization problem having widespread applications in a variety of settings. The applications of the shortest path problem include vehicle routing in transportation systems [1], traffic routing in communication networks [2], [3], and path planning in robotic systems [4]–[6]. Furthermore, the shortest path problem also has numerous variations such as the minimum weight problem, the quickest path problem, the most reliable path problem, and so on. The shortest path problem has been investigated extensively. The well-known algorithms for solving the shortest path problem include the O(n2 ) Bellman’s dynamic programming algorithm for directed acycle networks, the O(n2 ) Dijkstra-like labeling algorithm and the O(n3 ) Bellman–Ford successive approximation algorithm for networks with nonnegative cost coefficients only, where n denotes the number of vertices in the network. See [7] for a comprehensive coverage of these algorithms. Besides the classical methods, many new and modified methods have been developed during the past few years. For large-scale and real-time applications such as traffic Manuscript received June 27, 1996; revised December 7, 1997. This work was supported by the Hong Kong Research Grants Council under Grant CUHK381/96E. The author is with the Department of Mechanical and Automation Engineering, the Chinese University of Hong Kong, Shatin, N.T., Hong Kong, R.O.C. Publisher Item Identifier S 1083-4427(98)06216-X. routing and path planning, the existing series algorithms may not be effective and efficient due to the limitation of sequential processing in computational time. Therefore, parallel solution methods are more desirable. Since Hopfield and Tank’s seminal work [8], [9], neural networks for solving optimization problems have been a major area in neural network research. While the mainstream of neural network approach to optimization focuses on solving the NP-complete problems such as the traveling salesman problem [8], there have been a few direct attacks of the shortest path problem using neural network [10]–[12]. These investigations have shown the sufficient potentials for the neural network approach to the shortest path problem. In the present paper, two recurrent neural networks for shortestpath routing, called the primal and dual routing networks, are presented. These recurrent neural networks are capable of routing the shortest path in networks with mixed positive and negative cost coefficients. These recurrent neural networks are also much simpler in architecture, hence much easier for hardware implementation. II. PATH REPRESENTATIONS A. Problem Statement Given a weighted direct graph G = (V ; E ) where V is a set of vertices and E is an ordered set of m edges. A fixed cost cij is associated with the edge from vertices i to j in the graph G. In a transportation or a robotic system, for example, the physical meaning of the cost can be the distance between the vertices, the time or energy needed for travel from one vertex to another. In a telecommunication system, the cost can be determined according to the transmission time and the link capacity from one vertex to another. In general, the cost coefficients matrix [cij ] is not necessarily symmetric, i.e., the cost from vertices i to j may not be equal to the cost from vertices j to i. Furthermore, the edges between some vertices may not exist, i.e., m may be less than n2 (i.e., m < n2 ). The values of cost coefficients for the n2 0 m nonexistent edges are defined as infinity. More generally, a cost coefficient can be either positive or negative. A positive cost coefficient represents a loss, whereas a negative one represents a gain. It is admittedly more difficult to determine the shortest path for a network with mixed positive and negative cost coefficients [7]. For an obvious reason, we assume that there are neither negative cycles nor negative loops in the networks (i.e., 8 i; j ; cii 0, c 0). Hence the total cost of the shortest path is i j ij bounded from below. Since the vertices in a network can be labeled arbitrarily, without loss of generality, we assume hereafter vertices 1 and n are origin and destination, respectively. n B. Vertex Path Representation A path in a given network can be represented in different ways and the way of path representation in turn affects the effectiveness and efficiency of a solution procedure. In the literature, there are two typical path representations: vertex representation and edge representation. Rauch and Winarske [10] and Lee and Chang [11] used an n 2 p binary matrix to represent a path, where p is the number of vertices in the shortest path and which is assumed to be known. Specifically, the binary matrix of vertex path representation [xij ] contains only 1083–4427/98$10.00 1998 IEEE Authorized licensed use limited to: Akira Imada. Downloaded on April 2, 2009 at 00:08 from IEEE Xplore. Restrictions apply. IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 28, NO. 6, NOVEMBER 1998 n “0” and “1” elements, and xij can be defined as 1; 0; xij = if the j th vertex is the ith stop in the path; otherwise. xkn = 1 2 f0; 1g; xij xij = minimize n =1 j =1; j 6=i cij xij (9) i n subject to k =1; k6=i xik xij 6 0 n =1; l6=i xli l 1; 0; 1; = if i = 1, if i = 2; 3; if i = n, 0 2 f0; 1g; i = j ; i; j = 1; 2; 1 1 1 ; n 0 1, 111; n (10) (11) where xij denotes the decision variable associated with the edge from vertices i to j , as defined in (2). The objective function to be minimized, (9), is also the total cost for the path. The equality constraint coefficients and the right-hand sides are 01, 0, or 1. Equation (10) ensures that a continuous path starts from a specified origin and ends at a specified destination. Because of the total unimodulity property of the constraint coefficient matrix defined in (10) [14], the integrality constraint in the shortest path problem formulation can be equivalently replaced with the non-negativity constraint, if the shortest path is unique. In other words, the optimal solutions of the equivalent linear programming problem are composed of zero and one integers if a unique optimum exists [14]. The equivalent linear programming problem based on the simplified edge path representation can be described as follows: 01 n III. PROBLEM FORMULATION (8) Based on the edge path representation, the primal shortest path problem can be formulated as a linear integer program as follows [14]: (2) Similar to the vertex path representation, each row and each column in the edge representation can contain no more than one “1” element, if each vertex can be visited at most once. Since the cost coefficients of loops are assumed to be nonnegative (i.e., cii 0 for i = 1; 2; 1 1 1 ; n), the elements in the main diagonal line of the edge path representation are always zero. Hence, the edge path representation can be simplified by excluding the diagonal elements xii (i = 1; 2; 1 1 1 ; n). In addition, the first column and last row of the decision variable matrix are always zero vectors, they can be removed to reduce the matrix size. Consequently, the simplified edge representation has only (n 0 1)(n 0 2) binary elements. Because the edge path representation can represent the shortest path with more than n vertices in a general network with mixed positive and negative cost coefficients, it is more desirable. The desirability of the edge representation will be more apparent in problem formulation that follows. 111; n B. Primal Formulation Based on Edge Path Representation n if the edge from vertices i to j is in the path; otherwise. i; j = 1; 2; (7) where xij denotes the decision variable associated with the ith stop and the j th vertex as defined in (1). The objective function to be minimized, (3), is the total cost associated with the path. Equation (4) ensures that each vertex is visited at most once. Equation (5) ensures that each stop contains at most one vertex. Equations (6) and (7) define the initial condition for starting and ending vertices. Equation (8) is simply the integrality constraint. C. Edge Path Representation In contrast to the vertex path representation, the edge path representation uses an n 2 n binary matrix to represent the edges in a path [12], [13]. Specifically, the binary matrix of edge path representation [xij ] also contains only “0” and “1” elements, and xij can be defined as (6) =1 x11 = 1 k (1) Obviously, each row and each column in the binary matrix of vertex path representation contains no more than one “1” element, respectively, if each vertex can be visited no more than once. Since the binary matrix provides the path information in terms of vertices, this path representation is called a vertex representation. A limitation of the vertex path representation is that it requires determining the number of vertices in the shortest path p a priori. In a dynamic environment, it is almost impossible to determine p a priori due to the time-varying nature of cost coefficients. If p is assigned with a number smaller than the number of vertices in the shortest path, then the resultant vertex representation cannot represent the shortest path and any solution procedure based on the representation cannot determine the shortest path. 1; 0; 865 minimize n =1 j =2; j 6=i n01 cij xij (12) i A. Formulation Based on Vertex Path Representation The shortest path problem is to find the shortest (least costly) possible directed path from a specified origin vertex to a specified destination vertex. The cost of the path is the sum of the costs on the edges in the path. Based on the vertex path representation, the shortest path problem can be mathematically formulated as a quadratic integer program as follows: 01 n minimize k n =1 i=1 j =1 n subject to n =1 cij xki xk+1; j (3) xki 1; k = 2; 3; 111; n 0 1 (4) xki 1; i = 1; 2; 111; n 01 (5) i n k =1 xik subject to k 0 n xli =1; k6=i l=2; l6=i = i1 0 in ; i = 1; 2; 1 1 1 ; n xij 0; i 6= j ; i = 1; 2; 1 1 1 ; n 0 1 111; n j = 2; 3; (13) (14) where pq is the Kronecker delta function defined as pq = 1 if p = q and pq = 0 if p = q . 6 C. Dual Formulation Based on Edge Path Representation Since the number of equality constraints n is much less than the number of decision variables (n 0 1)(n 0 2), it is more desirable to formulate and solve the dual of the primal shortest path problem. Based on the edge path representation, the dual shortest path problem Authorized licensed use limited to: Akira Imada. Downloaded on April 2, 2009 at 00:08 from IEEE Xplore. Restrictions apply. 866 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 28, NO. 6, NOVEMBER 1998 can be formulated as a linear programming problem as follows [14]: yn 0 y1 yj 0 yi cij ; maximize subject to (15) i 6= j i; j = 1; 2; 1 1 1 ; n (16) where yi denotes the dual decision variable associated with vertex i. Note that the value of the objective function at its maximum is the total cost of the shortest path [14]. Since the objective function and constraints in the dual problem involves variable differences only, an equivalent dual shortest path problem with n 0 1 variables can be formulated by defining zi = yi 0 y1 for i = 1; 2; 1 1 1 ; n zn zj 0 zi cij ; maximize subject to (17) i 6= j ; i; j = 1; 2; 1 1 1 ; n IV. PRIMAL ROUTING NETWORK A. Energy Function An energy function for the primal problem based on the simplified edge path representation can be defined as follows: 2 i =1 + k 01 n 6= [xik (t) 2 i i n =1 j =2; j 6=i i 0 x (t)] 0 1 + in ki cij 0t= )x (t) exp( Let duij (t)=dt = 0@Ep [t; x(t)]=@xij , based on the simplified edge path representation, the state dynamical equation and the output function of the recurrent neural network presented in [13] is as follows: for i 6= ji = 1; 2; 1 1 1 ; n 0 1; j = 2; 3; 1 1 1 ; n; duij (t) dt = n 0w ij (19) =1; l6=i n01 xli (t) l n +w 01 n xik (t) + w =2; k6=i k xjp (t) 0 w =2; p6=j q =1; q 6=j + w (i1 0 in 0 j 1 + jn ) xqj (t) p 0 c ij 0t= ) exp( xij (t) = f [uij (t)] uij (t) denotes the net (20) (21) where input to neuron (i; j ), f (u) is a nonnegative and nondecreasing activation function defined as f (u) > 0 if u > 0 or f (u) = 0 otherwise, and df (u)=du 0. To reduce the complexity of the resulting neural network architecture, the dynamical equation and output function of the present primal routing network can be described by using n instrumental variables (vi (t); i = 1; 2; 1 1 1 ; n) duij (t) dt 0wv (t) + wv (t) 0 c exp(0t= ) i= 6 j ; i = 1; 2; 1 1 1 ; n 0 1 j = 2; 3; 1 1 1 ; n 01 v (t) = x (t) 0 x (t) = i j ij n i (22) n ik =2; k6=i 0 i1 + in ; k k =1; k6=i ki i = 1; 2; 1 1 1 ; n xij (t) = f [uij (t)] i 6= j ; i = 1; 2; 1 1 1 ; n 0 1 j = 2; 3; 1 1 1 ; n: C. Network Architecture Because the shortest path problem based on the edge path representation is formulated as a linear program, it can be solved by the neural networks proposed for linear programming. In [15], [16], a recurrent neural network called the deterministic annealing network is presented and demonstrated to be capable of solving linear programming problems. The primal and dual routing networks are tailored from the deterministic annealing network [15], [16]. Let the decision variables of the primal and dual shortest path problem be represented, respectively, by the activation states of the primal and dual routing networks. For simplicity of notations, the same symbols [xij ] and [zi ] are used to denote both the decision variables and corresponding activation states. n B. Dynamical Equation (18) where z1 0. The value of the objective function at its maximum is still the total cost of the shortest path. Moreover, zn0q+i (1 i q ) is the cost of the shortest path with i edges to the destination where q is the number of edges in the shortest path. In addition, if zi is used as the objective function to be maximized, then at optimum it is the shortest path from vertex 1 to vertex i [14]. Although the last component of the optimal dual solution gives the total cost of the shortest path, the optimal dual solution needs postprocessing to decode the optimal primal solution in terms of edges. According to the Complementary Slackness Theorem [14]: given the feasible solutions of xij and zi to the primal and dual problems, respectively, the solutions are optimal if and only if 1) xij = 1 implies zj 0 zi = cij and 2) xij = 0 is implied by zj 0 zi < cij for i; j = 1; 2; 1 1 1 ; n. The Complementary Slackness Theorem can be used as the basis for the post-processing as will be discussed in Section V. The shortest path problem formulation based on the edge path representation, (12)–(14) or (17) and (18), is a linear programming problem, whereas the problem based on the vertex path representation, (3)–(8), is an integer nonconvex quadratic programming problem. The advantages of the edge representation over the vertex representation become more obvious. The subsequent development of this paper is thus based on the simplified edge path representation. Ep [t; x(t)] = w where w; , and are positive scaling constants and exp(0t= ) is a decaying temperature parameter. The role of the temperature parameter in (19) is explained at length in [26] and [27]. (23) (24) The primal routing network consists of n2 0 2n + 2 neurons arranged spatially in two layers: an output layer and a hidden layer. The output layer consists of an (n 0 1) 2 (n 0 1) two-dimensional array of output neurons without the diagonal elements representing [xij ]. The hidden layer consists of an n-vector of hidden neurons representing instrumental variables [vi ]. The first two terms in the right-hand side of (22) define the connectivity from the n hidden neurons to the (n 0 1)(n 0 2) output neurons. The third term in the right-hand side of (22) defines a decaying external input exp(0t= )cij to the output layer. Similarly, the first two terms in the right-hand side of (23) define the connectivity from the (n 0 1)(n 0 2) output neurons to the n hidden neurons. The third term in the right-hand side of (23) defines a constant positive and negative unity input (bias) to the hidden neurons corresponding, respectively, to the specified origin and destination. Moreover, (22) and (23) also show the connectivity of the primal routing network: 1) there exist (n 0 1)(n 0 2) excitatory connections with weight of w to xij (t) in the output layer from vi (t) in the hidden layer; 2) there exist (n 0 1)(n 0 2) inhibitory connections with weight of 0w to xij (t) in the output layer from vi (t) in the hidden layers; Authorized licensed use limited to: Akira Imada. Downloaded on April 2, 2009 at 00:08 from IEEE Xplore. Restrictions apply. IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 28, NO. 6, NOVEMBER 1998 867 Fig. 2. Architecture of the dual routing network. Fig. 1. Architecture of the primal routing network. 3) there exist n(n 0 1) excitatory connections with unity weight to vi (t) in the hidden layer from xi1 (t); xi2 (t); 1 1 1 ; xin (t) in the output layer (i = 1; 2; 1 1 1 ; n); 4) there exist n(n 0 1) inhibitory connections with negative unity weight to vi (t) in the hidden layer from x1i (t); x2i (t); 1 1 1 ; xni (t) in the output layer (i = 1; 2; 1 1 1 ; n); 5) there is no lateral connection among neurons in either the output layer nor in the hidden layer. The spatial complexity of the primal routing network is characterized by O(n2 ) neurons and O(n2 ) connections. Specifically, there are 2 4(n 0 1) unidirectional connections in the primal routing network compared with the n(n 0 1)(2n 0 3) connections in the single-layer recurrent neural network [13], a reduction of (n 0 1)(2n2 0 7n + 4) connections for n 3. Fig. 1 illustrates the architecture of primal routing network. From (22), it is easy to see that the convergence rate of primal routing network depends also on the parameter . The parameter serves as a time constant for the decaying bias to reinforce the effect of cost minimization. The time constant has to be sufficiently large to sustain cost minimization and ensure solution optimality. Since the transient time of primal routing network depends on the reciprocals of the sensitivity parameter and the minimum absolute value of the nonzero eigenvalues of the connection weight matrix nw, a lower bound on the value of is 4=(nwxmax ), i.e., > 4=(nwxmax ) if the unipolar sigmoid activation function is used, as discussed in [15]. Based on a similar analysis, a design rule is that > 1=(nw) if the Heaviside activation function is used. The role of is to balance the effects of constraint satisfaction and cost minimization. Let cmax = maxfcij ; i; j; = 1; 2; 1 1 1 ; ng < 1. A design rule is to select such that w=cmax . V. DUAL ROUTING NETWORK A. Energy Functions Similarly to the primal shortest path problem, an energy function for the dual problem can be formulated as Ed [t; z (t)] = w 2 n i=1 j 6=i fg[zj (t) 0 zi (t) 0 cij ]g2 0 exp(0t= )zn (t) w; ; > 0, g(1) is a nonnegative (25) where and nondecreasing activation function defined as g (z ) > 0 if z > 0 or g (z ) = 0, otherwise, and exp(0t= ) is a decaying temperature parameter. D. Activation Function The role of the activation function f (1) is twofold: to enforce the nonnegative constraint on xij as described in (24) and to scale the sensitivity of the activation of xij (t). The basic requirements for the activation function of primal routing network are nonnegativity and nondecreasing monotonicity, i.e., f [uij (t)] 0 and df [uij (t)]=duij 0 for i; j = 1; 2; 1 1 1 ; n. The unipolar sigmoid function, f [uij (t)] = xmax =f1+exp[0uij (t)]g, is a good candidate of the activation function, where xmax 1 is the supremum of xij and is a positive scaling constant. Another simple activation function (Heaviside function) can be defined as f [uij (t)] = uij (t) for uij (t) > 0 and f [uij (t)] = 0 otherwise, where > 0 is the slope of the activation function in the positive half-space. Similar to the sigmoid activation function, the design parameter determines the sensitivity of activation. B. Dynamical Equation Let dzi (t)=dt = 0@Ed [t; z (t)]=@zi , the dynamical equation and output function of the dual routing network are as follows: dzi (t) = dt 0w j 6=i fg[zj (t) 0 zi (t) 0 cij ] 0 g[zi (t) 0 zj (t) 0 cji ]g + in exp(0t= ); xij (t) = h[zj (t) 0 zi (t) 0 cij ] where h(u) is the output function defined as h(u) = 1 if = 0 otherwise. h(u) C. Network Architectures E. Design Parameters It can be shown that all the nonzero eigenvalues of the system matrix of the linearized primal routing network are equal to 02nw. Therefore, a large value of w should be used to expedite the convergence. Furthermore, since the nonzero eigenvalues are directly proportional to n, the average convergence time of the primal routing network with given design parameters decreases as n increases. This feature has been demonstrated for the predecessor of the primal routing network in [13]. i = 2; 3; 1 1 1 ; n (26) (27) u = 0, or The dual routing network consists of n 0 1 neurons representing arranged spatially in a layer. The dynamical equation (26) of the dual routing network shows that there is an inhibitory connection with weight of 0w and an excitatory connection with weight of w from every pair of zi (t) (i = 2; 3; 1 1 1 ; n). That is, the number of connections is 4(n 0 1)2 , the same as the primal routing network. The dynamical equation also shows that only the neuron corresponding to the destination has a decaying external input exp(0t= ). Fig. 2 illustrates the architecture of the dual routing network. [zi ] Authorized licensed use limited to: Akira Imada. Downloaded on April 2, 2009 at 00:08 from IEEE Xplore. Restrictions apply. 868 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 28, NO. 6, NOVEMBER 1998 D. Design Parameters Because the constraint coefficient matrix in a dual problem is the transpose of that in the primal problem, it can be seen that the nonzero eigenvalues of the system matrix of the linearized dual routing network are also 02nw. Large w can expedite the convergence of the dual routing network as well. Similar to in the primal routing network, the role of is to balance the effects of constraint satisfaction and objective maximization. It is usually set wcmax . Similar to that in the primal routing network, the role of the activation function in the dual routing network is to enforce the inequality constraints (18) and scale the sensitivity of the activation. The Heaviside activation function discussed in the preceding section is suitable for the dual routing network. E. Convergence Analysis The asymptotic stability of the deterministic annealing network is proven in [16]. The necessary and sufficient condition for the state variables of the dual routing network to converge to a feasible solution will be shown to be the absence of negative cycles in the given network G. A negative cycle is characterized by the negative sum of cost coefficients around a closed circuit in a network, i.e., 9i, such that cij + cjk + 1 1 1 + cli < 0. To prove the necessity, adding both sides of the constraints associated with indices i; j; k; 1 1 1 ; l, zj 0 zi cij ; zk 0 zj cjk ; 1 1 1 ; zi 0 zl cli , results in cij + cjk + 1 1 1 + cli 0. The sufficiency can be proved by showing that the convergence to an infeasible solution will lead to the presence of at least one negative cycle. Examining the dynamical equation (26) of the dual routing network, we can see that the only cause of an infeasibility convergence results from the existence of an offset effect in the first term on the right-hand side, since the second (last) term vanishes as time approaches infinity and the left-side hand approaches zero as the states converge. Specifically, an infeasibility implies that at least one term in both j 6=i g [zj (t) 0 zi (t) 0 cij ] and j 6=i g [zi (t) 0 zj (t) 0 cji ] are positive. According to the definition of g(1) in the dual routing network, g [zj (1) 0 zi (1) 0 cij ] > 0 if and only if zj (1) 0 zi (1) 0 cij > 0. If the cause of infeasibility is that g[zj (1) 0 zi (1) 0 cij ] > 0 and g [zi (1) 0 zj (1) 0 cji ] > 0 (i.e., zj (1) 0 zi (1) 0 cij > 0 and zi (1) 0 zj (1) 0 cji > 0), adding both sides of the two inequalities results in a negative cycle cij + cji < 0. If an infeasibility is caused by zj (1) 0 zi (1) 0 cij > 0 and zi (1) 0 zk (1) 0 cki > 0, zj (1) 0 zi (1) 0 cij > 0 implies 9m such that zm (1) 0 zj (1) 0 cjm > 0 and zi (1)0zk (1)0cki > 0 implies 9n such that zk (1)0zn (1)0 cnk > 0. These inequalities in turn result in two other inequalities. Eventually, adding both sides of all the resulting inequalities lead to the presence of a negative cycle cij + cjm + 1 1 1 + cnk + cki < 0. The case that an infeasibility results from the cancellation of more than two terms can be proven in the similar way. F. Post Processing Using the Complementary Slackness Theorem via the output function (27), the optimal primal solution in terms of edges can be decoded from the optimal dual solution. The activation states from the output function of the dual routing network, however, sometimes results in an infeasible primal solution (specifically, more “1” than required). This infeasibility can be easily removed by checking every element of the resulting primal solution matrix [x] from origin to destination and converting inconsistent “1” to “0.” Specifically, 8 xij = 1, if 8 k; xki = xjk = 0 then set xij = 0. VI. ILLUSTRATIVE EXAMPLES Example 1: Consider the shortest path problem with 10 vertices (Example 1 in [13] or Example 2 in [17]) where the origin and Fig. 3. Transient states of the dual routing network in Example 1. destination vertices are, respectively, vertices 1 and 10. The Euclidean distances are used as the cost coefficients. The shortest path of this problem is fe12 ; e23 ; e3; 10 g where eij (i = 1; 2; 1 1 1 ; n) denotes the edge between vertices i and j in the network. The total cost of the shortest path is 1.149 896. Let w = = 108 , = 1007 , and c1 = 10. Fig. 3 depicts the transient behavior of the activation states of the dual routing network. The steady-state vector of the output neurons is [0.325 914, 0.510 996, 0.585 029, 1.207 522, 0.476 889, 0.701 207, 0.555 255, 0.831 225, 1.149 902]. After post-processing to remove inconsistency and ensure feasibility, the neural network solution to the problem represents the shortest path. The simulated recurrent neural network takes about 0.5 s to converge. Unlike the popular Dijksdra’s labeling algorithm which can solve the shortest path problem with nonnegative cost coefficients only, the primal and dual routing networks are capable of solving the shortest path problem with mixed-sign cost coefficients as will be shown in the following example. Example 2: Consider the 10-vertex directed network with mixed positive and negative cost coefficients (Example 2 in [13]). The cost coefficient matrix is asymmetric and there are no loops or negative cycles in this network. The shortest path of this problem is fe19 ; e92 ; e25 ; e57 ; e7; 10 g, with the total cost of 0.360 952. Fig. 4 depicts the transient behavior of the activation states of the dual routing network simulated using the same design parameters in Example 1. The steady-state vector of the output neurons is [0.053 502, 0.041 353, 0.268 410, 0.037 663, 0.345 777, 0.212 687, 0.211 472, 0.117 621, 0.360 951]. After post-processing to remove inconsistency and ensure feasibility, the neural network solution to the problem represents the shortest path. It takes less than 0.5 s (5 ) for the simulated dual routing network to converge. VII. CONCLUDING REMARKS In this paper, the primal routing network with O(n2 ) neurons and connections and the dual routing network with O(n) neurons and O(n2 ) connections for solving the shortest path problem have been presented. The dynamics and architectures of the primal and dual routing networks have been described. The design methodology and guidelines have been delineated. It has been shown that the primal and dual routing networks are capable of shortest-path routing for directed Authorized licensed use limited to: Akira Imada. Downloaded on April 2, 2009 at 00:08 from IEEE Xplore. Restrictions apply. IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 28, NO. 6, NOVEMBER 1998 869 a VLSI circuit can serve as coprocessors for onboard planning in dynamic decision environments. REFERENCES Fig. 4. Transient states of the dual routing network in Example 2. networks with arbitrary cost coefficients. The convergence rate of the primal and dual routing networks is nondecreasing with respect to the size of the shortest path problem and can be expedited by properly scaling design parameters. These features make the primal and dual routing networks suitable for solving large-scale shortest path problems in real-time applications. One salient advantage of the primal and dual routing networks is the independence of the connection weight matrix upon specific problems. Specifically, only the constant biases are different for different origins and/or destinations of the same network, and only the initial values of the decaying biases are different for different networks with the same number of vertices, and the same origin and destination. By energizing different vertices, the primal and dual routing networks can be used to generate all-pair shortest paths and minimal spanning trees. These desirable features facilitate the VLSI implementation of the primal and dual routing networks. The primal and dual routing networks implemented in [1] L. Bodin, B. L. Golden, A. Assad, and M. Ball, “Routing and scheduling of vehicles and crews: The state of the art,” Comput. Oper. Res., vol. 10, pp. 63–211, 1983. [2] A. Ephremides and S. Verdu, “Control and optimization methods in communication network problems,” IEEE Trans. Automat. Contr., vol. 34, pp. 930–942, 1989. [3] J. K. Antonio, G. M. Huang, and W. K. Tsai, “A fast distributed shortest path algorithm for a class of hierarchically clustered data networks,” IEEE Trans. Comput., vol. 41, pp. 710–724, 1992. [4] S. Jun and K. G. Shin, “Shortest path planning in distributed workspace using dominance relation,” IEEE Trans. Robot. Automat., vol. 7, pp. 342–350, 1991. [5] P. L. Lin and S. Chang, “A shortest path algorithm for a nonrotating object among obstacles of arbitrary shapes,” IEEE Trans. Syst., Man, Cybern., vol. 23, pp. 825–833, 1993. [6] P. Soueres and J.-P. Laumond, “Shortest paths synthesis for a car-like robot,” IEEE Trans. Automat. Contr., vol. 41, pp. 672–688, 1996. [7] E. L. Lawler, Combinatorial Optimization: Networks and Matroids. New York: Holt, Rinehart, and Winston, 1976, pp. 59–108. [8] J. J. Hopfield and D. W. Tank, “Neural computation of decisions in optimization problems,” Biol. Cybern., vol. 52, pp. 141–152, 1985. [9] D. W. Tank and J. J. Hopfield, “Simple neural optimization networks, an A/D converter, signal decision circuit, and a linear programming circuit,” IEEE Trans. Circuits Syst., vol. CAS-33, pp. 533–541, 1986. [10] H. E. Rauch and T. Winarske, “Neural networks for routing communication traffic,” IEEE Contr. Syst. Mag., vol. 8, pp. 26–30, 1988. [11] S.-L. Lee and S. Chang, “Neural networks for routing communication networks with unreliable components,” IEEE Trans. Neural Networks, vol. 4, pp. 854–863, 1993. [12] M. Ali and F. Kamoun, “Neural networks for shortest path computation and routing in computer networks,” IEEE Trans. Neural Networks, vol. 4, pp. 941–954, 1993. [13] J. Wang, “A recurrent neural network for solving the shortest path problem,” IEEE Trans. Circuits Syst., vol. 43, pp. 482–486, 1996. [14] M. S. Bazaraa, J. J. Jarvis, and H. D. Sherali, Linear Programming and Network Flows, 2nd ed. New York: Wiley, 1990. [15] J. Wang, “Analysis and design of a recurrent neural network for linear programming,” IEEE Trans. Circuits Syst., vol. 40, pp. 613–618, 1993. , “A deterministic annealing neural network for convex program[16] ming,” Neural Networks, vol. 7, pp. 629–641, 1994. [17] , “Primal and dual assignment networks,” IEEE Trans. Neural Networks, vol. 8, pp. 784–790, 1997. Authorized licensed use limited to: Akira Imada. Downloaded on April 2, 2009 at 00:08 from IEEE Xplore. Restrictions apply.