Download Primal And Dual Neural Networks For Shortest

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Cracking of wireless networks wikipedia , lookup

Multiprotocol Label Switching wikipedia , lookup

Peering wikipedia , lookup

Piggybacking (Internet access) wikipedia , lookup

Computer network wikipedia , lookup

Recursive InterNetwork Architecture (RINA) wikipedia , lookup

Network tap wikipedia , lookup

List of wireless community networks by region wikipedia , lookup

Airborne Networking wikipedia , lookup

Dijkstra's algorithm wikipedia , lookup

IEEE 802.1aq wikipedia , lookup

Routing in delay-tolerant networking wikipedia , lookup

Routing wikipedia , lookup

Transcript
864
IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 28, NO. 6, NOVEMBER 1998
[26] T. Whalen, “Decisionmaking under uncertainty with various assumptions about available information,” IEEE Trans. Syst., Man, Cybern.,
vol. SMC-14, pp. 888–900, 1984.
[27] R. J. Wonnacott and T. H. Wonnacott, Introductory Statistics. New
York: Wiley, 1985.
[28] R. R. Yager, “A procedure for ordering fuzzy subsets of the unit
interval,” Inf. Sci., vol. 24, pp. 143–161, 1981.
[29] Y. Yuan, “Criteria for evaluating fuzzy ranking methods,” Fuzzy Sets
Syst., vol. 44, pp. 139–157, 1991.
[30] L. A. Zadeh, “Fuzzy sets,” Inf. Contr., vol. 8, pp. 338–353, 1965.
[31]
, “A fuzzy-algorithmic approach to the definition of complex or
imprecise concepts,” Int. J. Man–Mach. Stud., vol. 8, pp. 249–291, 1976.
Primal and Dual Neural Networks
for Shortest-Path Routing
Jun Wang
Abstract— This paper presents two recurrent neural networks for
solving the shortest path problem. Simplifying the architecture of a
recurrent neural network based on the primal problem formulation,
the first recurrent neural network called the primal routing network
has less complex connectivity than its predecessor. Based on the dual
problem formulation, the second recurrent neural network called the dual
routing network has even much simpler architecture. While being simple
in architecture, the primal and dual routing networks are capable of
shortest-path routing like their predecessor.
Index Terms—Neural networks, optimization, shortest path problem.
I. INTRODUCTION
The shortest path problem is concerned with finding the shortest
path from a specified origin to a specified destination in a given
network while minimizing the total cost associated with the path.
The shortest path problem is an archetypal combinatorial optimization
problem having widespread applications in a variety of settings. The
applications of the shortest path problem include vehicle routing in
transportation systems [1], traffic routing in communication networks
[2], [3], and path planning in robotic systems [4]–[6]. Furthermore,
the shortest path problem also has numerous variations such as
the minimum weight problem, the quickest path problem, the most
reliable path problem, and so on.
The shortest path problem has been investigated extensively. The
well-known algorithms for solving the shortest path problem include
the O(n2 ) Bellman’s dynamic programming algorithm for directed
acycle networks, the O(n2 ) Dijkstra-like labeling algorithm and
the O(n3 ) Bellman–Ford successive approximation algorithm for
networks with nonnegative cost coefficients only, where n denotes
the number of vertices in the network. See [7] for a comprehensive
coverage of these algorithms. Besides the classical methods, many
new and modified methods have been developed during the past
few years. For large-scale and real-time applications such as traffic
Manuscript received June 27, 1996; revised December 7, 1997. This work
was supported by the Hong Kong Research Grants Council under Grant
CUHK381/96E.
The author is with the Department of Mechanical and Automation Engineering, the Chinese University of Hong Kong, Shatin, N.T., Hong Kong,
R.O.C.
Publisher Item Identifier S 1083-4427(98)06216-X.
routing and path planning, the existing series algorithms may not be
effective and efficient due to the limitation of sequential processing
in computational time. Therefore, parallel solution methods are more
desirable.
Since Hopfield and Tank’s seminal work [8], [9], neural networks
for solving optimization problems have been a major area in neural
network research. While the mainstream of neural network approach
to optimization focuses on solving the NP-complete problems such
as the traveling salesman problem [8], there have been a few direct
attacks of the shortest path problem using neural network [10]–[12].
These investigations have shown the sufficient potentials for the
neural network approach to the shortest path problem.
In the present paper, two recurrent neural networks for shortestpath routing, called the primal and dual routing networks, are
presented. These recurrent neural networks are capable of routing
the shortest path in networks with mixed positive and negative cost
coefficients. These recurrent neural networks are also much simpler
in architecture, hence much easier for hardware implementation.
II. PATH REPRESENTATIONS
A. Problem Statement
Given a weighted direct graph G = (V ; E ) where V is a set of
vertices and E is an ordered set of m edges. A fixed cost cij is
associated with the edge from vertices i to j in the graph G. In a
transportation or a robotic system, for example, the physical meaning
of the cost can be the distance between the vertices, the time or energy
needed for travel from one vertex to another. In a telecommunication
system, the cost can be determined according to the transmission time
and the link capacity from one vertex to another. In general, the cost
coefficients matrix [cij ] is not necessarily symmetric, i.e., the cost
from vertices i to j may not be equal to the cost from vertices j
to i. Furthermore, the edges between some vertices may not exist,
i.e., m may be less than n2 (i.e., m < n2 ). The values of cost
coefficients for the n2 0 m nonexistent edges are defined as infinity.
More generally, a cost coefficient can be either positive or negative.
A positive cost coefficient represents a loss, whereas a negative one
represents a gain. It is admittedly more difficult to determine the
shortest path for a network with mixed positive and negative cost
coefficients [7]. For an obvious reason, we assume that there are
neither negative cycles nor negative loops in the networks (i.e., 8 i; j ;
cii 0,
c
0). Hence the total cost of the shortest path is
i
j ij
bounded from below. Since the vertices in a network can be labeled
arbitrarily, without loss of generality, we assume hereafter vertices 1
and n are origin and destination, respectively.
n
B. Vertex Path Representation
A path in a given network can be represented in different ways
and the way of path representation in turn affects the effectiveness
and efficiency of a solution procedure. In the literature, there are
two typical path representations: vertex representation and edge
representation.
Rauch and Winarske [10] and Lee and Chang [11] used an n 2 p
binary matrix to represent a path, where p is the number of vertices
in the shortest path and which is assumed to be known. Specifically,
the binary matrix of vertex path representation [xij ] contains only
1083–4427/98$10.00  1998 IEEE
Authorized licensed use limited to: Akira Imada. Downloaded on April 2, 2009 at 00:08 from IEEE Xplore. Restrictions apply.
IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 28, NO. 6, NOVEMBER 1998
n
“0” and “1” elements, and xij can be defined as
1;
0;
xij =
if the j th vertex is the ith stop in the path;
otherwise.
xkn = 1
2 f0; 1g;
xij
xij =
minimize
n
=1 j =1; j 6=i
cij xij
(9)
i
n
subject to
k
=1; k6=i
xik
xij
6
0
n
=1; l6=i
xli
l
1;
0;
1;
=
if i = 1,
if i = 2; 3;
if i = n,
0
2 f0; 1g;
i = j ; i; j = 1; 2;
1 1 1 ; n 0 1,
111; n
(10)
(11)
where xij denotes the decision variable associated with the edge
from vertices i to j , as defined in (2). The objective function to
be minimized, (9), is also the total cost for the path. The equality
constraint coefficients and the right-hand sides are 01, 0, or 1.
Equation (10) ensures that a continuous path starts from a specified
origin and ends at a specified destination.
Because of the total unimodulity property of the constraint coefficient matrix defined in (10) [14], the integrality constraint in the
shortest path problem formulation can be equivalently replaced with
the non-negativity constraint, if the shortest path is unique. In other
words, the optimal solutions of the equivalent linear programming
problem are composed of zero and one integers if a unique optimum
exists [14]. The equivalent linear programming problem based on the
simplified edge path representation can be described as follows:
01
n
III. PROBLEM FORMULATION
(8)
Based on the edge path representation, the primal shortest path
problem can be formulated as a linear integer program as follows
[14]:
(2)
Similar to the vertex path representation, each row and each column in the edge representation can contain no more than one “1”
element, if each vertex can be visited at most once. Since the cost
coefficients of loops are assumed to be nonnegative (i.e., cii 0
for i = 1; 2; 1 1 1 ; n), the elements in the main diagonal line of
the edge path representation are always zero. Hence, the edge path
representation can be simplified by excluding the diagonal elements
xii (i = 1; 2; 1 1 1 ; n). In addition, the first column and last row
of the decision variable matrix are always zero vectors, they can be
removed to reduce the matrix size. Consequently, the simplified edge
representation has only (n 0 1)(n 0 2) binary elements. Because
the edge path representation can represent the shortest path with
more than n vertices in a general network with mixed positive and
negative cost coefficients, it is more desirable. The desirability of the
edge representation will be more apparent in problem formulation
that follows.
111; n
B. Primal Formulation Based on Edge Path Representation
n
if the edge from vertices i to j is in the path;
otherwise.
i; j = 1; 2;
(7)
where xij denotes the decision variable associated with the ith stop
and the j th vertex as defined in (1). The objective function to be
minimized, (3), is the total cost associated with the path. Equation
(4) ensures that each vertex is visited at most once. Equation (5)
ensures that each stop contains at most one vertex. Equations (6)
and (7) define the initial condition for starting and ending vertices.
Equation (8) is simply the integrality constraint.
C. Edge Path Representation
In contrast to the vertex path representation, the edge path representation uses an n 2 n binary matrix to represent the edges in a path
[12], [13]. Specifically, the binary matrix of edge path representation
[xij ] also contains only “0” and “1” elements, and xij can be defined
as
(6)
=1
x11 = 1
k
(1)
Obviously, each row and each column in the binary matrix of
vertex path representation contains no more than one “1” element,
respectively, if each vertex can be visited no more than once. Since
the binary matrix provides the path information in terms of vertices,
this path representation is called a vertex representation. A limitation
of the vertex path representation is that it requires determining the
number of vertices in the shortest path p a priori. In a dynamic
environment, it is almost impossible to determine p a priori due to
the time-varying nature of cost coefficients. If p is assigned with
a number smaller than the number of vertices in the shortest path,
then the resultant vertex representation cannot represent the shortest
path and any solution procedure based on the representation cannot
determine the shortest path.
1;
0;
865
minimize
n
=1 j =2; j 6=i
n01
cij xij
(12)
i
A. Formulation Based on Vertex Path Representation
The shortest path problem is to find the shortest (least costly)
possible directed path from a specified origin vertex to a specified
destination vertex. The cost of the path is the sum of the costs on the
edges in the path. Based on the vertex path representation, the shortest
path problem can be mathematically formulated as a quadratic integer
program as follows:
01
n
minimize
k
n
=1 i=1 j =1
n
subject to
n
=1
cij xki xk+1; j
(3)
xki
1;
k = 2; 3;
111; n 0 1
(4)
xki
1;
i = 1; 2;
111; n 01
(5)
i
n
k
=1
xik
subject to
k
0
n
xli
=1; k6=i
l=2; l6=i
= i1 0 in ;
i = 1; 2; 1 1 1 ; n
xij
0; i 6= j ; i = 1; 2; 1 1 1 ; n 0 1
111; n
j = 2; 3;
(13)
(14)
where pq is the Kronecker delta function defined as pq = 1 if
p = q and pq = 0 if p = q .
6
C. Dual Formulation Based on Edge Path Representation
Since the number of equality constraints n is much less than the
number of decision variables (n 0 1)(n 0 2), it is more desirable
to formulate and solve the dual of the primal shortest path problem.
Based on the edge path representation, the dual shortest path problem
Authorized licensed use limited to: Akira Imada. Downloaded on April 2, 2009 at 00:08 from IEEE Xplore. Restrictions apply.
866
IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 28, NO. 6, NOVEMBER 1998
can be formulated as a linear programming problem as follows [14]:
yn 0 y1
yj 0 yi cij ;
maximize
subject to
(15)
i 6= j
i; j = 1; 2; 1 1 1 ; n
(16)
where yi denotes the dual decision variable associated with vertex i.
Note that the value of the objective function at its maximum is the
total cost of the shortest path [14]. Since the objective function and
constraints in the dual problem involves variable differences only, an
equivalent dual shortest path problem with n 0 1 variables can be
formulated by defining zi = yi 0 y1 for i = 1; 2; 1 1 1 ; n
zn
zj 0 zi cij ;
maximize
subject to
(17)
i 6= j ;
i; j = 1; 2; 1 1 1 ; n
IV. PRIMAL ROUTING NETWORK
A. Energy Function
An energy function for the primal problem based on the simplified
edge path representation can be defined as follows:
2
i
=1
+
k
01
n
6=
[xik (t)
2
i
i
n
=1 j =2; j 6=i
i
0 x (t)] 0 1 + in
ki
cij
0t= )x (t)
exp(
Let duij (t)=dt = 0@Ep [t; x(t)]=@xij , based on the simplified
edge path representation, the state dynamical equation and the output
function of the recurrent neural network presented in [13] is as
follows: for i 6= ji = 1; 2; 1 1 1 ; n 0 1; j = 2; 3; 1 1 1 ; n;
duij (t)
dt
=
n
0w
ij
(19)
=1; l6=i
n01
xli (t)
l
n
+w
01
n
xik (t) + w
=2; k6=i
k
xjp (t) 0 w
=2; p6=j
q =1; q 6=j
+ w (i1 0 in 0 j 1 + jn )
xqj (t)
p
0 c
ij
0t= )
exp(
xij (t) = f [uij (t)]
uij (t) denotes the net
(20)
(21)
where
input to neuron (i; j ), f (u) is a
nonnegative and nondecreasing activation function defined as f (u) >
0 if u > 0 or f (u) = 0 otherwise, and df (u)=du 0.
To reduce the complexity of the resulting neural network architecture, the dynamical equation and output function of the present primal
routing network can be described by using n instrumental variables
(vi (t); i = 1; 2; 1 1 1 ; n)
duij (t)
dt
0wv (t) + wv (t) 0 c exp(0t= )
i=
6 j ; i = 1; 2; 1 1 1 ; n 0 1
j = 2; 3; 1 1 1 ; n
01
v (t) =
x (t) 0
x (t)
=
i
j
ij
n
i
(22)
n
ik
=2; k6=i
0 i1 + in ;
k
k
=1; k6=i
ki
i = 1; 2; 1 1 1 ; n
xij (t) = f [uij (t)]
i 6= j ; i = 1; 2; 1 1 1 ; n 0 1
j = 2; 3; 1 1 1 ; n:
C. Network Architecture
Because the shortest path problem based on the edge path representation is formulated as a linear program, it can be solved
by the neural networks proposed for linear programming. In [15],
[16], a recurrent neural network called the deterministic annealing
network is presented and demonstrated to be capable of solving linear
programming problems. The primal and dual routing networks are
tailored from the deterministic annealing network [15], [16]. Let the
decision variables of the primal and dual shortest path problem be
represented, respectively, by the activation states of the primal and
dual routing networks. For simplicity of notations, the same symbols
[xij ] and [zi ] are used to denote both the decision variables and
corresponding activation states.
n
B. Dynamical Equation
(18)
where z1 0. The value of the objective function at its maximum is
still the total cost of the shortest path. Moreover, zn0q+i (1 i q )
is the cost of the shortest path with i edges to the destination where q
is the number of edges in the shortest path. In addition, if zi is used
as the objective function to be maximized, then at optimum it is the
shortest path from vertex 1 to vertex i [14].
Although the last component of the optimal dual solution gives the
total cost of the shortest path, the optimal dual solution needs postprocessing to decode the optimal primal solution in terms of edges.
According to the Complementary Slackness Theorem [14]: given the
feasible solutions of xij and zi to the primal and dual problems,
respectively, the solutions are optimal if and only if 1) xij = 1
implies zj 0 zi = cij and 2) xij = 0 is implied by zj 0 zi < cij
for i; j = 1; 2; 1 1 1 ; n. The Complementary Slackness Theorem can
be used as the basis for the post-processing as will be discussed in
Section V.
The shortest path problem formulation based on the edge path
representation, (12)–(14) or (17) and (18), is a linear programming
problem, whereas the problem based on the vertex path representation, (3)–(8), is an integer nonconvex quadratic programming
problem. The advantages of the edge representation over the vertex
representation become more obvious. The subsequent development
of this paper is thus based on the simplified edge path representation.
Ep [t; x(t)] = w
where w; , and are positive scaling constants and exp(0t= )
is a decaying temperature parameter. The role of the temperature
parameter in (19) is explained at length in [26] and [27].
(23)
(24)
The primal routing network consists of n2 0 2n + 2 neurons
arranged spatially in two layers: an output layer and a hidden layer.
The output layer consists of an (n 0 1) 2 (n 0 1) two-dimensional
array of output neurons without the diagonal elements representing
[xij ]. The hidden layer consists of an n-vector of hidden neurons
representing instrumental variables [vi ].
The first two terms in the right-hand side of (22) define the
connectivity from the n hidden neurons to the (n 0 1)(n 0 2)
output neurons. The third term in the right-hand side of (22) defines
a decaying external input exp(0t= )cij to the output layer.
Similarly, the first two terms in the right-hand side of (23) define
the connectivity from the (n 0 1)(n 0 2) output neurons to the n
hidden neurons. The third term in the right-hand side of (23) defines a
constant positive and negative unity input (bias) to the hidden neurons
corresponding, respectively, to the specified origin and destination.
Moreover, (22) and (23) also show the connectivity of the primal
routing network:
1) there exist (n 0 1)(n 0 2) excitatory connections with weight of
w to xij (t) in the output layer from vi (t) in the hidden layer;
2) there exist (n 0 1)(n 0 2) inhibitory connections with weight
of 0w to xij (t) in the output layer from vi (t) in the hidden
layers;
Authorized licensed use limited to: Akira Imada. Downloaded on April 2, 2009 at 00:08 from IEEE Xplore. Restrictions apply.
IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 28, NO. 6, NOVEMBER 1998
867
Fig. 2. Architecture of the dual routing network.
Fig. 1. Architecture of the primal routing network.
3) there exist n(n 0 1) excitatory connections with unity weight
to vi (t) in the hidden layer from xi1 (t); xi2 (t); 1 1 1 ; xin (t) in
the output layer (i = 1; 2; 1 1 1 ; n);
4) there exist n(n 0 1) inhibitory connections with
negative unity weight to vi (t) in the hidden layer
from x1i (t); x2i (t); 1 1 1 ; xni (t) in the output layer
(i = 1; 2; 1 1 1 ; n);
5) there is no lateral connection among neurons in either the output
layer nor in the hidden layer.
The spatial complexity of the primal routing network is characterized by O(n2 ) neurons and O(n2 ) connections. Specifically, there are
2
4(n 0 1) unidirectional connections in the primal routing network
compared with the n(n 0 1)(2n 0 3) connections in the single-layer
recurrent neural network [13], a reduction of (n 0 1)(2n2 0 7n + 4)
connections for n 3. Fig. 1 illustrates the architecture of primal
routing network.
From (22), it is easy to see that the convergence rate of primal
routing network depends also on the parameter . The parameter serves as a time constant for the decaying bias to reinforce the effect
of cost minimization. The time constant has to be sufficiently large
to sustain cost minimization and ensure solution optimality. Since the
transient time of primal routing network depends on the reciprocals
of the sensitivity parameter and the minimum absolute value of the
nonzero eigenvalues of the connection weight matrix nw, a lower
bound on the value of is 4=(nwxmax ), i.e., > 4=(nwxmax ) if
the unipolar sigmoid activation function is used, as discussed in [15].
Based on a similar analysis, a design rule is that > 1=(nw)
if the Heaviside activation function is used. The role of is to
balance the effects of constraint satisfaction and cost minimization.
Let cmax = maxfcij ; i; j; = 1; 2; 1 1 1 ; ng < 1. A design rule is
to select such that w=cmax .
V. DUAL ROUTING NETWORK
A. Energy Functions
Similarly to the primal shortest path problem, an energy function
for the dual problem can be formulated as
Ed [t; z (t)] = w
2
n
i=1 j 6=i
fg[zj (t) 0 zi (t) 0 cij ]g2
0 exp(0t= )zn (t)
w; ; > 0, g(1) is a nonnegative
(25)
where
and nondecreasing
activation function defined as g (z ) > 0 if z > 0 or g (z ) = 0,
otherwise, and exp(0t= ) is a decaying temperature parameter.
D. Activation Function
The role of the activation function f (1) is twofold: to enforce
the nonnegative constraint on xij as described in (24) and to scale
the sensitivity of the activation of xij (t). The basic requirements
for the activation function of primal routing network are nonnegativity and nondecreasing monotonicity, i.e., f [uij (t)] 0 and
df [uij (t)]=duij 0 for i; j = 1; 2; 1 1 1 ; n. The unipolar sigmoid
function, f [uij (t)] = xmax =f1+exp[0uij (t)]g, is a good candidate
of the activation function, where xmax 1 is the supremum of
xij and is a positive scaling constant. Another simple activation
function (Heaviside function) can be defined as f [uij (t)] = uij (t)
for uij (t) > 0 and f [uij (t)] = 0 otherwise, where > 0 is the
slope of the activation function in the positive half-space. Similar to
the sigmoid activation function, the design parameter determines
the sensitivity of activation.
B. Dynamical Equation
Let dzi (t)=dt = 0@Ed [t; z (t)]=@zi , the dynamical equation and
output function of the dual routing network are as follows:
dzi (t) =
dt
0w
j 6=i
fg[zj (t) 0 zi (t) 0 cij ]
0 g[zi (t) 0 zj (t) 0 cji ]g
+ in exp(0t= );
xij (t) = h[zj (t) 0 zi (t) 0 cij ]
where h(u) is the output function defined as h(u) = 1 if
= 0 otherwise.
h(u)
C. Network Architectures
E. Design Parameters
It can be shown that all the nonzero eigenvalues of the system
matrix of the linearized primal routing network are equal to 02nw.
Therefore, a large value of w should be used to expedite the
convergence. Furthermore, since the nonzero eigenvalues are directly
proportional to n, the average convergence time of the primal routing
network with given design parameters decreases as n increases. This
feature has been demonstrated for the predecessor of the primal
routing network in [13].
i = 2; 3; 1 1 1 ; n
(26)
(27)
u = 0, or
The dual routing network consists of n 0 1 neurons representing
arranged spatially in a layer. The dynamical equation (26) of
the dual routing network shows that there is an inhibitory connection
with weight of 0w and an excitatory connection with weight of w
from every pair of zi (t) (i = 2; 3; 1 1 1 ; n). That is, the number of
connections is 4(n 0 1)2 , the same as the primal routing network. The
dynamical equation also shows that only the neuron corresponding
to the destination has a decaying external input exp(0t= ). Fig. 2
illustrates the architecture of the dual routing network.
[zi ]
Authorized licensed use limited to: Akira Imada. Downloaded on April 2, 2009 at 00:08 from IEEE Xplore. Restrictions apply.
868
IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 28, NO. 6, NOVEMBER 1998
D. Design Parameters
Because the constraint coefficient matrix in a dual problem is the
transpose of that in the primal problem, it can be seen that the nonzero
eigenvalues of the system matrix of the linearized dual routing
network are also 02nw. Large w can expedite the convergence
of the dual routing network as well. Similar to in the primal
routing network, the role of is to balance the effects of constraint
satisfaction and objective maximization. It is usually set wcmax .
Similar to that in the primal routing network, the role of the
activation function in the dual routing network is to enforce the
inequality constraints (18) and scale the sensitivity of the activation.
The Heaviside activation function discussed in the preceding section
is suitable for the dual routing network.
E. Convergence Analysis
The asymptotic stability of the deterministic annealing network is
proven in [16]. The necessary and sufficient condition for the state
variables of the dual routing network to converge to a feasible solution
will be shown to be the absence of negative cycles in the given
network G. A negative cycle is characterized by the negative sum of
cost coefficients around a closed circuit in a network, i.e., 9i, such
that cij + cjk + 1 1 1 + cli < 0. To prove the necessity, adding both sides
of the constraints associated with indices i; j; k; 1 1 1 ; l, zj 0 zi cij ; zk 0 zj cjk ; 1 1 1 ; zi 0 zl cli , results in cij + cjk + 1 1 1 + cli 0. The sufficiency can be proved by showing that the convergence
to an infeasible solution will lead to the presence of at least one
negative cycle. Examining the dynamical equation (26) of the dual
routing network, we can see that the only cause of an infeasibility
convergence results from the existence of an offset effect in the first
term on the right-hand side, since the second (last) term vanishes as
time approaches infinity and the left-side hand approaches zero as the
states converge. Specifically, an infeasibility implies that at least one
term in both j 6=i g [zj (t) 0 zi (t) 0 cij ] and j 6=i g [zi (t) 0 zj (t) 0
cji ] are positive. According to the definition of g(1) in the dual routing
network, g [zj (1) 0 zi (1) 0 cij ] > 0 if and only if zj (1) 0 zi (1) 0
cij > 0. If the cause of infeasibility is that g[zj (1) 0 zi (1) 0 cij ] >
0 and g [zi (1) 0 zj (1) 0 cji ] > 0 (i.e., zj (1) 0 zi (1) 0 cij > 0 and
zi (1) 0 zj (1) 0 cji > 0), adding both sides of the two inequalities
results in a negative cycle cij + cji < 0. If an infeasibility is caused
by zj (1) 0 zi (1) 0 cij > 0 and zi (1) 0 zk (1) 0 cki > 0, zj (1) 0
zi (1) 0 cij > 0 implies 9m such that zm (1) 0 zj (1) 0 cjm > 0
and zi (1)0zk (1)0cki > 0 implies 9n such that zk (1)0zn (1)0
cnk > 0. These inequalities in turn result in two other inequalities.
Eventually, adding both sides of all the resulting inequalities lead to
the presence of a negative cycle cij + cjm + 1 1 1 + cnk + cki < 0.
The case that an infeasibility results from the cancellation of more
than two terms can be proven in the similar way.
F. Post Processing
Using the Complementary Slackness Theorem via the output
function (27), the optimal primal solution in terms of edges can be
decoded from the optimal dual solution. The activation states from
the output function of the dual routing network, however, sometimes
results in an infeasible primal solution (specifically, more “1” than
required). This infeasibility can be easily removed by checking every
element of the resulting primal solution matrix [x] from origin
to destination and converting inconsistent “1” to “0.” Specifically,
8 xij = 1, if 8 k; xki = xjk = 0 then set xij = 0.
VI. ILLUSTRATIVE EXAMPLES
Example 1: Consider the shortest path problem with 10 vertices
(Example 1 in [13] or Example 2 in [17]) where the origin and
Fig. 3. Transient states of the dual routing network in Example 1.
destination vertices are, respectively, vertices 1 and 10. The Euclidean
distances are used as the cost coefficients. The shortest path of this
problem is fe12 ; e23 ; e3; 10 g where eij (i = 1; 2; 1 1 1 ; n) denotes
the edge between vertices i and j in the network. The total cost of
the shortest path is 1.149 896. Let w = = 108 , = 1007 , and
c1 = 10. Fig. 3 depicts the transient behavior of the activation states
of the dual routing network. The steady-state vector of the output
neurons is [0.325 914, 0.510 996, 0.585 029, 1.207 522, 0.476 889,
0.701 207, 0.555 255, 0.831 225, 1.149 902]. After post-processing
to remove inconsistency and ensure feasibility, the neural network
solution to the problem represents the shortest path. The simulated
recurrent neural network takes about 0.5 s to converge.
Unlike the popular Dijksdra’s labeling algorithm which can solve
the shortest path problem with nonnegative cost coefficients only, the
primal and dual routing networks are capable of solving the shortest
path problem with mixed-sign cost coefficients as will be shown in
the following example.
Example 2: Consider the 10-vertex directed network with mixed
positive and negative cost coefficients (Example 2 in [13]). The
cost coefficient matrix is asymmetric and there are no loops or
negative cycles in this network. The shortest path of this problem is fe19 ; e92 ; e25 ; e57 ; e7; 10 g, with the total cost of 0.360 952.
Fig. 4 depicts the transient behavior of the activation states of the
dual routing network simulated using the same design parameters
in Example 1. The steady-state vector of the output neurons is
[0.053 502, 0.041 353, 0.268 410, 0.037 663, 0.345 777, 0.212 687,
0.211 472, 0.117 621, 0.360 951]. After post-processing to remove
inconsistency and ensure feasibility, the neural network solution to
the problem represents the shortest path. It takes less than 0.5 s (5 )
for the simulated dual routing network to converge.
VII. CONCLUDING REMARKS
In this paper, the primal routing network with O(n2 ) neurons and
connections and the dual routing network with O(n) neurons and
O(n2 ) connections for solving the shortest path problem have been
presented. The dynamics and architectures of the primal and dual
routing networks have been described. The design methodology and
guidelines have been delineated. It has been shown that the primal and
dual routing networks are capable of shortest-path routing for directed
Authorized licensed use limited to: Akira Imada. Downloaded on April 2, 2009 at 00:08 from IEEE Xplore. Restrictions apply.
IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 28, NO. 6, NOVEMBER 1998
869
a VLSI circuit can serve as coprocessors for onboard planning in
dynamic decision environments.
REFERENCES
Fig. 4. Transient states of the dual routing network in Example 2.
networks with arbitrary cost coefficients. The convergence rate of the
primal and dual routing networks is nondecreasing with respect to the
size of the shortest path problem and can be expedited by properly
scaling design parameters. These features make the primal and dual
routing networks suitable for solving large-scale shortest path problems in real-time applications. One salient advantage of the primal
and dual routing networks is the independence of the connection
weight matrix upon specific problems. Specifically, only the constant
biases are different for different origins and/or destinations of the
same network, and only the initial values of the decaying biases are
different for different networks with the same number of vertices,
and the same origin and destination. By energizing different vertices,
the primal and dual routing networks can be used to generate all-pair
shortest paths and minimal spanning trees. These desirable features
facilitate the VLSI implementation of the primal and dual routing
networks. The primal and dual routing networks implemented in
[1] L. Bodin, B. L. Golden, A. Assad, and M. Ball, “Routing and scheduling
of vehicles and crews: The state of the art,” Comput. Oper. Res., vol.
10, pp. 63–211, 1983.
[2] A. Ephremides and S. Verdu, “Control and optimization methods in
communication network problems,” IEEE Trans. Automat. Contr., vol.
34, pp. 930–942, 1989.
[3] J. K. Antonio, G. M. Huang, and W. K. Tsai, “A fast distributed shortest
path algorithm for a class of hierarchically clustered data networks,”
IEEE Trans. Comput., vol. 41, pp. 710–724, 1992.
[4] S. Jun and K. G. Shin, “Shortest path planning in distributed workspace
using dominance relation,” IEEE Trans. Robot. Automat., vol. 7, pp.
342–350, 1991.
[5] P. L. Lin and S. Chang, “A shortest path algorithm for a nonrotating
object among obstacles of arbitrary shapes,” IEEE Trans. Syst., Man,
Cybern., vol. 23, pp. 825–833, 1993.
[6] P. Soueres and J.-P. Laumond, “Shortest paths synthesis for a car-like
robot,” IEEE Trans. Automat. Contr., vol. 41, pp. 672–688, 1996.
[7] E. L. Lawler, Combinatorial Optimization: Networks and Matroids.
New York: Holt, Rinehart, and Winston, 1976, pp. 59–108.
[8] J. J. Hopfield and D. W. Tank, “Neural computation of decisions in
optimization problems,” Biol. Cybern., vol. 52, pp. 141–152, 1985.
[9] D. W. Tank and J. J. Hopfield, “Simple neural optimization networks,
an A/D converter, signal decision circuit, and a linear programming
circuit,” IEEE Trans. Circuits Syst., vol. CAS-33, pp. 533–541, 1986.
[10] H. E. Rauch and T. Winarske, “Neural networks for routing communication traffic,” IEEE Contr. Syst. Mag., vol. 8, pp. 26–30, 1988.
[11] S.-L. Lee and S. Chang, “Neural networks for routing communication
networks with unreliable components,” IEEE Trans. Neural Networks,
vol. 4, pp. 854–863, 1993.
[12] M. Ali and F. Kamoun, “Neural networks for shortest path computation
and routing in computer networks,” IEEE Trans. Neural Networks, vol.
4, pp. 941–954, 1993.
[13] J. Wang, “A recurrent neural network for solving the shortest path
problem,” IEEE Trans. Circuits Syst., vol. 43, pp. 482–486, 1996.
[14] M. S. Bazaraa, J. J. Jarvis, and H. D. Sherali, Linear Programming and
Network Flows, 2nd ed. New York: Wiley, 1990.
[15] J. Wang, “Analysis and design of a recurrent neural network for linear
programming,” IEEE Trans. Circuits Syst., vol. 40, pp. 613–618, 1993.
, “A deterministic annealing neural network for convex program[16]
ming,” Neural Networks, vol. 7, pp. 629–641, 1994.
[17]
, “Primal and dual assignment networks,” IEEE Trans. Neural
Networks, vol. 8, pp. 784–790, 1997.
Authorized licensed use limited to: Akira Imada. Downloaded on April 2, 2009 at 00:08 from IEEE Xplore. Restrictions apply.