Download IV. Optimum Scheduling Algorithm

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Matrix multiplication wikipedia , lookup

Cayley–Hamilton theorem wikipedia , lookup

Transcript
Multi-class scheduling algorithms for supporting QoS in an
interconnected WDM rings network
A. Bianco1, D. Careglio2, J. Finochietto1, G. Galante1, E. Leonardi1, F. Neri1,
J. Solé-Pareta2, S. Spadaro2
1 Dipartimento di Elettronica, Politecnico di Torino, Italy,
(e-mail: {bianco, finochietto, galante, leonardi, neri}@mail.tlc.polito.it)
2 Advanced Broadband Communications Centre (CCABA), Universitat Politècnica de Catalunya, Spain,
(e-mail: {careglio,pareta}@ac.upc.es, [email protected])
Abstract— This paper is placed in the context of the interconnected WDM rings metro network, which
has been proposed within IST DAVID project. The DAVID metro network consists of several WDM
rings interconnected via an optical switch called Hub; the Hub provides optical interconnection among
rings using a scheduling algorithm based on a fixed-size frame while nodes belonging to the same ring
access the same set of shared resources using a MAC protocol. In this paper we address the problem of
designing efficient scheduling algorithms for supporting multi-class traffic services. In particular, two
categories of traffic are considered, namely high-priority which offers best-effort and bandwidthguaranteed services. We formally define the scheduling at the Hub as an ILP problem, showing that, in
general, it is NP-hard. We propose a complex optimal algorithm based on flow maximization and three
simpler heuristic algorithms to solve the scheduling problem. The performances of the suggested
algorithms are evaluated by simulation, and the obtained results are discussed to assess the merits of
each proposed solution.
Keywords—optical metro networks, WDM ring, QoS provisioning, MAC protocol, scheduling
algorithm.
I. INTRODUCTION
The trend in networking is currently experiencing an explosive growth in the service layer which is
putting inordinate requirements’ demands on the transport layer resulting in a complex, multi-layers
structure. This situation is challenging the capacity limits of existing transport infrastructure based on
SONET/SDH and ATM. Therefore, in the new telecommunication network optical technology is
expected to play a stronger role, not only for increasing the transmission capacity, but also for
networking operation. In fact, optics is an interesting solution since it can provide a direct IP over
WDM platform with less complexity in resource management and allocation algorithms [1]. On the
other hand, the need of Network Operators for reducing investments and operative costs (CAPEX and
OPEX) represents one of the major drivers moving the evolution of the public network infrastructures
towards optical shared medium networks, in particular in the metro environment [2].
In this direction, the IST DAVID (Data And Voice Integration over DWDM) project [3] funded by
the European Commission aims at designing an optical packet-switched network by developing
networking concepts and technologies for future optical networks. The DAVID network architecture
encompasses both metropolitan and long-haul geographical scales.
5
6
1
4
2
ring n
ring j
3
HUB
11
IP routers/switches
7
10
8
9
ring i
Fig. 1. DAVID metro network architecture
At the metropolitan scale, the DAVID network consists of several uni-directional WDM rings
interconnected in a star topology by an optical switch called Hub as shown in Fig. 1. On each fiber, a
fixed number of wavelengths are available by WDM partitioning. Rings can either be physically
disjoint (i.e., run on different fibers), or be logically obtained by partitioning the optical bandwidth of
one fiber into disjoint portions. In the remainder of the paper we use the term ring to identify a logical
ring; any reference to physical rings will be explicit. Nodes belonging to the same ring access the same
set of shared resources using a Medium Access Control (MAC) protocol. No packet buffering in the
optical domain is used in the metro network; buffering is done only in electronics at access nodes. The
Hub provides optical interconnection among rings and to one (or more) optical packet routers in the
Backbone. Being bufferless, the Hub behaves like a space switch between rings. Ring interconnections
are dynamically modified at the Hub following a scheduling algorithm, based on a fixed-size frame.
The aim of the scheduling algorithm is to provide to node pairs an amount of bandwidth close to
instantaneous (short-term) bandwidth requirements. The scheduling is based on explicit bandwidth
requests issued by nodes and periodically collected at the Hub to create traffic request matrices.
2
Since it is reasonable to think that in future metro networks multimedia and interactive applications
will be dominant part of ISP services, the importance of providing QoS is becoming highlighted. A
simple way to provide QoS is, as in legacy LANs and MANs, introducing different traffic priorities [4].
A drawback of this strategy is that if high priority traffic is not shaped, policed or controlled otherwise
(e.g., through any kind of CAC), it has absolute priority over low priority (e.g., best effort) traffic. To
avoid this situation in public networks, bandwidth management functions are required. Network
Operators need to have the possibility to control the amount of high priority traffic injected in the
(optical) MAN segments of their network infrastructures [2].
In this context, this paper addresses the problem of designing a scheduling algorithm at the Hub (a
focal point for network management) for supporting multi-class traffic. In particular, two service
categories are considered, namely a conventional Best-Effort service (BE), and an on demand
(subjected to an admission control policy) bandwidth and delay service to support High-Priority traffic
(HP) for applications such as Internet telephony, video broadcasting, and video-conferences.
The rest of the paper is organized as follows. We first provide a more detailed description of the
metro network architecture (Section II). In Section III, we discuss the scheduling problem. Then, we
describe an optimal scheduling algorithm with polynomial computational complexity for a particular
logical configuration of the metro network (Section IV). Three different heuristics solutions with the
aim of reducing the computational complexity are also described in Section V. Finally, we present
simulation results to assess the performance of the proposed schemes (Section VI). Section VII
concludes the paper.
II. DAVID METRO NETWORK ARCHITECTURE
In this Section we describe the network architecture that serves as the basis for our work. In
particular, we focus on the assumptions required to support multi-class traffic. We do not tackle issue
related to the metro network feasibility in this paper, nor do we discuss the components that should be
used or physical limitations that should be taken into account when dimensioning the network. All
these issues have been deeply analyzed in the DAVID project and have been discussed in [3].
The number of rings in the metro is denoted by R, the number of nodes in each ring, assumed to be
equal on each ring for simplicity, is denoted by n and the total number of nodes is N = R × n. While the
number of wavelengths on each ring can be different, we assume in this paper that it is equal for all
rings and denoted by W. We also assume that all the nodes of a ring can transmit and receive on any
wavelength used in that ring. The latter is an essential assumption, since the scheduling would be much
more complex if nodes had a limited tunability on the wavelengths of the ring they belong to.
3
Ring resources are shared by the nodes of the metro network using a statistical
time/wavelength/space division scheme. Indeed, each wavelength is time slotted (TDM) and the slot
duration is in the range 500 ns ÷ 1 s, several slots are simultaneously transmitted through wavelength
division (WDM), and rings can be disjoint in space (SDM).
Time slots are aligned on all wavelengths of the same ring, so that a multi-slot is available at each
node in each time slot. Slot misalignments among different rings are solved optically at the Hub; we
assume for simplicity that the propagation delay on each ring is an integer multiple of the slot size.
Slots are organized in fixed-size frames of length F slots.
One wavelengths (hence a slot in each multi-slot) is devoted to network control channel. We assume
that this control channel can be read and written by all nodes independently of their data transmissions
and receptions in other slots of the multi-slots. The control information contained in a multi-slot refers
to data slots of the same multi-slots; a delay is added to the optical data path at each node to process
information contained in the control slot.
A. The role of the Hub
The Hub is a buffer-less space optical switch with R×W input and R×W output ports. Wavelengths
are (dynamically) assigned to ring-to-ring interconnections by the Hub on a slot-by-slot basis: a packet
received at an input ring on a given wavelength can be forwarded to any output ring on any available
wavelength. Being all-optical, the Hub includes only a space switching stage, a wavelength conversion
stage, and a WDM synchronization; 3R regeneration may be added if necessary.
The Hub operates a permutation from input ports to output ports in every time slot, as depicted in
Fig. 2 for the case of two rings and three wavelengths. The permutation is known for each time slot in
each ring: the Hub labels the multi-slots (using the control slot) to identify the destination (rings,
wavelengths)-tuples for the next time they return to the Hub. In such a way, the nodes can fill in the
empty slots knowing the scheduled destination (ring, wavelength)-tuple. The computation of the
sequence of permutations operated by the Hub is a scheduling problem [5] [6], as shown in Fig. 2.
Since we are assuming that the number of wavelengths in each ring is the same, no congestion can
occur at the Hub: each incoming multi-slot can be forwarded to the Hub outputs. The Hub may have to
perform wavelength conversion when the wavelengths used in the input ring are different from those
used in the output ring.
4
ring 1
input ports
ring 2
1
3
6
4
5
2
1
2
3
1
2
3
4
5
6
4
5
6
6
2
5
1
4
3
permutation
5
3
1
2
6
4
4
1
6
3
2
5
2
6
3
5
1
4
ring 1
ring 2
…
…
…
…
…
…
time slots
Fig. 2. Scheduling problem at the Hub
Several approaches can be envisaged to solve this scheduling problem ranging from complex
optimization to simple heuristics [7] [8], generally driven by the estimation of the traffic pattern as in
[9]-[11]. This estimation can be based upon a priori knowledge of the traffic pattern, measurements of
the traffic load, or reservation requests issued by nodes. This problem is deeply discussed in Section
III.
B. Architecture of the nodes
We assume that the nodes are equipped with one fixed transceiver, to operate on the control channel,
and one tunable transceiver, to send and receive packets; thus a node can only transmit and receive data
on one wavelength in each multi-slot. Nodes must also have a selective erasure capability in order to
perform destination stripping, which allows space reuse. For sake of simplicity, we assume that all
tuning latencies are negligible compared to the slot time.
A MAC protocol arbitrates the access of nodes to the slots, regulating both time and wavelength
dimensions. Information of the status of each slot of the multi-slots is reflected in the control slot.
Therefore, a node that has packets to send must monitor the control slot, and, at the same time, looks
for any instance of his address for possible receptions. As a consequence, only the in-transit
information of the control channel is converted to the electrical domain for being processing at nodes,
the rest of the information remains in the optical domain along the entire source-destination node path.
5
At the electrical domain, to avoid the head-of-line blocking phenomenon [11], we assume that
packets are organized per traffic class and stored in a per destination node queue basis, similar to the
Virtual Output Queue architecture adopted in the Input-Queued switches [12].
C. Network behavior
Given the Hub behavior, each multi-slot traverses a sequence of rings, e.g. as illustrated in Fig. 3,
where roman numbers indicate successive positions of the multi-slot. For sake of simplicity, in this
figure we assume that all slots of the multi-slot have the same destination ring. Therefore, the upper
slot is the control slot where the destination ring of the multi-slot is written, and numbers within the
slots represent node destinations. Nodes of ring x transmit data to be received by nodes of ring y (Steps
II to IV). Ring x can be viewed as the upstream ring, where transmissions occur, while ring y can be
viewed as the downstream ring, where receptions occur. Note however that when the considered multislot traverses the downstream ring y (Steps VI to VIII), it gathers transmissions for the next ring, say
ring z, so that the ring traversal can be viewed as a downstream path for transmissions in the previous
ring, and as an upstream path for receptions in the following ring.
(I)
y
y
(II)
3
3
4
2
1
y
(V)
y
(IV)
(III)
5
4
5
4
y
3
4
Ring x
3
Hub
z
(VI)
Ring y
(VIII)
8
z
(VII)
5
4
z
8
4
5
4
Fig. 3. Multi-slot forwarding in the metro network
To summarize, neither packet buffering nor packet switching in the time domain is done at the Hub,
and, therefore, no packet losses occur at the optical domain.
III. SCHEDULING PROBLEM FORMALIZATION
We consider two classes of traffic: a high priority (HP) traffic, with guaranteed bandwidth, and a best
effort traffic (BE) class. We aim to provide bandwidth reservation on a per-session (i.e. flow, virtual
circuit, etc.) basis. Both bandwidths allocated to HP traffic and to BE traffic are based on explicit
6
bandwidth requests issued by nodes, received at the Hub through the signaling channel. Thus, two
traffic request matrices are available at the Hub, one for each traffic type: matrix A for HP requests and
matrix B for BE request. HP traffic has higher priority than BE traffic: therefore, the scheduling must
satisfy all HP requests first, up to the limit of band availability, and then BE requests.
The scheduling is based on a fixed-size frame of length F slots, which is considered the most suited
solution for a system with guaranteed bandwidth allocation: in fact, a reservation issued in terms of
bit/s can easily be translated into slot/frame with a fixed frame size. Each request matrix will store the
number of slots that must be transmitted from a node to any other node within a frame. Only in the
heuristic approach, to simplify the scheduling, we schedule ring-to-ring request matrices for BE traffic;
the ring-to-ring request matrix can be easily derived from the larger node-to-node request matrix.
To solve the scheduling problem we must take into account the architecture of the DAVID metro
network, which introduce specific problems.
Firstly, the scheduling has to be subjected to the following constraints:

Each node is equipped with only one tunable transceiver; therefore it can transmit and receive at
most a packet in each multi-slot.

Atomicity in HP requests allocation: an HP request, assumed to be non elastic, is accepted only
when it can be fully satisfied, i.e. when it is possible to allocate all the required slots; otherwise
the request is refused.

BE requests can be partially satisfied since they are considered elastic requests.

Transparency: an already existing connection cannot be blocked by the allocation of a new
connection.
On the other hand, the scheduling at the Hub has to avoid possible collisions between the packets
scheduled for a given node and the in transit packets that will be received by downstream nodes on the
same ring. This holds also for packets injected in the ring after having been switched at the Hub. This
problem can be solved in two different ways:

Algorithmically: in this case, to avoid in-transit packet interference, transmissions should be
scheduled at the Hub having the notion of node positions along the rings;

Decoupling the reception and the transmission phases either in frequency or in time. In this case,
the upstream ring and the downstream ring are logically disjointed and therefore the Hub can
schedule the requests without knowing the node position along rings.
Obviously, the first solution requires a more complex scheduling algorithm, while the second entails
a drastic simplification on scheduling procedures but can lead to some bandwidth waste on the network
7
links due to fix partitioning of the resources between up-links and down-links. Nevertheless, due to the
particular technological context in which the bandwidth on networks links is not a bottleneck, a nonefficient bandwidth usage does not have significant effects neither on network performance nor on
network costs.
In the following section we describe some solutions, with different degrees of complexity, to the
scheduling problem, trying to put in evidence both strengths and weaknesses of the different
approaches.
More precisely we present:

a formalization of the problem by writing an Integer Linear Programming (ILP) formulation and an
optimal algorithm based on flow maximization that may be used at the Hub to schedule node-tonode transmissions on the basis of node-to-node bandwidth requests, with frequency separation of
transmissions and receptions and no atomicity constraint (Section IV);

a discussion on three heuristic sub-optimal scheduling approaches that try to achieve a good tradeoff between performance and complexity (Section V).
IV. OPTIMUM SCHEDULING ALGORITHM
In this section we first present an ILP formulation of the multi-class resource allocation problem for
providing bandwidth-guaranteed and best-effort services. Then, we discuss the feasibility of this
approach showing that it is in general NP-hard. Later we present an optimal scheduling algorithm with
no atomicity constraint associated to the HP traffic requests which can be solved in polynomial time.
A. ILP Formulation
The following notations will be used in the ILP formulation and the scheduling problems:
N
denotes the total number of nodes
R
denotes the number of rings
W
denotes the number of wavelengths per ring
F
denotes the number of slots in a frame
A
is a node-to-node HP traffic request matrix, whose element Ai,j represents the
number of slots that must be transmitted in a frame from node i on ring ri to
node j on ring rj
B
is a node-to-node BE traffic request matrix, whose element Bi,j represents the
number of slots that must be transmitted in a frame from node i on ring ri to
node j on ring rj
8
i and j
denote network nodes
x
denotes network rings
l
addresses slots inside the frame
r(i)
denotes the nodal location function which, for each node i, returns the ring on
which node i is located, i.e. rj = r(i). In the same way, r-1(x) denotes the set of
nodes located on ring x
The problem of optimally scheduling node-to-node transmission requests in a frame of fixed size F at
the Hub can be easily formalized as follows.
GIVEN the HP request matrix A, and the BE request matrix B;
FIND a slot allocation within a fixed-length frame F
SUCH THAT

First, the number of satisfied HP traffic requests is maximized;

Then, the number of slots allocated to BE traffic is maximized.
The multi-class scheduling algorithm can be formulated in terms of ILP. Let tli,j, sli,j and xi,j be binary
variable defined as follows:
 1 if node i is transmitt ing to node j in slot l a HP packet
t il, j  
 0 otherwise
 1 if node i is transmitt ing to node j in slot l a BE packet
sil, j  
 0 otherwise
 1 if HP priority request i  j has been satisfied
xi , j  
 0 otherwise
Variables tli,j, sli,j and xi,j must satisfy the following constraints:
 t

i, l
(1)

j, l
(2)
W
a, l (3)
W
a, l (4)
l
i, j
 sil, j  1
l
i, j
 sil, j  1
j
 t
i
 t
ir 1  x  j
l
i, j
 t
jr 1  x  i
t
l
i, j
l
i, j
 Ai , j xi , j
(5)
 Bi , j
(6)
l
s
l
i, j
l
9
Constraints (1) and (2) enforce that each node in the network can transmit and receive at most a
packet in each time slot (since only one data transceiver is assumed to be available at each node).
Constraints (3) and (4) enforce that the aggregated number of packets transmitted or received by nodes
belonging the same ring cannot exceed the number of available wavelengths on the ring. Constraint (5)
enforces atomicity of the allocation. Constraint (6) enforces that BE requests can be partially satisfied,
since they are considered elastic requests.
The optimal scheduling is defined as the set of xi,j and sli,j that maximize the following objective
function:


max   xi , j    sil, j 
i
j
l
 i j

where ε is a positive constant smaller than 1/(F N2). In this way, at first the number of allocated HP
requests is maximized, and then the number of slots allocated to BE traffic is maximized.
From the application point of view, the previous scheduling approach has the drawback that, in the
case of connection-oriented guaranteed-bandwidth traffic, it does not guarantee transparency, i.e. it
does not distinguish among new HP allocation requests and already existing allocation. Thus, an
already existing connection may be blocked by the allocation of a new connection. To avoid this
undesirable behavior and obtain a so-called transparent scheduling algorithm, we need to distinguish
between a requests matrix Aold, whose element Aoldi,j represents the number of slots associated with
already existing connections that must be transmitted in frame F from node i to node j (after having
removed connections ended in the previous frame), and a requests matrix Δ, whose elements Δi,j
represents the number of slots associated with new connections that must be scheduled in frame F from
node i to node j. Finally, let Anew be the sum of Aold and Δ. Thus, while taking into account the
atomicity constraint, the model can be easily extended by distinguishing between the allocation of old
connections xoldi,j, and the allocation of new connections xnewi,j, by slightly modifying the objective
function as follows:


new
2
l

max   xiold
, j    xi , j    si , j 
i
j
i
j
i
j
l


The problem as stated before can be easily proved to be NP-hard, since it reduces to a generalization
of the well-known Knapsack problem [13] when only HP traffic requests are issued. On the contrary,
the problem can be solved in polynomial time if we relax the atomicity constraint associated to the HP
traffic requests.
The fixed-frame scheduling problem without atomicity constraint can be divided into two sub10
problems each exhibiting polynomial complexity: the F-matching problem, and the Time Slot
Assignment (TSA) problem.
The F-matching problem consists in finding the maximum subset of admissible HP and BE requests
subject to the constraints listed in the formulation. The outcome of an F-matching problem is a request
matrix that can be scheduled in a frame of length F; to obtain the schedulable matrix, some of the
requests may be dropped by the F-matching algorithm.
The problem of scheduling at the Hub an admissible matrix of requests can be reduced to the TSA
problem in a Hierarchical Switching System (HSS). According to the logical representation of a HSS
proposed in [14], a HSS comprises a central non-blocking switch which interconnects groups of
transmitters and receivers (see Fig. 4). In the DAVID metro network, we can consider the Hub as a
non-blocking W R × W R switch interconnecting a set of logical inputs (each input is associated with a
wavelength channel on a specific ring) to a set of logical outputs (each output is again associated with a
wavelength channel on a specific ring). All nodes on ring k are connected to the Hub through a nonblocking multiplexing nk × W stage, where nk represents the number of nodes on ring k. Thus, the
problem of scheduling node-to-node communications at the Hub can be optimally solved by applying
techniques and algorithms that were developed for the scheduling of packets in hierarchical switching
systems.
1
n2
MUX
2
W2
WR×WR
SWITCH
W1
DEMUX
1
n1
W2
DEMUX
2
n2
nR
MUX
R
WR
DEMUX
R
N
…
W1
1
1
1
…
n1
MUX
1
WR
WR
WR
nR
N
Fig. 4. Hierarchical Switching System
B. Optimum Scheduling Algorithm
We present in this section two optimal algorithms, one for the F-matching and one for the TSA
problem. Both make use of the Dinic algorithm [15] to calculate a maximum flow through a graph
associated to the DAVID metro network as represented in Fig. 5. It is a directed graph that presents 2N
+ 2R + 2 vertices: a source vertex s, with R outgoing edges linking it to the R vertices that represent the
11
rings to which source nodes belong; R vertices (called VI1, ..., VIR), each one connected to as many
vertices as the number of nodes on each ring; N vertices (called I1, ..., IN), that represent source nodes,
each one connected to any possible destination vertex by mean of N edges; N destination vertices
(called O1, ..., ON); R vertices (called VO1, ..., VOR) representing the ring to which destination nodes
belong; a sink t with R incoming edges. In the following sections we will show how to use this graph
for the solution of both the F-matching and the TSA problem.
I1
O1
VI1
VO1
capacityWF
capacity F
capacity ti,j
VIR
ring
R=4
node
N = 12
VOR
IN
ON
Fig. 5. Graph associated to the DAVID metro network
C. Matching Algorithm
Matching problems fall in the class of single-commodity network flow problems, that can be solved
by applying a maximum flow algorithm (the Dinic algorithm). To solve the scheduling problem in the
DAVID metro network, we repeat the same algorithm twice: the first time to allocate HP requests, the
second time to allocate BE requests on the residual bandwidth.
Looking at the graph depicted in Fig. 5, we start by assigning capacity to any graph edge. A capacity
equal to the value of the HP traffic matrix A is assigned to each one of the corresponding N2 edges
connecting source and destination nodes. Edges connecting each node to its corresponding ring have
capacity equal to F, edges connecting rings to the source s or to the sink t have capacity W F. Then, a
Dinic algorithm is applied to find a maximum flow in this oriented graph with the described capacities
associated to each edge. As a result we obtain a flow associated to each edge of the graph. Flows
associated to the N2 edges connecting source and destination vertices return the values of the Fmatched matrix for HP traffic.
The same algorithm is applied a second time to schedule BE traffic on a graph whose capacities are
based on the best-effort traffic matrix B reduced by the amount of slots already assigned to HP traffic.
We obtain the F-matched matrix for BE traffic from the flows associated to graph edges.
If transparency must be taken into account, i.e. we don't want new requests to block already existing
12
connection, we need to apply the Dinic algorithm three times instead of two: the first to allocate
existing HP connections (the Aold matrix), the second time to allocate new HP connections (the Anew
matrix); the third time to allocate BE traffic (the B matrix).
The overall matching complexity is O(N2.5 F).
D. Time Slot Assignment Algorithm
Once the two F-matched matrices are obtained from the algorithm presented in the previous section,
we need to allocate them in the frame. The algorithm allocates HP and BE traffic at the same time. We
create a matrix T which is the sum of the two matrices (one for HP requests and one for BE requests)
obtained from the F-matching algorithms. The algorithm works with full matrices only, i.e. with
matrices T that present the following characteristics:
N
 t
iI r j 1
W F
r , 1  r  R
i, j
W F
s, 1  s  R
i, j
W F R
i, j
N
 t
i 1 jOs
thus:
N
N
 t
i 1 j 1
where Ir is the group of lines of T corresponding to nodes that are on ring r (there are R groups of
lines in the matrix). In the same way, Os is the group of columns on ring s. If matrix T is not a full
matrix, we add dummy traffic to obtain a full matrix, named Tfull. It can be proved that matrix Tfull can
be allocated in F slot-times, exactly like T. The aim of the algorithm is to obtain a sequence of F full
binary switching matrices. This means that each switching matrix will have exactly W R non-null
elements, each group will have W non null elements in each row and column, and will cover all critical
lines and columns (i.e.: lines and columns that correspond to nodes that need to transmit or receive in
each remaining time slots).
The problem of obtaining each of the F switching matrices can be seen as a problem of maximum
flow with lower bounds. Lower bounds are use to enforce a minimum flow on edges that are critical,
i.e. edges that necessarily have to be used in every time-slot and that correspond to the critical lines and
columns of the T matrix.
In a generic graph, the problem of calculating the maximum flow f with lower bound constraints can
be solved in three steps, returning respectively flows f1, f2 and f3, so that f = f1 + f2 + f3.
1. We translate the lower bound constraints into excesses and deficits. For each edge (u, v) we set
13
the flow on that edge equal to the lower bound (f(u, v) = l(u, v)) and we add a positive excess to v
and a negative deficit to u, equal to the lower bound (l(u, v) and -l(u, v) respectively). The flow
obtained on each graph edge returns f1.
2. We add an auxiliary source s' and sink t'. Connect s' with all nodes with excess by edges with
capacity equal to the excess, and connect t' with all nodes with deficit by edges with capacity
equal to the deficit. We add an infinite capacity bi-directional edge between s and t. Now we
compute a max flow f' on the obtained graph, using for instance the Dinic algorithm. f' saturates
all the edges outgoing from s' and incoming to t' if and only if a solution to the original problem
exist. In this case, f2 can be obtained by considering the coordinates of f' that correspond to the
original graph edges. The basic idea is that, once obtained the minimum requested flow f1
correspondent to the lower bound, we impose a new source in the residual graph that pushes the
same flow out of the vertices that are receiving it. The contrary for the sink t', that forces vertices
"in debt" to receive some flow. The infinite capacity edges are added to make it possible to find a
solution in any case.
3. We set capacities of the original graph equal to c(u,v) - f1(u,v) - f2(u,v). We compute a max flow f3
on this residual graph.
We apply these three steps on the above described network graph, in which capacities have been
modified and lower bound constraints added as follows:

edges (s, AIi) have capacity W and lower bound W; the same for edges (AOi, t);

edges (AIi, Ij) have unit capacity and lower bound equal to 1 if line j of the Tfull matrix is critical,
zero otherwise;

node-to-node edges (Ii, Oj) have capacity Tfulli,j and lower bound zero;

edges (Oi, AOj) have unit capacity and lower bound equal to 1 if column j of the Tfull matrix is
critical, zero otherwise.
We apply the algorithm once; the flows obtained on node-to-node edges give the first switching
matrix. We subtract this matrix from Tfull and repeat the process F times. Starting from the first
switching matrix, wherever we find a non null element we assign it to HP traffic if there was a HP
request. Otherwise we assign it to BE traffic also if there was a BE request. If there were no requests
both for BE and HP traffic, it means that we are considering the added dummy traffic and we may drop
it.
The overall TSA complexity is O(N2.5 F).
14
V. HEURISTIC SCHEDULING ALGORITHMS
In case of very large networks, the optimal algorithm complexity may be too high. We may simplify
the scheduling using heuristic solutions, although they only permit to achieve sub-optimal
performances due to their greedy behavior.
We assume that a matrix P of F × (W × R) slots is available at the Hub, where W is the number of
wavelengths per ring, F is the frame length and R is the number of rings. If a slot is reserved for an HP
or BE packet, the addresses of source and destination nodes are stored in P as shown in Fig. 6. The aim
of the heurist algorithms is to fill this matrix maximizing the satisfied HP requests and then uses the
unreserved bandwidth to insert as mush as possible BE traffic.
1
W
node 7 on ring 1 will use wavelength 2 (indicated by the slot position in the
matrix) to transmit an HP packet to node 5 using wavelength on ring 4
(7)-(4,2,5)HP
(8)-(2,4,2)BE
(1)-(2,1,3)
BE
(1)-(3,4,5)HP
(3)-(1,3,1)
(2)-(2,4,9)
HP
…
…
2
W
(7)-(1,2,5)HP
BE
(6)-(2,3,1)BE
R
W
(4)-(1,4,1)HP
(6)-(4,15)
BE
(1)-(3,4,5)BE
F
Fig. 6. An example of matrix P status at the Hub
The following different heuristic solutions will be considered:
1. Nodes share wavelength resources (up-link channels and down-link channels share the same
wavelengths), and the Hub knows the position of the nodes along the rings. In Section V-A,
se propose a heuristic scheduling algorithm (called First-Fit, FF) that uses a transparent,
incremental, and atomic allocation for HP traffic, and a non transparent, elastic allocation for
BE traffic.
2. Nodes accessing the ring using separate sets of wavelength channels for transmission and
reception (like in section IV). In section V-B, we propose a heuristic scheduling algorithm
(called Frequency De-coupling, FD) that uses a transparent, and atomic allocation for HP
traffic and a ring-to-ring permutation based, non transparent allocation for BE traffic.
15
3. Nodes accessing the ring using only half part of each frame for transmission. In Section V-C,
we propose a heuristic scheduling algorithms (called Time De-coupling, TD) that forces the
allocation of HP traffic to be transparent, and to guarantee atomicity, and the allocation of BE
to be non transparent, elastic, and based on ring-to-ring permutation.
A. Heuristic I: First-Fit (FF)
This approach assumes that the Hub knows the position of the nodes along the rings and assigns the
resources to the HP and BE requests on a first-fit fashion taking into account the atomicity and
transparent constraints and the in-transit packets interference. It is an online algorithm: at each point in
time, it assigns the resources to a current request based only on past information and with no
knowledge whatsoever about the future requests; in contrast with off-line algorithm that assumes the
availability of the entire sequence of requests.
Two matrixes are available at the Hub: Pnew and Pold. For the current frame, the Hub applies the Pold
permutation sequence while Pnew is used to allocate the incoming requests and will be used for the next
frame, i.e. at the end of frame, Pnew is copied to Pold.
When a new HP request arrives:
1. If the new request for a node pair is for less slots than the slots already allocated (decrease or
close an old connection), slots are released until slots allocated to the considered node pair are
equal to the new request. Slots are released on the basis of a round-robin scan of the matrix Pnew.
2. If it requires a new connection or to increase an old connection, the matrix Pnew is scanned in a
round robin way trying to allocate all the required slots and satisfying all constraints, i.e. no more
than one packet to/from the same node, no more than W packets to/from a given ring appear in
the same time slot, and no interference with the in-transit packets. For the latter constraint, a slot
can be assigned to the request if it does not collide with an already allocated slot that will be
injected in the same position after having been switched at the Hub. In the same way, the slot
must also be empty after having been switched at the Hub, i.e. during its downstream path it must
not collide with the transmission of another packet. During this operation, the slots allocated for
BE traffic can be preempted to allow the allocation of HP traffic, because HP packets have higher
priority than BE packets. This scheduling step ends when either all the requested slots have been
satisfied or all slots have been considered. If the request is completely satisfied, the slots are
definitely allocated; otherwise, the request is rejected, and the slots are not allocated.
When a new BE request arrives, the algorithm performs the same steps but the allocation of BE
traffic cannot preempt the HP allocations.
The above described heuristic algorithm is non-optimal both for HP and for BE traffic, due to the
16
greedy approach in the allocation of network resources. Assuming that each node can issue at most a
request to each other node in each frame, the FF complexity is O(N2 F)
B. Heuristic II: Frequency-decoupling (FD)
In order to exploit the possibility of achieving a simple, fully distributed, medium access protocol at
nodes, at least for BE traffic, it is necessary to guarantee that simultaneous transmissions by nodes
located on different rings do not conflict. If the Hub performs individual wavelength-to-wavelength
permutations, nodes located on different rings cannot orthogonalize their transmissions in a distributed
fashion (i.e., without the exchange of signaling information and pre-coordination), since it is not
possible to know how many packet will be directed to the destination ring, nor to a particular
destination node. Instead, when the Hub performs ring-to-ring permutations, instead of individual
wavelength-to-wavelength permutations, a distributed solution of receiver contentions is possible and
easy to achieve, since downstream nodes can easily acquire the information on the transmission
decision performed by upstream nodes on the ring without signaling (see [9] for the case where only
BE traffic is considered). For these reasons, we prefer to consider solutions in which the scheduling
algorithm allocates, at least to BE traffic, all the free slots (i.e., slots that have not been reserved to HP
traffic) of each multi-slot (i.e., the set of simultaneous slots on different wavelengths injected by the
Hub on the up-link channels of a ring) to transmissions directed to nodes belonging to the same ring.
Under this extra constraint, the allocation of BE slots at the Hub can be performed on a ring-to-ring
transmission request matrix whose element Bi,j represents the aggregate amount of traffic that nodes on
ring i must transmit to nodes on ring j (i.e. the identity of the source node and the destination node is
ignored during the process of slot allocation at the Hub). The solution of source and destination
contentions can then be performed by each transmitter in a distributed fashion.
We emphasize, however, that this class of partially distributed scheduling algorithm is largely suboptimal and may lead to performance degradations when the number K of transceivers per node is less
than W. Assume, for example that K = 1, and that only node n on ring i is transmitting BE packets to
nodes belonging to ring j. The Hub operates on the ring-to-ring slot request information, ignoring node
identities and receiver contentions; assume that x slots are requested from ring i to ring j. The Hub will
allocate x/W multi-slots for transmissions from ring i to ring j. Node n will however be able to access
only one slot per multi-slot, so that only x/W out of x packets will be transmitted by node n.
Remember also that BE traffic must coexist with HP traffic, for which the allocation of slots must be
performed on a node-to-node request matrix basis, since receiver and transmission contentions must be
solved in the allocation phase at the Hub, in order to guarantee the effective transmission of all required
17
slots.
The proposed approach is an off-line algorithm and runs through a number of steps at the end of each
frame. Before starting the slot allocation, all the slots of matrix P that were dedicated in the previous
frame to BE class are set free, because HP traffic is scheduled first, and all resources can be exploited
by this traffic class.
Then we start scheduling HP traffic first.
1. If the new request for a node pair is for less slots than the slots already allocated (i.e. the
correspondent element in Δ is negative), slots are released until slots allocated to the considered
node pair are equal to the new request. Slots are released on the basis of a round-robin scan of the
time frame.
2. Requests that require an increase in slot allocation are served: each empty slot is filled,
considering one request after another, if all the constraints at the Hub are satisfied, i.e. no more
than one packet to/from the same node, and no more than W packets to/from a given ring appear
in the same time slot. These requests (i.e. those corresponding to a positive entry in Δ) are
scanned according to a round-robin criterion, so that when an empty slot is found, this is assigned
to an unsatisfied request that meets the above constraints and refers to the node pair that follows
(according to any node pair ordering) the last pair served in the previous time slot. Slots reserved
to the HP traffic class are allocated uniformly along the frame, aiming at avoiding a concentration
of HP traffic in a portion of the frame. This is achieved by scanning the slots according to the
time index first, and to the wavelength index next. This scheduling step ends when either all
empty slots have been considered, or all the allocation requests have been satisfied. At the same
time, the matrix Anew is updated, so that each entry at the end of the process represents the real
amount of resources allocated to each node pair in the current frame, and it becomes the Aold
matrix that will be used in the next iteration of scheduling algorithm.
Slots that are not used for HP traffic are available at the next step for BE allocations. The scheduling
of BE traffic runs through two sub-steps:
1. Matrix B is scheduled, independently from HP traffic, through an iterated Critical Maximum
Weight Matching [5], to obtain a set of ring-to-ring permutations that must be fitted in the frame.
Note that the allocation for BE traffic is recomputed every time; hence no incremental allocation
is implemented, nor transparency is enforced for BE traffic.
2. In each time slot a particular permutation is selected and removed from the previously computed
set of permutations. The selected permutation is the one that permits to allocate the maximum
number of slots to BE traffic on all rings, under the constraint that no more than W packets (either
18
HP or BE) are directed towards the destination ring specified by the selected permutation. To
meet the latter constraint, some slots can remain unused, possibly leading to throughput
degradations.
The complexity of this algorithm is O(N2 F).
C. Heuristic III: Time-decoupling (TD)
This approach uses the same algorithm as FD but logically separates upstream ring and downstream
rings in time. In fact, the time is organized in Round-Trip Time (i.e. the time needed by each slot to
circulate around a ring, assuming the same value for any ring), and the nodes can use only one RTT
from every two to transmit the packets. In this way, each transmitted packet is switched at the Hub and
injected into the downstream ring when the transmission is not allowed, avoiding any in-transit
interference. Moreover, the nodes can exploit the space reuse (i.e. transmission and reception on the
same ring), since the reception is allowed every time.
As before, the complexity of this algorithm is O(N2 F).
D. Fairness control of BE traffic
The distributed access of the BE traffic can exhibit bandwidth unfairness problems under unbalanced
traffic since the upstream nodes have generally better access chances than downstream nodes. This is
not the case for HP traffic, since an entry policing at nodes and a connection acceptance at the Hub can
control the load of the HP traffic.
In order to enforce the throughput fairness of the BE traffic, a credit-based scheme, such as MetaRing
discussed in [15] for a single ring, can be extended to the multi-ring networks.
In the MetaRing network, a control signal, called SAT, is circulated in store-and-forward mode from
node to node along the ring. In our architecture, several rings exist. Therefore a multi-SAT mechanism
(a SAT per each pair of rings, which leads to R2 SATs) is needed. When a node on ring i receives the
SATij, it is granted permission to send Q (transmission quota) BE packets to ring j; if it has no more
packets, or Q packets were sent since last SATij reception, the SATij is forwarded along the ring. The
SAT information is also carried in the control channel associated with the multi-slots. A detailed
description of this mechanism is included in [9] and its effectiveness is demonstrated in [3].
VI. PERFORMANCE EVALUATION
A. Simulation scenario
In order to evaluate the performance of the above suggested solution, we set up a simulator
consisting of R = 4 rings and n = 10 nodes per ring (N = 40 nodes in total), each node sharing W = 4
wavelengths. We set the slot size to Ps = 1 μs, the ring round-trip time to RTT = 0.5 ms, which means
19
set the propagation delay on each ring to 500 times the slot duration, and the frame duration to F = 10
ms (20 RTT). The queues at the nodes are considered length enough to not lose any packets.
We consider several traffic patterns: uniform, diagonal, client-server, power-of-ten, and very
unbalanced. To describe these traffic patterns, we use a matrix K, of size R×R, where Ki,j is a real
number ranging between 0 and 1, which represents the percentage of traffic generated on ring i towards
ring j with respect to the total network load . For the four considered traffic patterns the matrix K is:
uniform
1 / 4

1 / 4
K
1/ 4

1 / 4

1/ 4
1/ 4
1/ 4
1/ 4
1/ 4
1/ 4
1/ 4
1/ 4
diagonal
1 / 4

1 / 4
1 / 4

1 / 4 
 7 / 10

 1 / 10
K
1 / 10

 1 / 10

1 / 10
7 / 10
1 / 10
1 / 10
client-server
1 / 10
1 / 10
7 / 10
1 / 10
power-of-ten
1 / 10 

1 / 10 
1 / 10 

7 / 10 
0
3 / 4 0

0
3
/
4
0

K
0
0 3/ 4

1 / 4 1 / 4 1 / 4

1 / 4

1 / 4
1 / 4

1 / 4 
very unbalanced
10
100 1000 
 1


1
10
100 
1 1000
K
1
10 
1111  100 1000


 10
100 1000
1 

0
0
0 
1 / 2


1 / 2 1 / 10 1 / 3 1 / 15 
K
0
0
1/ 3
0 


 0
0
1/ 3
0 

Concerning the distribution of the traffic load among the nodes of the rings, we always considered it
uniform.
In all the scenarios, packets are generated at nodes according to a Bernoulli distribution whose
average is derived from the weight matrix described above and the packet size is matched to the slot
size. How do you generate the HP and BE requests?
B. Simulation results
In all the following figures, we plot the throughput (ratio between used and available slots) per each
destination ring reachable from ring 0. Note that all the points of the plots are steady-state values get
from statistically significant measures obtained from the simulation results. Also note that, although we
plot the throughput for a single ring (ring 0), the same behavior holds for all the other rings due to
traffic symmetries. This is not true only for very unbalanced traffic pattern, where we plot the rings of
interest.
In Fig. 7 we report the throughput for each destination ring on ring 0 and the overall normalized
network throughput as a function of the offered load, under diagonal traffic. The curves for ring 0 to
ring 1, ring 0 to ring 2, and ring 0 to ring 3 overlap. In these simulations only HP traffic is presented in
the network, to demonstrate that the scheduling algorithms can offer enough resources to HP traffic, as
long as the network is not overloaded. Note that some degree of fairness between competing traffic is
enforced: when the network load grows, the amount of admitted ring 0 to ring 0 traffic is reduced to
equalize the amount of traffic transmitted from each ring. (Some comparison comments)
20
a)
b)
c)
d)
Fig. 7. Throughput as a function of HP traffic offered load under diagonal traffic pattern
and comparing a) Optimal, b) FF, c) FD, and d) TD solutions
We then analyze network performances when both traffic classes are presented and under all
proposed traffic patterns. From Fig. 8 to Fig. 12 we show the throughput as a function of the percentage
of HP traffic present in the network, considering that the total offered load is always 1. In other words,
when HP traffic load in the horizontal axis of the figures is equal to 0.2, the BE traffic load equals to
0.8. In all figures we show the throughput for each destination ring on ring 0 for HP traffic (white
markers), the total HP throughput (black square markers), and the total throughput on that ring (solid
line without markers). Furthermore, the figures depict the maximum theoretical throughput reachable
by BE traffic (dashed line) as a reference to compare with the actual throughput achieved by BE traffic
(black rhombus markers).
Comments on results.
a)
b) (example)
c)
d)
Fig. 8. Throughput as a function of HP traffic relative load percentage at total load of 1 under uniform
traffic pattern and comparing a) Optimal, b) FF, c) FD, and d) TD solutions
21
a)
b)
c)
d)
Fig. 9. Throughput as a function of HP traffic relative load percentage at total load of 1 under diagonal
traffic pattern and comparing a) Optimal, b) FF, c) FD, and d) TD solutions
a)
b)
c)
d)
Fig. 10. Throughput as a function of HP traffic relative load percentage at total load of 1 under clientserver traffic pattern and comparing a) Optimal, b) FF, c) FD, and d) TD solutions
a)
b)
c)
d)
Fig. 11. Throughput as a function of HP traffic relative load percentage at total load of 1 under powerof-ten traffic pattern and comparing a) Optimal, b) FF, c) FD, and d) TD solutions
a)
b)
c)
d)
Fig. 12. Throughput as a function of HP traffic relative load percentage at total load of 1 under very
unbalanced traffic pattern and comparing a) Optimal, b) FF, c) FD, and d) TD solutions
VII. CONCLUSION (TO BE FILLED IN)
ACKNOWLEDGMENT
This work was partially funded by the European Commission under the IST DAVID project (IST
1999-11742). This work was also partially funded by MCYT (Spanish Ministry of Science and
Technology) under contract FEDER-TIC2002-04344-C02-02.
22
REFERENCES
[1]
B. Rajagopalan, D. Pendarakis, D. Saha, R.S. Ramamoothy, and K. Bala, “IP over optical
networks: architectural aspects”, IEEE Commun. Mag., vol. 38, no. 9, Sept. 2000, pp. 94-102.
[2]
S. Spadaro, J. Solé-Pareta, D. Careglio, K. Wajda, and A. Szymanski, “Positioning of RPR
standard in contemporary operators’ environment”, submitted to IEEE Network.
[3]
L. Dittman et al., “The European IST project DAVID: a viable approach towards optical packet
switching”, to be published in IEEE J. Select. Areas Commun.
[4]
W. Stallings, Local and Metropolitan Area Networks, Sixth Edition, Prentice Hall, 2000.
[5]
B. Hajek, and T. Weller, “Scheduling non-uniform traffic in a packet-switching system with small
propagation delay”, IEEE/ACM Trans. Networking, vol. 5, no. 6, Dec. 1997, pp. 813-823.
[6]
T. Inukai, “An efficient SS/TDMA time slot assignment algorithm”, IEEE Trans. Commun., vol.
27, no. 10, Oct. 1979, pp.1449-1455.
[7]
A. C. Kam, and K.-Y. Siu, “Supporting bursty traffic with bandwidth guarantee in WDM
distributed networks”, IEEE J. Select. Areas Commun., vol. 18, no. 10, Oct. 2000, pp. 20292040.
[8]
A. Bianco, E. Leonardi, M. Mellia, and F. Neri, “Network controller design for SONATA – a
large-scale all-optical passive network”, IEEE J. Select. Areas Commun., vol. 18, no. 10, Oct.
2000, pp. 2017-2028.
[9]
A. Bianco, G. Galante, E. Leonardi, and F. Neri, “Measurement Based Resource Allocation for
Interconnected WDM Rings”, Photonic Network Communications, vol. 5, no. 1, January 2003,
pp. 5-22.
[10] C. S. Chang, W. J. Chen, and H. Y. Huang, “Birhoff-von Neumann input buffered crossbar
switches”, in Proc. IEEE INFOCOM 2000, Tel Aviv, Israel, Mar. 2000, pp. 1614-1623.
[11] M. Karol, M. Hluchyj, and S. Morgan, “Input versus output queueing on a space division switch”,
IEEE Trans. Commun., vol. 35, no. 12, Dec. 1987, pp. 1347-1356.
[12] N. McKeown, A. Mekkittikul, V. Anantharam, and J. Walrand, “Achieving 100% throughput in
an input-queued switch”, IEEE Trans Commun., vol. 47, no. 8, Aug. 1999, pp. 1260-1267.
[13] H. Papadimitriou, and K. Steiglitz, Combinatorial optimization: algorithms and complexity,
Dover 1998.
[14] A. Varma, and S. Chalasani, “An incremental algorithm for TDM switching assignments in
satellite and terrestrial networks”, IEEE J. Select. Areas Commun., vol. 10, no. 2, Feb 1992, pp.
364-377.
[15] R.E. Tarjan, Data structures and network algorithms, Society for Industrial and Applied
Mathematics, Pennsylvania, November 1988.
[16] I. Cidon, and Y. Ofek, “MetaRing – A full-duplex ring with fairness and spatial reuse”, IEEE
Trans. on Commun., vol. 41, no. 1, Jan. 1993, pp. 110-120.
23