Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Multi-class scheduling algorithms for supporting QoS in an interconnected WDM rings network A. Bianco1, D. Careglio2, J. Finochietto1, G. Galante1, E. Leonardi1, F. Neri1, J. Solé-Pareta2, S. Spadaro2 1 Dipartimento di Elettronica, Politecnico di Torino, Italy, (e-mail: {bianco, finochietto, galante, leonardi, neri}@mail.tlc.polito.it) 2 Advanced Broadband Communications Centre (CCABA), Universitat Politècnica de Catalunya, Spain, (e-mail: {careglio,pareta}@ac.upc.es, [email protected]) Abstract— This paper is placed in the context of the interconnected WDM rings metro network, which has been proposed within IST DAVID project. The DAVID metro network consists of several WDM rings interconnected via an optical switch called Hub; the Hub provides optical interconnection among rings using a scheduling algorithm based on a fixed-size frame while nodes belonging to the same ring access the same set of shared resources using a MAC protocol. In this paper we address the problem of designing efficient scheduling algorithms for supporting multi-class traffic services. In particular, two categories of traffic are considered, namely high-priority which offers best-effort and bandwidthguaranteed services. We formally define the scheduling at the Hub as an ILP problem, showing that, in general, it is NP-hard. We propose a complex optimal algorithm based on flow maximization and three simpler heuristic algorithms to solve the scheduling problem. The performances of the suggested algorithms are evaluated by simulation, and the obtained results are discussed to assess the merits of each proposed solution. Keywords—optical metro networks, WDM ring, QoS provisioning, MAC protocol, scheduling algorithm. I. INTRODUCTION The trend in networking is currently experiencing an explosive growth in the service layer which is putting inordinate requirements’ demands on the transport layer resulting in a complex, multi-layers structure. This situation is challenging the capacity limits of existing transport infrastructure based on SONET/SDH and ATM. Therefore, in the new telecommunication network optical technology is expected to play a stronger role, not only for increasing the transmission capacity, but also for networking operation. In fact, optics is an interesting solution since it can provide a direct IP over WDM platform with less complexity in resource management and allocation algorithms [1]. On the other hand, the need of Network Operators for reducing investments and operative costs (CAPEX and OPEX) represents one of the major drivers moving the evolution of the public network infrastructures towards optical shared medium networks, in particular in the metro environment [2]. In this direction, the IST DAVID (Data And Voice Integration over DWDM) project [3] funded by the European Commission aims at designing an optical packet-switched network by developing networking concepts and technologies for future optical networks. The DAVID network architecture encompasses both metropolitan and long-haul geographical scales. 5 6 1 4 2 ring n ring j 3 HUB 11 IP routers/switches 7 10 8 9 ring i Fig. 1. DAVID metro network architecture At the metropolitan scale, the DAVID network consists of several uni-directional WDM rings interconnected in a star topology by an optical switch called Hub as shown in Fig. 1. On each fiber, a fixed number of wavelengths are available by WDM partitioning. Rings can either be physically disjoint (i.e., run on different fibers), or be logically obtained by partitioning the optical bandwidth of one fiber into disjoint portions. In the remainder of the paper we use the term ring to identify a logical ring; any reference to physical rings will be explicit. Nodes belonging to the same ring access the same set of shared resources using a Medium Access Control (MAC) protocol. No packet buffering in the optical domain is used in the metro network; buffering is done only in electronics at access nodes. The Hub provides optical interconnection among rings and to one (or more) optical packet routers in the Backbone. Being bufferless, the Hub behaves like a space switch between rings. Ring interconnections are dynamically modified at the Hub following a scheduling algorithm, based on a fixed-size frame. The aim of the scheduling algorithm is to provide to node pairs an amount of bandwidth close to instantaneous (short-term) bandwidth requirements. The scheduling is based on explicit bandwidth requests issued by nodes and periodically collected at the Hub to create traffic request matrices. 2 Since it is reasonable to think that in future metro networks multimedia and interactive applications will be dominant part of ISP services, the importance of providing QoS is becoming highlighted. A simple way to provide QoS is, as in legacy LANs and MANs, introducing different traffic priorities [4]. A drawback of this strategy is that if high priority traffic is not shaped, policed or controlled otherwise (e.g., through any kind of CAC), it has absolute priority over low priority (e.g., best effort) traffic. To avoid this situation in public networks, bandwidth management functions are required. Network Operators need to have the possibility to control the amount of high priority traffic injected in the (optical) MAN segments of their network infrastructures [2]. In this context, this paper addresses the problem of designing a scheduling algorithm at the Hub (a focal point for network management) for supporting multi-class traffic. In particular, two service categories are considered, namely a conventional Best-Effort service (BE), and an on demand (subjected to an admission control policy) bandwidth and delay service to support High-Priority traffic (HP) for applications such as Internet telephony, video broadcasting, and video-conferences. The rest of the paper is organized as follows. We first provide a more detailed description of the metro network architecture (Section II). In Section III, we discuss the scheduling problem. Then, we describe an optimal scheduling algorithm with polynomial computational complexity for a particular logical configuration of the metro network (Section IV). Three different heuristics solutions with the aim of reducing the computational complexity are also described in Section V. Finally, we present simulation results to assess the performance of the proposed schemes (Section VI). Section VII concludes the paper. II. DAVID METRO NETWORK ARCHITECTURE In this Section we describe the network architecture that serves as the basis for our work. In particular, we focus on the assumptions required to support multi-class traffic. We do not tackle issue related to the metro network feasibility in this paper, nor do we discuss the components that should be used or physical limitations that should be taken into account when dimensioning the network. All these issues have been deeply analyzed in the DAVID project and have been discussed in [3]. The number of rings in the metro is denoted by R, the number of nodes in each ring, assumed to be equal on each ring for simplicity, is denoted by n and the total number of nodes is N = R × n. While the number of wavelengths on each ring can be different, we assume in this paper that it is equal for all rings and denoted by W. We also assume that all the nodes of a ring can transmit and receive on any wavelength used in that ring. The latter is an essential assumption, since the scheduling would be much more complex if nodes had a limited tunability on the wavelengths of the ring they belong to. 3 Ring resources are shared by the nodes of the metro network using a statistical time/wavelength/space division scheme. Indeed, each wavelength is time slotted (TDM) and the slot duration is in the range 500 ns ÷ 1 s, several slots are simultaneously transmitted through wavelength division (WDM), and rings can be disjoint in space (SDM). Time slots are aligned on all wavelengths of the same ring, so that a multi-slot is available at each node in each time slot. Slot misalignments among different rings are solved optically at the Hub; we assume for simplicity that the propagation delay on each ring is an integer multiple of the slot size. Slots are organized in fixed-size frames of length F slots. One wavelengths (hence a slot in each multi-slot) is devoted to network control channel. We assume that this control channel can be read and written by all nodes independently of their data transmissions and receptions in other slots of the multi-slots. The control information contained in a multi-slot refers to data slots of the same multi-slots; a delay is added to the optical data path at each node to process information contained in the control slot. A. The role of the Hub The Hub is a buffer-less space optical switch with R×W input and R×W output ports. Wavelengths are (dynamically) assigned to ring-to-ring interconnections by the Hub on a slot-by-slot basis: a packet received at an input ring on a given wavelength can be forwarded to any output ring on any available wavelength. Being all-optical, the Hub includes only a space switching stage, a wavelength conversion stage, and a WDM synchronization; 3R regeneration may be added if necessary. The Hub operates a permutation from input ports to output ports in every time slot, as depicted in Fig. 2 for the case of two rings and three wavelengths. The permutation is known for each time slot in each ring: the Hub labels the multi-slots (using the control slot) to identify the destination (rings, wavelengths)-tuples for the next time they return to the Hub. In such a way, the nodes can fill in the empty slots knowing the scheduled destination (ring, wavelength)-tuple. The computation of the sequence of permutations operated by the Hub is a scheduling problem [5] [6], as shown in Fig. 2. Since we are assuming that the number of wavelengths in each ring is the same, no congestion can occur at the Hub: each incoming multi-slot can be forwarded to the Hub outputs. The Hub may have to perform wavelength conversion when the wavelengths used in the input ring are different from those used in the output ring. 4 ring 1 input ports ring 2 1 3 6 4 5 2 1 2 3 1 2 3 4 5 6 4 5 6 6 2 5 1 4 3 permutation 5 3 1 2 6 4 4 1 6 3 2 5 2 6 3 5 1 4 ring 1 ring 2 … … … … … … time slots Fig. 2. Scheduling problem at the Hub Several approaches can be envisaged to solve this scheduling problem ranging from complex optimization to simple heuristics [7] [8], generally driven by the estimation of the traffic pattern as in [9]-[11]. This estimation can be based upon a priori knowledge of the traffic pattern, measurements of the traffic load, or reservation requests issued by nodes. This problem is deeply discussed in Section III. B. Architecture of the nodes We assume that the nodes are equipped with one fixed transceiver, to operate on the control channel, and one tunable transceiver, to send and receive packets; thus a node can only transmit and receive data on one wavelength in each multi-slot. Nodes must also have a selective erasure capability in order to perform destination stripping, which allows space reuse. For sake of simplicity, we assume that all tuning latencies are negligible compared to the slot time. A MAC protocol arbitrates the access of nodes to the slots, regulating both time and wavelength dimensions. Information of the status of each slot of the multi-slots is reflected in the control slot. Therefore, a node that has packets to send must monitor the control slot, and, at the same time, looks for any instance of his address for possible receptions. As a consequence, only the in-transit information of the control channel is converted to the electrical domain for being processing at nodes, the rest of the information remains in the optical domain along the entire source-destination node path. 5 At the electrical domain, to avoid the head-of-line blocking phenomenon [11], we assume that packets are organized per traffic class and stored in a per destination node queue basis, similar to the Virtual Output Queue architecture adopted in the Input-Queued switches [12]. C. Network behavior Given the Hub behavior, each multi-slot traverses a sequence of rings, e.g. as illustrated in Fig. 3, where roman numbers indicate successive positions of the multi-slot. For sake of simplicity, in this figure we assume that all slots of the multi-slot have the same destination ring. Therefore, the upper slot is the control slot where the destination ring of the multi-slot is written, and numbers within the slots represent node destinations. Nodes of ring x transmit data to be received by nodes of ring y (Steps II to IV). Ring x can be viewed as the upstream ring, where transmissions occur, while ring y can be viewed as the downstream ring, where receptions occur. Note however that when the considered multislot traverses the downstream ring y (Steps VI to VIII), it gathers transmissions for the next ring, say ring z, so that the ring traversal can be viewed as a downstream path for transmissions in the previous ring, and as an upstream path for receptions in the following ring. (I) y y (II) 3 3 4 2 1 y (V) y (IV) (III) 5 4 5 4 y 3 4 Ring x 3 Hub z (VI) Ring y (VIII) 8 z (VII) 5 4 z 8 4 5 4 Fig. 3. Multi-slot forwarding in the metro network To summarize, neither packet buffering nor packet switching in the time domain is done at the Hub, and, therefore, no packet losses occur at the optical domain. III. SCHEDULING PROBLEM FORMALIZATION We consider two classes of traffic: a high priority (HP) traffic, with guaranteed bandwidth, and a best effort traffic (BE) class. We aim to provide bandwidth reservation on a per-session (i.e. flow, virtual circuit, etc.) basis. Both bandwidths allocated to HP traffic and to BE traffic are based on explicit 6 bandwidth requests issued by nodes, received at the Hub through the signaling channel. Thus, two traffic request matrices are available at the Hub, one for each traffic type: matrix A for HP requests and matrix B for BE request. HP traffic has higher priority than BE traffic: therefore, the scheduling must satisfy all HP requests first, up to the limit of band availability, and then BE requests. The scheduling is based on a fixed-size frame of length F slots, which is considered the most suited solution for a system with guaranteed bandwidth allocation: in fact, a reservation issued in terms of bit/s can easily be translated into slot/frame with a fixed frame size. Each request matrix will store the number of slots that must be transmitted from a node to any other node within a frame. Only in the heuristic approach, to simplify the scheduling, we schedule ring-to-ring request matrices for BE traffic; the ring-to-ring request matrix can be easily derived from the larger node-to-node request matrix. To solve the scheduling problem we must take into account the architecture of the DAVID metro network, which introduce specific problems. Firstly, the scheduling has to be subjected to the following constraints: Each node is equipped with only one tunable transceiver; therefore it can transmit and receive at most a packet in each multi-slot. Atomicity in HP requests allocation: an HP request, assumed to be non elastic, is accepted only when it can be fully satisfied, i.e. when it is possible to allocate all the required slots; otherwise the request is refused. BE requests can be partially satisfied since they are considered elastic requests. Transparency: an already existing connection cannot be blocked by the allocation of a new connection. On the other hand, the scheduling at the Hub has to avoid possible collisions between the packets scheduled for a given node and the in transit packets that will be received by downstream nodes on the same ring. This holds also for packets injected in the ring after having been switched at the Hub. This problem can be solved in two different ways: Algorithmically: in this case, to avoid in-transit packet interference, transmissions should be scheduled at the Hub having the notion of node positions along the rings; Decoupling the reception and the transmission phases either in frequency or in time. In this case, the upstream ring and the downstream ring are logically disjointed and therefore the Hub can schedule the requests without knowing the node position along rings. Obviously, the first solution requires a more complex scheduling algorithm, while the second entails a drastic simplification on scheduling procedures but can lead to some bandwidth waste on the network 7 links due to fix partitioning of the resources between up-links and down-links. Nevertheless, due to the particular technological context in which the bandwidth on networks links is not a bottleneck, a nonefficient bandwidth usage does not have significant effects neither on network performance nor on network costs. In the following section we describe some solutions, with different degrees of complexity, to the scheduling problem, trying to put in evidence both strengths and weaknesses of the different approaches. More precisely we present: a formalization of the problem by writing an Integer Linear Programming (ILP) formulation and an optimal algorithm based on flow maximization that may be used at the Hub to schedule node-tonode transmissions on the basis of node-to-node bandwidth requests, with frequency separation of transmissions and receptions and no atomicity constraint (Section IV); a discussion on three heuristic sub-optimal scheduling approaches that try to achieve a good tradeoff between performance and complexity (Section V). IV. OPTIMUM SCHEDULING ALGORITHM In this section we first present an ILP formulation of the multi-class resource allocation problem for providing bandwidth-guaranteed and best-effort services. Then, we discuss the feasibility of this approach showing that it is in general NP-hard. Later we present an optimal scheduling algorithm with no atomicity constraint associated to the HP traffic requests which can be solved in polynomial time. A. ILP Formulation The following notations will be used in the ILP formulation and the scheduling problems: N denotes the total number of nodes R denotes the number of rings W denotes the number of wavelengths per ring F denotes the number of slots in a frame A is a node-to-node HP traffic request matrix, whose element Ai,j represents the number of slots that must be transmitted in a frame from node i on ring ri to node j on ring rj B is a node-to-node BE traffic request matrix, whose element Bi,j represents the number of slots that must be transmitted in a frame from node i on ring ri to node j on ring rj 8 i and j denote network nodes x denotes network rings l addresses slots inside the frame r(i) denotes the nodal location function which, for each node i, returns the ring on which node i is located, i.e. rj = r(i). In the same way, r-1(x) denotes the set of nodes located on ring x The problem of optimally scheduling node-to-node transmission requests in a frame of fixed size F at the Hub can be easily formalized as follows. GIVEN the HP request matrix A, and the BE request matrix B; FIND a slot allocation within a fixed-length frame F SUCH THAT First, the number of satisfied HP traffic requests is maximized; Then, the number of slots allocated to BE traffic is maximized. The multi-class scheduling algorithm can be formulated in terms of ILP. Let tli,j, sli,j and xi,j be binary variable defined as follows: 1 if node i is transmitt ing to node j in slot l a HP packet t il, j 0 otherwise 1 if node i is transmitt ing to node j in slot l a BE packet sil, j 0 otherwise 1 if HP priority request i j has been satisfied xi , j 0 otherwise Variables tli,j, sli,j and xi,j must satisfy the following constraints: t i, l (1) j, l (2) W a, l (3) W a, l (4) l i, j sil, j 1 l i, j sil, j 1 j t i t ir 1 x j l i, j t jr 1 x i t l i, j l i, j Ai , j xi , j (5) Bi , j (6) l s l i, j l 9 Constraints (1) and (2) enforce that each node in the network can transmit and receive at most a packet in each time slot (since only one data transceiver is assumed to be available at each node). Constraints (3) and (4) enforce that the aggregated number of packets transmitted or received by nodes belonging the same ring cannot exceed the number of available wavelengths on the ring. Constraint (5) enforces atomicity of the allocation. Constraint (6) enforces that BE requests can be partially satisfied, since they are considered elastic requests. The optimal scheduling is defined as the set of xi,j and sli,j that maximize the following objective function: max xi , j sil, j i j l i j where ε is a positive constant smaller than 1/(F N2). In this way, at first the number of allocated HP requests is maximized, and then the number of slots allocated to BE traffic is maximized. From the application point of view, the previous scheduling approach has the drawback that, in the case of connection-oriented guaranteed-bandwidth traffic, it does not guarantee transparency, i.e. it does not distinguish among new HP allocation requests and already existing allocation. Thus, an already existing connection may be blocked by the allocation of a new connection. To avoid this undesirable behavior and obtain a so-called transparent scheduling algorithm, we need to distinguish between a requests matrix Aold, whose element Aoldi,j represents the number of slots associated with already existing connections that must be transmitted in frame F from node i to node j (after having removed connections ended in the previous frame), and a requests matrix Δ, whose elements Δi,j represents the number of slots associated with new connections that must be scheduled in frame F from node i to node j. Finally, let Anew be the sum of Aold and Δ. Thus, while taking into account the atomicity constraint, the model can be easily extended by distinguishing between the allocation of old connections xoldi,j, and the allocation of new connections xnewi,j, by slightly modifying the objective function as follows: new 2 l max xiold , j xi , j si , j i j i j i j l The problem as stated before can be easily proved to be NP-hard, since it reduces to a generalization of the well-known Knapsack problem [13] when only HP traffic requests are issued. On the contrary, the problem can be solved in polynomial time if we relax the atomicity constraint associated to the HP traffic requests. The fixed-frame scheduling problem without atomicity constraint can be divided into two sub10 problems each exhibiting polynomial complexity: the F-matching problem, and the Time Slot Assignment (TSA) problem. The F-matching problem consists in finding the maximum subset of admissible HP and BE requests subject to the constraints listed in the formulation. The outcome of an F-matching problem is a request matrix that can be scheduled in a frame of length F; to obtain the schedulable matrix, some of the requests may be dropped by the F-matching algorithm. The problem of scheduling at the Hub an admissible matrix of requests can be reduced to the TSA problem in a Hierarchical Switching System (HSS). According to the logical representation of a HSS proposed in [14], a HSS comprises a central non-blocking switch which interconnects groups of transmitters and receivers (see Fig. 4). In the DAVID metro network, we can consider the Hub as a non-blocking W R × W R switch interconnecting a set of logical inputs (each input is associated with a wavelength channel on a specific ring) to a set of logical outputs (each output is again associated with a wavelength channel on a specific ring). All nodes on ring k are connected to the Hub through a nonblocking multiplexing nk × W stage, where nk represents the number of nodes on ring k. Thus, the problem of scheduling node-to-node communications at the Hub can be optimally solved by applying techniques and algorithms that were developed for the scheduling of packets in hierarchical switching systems. 1 n2 MUX 2 W2 WR×WR SWITCH W1 DEMUX 1 n1 W2 DEMUX 2 n2 nR MUX R WR DEMUX R N … W1 1 1 1 … n1 MUX 1 WR WR WR nR N Fig. 4. Hierarchical Switching System B. Optimum Scheduling Algorithm We present in this section two optimal algorithms, one for the F-matching and one for the TSA problem. Both make use of the Dinic algorithm [15] to calculate a maximum flow through a graph associated to the DAVID metro network as represented in Fig. 5. It is a directed graph that presents 2N + 2R + 2 vertices: a source vertex s, with R outgoing edges linking it to the R vertices that represent the 11 rings to which source nodes belong; R vertices (called VI1, ..., VIR), each one connected to as many vertices as the number of nodes on each ring; N vertices (called I1, ..., IN), that represent source nodes, each one connected to any possible destination vertex by mean of N edges; N destination vertices (called O1, ..., ON); R vertices (called VO1, ..., VOR) representing the ring to which destination nodes belong; a sink t with R incoming edges. In the following sections we will show how to use this graph for the solution of both the F-matching and the TSA problem. I1 O1 VI1 VO1 capacityWF capacity F capacity ti,j VIR ring R=4 node N = 12 VOR IN ON Fig. 5. Graph associated to the DAVID metro network C. Matching Algorithm Matching problems fall in the class of single-commodity network flow problems, that can be solved by applying a maximum flow algorithm (the Dinic algorithm). To solve the scheduling problem in the DAVID metro network, we repeat the same algorithm twice: the first time to allocate HP requests, the second time to allocate BE requests on the residual bandwidth. Looking at the graph depicted in Fig. 5, we start by assigning capacity to any graph edge. A capacity equal to the value of the HP traffic matrix A is assigned to each one of the corresponding N2 edges connecting source and destination nodes. Edges connecting each node to its corresponding ring have capacity equal to F, edges connecting rings to the source s or to the sink t have capacity W F. Then, a Dinic algorithm is applied to find a maximum flow in this oriented graph with the described capacities associated to each edge. As a result we obtain a flow associated to each edge of the graph. Flows associated to the N2 edges connecting source and destination vertices return the values of the Fmatched matrix for HP traffic. The same algorithm is applied a second time to schedule BE traffic on a graph whose capacities are based on the best-effort traffic matrix B reduced by the amount of slots already assigned to HP traffic. We obtain the F-matched matrix for BE traffic from the flows associated to graph edges. If transparency must be taken into account, i.e. we don't want new requests to block already existing 12 connection, we need to apply the Dinic algorithm three times instead of two: the first to allocate existing HP connections (the Aold matrix), the second time to allocate new HP connections (the Anew matrix); the third time to allocate BE traffic (the B matrix). The overall matching complexity is O(N2.5 F). D. Time Slot Assignment Algorithm Once the two F-matched matrices are obtained from the algorithm presented in the previous section, we need to allocate them in the frame. The algorithm allocates HP and BE traffic at the same time. We create a matrix T which is the sum of the two matrices (one for HP requests and one for BE requests) obtained from the F-matching algorithms. The algorithm works with full matrices only, i.e. with matrices T that present the following characteristics: N t iI r j 1 W F r , 1 r R i, j W F s, 1 s R i, j W F R i, j N t i 1 jOs thus: N N t i 1 j 1 where Ir is the group of lines of T corresponding to nodes that are on ring r (there are R groups of lines in the matrix). In the same way, Os is the group of columns on ring s. If matrix T is not a full matrix, we add dummy traffic to obtain a full matrix, named Tfull. It can be proved that matrix Tfull can be allocated in F slot-times, exactly like T. The aim of the algorithm is to obtain a sequence of F full binary switching matrices. This means that each switching matrix will have exactly W R non-null elements, each group will have W non null elements in each row and column, and will cover all critical lines and columns (i.e.: lines and columns that correspond to nodes that need to transmit or receive in each remaining time slots). The problem of obtaining each of the F switching matrices can be seen as a problem of maximum flow with lower bounds. Lower bounds are use to enforce a minimum flow on edges that are critical, i.e. edges that necessarily have to be used in every time-slot and that correspond to the critical lines and columns of the T matrix. In a generic graph, the problem of calculating the maximum flow f with lower bound constraints can be solved in three steps, returning respectively flows f1, f2 and f3, so that f = f1 + f2 + f3. 1. We translate the lower bound constraints into excesses and deficits. For each edge (u, v) we set 13 the flow on that edge equal to the lower bound (f(u, v) = l(u, v)) and we add a positive excess to v and a negative deficit to u, equal to the lower bound (l(u, v) and -l(u, v) respectively). The flow obtained on each graph edge returns f1. 2. We add an auxiliary source s' and sink t'. Connect s' with all nodes with excess by edges with capacity equal to the excess, and connect t' with all nodes with deficit by edges with capacity equal to the deficit. We add an infinite capacity bi-directional edge between s and t. Now we compute a max flow f' on the obtained graph, using for instance the Dinic algorithm. f' saturates all the edges outgoing from s' and incoming to t' if and only if a solution to the original problem exist. In this case, f2 can be obtained by considering the coordinates of f' that correspond to the original graph edges. The basic idea is that, once obtained the minimum requested flow f1 correspondent to the lower bound, we impose a new source in the residual graph that pushes the same flow out of the vertices that are receiving it. The contrary for the sink t', that forces vertices "in debt" to receive some flow. The infinite capacity edges are added to make it possible to find a solution in any case. 3. We set capacities of the original graph equal to c(u,v) - f1(u,v) - f2(u,v). We compute a max flow f3 on this residual graph. We apply these three steps on the above described network graph, in which capacities have been modified and lower bound constraints added as follows: edges (s, AIi) have capacity W and lower bound W; the same for edges (AOi, t); edges (AIi, Ij) have unit capacity and lower bound equal to 1 if line j of the Tfull matrix is critical, zero otherwise; node-to-node edges (Ii, Oj) have capacity Tfulli,j and lower bound zero; edges (Oi, AOj) have unit capacity and lower bound equal to 1 if column j of the Tfull matrix is critical, zero otherwise. We apply the algorithm once; the flows obtained on node-to-node edges give the first switching matrix. We subtract this matrix from Tfull and repeat the process F times. Starting from the first switching matrix, wherever we find a non null element we assign it to HP traffic if there was a HP request. Otherwise we assign it to BE traffic also if there was a BE request. If there were no requests both for BE and HP traffic, it means that we are considering the added dummy traffic and we may drop it. The overall TSA complexity is O(N2.5 F). 14 V. HEURISTIC SCHEDULING ALGORITHMS In case of very large networks, the optimal algorithm complexity may be too high. We may simplify the scheduling using heuristic solutions, although they only permit to achieve sub-optimal performances due to their greedy behavior. We assume that a matrix P of F × (W × R) slots is available at the Hub, where W is the number of wavelengths per ring, F is the frame length and R is the number of rings. If a slot is reserved for an HP or BE packet, the addresses of source and destination nodes are stored in P as shown in Fig. 6. The aim of the heurist algorithms is to fill this matrix maximizing the satisfied HP requests and then uses the unreserved bandwidth to insert as mush as possible BE traffic. 1 W node 7 on ring 1 will use wavelength 2 (indicated by the slot position in the matrix) to transmit an HP packet to node 5 using wavelength on ring 4 (7)-(4,2,5)HP (8)-(2,4,2)BE (1)-(2,1,3) BE (1)-(3,4,5)HP (3)-(1,3,1) (2)-(2,4,9) HP … … 2 W (7)-(1,2,5)HP BE (6)-(2,3,1)BE R W (4)-(1,4,1)HP (6)-(4,15) BE (1)-(3,4,5)BE F Fig. 6. An example of matrix P status at the Hub The following different heuristic solutions will be considered: 1. Nodes share wavelength resources (up-link channels and down-link channels share the same wavelengths), and the Hub knows the position of the nodes along the rings. In Section V-A, se propose a heuristic scheduling algorithm (called First-Fit, FF) that uses a transparent, incremental, and atomic allocation for HP traffic, and a non transparent, elastic allocation for BE traffic. 2. Nodes accessing the ring using separate sets of wavelength channels for transmission and reception (like in section IV). In section V-B, we propose a heuristic scheduling algorithm (called Frequency De-coupling, FD) that uses a transparent, and atomic allocation for HP traffic and a ring-to-ring permutation based, non transparent allocation for BE traffic. 15 3. Nodes accessing the ring using only half part of each frame for transmission. In Section V-C, we propose a heuristic scheduling algorithms (called Time De-coupling, TD) that forces the allocation of HP traffic to be transparent, and to guarantee atomicity, and the allocation of BE to be non transparent, elastic, and based on ring-to-ring permutation. A. Heuristic I: First-Fit (FF) This approach assumes that the Hub knows the position of the nodes along the rings and assigns the resources to the HP and BE requests on a first-fit fashion taking into account the atomicity and transparent constraints and the in-transit packets interference. It is an online algorithm: at each point in time, it assigns the resources to a current request based only on past information and with no knowledge whatsoever about the future requests; in contrast with off-line algorithm that assumes the availability of the entire sequence of requests. Two matrixes are available at the Hub: Pnew and Pold. For the current frame, the Hub applies the Pold permutation sequence while Pnew is used to allocate the incoming requests and will be used for the next frame, i.e. at the end of frame, Pnew is copied to Pold. When a new HP request arrives: 1. If the new request for a node pair is for less slots than the slots already allocated (decrease or close an old connection), slots are released until slots allocated to the considered node pair are equal to the new request. Slots are released on the basis of a round-robin scan of the matrix Pnew. 2. If it requires a new connection or to increase an old connection, the matrix Pnew is scanned in a round robin way trying to allocate all the required slots and satisfying all constraints, i.e. no more than one packet to/from the same node, no more than W packets to/from a given ring appear in the same time slot, and no interference with the in-transit packets. For the latter constraint, a slot can be assigned to the request if it does not collide with an already allocated slot that will be injected in the same position after having been switched at the Hub. In the same way, the slot must also be empty after having been switched at the Hub, i.e. during its downstream path it must not collide with the transmission of another packet. During this operation, the slots allocated for BE traffic can be preempted to allow the allocation of HP traffic, because HP packets have higher priority than BE packets. This scheduling step ends when either all the requested slots have been satisfied or all slots have been considered. If the request is completely satisfied, the slots are definitely allocated; otherwise, the request is rejected, and the slots are not allocated. When a new BE request arrives, the algorithm performs the same steps but the allocation of BE traffic cannot preempt the HP allocations. The above described heuristic algorithm is non-optimal both for HP and for BE traffic, due to the 16 greedy approach in the allocation of network resources. Assuming that each node can issue at most a request to each other node in each frame, the FF complexity is O(N2 F) B. Heuristic II: Frequency-decoupling (FD) In order to exploit the possibility of achieving a simple, fully distributed, medium access protocol at nodes, at least for BE traffic, it is necessary to guarantee that simultaneous transmissions by nodes located on different rings do not conflict. If the Hub performs individual wavelength-to-wavelength permutations, nodes located on different rings cannot orthogonalize their transmissions in a distributed fashion (i.e., without the exchange of signaling information and pre-coordination), since it is not possible to know how many packet will be directed to the destination ring, nor to a particular destination node. Instead, when the Hub performs ring-to-ring permutations, instead of individual wavelength-to-wavelength permutations, a distributed solution of receiver contentions is possible and easy to achieve, since downstream nodes can easily acquire the information on the transmission decision performed by upstream nodes on the ring without signaling (see [9] for the case where only BE traffic is considered). For these reasons, we prefer to consider solutions in which the scheduling algorithm allocates, at least to BE traffic, all the free slots (i.e., slots that have not been reserved to HP traffic) of each multi-slot (i.e., the set of simultaneous slots on different wavelengths injected by the Hub on the up-link channels of a ring) to transmissions directed to nodes belonging to the same ring. Under this extra constraint, the allocation of BE slots at the Hub can be performed on a ring-to-ring transmission request matrix whose element Bi,j represents the aggregate amount of traffic that nodes on ring i must transmit to nodes on ring j (i.e. the identity of the source node and the destination node is ignored during the process of slot allocation at the Hub). The solution of source and destination contentions can then be performed by each transmitter in a distributed fashion. We emphasize, however, that this class of partially distributed scheduling algorithm is largely suboptimal and may lead to performance degradations when the number K of transceivers per node is less than W. Assume, for example that K = 1, and that only node n on ring i is transmitting BE packets to nodes belonging to ring j. The Hub operates on the ring-to-ring slot request information, ignoring node identities and receiver contentions; assume that x slots are requested from ring i to ring j. The Hub will allocate x/W multi-slots for transmissions from ring i to ring j. Node n will however be able to access only one slot per multi-slot, so that only x/W out of x packets will be transmitted by node n. Remember also that BE traffic must coexist with HP traffic, for which the allocation of slots must be performed on a node-to-node request matrix basis, since receiver and transmission contentions must be solved in the allocation phase at the Hub, in order to guarantee the effective transmission of all required 17 slots. The proposed approach is an off-line algorithm and runs through a number of steps at the end of each frame. Before starting the slot allocation, all the slots of matrix P that were dedicated in the previous frame to BE class are set free, because HP traffic is scheduled first, and all resources can be exploited by this traffic class. Then we start scheduling HP traffic first. 1. If the new request for a node pair is for less slots than the slots already allocated (i.e. the correspondent element in Δ is negative), slots are released until slots allocated to the considered node pair are equal to the new request. Slots are released on the basis of a round-robin scan of the time frame. 2. Requests that require an increase in slot allocation are served: each empty slot is filled, considering one request after another, if all the constraints at the Hub are satisfied, i.e. no more than one packet to/from the same node, and no more than W packets to/from a given ring appear in the same time slot. These requests (i.e. those corresponding to a positive entry in Δ) are scanned according to a round-robin criterion, so that when an empty slot is found, this is assigned to an unsatisfied request that meets the above constraints and refers to the node pair that follows (according to any node pair ordering) the last pair served in the previous time slot. Slots reserved to the HP traffic class are allocated uniformly along the frame, aiming at avoiding a concentration of HP traffic in a portion of the frame. This is achieved by scanning the slots according to the time index first, and to the wavelength index next. This scheduling step ends when either all empty slots have been considered, or all the allocation requests have been satisfied. At the same time, the matrix Anew is updated, so that each entry at the end of the process represents the real amount of resources allocated to each node pair in the current frame, and it becomes the Aold matrix that will be used in the next iteration of scheduling algorithm. Slots that are not used for HP traffic are available at the next step for BE allocations. The scheduling of BE traffic runs through two sub-steps: 1. Matrix B is scheduled, independently from HP traffic, through an iterated Critical Maximum Weight Matching [5], to obtain a set of ring-to-ring permutations that must be fitted in the frame. Note that the allocation for BE traffic is recomputed every time; hence no incremental allocation is implemented, nor transparency is enforced for BE traffic. 2. In each time slot a particular permutation is selected and removed from the previously computed set of permutations. The selected permutation is the one that permits to allocate the maximum number of slots to BE traffic on all rings, under the constraint that no more than W packets (either 18 HP or BE) are directed towards the destination ring specified by the selected permutation. To meet the latter constraint, some slots can remain unused, possibly leading to throughput degradations. The complexity of this algorithm is O(N2 F). C. Heuristic III: Time-decoupling (TD) This approach uses the same algorithm as FD but logically separates upstream ring and downstream rings in time. In fact, the time is organized in Round-Trip Time (i.e. the time needed by each slot to circulate around a ring, assuming the same value for any ring), and the nodes can use only one RTT from every two to transmit the packets. In this way, each transmitted packet is switched at the Hub and injected into the downstream ring when the transmission is not allowed, avoiding any in-transit interference. Moreover, the nodes can exploit the space reuse (i.e. transmission and reception on the same ring), since the reception is allowed every time. As before, the complexity of this algorithm is O(N2 F). D. Fairness control of BE traffic The distributed access of the BE traffic can exhibit bandwidth unfairness problems under unbalanced traffic since the upstream nodes have generally better access chances than downstream nodes. This is not the case for HP traffic, since an entry policing at nodes and a connection acceptance at the Hub can control the load of the HP traffic. In order to enforce the throughput fairness of the BE traffic, a credit-based scheme, such as MetaRing discussed in [15] for a single ring, can be extended to the multi-ring networks. In the MetaRing network, a control signal, called SAT, is circulated in store-and-forward mode from node to node along the ring. In our architecture, several rings exist. Therefore a multi-SAT mechanism (a SAT per each pair of rings, which leads to R2 SATs) is needed. When a node on ring i receives the SATij, it is granted permission to send Q (transmission quota) BE packets to ring j; if it has no more packets, or Q packets were sent since last SATij reception, the SATij is forwarded along the ring. The SAT information is also carried in the control channel associated with the multi-slots. A detailed description of this mechanism is included in [9] and its effectiveness is demonstrated in [3]. VI. PERFORMANCE EVALUATION A. Simulation scenario In order to evaluate the performance of the above suggested solution, we set up a simulator consisting of R = 4 rings and n = 10 nodes per ring (N = 40 nodes in total), each node sharing W = 4 wavelengths. We set the slot size to Ps = 1 μs, the ring round-trip time to RTT = 0.5 ms, which means 19 set the propagation delay on each ring to 500 times the slot duration, and the frame duration to F = 10 ms (20 RTT). The queues at the nodes are considered length enough to not lose any packets. We consider several traffic patterns: uniform, diagonal, client-server, power-of-ten, and very unbalanced. To describe these traffic patterns, we use a matrix K, of size R×R, where Ki,j is a real number ranging between 0 and 1, which represents the percentage of traffic generated on ring i towards ring j with respect to the total network load . For the four considered traffic patterns the matrix K is: uniform 1 / 4 1 / 4 K 1/ 4 1 / 4 1/ 4 1/ 4 1/ 4 1/ 4 1/ 4 1/ 4 1/ 4 1/ 4 diagonal 1 / 4 1 / 4 1 / 4 1 / 4 7 / 10 1 / 10 K 1 / 10 1 / 10 1 / 10 7 / 10 1 / 10 1 / 10 client-server 1 / 10 1 / 10 7 / 10 1 / 10 power-of-ten 1 / 10 1 / 10 1 / 10 7 / 10 0 3 / 4 0 0 3 / 4 0 K 0 0 3/ 4 1 / 4 1 / 4 1 / 4 1 / 4 1 / 4 1 / 4 1 / 4 very unbalanced 10 100 1000 1 1 10 100 1 1000 K 1 10 1111 100 1000 10 100 1000 1 0 0 0 1 / 2 1 / 2 1 / 10 1 / 3 1 / 15 K 0 0 1/ 3 0 0 0 1/ 3 0 Concerning the distribution of the traffic load among the nodes of the rings, we always considered it uniform. In all the scenarios, packets are generated at nodes according to a Bernoulli distribution whose average is derived from the weight matrix described above and the packet size is matched to the slot size. How do you generate the HP and BE requests? B. Simulation results In all the following figures, we plot the throughput (ratio between used and available slots) per each destination ring reachable from ring 0. Note that all the points of the plots are steady-state values get from statistically significant measures obtained from the simulation results. Also note that, although we plot the throughput for a single ring (ring 0), the same behavior holds for all the other rings due to traffic symmetries. This is not true only for very unbalanced traffic pattern, where we plot the rings of interest. In Fig. 7 we report the throughput for each destination ring on ring 0 and the overall normalized network throughput as a function of the offered load, under diagonal traffic. The curves for ring 0 to ring 1, ring 0 to ring 2, and ring 0 to ring 3 overlap. In these simulations only HP traffic is presented in the network, to demonstrate that the scheduling algorithms can offer enough resources to HP traffic, as long as the network is not overloaded. Note that some degree of fairness between competing traffic is enforced: when the network load grows, the amount of admitted ring 0 to ring 0 traffic is reduced to equalize the amount of traffic transmitted from each ring. (Some comparison comments) 20 a) b) c) d) Fig. 7. Throughput as a function of HP traffic offered load under diagonal traffic pattern and comparing a) Optimal, b) FF, c) FD, and d) TD solutions We then analyze network performances when both traffic classes are presented and under all proposed traffic patterns. From Fig. 8 to Fig. 12 we show the throughput as a function of the percentage of HP traffic present in the network, considering that the total offered load is always 1. In other words, when HP traffic load in the horizontal axis of the figures is equal to 0.2, the BE traffic load equals to 0.8. In all figures we show the throughput for each destination ring on ring 0 for HP traffic (white markers), the total HP throughput (black square markers), and the total throughput on that ring (solid line without markers). Furthermore, the figures depict the maximum theoretical throughput reachable by BE traffic (dashed line) as a reference to compare with the actual throughput achieved by BE traffic (black rhombus markers). Comments on results. a) b) (example) c) d) Fig. 8. Throughput as a function of HP traffic relative load percentage at total load of 1 under uniform traffic pattern and comparing a) Optimal, b) FF, c) FD, and d) TD solutions 21 a) b) c) d) Fig. 9. Throughput as a function of HP traffic relative load percentage at total load of 1 under diagonal traffic pattern and comparing a) Optimal, b) FF, c) FD, and d) TD solutions a) b) c) d) Fig. 10. Throughput as a function of HP traffic relative load percentage at total load of 1 under clientserver traffic pattern and comparing a) Optimal, b) FF, c) FD, and d) TD solutions a) b) c) d) Fig. 11. Throughput as a function of HP traffic relative load percentage at total load of 1 under powerof-ten traffic pattern and comparing a) Optimal, b) FF, c) FD, and d) TD solutions a) b) c) d) Fig. 12. Throughput as a function of HP traffic relative load percentage at total load of 1 under very unbalanced traffic pattern and comparing a) Optimal, b) FF, c) FD, and d) TD solutions VII. CONCLUSION (TO BE FILLED IN) ACKNOWLEDGMENT This work was partially funded by the European Commission under the IST DAVID project (IST 1999-11742). This work was also partially funded by MCYT (Spanish Ministry of Science and Technology) under contract FEDER-TIC2002-04344-C02-02. 22 REFERENCES [1] B. Rajagopalan, D. Pendarakis, D. Saha, R.S. Ramamoothy, and K. Bala, “IP over optical networks: architectural aspects”, IEEE Commun. Mag., vol. 38, no. 9, Sept. 2000, pp. 94-102. [2] S. Spadaro, J. Solé-Pareta, D. Careglio, K. Wajda, and A. Szymanski, “Positioning of RPR standard in contemporary operators’ environment”, submitted to IEEE Network. [3] L. Dittman et al., “The European IST project DAVID: a viable approach towards optical packet switching”, to be published in IEEE J. Select. Areas Commun. [4] W. Stallings, Local and Metropolitan Area Networks, Sixth Edition, Prentice Hall, 2000. [5] B. Hajek, and T. Weller, “Scheduling non-uniform traffic in a packet-switching system with small propagation delay”, IEEE/ACM Trans. Networking, vol. 5, no. 6, Dec. 1997, pp. 813-823. [6] T. Inukai, “An efficient SS/TDMA time slot assignment algorithm”, IEEE Trans. Commun., vol. 27, no. 10, Oct. 1979, pp.1449-1455. [7] A. C. Kam, and K.-Y. Siu, “Supporting bursty traffic with bandwidth guarantee in WDM distributed networks”, IEEE J. Select. Areas Commun., vol. 18, no. 10, Oct. 2000, pp. 20292040. [8] A. Bianco, E. Leonardi, M. Mellia, and F. Neri, “Network controller design for SONATA – a large-scale all-optical passive network”, IEEE J. Select. Areas Commun., vol. 18, no. 10, Oct. 2000, pp. 2017-2028. [9] A. Bianco, G. Galante, E. Leonardi, and F. Neri, “Measurement Based Resource Allocation for Interconnected WDM Rings”, Photonic Network Communications, vol. 5, no. 1, January 2003, pp. 5-22. [10] C. S. Chang, W. J. Chen, and H. Y. Huang, “Birhoff-von Neumann input buffered crossbar switches”, in Proc. IEEE INFOCOM 2000, Tel Aviv, Israel, Mar. 2000, pp. 1614-1623. [11] M. Karol, M. Hluchyj, and S. Morgan, “Input versus output queueing on a space division switch”, IEEE Trans. Commun., vol. 35, no. 12, Dec. 1987, pp. 1347-1356. [12] N. McKeown, A. Mekkittikul, V. Anantharam, and J. Walrand, “Achieving 100% throughput in an input-queued switch”, IEEE Trans Commun., vol. 47, no. 8, Aug. 1999, pp. 1260-1267. [13] H. Papadimitriou, and K. Steiglitz, Combinatorial optimization: algorithms and complexity, Dover 1998. [14] A. Varma, and S. Chalasani, “An incremental algorithm for TDM switching assignments in satellite and terrestrial networks”, IEEE J. Select. Areas Commun., vol. 10, no. 2, Feb 1992, pp. 364-377. [15] R.E. Tarjan, Data structures and network algorithms, Society for Industrial and Applied Mathematics, Pennsylvania, November 1988. [16] I. Cidon, and Y. Ofek, “MetaRing – A full-duplex ring with fairness and spatial reuse”, IEEE Trans. on Commun., vol. 41, no. 1, Jan. 1993, pp. 110-120. 23