Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Efficient Data Structures for Online QoS-Constrained Data Transfer Scheduling Mugurel Ionut Andreica, Nicolae Tapus Politehnica University of Bucharest Computer Science Department Summary • Motivation • Online Data Transfer Scheduling Model • Scheduling over Time on a Single Link – – – – – – Time slot array (basic data structures) Disjoint sets Balanced tree for maximal time slot intervals Block Partition (algorithmic framework) Segment Tree (algorithmic framework) Batched Updates (followed by queries) • Scheduling over Time on a Path Network – Multidimensional Data Structures – Multidimensional Batched Updates • (Other) Practical Application Scenarios • Conclusions Efficient Data Structures for Online QoS-Constrained Data Transfer Scheduling 2 Motivation • QoS guarantees – strictly necessary – Multimedia streams (minimum required bandwidth, constant latency) – Large file transfers (bandwidth, earliest start time, latest finish time) • Efficient resource management – Scheduling (Grid schedulers, bandwidth brokers) – Resource availability – Resource reservations Efficient Data Structures for Online QoS-Constrained Data Transfer Scheduling 3 Online Data Transfer Scheduling Model • One (centralized) resource manager – Knows the network topology (structure) – Has full control over the network • Many data transfer requests – – – – – – • Duration (D) (non-preemptive = a contiguous time interval) Earliest start time (ES) Latest finish time (LF) Minimum required bandwidth (Bmin) Source (src) Destination (dst) Simple greedy strategy – Handle the requests in the order of arrival – Verify if the request can be granted (satisfying the request’s constraints/parameters) – Grant the request (resource allocation/reservation) – Low response times • a complex strategy would take too long • even a simple strategy may take too long! => need some efficient techniques (data structures) Efficient Data Structures for Online QoS-Constrained Data Transfer Scheduling 4 Scheduling over Time on a Single Link • • Two nodes, connected by a single link Two models – – • (1) One transfer at a time (mutual exclusion) => use the whole bandwidth of the link (2) Multiple simultaneous data transfers (each uses some amount of bandwidth) Time horizon – divided into m time slots of equal length – good performance = fine-grained time division (=> m can be quite large) Efficient Data Structures for Online QoS-Constrained Data Transfer Scheduling 5 The Time Slot Array • The most basic data structure • An entry for each time slot (ts[t] for time slot t) – model (1): ts[t]=0 (unoccupied) or 1 (occupied) – model (2): ts[t]=available bandwidth during time slot t • Query(ES, LF, D) – model (1): find an interval of D unoccupied time slots, fully included in [ES,LF] => O(m) – model (2): find an interval of D time slots, fully included in [ES,LF], for which the minimum available bandwidth is maximum => O(m) (requires the use of a double-ended queue) • Update(tstart, D, value) – Model (1): ts[t]=value, tstart≤t≤tstart+D-1 => O(m) – Model (2): ts[t]+=value, tstart≤t≤tstart+D-1 (value can be negative) => O(m) Efficient Data Structures for Online QoS-Constrained Data Transfer Scheduling 6 Disjoint Sets (1/2) • • • Only for model (1) Uncancelable requests (once a time slot is occupied, it cannot be un-occupied) Disjoint Sets – – – – – – Tree representation Every element has a parent Tree root = the representative of the set Find(i) = finds the representative of the set containing element i Union (i,j) = joins the sets into which elements i and j are contained O(log*(m)) per operation Efficient Data Structures for Online QoS-Constrained Data Transfer Scheduling 7 Disjoint Sets (2/2) • Maximal 1-intervals – e.g. 0 0 1 1 1 1 0 0 1 1 1 0 1 0 0 • Maintain the maximal 1-intervals using the disjoint sets data structure – The representative of a set contains the left and right time slot of the 1-interval • Query(ES, EF, D) – Find an interval of D unoccupied time slots – Jump over whole maximal 1-intervals => reducing the time complexity (only in practice) • Update(tstart, D, value=1) – Set the time slot entries to 1 – Join the corresponding disjoint sets – O(m·log*(m)) overall (for all the update operations) Efficient Data Structures for Online QoS-Constrained Data Transfer Scheduling 8 Balanced Tree of Maximal Time Slot Intervals • • Only for model (1) Decomposition of the m time slots into 0-intervals and 1-intervals – stored in a balanced tree T (red-black, AVL, scapegoat tree, 2-3-4, ...) • Query(ES, EF, D) – Obtain (in O(log(m)) time) from T • each 0-interval intersecting [ES,EF] (and compare its length to D) • each 1-interval intersecting [ES,EF] (and jump over it) • time complexity: O(m·log(m)); better, in practice (due to jumps over large intervals) • Update(tstart, D, value) – Remove from T the maximal intervals intersected by [tstart, tstart+D-1] – Insert into T the new intervals – At most one removal + at most 3 (re)insertions => time complexity O(log(m)) • if ((all ES=1) and (all LF=m)) (no earliest start time and latest finish time constraints) – Maintain a heap with the lengths of the 0-intervals – Obtain the longest 0-interval in O(1) time => O(1) per update – Update heap whenever T is updated (on removals and insertions) Efficient Data Structures for Online QoS-Constrained Data Transfer Scheduling 9 Block Partitioning Method (1/2) • Divide the m time slots into m/k blocks of k time slots each (possibly less in the last block) • For each block B – [B.left,B.right]=the interval of time slots corresponding to block B – uagg=update aggregate of all the update parameters of update calls whose ranges include [B.left, B.right] – qagg=querry aggregate for the current block (the answer if the query’s range is [B.left, B.right]) Efficient Data Structures for Online QoS-Constrained Data Transfer Scheduling 10 Block Partitioning Method (2/2) • Framework – qFunc(x,y): range query(a,b) => compute qFunc(ts[a], qFunc(ts[a+1], ..., qFunc(ts[b-1], ts[b])..) – uFunc(x,y): range update(u, a, b) => ts[t]=uFunc(u, ts[t]) a≤t≤b • Pairs of range updates and range queries – Range addition update, range sum query – Range addition update, range minimum/maximum query (useful for model (2), when the start time and finish time are given as request parameters) – Range set update, range maximum sum segment query (useful for model (1)) – Range set update, range sum/min/max query • Time complexity: O(k+m/k) for each range update/range query call – k=sqrt(m) => O(sqrt(m)) Efficient Data Structures for Online QoS-Constrained Data Transfer Scheduling 11 The Segment Tree Data Structure (1/3) • Binary tree structure used for performing operations on an array v with m cells • Node p – associated interval: [p.left, p.right] – two sons: the left son (p.lson) and the right son (p.rson) – left son’s interval: [p.left, mid] – right son’s interval: [mid+1, p.right] – where: mid=floor((p.left+p.right)/2) – leaves: [x,x] interval (only one cell) • Range queries and updates => similar to the block partitioning method Efficient Data Structures for Online QoS-Constrained Data Transfer Scheduling 12 The Segment Tree Data Structure (2/3) Efficient Data Structures for Online QoS-Constrained Data Transfer Scheduling 13 The Segment Tree Data Structure (3/3) • Algorithmic framework (new) – uagg and qagg maintained at each tree node • uagg=update aggregate of all the update calls which “stopped” at the node (did not go further down in the tree) • qagg=querry aggregate for the node’s interval – More troublesome for several update and query functions (had to “push” update aggregates further down in the tree => “piggybacking” of future update and query calls) • Same pairs of updates + queries => O(log(m)) time complexity per operation (update/query) Efficient Data Structures for Online QoS-Constrained Data Transfer Scheduling 14 Batched Updates • Multiple reservations (updates) performed before querying the data structure (only some types of update functions) • q updates (ui, [ai, bi]) – Addition update (folklore): • taux[ai]+=ui (Start entry) • taux[bi+1]=-u\ (Finish entry) • ts[t]=taux[1]+taux[2]+...+taux[t] – Set update: • taux[ai].add(Start, ui) • taux[bi+1].add(Finish, ui) • traverse taux from left to right + maintain a max-heap (the most recent “active” set update) – for each Finish entry => remove update – for each Start entry => add update • ts[t]=heap.top() • O(m) for all the q updates • Preprocess the array for subsequent queries (without updates => easier, if there is no need to support updates) Efficient Data Structures for Online QoS-Constrained Data Transfer Scheduling 15 Scheduling over Time on a Path Network • A path network, composed of n vertices v1, v2, ..., vn • (vi,vi+1) connected by a link (1≤i≤n-1) • A request=2D range [src,dst] x [ES,LF] – first dimension=the path interval – second dimension=the time slots interval • n 1-dimensional data structures => performance decreases by a factor of O(n) • 2-dimensional data structures => better (sublinear performance degradation) Efficient Data Structures for Online QoS-Constrained Data Transfer Scheduling 16 d-dimensional Data Structures • d-dimensional Block partition – O((k+m/k)d) per update/query • d-dimensional Segment tree – O(logd(m)) per update/query • d-dimensional batched updates – Update range=[l1,h1] x ... x [ld,hd] – Generate all the 2d numbers K of d bits • Generate a position x=(x1, x2, ..., xd) – xi=li, if Ki=0; xi=hi+1, if Ki=1 • B=the number of 1-bits in K – if B is even => add a Start entry at taux[x] – if B is odd => add a Finish entry at taux[x] • Based on the inclusion-exclusion principle Efficient Data Structures for Online QoS-Constrained Data Transfer Scheduling 17 (Other) Practical Application Scenarios • requests asking for a fixed amount of bandwidth B during a fixed time interval – range addition update + range minimum query • requests asking for a range of consecutive frequencies (out of n available) + maximum sum of transfer rates (for just one time slot) – range maximum sum segment query + point update • requests asking for a range of frequency numbers + range of time slots (time interval) + maximum total bandwidth – 2D data structure: range addition update + range sum query Efficient Data Structures for Online QoS-Constrained Data Transfer Scheduling 18 Conclusions • Particular online scheduling model • Efficient data structures – – – – Segment tree framework (online) Block partitioning framework (online) Batched updates (semi-online) use of disjoint sets/balanced trees/heaps (online, restricted situations) • Types of networks – single link ; paths – can be used on any kind of network, but with decreasing performance (e.g. one data structure per link) Efficient Data Structures for Online QoS-Constrained Data Transfer Scheduling 19 Thank You ! http://hipergrid.grid.pub.ro/ Efficient Data Structures for Online QoS-Constrained Data Transfer Scheduling 20