Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Advanced Data Structures Lecture 7 Temporal Data Structures Zhuang Bingbing Temporal Data Structures ● ● View and/or modify the data structure at various points in the past is allowed. Two models of temporal data structures ● Persistence: branching-universe model of time travel. – ● Making changes in the past creates a new branch of the data structure. Retroactivity: round -trip time travel. – Making changes in the past changes the present data. Persistence ● ● Keep all versions of the data structure available for updates and queries. Four levels ① Partial Persistence: versions linearly ordered – Query any previous version, but only update the latest version. ② Full Persistence: branching tree – Query any version, and update any version. ③ Confluent Persistence: DAG – Combine input of p>1 previous versions to output a new single version. ④ Functional Persistence – Revisions create new nodes. Persistence Functional Persistence Use the combinators to append a new combined version. Confluent Persistence Do not use combinators. Full Persistence Only update the most recent version. Partial Persistence Ephemeral (update destroys the current version) Data for traffic flow Time 11:35 Version1 Time 11:40 Version2 Destroyed Destroyed Time 11:50 Version3 Time 11:55 Version4 Destroyed Update (write) now Data for traffic flow Time 11:35 Version1 Access (read) Partial Persistence Time 11:40 Version2 Access (read) Time 11:50 Version3 Access (read) Time 11:55 Version4 Access (read) Update (write) now Partial Persistence Three naïve methods [Overmars 1981] n: size of each version, m: number of updates (# versions) 1. Store every version explicitly: best for access, but Ω(n) time per update Version0 v1 v2 vm ... 2. Store only the seq. of updates: Ω (i) time for access to vi, best for update Version0 ... update 1 update 2 update m 3. Store all updates and every kth version :O(k) time for access, Ω (n/k) time for update Version0 Version k Version 2k ... ... update 1 update 2 All these results are poor, because of no assumption on the structure of data update m Linked Data Structure: assumption on Data to be maintained { access operation: find a node following pointers from an entry node update operation: create/delete a node or change the info of an accessed node Linked Data Structure entry node entry node node information field pointer field access pointers Assume out-degree (in-degree) is const A Typical Example: Binary Search Tree access operation: find a node following pointers from the root node update operation: create/delete a node or change the info of an accessed node access pointer Linked Data Structure entry node node information field left right tree out-degree (in-degree) is constant An Example on Binary Search Tree insert (E), insert (C), insert (M), insert (O ), insert (A), insert (I), insert (G), insert (K), insert (J), delete (M), delete (E), delete (A) insert (E) insert (C) E version 1 insert (M) E C E C ver. 2 insert (O) M insert (A) insert (I) E C E insert (G) insert (K) M C insert (J) ver. 3 O A M ver. 4 K G delete (M) delete (E) E C A C A K G J G J J ver. 11 O I J J ver. 10 K O I O I J C ver. 9 K G J delete (A) ver. 12 O I Fat Node Method array for access pointers vers. 1-10 [Driscoll et al 1989] O(1) time vers. 11-12 node E information stamp 2 information stamp 10 O(log m) time per node right left 10 stamp 10 2 stamp 2 stamp 3 3 5 C M 11 6 4 O 12 A 7 I 8 10 G K J 9 10 10 Node-Copying Method node [Driscoll et al 1989] containing no more than one version except enough (constant) #pointer fields Make a new copy of node if some field is full. predecessor predecessor information left nill information information stamp 10 predecessor stamp 15 stamp 13 right left right stamp 10 stamp 13 stamp 14 right left stamp 17 nill newest Node-Copying Method (the Same Example) array for access pointers 3-9 vers. 1-2 10 E E O(1) time 11-12 E 6 O(1) amortized time and space per node 2 C 5 C M 12 M 8 A I I 10 K 7 G 9 J K 4 O Partial Persistence ● Given linked data structure on a pointer-machine ● Suppose ≦p (constant) nodes point to any node ● Store reverse pointers for most recent version ● ● Allow ≦p (time, field, value) modification (by an update) in a node When update changes a field: ● ● ● If node not full, just add modification Else: copy node-with-modification, recourse on reverse pointers # full latest nodes ⇒ O(1) amortized overhead Other Persistence only the latest version is allowed to be updated any past version is allowed to be updated; generate branching in version relationship also merging two version is allowed Full Persistence ● Given a linked data structure on a pointer-machine ● Version list = pre-order traversal of version tree ● List-order DS: O(1) time/op ● ● ● insert node after specified node ● Order query: is node x before y? Space for up to 2p modifications in a node When a node is full, split into two roughly half-full nodes (like Btrees), called node-splitting method (similar to node-copying method) ● [Driscoll et al 1989] # full nodes ⇒ O(1) amortized cost ● Linked list of nodes representing DS node ● Second phase to update reverse pointers Full Persistence Open Problem ● [Brodal-NJC 1996] O(1) worst-case partial persistence ● OPEN: O(1) worst-case full persistence? ● O(lglgn) fully persistent array ⇒ any RAM DS ● OPEN: Matching lower bound? What about partial? Confluent Persistence ● Functional data structures ● e.g. deques with concatenation in O(1)/op ● Eneral transformation: ● d(v) = depth of node v in a version DAG ● e(v) = 1+lg(# paths from root to v) ● Overhead: lg(#updates)+maxv e(v) ● Poor when e(v)=2#updates – Can make exponential-size DS this way – In this case still exponentially better than nonpersistence Open Problem OPEN: When can you do better? Lists with split and concatenate – Trees – General pointer machine – Arrays – Retroactivity (maintain the current version for changes on the past version) Data for traffic flow Time 11:35 Version1 Time 11:40 Version2 update 1 Time 11:50 Version3 update 2 Time 11:55 Version4 update 3 now Whoops! Update 1 was wrong! Update 1’ is the correct one. What is the correct current version? update 1’ update 2 update 3 ??? now Retroactivity ● Traditional DS formed by sequence of updates ● Allow changes to that sequence ● Maintain linear timeline ● Operations: ● Insert(t,”op()”): retroactively do op. at time t ● Delete(t): retroactivity undo op. at time t ● Query(t,”op()”): execute op. query at time t ● Partial retroactivity: Query onlyy in present (last t) ● Full retroactivity: Query at any time Retroactivity ● Easy cases: ● Commutative updates: xоy≡yоx – ● Insert(t,x) can just do x in present Invertible updates: xоx-1≡φ – Delete(t) can just do x-1 in present Rollback Method ● ● Rollback method: retroative op. at r time units in past with factor-r overhead, via logging (undo persistence) Lower bound: Ω(r) can be necessary! ● ● DS maintains two values, X & Y, initially 0 – set X(x): X←x – add Y(Δ): Y←Y+Δ – mul XY(): Y←X*Y – query (): return Y add Y(an), mul XY(), add Y(an-1), mul XY(),..., add Y(a0) computes the polynomial anxn+an-1xn-1+...+a0 Rollback Method ● ● ● ● add Y(an), mul XY(), add Y(an-1), mul XY(),..., add Y(a0) computes the polynomial anxn+an-1xn-1+...+a0 Insert (t=0, “set X(x)”) changes x value It is known that computing an arbitrary polynomial requires Ω(n) time over any field, even with any infinite subset such as integers, independent of preprocessing of ai in worst case. In “history-independent algebraic decision tree,” the same result holds for the integer RAM and generalized real RAM models. Open Problem ● ● Cell-probe lower bound: ● DS maintains n words; arithmetic updates (+/·) ● Compute FFT in ● Changing wi has lower bound OPEN: cell-probe lower bound ? Partially Retroactive ● Priority queues: insert(t,k), delete-min() ● (not commut.) ● Partial retroactivity in O(lgn)/op. ● Assume keys only inserted once ● L view: insert=rightward ray; delete-min=upward Priority queues key value insert=rightward ray; delete-min=upward Q0 time now Priority queues Bridge: subset of keys of Qnow key value bridge Qnow time now Insert(t,”insert(k)”)/Delete(t,”delete-min()”) ● Insert(t,”insert(k)”): insert key k into Qt={elements at time t} ● ● Find the largest element that was deleted after time t Delete(t,”delete-min()”): undo the delete-min at time t ● Identical to insert(k) right after deleting it Insert (t, “insert(k)”) key value bridge Qnow Insert (t, insert(k)) time now Insert (t, “insert(k)”) key value I≧t = {elements inserted at time t or later} D≧t = {elements deleted at time t or later } (not in Qnow) bridge Qnow Insert (t, insert(k)) time Lemma The element to be inserted into Qnow is the maximum element in D≧t (= the maximum element in I≧t’ -Qnow for the last bridge t’ before t) . now Insert(t,”delete-min(k)”)/ Delete(t,”insert(k)”) ● Insert(t,”delete-min(k)”): delete the min key at time t ● ● Find the min value of Qt', t' is the first bridge after t Delete(t,”insert(k)”): undo the insertion at time t ● Identical to delete-min() right after inserting it Insert (t, delete-min) key value bridge Qnow Insert (t, delete-min) time now Insert (t, delete-min) key value bridge Qnow Insert (t, delete-min) time now Lemma The element to be removed fromQnow is the minimum element in Qt’ for the first bridge t’ after t (Qt’ = the maximum element in I≦t’ ∩Qnow) . Priority queues Reduced to ordinal tasks: (A) Find the last (first) bridge before (after) time t; (B) Find the max (min) element in I≧t’ -Qnow (I≦t’ ∩Qnow) ● O(lgm) worst-case, m is the total number of updates, present or retroactive, performed on the priority queue. Partial Persistence Modification box