Download Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Lattice model (finance) wikipedia , lookup

Linked list wikipedia , lookup

Red–black tree wikipedia , lookup

Quadtree wikipedia , lookup

Binary tree wikipedia , lookup

Interval tree wikipedia , lookup

B-tree wikipedia , lookup

Binary search tree wikipedia , lookup

Transcript
Advanced Data Structures
Lecture 7
Temporal Data Structures
Zhuang Bingbing
Temporal Data Structures
●
●
View and/or modify the data structure at various
points in the past is allowed.
Two models of temporal data structures
●
Persistence: branching-universe model of time travel.
–
●
Making changes in the past creates a new branch of the data
structure.
Retroactivity: round -trip time travel.
–
Making changes in the past changes the present data.
Persistence
●
●
Keep all versions of the data structure available for updates
and queries.
Four levels
① Partial Persistence: versions linearly ordered
–
Query any previous version, but only update the latest version.
② Full Persistence: branching tree
–
Query any version, and update any version.
③ Confluent Persistence: DAG
–
Combine input of p>1 previous versions to output a new single version.
④ Functional Persistence
–
Revisions create new nodes.
Persistence
Functional Persistence
Use the combinators to append a new combined version.
Confluent Persistence
Do not use combinators.
Full Persistence
Only update the most recent version.
Partial Persistence
Ephemeral (update destroys the current version)
Data for traffic flow
Time 11:35
Version1
Time 11:40
Version2
Destroyed
Destroyed
Time 11:50
Version3
Time 11:55
Version4
Destroyed
Update
(write)
now
Data for traffic flow
Time 11:35
Version1
Access
(read)
Partial Persistence
Time 11:40
Version2
Access
(read)
Time 11:50
Version3
Access
(read)
Time 11:55
Version4
Access
(read)
Update
(write)
now
Partial Persistence
Three naïve methods [Overmars 1981]
n: size of each version, m: number of updates (# versions)
1. Store every version explicitly: best for access, but Ω(n) time per update
Version0
v1
v2
vm
...
2. Store only the seq. of updates: Ω (i) time for access to vi, best for update
Version0
...
update 1
update 2
update m
3. Store all updates and every kth version :O(k) time for access, Ω (n/k) time for update
Version0
Version k
Version 2k
...
...
update 1 update 2
All these results are poor, because of no assumption on the structure of data
update m
Linked Data Structure: assumption on Data to be maintained
{
access operation: find a node following pointers from an entry node
update operation: create/delete a node or change
the info of an accessed node
Linked Data Structure
entry node
entry node
node
information field
pointer field
access pointers
Assume out-degree
(in-degree) is const
A Typical Example: Binary Search Tree
access operation: find a node following pointers from the root node
update operation: create/delete a node or change the info of an accessed node
access pointer
Linked Data Structure
entry node
node
information field
left
right
tree  out-degree (in-degree) is constant
An Example on Binary Search Tree
insert (E), insert (C), insert (M), insert (O ), insert (A), insert (I), insert (G), insert (K),
insert (J), delete (M), delete (E), delete (A)
insert (E)
insert (C)
E
version 1
insert (M)
E
C
E
C
ver. 2
insert (O)
M
insert (A)
insert (I)
E
C
E
insert (G)
insert (K)
M
C
insert (J)
ver. 3
O
A
M
ver. 4
K
G
delete (M)
delete (E)
E
C
A
C
A
K
G
J
G
J
J
ver. 11
O
I
J
J
ver. 10
K
O
I
O
I
J
C
ver. 9
K
G
J
delete (A)
ver. 12
O
I
Fat Node Method
array for access pointers
vers. 1-10
[Driscoll et al 1989]
O(1) time
vers. 11-12
node
E
information stamp 2
information stamp 10
O(log m) time per node
right
left
10
stamp 10
2
stamp 2
stamp 3
3
5
C
M
11
6
4
O
12
A
7
I
8
10
G
K
J
9
10
10
Node-Copying Method
node
[Driscoll et al 1989]
containing no more than one version
except enough (constant) #pointer fields
Make a new copy of node if some field is full.
predecessor
predecessor
information
left
nill
information
information
stamp 10
predecessor
stamp 15
stamp 13
right
left
right
stamp 10
stamp 13
stamp 14
right
left
stamp 17
nill
newest
Node-Copying Method (the Same Example)
array for access pointers
3-9
vers. 1-2
10
E
E
O(1) time
11-12
E
6
O(1) amortized time
and space per node
2
C
5
C
M
12
M
8
A
I
I
10
K
7
G
9
J
K
4
O
Partial Persistence
●
Given linked data structure on a pointer-machine
●
Suppose ≦p (constant) nodes point to any node
●
Store reverse pointers for most recent version
●
●
Allow ≦p (time, field, value) modification (by an update) in a
node
When update changes a field:
●
●
●
If node not full, just add modification
Else: copy node-with-modification, recourse on reverse
pointers
# full latest nodes ⇒ O(1) amortized overhead
Other Persistence
only the latest version is allowed
to be updated
any past version is allowed to be updated;
generate branching in version relationship
also merging two version is allowed
Full Persistence
●
Given a linked data structure on a pointer-machine
●
Version list = pre-order traversal of version tree
●
List-order DS: O(1) time/op
●
●
●
insert node after specified node
●
Order query: is node x before y?
Space for up to 2p modifications in a node
When a node is full, split into two roughly half-full nodes (like Btrees), called node-splitting method (similar to node-copying method)
●
[Driscoll et al 1989]
# full nodes
⇒ O(1) amortized cost
●
Linked list of nodes representing DS node
●
Second phase to update reverse pointers
Full Persistence
Open Problem
●
[Brodal-NJC 1996] O(1) worst-case partial
persistence
●
OPEN: O(1) worst-case full persistence?
●
O(lglgn) fully persistent array ⇒ any RAM DS
●
OPEN: Matching lower bound? What about partial?
Confluent Persistence
●
Functional data structures
●
e.g. deques with concatenation in O(1)/op
●
Eneral transformation:
●
d(v) = depth of node v in a version DAG
●
e(v) = 1+lg(# paths from root to v)
●
Overhead: lg(#updates)+maxv e(v)
●
Poor when e(v)=2#updates
–
Can make exponential-size DS this way
–
In this case still exponentially better than nonpersistence
Open Problem
OPEN: When can you do better?
Lists with split and concatenate
– Trees
– General pointer machine
– Arrays
–
Retroactivity
(maintain the current version for changes on the past version)
Data for traffic flow
Time 11:35
Version1
Time 11:40
Version2
update 1
Time 11:50
Version3
update 2
Time 11:55
Version4
update 3
now
Whoops! Update 1 was wrong!
Update 1’ is the correct one.
What is the correct current version?
update 1’
update 2
update 3
???
now
Retroactivity
●
Traditional DS formed by sequence of updates
●
Allow changes to that sequence
●
Maintain linear timeline
●
Operations:
●
Insert(t,”op()”): retroactively do op. at time t
●
Delete(t): retroactivity undo op. at time t
●
Query(t,”op()”): execute op. query at time t
●
Partial retroactivity: Query onlyy in present (last t)
●
Full retroactivity: Query at any time
Retroactivity
●
Easy cases:
●
Commutative updates: xоy≡yоx
–
●
Insert(t,x) can just do x in present
Invertible updates: xоx-1≡φ
–
Delete(t) can just do x-1 in present
Rollback Method
●
●
Rollback method: retroative op. at r time units in past with factor-r
overhead, via logging (undo persistence)
Lower bound: Ω(r) can be necessary!
●
●
DS maintains two values, X & Y, initially 0
–
set X(x): X←x
–
add Y(Δ): Y←Y+Δ
–
mul XY(): Y←X*Y
–
query (): return Y
add Y(an), mul XY(), add Y(an-1), mul XY(),..., add Y(a0)
computes the polynomial anxn+an-1xn-1+...+a0
Rollback Method
●
●
●
●
add Y(an), mul XY(), add Y(an-1), mul XY(),..., add Y(a0)
computes the polynomial anxn+an-1xn-1+...+a0
Insert (t=0, “set X(x)”) changes x value
It is known that computing an arbitrary polynomial requires Ω(n)
time over any field, even with any infinite subset such as integers,
independent of preprocessing of ai in worst case.
In “history-independent algebraic decision tree,” the same result
holds for the integer RAM and generalized real RAM models.
Open Problem
●
●
Cell-probe lower bound:
●
DS maintains n words; arithmetic updates (+/·)
●
Compute FFT in
●
Changing wi has lower bound
OPEN: cell-probe lower bound
?
Partially Retroactive
●
Priority queues: insert(t,k), delete-min()
●
(not commut.)
●
Partial retroactivity in O(lgn)/op.
●
Assume keys only inserted once
●
L view: insert=rightward ray; delete-min=upward
Priority queues
key value
insert=rightward ray; delete-min=upward
Q0
time
now
Priority queues
Bridge: subset of keys of Qnow
key value
bridge
Qnow
time
now
Insert(t,”insert(k)”)/Delete(t,”delete-min()”)
●
Insert(t,”insert(k)”): insert key k into
Qt={elements at time t}
●
●
Find the largest element that was deleted after time t
Delete(t,”delete-min()”): undo the delete-min at time t
●
Identical to insert(k) right after deleting it
Insert (t, “insert(k)”)
key value
bridge
Qnow
Insert (t, insert(k))
time
now
Insert (t, “insert(k)”)
key value
I≧t = {elements inserted at time t or later}
D≧t = {elements deleted at time t or later } (not in Qnow)
bridge
Qnow
Insert (t, insert(k))
time
Lemma The element to be inserted into Qnow is
the maximum element in D≧t
(= the maximum element in I≧t’ -Qnow
for the last bridge t’ before t) .
now
Insert(t,”delete-min(k)”)/ Delete(t,”insert(k)”)
●
Insert(t,”delete-min(k)”): delete the min key at time t
●
●
Find the min value of Qt', t' is the first bridge after t
Delete(t,”insert(k)”): undo the insertion at time t
●
Identical to delete-min() right after inserting it
Insert (t, delete-min)
key value
bridge
Qnow
Insert (t, delete-min)
time
now
Insert (t, delete-min)
key value
bridge
Qnow
Insert (t, delete-min)
time
now
Lemma The element to be removed fromQnow is
the minimum element in Qt’ for the first bridge t’ after t
(Qt’ = the maximum element in I≦t’ ∩Qnow) .
Priority queues
Reduced to ordinal tasks:
(A) Find the last (first) bridge before (after) time t;
(B) Find the max (min) element in I≧t’ -Qnow (I≦t’ ∩Qnow)
●
O(lgm) worst-case, m is the total number of updates,
present or retroactive, performed on the priority
queue.
Partial Persistence
Modification box