Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Wolfgang Mulzer Institut für Informatik Data Structures on Event Graphs Bernard Chazelle Princeton University Wolfgang Mulzer FU Berlin It‘s the data Data can be huge corrupted low-entropy expensive … Rethink classical algorithms from a data-oriented perspective. Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs 2 It‘s the data Data can be huge corrupted low-entropy expensive … We study a model that represents temporal locality of the data. Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs 3 A concrete problem – successor search Given: An ordered universe U of n elements x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 Goal: maintain a subset S of U supporting successor queries Operations: Insert(xi) Delete(xi) Successor(xi) Also known as Union-Split-Find Problem. Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs 4 A concrete problem – successor search Given: An ordered universe U of n elements x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 Can be solved in O(log log n) time on a pointer machine. [van Emde Boas, Kaas, Zijlstra 77] This is optimal. [Mehlhorn, Näher, Alt 88], [Pătraşcu, Thorup 06] Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs 5 Event graphs Given: An ordered universe U of n elements and a labeled, connected, undirected graph G Ix0 Sx2 G is labeled with operations Ixi, Dxi, Sxi Dx2 Ix5 G can be preprocessed Sx7 Ix7 Dx9 Ix9 G is known in advance Adversary walks on G to perform ops Similar to Markov chains Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs 6 Event graphs x1 Ix0 x2 x3 x4 Sx2 x5 x6 x7 x8 x9 x10 G is labeled with operations Ixi, Dxi, Sxi Dx2 Ix5 G can be preprocessed Sx7 Ix7 G is known in advance Dx9 Adversary walks on G to perform ops Ix9 Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs 7 Decorated graphs The walk of the adversary induces a walk on a much bigger graph. Decorated Graph dec(G): directed graph with vertex set V(G) Pow(U). Represents current node of G + current set S. Ix0 Sx2 (Sx2, ) Dx2 (Dx2, ) Ix5 Sx7 (Ix5, {x5, x9}) Ix7 Dx9 (Ix9, {x9}) Ix9 Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs 8 Decorated graphs The walk of the adversary induces a walk on a much bigger graph. Decorated Graph dec(G): directed graph with vertex set V(G) Pow(U). Represents current node of G + current set S. If dec(G) is available, we can perform all operations in constant time. But: The size of dec(G) is exponential. Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs 9 Decorated graphs The walk of the adversary induces a walk on a much bigger graph. Decorated Graph dec(G): directed graph with vertex set V(G) Pow(U). Represents current node of G + current set S. Questions: - What can we say about the structure of dec(G)? -What can we deduce about dec(G), given G? -In which cases can dec(G) be compressed efficiently? Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs 10 The structure of decorated graphs dec(G) contains a unique strongly connected component that has no exit and is reachable from every other node. C2 C1 C3 C4 This component is called the unique sink. Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs 11 The structure of decorated graphs Theorem: Given a node vV(G) and a set SU, we can decide in time O(|V(G)|+|E(G)|) whether (v,S) lies in the unique sink. Proof idea: We show that for every node in the unique sink there exists a unique certificate in G (a certifying walk). A modified graph search in G can be used to find a certifying walk for (v,S), if it exists. Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs 12 Can the decorated graph be compressed? Consider the case that G is a path. Ix7 Sx7 Ix0 Sx2 Dx2 Ix9 Dx9 Ix5 Theorem: If G is a path, the successor problem can be solved in O(1) time per operation with O(n1+) space on a word RAM, where n=|V|. Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs 13 Can the decorated graph be compressed? Ix7 Sx7 Ix0 Sx2 Dx2 Ix9 Dx9 Ix5 Theorem: If G is a path, the successor problem can be solved in O(1) time per operation with O(n1+) space on a word RAM, where n=|V|. Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs 14 Can the decorated graph be compressed? Ix7 Sx7 Ix0 Sx2 Dx2 Ix9 Dx9 Ix5 Theorem: If G is a path, the successor problem can be solved in O(1) time per operation with O(n1+) space on a word RAM, where n=|V|. Proof: Maintain S in a doubly linked list. Each node in G has a pointer to its predecessor or successor in S. Use this pointer to answer the queries. Need only maintain those pointers that will be relevant next. Use lookup-table. Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs 15 Example x1 … Dx1 x3 Sx5 x5 Dx3 Ix7 Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs x7 Dx2 Sx8 x10 Ix2 Dx9 … 16 Reducing the space requirement A naïve implementation uses two lookup-tables per node to update the pointers → O(n2) space usage. Can be improved to O(n1+) space. Approach: Use spatial decomposition and bootstrapping to compress the lookup-tables (cf. [Crochemore et al, 2008]) Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs 17 What about randomization? We assumed an adversary. But: What if the walk on the path is random? Theorem: If the requests are generated by a random walk on a path, the successor problem can be solved in O(1) expected time per operation with O(n) space on a word RAM, where n=|V|. Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs 18 What about randomization? Theorem: If the requests are generated by a random walk on a path, the successor problem can be solved in O(1) expected time per operation with O(n) space on a word RAM, where n=|V|. Proof (sketch): Subdivide the path into segments of n nodes. The random walk requires (n) steps to leave a segment. Build the quadratic data structure once the walk enters the next segment. Use overlapping segments and deamortization techniques to make it work. Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs 19 What about more complicated graphs? What if G is a tree, a grid, or something more complicated? Ix7 Sx7 Sx2 Dx2 Ix0 Ix9 Dx9 Ix7 Sx7 Ix0 Sx2 Dx2 Ix9 Dx9 Ix5 Ix7 The path approach does not work any more We conjecture that in this case the O(log log n) bound from van Emde Boas trees is optimal (but we do not know). Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs 20 Conclusion and open problems A new way to model request sequences to a data structure. Can be applied to any data structuring problem. More algorithmic questions on decorated graphs, e.g., can we estimate the size of the unique sink efficiently? Can we prove lower bounds for the successor problem on general event graphs? Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs 21 Thank you! Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs 22