Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Augmenting Data Structures Advanced Algorithms & Data Structures Lecture Theme 07 – Part II Prof. Dr. Th. Ottmann Summer Semester 2006 Examples for Augmenting DS • Dynamic order statistics: Augmenting binary search trees by size information • D-dimensional range trees: Recursive construction of (static) d-dim range trees • Min-augmented dynamic range trees: Augmenting 1-dim range trees by mininformation • Interval trees • Priority search trees 2 Interval Trees (CLR-Version) Problem: Given a set R of intervals that changes under insertions and deletions, construct a data structure to store R that can be updated in O(log n) time and that can find for any given query interval i an interval in R that overlaps i, and returns nil if there is no such interval, in O(log n) time. Idea: Store the set of intervals in an appropriately augmented balanced binary search tree and design an algorithm for the new operation. Interval-search(T, i): Report an interval stored in T that overlaps the query interval i, if such an interval exists, and Nil otherwise. 3 Observation Let i = [low(i), high(i)] and i‘ = [low(i‘), high(i‘)] be two intervals. Then i and i‘ overlap if and only if low(i) ≤ high(i‘) and low(i‘) ≤ high(i) Any two intervals satisfy the interval trichotomy: a) i and i‘ overlap b) high(i) ≤ low(i‘) c) high(i‘) ≤ low(i) 4 Interval Trichotomy The cases when intervals I1 and I2 overlap: x1 y1 x1 x2 x1 y2 x2 y1 x2 y1 y2 x1 y1 x2 y2 y2 The cases when intervals I1 and I2 do not overlap: x1 y1 x2 x1 y2 x2 y2 y1 5 Interval tree (CLR Version) [15,23] Each node v stores an interval int(v) and the maximum upper endponit of all intervals stored in the subtree rooted at v; The interval tree is a search tree on the lower endpoints of intervals. 33 [6,10] [17,19] 10 33 [5,8] [8,9] [16,21] [27,33] 8 9 21 33 [0,3] [25,30] [29,30] 3 30 30 [19,20] [26,26] 20 26 Max(v) is the maximum value of all right endpoints in the subtree rooted at v. 6 Maintaining max-information Max-information can be maintained during updates and rebalancing operations (rotations). Max (x) = max(high(int(x), max (left(x)), max (right(x)) 7 Finding an interval in T that overlaps interval i [15,23] Interval-search(T, i) 33 [6,10] [17,19] 10 33 [5,8] [8,9] [16,21] [27,33] 8 9 21 33 x root(T) while x ≠ Nil and i does not overlap int(x) do if left(x) ≠ Nil and max(left(x))≥ low(i) then x left(x) else x right(x) return x [0,3] [25,30] [29,30] 3 30 30 [19,20] [26,26] 20 26 Observation: if int(x) does not overlap i, the search always proceeds in a safe direction! Interval-search can be carried out in time O(height T). 8 Interval Trees: Point-set Variant Problem: Given a set R of intervals that changes under insertions and deletions, construct a data structure to store R that can be updated in O(log n) time and that can find for any given query interval i an interval in R that overlaps i, and returns nil if there is no such interval, in O(log n) time. Solution: Map intervals to points and store points in appropriately augmented tree. (l, r) i l = low[i] r = high[i] 9 Max-augmented Range Tree Store intervals as points in sorted x-order in a leaf search tree. Store max y-coordinates at internal nodes. 11 28 (2, 5) 3 14 15 28 2 4 5 15 (3, 4) 17 (14, 17) (4, 5) 8 28 10 15 21 15 22 28 (11, 12) (15, 22) (17, 18) Leaf-search tree on x-coordinates of points Max-tournament tree on y-coordinates of points (21, 28) 15 (8, 13) (10, 15) 10 Interval Search Interval-Search(T, i) /* Find an interval in tree T that overlaps i */ 1 P = root[T] 2 while p is not a leaf do 3 if max-y[left[p]] low[i] 4 then p = left[p] 5 else p = right[p] /* Now p is a leaf storing interval i’ */ 6 If (i and i’ overlap) then return “found” else return “not found” 11 Correctness Proof Case 1: We go right, low[i] > max-y[left[p]] i Intervals in left subtree of p None of them can overlap with i. 12 Correctness Proof Case 2: We go left, low[i] ≤ max-y[left[p]] If T[left[p]] does not contain an interval i‘ that overlaps i, then T[right[p]] cannot contain such an interval as well! i i‘ max-y[left[p]] (Here we utilize the fact that the intervals are sorted according to their x-coordinates!) 13 Interval Tree-Summary A interval tree for a set of n intervals [l1, r1], …, [ln, rn] on the line is a daynamic maxaugmented range tree for the set of points P = {(l1, r1), …, (ln, rn)}. Interval trees can be used to carry out the following operations: Interval-Insert(T, i) inserts the interval i into the tree T Interval-Delete(T, i) removes the interval i from the tree T Interval-Search(T, i) returns a pointer to a node storing an interval i‘ that overlaps i, or NIL if no such interval is stored in T. 14 Examples for Augmenting DS • Dynamic order statistics: Augmenting binary search trees by size information • D-dimensional range trees: Recursive construction of (static) d-dim range trees • Min-augmented dynamic range trees: Augmenting 1-dim range trees by mininformation • Interval trees • Priority search trees 15 3-Sided Range Queries Goal: Report all k points in the query range in O(log n + k) time. 16 3-Sided Range Queries Salary Age Goal: Report all k points in the query range in O(log n + k) time. 17 Priority Search Trees Two data structures in one: 11 ● Search tree on points’ x-coordinates 3 14 2 2 4 3 ● Heap on points’ y-coordinates 5 5 17 14 8 4 15 11 8 { (2, 12), (3, 4) (4, 11), (5, 3), (8, 5), (11, 21), (14, 7), (15, 2), (17, 30), (21, 8), (33, 33) } 15 21 17 21 33 18 Priority Search Trees 11 Two data structures in one: ● Search tree on points’ x-coordinates 14 ● Heap on points’ (17, 30) y-coordinates (33, 33) 3 (11, 21) 2 2 4 14 17 (2, 12) (4, 11) (14, 7) (21, 8) 3 8 15 (8, 5) (15, 2) 4 (3, 4) 5 11 15 21 17 21 33 (5, 3) 5 8 19 3-Sided Range Queries on a Priority Search Tree • Query procedure: Inspect all nodes on the two bounding paths and report the points that match the query. For every tree between the two bounding paths, apply the following strategy: • Inspect the root. • If this reports a point, recursively visit the children of the root. O(log n) time to query red paths O(log n + k) time to query blue subtrees 20 Correctness of the Query Procedure • Observations: • We never report a point that is not in the query range. • Points in the yellow subtrees cannot match the query. • Points in the blue subtrees that are not reported cannot match the query. 21 Insertion into a Priority Search Tree 11 Insertion procedure: (33, 33) 2 1. Insert new leaf based on point’s x-coordinate. 3 14 (11, 21) (17, 30) 2 4 14 17 (2, 12) (4, 11) (14, 7) (21, 8) 3 8 15 (8, 5) (15, 2) 4 (3, 4) 5 11 15 2. Insert point down the tree, based on its y-coordinate. 21 17 21 33 (5, 3) 5 8 22 Deletion from a Priority Search Tree Deletion procedure: 11 1. Search for the point and delete it. 2. Fill the gap by pulling-up points according to their y-values (33, 33) 2 3 14 (11, 21) (17, 30) 2 4 14 17 (2, 12) (4, 11) (14, 7) (21, 8) 3 8 15 (8, 5) (15, 2) 4 (3, 4) 5 11 15 21 17 21 33 (5, 3) 5 8 23 Priority Search Tress: Observations Insertion and deletion of point in a priority search tree T of n nodes can be carried out in time O(height(T)). Priority search trees support north-grounded range reporting, if the heap-structure is a max-heap, and they support south-grounded range reporting, if the heap-structure is a min-heap. Maintaining the height of the leaf search tree underlying a priority search tree such that the height is always of order O(log n) for a priority search tree storing n nodes requires rebalancing! In order to obtain O(log n) algorithms for insertion and deletion of points one must use a rebalancing scheme with constant restructuring cost per update! A PST storing n points requires space O(N). 24 Rotations in a Priority Search Tree x y p1 p2 p1 y ? x • Push p2 to the appropriate child of y. • Store p1 at y. • Propagate the point with maximal y-coordinate from the appropriate child of x. 25 Priority Search Trees — Summary • Theorem: There exists a data structure to represent a dynamically changing set S of points in two dimensions with the following properties: The data structure can be updated in O(log n) time after every insertion or deletion into or from S. The data structure allows us to answer 3-sided range queries in O(log n + k) time. The data structure occupies O(n) space. 26