Download 09_Lecture

Document related concepts

Lattice model (finance) wikipedia , lookup

B-tree wikipedia , lookup

Red–black tree wikipedia , lookup

Quadtree wikipedia , lookup

Binary tree wikipedia , lookup

Interval tree wikipedia , lookup

Binary search tree wikipedia , lookup

Transcript
Selected
Algorithms and Data Structures
(Augmenting Data Structures)
Prof. Th. Ottmann
Lecture 9. Page 1
Augmentation Process
Augmentation is a process of extending a data structure in order to support additional
functionality. It consists of four steps:
1. Choose an underlying data structure.
2. Determine the additional information to be maintained in the underlying data
structure.
3. Verify that the additional information can be maintained for the basic modifying
operations on the underlying data structure.
4. Develop new operations.
Lecture 9. Page 2
Examples for Augmenting DS
•
Dynamic order statistics: Augmenting binary search trees by size information
•
D-dimensional range trees: Recursive construction of (static) d-dim range trees
•
Min-augmented dynamic range trees: Augmenting 1-dim range trees by mininformation
•
Interval trees
•
Priority search trees
Lecture 9. Page 3
Examples for Augmenting DS
•
Dynamic order statistics: Augmenting binary search trees by size information
•
D-dimensional range trees: Recursive construction of (static) d-dim range trees
•
Min-augmented dynamic range trees: Augmenting 1-dim range trees by mininformation
•
Interval trees
•
Priority search trees
Lecture 9. Page 4
Dynamic Order Statistics
•
Problem: Given a set S of numbers that changes under insertions and deletions,
construct a data structure to store S that can be updated in O(log n) time and that
can report the k-th order statistic for any k in O(log n) time.
S
5
51
13
48
22
85
34
7
14
Lecture 9. Page 5
Binary Search Trees and Order Statistics
17
7
5
1
19
13
18
37
25
49
33
Lecture 9. Page 6
Binary Search Trees and Order Statistics
17
7
5
1
•
•
Retrieving an element with a given rank:
For a given i, find the i-th smallest key in
the set.
•
•
Determining the rank of an element:
For a given (pointer to a) key k, determine
the rank of k in the set of keys.
19
13
18
37
25
49
33
Lecture 9. Page 7
Augmenting the Data Structure
•
33
11
Every node v stores two pieces
of information:
17
4
Its key
The number of its
descendants (The size of
the subtree with root v)
9
2
4
1
51
6
21
1
48
1
92
4
73
2
124
1
81
1
Lecture 9. Page 8
How To Determine The Rank of an Element
•
Find the rank of key x in the tree with root node v:
•
Rank(v, x)
•
1
if x = key(v)
•
2
then return 1 + size(left(v))
•
3
if x < key(v)
•
4
then return Rank(left(v), x)
•
5
else return 1 + size(left(v)) + Rank(right(v), x)
33
11
51
6
17
4
9
2
4
1
21
1
48
1
92
4
73
2
124
1
81
1
Lecture 9. Page 9
How to Find the k-th Order Statistic
•
Find (a pointer to) the node containing the
•
k-th smallest key in the subtree rooted
•
at node v.
•
33
11
51
6
17
4
9
2
Select(v, k)
1
if k = size(left(v)) + 1
4
1
2
then return v
3
if k ≤ size(left(v))
4
then return Select(left(v), k)
5
else return Select(right(v), k – 1 – size(left(v)))
21
1
48
1
92
4
73
2
124
1
81
1
Lecture 9. Page 10
Maintaining Subtree Sizes Under Insertions
33
11
17
4
9
2
4
1
51
6
21
1
48
1
92
4
73
2
•
Insert operation
•
Insert node as into a standard
binary search tree.
•
Add 1 to the subtree size of
every ancestor of the new
node.
124
1
81
1
Lecture 9. Page 11
Maintaining Subtree Sizes Under Insertions
33
11
17
4
9
2
51
6
21
1
48
1
4
1
92
4
73
2
64
1
•
Insert operation
•
Insert node as into a standard
binary search tree
•
Add 1 to the subtree size of
every ancestor of the new node
124
1
81
1
Lecture 9. Page 12
Maintaining Subtree Sizes Under Insertions
33
12
17
4
9
2
51
7
21
1
48
1
4
1
92
5
73
3
64
1
•
Insert operation
•
Insert node as into a standard
binary search tree
•
Add 1 to the subtree size of
every ancestor of the new node
124
1
81
1
Lecture 9. Page 13
Maintaining Subtree Sizes Under Deletions
•
Delete operation
•
Delete node as from a standard
binary search tree
•
Subtract 1 from the subtree size
of every ancestor of the deleted
node
Lecture 9. Page 14
Maintaining Subtree Sizes Under Rotations
s1
s1
s3
s2
s4
s5
s5 + s3 + 1
s4
s5
s3
Lecture 9. Page 15
Dynamic Order Statistics—Summary
•
Theorem: There exists a data structure to represent a dynamically changing set S
of numbers with the following properties:
•
The data structure can be updated in O(log n) time after every insertion or deletion
into or from S.
•
The data structure allows us to determine the rank of an element or to find the
element with a given rank in O(log n) time.
•
The data structure occupies O(n) space.
Lecture 9. Page 16
Examples for Augmenting DS
•
Dynamic order statistics: Augmenting binary search trees by size information
•
D-dimensional range trees: Recursive construction of (static) d-dim range trees
•
Min-augmented dynamic range trees: Augmenting 1-dim range trees by mininformation
•
Interval trees
•
Priority search trees
Lecture 9. Page 17
4-Sided Range Queries
Lecture 9. Page 18
4-Sided Range Queries
Goal: Build a static data structure of size O(n log n) that can answer 4-sided range
queries in O(log2 n + k) time.
Lecture 9. Page 19
Orthogonal d-dimensional Range Search
Build a static data structure for a set P of n points in d-space that supports d-dim range
queries:
d-dim range query: Let R be a d-dim orthogonal hyperrectangle, given by
d ranges [x1, x1‘], …, [xd, xd‘]:
Find all points p = (p1, …, pd)  P such that x1 ≤ p1≤ x1‘,…,xd ≤ pd ≤ xd.
Special cases:
1-dim range query:
2-dim range query:
x2‘
x1
x1‘
x2
x1
x1‘ Lecture 9. Page 20
1-dim Range Search
Standard binary search trees support also 1-dim range queries:
68
37
99
81
55
18
23
12
42
21
61
74
90
49
30
80
Lecture 9. Page 21
1-dim Range Search
Leaf-search-tree:
68
37
99
55
18
23
12
12
18
42
37
61
21
21
90
74
61
42
∞
81
68
74
81
90
99
49
23
49
30
55
80
Lecture 9. Page 22
1-dim Range Tree
A 1-dim range tree is a leaf-search tree for the x-values (points on the line).
Internal nodes have routers guiding the search to the leaves: We choose the maximal
x-value in left subtree as router.
Range search: In order to find all points in a given range [l, r] search for the boundary
values l and r.
This is a forked path; report all leaves of subtrees rooted at nodes v in between the
two search paths whose parents are on the search path.
Lecture 9. Page 23
The selected subtrees
Split node
l
r
Lecture 9. Page 24
Canonical Subsets
The canonical subset of node v, P(v), is the subset of points of P stored at the leaves
of the subtree rooted at v.
If v is a leaf, P(v) is the point stored at this leaf.
If v is the root, P(v) = P.
Observations:
For each query range [l, r] the set of points with x-coordinates falling into this range
is the disjoint union of O(log n) canonical subsets of P.
A node v is called an umbrella node for the range [l, r], if the x-coordinates of all
points in its canonical subset P(v) fall into the range, but this does not hold for the
predecessor of v.
All k points stored at the leaves of a tree rooted at node v, i.e. the k points in a
canonical subset P(v), can be reported in time O(k).
Lecture 9. Page 25
1-dim Range Tree: Summary
Let P be a set of n points in 1-dim space.
P can be stored in a balanced binary leaf-search tree such that the following holds:
Construction time: O(n log n)
Space requirement: O(n)
Insertion of a point: O(log n) time
Deletion of a point: O(log n) time
1-dim-range-query: Reporting all k points falling into a given query range can be
carried out in time O(log n + k).
The performance of 1-dim range trees does not depend on the chosen balancing
scheme!
Lecture 9. Page 26
2-dim Range tree: The Primary Structure
•
Static binary leaf-search tree over
x-coordinates of points.
Lecture 9. Page 27
The Primary Structure
•
Static binary leaf-search tree over
x-coordinates of points.
Lecture 9. Page 28
The Primary Structure
•
Static binary leaf-search tree over
x-coordinates of points.
Lecture 9. Page 29
The Primary Structure
•
Static binary leaf-search tree over
x-coordinates of points.
•
Every leaf represents a vertical
slab of the plane.
Lecture 9. Page 30
The Primary Structure
•
Static binary leaf-search tree over
x-coordinates of points.
•
Every leaf represents a vertical
slab of the plane.
•
Every internal node represents a
slab that is the union of the slabs
of its children.
Lecture 9. Page 31
The Primary Structure
•
Static binary leaf-search tree over
x-coordinates of points.
•
Every leaf represents a vertical
slab of the plane.
•
Every internal node represents a
slab that is the union of the slabs
of its children.
Lecture 9. Page 32
The Primary Structure
•
Static binary leaf-search tree over
x-coordinates of points.
•
Every leaf represents a vertical
slab of the plane.
•
Every internal node represents a
slab that is the union of the slabs
of its children.
Lecture 9. Page 33
The Primary Structure
•
Static binary leaf-search tree over
x-coordinates of points.
•
Every leaf represents a vertical
slab of the plane.
•
Every internal node represents a
slab that is the union of the slabs
of its children.
Lecture 9. Page 34
Answering 2-dim Range Queries
•
Normalize queries to end on slab
boundaries.
•
Query decomposes into O(log n)
subqueries.
•
Every subquery is a
1-dimensional range query on ycoordinates of all points in the slab
of the corresponding node.
(x-coordinates do not matter!)
Lecture 9. Page 35
The selected subtrees
Split node
l
r
Lecture 9. Page 36
Answering Queries
•
Normalize queries to end on slab
boundaries.
•
Query decomposes into O(log n)
subqueries.
•
Every subquery is a
1-dimensional range query on ycoordinates of all points in the slab
of the corresponding node.
(x-coordinates do not matter!)
Lecture 9. Page 37
Answering Queries
•
Normalize queries to end on slab
boundaries.
•
Query decomposes into O(lg n)
subqueries.
•
Every subquery is a
1-dimensional range query on ycoordinates of all points in the slab
of the corresponding node.
(x-coordinates do not matter!)
Lecture 9. Page 38
Answering Queries
•
Normalize queries to end on slab
boundaries.
•
Query decomposes into O(log n)
subqueries.
•
Every subquery is a
1-dimensional range query on ycoordinates of all points in the slab
of the corresponding node.
(x-coordinates do not matter!)
Lecture 9. Page 39
2-dim Range Tree
y
Ty(v)
x
Ix(v)
v
Tx
Lecture 9. Page 40
2-dim Range Tree
A 2-dimensional range tree for storing a set P of n points in the x-y-plane is:
•
A 1-dim-range tree Tx for the x-coordinates of points.
•
Each node v of Tx has a pointer to a 1-dim-range-tree Ty(v) storing all points which
fall into the interval Ix(v). That is: Ty(v) is a 1-dim-range-tree based on the ycoordinates of all points p  P with p  Ix(v).
Leaf-search-tree on
y-coordinates of poins
v
Leaf-search-tree on x-coordinates of points
Lecture 9. Page 41
2-dim Range Tree
A 2-dim range tree on a set of n points in the plane requires O(n log n) space.
A point p is stored in all associated
range trees Ty(v) for all nodes v on the
search path to px in Tx.
p
Hence, for each depth d, each point p occurs
in only one associated search structure Ty(v)
for a node v of depth d in Tx.
The 2-dim range tree can be constructed in
time O(n log n).
(Presort the points on y-coordinates!)
p
p
p
Lecture 9. Page 42
The 2-Dimensional Range Tree
•
Primary structure:
Leaf-search tree on
x-coordinates of points
•
Every node stores a secondary
structure:
Balanced binary search tree on ycoordinates of points in the
node’s slab.
Every point is stored in secondary
structures of O(log n) nodes.
Space: O(n log n)
Lecture 9. Page 43
Answering Queries
•
Every 2-dimensional range query
decomposes into O(log n) 1dimensional range queries
•
Each such query takes O(log n +
k′) time
• Total query complexity:
•
O(log2 n + k)
Lecture 9. Page 44
2-dim Range Query
Let P be a set of points in the plane stored in a 2-dim range tree and let a 2-dim range
R defined by the two intervals [x, x‘], [y, y‘] be given. The all k points of P falling
into the range R can be reported as follows:
1. Determine the O(log n) umbrella nodes for the range [x, x‘], i.e. determine the
canonical subsets of P that together contain exactly the points with x-coordinates
in the range [x, x‘]. (This is a 1-dim range query on the x-coordinates.)
2. For each umbrella node v obtained in 1, use the associated 1-dim range tree Ty(v)
in order to select the subset P(v) of points with y-coordinates in the range [y, y‘].
(This is a 1-dim range query for each of the O(log n) canonical subsets obtained
in 1.)
Time to report all k points in the 2-dim range R: O(log2 n + k).
Query time can be reduced to O(log n +k) by a technique known as fractional
cascading.
Lecture 9. Page 45
The 3-Dimensional Range Tree
•
Primary structure:
Search tree on
x-coordinates of points
•
Every node stores a secondary
structure:
2-dimensional range tree on
points in the node’s slab.
Every point is stored in secondary
structures of O(log n) nodes.
Space: O(n log2 n)
Lecture 9. Page 46
Answering Queries
•
Every 3-dimensional range query
decomposes into O(log n) 2dimensional range queries
•
Each such query takes O(log2 n +
k′) time
•
Total query complexity:
•
O(log3 n + k)
Lecture 9. Page 47
d-Dimensional Range Queries
•
•
•
•
Primary structure:
Search tree on x-coordinates
Secondary structures:
(d – 1)-dimensional range trees
Space requirement:
O(n logd – 1 n)
Query time:
O(n logd – 1 n)
Lecture 9. Page 48
Updates are difficult!
Insertion or deletion of a point p in a 2-dim range tree requires:
1. Insertion or deletion of p into the primary range tree Tx according to the xcoordinate of p
2. For each node v on the search path to the leaf storing p in Tx, insertion or deletion
of p in the associated secondary range tree Ty(v).
Maintaining the primary range tree balanced is difficult, except for the case d = 1!
Rotations in the primary tree may require to completely rebuild the associated range
trees along the search path!
Lecture 9. Page 49
Range Trees–Summary
•
Theorem: There exists a data structure to represent a static set S of n points in d
dimensions with the following properties:
The data structure allows us to answer range queries in
O(logd n + k) time. The data structure occupies O(n logd – 1 n) space.
•
Note: The query complexity can be reduced to O(logd – 1 n + k), for d ≥ 2, using a
very beautiful technique called fractional cascading.
Lecture 9. Page 50
Examples for Augmenting DS
•
Dynamic order statistics: Augmenting binary search trees by size information
•
D-dimensional range trees: Recursive construction of (static) d-dim range trees
•
Min-augmented dynamic range trees: Augmenting 1-dim range trees by mininformation
•
Interval trees
•
Priority search trees
Lecture 9. Page 51
minXinRectangle Queries
Problem: Given a set P of points that changes under insertions and deletions,
construct a data structure to store P that can be updated in O(log n) time and that can
find the point with minimal x-coordinate in a given range below a given
threshold in O(log n) time.
y0
minXinRectangle(l, r, y0)
l
r
Assumption: All points have pairwise different x-coordinates
Lecture 9. Page 52
minXinRectangle Queries
minXinRectangle(l, r, y0)
y0
l
r
Assumption: All points have pairwise different x-coordinates
Lecture 9. Page 53
Min-augmented Range Tree
Two data structures in one:
11
2
(2, 12)
3
14
3
2
2
4
4
3
(3, 4)
(4, 11)
5
Leaf-search tree on
x-coordinates of points
Min-tournament tree
on y-coordinates of points
17
(14, 7)
2
8
15
21
3
2
8
(11, 21)
(15, 2)
(17, 30)
(21, 8)
3
(5, 3)
(8, 5)
Lecture 9. Page 54
minXinRectangle(l, r, y0)
Search for the boundary values l, r.
Find the leftmost umbrella node with
a min-field ≤ y0.
l
Split node
r
Lecture 9. Page 55
minXinRectangle(l, r, y0)
Search for the boundary values l, r.
Find the leftmost umbrella node with
a min-field ≤ y0.
Split node
Proceed to the left son of the current
node, if its min-field is ≤ y0, and to
the right son, otherwise.
Return the point at the leaf.
l
minXinRectangle(l, r, y0) can be found in time O(height of tree).
r
Lecture 9. Page 56
Updates
Insert operation
Insert node as into a standard binary leaf search tree.
Adjust min-fields of every ancestor of the new node by playing a min tournament for
each node and its sibling along the search path.
Delete operation: Similar
Lecture 9. Page 57
Maintaining min-fields under Rotations
s1
s1
s3
s2
s4
s5
min{s5, s3}
s4
s5
s3
Lecture 9. Page 58
Min-augmented Range Trees–Summary
•
Theorem: There exists a data structure to represent a dynamic set S of n points
in the plane with the following properties:
The data structure allows updates and to answer minXinRectangle(l, r, y0) queries
in
O(log n) time. The data structure occupies O(n) space.
•
Note: The data structure can be based on an arbitrary scheme of balanced binary
leaf search trees.
Lecture 9. Page 59