Download Adv Data Ch 4 - Computer Science

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Binary search algorithm wikipedia , lookup

Signal-flow graph wikipedia , lookup

Dijkstra's algorithm wikipedia , lookup

Lattice model (finance) wikipedia , lookup

Computational phylogenetics wikipedia , lookup

Corecursion wikipedia , lookup

Transcript
Western Michigan University
Department of Computer Science
CS 6310 - Advanced Data Structure
Mehdi Mohammadi
March 2015
1





Orthogonal Range Trees
Higher-Dimensional Segment Trees
Other Systems of Building Blocks
Range-Counting and the Semigroup Model
KD-Trees and Related Structures
2

Orthogonal Range-Searching problem
◦ Input: a (d-dimensional) box, a set of points
◦ Output: all the points in the set that lies in that box

Applications
◦ Geometric applications
◦ Database Queries
 Select Emp from T where 50K<Salary<75K
AND age > 50 AND salesAmount > 500K
AND 2011<salesYear<2015
 A 5-d orthogonal range query
◦ Preprocessing for queries
3

General situation
◦ Set of data points p1, …, pn
 Pi = (pi1, …, pid)
◦ d-dimensional query interval [a1,b1[ ×…×[ad,bd[
◦ Return all points pi contained in that interval:
 a1≤pi1<b1, …, ad ≤ pid<bd
 O(fd(n) + k)

Structure
◦ Build a balanced search tree for the first coordinates of
data points
 Each node has its Associated Interval: points whose first
coordinate falls into that interval
 Build recursively a range search tree for the remaining d-1
coordinates on each node
4

Query:
◦ find O(log n) nodes correspond to [a1,b1[
◦ In each of those nodes perform d-1 dimensional
range search for [a2,b2[ × … × [ad, bd[
5

Example 2-d
◦ (0,1), (1,5), (2,8), (3,3), (5,0), (6,4), (7,6), (8,7), (9,9)
1-d range tree
5
{(0,1)}
1
{(1,5)} 5
{(0,1), (1,5), (2,8)}
8
{(1,5), (2,8)}
8
{(2,8)}
6

Theorem: Orthogonal search trees are static
structure supporting d-dimensional range
queries in a set of d-dimensional points
◦ Query time
 Output sensitive time
 O((log n)d + k) if output consists of k points
◦ Building tree time
 O(n(log n)d )
◦ Space requirement
 O(n(log n)d-1)
7

Fractional Cascading
◦ When we make a sequence of searches in different
but related sets, we can use the information of
search in previous set into the next set.

Algorithm
◦ For each node, sort the Associated Intervals by
second coordinate
◦ Link each point on this list to
 The same point on the left or right lower neighbor
 The point with the next smaller second coordinate if
the point is missing on that side
 Or the first point on the list if there is no point with
smaller coordinate
8

Fractional Cascading
9

Fractional Cascading Search
◦ We have a search tree for the first coordinate
 We have to select the corresponding nodes to the
canonical interval decomposition of the first interval
query
◦ Attached to each node is a structure for the search
in the second coordinate
 These structure are linked together for fractional
cascading
◦ So that we need to search only in the set associated
with the first node
 Then reuse that information in all later searches
10

Theorem: Orthogonal range trees with
fractional cascading are a static data
structure that support d-dimensional
orthogonal range queries in a set of ddimensional point (d>1);
◦ Query time
 O((log n)d-1 + k) if output consists of k points
◦ Building tree time
 O(n(log n)d-1 )
◦ Space requirement
 O(n(log n)d-1)
11


The inverse problem of orthogonal range
searching problem
Input:
◦ A set of n ranges (d-dimensional intervals)
◦ A query point

Output:
◦ All ranges that contain that point

Solvable by generalization over segment tree
◦ It is defined recursively
12

Main structure:
◦ A balanced search tree whose keys are the first
coordinates of d-dimensional intervals
◦ Each node of that tree contains a d-1 dimensional
segment tree.
◦ In this d-1 dimensional segment tree associated
with node p, all intervals are stored for which p is
part of the canonical interval decomposition of the
first dimension.
13

Query
◦ Follow the search path of the first coordinate of the
query point
◦ In each node perform a (d-1) dimensional query with the
remaining coordinates associated with the node.

Theorem: d-dimensional segment tree is a static
data structure that lists all d-dimensional
intervals containing a given query key,
◦ Build time: O(n(log n)d )
◦ Space need: O(n(log n)d )
◦ Query time: O((log n)d + k) if there are k such intervals
14


Improvement: S-tree using fractional cascading
Algorithm
◦ Input: rectangles [ai,bi[ × [ci,di[ for i = 1,…, n



1. create balanced search tree T1 for
{a1,b1,a2,b2,…,an,bn}
2. attach an empty secondary balanced tree to
each node of the first tree
3. for i=1 to n
◦ 3.1 start from T1 root, put it on a stack.
◦ 3.2 Repeat As long as stack is not empty
 Take the current node v from the stack
 Insert {ci, di} as keys into the tree T2(v)
15
 If intervalOf(v) is not in [ai,bi[ , check v’s left and right
subtrees.
 If their intervals have some intersection with [ai,bi[ , then
put them on the stack.

4. for each i=1,…n
◦ 4.1. for all nodes v that belong to the canonical
interval decomposition of [ai,bi[ in T1
 Insert rectangle [ai,bi[ × [ci,di[ into the segment tree
T2(v)
16

5. for each node v of T1
◦ Create pointers from each leaf of T2(v) to the
corresponding leaves of T2(v->left) and T2(v->right)

6. for each node v of T1
◦ For each node w of T2(v) create a pointer to the next
node above w in T2(v) that has some rectangle
associated with it.

Theorem: S-tree is a static data structure that
keeps track of a set of n rectangles, and for a
given point list all rectangles containing that
point
◦ Space: O(n(log n)2 )
◦ Query time: O(log n + k); if there are k output intervals
17

Canonical interval decomposition
◦ Decompose an interval in a union of a small number
of building blocks

To answer a query interval
◦ Decompose the query interval into a union of
building blocks
◦ Execute the query on those building blocks.
18

Building block query requires
◦ Decompose the queries
◦ Reconstruct the answer from the answer of building
blocks
◦ Also, some structure that answers the query for a fixed
block
◦ Represent each interval as a union of a small number of
blocks

Choice of building blocks tradeoff
◦ Reduce interval query to a small number of blocks needs
many building blocks
 For each block we have to build a structure to answer
queries
19

Bentley and Maurer (1980) proposal
◦ Use an r-level structure for system of blocks
 Interpreted as writing numbers to the base n(1/r).
Intervals of blocks
for top level
[an(1-1/r), bn(1-1/r)]
0≤a<b≤n1/r
O(n2/r)
blocks
O(n(j+1)/r)
blocks
20

Using r-level blocking we obtain a structure
to perform d-dimensional orthogonal range
searching
◦ Query time: O(rd log n + k)
◦ Preprocessing time: O(rd n1+(2d-2)/r log n)
◦ Query time is output sensitive for large r and n.
21

Range counting problem ask just for the number of points in
a range
◦ We do not need output sensitive time complexity

Use orthogonal range tree
◦ Instead of concatenating lists, just add up numbers
◦ Generalization by giving weight to points

In 1-dimensional version, just ask for the number of keys in
9
an interval
4
5
3
2
2
2
2
22

All operations in O(n(log n)d) for a set of n points

Difference with range searching
◦ Allow to make dynamic structure
 Insertion, deletion and rebalance
 Range searching has large associated trees for nodes
◦ lower bounds for operations are possible: O((log n)d)

In the semigroup version
◦ a commutative semigroup (S,+) is specified,
◦ each point is assigned a weight from S,
◦ Return semigroup sum of the weights of the keys in an
interval
 Directly from canonical interval decomposition
23

Another structure to support orthogonal
range searching
◦ Easy to understand and implement
◦ Unsatisfactory performance
 2-dimensional
 KD-Tree:O(n1/2 + k)
 Orthogonal range tree: O((log n)2 + k)
 d-dimensional
 KD-Tree: O(n1-1/d + k)
 Orthogonal range tree: O((log n)d + k)
24

In each node make a comparison to enter the
left or right sub-trees
◦ In different levels compare against different
coordinates
 In the root compare against x
 In the second level compare against y, and so on.
25

Building KD-Tree
26

Building KD-Tree
27

KD-Tree range query
◦ Starting in the root, descend into each node whose
node interval has an intersection with the query
region
◦ Stop branches when an intersection is empty

Time complexity is as large as Ω(√n)
◦ Even in completely balanced tree with distinct keys
◦ This bound cannot be improved
28

Theorem: KD-Trees are a static data structure
that supports d-dimensional or orthogonal
range queries in a set of d-dimensional
points
◦ output sensitive time O(n1-1/d + k) if output consist
of k points
◦ Can be built in O(n (log n))
◦ Need space O(n)
29
Thank you
for your attention
30