Download PR Quadtrees

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Linked list wikipedia , lookup

Interval tree wikipedia , lookup

Red–black tree wikipedia , lookup

B-tree wikipedia , lookup

Binary search tree wikipedia , lookup

Quadtree wikipedia , lookup

Transcript
Organizing Spatial Data
PR Quadtrees 1
Spatial data records include a sense of location as an attribute.
Typically location is represented by coordinate data (in 2D or 3D).
If we are to search spatial data using the locations as key values, we need data structures
that efficiently represent selecting among more than two alternatives during a search.
One approach for 2D data is to employ quadtrees, in which each internal node can have
up to four children, each representing a different region obtained by decomposing the
coordinate space.
There are a variety of such quadtrees, many of which are described in:
The Quadtree and Related Hierarchical Data Structures, Hanan Samet
ACM Computing Surveys, June 1984
CS@VT
Data Structures & Algorithms
©2000-2010 McQuain
Spatial Decomposition
PR Quadtrees 2
In binary search trees, the structure of the tree depends not only upon what data values are
inserted, but also in what order they are inserted.
In contrast, the structure of a Point-Region quadtree is determined entirely by the data
values it contains, and is independent of the order of their insertion.
In effect, each node of a PR quadtree represents a particular region in a 2D coordinate
space.
Internal nodes have exactly 4 children (some may be empty), each representing a
different, congruent quadrant of the region represented by their parent node.
Internal nodes do not store data.
Leaf nodes hold a single data value. Therefore, the coordinate space is partitioned as
insertions are performed so that no region contains more than a single point.
PR quadtrees represent points in a finitely-bounded coordinate space.
CS@VT
Data Structures & Algorithms
©2000-2010 McQuain
Coordinate Space Partitioning
PR Quadtrees 3
128
Consider the collection of points
in a 256 x 256 coordinate space:
C
A
(100, 125)
B
(25, -30)
C
(-55, 80)
D
(125, -60)
E
(80, 80)
F
(-80, -8)
G
(-12, -112)
H
(-48, -112)
J
(16, 72)
K
(60, 100)
L
(48, 48)
M
(36, 8)
N
(4, 60)
P
(28, 30)
CS@VT
A
K
J
N
E
L
P
-128
M
F
128
B
D
H
G
-128
The subdivision of the coordinate space shows how it will be
partitioned as the points are added to a PR quadtree.
Data Structures & Algorithms
©2000-2010 McQuain
PR Quadtree Insertion
PR Quadtrees 4
Obviously inserting the first point, A, just results in the creation
of a leaf node holding A.
A
(100, 125)
B
(25, -30)
Inserting B causes the partitioning of the original coordinate
space into four quadrants, and the replacement of the root with
an internal node with two nonempty children:
(-128,-128) to (128,128)
NW
NE
SE
SW
(-128,0) to (0,128)
(0,0) to (128,128)
(0,-128) to (128,0)
(-128,-128) to (0,0)
none
A(100, 125)
B(25, -30)
none
The display above shows the SW and NE corners of the regions logically represented by
each node, and the data values stored in the leaf nodes.
In an implementation, nodes would not store information defining their regions explicitly,
nor would empty leaf nodes probably be allocated.
CS@VT
Data Structures & Algorithms
©2000-2010 McQuain
Leaf Splitting During Insertion
PR Quadtrees 5
Inserting C does not cause any additional partitioning of the
coordinate space since it naturally falls into an empty leaf:
(-128,-128) to (128,128)
A
(100, 125)
B
(25, -30)
C
(-55, 80)
D
(125, -60)
●
NW
NE
(-128,0) to (0,128)
(0,0) to (128,128)
C(-55, 80)
A(100, 125)
Inserting D will cause the
partitioning of the SE
quadrant in order to separate
B and D:
CS@VT
SE
SW
(0,-128) to (128,0)
NW NE
●
●
SE
SW
(0,-64) to (64,0)
(64,-64) to (128,0)
B(25, -30)
D(125, -60)
Data Structures & Algorithms
©2000-2010 McQuain
Multiple Splitting
PR Quadtrees 6
Suppose the value E(80, 80) is now inserted into the tree. It falls in the same region as
the point A, (0, 0) to (128, 128).
128
A
However, dividing that region creates three empty
regions, and the region (64, 64) to (128, 128) in
which both A and E lie.
E
128
So, that region must be partitioned again. This separates A and E into two separate
regions (see illustration on the slide "Coordinate Space Partitioning").
If it had not, then the region in which they both occurred would be partitioned again, and
again if necessary, until they are separated.
CS@VT
Data Structures & Algorithms
©2000-2010 McQuain
Multiple Splitting
PR Quadtrees 7
Inserting E results in the following tree:
(-128,-128) to (128,128)
●
NW
(0,0) to (128,128)
(-128,-128) to (0,0)
●
C(-55, 80)
NW
NE
●
●
SE
SW
●
●
CS@VT
NE
SE
SE
(100, 125)
B
(25, -30)
C
(-55, 80)
D
(125, -60)
E
(80, 80)
SW
(-128,-128) to (128,0)
(64,64) to (128,128)
NW
NE
A
SW
NW NE
●
●
SE
SW
(0,-64) to (64,0)
(64,-64) to (128,0)
B(25, -30)
D(125, -60)
(96,96) to (128,128)
(64,64) to (96,96)
A(100, 125)
E(80, 80)
Data Structures & Algorithms
©2000-2010 McQuain
PR Quadtree Insertion
PR Quadtrees 8
Insertion proceeds recursively, descending until the appropriate leaf node (possibly
empty) is found, and then partitioning and descending until there is no more than one
point within the region represented by each leaf.
It is possible for a single insertion to add many levels to the relevant subtree, if points lie
close enough together.
Of course, it is also possible for an insertion to require no splitting whatsoever.
The shape of the tree is entirely independent of the order in which the data elements are
added to it.
CS@VT
Data Structures & Algorithms
©2000-2010 McQuain
PR Quadtree Deletion
PR Quadtrees 9
Deletion always involves removing a leaf node. Consider deleting A from the following
tree:
(-128,-128) to (128,128)
●
NW
CS@VT
NE
(-128,0) to (0,128)
(0,0) to (128,128)
C(-55, 80)
A(100, 125)
SE
SW
(0,-128) to (128,0)
NW NE
●
●
SE
SW
(0,-64) to (64,0)
(64,-64) to (128,0)
B(25, -30)
D(125, -60)
Data Structures & Algorithms
©2000-2010 McQuain
PR Quadtree Deletion
PR Quadtrees 10
Deletion always involves removing a leaf node. Consider deleting A from the following
tree:
(-128,-128) to (128,128)
●
●
NW
(-128,0) to (0,128)
(0,0) to (128,128)
C(-55, 80)
A(100, 125)
Setting the parent's pointer to null
removes the leaf node from the tree,
and no further action is required…
CS@VT
NE
SE
SW
(0,-128) to (128,0)
NW NE
●
●
SE
SW
(0,-64) to (64,0)
(64,-64) to (128,0)
B(25, -30)
D(125, -60)
Data Structures & Algorithms
©2000-2010 McQuain
PR Quadtree Deletion
PR Quadtrees 11
On the other hand, deleting a leaf node may cause its parent to "underflow":
Consider deleting B:
(-128,-128) to (128,128)
●
●
NW
NE
SE
SW
(0,-128) to (128,0)
(-128,0) to (0,128)
●
C(-55, 80)
NW NE
Now, the parent node has only one
(nonempty) child, and hence there is no
reason that it must be split…
●
●
SE
SW
(0,-64) to (64,0)
(64,-64) to (128,0)
B(25, -30)
D(125, -60)
… so, we can "contract" the branch by
replacing the parent (internal) node with
the remaining child (leaf)…
CS@VT
Data Structures & Algorithms
©2000-2010 McQuain
PR Quadtree Deletion
PR Quadtrees 12
Contracting the branch results in:
(-128,-128) to (128,128)
●
●
NW
NE
SE
(-128,0) to (0,128)
C(-55, 80)
SW
(64,-64) to (128,0)
D(125, -60)
Of course, if the root node of this tree did not have another child, then the branch
contraction could continue…
CS@VT
Data Structures & Algorithms
©2000-2010 McQuain
PR Quadtree Achilles' Heel
PR Quadtrees 13
If a data point that is inserted lies very close to another data point in the tree, it is possible
that many levels of partitioning will be required in order to separate them.
A
A
B
The minimum height of a PR quadtree is can be as large as
C
B
  2s  
log 
 
  d 
where s is the length of a side of the "world" and d is the minimum distance between any
two data points in the tree.
CS@VT
Data Structures & Algorithms
©2000-2010 McQuain
PR Quadtree Using Buckets
PR Quadtrees 14
The problem of "stalky" PR quadtree branches can be alleviated by allowing each leaf
node to store more than one data object, making the leaf a "bucket".
For example, if the quadtree leaf can store 5 data elements then it does not have to split
until we have 6 data points that fall within its region.
This complicates the implementation of the tree slightly, but can substantially reduce the
cost of search operations.
If buckets are used, then the minimum height of a PR quadtree is can be as large as
  2s  
log 
 
d
 

where s is the length of a side of the "world" and d is the minimum side of a square that
contains more data elements than a bucket can hold.
CS@VT
Data Structures & Algorithms
©2000-2010 McQuain