Download Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Shape of the universe wikipedia , lookup

Four-dimensional space wikipedia , lookup

Space wikipedia , lookup

Geometrization conjecture wikipedia , lookup

Euclidean geometry wikipedia , lookup

Line (geometry) wikipedia , lookup

History of geometry wikipedia , lookup

Transcript
Spatial Query Processing
•
•
•
•
Spatial DBs do not have a set of operators that are
considered to be basic elements in a query evaluation.
Spatial DBs handle a large set of complex data, which are
not sorted in a dimension.
Complex algorithms are needed for evaluating spatial
predicates.
It is not possible to assume that the computational cost in
the query processing is only associated with I/O.
Spatial Operations
•
•
Update operations
Selection operations:
–
Point Query (PQ): given a query point p, fin all objects O that contain it:
PQ(p) = { O| p  O.G ≠ Ø}
–
Range or region query (WQ): given a query polygon P, find all objects
O that intersect P. When P is rectangular, we call it windows query.
WQ(P) = { O| O.G P.G ≠ Ø}
•
Spatial aggregation: It is a variant of the search for nearest neighbor.
Given an object O’, find objects o that have a minimum distance to
o’.
NNQ(o’) = { o|o’’: dist(o’.G,o.G) ≤ dist(o’.G,o’’.G) }
Spatial Operations
•
Spatial JOIN: This is one of the most important operators in relational
databases. When two tables R y S are joined based on a spatial
predicate , the join is called spatial join. A variant of this operator in
GIS is the map overlay. This operator combines two set of spatial
objects to create a new set. The boundaries of these new objects are
determined by the nonspatial attributes assigned by the overlap
operation. For example, if an operation assigns the same value of a
nonspatial attribute two adjacent objects, they will merge.
R  S = {(o, o’)| o  R , o’  S,  (o.G, o’.G)}
Some spatial predicates are: intersection, northeast, distance, overlap,
meets, adjacent, contains, and so on.
Techniques of Query Processing
•
Selection:
–
–
–
Unsorted data and no index
Spatial Indexing
Rank = selectivity - 1/differential cost
selectivity(p): cardinality(output(p))/cardinality(input(p))
differential cost is the cost of the predicate.
Techniques of Query Processing
•
Nearest Neighbor:
An approach to solve this type of queries uses a couple of distance measures,
search pruning criteria, and a search algorithm.
Min-distance(P,R) is zero if P is inside of R or on its boundary. If P is
outside of R, then min-distance(P,R) is the Euclidean distance between P and
any side of R.
Min-Max distance(P,R) is the distance to P from the farthest point on any
face of R that contain the vertex closest from R to P. The construction of the
R-tree guarantee that there is an object O inside of R in the R-tree such that
distance(O,P) ≤ Min-Max distance(P,R).
Some search pruning strategies are:
•
An MBR M can be eliminated if
if there is another MBR M’ such
min-distance(P,M) > min-max distance(P,M’)
•
An MBR M can be eliminated if if there is an object O such that
distance(P,O) < min- distance(P,M)
•
An object O can be eliminated if if there is an MBR M such that
distance(P,O) > min-max distance(P,M)
that
Techniques of Query Processing
•
Join: Un join is defined as the cross product followed by a selection
condition. This is specially expensive for spatial databases.
Associated with a filter step, which is then followed by a refinement,
the following algorithms are concentrated on the spatial operations
over rectangles (mbrs).
JOIN Algorithms
•
Nested loop
for all tuple f  F
for all tuple r  R
if overlap(F.Geom, R.Geom)
then add <f,r> to result
If F needs M pages with pf tuples in each page, and R needs N pages
with pr tuples in each of them, the computational cost is prohibitive. If
we consider B buffers in memory, one can transfer B-2 pages from F,
leave one buffer for R, and one for the results of <f,r>.
An alternative is to use each tuple in F as a window query over an
indexed R.
JOIN Algorithms
•
Tree matching: Both tables are indexed.
SJ(R1,R2: nodes)
forall er2 in R2 [
forall er1 in R1 [
if overlap(er1.rect, er2.rect) then [
if R1 and R2 are leaf pages then
output(er1.oid,er2.oid)
else if R1 is leaf page then [
ReadPage(er2.ptr); SJ(er1.ptr.er2.ptr)]
else if R2 is leaf page then[
ReadPage(er1.ptr); SJ(er1.ptr.er2.ptr)]
else [
ReadPage(er1.ptr), ReadPage(er2.ptr)
SJ(er1.ptr,er2.ptr)]
]
]
]
JOIN Algorithms
•
•
Partition-Based Spatial Merge Join
Filter Step: Given two relations F y R:
–
–
–
•
Given each tuple in F y R, form the tuple key-pointer consisting of the
unique id OID and the MBR. Llame a esto Fkp y Rkp.
If both relations Fkp y Rkp fit in main memory, the operation can be
processed with a plane-sweep algorithm.
If the relations do not fit in memory, partition both relations in P parts.
Partition: The partition must satisfy the following constraints:
–
–
For each Fikp, the element in Rikp lies in Rikp
Both Fikp y Rikp lie in main memory.
Sweep plane: intersection of polygons
l
l
Optimization
In traditional DB, the computational cost of a query is
defined in terms of I/O. In a spatial DB, in contrast, the fact
that the system deals with complex data makes the definition
of a query plan and optimization more relevent.
The query optimizer generates different evaluation planes
and selects one. Many times, time is not the best, but at least,
it is not the worst. The activities of the optimizer can be
classified into: logical trasnformation and dynamic
programming.
Schema of Query Optimizer
Parser
Query
SQL Grammer
Abstract Data Types
Optimizer
Logical
Transformation
Descomposition
Dynamic
Programming
Heuristic Rule
Nonspatial
Spatial
Hybrid
Architecture
Specification
System Catalog
selectivity Index CPU Bfr
Cost Function
Nonspatial
Evaluation
Merge
Spatial
Query Optimizer
•
Parsing: Before that the optimizer can operate, a high-level declarative
statement must be scanned through a parser.In traditional DB, the types of
data and functions are fixed and the parsers are relatively simple. Spatial DB
are extended by user defined types so that parsers are more complicated.
SELECT L.nombre
FROM Lago L, Servicio Fa
WHERE Area(L.Geometry) > 20 ANDs
Fa.nombre = ‘camping’ AND
Distance(Fa.Geometry, L.Geometry) < 50
Query Tree
 L.nombre
 Area(L.Geometry) > 20
 Fa.nombre = ΤcampingΥ
Distance(Fa.Geometry,L.Geometry) < 50
Lago L
Servicio Fa
Query Optimizer
•
Logical transformation: The strategy derived from the parser can be very inefficient.
The join operation is very expensive and whose complexity is bounded by the size of
the input.Thus,it is better to decrease the size of the input of the join operation.An
option is to move the selection of nonspatial attribute down in the query tree.
 L.nombre
 Area(L.Geometry) > 20
Distance(Fa.Geometry,L.Geometry) < 50
Lago L
 Fa.nombre = ΤcampingΥ
Servicio Fa
Transformations
•
•
In the step, the tree is mapped onto equivalent trees by using a set of formal rules
inherited from relational algebra.
The trees are numbered based on the heuristics to filter candidates that are obviously
no recommended. The general rule in this case is “ move the nonspatial operators
SELECT and PROJECT down in the tree.” For each alternative is possible to define
the rank.
Rank = selectivity - 1/differential cost
selectivity(p): cardinality(output(p))/cardinality(input(p))
The space of alternatives is generated with rules of relationsl algebra based on
notions of commutativity, associativity and distributivity.
Equivalence Rules
•
•
Selection
c1 c2…cn(R )  c1(c2(…(cn (R ))…) All nonspatial relation are moveed to
the right.
c1(c2 (R ))  c2(c1 (R )) Nonspatial selection is first than spatial selection.
Projection
If ai’ are a set of attributes such that ai  ai+1 for i = 1,…n-1, then
a1 (R )   a1( a2(…( an (R ))…)
Equivalence Rules
•
Cross Product and Join
Conmutativity:
RS S R
Associativity
R  (S  T)  (S  R)  T
Implication
(R  T)  S  (T  R)  S
•
Selection, Projection and Join
If the selection condition involves attributes used by the projection operator:
a(c(R ))  c (a (R ))
Equivalence Rules
•
Selection, Projection and Join
If a condition of selection c involves an attribute that only appears in R and
not in S, then:
c(R S )  c (R )  S
Projection can be processed with Join:
a(R S )  a1(R )  a2 (S )
where a1 is a subset of a, which appears in R, and a2 is the subset of a that
appears in S.
Query Optimizer
•
Dynamic Programming. It is the technique that selects an evaluation plan.
This selection is carried out with the goal of minimizing the computational
cost.The factors to consider are:
–
–
–
–
Access cost
Storage cost
CPU cost
Communication cost
•
Catalogs. It keeps the information for computing the cost
•
Cost function:
Cost = Espression(records-examined) + K* Expresión(pages-read)
K weigth of CPU respect to I/O.
Execution Plan
Ej. SELECT F. Nombre
FROM Bosque F, Rios R
WHERE
Intersect(F.
Geometry,
Overlap( F. Geometry, R.Geometry)
:WINDOW)
 F.nombre (on-the fly)
 Intersect(F .Geometry,:WINDOWS) (R-T ree index)
Overlap(F.Geometry,R.Geometry) (Tree-Matching Join)
Bosque F
Rios R
AND