Download Query Processing and Advanced Query Processing and Advanced

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Query Processing and Advanced
Q
Queries
Query Processing (2)
Review: Query Processing
SQL query
parse
parse tree
convert
answer
logical query plan
execute
query rewrite
“improved” ll.q.p
qp
estimate result sizes
l
l.q.p.
+ sizes
i
consider physical plans
Pi
pick best
{(P1 C1) (P2 C2) }
{(P1,C1),(P2,C2)...}
estimate costs
{P1 P2
{P1,P2,…..}
}
CMPT 454: Database Systems II – Query Processing (2)
2 / 26
Parsing
Grammar ffor SQL
The following grammar describes a simple subset of SQL.
Queries
<Query>::= SELECT <SelList> FROM <FromList>
WHERE <Condition> ;
Selection lists
<SelList>::= <Attribute>
<SelList>::
<Attribute>, <SelList>
<SelList>::= <Attribute>
From lists
<FromList>::= <Relation>, <FromList>
<FromList>::= <Relation>
<FromList>::
CMPT 454: Database Systems II – Query Processing (2)
3 / 26
Parsing
Grammar for SQL
Conditions
<Condition>::= <Condition> AND <Condition>
<Condition>::= <Attribute> IN (<Query>)
<Condition>::= <Attribute> = <Attribute>
<Condition>::
<Condition>::= <Attribute> LIKE <Pattern>
Syntactic categories Relation and Attribute are not
defined by grammar rules, but by the database schema.
Syntactic category Pattern defined as some regular
expression.
CMPT 454: Database Systems II – Query Processing (2)
4 / 26
Example: A SQL Query
StarsIn (mo
(movieTitle,
ieTitle mo
movieYear,
ieYear starName)
MovieStar (name, address, gender, birthdate)
Goal: find the movies with stars born in 1960
SELECT movieTitle
FROM StarsIn, MovieStar
WHERE starName = name AND birthdate LIKE ‘%1960’
%1960
CMPT 454: Database Systems II – Query Processing (2)
5 / 26
A Parse Tree
<Query>
SELECT <SelList>
S lLi t
<Attribute>
FROM
<FromList>
F
Li t
WHERE
<Condition>
C diti
<Relation> , <FromList>
movieTitle
StarsIn
<Relation>
MovieStar
<Condition>
<Attribute>
starName
=
AND
<Condition>
<Attribute>
<Attribute>
name
birthdate
CMPT 454: Database Systems II – Query Processing (2)
LIKE <Pattern>
‘%1960’
6 / 26
Conversion to Logical Query Plan
How to convert a parse tree into a logical query plan, i.e.
a relational algebra expression?
Queries with conditions without subqueries are easy:
Form Cartesian product of all relations in <FromList>.
Apply a selection c where C is given by <Condition>.
<Condition>
Finally apply a projection L where L is the list of
tt ib t iin <S
lLi t>
attributes
<SelList>.
Queries involving subqueries are more difficult.
Remove subqueries from conditions and represent them
by a two-argument selection in the logical query plan.
CMPT 454: Database Systems II – Query Processing (2)
7 / 26
An Algebraic Expression Tree
movieTitle
starName=name AND birthdate LIKE ‘%1960’
X
StarsIn
CMPT 454: Database Systems II – Query Processing (2)
MovieStar
8 / 26
Another SQL Query
StarsIn
St
I ((movieTitle,
i Titl movieYear,
i Y
starName)
t N
)
MovieStar (name, address, gender, birthdate)
Goal: find the movies with stars born in 1960
SELECT title
titl
FROM StarsIn
WHERE starName IN (
SELECT name
FROM MovieStar
)
WHERE birthdate LIKE ‘%1960’);
CMPT 454: Database Systems II – Query Processing (2)
9 / 26
Another Parse Tree
<Query>
SELECT <SelList>
FROM
<FromList>
<Attribute>
<Relation>
title
StarsIn
SELECT
<SelList>
FROM
WHERE
<Condition>
<Attribute> IN (
<FromList>
<Attribute>
<Relation>
name
MovieStar
CMPT 454: Database Systems II – Query Processing (2)
)
starName
<Query>
WHERE
<Condition>
<Attribute> LIKE <Pattern>
birthDate
‘%1960’
10 / 26
Another Algebraic Expression Tree
movieTitle

StarsIn
<condition>
<Attribute> IN
starName
name
birthdate LIKE ‘%1960’
MovieStar
CMPT 454: Database Systems II – Query Processing (2)
11 / 26
Algebraic Laws for Query Plans
Introduction
Algebraic laws allow us to transform a Relational
Algebra (RA) expression into an equivalent one.
Two RA expressions
p
are equivalent
q
if,, for all
database instances, they produce the same answer.
The resulting expression may have a more
efficient physical query plan.
Algebraic laws are used in the query rewrite
phase.
p
CMPT 454: Database Systems II – Query Processing (2)
12 / 26
Algebraic Laws for Query Plans
Introduction
Commutative law:
O d off arguments
t d
tt
Order
does nott matter.
x+y=y+x
Associative
A
i ti law:
l
May group two uses of the operator either from the left
right
or the right.
(x + y) + z = x + (y + z)
Operators
O
t
th
thatt are commutative
t ti and
d associative
i ti
can be grouped and ordered arbitrarily.
CMPT 454: Database Systems II – Query Processing (2)
13 / 26
Algebraic Laws for Query Plans
Natural Join,
Join Cartesian Product and Union
R
(R
S
S)
=
T
S
R
=R
(S
T)
RxS=SxR
(R x S) x T = R x (S x T)
RUS=SUR
R U (S U T) = (R U S) U T
CMPT 454: Database Systems II – Query Processing (2)
14 / 26
Algebraic Laws for Query Plans
Selection
p1  p2(R) = p11 [ p22 (R)]
p1 v p2(R) = [ p1 (R)] U [ p2 (R)]
p1 [ p2 (R)] = p2 [ p1 (R)]
Simple conditions p1 or p2 may be pushed down
further than the complex condition.
CMPT 454: Database Systems II – Query Processing (2)
15 / 26
Algebraic Laws for Query Plans
Bag Union
What about the union of relations with duplicates
(b )?
(bags)?
R = {a,a,b,b,b,c}
S = {b,b,c,c,d}
{b b
d}
RUS=?
Number of occurrences either SUM or MAX of
p relations.
occurrences in the imput
SUM: R U S = {a,a,b,b,b,b,b,c,c,c,d}
MAX: R U S = {{a,a,b,b,b,c,c,d}
, , , , , , , }
CMPT 454: Database Systems II – Query Processing (2)
16 / 26
Algebraic Laws for Query Plans
Selection
p1 v p2 (R) = p1(R) U p2(R)
MAX implementation of union makes rule work.
R={a,a,b,b,b,c}
R
{a,a,b,b,b,c}
p1 satisfied by a,b, p2 satisfied by b,c
p1vp2 (R) = {{a,a,b,b,b,c}
bbb }
p1(R) = {a,a,b,b,b}
p2(R) = {b,b,b,c}
p1 (R) U p2 (R) = {a,a,b,b,b,c}
CMPT 454: Database Systems II – Query Processing (2)
17 / 26
Algebraic Laws for Query Plans
Selection
p1 v p2 (R) = p1(R) U p2(R)
SUM implementation of union makes more sense.
Senators (……)
Reps (……)
T1 = yr,state
yr state Senators,
Senators T2 =  yr,state
yr state Reps
T1
Yr
97
99
98
State
CA
CA
AZ
T2
Yr
99
99
98
State
CA
CA
CA
Union?
Use SUM implementation,
p
but then some laws do not hold.
CMPT 454: Database Systems II – Query Processing (2)
18 / 26
Algebraic Laws for Query Plans
Selection and Set Operations
p(R U S) = p(R) U p(S)
p(R - S) = p(R) - S = p(R) - p(S)
CMPT 454: Database Systems II – Query Processing (2)
19 / 26
Algebraic Laws for Query Plans
Selection and Join
p predicate
p:
p
with onlyy R attributes
q: predicate with only S attributes
m: ppredicate with attributes from R and S

p (R
S) = [
q (R
S) = R
p
(R)]
S

[
q
CMPT 454: Database Systems II – Query Processing (2)
(S)]
20 / 26
Algebraic Laws for Query Plans
Selection and Join
ppqq ((R
pqm (R
S)) = [
p ((R)])]
[
S) =
m [(p R)
pvq (R S) =
[(p R) S] U [R
CMPT 454: Database Systems II – Query Processing (2)
(
q ((S)])]
q S)]
(
q S)]
21 / 26
Algebraic Laws for Query Plans
Projection
X: set of attributes
Y: set of attributes
XY: X U Y
xy (R) =
x [y (R)]
May
y introduce p
projection
j
anywhere
y
in an expression
p
tree as long as it eliminates no attributes needed by
an operator above and no attributes that are in result
CMPT 454: Database Systems II – Query Processing (2)
22 / 26
Algebraic Laws for Query Plans
Projection and Selection
X: subset of R attributes
Z: attributes in predicate P (subset of R attributes)
x (pR) =
xz
x {p [ x
(R)
]}
Need to keepp attributes for the selection and
for the result
CMPT 454: Database Systems II – Query Processing (2)
23 / 26
Algebraic Laws for Query Plans
Projection and Selection
X: subset of R attributes
Y: subset of S attributes
Z: intersection of R,S
R S attributes
xy (R
S) =
xy{[xz (R) ]
CMPT 454: Database Systems II – Query Processing (2)
[yz (S) ]}
24 / 26
Algebraic Laws for Query Plans
Projection Selection and Join
Projection,
xy {
{ p (R
S)}
=
xy {
{ p [xz’ (R)
yz’ (S)]}
z’ = z U {attributes used in P
CMPT 454: Database Systems II – Query Processing (2)
}
25 / 26
What Are Good Transformation?
No transformation is always good
Usually good: early selections/projections
CMPT 454: Database Systems II – Query Processing (2)
26 / 26
Related documents