Download Query Processing and Advanced Query Processing and Advanced

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Query Processing and Advanced
Q
Queries
Query Processing (2)
Review: Query Processing
SQL query
parse
parse tree
convert
answer
logical query plan
execute
query rewrite
“improved” ll.q.p
qp
estimate result sizes
l
l.q.p.
+ sizes
i
consider physical plans
Pi
pick best
{(P1 C1) (P2 C2) }
{(P1,C1),(P2,C2)...}
estimate costs
{P1 P2
{P1,P2,…..}
}
CMPT 454: Database Systems II – Query Processing (2)
2 / 26
Parsing
Grammar ffor SQL
The following grammar describes a simple subset of SQL.
Queries
<Query>::= SELECT <SelList> FROM <FromList>
WHERE <Condition> ;
Selection lists
<SelList>::= <Attribute>
<SelList>::
<Attribute>, <SelList>
<SelList>::= <Attribute>
From lists
<FromList>::= <Relation>, <FromList>
<FromList>::= <Relation>
<FromList>::
CMPT 454: Database Systems II – Query Processing (2)
3 / 26
Parsing
Grammar for SQL
Conditions
<Condition>::= <Condition> AND <Condition>
<Condition>::= <Attribute> IN (<Query>)
<Condition>::= <Attribute> = <Attribute>
<Condition>::
<Condition>::= <Attribute> LIKE <Pattern>
Syntactic categories Relation and Attribute are not
defined by grammar rules, but by the database schema.
Syntactic category Pattern defined as some regular
expression.
CMPT 454: Database Systems II – Query Processing (2)
4 / 26
Example: A SQL Query
StarsIn (mo
(movieTitle,
ieTitle mo
movieYear,
ieYear starName)
MovieStar (name, address, gender, birthdate)
Goal: find the movies with stars born in 1960
SELECT movieTitle
FROM StarsIn, MovieStar
WHERE starName = name AND birthdate LIKE ‘%1960’
%1960
CMPT 454: Database Systems II – Query Processing (2)
5 / 26
A Parse Tree
<Query>
SELECT <SelList>
S lLi t
<Attribute>
FROM
<FromList>
F
Li t
WHERE
<Condition>
C diti
<Relation> , <FromList>
movieTitle
StarsIn
<Relation>
MovieStar
<Condition>
<Attribute>
starName
=
AND
<Condition>
<Attribute>
<Attribute>
name
birthdate
CMPT 454: Database Systems II – Query Processing (2)
LIKE <Pattern>
‘%1960’
6 / 26
Conversion to Logical Query Plan
How to convert a parse tree into a logical query plan, i.e.
a relational algebra expression?
Queries with conditions without subqueries are easy:
Form Cartesian product of all relations in <FromList>.
Apply a selection c where C is given by <Condition>.
<Condition>
Finally apply a projection L where L is the list of
tt ib t iin <S
lLi t>
attributes
<SelList>.
Queries involving subqueries are more difficult.
Remove subqueries from conditions and represent them
by a two-argument selection in the logical query plan.
CMPT 454: Database Systems II – Query Processing (2)
7 / 26
An Algebraic Expression Tree
movieTitle
starName=name AND birthdate LIKE ‘%1960’
X
StarsIn
CMPT 454: Database Systems II – Query Processing (2)
MovieStar
8 / 26
Another SQL Query
StarsIn
St
I ((movieTitle,
i Titl movieYear,
i Y
starName)
t N
)
MovieStar (name, address, gender, birthdate)
Goal: find the movies with stars born in 1960
SELECT title
titl
FROM StarsIn
WHERE starName IN (
SELECT name
FROM MovieStar
)
WHERE birthdate LIKE ‘%1960’);
CMPT 454: Database Systems II – Query Processing (2)
9 / 26
Another Parse Tree
<Query>
SELECT <SelList>
FROM
<FromList>
<Attribute>
<Relation>
title
StarsIn
SELECT
<SelList>
FROM
WHERE
<Condition>
<Attribute> IN (
<FromList>
<Attribute>
<Relation>
name
MovieStar
CMPT 454: Database Systems II – Query Processing (2)
)
starName
<Query>
WHERE
<Condition>
<Attribute> LIKE <Pattern>
birthDate
‘%1960’
10 / 26
Another Algebraic Expression Tree
movieTitle

StarsIn
<condition>
<Attribute> IN
starName
name
birthdate LIKE ‘%1960’
MovieStar
CMPT 454: Database Systems II – Query Processing (2)
11 / 26
Algebraic Laws for Query Plans
Introduction
Algebraic laws allow us to transform a Relational
Algebra (RA) expression into an equivalent one.
Two RA expressions
p
are equivalent
q
if,, for all
database instances, they produce the same answer.
The resulting expression may have a more
efficient physical query plan.
Algebraic laws are used in the query rewrite
phase.
p
CMPT 454: Database Systems II – Query Processing (2)
12 / 26
Algebraic Laws for Query Plans
Introduction
Commutative law:
O d off arguments
t d
tt
Order
does nott matter.
x+y=y+x
Associative
A
i ti law:
l
May group two uses of the operator either from the left
right
or the right.
(x + y) + z = x + (y + z)
Operators
O
t
th
thatt are commutative
t ti and
d associative
i ti
can be grouped and ordered arbitrarily.
CMPT 454: Database Systems II – Query Processing (2)
13 / 26
Algebraic Laws for Query Plans
Natural Join,
Join Cartesian Product and Union
R
(R
S
S)
=
T
S
R
=R
(S
T)
RxS=SxR
(R x S) x T = R x (S x T)
RUS=SUR
R U (S U T) = (R U S) U T
CMPT 454: Database Systems II – Query Processing (2)
14 / 26
Algebraic Laws for Query Plans
Selection
p1  p2(R) = p11 [ p22 (R)]
p1 v p2(R) = [ p1 (R)] U [ p2 (R)]
p1 [ p2 (R)] = p2 [ p1 (R)]
Simple conditions p1 or p2 may be pushed down
further than the complex condition.
CMPT 454: Database Systems II – Query Processing (2)
15 / 26
Algebraic Laws for Query Plans
Bag Union
What about the union of relations with duplicates
(b )?
(bags)?
R = {a,a,b,b,b,c}
S = {b,b,c,c,d}
{b b
d}
RUS=?
Number of occurrences either SUM or MAX of
p relations.
occurrences in the imput
SUM: R U S = {a,a,b,b,b,b,b,c,c,c,d}
MAX: R U S = {{a,a,b,b,b,c,c,d}
, , , , , , , }
CMPT 454: Database Systems II – Query Processing (2)
16 / 26
Algebraic Laws for Query Plans
Selection
p1 v p2 (R) = p1(R) U p2(R)
MAX implementation of union makes rule work.
R={a,a,b,b,b,c}
R
{a,a,b,b,b,c}
p1 satisfied by a,b, p2 satisfied by b,c
p1vp2 (R) = {{a,a,b,b,b,c}
bbb }
p1(R) = {a,a,b,b,b}
p2(R) = {b,b,b,c}
p1 (R) U p2 (R) = {a,a,b,b,b,c}
CMPT 454: Database Systems II – Query Processing (2)
17 / 26
Algebraic Laws for Query Plans
Selection
p1 v p2 (R) = p1(R) U p2(R)
SUM implementation of union makes more sense.
Senators (……)
Reps (……)
T1 = yr,state
yr state Senators,
Senators T2 =  yr,state
yr state Reps
T1
Yr
97
99
98
State
CA
CA
AZ
T2
Yr
99
99
98
State
CA
CA
CA
Union?
Use SUM implementation,
p
but then some laws do not hold.
CMPT 454: Database Systems II – Query Processing (2)
18 / 26
Algebraic Laws for Query Plans
Selection and Set Operations
p(R U S) = p(R) U p(S)
p(R - S) = p(R) - S = p(R) - p(S)
CMPT 454: Database Systems II – Query Processing (2)
19 / 26
Algebraic Laws for Query Plans
Selection and Join
p predicate
p:
p
with onlyy R attributes
q: predicate with only S attributes
m: ppredicate with attributes from R and S

p (R
S) = [
q (R
S) = R
p
(R)]
S

[
q
CMPT 454: Database Systems II – Query Processing (2)
(S)]
20 / 26
Algebraic Laws for Query Plans
Selection and Join
ppqq ((R
pqm (R
S)) = [
p ((R)])]
[
S) =
m [(p R)
pvq (R S) =
[(p R) S] U [R
CMPT 454: Database Systems II – Query Processing (2)
(
q ((S)])]
q S)]
(
q S)]
21 / 26
Algebraic Laws for Query Plans
Projection
X: set of attributes
Y: set of attributes
XY: X U Y
xy (R) =
x [y (R)]
May
y introduce p
projection
j
anywhere
y
in an expression
p
tree as long as it eliminates no attributes needed by
an operator above and no attributes that are in result
CMPT 454: Database Systems II – Query Processing (2)
22 / 26
Algebraic Laws for Query Plans
Projection and Selection
X: subset of R attributes
Z: attributes in predicate P (subset of R attributes)
x (pR) =
xz
x {p [ x
(R)
]}
Need to keepp attributes for the selection and
for the result
CMPT 454: Database Systems II – Query Processing (2)
23 / 26
Algebraic Laws for Query Plans
Projection and Selection
X: subset of R attributes
Y: subset of S attributes
Z: intersection of R,S
R S attributes
xy (R
S) =
xy{[xz (R) ]
CMPT 454: Database Systems II – Query Processing (2)
[yz (S) ]}
24 / 26
Algebraic Laws for Query Plans
Projection Selection and Join
Projection,
xy {
{ p (R
S)}
=
xy {
{ p [xz’ (R)
yz’ (S)]}
z’ = z U {attributes used in P
CMPT 454: Database Systems II – Query Processing (2)
}
25 / 26
What Are Good Transformation?
No transformation is always good
Usually good: early selections/projections
CMPT 454: Database Systems II – Query Processing (2)
26 / 26
Related documents