Download An Evolutionary Algorithm for Query Optimization

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Biology and consumer behaviour wikipedia , lookup

Designer baby wikipedia , lookup

Koinophilia wikipedia , lookup

Hybrid (biology) wikipedia , lookup

Public health genomics wikipedia , lookup

Polyploid wikipedia , lookup

Genetic engineering wikipedia , lookup

Genetic testing wikipedia , lookup

Population genetics wikipedia , lookup

Microevolution wikipedia , lookup

Genome (book) wikipedia , lookup

Smith–Waterman algorithm wikipedia , lookup

Gene expression programming wikipedia , lookup

Transcript
An Evolutionary Algorithm for
Query Optimization
in Database
Kayvan Asghari,
Ali Safari Mamaghani
Mohammad Reza Meybodi
International Joint Conferences
on Computer, Information, and Systems Sciences, and Engineering
CISSE 2007
1
Introduction
Optimizing the database queries is one of hard research problems
 If the number of relations is more than five or six relations,
exhaustive search techniques will bear high cost regarding the
memory and time.
 Examples of queries with large number of relations can be found
in:







Deductive database management systems
Expert systems
Engineering database management systems (CAD/CAM)
Decision Support Systems
Data mining
Scientific database management systems
Whatever the reason, database management systems need the use
of query optimizing techniques with low cost in order to
counteract with such complicated queries.
2
Searching Algorithms for Suitable Join Order


Exact algorithms that search all of state space and sometimes they reduce this
space by heuristic methods:
 Dynamic programming method which at first introduced by Selinger et al.
for optimizing the join ordering in System-R.
 Minimum selectivity algorithm
 KBZ algorithm
 AB algorithm
But Exhaustive search techniques like dynamic programming is suitable for
queries with a few relation and we have to use random and evolutionary
methods
Random algorithms have been introduced for showing the inability of exact
algorithms versus large queries.
 Iterative improvement
 Simulated annealing
 Two-phase optimization
 Toured simulated annealing
 Random sampling
 Evolutionary algorithms
3
Evolutionary algorithms




Genetic algorithm has been done by Bennet et al.
Some other works have been done by Steinbrunn et al. that
they have used different coding methods and genetic operators
genetic programming which is introduced by Stillger et al.
CGO genetic optimizer has also been introduced by Mulero et
al.
In our paper a hybrid evolutionary algorithm has been
proposed that uses two methods of genetic algorithm
and learning automata synchronically for searching the
states space of problem.
4
The Definition of Problem

DBMS selects the best query execution plan (qep) from among
execution plans, in a way that query execution bears the low
cost, especially the cost of input/output operations and time of
processing.
 If we show the all of allocated execution plans for responding to
the query with S set, each member qep that belongs to S set has
cost(qep) The purpose of each optimization algorithm is finding
a member like qep0 which belongs to S set, so that:
cost(qep0 )  minqepS cost(qep)
 Processing and Optimizing the join operators in query are
difficult
 Join operator considers two relations as input and combines
their tuples one by one on the basis of a definite criterion and
produce a new relation as output.
 The join operator has associative and commutative features thus
the number of execution plans for responding to a query
increases exponentially when the number of joins among
relations increases.
5
Learning automata


Learning automata approach for learning involves
determination of an optimal action from a set of allowable
actions.
It selects an action from its finite set of actions.
6
Proposed Hybrid Algorithm for Solving
Join Ordering Problem


Combining genetic algorithms and learning automata
Generation, penalty and reward are some of features
of hybrid algorithm.
 In proposed algorithm, unlike classical genetic
algorithm, binary coding or natural permutation
representations aren't used for chromosomes.
 Each chromosome is represented by learning
automata of object migration kind
 Each of genes in chromosome is attributed to one of
automata actions, and is placed in a definite depth of
that action.
7
Learning automata of object migration
kind




In these automata, α  {α1, α2 , ... , αk } is set of allowed
actions of automata.
  {1 ,  2 , ... ,  KN } Is set of states and N is memory depth for
automata.
Now consider the following query:
(A∞C) and (B∞C) and (C∞D) and (D∞E)
An example of a query graph :
B
P2
P
E3
P1
C
P4
D
E
A
8
Learning automata of object migration kind
Display of joins
permutation (p3, p2, p1, p4)
by learning automata based
on Tsetlin automata
connections
9
Fitness function


The purpose of searching the optimized order of query joins is
finding permutation of join operators, so that total cost of
query execution is minimized in this permutation.
One important point in computing fitness function is the
number of references to the disc so we can define the fitness
function of F for an execution plan like qep as follows:
F (qep) 
1
The number of references to disk
Ctotal  C ( R1 )  C ( R2 )  C ( R1R2 )
10
Operators of Hybrid genetic algorithm

Selection operator: The selection used for this
algorithm is roulette wheel.
 Crossover Operator: In this operator, two parent
chromosomes are selected and two genes i and j are
selected randomly in one of the two parent
chromosomes.
 Mutation operator: For executing this operator, we can
use different method which are suitable for work with
permutations. For example in swap mutation, two
actions (genes) from one automata (chromosome) are
selected randomly and replaced with each other.
11
Crossover Operator
12
Mutation operator
13
Penalty and Reward Operator

In each chromosome, evaluating the fitness rate of a
gene which is selected randomly, penalty or reward is
given to that gene.
 As a result of giving penalty or reward, the depth of
gene changes.
 For example, in automata like Tsetlin connections, if
p2 join be in states set {6,7,8,9,10}, and the cost for
p2 join in the second action will be less than average
join costs of chromosome, reward will be given to
this join and it's Certainty will be increased and
moves toward the internal states of that action.
14
An Example of Reward Operator
15
The manner of giving penalty to the join
that is in a state except boundary state
16
The manner of giving penalty to the
join that is in a boundary state
17
Experiment Results
Automata
GA
GALA
Comparison of averaged
cost obtained from
hybrid algorism and
learning automata based
on Tsetline and genetic
algorithm
Cost of Query Execution
70000
60000
50000
40000
30000
20000
10000
0
10
20
30
40
50
60
70
80
90
100
Number of Join Operator
Comparison of
averaged cost obtained
from hybrid algorithm
and learning automata
based on Krinsky and
genetic algorithm
Cost of Query Execution
Automata
GA
GALA
70000
60000
50000
40000
30000
20000
10000
0
10
20
30
40
50
60
70
Number of Join Operator
80
90 100
18
Experiment Results
Automata
GA
GALA
Comparison of averaged
cost obtained from hybrid
algorithm and learning
automata based on Krylov
and genetic algorithm
Cost of Query Execution
70000
60000
50000
40000
30000
20000
10000
0
10
20
30
40
50
60
70
80
90
100
Number of Join Operator
Comparison of averaged
cost obtained from hybrid
algorithm and learning
automata based on Oomen
and genetic algorithm
Cost of Query Execution
Automata
GA
GALA
70000
60000
50000
40000
30000
20000
10000
0
10
20
30
40
50
60
70
Number of Join Operator
80
90 100
19
Experiment Results
Comparison of averaged
cost obtained from
hybrid algorithms based
on different Automata
Cost of Query Execution
GALA - Krinisky
GALA - Oomen
GALA - Krylov
GALA - Tsetlin
70000
60000
50000
40000
30000
20000
10000
0
10
20
30
40
50
60
70
80
90
100
Number of Join Operator
Comparison of averaged
cost obtained from hybrid
algorithms based on
different Automata depth
C ost of Q ue ry E xe cution
GALA - Krinisky
GALA - Oomen
GALA - Krylov
GALA - Tsetlin
43000
38000
33000
28000
23000
1
3
5
7
De pth of Automata
9
20
11