Download Query Optimization – Seminar 1

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Entity–attribute–value model wikipedia , lookup

Microsoft Access wikipedia , lookup

Database wikipedia , lookup

Open Database Connectivity wikipedia , lookup

PL/SQL wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Microsoft SQL Server wikipedia , lookup

Clusterpoint wikipedia , lookup

SQL wikipedia , lookup

Versant Object Database wikipedia , lookup

Database model wikipedia , lookup

Relational algebra wikipedia , lookup

Relational model wikipedia , lookup

Transcript
Query Optimization – Seminar 6
1. Introduction
Query optimization plays a vital role in query processing. Query processing consists of the
following stages:
1.
2.
3.
4.
Parsing a user query (e.g. in SQL)
Translating the parse tree (representing the query) into relational algebra expression.
Optimizing the initial algebraic expression.
Choosing an evaluation algorithm for each relational algebra operator that would
constitute least cost for answering the query.
Stages 3-4 are the two parts of Query Optimization. Query optimization is an important and
classical component of a database system. Queries, in a high level and declarative language e.g.
SQL, that require several algebraic operations could have several alternative compositions and
ordering. Finding a “good” composition is the job of the optimizer. The optimizer generates
alternative evaluation plan for answering a query and chooses the plan with least estimated cost.
To estimate the cost of a plan (in terms of I/O, CPU time, memory usage, etc but not in pounds or
dollars) the optimizer uses statistical information available in the database system catalogue.
2. Objectives
Generally speaking, the purpose of this seminar is to present and discuss how relational algebra
(RA) operators are evaluated. In other words, how RA operators are implemented using some
algorithms. In particular, this seminar tries to compare different evaluation algorithms for
selection (known as Restrict) operation. In addition, the exercises will explore the influence of
different physical access methods available to a RBMS to make a choice of how to evaluate
queries.
3. Exercise
Consider a relation R(a, b, c, d, e) containing 5,000,000 records (tuples), where each data page
of the relation holds 10 records. R is organized as a sorted file with dense secondary indexes.
Assume that R.a is a candidate key for R, with values lying in the range of 0 to 4,999,999, and
that R is sorted in R.a order. For each of the following relational algebra expressions (i.e.
queries), state which of the following three approaches (evaluation strategies) is most likely to be
the cheapest.
a) Access the sorted file for R directly or using binary search
b) Use a B+ tree index (clustered) on attribute R.a.
c) Use a hashed index (clustered) on attribute R.a.
1. a<50,000(R)
2. a=50,000(R)
3. a50,000(R)
where  denotes the selection operation of relational algebra (e.g. a<50,000(R)).
Hints: Try to calculate how many pages are there in R. Calculate the cost of each approach using
the formulas given in the reading material and select the one that gives least cost.