Download The Query Processor - AARYA CLASSES

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

SQL wikipedia , lookup

Concurrency control wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Microsoft SQL Server wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Database wikipedia , lookup

Functional Database Model wikipedia , lookup

Versant Object Database wikipedia , lookup

Clusterpoint wikipedia , lookup

Relational model wikipedia , lookup

Database model wikipedia , lookup

Transcript
Q. Query Tree
A query tree is a tree data structure that corresponds to a relational algebra expression.
It represents the input relations of the query as leaf nodes of the tree, and represents
the relational algebra operations as internal nodes.
An execution of the query tree consists of executing an internal node operation whenever its
operands are available and then replacing that internal node by the relation that results from
executing the operation.
The order of execution of operations starts at the leaf nodes, which represents the input database
relations for the query, and ends at the root node, which represents the final operation of the
query. The execution terminates when the root node operation is executed and produces the
result relation for the query.
e.g.
Notes By Santosh Sir... www.aaryaclasses.co.in 8976381939/9867211982
Page 1
Q. Draw Query tree for the following:
Notes By Santosh Sir... www.aaryaclasses.co.in 8976381939/9867211982
Page 2
Notes By Santosh Sir... www.aaryaclasses.co.in 8976381939/9867211982
Page 3
Q. Steps in Query Processing
A query expressed in a high-level query language such as SQL must first be scanned, parsed, and
validated.1 The scanner identifies the query tokens—such as SQL keywords, attribute names,
and relation names—that appear in the text of the query.
The parser checks the query syntax to determine whether it is formulated according to the
syntax rules (rules of grammar) of the query language.
The query must also be validated by checking that all attribute and relation names are valid and
semantically meaningful names in the schema of the particular database being queried. An
internal representation of the query is then created, usually as a tree data structure called a query
tree.
It is also possible to represent the query using a graph data structure called a query graph. The
DBMS must then devise an execution strategy or query plan for retrieving the results of the
query from the database files. A query typically has many possible execution strategies, and the
process of choosing a suitable one for processing a query is known as query optimization.
The query optimizer module has the task of producing a good execution plan, and the code
generator generates the code to execute that plan. The runtime database processor has the task
of running (executing) the query code, whether in compiled or interpreted mode, to produce the
Notes By Santosh Sir... www.aaryaclasses.co.in 8976381939/9867211982
Page 4
query result. If a runtime error results, an error message is generated by the runtime database
processor.
Finding the optimal strategy is usually too time-consuming—except for the simplest of queries.
In addition, trying to find the optimal query execution strategy may require detailed information
on how the files are implemented and even on the contents of the files—information that may not
be fully available in the DBMS catalog. Hence, planning of a good execution strategy may be a
more accurate description than query optimization.
The programmer must choose the query execution strategy while writing a database program. If
a DBMS provides only a navigational language, there is limited need or opportunity for
extensive query optimization by the DBMS; instead, the programmer is given the capability
to choose the query execution strategy.
Q. Assertions
An assertion is a predicate expressing a condition that we wish the database always to satisfy.
Domain constraints and referential-integrity constraints are special forms of assertions. We have
paid substantial attention to these forms of assertion because they are easily tested and apply to a
wide range of database applications. However, there are many constraints that we cannot express
by using only these special forms.
Two examples of such constraints are:
•The sum of all loan amounts for each branch must be less than the sum of allaccount balances at
the branch.
•Every loan has at least one customer who maintains an account with a minimum balance of
$1000.00.
An assertion in SQL takes the form
create assertion <assertion-name> check <predicate>
e.g.
create assertion sum-constraint check
(not exists (select * from branch
where (select sum(amount) from loan
whereloan.branch-name = branch.branch-name)
>= (select sum(balance) from account
Notes By Santosh Sir... www.aaryaclasses.co.in 8976381939/9867211982
Page 5
where account.branch-name = branch.branch-name)))
When an assertion is created, the system tests it for validity. If the assertion is valid, then any
future modification to the database is allowed only if it does not cause that assertion to be
violated. This testing may introduce a significant amount of over head if complex assertions have
been made. Hence, assertions should be used with great care. The high overhead of testing and
maintaining assertions has led some system developers to omit support for general assertions, or
to provide specialized forms of assertions that are easier to test.
Q. Aggregation
One limitation of the E-R model is that it cannot express relationships among relationships.
Let us consider the following ternary relation.
Now, suppose we want to record managers for tasks performed by an employee at branch; that is,
we want to record managers for (employee, branch, job) combinations.
Let us assume that there is an entity set manager.
One alternative for representing this relationship is to create a quaternary relationship manages
between employee, branch, job, and manager.
Notes By Santosh Sir... www.aaryaclasses.co.in 8976381939/9867211982
Page 6
There is redundant information in the resultant figure, however, since every employee, branch,
job combination in manages is also in works-on. If the manager were a value rather than an
manager entity, we could instead make manager a multivalued attribute of the relationship
works-on. But doing so makes it more difficult to find, for example, employee-branch-job triples
for which a manager is responsible.
The best way to model a situation such as the one just described is to use aggregation.
Aggregation is an abstraction through which relationships are treated as higher level entities.
Following diagram shows a notation for aggregation commonly used to represent the above
situation.
Notes By Santosh Sir... www.aaryaclasses.co.in 8976381939/9867211982
Page 7
Q. DBMS System architecture
A Database system is partitioned into modules that deal with each of the responsibilities of the
overall system. The functional components of a database system can be broadly divided into the
storage manager and query processor components.
Storage Manager
A storage manager is a program module that provides the interface between the low level data
stored in the database and the application programs and queries submitted to the system. The
storage manager is responsible for the interaction with the file manager.
Notes By Santosh Sir... www.aaryaclasses.co.in 8976381939/9867211982
Page 8
The storage manager components include:

Authorization and integrity manager
Which tests for the satisfaction of integrity constraints and checks the authority of users to access
data.

Transaction manager
Which ensures that the database remains in a consistent correct) state despite system failures, and
that concurrent transaction executions proceed without conflicting.

File Manager
Which manages the allocation of space on disk storage and the data structures used to represent
information stored on disk.

Buffer manager
Which is responsible for fetching data from disk storage into main memory, and deciding what
data to cache in main memory. The buffer manager is a critical part of the database system, since
it enables the database to handle data sizes that are much larger than the size of main memory.
The storage manager implements several data structures as part of the physical system
implementation:



Data files, which store the database itself.
Data dictionary, which stores metadata about the structure of the database, in particular
the schema of the database.
Indices, which provide fast access to data items that hold particular values.
The Query Processor
The query processor components include


DDL interpreter, which interprets DDL statements and records the definitions in the
data dictionary.
DML compiler, which translates DML statements in a query language into an evaluation
plan consisting of low level instructions that the query evaluation engine understands.
A query can usually be translated into any of a number of alternative evaluation plans that all
give the same result. The DML compiler also performs query optimization, that is it picks the
lowest cost evaluation plan from among the alternatives.
o
Query evaluation engine, which executes low level instructions generated by the
DML compiler.
Notes By Santosh Sir... www.aaryaclasses.co.in 8976381939/9867211982
Page 9
Q. Data Independence:
There's a lot of data in whole database management system other than user's data. DBMS
comprises of three kinds of schemas, which is in turn data about data (Meta-Data). Meta-data is
also stored along with database, which once stored is then hard to modify. But as DBMS
expands, it needs to be changed over the time satisfy the requirements of users. But if the whole
data were highly dependent it would become tedious and highly complex.
Data about data itself is divided in layered architecture so that when we change data at one layer
it does not affect the data layered at different level.
Logical Data Independence
Logical data is data about database, that is, it stores information about how data is managed
inside. For example, a table stored in the database and all constraints, which are applied on that
relation.
Logical data independence is a kind of mechanism, which liberalizes itself from actual data
stored on the disk. If we do some changes on table format it should not change the data residing
on disk.
Physical Data Independence
All schemas are logical and actual data is stored in bit format on the disk. Physical data
independence is the power to change the physical data without impacting the schema or logical
data.
For example, in case we want to change or upgrade the storage system itself, that is, using SSD
instead of Hard-disks should not have any impact on logical data or schemas.
Notes By Santosh Sir... www.aaryaclasses.co.in 8976381939/9867211982
Page 10
Notes By Santosh Sir... www.aaryaclasses.co.in 8976381939/9867211982
Page 11