* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download The Query Processor - AARYA CLASSES
Concurrency control wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Microsoft SQL Server wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Functional Database Model wikipedia , lookup
Versant Object Database wikipedia , lookup
Clusterpoint wikipedia , lookup
Q. Query Tree A query tree is a tree data structure that corresponds to a relational algebra expression. It represents the input relations of the query as leaf nodes of the tree, and represents the relational algebra operations as internal nodes. An execution of the query tree consists of executing an internal node operation whenever its operands are available and then replacing that internal node by the relation that results from executing the operation. The order of execution of operations starts at the leaf nodes, which represents the input database relations for the query, and ends at the root node, which represents the final operation of the query. The execution terminates when the root node operation is executed and produces the result relation for the query. e.g. Notes By Santosh Sir... www.aaryaclasses.co.in 8976381939/9867211982 Page 1 Q. Draw Query tree for the following: Notes By Santosh Sir... www.aaryaclasses.co.in 8976381939/9867211982 Page 2 Notes By Santosh Sir... www.aaryaclasses.co.in 8976381939/9867211982 Page 3 Q. Steps in Query Processing A query expressed in a high-level query language such as SQL must first be scanned, parsed, and validated.1 The scanner identifies the query tokens—such as SQL keywords, attribute names, and relation names—that appear in the text of the query. The parser checks the query syntax to determine whether it is formulated according to the syntax rules (rules of grammar) of the query language. The query must also be validated by checking that all attribute and relation names are valid and semantically meaningful names in the schema of the particular database being queried. An internal representation of the query is then created, usually as a tree data structure called a query tree. It is also possible to represent the query using a graph data structure called a query graph. The DBMS must then devise an execution strategy or query plan for retrieving the results of the query from the database files. A query typically has many possible execution strategies, and the process of choosing a suitable one for processing a query is known as query optimization. The query optimizer module has the task of producing a good execution plan, and the code generator generates the code to execute that plan. The runtime database processor has the task of running (executing) the query code, whether in compiled or interpreted mode, to produce the Notes By Santosh Sir... www.aaryaclasses.co.in 8976381939/9867211982 Page 4 query result. If a runtime error results, an error message is generated by the runtime database processor. Finding the optimal strategy is usually too time-consuming—except for the simplest of queries. In addition, trying to find the optimal query execution strategy may require detailed information on how the files are implemented and even on the contents of the files—information that may not be fully available in the DBMS catalog. Hence, planning of a good execution strategy may be a more accurate description than query optimization. The programmer must choose the query execution strategy while writing a database program. If a DBMS provides only a navigational language, there is limited need or opportunity for extensive query optimization by the DBMS; instead, the programmer is given the capability to choose the query execution strategy. Q. Assertions An assertion is a predicate expressing a condition that we wish the database always to satisfy. Domain constraints and referential-integrity constraints are special forms of assertions. We have paid substantial attention to these forms of assertion because they are easily tested and apply to a wide range of database applications. However, there are many constraints that we cannot express by using only these special forms. Two examples of such constraints are: •The sum of all loan amounts for each branch must be less than the sum of allaccount balances at the branch. •Every loan has at least one customer who maintains an account with a minimum balance of $1000.00. An assertion in SQL takes the form create assertion <assertion-name> check <predicate> e.g. create assertion sum-constraint check (not exists (select * from branch where (select sum(amount) from loan whereloan.branch-name = branch.branch-name) >= (select sum(balance) from account Notes By Santosh Sir... www.aaryaclasses.co.in 8976381939/9867211982 Page 5 where account.branch-name = branch.branch-name))) When an assertion is created, the system tests it for validity. If the assertion is valid, then any future modification to the database is allowed only if it does not cause that assertion to be violated. This testing may introduce a significant amount of over head if complex assertions have been made. Hence, assertions should be used with great care. The high overhead of testing and maintaining assertions has led some system developers to omit support for general assertions, or to provide specialized forms of assertions that are easier to test. Q. Aggregation One limitation of the E-R model is that it cannot express relationships among relationships. Let us consider the following ternary relation. Now, suppose we want to record managers for tasks performed by an employee at branch; that is, we want to record managers for (employee, branch, job) combinations. Let us assume that there is an entity set manager. One alternative for representing this relationship is to create a quaternary relationship manages between employee, branch, job, and manager. Notes By Santosh Sir... www.aaryaclasses.co.in 8976381939/9867211982 Page 6 There is redundant information in the resultant figure, however, since every employee, branch, job combination in manages is also in works-on. If the manager were a value rather than an manager entity, we could instead make manager a multivalued attribute of the relationship works-on. But doing so makes it more difficult to find, for example, employee-branch-job triples for which a manager is responsible. The best way to model a situation such as the one just described is to use aggregation. Aggregation is an abstraction through which relationships are treated as higher level entities. Following diagram shows a notation for aggregation commonly used to represent the above situation. Notes By Santosh Sir... www.aaryaclasses.co.in 8976381939/9867211982 Page 7 Q. DBMS System architecture A Database system is partitioned into modules that deal with each of the responsibilities of the overall system. The functional components of a database system can be broadly divided into the storage manager and query processor components. Storage Manager A storage manager is a program module that provides the interface between the low level data stored in the database and the application programs and queries submitted to the system. The storage manager is responsible for the interaction with the file manager. Notes By Santosh Sir... www.aaryaclasses.co.in 8976381939/9867211982 Page 8 The storage manager components include: Authorization and integrity manager Which tests for the satisfaction of integrity constraints and checks the authority of users to access data. Transaction manager Which ensures that the database remains in a consistent correct) state despite system failures, and that concurrent transaction executions proceed without conflicting. File Manager Which manages the allocation of space on disk storage and the data structures used to represent information stored on disk. Buffer manager Which is responsible for fetching data from disk storage into main memory, and deciding what data to cache in main memory. The buffer manager is a critical part of the database system, since it enables the database to handle data sizes that are much larger than the size of main memory. The storage manager implements several data structures as part of the physical system implementation: Data files, which store the database itself. Data dictionary, which stores metadata about the structure of the database, in particular the schema of the database. Indices, which provide fast access to data items that hold particular values. The Query Processor The query processor components include DDL interpreter, which interprets DDL statements and records the definitions in the data dictionary. DML compiler, which translates DML statements in a query language into an evaluation plan consisting of low level instructions that the query evaluation engine understands. A query can usually be translated into any of a number of alternative evaluation plans that all give the same result. The DML compiler also performs query optimization, that is it picks the lowest cost evaluation plan from among the alternatives. o Query evaluation engine, which executes low level instructions generated by the DML compiler. Notes By Santosh Sir... www.aaryaclasses.co.in 8976381939/9867211982 Page 9 Q. Data Independence: There's a lot of data in whole database management system other than user's data. DBMS comprises of three kinds of schemas, which is in turn data about data (Meta-Data). Meta-data is also stored along with database, which once stored is then hard to modify. But as DBMS expands, it needs to be changed over the time satisfy the requirements of users. But if the whole data were highly dependent it would become tedious and highly complex. Data about data itself is divided in layered architecture so that when we change data at one layer it does not affect the data layered at different level. Logical Data Independence Logical data is data about database, that is, it stores information about how data is managed inside. For example, a table stored in the database and all constraints, which are applied on that relation. Logical data independence is a kind of mechanism, which liberalizes itself from actual data stored on the disk. If we do some changes on table format it should not change the data residing on disk. Physical Data Independence All schemas are logical and actual data is stored in bit format on the disk. Physical data independence is the power to change the physical data without impacting the schema or logical data. For example, in case we want to change or upgrade the storage system itself, that is, using SSD instead of Hard-disks should not have any impact on logical data or schemas. Notes By Santosh Sir... www.aaryaclasses.co.in 8976381939/9867211982 Page 10 Notes By Santosh Sir... www.aaryaclasses.co.in 8976381939/9867211982 Page 11