* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download CUSTOMER_CODE SMUDE DIVISION_CODE SMUDE
Serializability wikipedia , lookup
Concurrency control wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Microsoft SQL Server wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Functional Database Model wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Clusterpoint wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Relational algebra wikipedia , lookup
Versant Object Database wikipedia , lookup
CUSTOMER_CODE SMUDE DIVISION_CODE SMUDE EVENT_CODE JAN2016 ASSESSMENT_CODE MIT202_JAN2016 QUESTION_TYPE DESCRIPTIVE_QUESTION QUESTION_ID 13027 QUESTION_TEXT Explain Relational Database Management System and its components. SCHEME OF EVALUATION Relational Database Management System: A Relational Database Management System (RDBMS) provides a comprehensive and integrated approach to information management. A relational model provides the basis for a relational database. A relational model has three aspects: 1.Structures: Structures consist of a collection of objects or relations that store data. An example of relation is a table. You can store information in a table and use the table to retrieve and modify data. 2.Operations: Operations are used to manipulate data and structures in a database. When using operations, you must adhere to a predefined set of integrity rules. 3.Integrity rules: Integrity rules are laws that govern the operations allowed on data in a database. This ensures data accuracy and consistency. (4 Marks) Relational database components include: Table Row Column Field Primary key Foreign key A Table is a basic storage structure of an RDBMS and consists of columns and rows. A table represents an entity. For example, the S_DEPT table stores information about the departments of an organization. A Row is a combination of column values in a table and is identified by a primary key. Rows are also known as records. For example, a row in the table S_DEPT contains information about one department. A Column is a collection of one type of data in a table. Columns represent the attributes of an object. Each column has a column name and contains values that are bound by the same type and size. For example, a column in the table S_DEPT specifies the names of the departments in the organization. A Field is an intersection of a row and a column. A field contains one data value. If there is no data in the field, the field is said to contain a NULL value. A Primary key is a column or a combination of columns that is used to uniquely identify each row in a table. For example, the column containing department numbers in the S_DEPT table is created as a primary key and therefore every department number is different. A primary key must contain a value. It cannot contain a NULL value. A Foreign key is a column or set of columns that refers to a primary key in the same table or another table. You use foreign keys to establish principle connections between, or within, tables. A foreign key must either match a primary key or else be NULL. Rows are connected logically when required. The logical connections are based upon conditions that define a relationship between corresponding values, typically between a primary key and a matching foreign key. This relational method of linking provides great flexibility as it is independent of physical links between records. (6 Marks) QUESTION_TYPE DESCRIPTIVE_QUESTION QUESTION_ID 13028 QUESTION_TEXT Discuss Data Replication in distributed database SCHEME OF EVALUATION The system maintains several identical replicas of the relation. Each replica is stored in a different site, resulting in data replication. The alternative to replication is to store only one copy of relation. (2 marks) If relation r is replicated, a copy of relation r is stored in two or more sites. In most extreme case, we can have full replication, in which a copy is stored in every site in the system. In general replication enhances the performance of read operations and increases the availability of data to read transactions. (2 marks) There are number of advantages and disadvantages to replication: • Availability: If one of the sites containing relation r fails, then the relation r may be found in another site. Thus, the system may continue to process queries involving despite the failure of one site. • Increased parallelism: Where the majority of access to the relation r results in only the reading of the relation, the several sites can process queries involving r in parallel. The more replicas of r are there, the greater the chance that the needed data is found in the site where the transaction is executing. • Increase overhead on update: The system must ensure that all replicas of a relation r are consistent since otherwise erroneous computations may result. This implies that whenever r is updated, this update must be propagated to all sites containing replicas, resulting in increased over head. (2 X3 = 6) QUESTION_TYPE DESCRIPTIVE_QUESTION QUESTION_ID 13030 QUESTION_TEXT Discuss the advantages of data distribution SCHEME OF EVALUATION Data sharing and Distributed Control: If a number of different sites are connected to each other, then a user at one site may be able to access data that is available at another site.. For example, in the distributed banking system, it is possible for a user in one branch to access data in another branch. ( 2marks) The primary advantage to accomplishing data sharing by means of data distribution is that each site is able to retain a degree of control over data stored locally. In distributed system, there is a global database administrator responsible for the entire system. A part of these responsibilities is delegated to the local database administrator for each site. ( 2marks) Reliability and Availability: If one site fails in distributed system, the remaining sites may be able to continue operating. In particular, if data are replicated in several sites, transaction needing a particular data item may find it in several sites. ( 2marks) The failure of one site must be detected by the system, and appropriate action may be needed to recover from the failure. Although recovery from failure is more complex in distributed systems than in a centralized system, the ability of most of the systems to continue to operate despite failure of one site, results in increased availability ( 2marks) Speedup Query Processing: If a query involves data at several sites, it may be possible to split the query into sub queries that can be executed in parallel by several sites. Such parallel computation allows for faster processing of a user’s query. ( 2marks) QUESTION_TYPE DESCRIPTIVE_QUESTION QUESTION_ID 118838 QUESTION_TEXT Explain the following statements with example. a. Insert b. Update c. Delete d. Group by e. SCHEME OF EVALUATION View Insert The insertion facility allows new tuples to be inserted into given relations. Attributes which are not specified by the insertion statement are given null values. 1. Add a part with 14, weight 10, coloured red, with the cost and selling price as 20 and 60 respectively. Insert into P <14,10, ‘red’, 20, 60> Update When the columns are to be modified, set clause is used. This clause specifies the update to be made to selected tuples Ex: update S set city=’Bangalore’ turnover=turnover+20 where S#=13 Delete The deletion removes specified tuples from the database Delete S where S#=13 Group by This feature allows one to partition the result into a number of groups such that all rows of the group have the same value in some specified column Select P#, sum(qty), from SP group by P# View A very important aspect of data definition is the ability to define alternative views of data. The process of specifying an alternative view is very similar to that of framing a query Define view D50 AS select empno, name, job from emp where DNO=50 QUESTION_TYPE DESCRIPTIVE_QUESTION QUESTION_ID 118841 QUESTION_TEXT Explain different categories of SQL commands. SCHEME OF EVALUATION • SQL commands can be roughly divided into 3 major categories with regard to their functionality. a. Data Definition Statements: (04 marks) • Used to create and maintain the database structure. • 2 major DDL statements CREATE and DROP. • CREATE DATABASE to create a database. • DROP DATABASE to remove a database. • CREATE TABLE create table. • DROP TABLE to drop a table. • CREATE INDEX to create an index on a column. • DROP INDEX to drop index. • CREATE VIEW to create view. • DROP VIEW drop a view. b. Data Manipulation Statements: (02 marks) To manipulate data in tables directly or through views 4 standard DML statements: SELECT DELETE INSERT UPDATE c. Data Control: (04 marks) 3 issues a. Recovery and concurrency: • Manner in which multiple users operate upon the data base • Reflect updates of a transaction by using the COMMIT. • Cancel all updates using ROLLBACK. b. Security: 2 aspects • GRANT-this shall grant one or more access rights to perform the data manipulative operations on the relations. • VIEW mechanism it can be created which hides the sensitive information and defines only that part of a relation which should be visible. • A user can then be allowed to access this view • Ex: CREATE VIEW LOCAL AS SELECT * FROM SUPPLIER WHERE SUPPLIER.CITY=’Delhi’ c. Integrity constraints: ex: one can specify that an attribute of a relation will not take on null values. QUESTION_TYPE DESCRIPTIVE_QUESTION QUESTION_ID 118846 Explain Heuristics in Query Optimization in brief. QUESTION_TEXT SCHEME OF EVALUATION Heuristics in Query Optimization Heuristic rule is applied for Query Optimization by modifying the internal representation of query. This form of query is generally in the form of query tree or a query graph data structure. Although some optimization technique were based on query graph, nowadays, this technique is not applied because query graph cannot show the order of operation which is needed by query optimizer for query execution. So this unit will deal mainly with the Heuristic Optimization of query tree. A heuristic rule is applied to the initial query expression and produces the heuristically transformed equivalent query expressions. This is performed by transforming an initial expression (tree) into an equivalent expression (tree) which is made more efficient for execution. This rule works well in most cases but not always guaranteed. An execution of the query tree consists of executing an internal node operation whenever its operands are available and then replacing that internal node by the relation that results from executing the operation. Query trees and query graphs The query tree is a tree data structure that represents the relational algebra expression in the query optimization process. The leaf nodes in the query tree correspond to the input relations of the query. The internal nodes represent the relational algebra operations. The system will execute an internal node operation whenever its operands are available and then the internal node is replaced by the relation which is obtained from executing the operation. The execution is terminated when the root node is executed and produces the result relation for the query. Query can also be represented by a query graph. In this case the relations in the query are represented by relational nodes and are represented by single circles. Constant nodes are used to represent constant values and are displayed as double circles or ovals. Selection and join conditions are represented by the graph edges. And finally the attributes to be retrieved from each relation are displayed in square brackets above each relation. The graph query representation does not give an order of performing the operations because there is only a single graph corresponding to each query. Hence query trees are better than query graphs because query optimizer needs to show the order of operations for query execution which is not possible in query graphs. Heuristic optimization of query trees: Two relational algebra expressions are said to be equivalent if the two expressions generate two relation of the same set of attributes and contain the same set of tuples although their attributes may be ordered differently. Hence while doing Heuristic Optimization on Query Trees, generally, many different relational algebra expressions can be found out which can be equivalent to correspond to the same query. And for every relational algebra expressions a query tree can be drawn. And hence there can be many different query trees to correspond to the same query. The query parser generates a standard initial query tree to correspond to SQL query without optimizing. When the simple standard form of query tree is found out then the heuristic query optimizer transforms this initial query tree into a final query tree that is efficient to execute. General outline of a Heuristic Algebraic Optimization Algorithm Heuristic rule are generally applied as per the following steps: First of all SELECT operations are broken up with conjunctive operations into a cascade of SELECT operations. Then SELECT operations are moved down far to the query tree as is permitted by the attributes involved in the select condition. Leaf nodes of the tree are rearranged by : positioning the leaf node relation with the most restrictive SELECT operations so they are executed first in the query representation, and making sure that the ordering of leaf nodes does not cause CARTESIAN PRODUCT operation. CARTESIAN PRODUCT operations are combined with a subsequent SELECT operation in the tree into a JOIN operation, if the condition represents a join condition. Lists of projection attributes are broken down and moved down the tree as far as possible by creating new PROJECT operations as needed. And lastly subtrees are identified that represent groups of operations that can be executed by a single algorithm. These steps are applied using general transformation rules for relational algebra operations into equivalent ones. While transforming, the meaning of the operations and the resulting relations should not mismatch. (2.5 marks each)