* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download RDBMS Study Material Unit-1
Microsoft SQL Server wikipedia , lookup
Oracle Database wikipedia , lookup
Relational algebra wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Concurrency control wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
ContactPoint wikipedia , lookup
Clusterpoint wikipedia , lookup
US03CBCA01 (Relational Database Management Systems - I) Unit - I CHARUTAR VIDYA MANDAL’S SEMCOM Vallabh Vidyanagar Faculty Name: Ami D. Trivedi Class: SYBCA Subject: US03CBCA01 (Relational Database Management Systems-I) *UNIT – 1 (Introduction to Relational Database Theory and Data Modeling) DATABASE A database can be defined as a collection of coherent, meaningful data. The phrase collection of coherent data needs to have a point of reference to be understood. DBMS The system that would help in managing data in such a database, called Data Base Management System. DBMS is a system that allows inserting, updating, deleting and processing of data. BENEFITS OF DBMS 1. 2. 3. 4. 5. 6. 7. The amount of data redundancy in stored data can be reduced. No more data inconsistencies. Stored data can be shared by a single or multiple users. Standards can be set and followed. Data integrity can be maintained. Security of data can be simply implemented. Data independence can be achieved. DBMS V/S RDBMS DBMS RDBMS In DBMS relationship between two tables In RDBMS relationship between two or file are maintained programmatically. tables or files can be specified at the time of table creation. DBMS does not support client / server Most of the RDBMS support client / architecture. server architecture. DBMS does not support distributed Most of the RDBMS support distributed database. database. In DBMS there is no security of data. IN RDBMS there are multiple levels of security Logging in O/S level Command level Object level Each table is given an extension in DBMS Many tables are grouped in one database in RDBMS DBMS may satisfy less than 7 to 8 rules RDBMS may satisfy more than 7 to 8 of Dr.E.F.codd rules of Dr.E.F.codd Page 1 of 16 US03CBCA01 (Relational Database Management Systems - I) Naming Conventions: Field Record File Unit - I Naming Conventions: Column, Attributes Row, Tuple, Entity Table, Relation, Entity Class *THREE LEVEL ARCHITECTURE PROPOSAL FOR DBMS The Architecture, is divided into three levels : 1. External level 2. Conceptual level And 3. Internal level The view at each of the level is described by a SCHEMA. A Schema is an outline or a plan that describes the records and relationships existing in the view. 1. INTERNAL VIEW 2. CONCEPTUAL VIEW 3. EXTERNAL VIEW 1. INTERNAL VIEW In this view at the lowest level of abstraction, closet to the physical storage methods used. It indicates how data will be stored and describes the data structures and access method to be used by the database. The internal view is expressed by the internal view schema, which contains the definition of the stored record, the method of representing the data fields, and the access of it. 2. CONCEPTUAL OR GLOBAL VIEW At this level, all the database entities and the relationships among them are included. One conceptual view represents the entire DATABASE. And this conceptual view is defined by conceptual view is defined by conceptual schema. It describes all the records and relationships includes in the conceptual view. There is only one conceptual schema per database. The schema also contains the methods of deriving the object in the conceptual view from the object in the internal view. Page 2 of 16 US03CBCA01 (Relational Database Management Systems - I) Unit - I 3. EXTERNAL OR USER VIEW The external or user view is at the highest level of database abstraction (concept) where only those portions of the database of concern to a user or application program included. Any number of user views may exist for a given global or conceptual view. Each external view is described by means of a scheme called an external schema. The external schema consists of the definition of the logical records and the relationships in the external view. The external view also contains the method of deriving the objects in the external view from the objects in the conceptual view. *INTRODUCTION TO DATA MODELS A data model is a picture or description which depicts how data is to be arranged to serve a specific purpose. The data model depicts what that data items are required, and how that data must look. Some data models, which data records are connected or related within a file structure. These are called record or structural data models. Some data models are used to identify the subjects of corporate data processing these are called entity-relationship data models. Still another type of data model is used for analytic purposes to help the analyst to solidify the semantics associated with critical corporate or business concepts. DATA MODELS can be classified as 1. Hierarchical 2. Network 3. Relational Page 3 of 16 US03CBCA01 (Relational Database Management Systems - I) Unit - I 1. HIERARCHICAL MODEL The hierarchical data model organizes data in a tree structure. There is a hierarchy of parent and child data segments. This structure implies that a record can have repeating information, generally in the child data segments. Data in a series of records, which have a set of field values attached to it. It collects all the instances of a specific record together as a record type. These record types are the equivalent of tables in the relational model, and with the individual records being the equivalent of rows. To create links between these record types, the hierarchical model uses Parent Child Relationships. These are a 1:N mapping between record types. i.e. it represent a one-to-many relationship between two entities where the two are respectively parent and child For example, an organization might store information about an employee, such as name, employee number, department, salary. The organization might also store information about an employee's children, such as name and date of birth. The employee and children data forms a hierarchy, where the employee data represents the parent segment and the children data represents the child segment. If an employee has three children, then there would be three child segments associated with one employee segment. In a hierarchical database the parent-child relationship is one to many. This restricts a child segment to having only one parent segment. Hierarchical DBMSs were popular from the late 1960s, with the introduction of IBM's Information Management System (IMS) DBMS, through the 1970s. 2. NETWORK MODEL The popularity of the network data model coincided with the popularity of the hierarchical data model. Some data were more naturally modeled with more than one parent per child. So, the network model permitted the modeling of many-to-many relationships in data. In 1971, the Database Task Group of the Conference on Data Systems Languages (DBTG / CODASYL) formally defined the network model. The basic data modeling construct in the network model is the set construct. A set consists of an owner record type, a set name, and a member record type. A member record type can have that role in more than one set, hence the multi parent concept is supported. An owner record type can also be a member or owner in another set. The data model is a simple network, and link and intersection record types (called junction records by IDMS) may exist, as well as sets between them. Thus, the complete network of relationships is represented by several pair wise sets; in each set some (one) record type is owner (at the tail of the network arrow) and one or more record types are members (at the head of the relationship arrow). Usually, a set defines a 1:M relationship, although 1:1 is permitted. The CODASYL network model is based on mathematical set theory. Page 4 of 16 US03CBCA01 (Relational Database Management Systems - I) Unit - I 3. RELATIONAL MODEL In relational model, it has the advantage of being simple in principle, user can express their queries in a powerful query language. (RDBMS - relational database management system) A database based on the relational model developed by E.F. Codd. A relational database allows the definition of data structures, storage and retrieval operations and integrity constraints. In such a database the data and relations between them are organized in tables. A table is a collection of records and each record in a table contains the same fields. Properties of Relational Tables Values are Atomic(tiny) Each Row is Unique Column Values Are of the Same Kind The Sequence of Columns is Insignificant The Sequence of Rows is Insignificant Each Column Has a Unique Name Certain fields may be designated as keys, which mean that searches for specific values of that field will use indexing to speed them up. Where fields in two different tables take values from the same set, a join operation can be performed to select related records in the two tables by matching values in those fields. Often, but not always, the fields will have the same name in both tables. RDBMS stands for Relational Database Management System. Often RDBMS is referred just as “Database”. RDBMS technology is mature now. That’s why it’s a good idea to use RDBMS for Strong AI development. RDBMS examples MS SQL Server, Oracle, MySQL, MS Access, SyBase, … RELATIONAL DATA MODEL A relational database matches data by using common characteristics found within the data set. The resulting groups of data are organized and are much easier for many people to understand. For example, a data set containing all the real-estate transactions in a town can be grouped by the year each transaction occurred, the sale price, a buyer's last name and so on. Such a grouping uses the relational model (a technical term for this is schema). Hence, such a database is called a "relational database." Page 5 of 16 US03CBCA01 (Relational Database Management Systems - I) Unit - I The software used to do this grouping is called a relational database management system (RDBMS). The term "relational database" often refers to this type of software TERMINOLOGY The term relational database was originally defined by and is attributed to Edgar Codd at IBM Almaden Research Center in 1970. Relational database theory uses a set of mathematical terms, which are roughly equivalent to SQL database terminology. The table below summarizes some of the most important relational database terms and their SQL database equivalents. A relation is defined as a set of tuples that have the same attributes. A tuple usually represents an object and information about that object. Objects are typically physical objects or concepts. A relation is usually described as a table, which is organized into rows and columns All the data referenced by an attribute are in the same domain and conform to the same constraints. Page 6 of 16 US03CBCA01 (Relational Database Management Systems - I) Unit - I DOMAIN: A Domain is a pool of values from where one or more attributes (column/field) can draw their actual values. Example - For above table domain name for this field is COUNTRY, if we take values from country field. TUPLE: According to the relational model, every relation or table is made of many Tuples. They are also called row/records. ATTRIBUTES : The term Attributes refers to characteristics. The characteristics of the tuple are reflected by its attributes or fields. This simply means that the column contains will be defined by the attributes of that column. *OPERATIONS ON DATA (DDL, DML) SQL is a language that provides an interface to a rational database system. SQL was developed by IBM in the 1970 for use in system. SQL is often pronounced SEQUEL In common use SQL also encompasses DML, DDL, used for creating and modifying tables and other database structures. DDL (Data Definition Language) DDL is set of SQL commands used to create, modify and delete database structures but not the data. These commands are normally used by a general user, who should be accessing the database via an application. DDL statements are used to define the database structure or schema. Some examples: CREATE - to create objects in the database ALTER - alters the structure of the database DROP - delete objects from the database TRUNCATE - remove all records from a table, including all spaces allocated for the records are removed COMMENT - add comments to the data dictionary GRANT – gives user’s access privilege to database REVOKE – Withdraw access privilege given with the GRANT command. RENAME - rename an object DML (Data Manipulation Language) DML is the area of SQL that allows changing data within the database. DML statements are used for managing data within schema objects. Page 7 of 16 US03CBCA01 (Relational Database Management Systems - I) Unit - I Some examples: INSERT - insert data into a table UPDATE - updates existing data within a table DELETE - deletes all records from a table, the space for the records remain MERGE - UPSET operation (insert or update) CALL - call a PL/SQL or Java subprogram EXPLAIN PLAN - explain access path to data LOCK TABLE - control concurrency DCL (Data Control Language) SET TRANSACTION – change transaction option like what rollback segment to use GRANT - gives user's access privileges to database REVOKE - withdraw access privileges given with the GRANT command DQL (Data Query Language Statement) SELECT - retrieve data from the a database *RELATIONSHIP AND RELATIONSHIP TYPES A relationship can be defined as an association among ENTITIES. An association of several entities in a r-y model is called relationship. The relationship represents the fact that the LECTURER teaches several STUDENTS and the STUDENT, And STUDENT is taught by several LECTURERS. ENTITIES AND RELATIONSHIP 3 Types of Relationship exists among entities. 1. One-to-One (1:1) 2. One-to-Many (1:M) 3. Many-to-Many (M:M) Page 8 of 16 US03CBCA01 (Relational Database Management Systems - I) Unit - I 1. ONE-TO-ONE (1:1) A One-to-One relationship is an association between two entities. For example: In University each Department have one head of department. More Over, One member cannot be a head of more than one department. 2. ONE-TO-MANY A One-to-Many relationship exists when one entity is related to more than one entity. For example: A father may have many children but, a child has only one father. More than one subject is taught by one lecturer only. 3. MANY-TO-MANY RELATIONSHIP A many-to-many relationship describes entities that may have many relations in both the directions. For example: One customer may buy many items and one item may be bought by many customers. A student can take many courses in university, and many students can register for given course. Page 9 of 16 US03CBCA01 (Relational Database Management Systems - I) Unit - I *INTEGRITY CONSTRAINT Integrity Constraints are the set of constructs provided by a data model for specifying conditions that must be satisfied by the data. An Integrity constraint (IC) is a condition specified on a database schema and restricts the data that can be stored in an instance of the database. The integrity rules implicitly or explicitly define the set of consistent database states or changes of states or both. INTEGRITY RULE 1 (Entity Integrity) If attribute A of relation R is a prime attribute of R, then a cannot accepts null values. (i.e concept of primary key) INTEGRITY RULE2 (Referential Integrity) Given two relations R and S, suppose R refers to the relation S via a set of attributes that form the primary key of S and this Set of attributes forms a foreign Key in R. The value of the foreign key in the tuple R must either be equal to PK of a tuple S or be entirely null. *CODD’S RULES There are 12 Codd’s Rules for RDBMS. Rule 1: The Information Rule All data should be presented in table form Rule 2: Guaranteed Access Rule All data should be accessible without ambiguity. This can be accomplished through a combination of the table name, primary key, and column name Rule 3: Systematic Treatment of Null Values A field should be allowed to remain empty. This involves the support of a null value, which is distinct from an empty' string or a number with a value of zero. Of course, this can't apply to primary keys. In addition, most database implementations support the concept of a not-null field constraint that prevents null values in a specific table column Rule 4: Dynamic On-Line Catalog based on the Relational Model A relational database must provide access to its structure through the same tools that are used to access the data. This is usually accomplished by storing the structure definition within special system tables Page 10 of 16 US03CBCA01 (Relational Database Management Systems - I) Unit - I Rule 5: Comprehensive Data Sublanguage Rule The database must support at least one clearly defined language that includes functionality for data definition, data manipulation, data integrity, and database transaction control. All commercial relational databases use forms of standard SQL (i.e. Structured Query Language) as their supported comprehensive language Rule 6: View Updating Rule Data can be presented in different logical combinations called views. Each view should support the, same full range of data manipulation that has direct access to a table, available. In practice, providing update and delete access to logical views is difficult and is not fully supported by any current database Rule 7: High-level Insert, Update, and Delete Data can be retrieved from a relational database in sets constructed of data from multiple rows and/or multiple tables. This rule states that insert, update, and delete operations should be supported for any retrievable set rather than just for a single row in a single table Rule 8: Physical Data Independence The user is isolated from the physical method of storing and retrieving information from the database. Changes can be made to the underlying architecture (hardware, disk storage methods) without affecting how the user accesses it Rule 9: Logical Data Independence How data is viewed should not be changed when the logical structure (table's structure) of the database changes. This rule is particularly difficult to satisfy. Most databases rely on strong ties between the data viewed and the actual structure of the underlying tables Rule 10: Integrity Independence The database language (like SQL) should support constraints on user input that maintain database integrity. This rule is not fully implemented by most major vendors. At a minimum, all databases do preserve two constraints through SQL. Page 11 of 16 US03CBCA01 (Relational Database Management Systems - I) Unit - I No component of a primary key can have a null value. If a foreign key is defined in one table, any value in it must exist as a primary key in another table. Rule 11: Distribution Independence A user should be totally unaware of whether or not the database is distributed (whether parts of the database exist in multiple locations). A variety of reasons make this rule difficult to implement. Rule 12: Non subversion Rule There should be no way to modify the database structure other than through the multiple row database language (like SQL). Most databases today support administrative tools that allow some direct manipulation of the data structure. *ENTITY-RELATIONSHIP MODELING The entity-relationship (E-R) data model grew out of the exercise of using commercially available DBMSs to model application databases. Earlier commercial systems were based on the hierarchical and network approach. The entity-relationship model is a generalization of these models. It allows the representation of explicit constraints as well as relationships. Even though the E-R model has some means of describing the physical database model, it is basically useful in the design and communication of the logical database model. In this model, objects of similar structures are collected into an entity set. The relationship between entity sets is represented by a named E-R relationship and is 1: 1, 1: M, or M: N, mapping from one entity set to another. The database structure, employing the E-R model is usually shown pictorially using entity-relationship (E-R) diagrams. The entities and the relationships between them are shown in bellow using the following conventions: An entity set is shown as a rectangle. A diamond represents the relationship among a number of entities, which are connected to the diamond by lines. The attributes, shown as ovals, are connected to the entities or relationships by lines. Diamonds, ovals, and rectangles are labeled. The type of relationship existing between the entities is represented by giving the cardinality of the relationship on the line joining the relationship to the entity. Page 12 of 16 US03CBCA01 (Relational Database Management Systems - I) Unit - I ENTITY-RELATIONSHIP DIAGRAM DIFFERENT TYPES OF ENTITIES and ATTRIBUTES An entity is an object that is of interest to an organization. Objects of similar types are characterized by the same set of attributes or properties. Such similar objects form an entity set or entity type. Two objects are mutually distinguishable and this fact is represented in the entity set by giving them unique identifiers. Consider an organization such as a hotel. Some of the objects of concern to it are its employees, rooms, guests, restaurants, and menus. These collections of similar entities form the entity sets, EMPLOYEE, ROOM, GUEST_LIST, RESTAU¬RANT, MENUS. Given an entity set, we can determine whether or not an object belongs to it. An object may belong to more than one entity set. For example, an individual may be part of the entity set STUDENT, the entity set PART_TIME-EMPLOYEE, and the entity set PERSON. Entities interact with each other to establish relationships of various kinds. Objects are represented by their attributes and, as objects are interdistinguishable, a subset of these attributes forms a primary key or key for uniquely identifying an instance of an entity. Entity types that have primary keys are called strong entities. The entity set EMPLOYEE would qualify as a strong entity because it has an attribute Employee_Id that uniquely identifies an instance of the entity EMPLOYEE; no two instances of the entity have the same value for the attribute Employee_Id. Below figure shows some examples of strong entities. Only the attributes that form the primary keys are shown. Entities may not be distinguished by their attributes but by their relationship to another entity. Page 13 of 16 US03CBCA01 (Relational Database Management Systems - I) Unit - I Recall the representation of the entity EMPLOYEE wherein the 1:M association involving the attributes (Dependent-Name, Relationship_to_Employee) is removed as a separate entity, DEPENDENTS. We then establish a relationship, DEDUCTIONS, between the modified entity EMPLOYEE and DEPENDENTS as Strong Entities: Converting an attribute association to a relationship Shown in above figure In this case, the instances of the entity from the set DEPENDENTS are distinguishable only by their relationship with an instance of an entity from the entity set EMPLOYEE. The relationship set DEDUCTIONS is an example of an identifying relationship and the entity set DEPENDENTS is an example of a weak entity. Instances of weak entity sets associated with the same instance of the strong entity must be distinguishable from each other by a subset of the attributes of the weak entity (the subset may be the entire weak entity). This subset of attributes is called the discriminator of the weak entity set. The primary key of a weak entity set is thus formed by using the primary key of the strong entity set to which it is related, along with the discriminator of the weak entity. Page 14 of 16 US03CBCA01 (Relational Database Management Systems - I) Unit - I A Binary relationship between different entity sets: Figure-1 RELATIONSHIP An association among entities is called a relationship. We looked at a relationship indirectly when we converted a 1 : M association into a strong entity, a weak entity, and a relationship. A collection of relationships of the same type is called a relation-ship set. A relationship is a binary relationship if the number of entity sets involved in the relationship is two. In Figure(1), ENROLLMENT is an example of a binary relationship involving two distinct entity sets. However, the entities need not be from distinct entity sets. Figure - 2 Figure (2) illustrates binary relationships that involve the same entity sets. Page 15 of 16 US03CBCA01 (Relational Database Management Systems - I) Unit - I A marriage, for example, is a relation between a man and woman that is modeled by a relationship set MARRIAGE between two instances of entities derived from the entity set PERSON. TERNARY RELATIONSHIP Figure - 3 A relationship that involves N entities is called an N-ary relationship. In Figure (3), COMPUTING is an example of a ternary relationship involving three entity sets. COMPUTING represents the relationship involving a student using a particular computing system to do the computations for a given course. Disclaimer The study material is compiled by Ami D. Trivedi. The basic objective of this material is to supplement teaching and discussion in the classroom in the subject. Students are required to go for extra reading in the subject through library work. Page 16 of 16