Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
MIS 3053 Database Design & Applications The University of Tulsa Professor: Akhilesh Bajaj RM/SQL Lecture 1 ©Akhilesh Bajaj, 2000, 2002, 2003, 2004. All rights reserved. Goals for Today • Learn the basics of the relational model • Learn the notions of superkeys, candidate keys and primary keys • Get started with relational algebra - Basics of relational algebra - The select operation -The project operation -The assignment operation What is a Data Model? • It is a collection of conceptual tools for describing data, data relationships, data semantics and consistency constraints. • Essentially answers the following questions: - What is data? - How are the data related to each other? - How can we query the data and change it? - How do we ensure that the data represents a correct view of the real world (data integrity)? • Examples of data models include the relational model, the ER model, the network model and the hierarchical model. • The network and hierarchical models are found on older mainframe systems (usually in large organizations). Basics of the Relational Model • Primary model used today for data-processing applications • Database systems like Oracle, Sybase, Informix, etc. support this model. • Has strong theory associated with it. • Proposed in 1970-71 and became mainstream in the early 1980-s onwards. • Still the most robust model out there, many experts predict it will be around for tens of years more. Basics of the Relational Model • A database represents information about the enterprise. • Information is structured, so as to be easy to store, retrieve and modify. • In the relational model, the information is structured by means of a relational schema. • A relational schema consists of the table names (also called relation names), the column names (or attribute names), the primary keys of each table and a few more objects we will study. • In the ER module, we learnt how to capture user requirements in the ER model, and to map this to a simple relational schema. Basics of the Relational Model Example of a relational schema: PROFESSORS(fac_id, f_name, research_interest) COURSES(c_num, c_name, num_credits) TEACH(fac_id, c-num) • Once we have a schema, the actual information is stored in the form of rows (tuples). Each row represents a “piece” or record of information that represents a part of the real world. Important A table can never have duplicate rows. Example of rows: PROFESSORS Fac_id F_name Research_interest 1 Akhilesh Bajaj Information Systems 2 Sudha Ram Database Modeling Rows or records or tuples Basics of the Relational Model • Each attribute in a table has a domain. The domain is the set of values that the attribute can take. This is analogous to domains in the ER model. • An attribute can have only one value at a time in a row, and the value is atomic (cannot be further broken up). • It is possible for attributes to have a null value, provided they are not part of the primary key of the table. Null can mean one of two things, we don’t know which. E.g., a dependent value being null could mean that there are no dependents, OR that there are dependents but we don’t have information on them. Superkeys, Candidate Keys, Primary Keys • A superkey is any set of attributes in a table, such that no 2 rows can have the same values when we consider all the attributes in the superkey together. E.g., (s_id, name, address, GPA, age), (s_id, name), (s_id, address), (s_id, GPA) are all sets of attributes that are examples of superkeys for the table STUDENTS. • A candidate key is a superkey of a table such that no subset of these attributes is also a superkey. E.g., (s_id) is a candidate key of STUDENTS. • A table can have several superkeys and more than one of these may be candidate keys. A primary key is selected from these candidate keys. Usually, if the table represents real-world objects, we designate an id as a primary key. Superkeys, Candidate Keys, Primary Keys All possible attribute sets of a table T All possible superkeys of a table T One set of attributes that make up the primary key of table T All possible candidate keys of a table T Basics of Relational Algebra • Relational algebra is the query language of the relational model. • Recall that in a data model, we need to be able to ask questions of the data also. We did not cover this in the ER model. • Query languages are procedural or non-procedural. • In a procedural language, the user instructs the database system to perform a series of steps. Relational Algebra is procedural. • In a non-procedural language, the user describes the information desired, without giving a specific procedure for obtaining that information. SQL is non procedural. Basics of Relational Algebra • The fundamental operations in relational algebra are: - Select -Project -Assign -Union - Set Difference - Cartesian Product - Rename [There are other operations as well.] • In relational algebra, the input to an operation is one or two relations (or tables) , and the output of an operation is one relation. • Select, project, rename are unary operators (input is one table) • Union, set difference, cartesian product are binary operators (input is two tables). The Select Operation • Operator: SELECT or • Works on: One table • Semantics: End result is the set of rows from the table that satisfy a given condition. • Condition: based on some attribute values and >,<, =, <>, >=, <=, AND, OR, NOT. E.g., name=“Akhilesh” AND research_interest = “MIS” is a condition. E.g., PROFESSORS • Syntax: SELECTCONDITION ( INPUT_TABLE) The above expression yields an output table whose rows satisfy the condition. The columns remain the same. The Project Operation • Operator: PROJECT or P • Works on: One table • Semantics: End result is the set of columns from the table that are specified. Column Specification: column-name1, column-name2, …. • Syntax: PROJECTCOLUMN_NAMES ( INPUT_TABLE) The above expression yields an output table that contains only the columns that are specified. Has 3 columns E.g., PROJECTf_name (PROFESSORS) yields as output a table with only one column: f_name. The Assignment Operation • Operator: ASSIGN or • Works on: Two tables • Semantics: The relational algebra expression on the right is assigned to the table on the left (usually a temporary table). This allows us to write an expression as a sequence of assignments. • Syntax: temp_table1 Relational Algebra Expression E.g., temp1 PROJECTf_name (PROFESSORS) assigns the expression on the right to temp1.