* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Download Pclec02
Concurrency control wikipedia , lookup
Oracle Database wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Ingres (database) wikipedia , lookup
Microsoft SQL Server wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Clusterpoint wikipedia , lookup
Functional Database Model wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Relational algebra wikipedia , lookup
Lecture 2 This lecture will introduce some more terminology - about primary keys, foreign keys, candidate keys, access keys and some design concepts. There will be a brief mention of table structures, constraints and values. There are some examples of outputs of queries And we will also look at SQL Relational Database Concepts The relational data model was developed by Dr. E.Codd (a mathematician) in the late 1960’s and early 1970’s. The theory of normalisation of data is closely linked. Databases based on the relational model should be easy to use and understand. There should be no need for the user to be aware of the physical structure of the underlying files. Most databases developed for commercial use are now based on the relational model. Data Models Codd suggests that any data model has three components: the data structures; the integrity constraints; the data manipulation operators. The Relational Data Model Data Structures - domain, attribute, relation, row (tuple), primary key, degree, cardinality. Integrity Constraints - entity integrity and referential integrity. Data Manipulation Operations - defined through relational algebra and equivalent relational calculus. The Beginning of the Relational Model In 1969, Dr. Edgar F Codd published an original paper titled ‘ ‘Derivability, Redundancy, and Consistency of Relations Stored in Large Data Banks’ In 1970, there was a revised version titled ‘A Relational Model of Data for Large Shared Data Banks’ Dr. Codd’s Relational Theory He also published • ‘Relational Completeness of Data Base Sublanguages’ • ‘A Data Base Sublanguage Founded on the Relational Calculus’ • ‘Further Normalisation of the Data base Relational Model’ • ‘Interactive Support for Non-programmers : The Relational and Network Approaches’ Dr. Codd’s Relational Theory • And ‘Extending the Relational Database Model to Capture More Meaning’ Dr. Codd also produced papers relating to • Multiprogramming • Natural Language processing The Relational Model serves as the basis for the theory of data - he instigated the ideas of predicate logic as a foundation for database management and he defined both a relational algebra and a relational calculus as a basis for data in relational form. Dr. Codd’s Relational Theory • The Original Paper (‘Derivability, Redundancy, and Consistency of Relations Stored in Large Data Banks’ 1969) contains references to these aspects: 1 2 3 4 5 6 A Relational View of Data Some Linguistic Aspects Operations on Relations Expressible, Named and Stored Relations Derivability, Redundancy and Consistency Data Bank Control Dr. Codds Relational Theory • The relational model described ‘provides a means of describing data in terms of its natural structure’ - no machine representation details. • The model also provided a basis for constructing a high-level retrieval language with ‘maximal data independence’ which led to the development of SQL Meet Dr. E.F.Codd Relational Data Structure Relation Attribute EMPLOYEE Empno Name Gender Mgr Empno E1 Jones Male E65 E6 Smith Male E28 E28 Jones Female - Empno E1 to E125 Gender Domains Female Male And what about ‘expired’ Empnos ? Heading Body Domains Employee Empno E1 E2 E3 Name Red Brown Black Mgrno E1 E1 Attributes Person Name Red, Brown Black, Blue E1, E2, E3,E4 Empno Domains Value Sets and Domains • Domains in Relational Database can be extensive and complex. • A ‘domain’ (a restriction of value or expression) can be applied to the result of a function or of a derived value. For example, the multiplication of a person’s age by the person’s I.D. would not lead to a realistic value A domain constraint would ensure that this process, if initiated, would not proceed and would result in an error message being displayed Value Sets and Domains • The arithmetic addition of an I.D. and a date of birth would also be a non realistic value • Domains can be used to limit which attributes can be associated with other attributes - this leads to interesting and complex processes - Rules and Procedures (Ingres) and Triggers and Constraints (Oracle). • Access has the option of delving into Visual Basic • Does anyone know what SQLServer has available ? Relational Data Structures • The only structure available is a 2-dimensional file of data. • This is known as a relation or table. • Each entity corresponds to a table and each attribute to a column (or field) in that table. • Each entity occurrence corresponds to a row of the table. Properties of Relations • Data is held in tables • There is no order of data in the tables - either in row or attribute • Primary Key - Foreign Key relationship • Data Typing including NULLS • Query Access - insert, update, delete, retrieval • Indexing on Candidate (and Primary) keys Some Concepts A database system is a computerised record keeping system A database is a collection of structured data files and associated indexes A database user must be able to add, retrieve, insert, update and delete data and files A set is any collection of definite distinguishable things. Olympians for instance are a ‘set’ of people. The term distinguishable means that in inspection of any 2 things which fit into a set, there must be the capability of deciding if they are identical or different Some Concepts The term ‘definite’ means that if the set is known, and the thing is known, a decision can be made that (a) the thing belongs to the set or (b) the thing does not belong to the set For the set to be known, it is sufficient if the members are known A Relation Relations exist between 2 or more things There is a relation between Lleyton Hewitt and tennis There is a relation between Steve Waugh and cricket There is a relation between Tiger Woods and golf We could present this as : Name Sport Lleyton Hewitt Tennis Steve Waugh Cricket Tiger Woods Golf and we have a relation of degree 2. We can also have relations of any required degree 3, 4, 5 ………. A Relation This is a table of ‘ordered pairs’ and the relationship is directional Lleyton Hewitt plays Tennis - Tennis doesn’t play Lleyton Hewitt. This is a binary relation. The order is horizontal, and is row limited. The order of the rows in the table is immaterial to the data In this example (and in any table) the relationship is the set of all ordered pairs (Question : what happens to this data if, for instance, Lleyton Hewiit is unable to play tennis ?) Another Relation We could have this Name Activity Smith, J Doctor Ellis,T Blacksmith Werija,K Lecturer Brack,S Premier Residence Clayton Colac Caulfield Ballarat Date of Death 22-09-1998 12-10-1976 ??? ??? This is a relation, or table, of degree 4 Notice that each row has only 1 entry in each ‘column’ or attribute - this is called the ‘atomic value’ Strictly Speaking A ‘set’ in mathematics has no duplicates A relation is a set, so a relation shouldn’t have duplicates either A relational database consists of tables A table is not a relation, but the only difference is that a table may have duplicate row values (not a good idea) Duplicate rows should be avoided and the duplicates erased All relational database should consist of relations Relations must have unique names A Table A Table : Is a named set of rows - an ordered row of one or more column names, together with zero or more unordered rows of data values Tables store data about a specific entity - each row in a Table describes a single occurrence of that entity. The SQL Standard defines 3 types of tables - Base tables, Views, and Derived tables More on Tables Base tables are created and managed with the Create Table, Alter Table and Drop Table statements. Views are created and managed with the Create View and Drop View statements Derived tables are created when a query is executed. Tables are dependent a Schema or a Module. More on Tables Column : A column is a named component of a table. A set of similar data values describe the same attribute of an entity. A column’s values all belong to the same data type or to the same Domain, and may vary over time. A Column value is the smallest unit of data which can be selected from, or updated in, a table.Columns are dependent on some table, and are created, altered, and dropped with column definition in the Create Table and Alter Table statements A Primary Key • McFadden, Hoffer and Prescott define a Primary Key as : An attribute (or combination of attributes) which uniquely identifies each row in a relation. (table) • Richard T. Watson has this to say: The primary key definition block specifies a set of column values comprising the primary key. Once a Primary Key is defined, the system enforces its uniqueness by checking that the Primary Key of any new row does not already exist in the table. A Primary Key - What’s That ? • A key - a unique identifier ‘A key is said to be nonredundant if every attribute it contains is necessary for the purpose of unique identification - if any attribute of the key were removed, the remaining attributes would not be a unique identifier’ And a Foreign Key ? • McFadden, Hoffer and Prescott’s definition: An attribute (or attributes) in a relation (table) of a database which serves as the Primary Key of another relation (table) in the same database. • Richard T. Watson says: An attribute (or attributes) that is a Primary Key in the same table, or another table. It is the method of recording relations in a relational database. And, both the Primary and Foreign Key(s) should be drawn from the same Domain. Other Keys • Candidate Key(s) - is a key (an attribute, or attributes) which should be considered as a Primary Key • Access Key - an attribute, or attributes, other than the Primary (or Foreign) key on which data will be retrieved from a table e.g. postcode as in your second tutorial example SQL - An Introduction • With SQL, the user does not ‘open’ nor ‘close’ tables • A user normally has a subset of tables to which access is allowed, and privileges are granted to allow the user to perform some specific functions • A query (an access to data in a table or tables) returns the whole result set all at once. All of the required rows are updated, inserted or deleted - or none of the rows are. • The whole set involved in the ‘transaction’ works, OR the whole ‘transaction’ fails A Transaction A transaction is a sequence of SQL statements which Oracle treats as a single unit The set of changes is made permanent with the Commit statement Part or all of a transaction can be undone with the Rollback statement A transaction starts with the execution of the first SQL statement in the transaction and ends with either the Commit or Rollback statement SQL - An Introduction Transaction Control • A transaction in SQL is either completely finished OR it is not done at all • No partial results can be produced • Work done can be committed - it becomes a permanent part of the database or it can be rolled back - the database is restored to the state prior to the transaction commencing • SQL programmers need to be aware of the need for concurrency control - that is the sharing of the database contents among transactions (more about this later) A Transaction Oracle guarantees that a transaction has statement-level read consistency (the data stays the same while Oracle is gathering and returning it) If a transaction has multiple queries, then each query is consistent, but not with each other Transaction-level read consistency can be achieved with the Set Transaction Read Only - (queries only) SQL - An Introduction SQL has some very specific rules 1 is that every table has a structure Another rule is that insertion, updating and deletion of rows in each table can only occur if all the rows have the same structure as the rest of the rows in the table This reinforces the rule that – A table is a set of rows of one particular type SQL - An Introduction A table has no ordering - data is not ‘in ascending or descending’ order or ‘date’ order …. Columns are referenced by name only, not by their relative position in a table The columns of a table can be re-arranged, BUT the SQL statements referencing this or these tables are not affected Properties of Relations Integrity Constraints included in the DBMS – Attribute value ranges – Referential Integrity – Entity Integrity - No part of any Primary key may be null Set retention constraints (how long to retain a set of data) Domain constraints User Defined Rules Recovery Procedures (after failure) Properties of Relations No explicit linkage between tables - set up at run time Linking or Embedding database operations in a procedural language The Database may be distributed across similar or different DBMS’s A Relational Database EMPNUM 3 7 11 18 NAME Date of Birth DEPTNUM JONES 27/11/1967 650 ADAMS 14/10/1978 432 NGUYEN 9/05/1977 314 PHAN 30/06/1969 432 Relation Schema EMP(empnum,name,age,deptnum) DEPTNUM 650 432 314 DEPTNAME PRODUCTION INFOSYS FINANCE Relation Schema DEPT (deptnum, deptname) A Relational Database EMPNUM 3 7 11 18 NAME Date of Birth DEPTNUM Relation JONES 27/11/1967 650 EMP ADAMS 14/10/1978 432 NGUYEN 9/05/1977 314 PHAN 30/06/1969 432 Relation Schema EMP(empnum,name,age,deptnum) DEPTNUM 650 432 314 DEPTNAME PRODUCTION INFOSYS FINANCE Relation DEPT Relation Schema DEPT (deptnum, deptname) More Terminology The degree of a relation is the number of attributes in that relation. Degree 1 2 3 . n Name unary binary ternary n-ary The cardinality is the number of rows in the relation (table). Primary Keys A candidate key of a relation is a set of attributes that satisfy two time independent properties: Uniqueness - No two rows of the relation have the same values for the set of attributes forming the candidate key. Minimality - No attributes can be discarded from the candidate key without destroying the uniqueness property. Empnum E110 E261 E311 Surname Given Name Tax FileNo Parkes John 100-100-232 Kimball John Hurwitz Fred 101-111-222 Entity Integrity · No component of the Primary Key of a base relation is allowed to accept nulls. Surname Given Name Parkes John Kimball Hurwitz Fred Ashton Salary 40,000 50,000 60,000 70,000 What is the Primary Key ? Foreign Key · A foreign key is an attribute or attribute combination of one relation R2 whose values are required to match those of the primary key of relation R1 where R1 and R2 are not necessarily distinct. The foreign key and the corresponding primary key should be defined on the same domain(s). Empnum Surname Worksfordept E110 Parkes d1 E261 Kimball d3 E311 Hurwitz d2 Employee Foreign key Dept d1 d2 d3 Dname Pay Tax Art Dept Referential Integrity If base relation R2 includes a foreign key FK matching the primary key PK of some base relation R1 then every value of FK in R2 must either (a) be equal to the value of PK in some row of R1, or (b) be wholly null. Note that PK and FK may comprise more than one attribute and that R1 and R2 are not necessarily distinct. ( Stated more simply : a foreign key should associate to a valid primary key value, or the foreign key should be null.) Recording Design Decisions Formal design decisions can be recorded in the same graphical notation as an E-R diagram. This is called a data structure diagram and is developed from normalised relations using a few simple steps. Recording Design Decisions a) Treat each relation as an entity, represent it as a rectangle and enter its name. b) Primary and Foreign keys are used to establish the relationships (Note; a foreign key can be part of a composite primary key). If the primary key in one relation exists as the foreign key in another relation, then draw a line linking the relationship between these two entities. Some E-R Examples DEPARTMENT(DeptNo,Dname) EMPLOYEE(Empnum,Ename,Salary,DeptNo) EMPLOYEE DEPARTMENT STUDENT(StudentNo,Name) UNIT(Unitcode,Title) RESULT (StudentNo,Unitcode,Result) STUDENT RESULT UNIT Open to Interpretation Student Course Unit Text There are a number of ‘rules’ in this model, which determine the relationships. They are known as Business Rules. The Rules ? • • • • • • A student must be enrolled in 1 Course A Course may contain zero, or many students A student may be enrolled in many units, but at least 1 A unit may attract many students (or no students ?) Each Unit has one prescribed text Each text is associated with one unit Open to Interpretation Customer Invoice Line Each Customer may generate one or more Invoices Each Invoice is generated by one Customer Each Invoice contains one or more lines Each line is contained in an Invoice Product Each line references one product Each Product may be referenced in one or more lines Modelling to Processing So, how do we convert the conceptual design details into software which allows for the entry of data into the appropriate tables, and for further processing to allow for the use of this data to respond to queries ? Something Different - or, how do we make this happen ? An Introduction to SQL Some Comments Regarding SQL In the next few overheads, there will be some terms and explanations which should help you to make the transition from the methods of data storage and file processing to that of the relational database style of storage and processing of data. An Introduction to SQL Firstly some plusses for SQL. 1. SQL is the one industry standard for querying databases 2. Other ‘tools’ such as front enders don’t allow the developer to use all of the features of a database 3. Tools provided invariably do not exploit the full functionality of the underlying language 4. An SQL query in a client-server environment can be run in any application language and the result will always be the same Some SQL Basics SQL acts as a bridge between – the user – the database management system (DBMS) – the data tables – the transactions which involve the previous 3 items SQL also allows the ‘system’ to be administered and managed by a database administrator using the same format : procedural commands and data in tables. (.net ?) SQL can be embedded into source code from C to Pascal Procedural and Non-Procedural Languages SQL requires a different approach from that used in other programming languages C, Fortran, Basic, Cobol, Pascal, PL/1 are procedural languages. They are characterised by statements which tell the executing computer what to do, and in a structured stepby-step way (even when loops are used). SQL is a declarative language - the computer is told what the user wants to achieve and the computer ‘decides’ on how to achieve this requirement, and correctly. The user sees the results. SQL Sets SQL is a set-oriented language. Many programmers are used to file-oriented languages. A set is an unordered collection of items, all of which have the same type and structure These sets become tables in SQL, and are made up of vertical attributes (or columns) and horizontal rows SQL - Data Manipulation Data Retrieval (DML) SELECT retrieve data from table Data Modification (DML) INSERT UPDATE add a single row or copy rows from other table(s) amend column values DELETE delete rows of data Data Definition - DDL (Oracle) Creating Tables create table emp, (empno number(6,0), name varchar2(20), salary number(6,0), age number(3,0), deptno number(5,0)); A table is defined. Space is reserved. The system catalogue is updated. (also known as the Data Dictionary) Table and Column Names begin with alpha (A-Z) less than or equal to 12 characters Table names contain (A-Z, 0-9) Column names contain (A-Z, 0-9,) Data Definition (DDL) Did you notice the entries such as – Number(5,0) – Varchar2(20) – Number(6,0) in the previous overhead ? These are ‘data types’ and further assist integrity by defining actual data values which can exist for each attribute The size (or number of bytes) of each attribute is also expressed (either explicitly or implicitly) Overview of SQL Data Definition (DDL) Create Table define table and constraints Create View define user view of data Alter Table add new columns (Oracle) Drop Table delete table Drop View delete user view Overview of SQL Data Control Commit Rollback commit changes to the database rollback previous changes Data Security Grant Revoke grant access privileges to users revoke access privileges Relational DBMS Products IBM Relational Products DB2/nn SQL/DS QMF CSP MVS/370 MVS/XA VM/CMS DOS/VSE front-end to DB2 and SQL/DS application development tool Numerous other RDBMS ORACLE 8, 8i, 9i OPENINGRES from ASK Corp. (OSL,ABF) AIM/RDB from Fujitsu INFORMIX - now in DB2 VAXSQL/Rdb from DEC NonStop SQL from Tandem Microcomputer versions SQL Server (as in MS 2000) Quadbase-SQL ORACLE INGRES dBASEV / Visual dBASE microSQL practically all micro DBMS Other Oracle Products : Designer2000, Developer2000, Programmer2000, Discoverer2000 Can you explain this ? 3 people agree to buy an item for $30 and hand over $10 each. • The salesperson discounts the item by $5 , and refunds each person $1 each. Each person has therefore paid $9. (5/3 does not give an even amount) • He keeps the remaining $2 as a token of good will. • Mathematically, 3 x $9 = $27 plus the additional $2 = $29 • The question is, where is the other $1 ??? And , what are your views on this ? These are quotations : Eye Drops Off Shelf Wild Cow Injures Farmer with Axe and Cold Wave Linked to Temperatures Relax - until the next session