* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Slide 1
Serializability wikipedia , lookup
Oracle Database wikipedia , lookup
Microsoft SQL Server wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Concurrency control wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
ContactPoint wikipedia , lookup
Clusterpoint wikipedia , lookup
Databases and Statistical Databases Session 4 Mark Viney Australian Bureau of Statistics 5 June 2007 Terms Database ƒ A shared collection of logically related data (and description of this data), designed to meet the information needs of an organisation DataBase Management System (DBMS) ƒ A software system that enables users to define, create and maintain the database and provides controlled access to this database Terms (example) Database Personnel Database ƒ Stock database ƒ Statistical Database ƒ DataBase Management System (DBMS) Oracle ƒ DB2 ƒ Access ƒ MySql ƒ FoxPro ƒ Firebird ƒ Why keep information in databases? Accessibility of data Increased concurrency (reads and writes) ƒ Sharing data ƒ Improved data integrity Improved security ƒ access only to necessary data Relatable ƒ More information from same amount of data Visible Why keep information in databases? (continued) backup and recovery Improved productivity ƒ common tools / common processes Disadvantages of databases Complexity Size Cost of DBMS Need to upgrade versions Additional hardware costs Higher impact of failure www.cableready.net/newsletter/winter99.html Databases used to be solely mainframe commonly on minicomputers increasingly available on microcomputers mostly accessed by SQL Relational Databases entities datatypes ƒ validation ƒ relationships ƒ rules for interaction Database Tables rows and columns fixed number of columns multiple rows (records) columns are of same datatype Structured Query Language - SQL Standard database language that allows:Database creation and relation structures ƒ Basic data management tasks ƒ Both simple and complex queries ƒ SQL - Data Definition - DDL allows creation, modification and deletion of database objects Creation - CREATE CREATE TABLE TAB1 (COL1 NUMBER, COL2 NUMBER); ƒ Modification - ALTER ALTER TABLE TAB1 ADD COL3 NUMBER; ƒ Deletion - DROP DROP TABLE TAB1; ƒ Structured Query Language - SQL Data Manipulation - DML Standard language to allow access the data stored in databases Extraction - SELECT SELECT COL1,COL2 from TAB1; ƒ Loading - INSERT INSERT INTO TAB1 (COL1,COL2) VALUES(7,22); ƒ Manipulation - UPDATE UPDATE TAB1 SET COL2 = COL1 + 2; ƒ Deletion - DELETE DELETE FROM TAB1 WHERE COL2 = 4; ƒ Database Modeling representation of "real world" conceptual model logical model physical model Keys Primary Keys ƒ uniquely identifies a record Foreign Keys ƒ pointer to a Primary Key in another table Indexes May be applied to columns to allow fast data access May be applied to single columns or several columns Direct pointers to rows containing specific values in the indexed column(s) may be unique or non-unique May have more than one index per table Normalisation A technique for producing a set of relations with desirable properties, given the data requirements of an enterprise Normalisation - unnormalised A representation of the data that contains repeating groups Normalization - unnormalised form Normalisation - 1st normal form A relation in which the intersection of each row and column contains one and only one value 1NF Normalization - 1st normal form 1NF Normalisation - 2nd normal form A relation that is in first normal form ƒ every non-primary key attribute is fully functionally dependent on the primary key ƒ 2NF Normalization - 2nd normal form 2NF Normalisation - 3rd normal form A relation that is in first and second normal form and ƒ in which no non-primary key attribute is transitively dependent on the primary key ƒ 3NF Normalization - 3rd normal form 3NF Loading data into databases Bulk loading tool Data Integrity Validation ad-hoc loading Data Extraction Assemble data into usable format Spreadsheet Timeseries Data Cube Publication Data manipulation Inside database ƒ Sophisticated manipulation language - SQL Outside database Timeseries Seasonal Adjustment Chain Volume Measures (Constant Price) ƒ SAS, SPSS ƒ Transactional Integrity the ability to apply rules to the data via database constraints ability to group several discrete data insertion or data manipulation into one logical data change In SQL, controlled via COMMIT and ROLLBACK statements Transactional Integrity database constraints ƒ values must conform to specific rules exist in a specific column belong to a "set" uniqueness If a validation against a constraint fails ƒ the current transaction fails Transactions & Recovery Each transaction is logged by the DBMS Backups taken periodically Data can be recovered to an archived backup ƒ to a point in time ƒ Transaction example COMMIT; transaction 1 INSERT INTO TABLE1 (COL1,COL2) VALUES(7,22); UPDATE TABLE1 SET COL1 = 77 WHERE COL2 = 22; DELETE FROM TABLE1 WHERE COL1 = 7; ROLLBACK; transaction 2 INSERT INTO TABLE2 (COL3,COL4) VALUES('ABC',11); UPDATE TABLE2 SET COL3 = 'XYZ'; DELETE FROM TABLE2 WHERE COL3 = 'xyz'; COMMIT; References Database Systems a Practical Approach to Design, Implementation and Management Thomas Connolly, Carolyn Begg, Anne Strachan (Addison-Wesley) 1999 cartoons - Randy Glassbergen Questions?