Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Microsoft SQL Server wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Concurrency control wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Clusterpoint wikipedia , lookup
Christoph F. Eick Today 1. Introduction to Databases 2. Questionnaire 3. Course Information 4. Grading and Other Things Introduction Data Management Christoph F. Eick Spring 2003 Schedule COSC 6340 Exams: Undergraduate Material Review Exam: Th., Feb. 13 (in class) Midterm Exam: Tu., March 25 (in class) Final Exam: Tu., May 6, 11a Qualifying Exam Part2: Fr.,. May 9, 10:30-noon Project and Graded Home Works Project1(Feb. 15-March 15), Project2 (March 30-April 20), Homework1 (deadline: Feb. 27; March 11), Homework2 (deadline: April 17) Last day of lecture: Th., April 24, 2003 Spring Break: March 4+6 Introduction Data Management Christoph F. Eick Elements of COSC 6340 I: Basic Database Management Concepts --- Review of basic database concepts, techniques, and languages (4 weeks, Chapters 1-5, 7-11, and 18 of the textbook). II: Implementation of Relational Operators and Query Optimization (Chapters 12+13, 1.5 weeks) III: Relational Database Design (1.5 weeks, chapters 15+16,) IV: Introduction to KDD and Making Sense of Data (Chapters 1, 2, 6, and 7 of the Han/Kamber book centering on data warehouses, OLAP, and data mining). 3 weeks V: Object-oriented Databases, PL/SQL, Object-relational Database Systems, and SQL3 (1.5 weeks; other material) VI: Internet Databases and XML (1 week, chapter 22 of the textbook and other teaching material) Introduction Data Management Christoph F. Eick Textbooks for COSC 6340 Required Text: Raghu Ramakrishnan and Johannes Gehrke, Data Management Systems, McGraw Hill, Third Edition, 2002 (complication: the chapter numbers in the new edition are different!!) Recommended: Jiawei Han and Micheline Kamber, Data Mining: Concepts and Techniques, Morgan Kaufman Publishers, 2001, ISBN 1-55860-489-8 (4 chapters will be covered) Other books with relevant material: Ramez Elmasri and Shamkant Navathe, Fundamentals of Database Systems, Third Edition Addison Wesley ISBN: 0-80531755-4 Introduction Data Management Christoph F. Eick Schedule for Part1 of COSC 6340 Jan. 14: Introduction to COSC 6340 Fast Review of Undergraduate Material (Jan. 16-Feb. 13) Jan. 16: Entity-Relationship Data Model Jan. 21: Entity-Relationship Data Model Jan. 23: Relational Data Model Jan. 28: Mapping E/R to Relations Jan. 30: Files, B+-trees, and hashing (chapter 8, 9, 10) Feb. 4: Files, B+-trees, and hashing (chapter 8, 9, 10) Feb. 6: Relational Algebra and SQL (very brief!!) Feb. 11: Transaction Management (chapter 18) Feb. 13: Exam0 (Undergraduate Review Exam) Introduction Data Management Christoph F. Eick Why are integrated databases popular? Avoidance of uncontrolled redundancy Making knowledge accessible that would otherwise not be accessible Standardization --- uniform representation of data facilitating import and export Reduction of software development (though the availability of data management systems) Bookkeeping Device Integrated Database Car Salesman Introduction Data Management Christoph F. Eick Popular Topics in Databases Efficient algorithms for data collections that reside on disks (or which are distributed over multiple disk drives, multiple computers or over the internet). Study of data models (knowledge representation, mappings, theoretical properties) Algorithms to run a large number of transactions on a database in parallel; finding efficient implementation for queries that access large databases; database backup and recovery,… Database design How to use database management systems as an application programmer / end user. How to use database management systems as database administrator How to implement database management systems Data summarization, knowledge discovery, and data mining Special purpose databases (genomic, geographical, internet,…) Introduction Data Management Christoph F. Eick Data Model Data Model is used to define Schema (defines a set of database states) Current Database State Introduction Data Management Christoph F. Eick Schema for the Library Example using the E/R Data Model when title name ssn Person Introduction Data Management 1-to-1 (0,35) author B# phone Check_out 1-to Many (0,1) Many-to-1 Book Many-to-Many Christoph F. Eick Relational Schema for Library Example in SQL/92 CREATE TABLE Person (ssn CHAR(9), name CHAR(30), phone INTEGER, PRIMARY KEY (ssn)); CREATE TABLE Book (B# INTEGER, title CHAR(30), author CHAR(20), PRIMARY KEY (B#)); CREATE TABLE Checkout( book INTEGER, person CHAR(9), since DATE, PRIMARY KEY (B#), FOREIGN KEY (book) REFERENCES Book, FOREIGN KEY (person) REFERENCES Person)); Introduction Data Management Christoph F. Eick Referential Integrity in SQL/92 SQL/92 supports all 4 options on CREATE TABLE Enrolled deletes and updates. (sid CHAR(20), Default is NO ACTION cid CHAR(20), (delete/update is rejected) CASCADE (also delete all tuples grade CHAR(2), that refer to deleted tuple) PRIMARY KEY (sid,cid), SET NULL / SET DEFAULT (sets FOREIGN KEY (sid) foreign key value of referencing REFERENCES Students tuple) ON DELETE CASCADE ON UPDATE SET DEFAULT ) Introduction Data Management Christoph F. Eick Example of an Internal Schema for the Library Example INTERNAL Schema Library12 references Library. Book is stored sequentially, index on B# using hashing, index on Author using hashing. Person is stored using hashing on ssn. Check_out is stored sequentially, index on since using B+-tree. Introduction Data Management Modern Relational DBMS Transaction Concepts; capability of running many transactions in parallel; support for backup and recovery. Support for Web-Interfaces, XML, and Data Exchange Support for OO; capability to store operations Efficient Implementation of Queries (Query Optimization, Join & Selection & Indexing techniques) Modern DBMS Support for special Data-types: long fields, images, html-links, DNA-sequences, spatial information,… Support for datadriven computing Support for Data Mining operations Support for OLAP and Data Warehousing Support for higher level user interfaces: graphical, natural language, form-based,…