Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
ITEC 3220A Using and Designing Database Systems Instructor: Prof. Z. Yang Course Website: http://people.math.yorku.ca/~zyang/itec 3220a.htm Office: TEL 3049 Course Objective • Examine databases, trends in database management systems and their application in a wide range of organizational areas • Provide an overview of database processing, both historical and discussion of recent trends in database management • Provide the student with exposure to a range of tools, including a relational DBMS as well as an object-oriented DBMS 2 Textbook • Database Systems: Design, Implementation, and Management, 12th Edition- Carlos Coronel & Steven Morris 3 Marking Scheme • Final exam (closed book) - 50% Midterm (closed book) - 35% Assignments (2 assignments) - 15% • Lecture notes will be made available at: http://people.yorku.ca/~zyang/itec3220a.htm 4 Schedule • Week 1 Database concepts and the relational database model • Week 2 Entity relationship model • Week 3 Normalization • Week 4 SQL • Week 5 SQL + lab • Week 6 Advanced SQL + lab 5 Schedule (Cont’d) • Week 7 Midterm • Week 8 Database design & case study • Week 9 Transaction management and concurrent control • Week 10 Transaction management and concurrent control (Cont’d) ; Data warehousing • Week 11 Objected-Oriented database • Week 12 Review 6 Introduction Database Systems and Data Models Basic Definition • Data: raw facts – Constitute building blocks of information • Information: is produced by processing data and reveals meaning of data – Requires context – Should be accurate, relevant, and timely to enable good decision making • Database: shared, integrated computer structure housing: – End-user data - Raw facts of interest to end user – Metadata: Data about data, which the end-user data are integrated and managed 8 An Example • Converting data to information 9 An Example (Cont’d) • Metadata 10 What is a Database Management System (DBMS) • A collection of programs that manages the database structure and controls access to the data stored in the database • Role of the DBMS – Intermediary between the user and the database – Enables data to be shared – Presents the end user with an integrated view of the data – Receives and translates application requests into operations required to fulfill the requests – Hides database’s internal complexity from the application programs and users 11 DBMS Manages Interaction 12 Types of Databases • Centralized database, distributed database and cloud database • Single-user database, multiuser database • Analytical database, business intelligence – Data warehouse – Online analytical processing 13 File and File System • Terminology – Data • Raw Facts – Field • Group of characters with specific meaning – Record • Logically connected fields that describe a person, place, or thing – File • Collection of related records 14 Example 15 Problems with File System Data Processing • Lengthy development times • Difficulty of getting quick answers • Complex system administration • Lack of security and limited data sharing • Extensive programming 16 Structural and Data Dependence •Structural dependence: Access to a file is dependent on its own structure – All file system programs are modified to conform to a new file structure •Structural independence: File structure is changed without affecting the application’s ability to access the data 17 Structural and Data Dependence • Data dependence – Data access changes when data storage characteristics change • Data independence – Data storage characteristics is changed without affecting the program’s ability to access the data • Practical significance of data dependence is difference between logical and physical format 18 Data Redundancy •Unnecessarily storing same data at different places •Islands of information: Scattered data locations – Increases the probability of having different versions of the same data •Data anomalies – Modification – Insertion – Deletion 19 Example 20 Database Systems •Logically related data stored in a single logical data repository – Physically distributed among multiple storage facilities – DBMS eliminates most of file system’s problems • Current generation DBMS software: • Stores data structures, relationships between structures, and access paths • Defines, stores, and manages all access paths and components 21 Database vs. File Systems 22 Database Models • Collection of logical constructs used to represent data structure and relationships within the database – Conceptual models: logical nature of data representation – Implementation models: emphasis on how the data are represented in the database 23 Database Models: Historic Overview 24 Hierarchical and Network Models Hierarchical Models Network Models • Developed to manage large amounts of data for complex manufacturing projects • Represented by an upsidedown tree which contains segments (equivalent of a • Created to represent complex data relationships effectively • Improved database performance and imposed a database standard • Allows a record to have more than one parent • Depicts both one-to-many (1:M) and many-to-many (M:N) relationships file system’s record type) • Depicts a set of one-to-many (1:M) relationships 25 The Relational Model • Produced an automatic transmission database that replaced standard transmission databases • Based on a relation – Relation or table: Matrix composed of intersecting tuple and attribute •Tuple: Rows •Attribute: Columns • Describes a precise set of data manipulation constructs 26 Relational Database Management System (RDBMS) • Performs basic functions provided by the hierarchical and network DBMS systems • Makes the relational data model easier to understand and implement • Hides the complexities of the relational model from the user 27 Relational Database Model (Cont’d) 28 Relational Database Model (Cont’d) • Schema for the table – Graphical representation AGENT AGENT_C ODE AGENT_LN AGENT_FN AME AME AGENT_INI AGENT_AREA AGENT_PH TIAL CODE ONE – Text description AGENT(AGENT_CODE, AGENT_LNAME, AGENT_FNAME, AGENT_INITIAL, AGENT_AREACODE, AGETN_PHONE) 29 The Object-Oriented Data Model (OODM) or Semantic Data Model •Object-oriented database management system(OODBMS) – Based on OODM •Object: Contains data and their relationships with operations that are performed on it – Basic building block for autonomous structures – Abstraction of real-world entity •Attributes - Describe the properties of an object 30 NoSQL Databases •Not based on the relational model •Support distributed database architectures •Provide high scalability, high availability, and fault tolerance •Support large amounts of sparse data •Geared toward performance rather than transaction consistency •Store data in key-value stores 31 Hierarchical Model Advantages Disadvantages • Promotes data sharing • Parent/child relationship promotes conceptual simplicity and data integrity • Database security is provided and enforced by DBMS • Efficient with 1:M relationships • Requires knowledge of physical data storage characteristics • Navigational system requires knowledge of hierarchical path • Changes in structure require changes in all application programs • Implementation limitations • No data definition • Lack of standards 32 Network Model Advantages Disadvantages • Conceptual simplicity • System complexity • Handles more relationship limits efficiency types • Navigational system • Data access is flexible yields complex • Data owner/member implementation, relationship promotes data application integrity development, and management • Conformance to standards • Structural changes • Includes data definition require changes in all language (DDL) and data application programs 33 manipulation language (DML) Relational Model Advantages • Structural independence is promoted using independent tables • Tabular view improves conceptual simplicity • Ad hoc query capability is based on SQL • Isolates the end user from physical-level details • Improves implementation and management simplicity Disadvantages • Requires substantial hardware and system software overhead • Conceptual simplicity gives untrained people the tools to use a good system poorly • May promote information problems 34 Object-Oriented Model Advantages • Semantic content is added • Visual representation includes semantic content • Inheritance promotes data integrity Disadvantages • Slow development of standards caused vendors to supply their own enhancements – Compromised widely accepted standard • Complex navigational system • Learning curve is steep • High system overhead slows transactions 35 NoSQL Advantages • High scalability, availability, and fault tolerance are provided • Uses low-cost commodity hardware • Supports Big Data • Key-value model improves storage efficiency Disadvantages • Complex programming is required • There is no relationship support • There is no transaction integrity support • In terms of data consistency, it provides an eventually consistent model 36 Chapter 3 The Relational Database Model Basic Definition • Entities and Attributes – Entity is a person, place, event, or thing about which data is collected – Attributes are characteristics of the entity • Tables – Holds related entities or entity set – Also called relations – Comprised of rows and columns 38 Table Characteristics • • • • • • • • Two-dimensional structure with rows and columns Rows (tuples) represent single entity Columns represent attributes Row/column intersection represents single value Tables must have an attribute to uniquely identify each row Column values all have same data format Each column has range of values called attribute domain Order of the rows and columns is immaterial to the DBMS 39 Example Tables 40 Terminology for Relational Database Table-Oriented Set-oriented Record-Oriented Table Relation Record type Row Tuple Record Column Attribute Field 41 Key • Consists of one or more attributes that determine other attributes • Primary key (PK) is an attribute (or a combination of attributes) that uniquely identifies any given entity (row). • Key’s role is based on determination – If you know the value of attribute A, you can look up (determine) the value of attribute B 42 Keys (Cont’d) • Composite key – Composed of more than one attribute • Key attribute – Any attribute that is part of a key • Superkey – Any key that uniquely identifies each entity • Candidate key – A superkey without redundancies 43 Keys (Cont’d) • Foreign key (FK) – An attribute whose values match primary key values in the related table • Referential integrity – FK contains a value that refers to an existing valid tuple (row) in another relation • Secondary key – Key used strictly for data retrieval purposes 44 Simple Relational Database 45 Controlled Redundancy • Makes the relational database work • Tables within the database share common attributes that enable us to link tables together. • Multiple occurrences of values in a table are not redundant when they are required to make the relationship work. • Redundancy is unnecessary duplication of data 46 Integrity Rules 47 Integrity Rules (cont’d) 48 Exercises Table name: TRUCK Table name: BASE Table name: TYPE 49 Exercises (Cont’d) • For each table, identify the primary key and the foreign keys. • Do the tables exhibit entity integrity? Explain • So the tables exhibit referential integrity? Explain • Identify the TRUCK table’s candidate key (s). • For each table, identify a super key and a secondary key 50