* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download No Slide Title
Survey
Document related concepts
Transcript
CIS 507 Database Programming Database Concepts DBMS Models Database A DATABASE is an organized collection of related data McFadden-Hoffer – – a shared collection of logically related data designed to meet the information needs of multiple users in an organization Kroenke – a self-describing collection of integrated records self-describing: in addition to the user’s source data, contains a description of its own structure collection of integrated records: user data, metadata (data about data), indexes to represent relationships among data and improve performance, data about applications that use the database Rob-Coronel – a shared integrated computer structure that houses a collection of end-user data--i.e. raw facts of interest to the end-user metadata through which the data are integrated 2 File Management Systems The predecessor to the modern database management system Problems – Maintenance: Data created, managed, and accessed primarily through 3GL (COBOL, PL/I) – Data dependent: all components sensitive to changes made to data structure or storage and retrieval methods – Data Redundancy: uncontrolled duplication of data – Data Inconsistency Data Anomalies Sharing: inconsistent standards made it impossible to share data 3 DBMS A Database Management System (DBMS) is general purpose software and hardware facility to: – – – – – – Create, delete, reorganize, and manipulate data in a database Store, retrieve, share, and maintain data in a database Maintain relationships between the database components Provide security and procedures relating to privilege and access. The integrity of all the updates and transactions that are carried out. interface for the access, deletion and addition of data and for redefining the relationships within the database. A DBMS is a collection of programs that manages the database structure and controls access to the data stored in the database. 4 DBMS Disadvantages DBMSs are complex; Need for explicit backup and control; Costs associated with development and operation can be substantial; Consolidation of an entire business’ information resources can create a high level of vulnerability. 5 The Database System Environment Hardware: computer, storage, networks, devices Software: OS, DBMS, Applications, Untilities People: – – – – – System Administrator Database Administrator Database Designers Systems Analysts and Programmers End-Users Procedures Data 6 Database Systems Types Number of Users: – – Location: – – Single-user: usually desktop Multi-user: workgroup (small); enterprise (large) Centralized: all data stored in a database at a single site Distributed: database is distributed across several sites Type and Use: – – – Production (transactional): designed to support day-to-day use Decision Support: designed to make tactical and strategic decisions at middle- and high-management levels Decision Support Systems (data warehouse): use of historical data from many sources to make decisions such as pricing, sales forecasts, marketing positioning (e.g. structural estimates for insurance by underwriters) 7 DBMS Functions Data Dictionary Management Data Storage Management--Data Storage Definition Language (DSDL) Data Transformation and Presentation Database Control Language (DBCL) – – Backup and Recovery Management Data Integrity Management Data Access Languages – – Data Definition Language (DDL) Data Manipulation Language (DML) Application Program Interfaces – – Security Management Multi-User Access Control COBOL, C, PASCAL, Visual Basic Administrative Utilities Data Communication Interfaces – queries, reports, email through web browsers 8 Features of a good DBMS Open ended--can be extended Flexible--can be changed Efficient Easy to use Security should be built-in. Data independence 9 Models A database is a model of a user’s model of reality (Kroenke) Many different types of models involved in databases Reality Objects, Properties, Relationships Unique Identifier ANSI/SPARC Conceptual Model External Model Internal Model Physical Model 10 ANSI/SPARC Architecture American National Standards Institute/Standards Planning and Requirements Committee Conceptual Model (Database Administrator View) External Model (end-user views) Internal Model (DBA view) Physical Model (storage view) 11 Conceptual Model Global view of data Enterprise-wide view as seen by DBA Conceptual schema – – basic blueprint for the database design frequently represented with E-R diagrams Hardware and software independent 12 External Model Accessed by – – External Schema – – Application programmer End-user User’s authorized view of the data A subset of the Conceptual Schema or a logical view of the Conceptual Schema Hardware Independent; software dependent 13 Internal Model Implementation of Conceptual Schema – – – – – Hierarchical Model DBMS Network (CODASYL) Model DBMS Relational Model DBMS Object-Oriented Model DBMS Semantic Model DBMS Hardware independent; software dependent 14 Physical Model Description of how data is to be stored – – Definition of physical storage devices Definition of physical access methods Hardware and software dependent 15 Modeling Reality Common Conceptual Modeling Terms: – – – – – Internal Modeling Terms: unique to Internal (implementation) model – – – – Entity: a person, place, event, or thing for which data is to be collected Attributes: properties or characteristics of an entity which describe the entity in the context of interest Identifier: a means of distinguishing one entity from another Entity Class: a collection of all entities of the same type, i.e. entities that have exactly the same properties Relationship: an association among entities in the same or different classes Hierarchical Network (CODASYL) Relational OODBMS Semantic Physical Model: strategy for storage and access is unique to the internal model 16 Relationships One-to-Many – – – INSTRUCTOR Instructor may have many Students Student may have many Instructors One-to-One – – ADVISEE Advisor may have many Advisees Advisee has but one Advisor (our choice) Many-to-Many – ADVISOR FACULTY Faculty is assigned to one office An office is assigned to one faculty STUDENT OFFICE 17 Data Integrity Constraints Measures taken to ensure data is accurate Business Constraints: rules that must be satisfied for the business – Entity Integrity Constraint: there is an attribute of the entity that is used to uniquely identify that entity – example: student id or ss# Static Domain Constraint: a value for a property can only be one of the items in a predefined list – example: managers vacation days shall not exceed 20 example: faculty may only be instructor, assistant professor, associate professor, professor Referential Integrity Constraint: in a one-to-many association, an entity on the many side must be associated (reference) an entity on the one side – example: in an advising relationship, a student’s advisor must be a faculty member 18 Data Independence Physical Data Independence: – Application programs and terminal activities remain logically unimpaired whenever any changes are made in either storage representation or access methods. Logical Data Independence: – Application programs and terminal activities remain logically unimpaired when information preserving changes of any kind are made to the conceptual design 19 Hierarchical Model Entity Segment Entity Class Segment type Attributes: Fields Identifier Value-bearing field or disk address reference Relationships – Internal Model--1-1 and 1-many in parent- child relationship; some support for two parents for same child – Physical Model--uses child/twin pointer strategy for 1-many Data Access 3GL products--COBOL, PL/I, C, Pascal Commercial Products IMS (DL/I) and Focus 20 DBMS Models - Hierarchical Conceptual Model STUDENT COURSE STUDENT -CLASS Initial Data Collected student id 1000 1005 1000 1010 1005 1010 1030 name doe deer doe ray deer ray jay class cis120 cis121 cis121 cis501 ma120 ma120 his101 description prog1 prog2 prog2 aprog1 cal0 cal0 his1 grade c a d a b c b 21 DBMS Models - Hierarchical Conceptual Model STUDENT COURSE Modified Conceptual Model Parent-Child Associations STUDENT Physical STUDENT -CLASS STUDENTCLASS (Physical) COURSE Logical STUDENTCLASS (Logical) 22 DBMS Models - Hierarchical Modified Conceptual Model Parent-Child Associations STUDENT Physical STUDENTCLASS (Physical) COURSE Student Data Twin 1st Child Logical STUDENTCLASS (Logical) Student-Class Data Twin Course 23 DBMS Models - Hierarchical Modified Conceptual Model Parent-Child Associations STUDENT Physical STUDENTCLASS (Physical) COURSE Course Data Twin 1st Child Logical STUDENTCLASS (Logical) Ref to Physical Twin 24 DBMS Models - Hierarchical Modified Conceptual Model Parent-Child Associations STUDENT Student Data Twin 1st Child Student Data Twin 1st Child COURSE Student Data Physical STUDENTCLASS (Physical) Logical STUDENTCLASS (Logical) Student-Class Data Twin Course Student-Class Data Twin Course Student-Class Data Twin Course Student-Class Data Twin Course Student-Class Data Twin Course Student-Class Data Twin Course Twin 1st Child Student-Class Data Twin Course Student-Class Data Twin Course Student-Class Data Twin Course 25 DBMS Models - Hierarchical Modified Conceptual Model Parent-Child Associations STUDENT Course Data Twin 1st Child Course Data Twin 1st Child COURSE Course Data Physical STUDENTCLASS (Physical) Logical STUDENTCLASS (Logical) Ref to Physical Twin Ref to Physical Twin Ref to Physical Twin Ref to Physical Twin Ref to Physical Twin Ref to Physical Twin Twin Ref to Physical Twin Ref to Physical Twin Ref to Physical Twin 1st Child 26 Hierarchical Model Advantages – – – – – – Common database makes sharing practical Security is provided and enforced Supports some data independence Referential integrity maintained through parent-child relationship Very efficient for data models that are hierarchical (oneto-many) Many hierarchical type applications are on mainframes 27 Hierarchical Model Disadvantages – – – – – – – – Knowledge of physical level required Does not support logical data independence and does not support all physical data independence operations Not all problems are one-to-many types Problems with multiple parent implementation Problems with anomalies for parent deletion Application development in 3GL time-consuming Support programs are not part of the DBMS “System created by programmers for programmers!” 28 Network (CODASYL) Model Entity Record Entity Class Record type Attributes: Data items Identifier Value-bearing field or disk address reference Relationships – Internal Model--1-1 and 1-many in owner-member set relationship; some provide elementary many-to-many relationships – Physical Model--same type records: doubly-linked, ringed structure owners: additional references to first & last associated member in each set members: additional references to associated owner in each set Data Access 3GL products--COBOL, PL/I, C, Pascal Commercial Products DBMS-10, IDMS (Cullinet), IDS (Honeywell), TOTAL, IMAGE, MDBS-III 29 DBMS Models – Network (CODASYL) Conceptual Model STUDENT COURSE STUDENT -CLASS Initial Data Collected student id 1000 1005 1000 1010 1005 1010 1030 name doe deer doe ray deer ray jay class cis120 cis121 cis121 cis501 ma120 ma120 his101 description prog1 prog2 prog2 aprog1 cal0 cal0 his1 grade c a d a b c b 30 DBMS Models – Network (CODASYL) Conceptual Model STUDENT Conceptual Model Set Associations: Owner-Member COURSE STUDENT STUDENT -CLASS COURSE STUDENT -CLASS 31 DBMS Models – Network (CODASYL) Conceptual Model Set1: Owner (Student)Member (Student-Class) STUDENT STUDENT-CLASS Student Data Next Prev Student-Class Data Next 1st Mem Prev Last Mem Assoc Owner 32 DBMS Models – Network (CODASYL) Conceptual Model Set2:Owner (Course)Member (Student-Class) Associations COURSE STUDENT-CLASS Course Data Next Prev Student-Class Data Next 1st Mem Prev Last Mem Assoc Owner 33 DBMS Models – Network (CODASYL) Student Data Next Prev Student Data STUDENT First Mem Next Prev Last Mem First Mem Last Mem COURSE Student Data Student-Class Data Next Student-Class Data Next Prev Next Prev First Mem Last Mem Assoc Owner Prev Assoc Owner STUDENT -CLASS Student-Class Data Next Prev Assoc Owner 34 Network Model Advantages – – – – Can be used to directly implement one-to-one, one-tomany, and (some DBMS models) many-to-many relationships Access, navigation is superior to hierarchical model Enforces referential integrity through owner-member relationship Achieves some physical data independence 35 Network Model Disadvantages – – – – Difficult to design and use Does not support logical data independence Very complex--not for the novice Navigation is achieved at the record level 36 Relational Model Entity Row (Tuple) Entity Class Table (Relation) Attributes: Column (?dimension?) Identifier Value-bearing field or generated value Relationships – Internal Model--1-1 and 1-many relationships – Physical Model--uses foreign key to link parent to child Data Access – – 4GL-- SQL 3GL products--COBOL, PL/I, C, Pascal Commercial Products ACCESS, ORACLE, DB2, SQL/DS, RBASE 500, INGRES, SYBASE 37 Relational Database Structure Unit Table Student Table MIS1100 Information Systems ... 0970000 Joe Bloe ... MIS1150 Business Statistics ... 0970010 Julie King ... 0970012 John Smith ... 0970015 Anne Oether ... 0970035 John Smith ... Unit_Student Relationship Table MIS1100 0970000 ... MIS1100 0970010 ... MIS1100 0970015 ... MIS1150 0970000 ... MIS1150 0970012 ... MIS1150 0970035 ... Data relating to the relationship is stored in the relational table. Recording the semester of enrolment, marks, and the grade for each student along with the relationship places logically related data in one location. 38 Relational Model Advantages User can focus on only the logical view Powerful query capabilities from 4GL—SQL Ad hoc query capability Aggregate processing as opposed to record at a time Standardization of language Creation, management, and data manipulation language Easier to make changes to the logical design without affecting applications (Logical Data Independence) 39 Relational Model Disadvantages More powerful computers are needed because so much is done for the user Ease of use creates a false sense of security in the area of design 40