* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Models - Courses
Oracle Database wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Concurrency control wikipedia , lookup
ContactPoint wikipedia , lookup
Clusterpoint wikipedia , lookup
Lecture 11: Intro to Database Design SIMS 202: Information Organization and Retrieval Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2004 http://www.sims.berkeley.edu/academics/courses/is202/f04/ IS 202 – FALL 2004 Callie’s Birthday 2004-10-05 - SLIDE 1 Lecture Overview • Review – Evaluation exercise • • • • • Databases and Database Design Database Life Cycle ER Diagrams Discussion Next Time/Readings IS 202 – FALL 2004 Callie’s Birthday 2004-10-05 - SLIDE 2 Lecture Overview • Review – Evaluation exercise • • • • • Databases and Database Design Database Life Cycle ER Diagrams Discussion Next Time/Readings IS 202 – FALL 2004 Callie’s Birthday 2004-10-05 - SLIDE 3 What is a Database? IS 202 – FALL 2004 Callie’s Birthday 2004-10-05 - SLIDE 4 Files and Databases • File: A collection of records or documents dealing with one organization, person, area or subject (Rowley) – Manual (paper) files – Computer files • Database: A collection of similar records with relationships between the records (Rowley) – Bibliographic, statistical, business data, images, etc. IS 202 – FALL 2004 Callie’s Birthday 2004-10-05 - SLIDE 5 Database • A Database is a collection of stored operational data used by the application systems of some particular enterprise (C.J. Date) – Paper “Databases” • Still contain a large portion of the world’s knowledge – File-Based Data Processing Systems • Early batch processing of (primarily) business data – Database Management Systems (DBMS) IS 202 – FALL 2004 Callie’s Birthday 2004-10-05 - SLIDE 6 Why DBMS? • History – 50’s and 60’s all applications were custom built for particular needs – File based – Many similar/duplicative applications dealing with collections of business data – Early DBMS were extensions of programming languages – 1970 - E.F. Codd and the Relational Model – 1979 - Ashton-Tate and first Microcomputer DBMS IS 202 – FALL 2004 Callie’s Birthday 2004-10-05 - SLIDE 7 File Based Systems Application Delivery List Coal Estimation Just what asked for IS 202 – FALL 2004 File Toys Addresses Naughty Nice Toys Callie’s Birthday 2004-10-05 - SLIDE 8 From File Systems to DBMS • Problems with file processing systems – Inconsistent data – Inflexibility – Limited data sharing – Poor enforcement of standards – Excessive program maintenance IS 202 – FALL 2004 Callie’s Birthday 2004-10-05 - SLIDE 9 DBMS Benefits • • • • • • • • • Minimal data redundancy Consistency of data Integration of data Sharing of data Ease of application development Uniform security, privacy, and integrity controls Data accessibility and responsiveness Data independence Reduced program maintenance IS 202 – FALL 2004 Callie’s Birthday 2004-10-05 - SLIDE 10 Terms and Concepts • Data independence – Physical representation and location of data and the use of that data are separated • The application doesn’t need to know how or where the database has stored the data, but just how to ask for it • Moving a database from one DBMS to another should not have a material effect on application program • Recoding, adding fields, etc. in the database should not affect applications IS 202 – FALL 2004 Callie’s Birthday 2004-10-05 - SLIDE 11 Database Environment CASE Tools Repository IS 202 – FALL 2004 User Interface DBMS Application Programs Database Callie’s Birthday 2004-10-05 - SLIDE 12 Database Components DBMS =============== Design tools Database Database contains: User’s Data Metadata Indexes Application Metadata IS 202 – FALL 2004 Table Creation Form Creation Query Creation Report Creation Procedural language compiler (4GL) ============= Run time Form processor Query processor Report Writer Language Run time Application Programs User Interface Applications Callie’s Birthday 2004-10-05 - SLIDE 13 Types of Database Systems • • • • • PC databases Centralized database Client/server databases Distributed databases Database models IS 202 – FALL 2004 Callie’s Birthday 2004-10-05 - SLIDE 14 PC Databases E.g.: Access FoxPro Dbase Etc. IS 202 – FALL 2004 Callie’s Birthday 2004-10-05 - SLIDE 15 Centralized Databases Central Computer IS 202 – FALL 2004 Callie’s Birthday 2004-10-05 - SLIDE 16 Client Server Databases Client Client Network Database Server Client IS 202 – FALL 2004 Callie’s Birthday 2004-10-05 - SLIDE 17 Distributed Databases Location C Location B computer computer computer Homogeneous Databases Location A IS 202 – FALL 2004 Callie’s Birthday 2004-10-05 - SLIDE 18 Distributed Databases Client Heterogeneous Or Federated Databases Database Server Remote Comp. Local Network Comm Server Client IS 202 – FALL 2004 Remote Comp. Callie’s Birthday 2004-10-05 - SLIDE 19 Terms and Concepts • A “database application” is an application program (or set of related programs) that is used to perform a series of database activities: – Create • Add new data to the database – Read • Read current data from the database – Update • Update or modify current database data – Delete • Remove current On behalf of database users IS 202 – FALL 2004 Callie’s Birthday 2004-10-05 - SLIDE 20 Terms and Concepts • Enterprise – Organization • Entity – Person, Place, Thing, Event, Concept... • Attributes – Data elements (facts) about some entity – Also sometimes called fields or items or domains • Data values – Instances of a particular attribute for a particular entity IS 202 – FALL 2004 Callie’s Birthday 2004-10-05 - SLIDE 21 Terms and Concepts • Key – An attribute or set of attributes used to identify or locate records in a file • Primary Key – An attribute or set of attributes that uniquely identifies each record in a file IS 202 – FALL 2004 Callie’s Birthday 2004-10-05 - SLIDE 22 Terms and Concepts • Models – (1) Levels or views of the Database • Conceptual, logical, physical – (2) DBMS types • Relational, Hierarchic, Network, Object-Oriented, Object-Relational IS 202 – FALL 2004 Callie’s Birthday 2004-10-05 - SLIDE 23 Models (1) Application 1 External Model Application 2 Application 3 Application 4 External Model External Model External Model Application 1 Conceptual requirements Application 2 Conceptual requirements Application 3 Conceptual requirements Conceptual Model Logical Model Internal Model Application 4 Conceptual requirements More later on this… IS 202 – FALL 2004 Callie’s Birthday 2004-10-05 - SLIDE 24 Data Models(2): History • Hierarchical Model (1960’s and 1970’s) – Similar to data structures in programming languages Books (id, title) Authors (first, last) IS 202 – FALL 2004 Publisher Subjects Callie’s Birthday 2004-10-05 - SLIDE 25 Data Models(2): History • Network Model (1970’s) – Provides for single entries of data and navigational “links” through chains of data. Authors Subjects Books Publishers IS 202 – FALL 2004 Callie’s Birthday 2004-10-05 - SLIDE 26 Data Models(2): History • Relational Model (1980’s) – Provides a conceptually simple model for data as relations (typically considered “tables”) with all data visible pubid Book ID 1 2 3 4 5 Title pubid Introductio The history New stuff ab Another title And yet more IS 202 – FALL 2004 2 4 3 2 1 Author id 1 2 3 4 5 1 2 3 4 Book ID pubname Harper Addison Oxford Que Authorid 1 2 3 4 5 Author name Smith Wynar Jones Duncan Applegate Subid 1 2 3 4 4 2 1 3 2 3 Subid Subject 1 cataloging 2 history 3 stuff Callie’s Birthday 2004-10-05 - SLIDE 27 Data Models(2): History • Object Oriented Data Model (1990’s) – Encapsulates data and operations as “Objects” Books (id, title) Authors (first, last) IS 202 – FALL 2004 Publisher Subjects Callie’s Birthday 2004-10-05 - SLIDE 28 Data Models(2): History • Object-Relational Model (1990’s) – Combines the well-known properties of the Relational Model with such OO features as: • User-defined datatypes • User-defined functions • Inheritance and sub-classing • All of the major enterprise DBMS systems are now Object-Relational or incorporate Object-Relational features IS 202 – FALL 2004 Callie’s Birthday 2004-10-05 - SLIDE 29 Lecture Overview • Review – MediaStreams • • • • • Databases and Database Design Database Life Cycle ER Diagrams Discussion Next Time/Readings IS 202 – FALL 2004 Callie’s Birthday 2004-10-05 - SLIDE 30 Database System Life Cycle Physical Creation 2 Conversion 3 Design 1 Growth, Change, & Maintenance 6 Integration 4 Operations 5 IS 202 – FALL 2004 Callie’s Birthday 2004-10-05 - SLIDE 31 Design • Determination of the needs of the organization • Development of the Conceptual Model of the database – Typically using Entity-Relationship diagramming techniques • Construction of a Data Dictionary • Development of the Logical Model IS 202 – FALL 2004 Callie’s Birthday 2004-10-05 - SLIDE 32 Physical Creation • Development of the Physical Model of the Database – Data formats and types – Determination of indexes, etc. • Load a prototype database and test • Determine and implement security, privacy and access controls • Determine and implement integrity constraints IS 202 – FALL 2004 Callie’s Birthday 2004-10-05 - SLIDE 33 Conversion • Convert existing data sets and applications to use the new database – May need programs, conversion utilities to convert old data to new formats IS 202 – FALL 2004 Callie’s Birthday 2004-10-05 - SLIDE 34 Integration • Overlaps with Phase 3 • Integration of converted applications and new applications into the new database IS 202 – FALL 2004 Callie’s Birthday 2004-10-05 - SLIDE 35 Operations • All applications run full-scale • Privacy, security, access control must be in place • Recovery and Backup procedures must be established and used IS 202 – FALL 2004 Callie’s Birthday 2004-10-05 - SLIDE 36 Growth, Change, and Maintenance • Change is a way of life – Applications, data requirements, reports, etc. will all change as new needs and requirements are found – The Database and applications and will need to be modified to meet the needs of changes IS 202 – FALL 2004 Callie’s Birthday 2004-10-05 - SLIDE 37 Another View of the Life Cycle Integration 4 Operations 5 Design Physical 1 Creation Conversion Growth, 2 Change 3 6 IS 202 – FALL 2004 Callie’s Birthday 2004-10-05 - SLIDE 38 Lecture Overview • Review – MediaStreams • • • • • Databases and Database Design Database Life Cycle ER Diagrams Discussion Next Time/Readings IS 202 – FALL 2004 Callie’s Birthday 2004-10-05 - SLIDE 39 Database Design Process Application 1 External Model Application 2 Application 3 Application 4 External Model External Model External Model Application 1 Conceptual requirements Application 2 Conceptual requirements Application 3 Conceptual requirements Conceptual Model Logical Model Internal Model Application 4 Conceptual requirements IS 202 – FALL 2004 Callie’s Birthday 2004-10-05 - SLIDE 40 Entity • An Entity is an object in the real world (or even imaginary worlds) about which we want or need to maintain information – Persons (e.g.: customers in a business, employees, authors) – Things (e.g.: purchase orders, meetings, parts, companies) Employee IS 202 – FALL 2004 Callie’s Birthday 2004-10-05 - SLIDE 41 Attributes • Attributes are the significant properties or characteristics of an entity that help identify it and provide the information needed to interact with it or use it (this is the Metadata for the entities) Birthdate First Middle Last IS 202 – FALL 2004 Age Name Employee SSN Projects Callie’s Birthday 2004-10-05 - SLIDE 42 Relationships • Relationships are the associations between entities • They can involve one or more entities and belong to particular relationship types IS 202 – FALL 2004 Callie’s Birthday 2004-10-05 - SLIDE 43 Relationships Student Attends Class Project Supplier IS 202 – FALL 2004 Supplies project parts Part Callie’s Birthday 2004-10-05 - SLIDE 44 Types of Relationships • Concerned only with cardinality of relationship Employee Employee Employee 1 Assigned n Assigned 1 1 m Assigned n Truck Project Project Chen ER notation IS 202 – FALL 2004 Callie’s Birthday 2004-10-05 - SLIDE 45 Other Notations Employee Assigned Truck Employee Assigned Project Employee Assigned Project “Crow’s Foot” IS 202 – FALL 2004 Callie’s Birthday 2004-10-05 - SLIDE 46 Other Notations Employee Assigned Truck Employee Assigned Project Employee Assigned Project IDEFIX Notation IS 202 – FALL 2004 Callie’s Birthday 2004-10-05 - SLIDE 47 More Complex Relationships Manager 1/1/1 Employee 1/n/n Evaluation n/n/1 Project SSN Date Project Employee 4(2-10) Assigned 1 Manages Employee Is Managed By Project 1 Manages n IS 202 – FALL 2004 Callie’s Birthday 2004-10-05 - SLIDE 48 Weak Entities • Owe existence entirely to another entity Part# Invoice # Order Invoice# Contains Quantity Order-line Rep# IS 202 – FALL 2004 Callie’s Birthday 2004-10-05 - SLIDE 49 Supertype and Subtype Entities Employee Sales-rep Is one of Manages Clerk Sold Other Invoice IS 202 – FALL 2004 Callie’s Birthday 2004-10-05 - SLIDE 50 Many to Many Relationships SSN Proj# Proj# Hours Project Assignment Is Assigned Project Assigned Employee IS 202 – FALL 2004 SSN Callie’s Birthday 2004-10-05 - SLIDE 51 Lecture Overview • Review – MediaStreams • • • • • Databases and Database Design Database Life Cycle ER Diagrams Discussion Next Time/Readings IS 202 – FALL 2004 Callie’s Birthday 2004-10-05 - SLIDE 52 Discussion • Why use DBMS for web-based system development? • Why Not use IR systems? • Can you use both? • Other Questions? IS 202 – FALL 2004 Callie’s Birthday 2004-10-05 - SLIDE 53 Lecture Overview • Review – MediaStreams • • • • • • Databases and Database Design Database Life Cycle ER Diagrams Database Design Discussion Next Time/Readings IS 202 – FALL 2004 Callie’s Birthday 2004-10-05 - SLIDE 54 Next Time • Database Design – Normalization and SQL • Readings – Hoffer/McFadden “Logical database Design and the Relational Model” IS 202 – FALL 2004 Callie’s Birthday 2004-10-05 - SLIDE 55