* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Relational Data Model
Survey
Document related concepts
Transcript
GUS: 0262 Fundamentals of GIS Lecture Presentation 3: Relational Data Model Jeremy Mennis Department of Geography and Urban Studies Temple University File Structures A file: “STUDENT” field record ID Last First Grade 3 Smith Jane A 1 Wood Bob C 2 Kent Chuck B 4 Boone Dan B File Structures Simple ordering is based on order of entry into the file Ordered Sequential ordering is based on numeric or alphabetical ordering Indexed an index provides pointers to certain positions in the file Databases and Data Models A database is a collection of data files that is structured (organized) to facilitate data storage, manipulation, and retrieval. A database management system (DBMS) is a software package that performs these database functions Databases and Data Models A data model is a particular way of conceptually organizing multiple data files in a database: Hierarchical, Network, Relational Hierarchical and network data models have generally been replaced by the relational data model. Relational DBMSs (and their derivatives) dominate the (non-GIS) database market: Oracle, Informix. Relational Data Model Composed of a set of tables called relations Records (rows) in the table are called tuples Fields (columns) in the table are called domains We will use the terms tables, records, and fields Relational data model Each record represents a logical entity (e.g. a student) (or a relationship) Each field represents an attribute (property) of the logical entity Student ID Last First Grade Class 1 Wood Bob C 2 Kent Chuck B Geog115 3 Smith Jane A Geog357 4 Boone Dan B Geog357 Geog357 Relational data model Each table has a primary key, one field (or a combination of fields) that has a unique value for each and every record in the table Student ID Last First Grade Class 1 Wood Bob C Geog357 2 Kent Chuck B Geog115 3 Smith Jane A Geog357 4 Boone Dan B Geog357 Relational data model Tables can be related (joined or linked) together based on their keys. Student ID Last Class First Grade Class Name 1 Wood Bob C Geog357 2 Kent Chuck B Geog115 3 Smith Jane A Geog357 4 Boone Dan B Geog357 #Stud Instructor Geog357 48 Mennis Geog115 120 Brower Geog20 120 Fountain Relational data model Primary key Foreign key Primary key Student ID Last Class First Grade Class Name 1 Wood Bob C Geog357 2 Kent Chuck B Geog115 3 Smith Jane A Geog357 4 Boone Dan B Geog357 #Stud Instructor Geog357 48 Mennis Geog115 120 Brower Geog20 120 Mennis Relational data model Student ID Last First Grade Class 1 Wood Bob C Geog357 2 Kent Chuck B Geog115 3 Smith Jane A Geog357 4 Boone Dan B Geog357 Class Name #Stud Instructor Geog20 120 Brower Geog115 120 Mennis Geog357 48 Mennis Instructor Name Office Mennis 332 Brower 517 Normal Forms The process of structuring tables and table relationships in a logical way that minimizes data redundancy. 3 rules or steps in normalization • first normal form • second normal form • third normal form Normal Forms First normal form only one value per field for each record Violates first normal form Student ID Last First Grades Classes 1 Wood Bob C, B Geog357, Geog20 2 Kent Chuck B, D Geog115, Geog356 3 Smith Jane 4 Boone Dan A, B Geog357, Geog20 B, A Geog357, Geog455 Normal Forms Second normal form each non-primary key field must be totally dependent on the entire primary key (and not on only part of the primary key) primary key Year Last Student First Grade Status 2 Wood Bob C sophomore 4 Kent Chuck B senior 3 Smith Jane A junior 3 Boone Dan B junior Violates second normal form because Status is dependent only on Year, not on Year/Last/First Normal Forms To resolve second normal form violation - create separate tables Student Year primary key primary key Last First Grade Year Year Status Wood Bob C 2 1 freshman Kent Chuck B 4 2 sophomore Smith Jane A 3 3 junior Boone Dan B 3 4 senior Normal Forms Third normal form every field that is not a primary key must be totally and directly dependent on the primary key (no transitive dependency) Student ID Last First Grade Class Instructor 1 Wood Bob C Geog357 Mennis 2 Kent Chuck B Geog115 Brower 3 Smith Jane A Geog357 Mennis 4 Boone Dan B Geog357 Mennis Normal Forms Third normal form every field that is not a primary key must be totally and directly dependent on the primary key (no transitive dependency) Student ID Last First Grade Class Instructor 1 Wood Bob C Geog357 Mennis 2 Kent Chuck B Geog115 Brower 3 Smith Jane A Geog357 Mennis 4 Boone Dan B Geog357 Mennis Violates third normal form because Instructor is dependent on Class, not on the primary key ID Normal Forms To resolve third normal form violation - create separate tables Student ID Last Class is dependent on primary key ID First Grade Class Class Name 1 Wood Bob C Geog357 2 Kent Chuck B Geog115 3 Smith Jane A Geog357 4 Boone Dan B Geog357 Instructor is dependent on primary key Name #Stud Instructor Geog357 48 Mennis Geog115 120 Brower Geog20 120 Mennis Why Normalization? 1. If the instructor to a class changed - all students with that class would have to have their instructors changed 2. Every time a student changed a class, the instructor would also have to be changed Student ID Last First Grade Class Instructor 1 Wood Bob C 2 Kent Chuck B Geog115 Brower 3 Smith Jane A Geog357 Mennis 4 Boone Dan B Geog357 Mennis Geog357 Mennis Why Normalization? Original table Updated table ID Last First Grade Class Instructor 1 Wood Bob C 2 Kent Chuck B Geog115 Brower 3 Smith Jane A Geog357 Mennis 4 Boone Dan B Geog357 Mennis Student Geog357 Mennis ID Last First Grade Class Instructor 1 Wood Bob C 2 Kent Chuck B Geog115 Brower 3 Smith Jane A Geog357 Knight 4 Boone Dan B Geog357 Mennis Geog115 Mennis These update problems may result in logical inconsistencies in the database Logical inconsistency Logical inconsistency Why Normalization? When the table is in third normal form, these logical inconsistencies cannot take place. When an instructor is changed, the change is enforced for all students When a student changes classes, the change is instructor is automatically enforced ID Last First Grade Class Class Name 1 Wood Bob C Geog357 2 Kent Chuck B Geog115 3 Smith Jane A Geog357 4 Boone Dan B Geog357 #Stud Instructor Geog357 48 Mennis Geog115 120 Brower Geog20 120 Mennis Relational Algebra Operations on the relational data model are defined by relational algebra join projection selection Relational Algebra Join: Match records in both tables based on a common field Geography Classes Class Instructor Instructor Instructor Office Result of Join Class Instructor Office Geog357 Mennis Mennis 332 Geog357 Mennis 332 Geog115 Brower Brower Geog115 Brower Geog20 Fountain 125 Geog20 Karnes Geog435 Karnes Fountain Geog435 Karnes 423 312 423 Fountain 125 312 Relational Algebra Projection: reduces one table in the attribute dimension (a selection of a subset of fields, for all records) Relational Algebra Projection: List all Geography classes, but not the instructors Geography Classes Class Instructor Result of Projection Class Geog357 Mennis Geog357 Geog115 Brower Geog115 Geog20 Geog20 Fountain Geog435 Karnes Geog435 Relational Algebra Selection (restriction): reduces one table in the record dimension (a selection of a subset of records, for all fields) Criteria for selection is called a predicate Relational Algebra Selection: Find Geography classes taught by Mennis Geography Classes Class Instructor Geog357 Mennis Geog115 Brower Geog20 Fountain Geog435 Karnes Result of Selection Class Instructor Geog357 Mennis SQL – Structured (Standard) Query Language – Formal language for interacting with relational databases – Implementation and language for relational algebra SQL SQL - basic syntax Geography Classes Class Instructor Geog357 Mennis SELECT <fields> FROM <tables> WHERE <condition> SELECT Class, Instructor FROM Geography Classes WHERE Instructor = “Mennis” Geog115 Brower Geog20 Fountain Geog435 Karnes Result of Selection Class Instructor Geog357 Mennis SQL SELECT Class, Office FROM WHERE Geography Classes, Instructor Class = “Geog357” or Instructor = “Karnes” or Office = 125 ORDER BY Office Geography Classes Class Instructor Instructor Name Office Geog357 Mennis Mennis 332 Geog115 Brower Brower Geog20 Fountain 125 Fountain Geog435 Karnes Karnes 423 312 Result of SQL Query Class Office Geog20 125 Geog435 312 Geog357 332