Download Relational Data Model

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Database wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Data vault modeling wikipedia , lookup

Relational algebra wikipedia , lookup

Relational model wikipedia , lookup

Database model wikipedia , lookup

Transcript
GUS: 0262
Fundamentals of GIS
Lecture Presentation 3:
Relational Data Model
Jeremy Mennis
Department of Geography and Urban Studies
Temple University
File Structures
A file: “STUDENT”
field
record
ID Last
First
Grade
3
Smith
Jane
A
1
Wood
Bob
C
2
Kent
Chuck
B
4
Boone
Dan
B
File Structures
Simple
ordering is based on order of entry into the file
Ordered Sequential
ordering is based on numeric or alphabetical
ordering
Indexed
an index provides pointers to certain positions in
the file
Databases and Data Models
A database is a collection of data files that is
structured (organized) to facilitate data storage,
manipulation, and retrieval.
A database management system (DBMS) is a
software package that performs these database
functions
Databases and Data Models
A data model is a particular way of conceptually
organizing multiple data files in a database:
Hierarchical, Network, Relational
Hierarchical and network data models have generally
been replaced by the relational data model.
Relational DBMSs (and their derivatives) dominate the
(non-GIS) database market: Oracle, Informix.
Relational Data Model
Composed of a set of tables called relations
Records (rows) in the table are called tuples
Fields (columns) in the table are called domains
We will use the terms tables, records, and fields
Relational data model
Each record represents a logical entity (e.g. a student)
(or a relationship)
Each field represents an attribute (property) of the
logical entity
Student
ID
Last
First
Grade Class
1
Wood
Bob
C
2
Kent
Chuck
B Geog115
3
Smith
Jane
A Geog357
4
Boone
Dan
B
Geog357
Geog357
Relational data model
Each table has a primary key, one field (or a
combination of fields) that has a unique value for
each and every record in the table
Student
ID
Last
First
Grade Class
1
Wood
Bob
C
Geog357
2
Kent
Chuck
B
Geog115
3
Smith
Jane
A Geog357
4
Boone
Dan
B
Geog357
Relational data model
Tables can be related (joined or linked) together based
on their keys.
Student
ID
Last
Class
First Grade Class
Name
1
Wood
Bob
C Geog357
2
Kent
Chuck
B Geog115
3
Smith
Jane
A Geog357
4
Boone
Dan
B Geog357
#Stud Instructor
Geog357 48
Mennis
Geog115 120 Brower
Geog20
120 Fountain
Relational data model
Primary key
Foreign key
Primary key
Student
ID
Last
Class
First Grade Class
Name
1
Wood
Bob
C Geog357
2
Kent
Chuck
B Geog115
3
Smith
Jane
A Geog357
4
Boone
Dan
B Geog357
#Stud Instructor
Geog357 48
Mennis
Geog115 120 Brower
Geog20
120 Mennis
Relational data model
Student
ID Last
First Grade Class
1
Wood
Bob
C Geog357
2
Kent
Chuck
B Geog115
3
Smith
Jane
A Geog357
4
Boone
Dan
B Geog357
Class
Name
#Stud Instructor
Geog20
120 Brower
Geog115 120 Mennis
Geog357 48
Mennis
Instructor
Name
Office
Mennis
332
Brower
517
Normal Forms
The process of structuring tables and table relationships
in a logical way that minimizes data redundancy.
3 rules or steps in normalization
• first normal form
• second normal form
• third normal form
Normal Forms
First normal form
only one value per field for each record
Violates first
normal form
Student
ID
Last
First
Grades
Classes
1
Wood
Bob
C, B
Geog357, Geog20
2
Kent
Chuck
B, D Geog115, Geog356
3
Smith
Jane
4
Boone
Dan
A, B Geog357, Geog20
B, A Geog357, Geog455
Normal Forms
Second normal form
each non-primary key field must be totally
dependent on the entire primary key (and not on
only part of the primary key)
primary key
Year Last
Student
First Grade Status
2
Wood
Bob
C
sophomore
4
Kent
Chuck
B
senior
3
Smith
Jane
A
junior
3
Boone
Dan
B
junior
Violates second
normal form
because Status is
dependent only on
Year, not on
Year/Last/First
Normal Forms
To resolve second normal form violation - create
separate tables
Student
Year
primary key
primary key
Last
First Grade Year
Year
Status
Wood
Bob
C
2
1
freshman
Kent
Chuck
B
4
2
sophomore
Smith
Jane
A
3
3
junior
Boone
Dan
B
3
4
senior
Normal Forms
Third normal form
every field that is not a primary key must be totally
and directly dependent on the primary key (no
transitive dependency)
Student
ID
Last
First Grade Class Instructor
1
Wood
Bob
C
Geog357 Mennis
2
Kent
Chuck
B
Geog115 Brower
3
Smith
Jane
A Geog357 Mennis
4
Boone
Dan
B Geog357 Mennis
Normal Forms
Third normal form
every field that is not a primary key must be totally
and directly dependent on the primary key (no
transitive dependency)
Student
ID
Last
First Grade Class Instructor
1
Wood
Bob
C
Geog357 Mennis
2
Kent
Chuck
B
Geog115 Brower
3
Smith
Jane
A Geog357 Mennis
4
Boone
Dan
B Geog357 Mennis
Violates third normal
form because
Instructor is
dependent on Class,
not on the primary
key ID
Normal Forms
To resolve third normal form violation - create separate
tables
Student
ID
Last
Class is
dependent on
primary key ID
First Grade Class
Class
Name
1
Wood
Bob
C Geog357
2
Kent
Chuck
B Geog115
3
Smith
Jane
A Geog357
4
Boone
Dan
B Geog357
Instructor is
dependent on
primary key
Name
#Stud Instructor
Geog357 48
Mennis
Geog115 120 Brower
Geog20
120 Mennis
Why Normalization?
1. If the instructor to a class changed - all students with
that class would have to have their instructors changed
2. Every time a student changed a class, the instructor
would also have to be changed
Student
ID Last
First Grade Class Instructor
1
Wood
Bob
C
2
Kent
Chuck
B Geog115 Brower
3
Smith
Jane
A Geog357 Mennis
4
Boone
Dan
B Geog357 Mennis
Geog357 Mennis
Why Normalization?
Original
table
Updated
table
ID Last
First Grade Class Instructor
1
Wood
Bob
C
2
Kent
Chuck
B Geog115 Brower
3
Smith
Jane
A Geog357 Mennis
4
Boone
Dan
B Geog357 Mennis
Student
Geog357 Mennis
ID Last
First Grade Class Instructor
1
Wood
Bob
C
2
Kent
Chuck
B Geog115 Brower
3
Smith
Jane
A Geog357 Knight
4
Boone
Dan
B Geog357 Mennis
Geog115 Mennis
These update
problems may
result in logical
inconsistencies in
the database
Logical
inconsistency
Logical
inconsistency
Why Normalization?
When the table is in third normal form, these logical
inconsistencies cannot take place.
When an instructor is changed, the change is enforced for
all students
When a student changes classes, the change is instructor
is automatically enforced
ID
Last
First Grade Class
Class
Name
1
Wood
Bob
C Geog357
2
Kent
Chuck
B Geog115
3
Smith
Jane
A Geog357
4
Boone
Dan
B Geog357
#Stud Instructor
Geog357 48
Mennis
Geog115 120 Brower
Geog20
120 Mennis
Relational Algebra
Operations on the relational data model are defined by
relational algebra
join
projection
selection
Relational Algebra
Join: Match records in both tables based on a common
field
Geography Classes
Class
Instructor
Instructor
Instructor Office
Result of Join
Class
Instructor Office
Geog357 Mennis
Mennis 332
Geog357 Mennis 332
Geog115 Brower
Brower
Geog115 Brower
Geog20
Fountain 125
Geog20
Karnes
Geog435 Karnes
Fountain
Geog435 Karnes
423
312
423
Fountain 125
312
Relational Algebra
Projection: reduces one table in the attribute
dimension (a selection of a subset of fields, for all
records)
Relational Algebra
Projection: List all Geography classes, but not the
instructors
Geography Classes
Class
Instructor
Result of Projection
Class
Geog357 Mennis
Geog357
Geog115 Brower
Geog115
Geog20
Geog20
Fountain
Geog435 Karnes
Geog435
Relational Algebra
Selection (restriction): reduces one table in the record
dimension (a selection of a subset of records, for all
fields)
Criteria for selection is called a predicate
Relational Algebra
Selection: Find Geography classes taught by Mennis
Geography Classes
Class
Instructor
Geog357 Mennis
Geog115 Brower
Geog20
Fountain
Geog435 Karnes
Result of Selection
Class
Instructor
Geog357 Mennis
SQL
– Structured (Standard) Query Language
– Formal language for interacting with relational
databases
– Implementation and language for relational algebra
SQL
SQL - basic syntax
Geography Classes
Class
Instructor
Geog357 Mennis
SELECT <fields>
FROM <tables>
WHERE <condition>
SELECT Class, Instructor
FROM Geography Classes
WHERE Instructor = “Mennis”
Geog115 Brower
Geog20
Fountain
Geog435 Karnes
Result of Selection
Class
Instructor
Geog357 Mennis
SQL
SELECT
Class, Office
FROM
WHERE
Geography Classes, Instructor
Class = “Geog357” or
Instructor = “Karnes” or
Office = 125
ORDER BY Office
Geography Classes
Class
Instructor
Instructor
Name Office
Geog357 Mennis
Mennis 332
Geog115 Brower
Brower
Geog20
Fountain 125
Fountain
Geog435 Karnes
Karnes
423
312
Result of SQL Query
Class
Office
Geog20
125
Geog435 312
Geog357 332