Download Chapter 17 Designing Effective Input

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Data model wikipedia , lookup

Data analysis wikipedia , lookup

Clusterpoint wikipedia , lookup

Information privacy law wikipedia , lookup

3D optical data storage wikipedia , lookup

Business intelligence wikipedia , lookup

Data vault modeling wikipedia , lookup

Database model wikipedia , lookup

Transcript
Chapter 17
Designing Databases
Systems Analysis and Design
Kendall and Kendall
Fifth Edition
Major Topics
Files
Databases
Normalization
Key design
Using the database
Data warehouses
Data mining
Kendall & Kendall
Copyright © 2002 by Prentice Hall, Inc.
17-2
Data Storage Design Objectives
 Data must be available when the user wants it
 Data must have integrity -- accurate & consistent
 Efficient data storage, maintenance and retrieval
 Information retrieval be purposeful
 Information obtained must be useful for
 Managing
 Planning
 Controlling
 Decision making
Kendall & Kendall
Copyright © 2002 by Prentice Hall, Inc.
17-3
Data Storage Approaches
 Store data in individual files; each unique to a
particular application.
 Can be designed and built quite rapidly.
 Concerns for data availability and security are
minimized
 Choose an appropriate file structure according to
the system requirements
 Store data in a database.
 A formally defined and centrally controlled store of
data intended for use in many different
applications
Kendall & Kendall
Copyright © 2002 by Prentice Hall, Inc.
17-4
Objectives of Effective
Databases
 Ensure data can be shared among users for a
variety of applications
 Maintain data that are both accurate &
consistent
 Ensure all data required for current and
future applications will be readily available
 Allow the database to evolve
 Allow users to construct their personal view
of the data without concern for the way the
data are physically stored
Kendall & Kendall
Copyright © 2002 by Prentice Hall, Inc.
17-5
Efficiency Measures of
Database Design
Time
Cost for
design & development
operation and maintenance
hardware installation
user training
Kendall & Kendall
Copyright © 2002 by Prentice Hall, Inc.
17-6
Entities
A distinct collection of data for one
person, place, thing, or event
 Entities become files of database tables
Customer
Kendall & Kendall
Copyright © 2002 by Prentice Hall, Inc.
17-7
Attributive Entity
describes attributes, especially
repeating elements
Book
Subject
Kendall & Kendall
Copyright © 2002 by Prentice Hall, Inc.
17-8
Associative Entity
The relationship line between a manyto-many relationship becomes an
associative entity.
Sometimes called a composite entity or
gerund
Kendall & Kendall
Copyright © 2002 by Prentice Hall, Inc.
17-9
Associative Entity Connections
 Links two entities
 Each entity end has a “one” connection
 The associative entity has a “many”
connection on each side
 Can only exist between two entities
 Become database tables
Kendall & Kendall
Copyright © 2002 by Prentice Hall, Inc.
17-10
Key of an Associative Entity
The primary key for each “one” end is a
foreign key on the associative entity
Both foreign keys concatenated together
become the primary key
Kendall & Kendall
Copyright © 2002 by Prentice Hall, Inc.
17-11
Relationships
Associations between entities may be
One-to-one
One-to-many
Many-to-many
A single vertical line represents one
A circle represents zero or none
A crows foot represents many
Kendall & Kendall
Copyright © 2002 by Prentice Hall, Inc.
17-12
Relationships
Kendall & Kendall
Many
One
Many
O
None
Copyright © 2002 by Prentice Hall, Inc.
17-13
Ordinality
The ordinality is the minimum number
that can occur in a relationship
If the ordinality is zero, it means that it
is possible to have none of the entity
Item
Kendall & Kendall
O
Copyright © 2002 by Prentice Hall, Inc.
Order
17-14
Entity Subtype
Student
Internship
A special one-to-one relationship
It is used to represent additional
attributes, which may not be present on
every record of the first entity
This eliminates null fields on the
primary database
For example, a company that has
preferred customers, or student interns
Kendall & Kendall
Copyright © 2002 by Prentice Hall, Inc.
17-15
Self-Join
A self-join is when a record has a
relationship with another record on the
same file
Kendall & Kendall
Copyright © 2002 by Prentice Hall, Inc.
17-16
Attributes, Records, and Keys
Attributes are a characteristic of an
entity, sometimes called a field
Records are a collection of data items
that have something in common
Keys are data items in a record used to
identify the record
Kendall & Kendall
Copyright © 2002 by Prentice Hall, Inc.
17-17
Key Types
Primary key, unique for the record
Secondary key, may not be unique
Concatenated key, a combination of two
or more data items for the key
Foreign key, a data item in one record
that is the key of another record
Kendall & Kendall
Copyright © 2002 by Prentice Hall, Inc.
17-18
Types of Files
 Master files
 Have large records
 Contain all pertinent info. about an entity
 Transaction files
 Are short records
 Contain info. used to update master files
 Table file
 Work file
 Report file
Kendall & Kendall
Copyright © 2002 by Prentice Hall, Inc.
17-19
File Organization
Sequential organization
Linked lists
Hashed file organization
Indexed organization
Indexed-sequential organization
VSAM (Virtual Storage Access Method),
sequential and indexed-sequential files
Kendall & Kendall
Copyright © 2002 by Prentice Hall, Inc.
17-20
Databases
Intended to be shared by users
Three types of databases:
Hierarchical databases
Network databases
Relational databases
Kendall & Kendall
Copyright © 2002 by Prentice Hall, Inc.
17-21
Normalization
Transformation of complex user views
and data to a set of smaller, stable, and
easily maintainable data structures
Kendall & Kendall
Copyright © 2002 by Prentice Hall, Inc.
17-22
Normalization
Creates data that are stored only once
on a file with the exception of key fields
This eliminates redundant data storage
It provides ideal data storage for
database systems
Kendall & Kendall
Copyright © 2002 by Prentice Hall, Inc.
17-23
Three Steps of Data Normalization
Remove all repeating groups and
identify the primary key
Ensure that all nonkey attributes are
fully dependent on the primary key
Remove any transitive dependencies,
attributes which are dependent on
other nonkey attributes
Kendall & Kendall
Copyright © 2002 by Prentice Hall, Inc.
17-24
Normalization
User View
Unnormalized
Relationship
Remove repeating groups
Normalized
Relations (1NF)
Remove partial dependencies
Second Normal Form
Relations (2NF)
Remove transitive dependencies
Third Normal Form
Relations (1NF)
Kendall & Kendall
Copyright © 2002 by Prentice Hall, Inc.
17-25
Data Model Diagrams
Used to show relationships between
attributes
An oval represents an attribute
A single arrow line represents one
A double arrow line represents many
Customer
Number
Kendall & Kendall
Salesperson
Number
Copyright © 2002 by Prentice Hall, Inc.
17-26
First Normal Form (1NF)
Remove any repeating groups
All repeating groups are moved into a
new table
Foreign keys are used to link the tables
When a relation contains no repeating
groups, it is in 1NF
Keys must be included to link the
relations, tables
Kendall & Kendall
Copyright © 2002 by Prentice Hall, Inc.
17-27
Second Normal Form (2NF)
Remove any partial dependencies
A partial dependency is when the data
are only dependent on a part of a key
field
A relation is created for the data that
are only dependent on part of the key
and another for data that are
dependent on both parts
Kendall & Kendall
Copyright © 2002 by Prentice Hall, Inc.
17-28
Third Normal Form (3NF)
Remove any transitive dependencies
A transitive dependency is when a
relation contains data that are not part
of the entity
The problem with transitive
dependencies is updating the data
A single data item may be present on
many records
Kendall & Kendall
Copyright © 2002 by Prentice Hall, Inc.
17-29
Entity-Relationship Diagram (ERDs)
and Record Keys
When the relationship is one-to-many,
the primary key of the file at the one
end of the relationship should be
contained as a foreign key on the file at
the many end of the relationship
A many-to-many relationship should be
divided into two one-to-many
relationships with an associative entity
in the middle
Kendall & Kendall
Copyright © 2002 by Prentice Hall, Inc.
17-30
Guidelines for Creating Master
Files or Database Relations
Each separate entity should have it's
own master file or database relation
A specific, nonkey data field should
exist on only one master file or relation
Each master file or relation should have
programs to create, read, update, and
delete records
Kendall & Kendall
Copyright © 2002 by Prentice Hall, Inc.
17-31
Integrity Constraints
Help ensure that the database contains
accurate data:
Three Types:
Entity integrity constraints, which govern
the composition of primary keys
Referential integrity, which governs the
denature of records in a one-to-many
relationship
Domain integrity
Kendall & Kendall
Copyright © 2002 by Prentice Hall, Inc.
17-32
Entity Integrity
The primary key cannot have a null value
If the primary key is a composite key, none
of the fields in the key can contain a null
value
Kendall & Kendall
Copyright © 2002 by Prentice Hall, Inc.
17-33
Referential Integrity
Referential integrity means that all
foreign keys in one table (the child
table) must have a matching record in
the parent table
Kendall & Kendall
Copyright © 2002 by Prentice Hall, Inc.
17-34
Referential Integrity
You cannot add a record without a
matching foreign key record
You cannot change a primary key that
has matching child table records
A child table that has a foreign key for a
different record
You cannot delete a record that has
child records
Kendall & Kendall
Copyright © 2002 by Prentice Hall, Inc.
17-35
Referential Integrity
A restricted database updates or
deletes a key only if there are no
matching child records
A cascaded database will delete or
update all child records when a parent
record is deleted or changed
The parent triggers the changes
Kendall & Kendall
Copyright © 2002 by Prentice Hall, Inc.
17-36
Domain Integrity
Domain integrity defines rules that
ensure that only valid data are stored
on database records
Domain integrity has two forms:
Check constraints, which are defined at the
table level
Rules, which are defined as separate
objects and may be used within a number
of fields
Kendall & Kendall
Copyright © 2002 by Prentice Hall, Inc.
17-37
Guidelines to retrieve and
present data
 Choose a relation from the database
 Join two relations together
 Project columns from the relation
 Select rows from the relation
 Derive new attributes
 Index or sort rows
 Calculate totals and performance measures
 Present data
Kendall & Kendall
Copyright © 2002 by Prentice Hall, Inc.
17-38
Denormalization
Process of taking the logical data model
and transforming it into an efficient
physical model
Kendall & Kendall
Copyright © 2002 by Prentice Hall, Inc.
17-39
Data Warehouses
 Used to organize information for quick and
effective queries
 Data is organized around major subjects
 Data stored as summarized rather than
detailed raw data
 Data covers a much longer time frame than in
a traditional transaction-oriented database
Kendall & Kendall
Copyright © 2002 by Prentice Hall, Inc.
17-40
Data Warehouses
 Organized for fast queries
 Usually optimized for answering complex
queries, known as OLAP
 Allow for easy access via data-mining
software called siftware
 Include data from multiple databases that has
been “cleaned”
 Usually contains data from outside sources
(“overlays”)
Kendall & Kendall
Copyright © 2002 by Prentice Hall, Inc.
17-41
Online Analytic Processing (OLAP)
Meant to answer decision makers’ complex
questions by defining a multidimensional
database
Data mining, or knowledge data discovery
(KDD), is the process of identifying
patterns that a human is incapable of
detecting
Kendall & Kendall
Copyright © 2002 by Prentice Hall, Inc.
17-42
Data Mining Decision Aids
Statistical analysis
Decision trees
Neural networks
Fuzzy logic
Data visualization
Kendall & Kendall
Copyright © 2002 by Prentice Hall, Inc.
17-43
Data Mining Patterns
Associations, patterns that occur
together
Sequences, patterns of actions that take
place over a period of time
Clustering, patterns that develop among
groups of people
Trends, the patterns that are noticed
over a period of time
Kendall & Kendall
Copyright © 2002 by Prentice Hall, Inc.
17-44