Download Document

Document related concepts

Open Database Connectivity wikipedia , lookup

Concurrency control wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Database wikipedia , lookup

Relational model wikipedia , lookup

Clusterpoint wikipedia , lookup

Functional Database Model wikipedia , lookup

Database model wikipedia , lookup

Transcript
Chapter 8 Home Page – Welcome!
To navigate the slide presentation, use the navigation bar on the
left OR use your right and left arrow keys.
Move your mouse over the key terms throughout the
presentation to see a definition.
Click
to view reference figures.
For further exploration, links to selected web sites are included.
Click under this banner
throughout the
presentation.
Chapter Objectives
 Explain data design concepts and data
structures
 Describe file processing systems
 Explain database systems and define the
components of a database management system
(DBMS)
 Describe Web-based data design
2
Chapter Objectives
 Explain data design terminology, including
entities, fields, common fields, records, files,
tables, and key fields
 Describe data relationships, draw an entityrelationship diagram, define cardinality and
use cardinality notation
 Explain the concept of normalization
 Explain the importance of codes and describe
various coding schemes
3
Chapter Objectives
 Describe relational and object-oriented
database models
 Explain data warehousing and data mining
 Differentiate between logical and physical
storage and records
 Explain data control measures
4
Introduction
 You will develop a physical plan for data
organization, storage, and retrieval
 Begins with a review of data design concepts
and terminology, then discusses file-based
systems and database systems, including Webbased databases
 Concludes with a discussion of data storage
and access, including strategic tools such as
data warehousing and data mining, physical
design issues, logical and physical records, data
storage formats, and data controls
5
Data Design Concepts
 Before constructing an information system, a
systems analyst must understand basic
design concepts, including data structures
and the characteristics of file processing and
database systems, including Web-based
database design
6
Data Design Concepts
 Data Structures
– A file or table contains data about people,
places or events that interact with the system
– File-oriented system
– File processing
– Database system
Figure 8-2
Figure 8-3
7
Data Design Concepts
 Overview of File Processing
– Companies mainly use file processing to
handle large volumes of structured data on a
regular basis
– Although much less common today, file
processing can be more efficient and costeffective in certain situations
Figure 8-4
8
Data Design Concepts
 Overview of File Processing
– Potential problems
• Data redundancy
• Data integrity
• Rigid data structure
9
Data Design Concepts
 Overview of File Processing
– Various types of files
•
•
•
•
•
•
Master file
Table file
Transaction file
Work file
Security file
History file
10
Data Design Concepts
 The Evolution from File Systems to Database
Systems
– A properly design database system offers a
solution to the problems of file processing
– Provides an overall framework that avoids
data redundancy and supports a real-time,
dynamic environment
Figure 8-5
11
Data Design Concepts
 The Evolution from File Systems to Database
Systems
– A database management system (DBMS) is
a collection of tools, features, and interfaces
that enables users to add, update, manage,
access, and analyze the contents of a
database
– The main advantage of a DBMS is that it
offers timely, interactive, and flexible data
access
12
Data Design Concepts
The Evolution from File Systems to Database
Systems Advantages
•
•
•
•
•
Scalability
Better support for client/server systems
Economy of scale
Flexible data sharing
Enterprise-wide application – database
administrator (DBA)
• Stronger standards
13
Data Design Concepts
 The Evolution from File Systems to Database
Systems
– Advantages
•
•
•
•
Controlled redundancy
Better security
Increased programmer productivity
Data independence
14
Data Design Concepts
 Database Tradeoffs
– Because DBMSs are powerful, they require
more expensive hardware, software, and data
networks capable of supporting a multi-user
environment
– More complex than a file processing system
– Procedures for security, backup, and
recovery are more complicated and critical
15
DBMS Components
 A DBMS provides an interface between a
database and users who need to access the
data
 In addition to interfaces for users, database
administrators, and related systems, a DBMS
also has a data manipulation language, a
schema and subschemas, and a physical data
repository
Figure 8-6
16
DBMS Components
 Interfaces for Users, Database
Administrators, and Related Systems
– Users
• Query language
• Query by example (QBE)
• SQL (structured query language)
– Database Administrators
• A DBA is responsible for DBMS management
and support
Figure 8-7
17
DBMS Components
 Interfaces for Users, Database
Administrators, and Related Systems
– Related information systems
• A DBMS can support several related information
systems that provide input to, and require
specific data from, the DBMS
18
DBMS Components
 Data Manipulation Language
– A data manipulation language (DML)
controls database operations, including
storing, retrieving, updating, and deleting
data
19
DBMS Components
 Schema
– The complete definition of a database,
including descriptions of all fields, tables,
and relationships, is called a schema
– You also can define one or more subschemas
20
DBMS Components
 Physical Data Repository
– The data dictionary is transformed into a
physical data repository, which also contains
the schema and subschemas
– The physical repository might be centralized,
or distributed at several locations
– ODBC – open database connectivity
– JDBC – Java database connectivity
21
Web-Based Database Design
 The following sections discuss the
characteristics of Web-based design, Internet
terminology, connecting a database to the
Web, and data security on the Web
22
Web-Based Database Design
 Characteristics of Web-Based Design
– In a Web-based design, the Internet serves as
the front end, or interface for the database
management system. Internet technology
provides enormous power and flexibility
– Web-based systems are popular because they
offer ease of access, cost-effectiveness, and
worldwide connectivity
Figure 8-8
23
Web-Based Database Design
 Internet Terminology
– Web browser
– Web page
– HTML (Hypertext Markup Language)
– Tags
– Web server
– Web site
For more information about
HTML, visit
scsite.com/sad7e/more,
locate Chapter 8 and then
the HTML link.
24
Web-Based Database Design
 Internet Terminology
– Intranet
– Extranet
– Protocols
– Web-centric
– Clients
– Servers
25
Web-Based Database Design
 Connecting a Database to the Web
– Database must be connected to the Internet
or intranet
• Middleware
• Adobe ColdFusion
Figure 8-9
Figure 8-10
26
Web-Based Database Design
 Data Security
– Web-based data must be totally secure, yet
easily accessible to authorized users
– To achieve this goal, well-designed systems
provide security at three levels: the database
itself, the Web server, and the
telecommunication links that connect the
components of the system
27
Data Design Terminology
 Definitions
– Entity
– Table or file
– Field
• Attribute
• Common field
– Record
• Tuple
Figure 8-11
28
Data Design Terminology
 Key Fields
– Primary key
•
•
•
•
Combination key
Composite key
Concatenated key
Multi-valued key
Figure 8-12
29
Data Design Terminology
 Key Fields
– Candidate key
• Nonkey field
– Foreign key
– Secondary key
30
Data Design Terminology
 Referential Integrity
– Validity checks can help avoid data input
errors
– In a relational database, referential integrity
means that a foreign key value cannot be
entered in one table unless it matches an
existing primary key in another table
– Orphan
For more information about
Referential Integrity , visit
scsite.com/sad7e/more,
locate Chapter 8 and then
the Referential Integrity link.
Figure 8-13
31
Entity-Relationship Diagrams
 An entity is a person, place, thing, or event
for which data is collected and maintained
 Entity-relationship diagram (ERD)
 An ERD provides an overall view of the
system, and a blueprint for creating the
physical data structures
For more information about
Entity-Relationship Diagrams, visit
scsite.com/sad7e/more,
locate Chapter 8 and then the
Entity-Relationship Diagrams link.
32
Entity-Relationship Diagrams
 Drawing an ERD
– The first step is to list the entities that you
identified during the fact-finding process and
to consider the nature of the relationships
that link them
– A popular method is to represent entities as
rectangles and relationships as diamond
shapes
Figure 8-14
33
Entity-Relationship Diagrams
 Types of Relationships
– Three types of relationships can exist
between entities
– One-to-one relationship (1:1)
– One-to-many relationship (1:M)
– Many-to-many relationship (M:N)
• Associative entity
Figure 8-15
Figure 8-18
Figure 8-16
Figure 8-17
34
Entity-Relationship Diagrams
 Cardinality
• Cardinality notation
• Crow’s foot notation
• Unified Modeling Language (UML)
Figure 8-19
For more information about
Cardinality, visit
scsite.com/sad7e/more,
locate Chapter 8 and then
the Cardinality link.
Figure 8-20
Figure 8-21
35
Normalization
 Normalization
 Table design
 Involves four stages: unnormalized design,
first normal form, second normal form, and
third normal form
 Most business-related databases must be
designed in third normal form
For more information about
Normalization, visit
scsite.com/sad7e/more,
locate Chapter 8 and then
the Normalization link.
36
Normalization
 Standard Notation Format
– Designing tables is easier if you use a standard
notation format to show a table’s structure,
fields, and primary key
Example: NAME (FIELD 1, FIELD 2, FIELD 3)
37
Normalization
 Repeating Groups and Unnormalized Design
– Repeating group
• Often occur in manual documents prepared by
users
– Unnormalized
Figure 8-22
38
Normalization
 First Normal Form
– A table is in first normal form (1NF) if it
does not contain a repeating group
– To convert, you must expand the table’s
primary key to include the primary key of
the repeating group
Figure 8-23
39
Normalization
 Second Normal Form
– To understand second normal form (2NF),
you must understand the concept of
functional dependence
– Field X is functionally dependent on field Y
if the value of field X depends on the value
of field Y
40
Normalization
 Second Normal Form
– A standard process exists for converting a
table from 1NF to 2NF
1. First, create and name a separate table for each
field in the existing primary key
2. Next, create a new table for each possible
combination of the original primary key fields
3. Finally, study the three tables and place each
field with its appropriate primary key
Figure 8-24
41
Normalization
 Second Normal Form
– Four kinds of problems are found with 1NF
description that do not exist with 2NF
• Consider the work necessary to change a
particular product’s design
• 1NF tables can contain inconsistent data
• Adding a new product is a problem
• Deleting a product is a problem
42
Normalization
 Third Normal Form
– 3NF design avoids redundancy and data
integrity problems that still can exist in 2NF
designs
– A table design is in third normal form (3NF)
if it is in 2NF and if no nonkey field is
dependent on another nonkey field
Figure 8-25
43
Normalization
 Third Normal Form
– To convert the table to 3NF, you must
remove all fields from the 2NF table that
depend on another nonkey field and place
them in a new table that uses the nonkey
field as a primary key
Figure 8-26
44
Normalization
 A Normalization Example
– To show the normalization process, consider
the familiar situation in Figure 8-27 which
might depict several entities in a school
advising system: ADVISOR, COURSE, and
STUDENT
Figure 8-27
Figure 8-28
Figure 8-29
Figure 8-30
Figure 8-31
Figure 8-32
Figure 8-33
45
Using Codes During System Design
 Overview of Codes
– Because codes often are used to represent
data, you encounter them constantly in your
everyday life
– They save storage space and costs, reduce
transmission time, and decrease data entry
time
– Can reduce data input errors
Figure 8-34
46
Using Codes During System Design
 Types of Codes
1. Sequence codes
2. Block sequence codes
3. Alphabetic codes
a. Category codes
b. Abbreviation codes – mnemonic codes
Figure 8-35
47
Using Codes During System Design
 Types of codes
4. Significant digit codes
5. Derivation codes
6. Cipher codes
7. Action codes
Figure 8-36
Figure 8-37
48
Using Codes During System Design
 Developing a Code
1. Keep codes concise
2. Allow for expansion
3. Keep codes stable
4. Make codes unique
5. Use sortable codes
49
Using Codes During System Design
 Developing a Code
6. Avoid confusing codes
7. Make codes meaningful
8. Use a code for a single purpose
9. Keep codes consistent
50
Steps in Database Design
1.
2.
3.
4.
Create the initial ERD
Assign all data elements to entities
Create 3NF designs for all tables
Verify all data dictionary entries
– After creating your final ERD and normalized
table designs, you can transform them into a
database
Figure 8-38
Figure 8-39
Figure 8-40
51
Database Models
 Relational Databases
– The relational model was introduced during
the 1970s and became popular because it
was flexible and powerful
– Because all the tables are linked, a user can
request data that meets specific conditions
– New entities and attributes can be added at
any time without restructuring the entire
database
Figure 8-41
Figure 8-42
52
Database Models
 Object-Oriented Databases
– Many systems developers are using objectoriented database (OODB) design as a natural
extension of the object-oriented analysis
process
• Object Database Management Group (ODMG)
• Each object has a unique object identifier
Figure 8-43
Figure 8-44
53
Data Storage and Access
 Data storage and access involve strategic
business tools
 Strategic tools for data storage and access
– Data warehouse – dimensions
– Data mart
For more information about
Data Warehousing , visit
scsite.com/sad7e/more,
locate Chapter 8 and then
the Data Warehousing link.
Figure 8-45
Figure 8-46
54
Data Storage and Access
 Strategic tools for data storage and access
– Data Mining
•
•
•
•
•
•
For more information about
Data Mining , visit
scsite.com/sad7e/more,
locate Chapter 8 and then
the Data Mining link.
Increase average pages viewed per session.
Increase number of referred customers
Reduce clicks to close
Increase checkouts per visit
Increase average profit per checkout
Clickstream storage – market basket analysis
Figure 8-47
55
Data Storage and Access
 Logical and Physical Storage
– Logical storage
• Characters
• Date element or data item
• Logical record
– Physical storage
• Physical record or block
• Buffer
• Blocking factor
56
Data Storage and Access
 Data Storage Formats
– Binary digits
– Bit
– Byte
– EBCDIC and ASCII
– Unicode - internationalize
Figure 8-48
57
Data Storage and Access
 Data Storage Formats
– Binary
•
•
•
•
Binary storage format
Integer format
Long integer format
Other binary formats exist for efficient storage of
exceedingly long numbers
58
Data Storage and Access
 Selecting a Data Storage Format
– In many cases, a user can select a specific data
storage format
– For example, when using Microsoft Office,
you can store documents, spreadsheets, and
databases in Unicode-compatible form by
using the font called Arial Unicode MS
– Best answer is it depends on the situation
59
Data Storage and Access
 Date Fields
– Most date formats now are based on the model
established by the International Organization
for Standardization (ISO)
– Can be sorted easily and used in comparisons
– Absolute date
– Best method depends on how the specific date
will be printed, displayed or used in a
calculation
Figure 8-49
60
Data Control
 File and database control must include all
measures necessary to ensure that data storage
is correct, complete, and secure
 A well-designed DBMS must provide built-in
control and security features, including
subschemas, passwords, encryption, audit trail
files, and backup and recovery procedures to
maintain data
For more information about
Data Control, visit
scsite.com/sad7e/more,
locate Chapter 8 and then
the Data Control link.
Figure 8-50
61
Data Control




User ID
Password
Permissions
Encryption
62
Data Control




Backup
Recovery procedures
Audit log files
Audit fields
63
Chapter Summary
 Files and tables contain data about people,
places, things, or events that affect the
information system
 DBMS designs are more powerful and flexible
than traditional file-oriented systems
64
Chapter Summary
 Data design tasks include creating an initial
ERD; assigning data elements to an entity;
normalizing all table designs; and completing
the data dictionary entries for files, records, and
data elements
 A code is a set of letters or numbers used to
represent data in a system
 The most common database models are
relational and object-oriented
65
Chapter Summary
 Logical storage is information seen through a
user’s eyes, regardless of how or where that
information actually is organized or stored
 Physical storage is hardware-related and
involves reading and writing blocks of binary
data to physical media
 File and database control measures include
limiting access to the data, data encryption,
backup/recovery procedures, audit-trail files,
and internal audit fields
66
Test Yourself
1. List 2 database management system (DBMS)
components.
67
Test Yourself
1. List 2 database management system (DBMS)
components.
1.
2.
3.
4.
Interfaces
Data manipulation language
Schema
Physical data repository
68
Test Yourself
2. True/False: Some amount of data
duplication is permissible in file oriented
systems, but is avoided in database systems.
69
Test Yourself
2. True/False: Some amount of data
duplication is permissible in file oriented
systems, but is avoided in database systems.
True
70
Test Yourself
3. Match the terms on the left with the correct
definitions on the right.
1. Primary key
2. Secondary key
3. Foreign key
4. Candidate key
a. A non-unique field or combination of
fields that can be used to access
records.
b. a field or combination of fields that
uniquely and minimally identifies a
particular entity.
c. Any field that could serve as a
primary key is called this term.
d. A field in one table that must match a
primary key value in another table to
establish a relationship.
71
Test Yourself
3. Match the terms on the left with the correct
definitions on the right.
1. Primary key
2. Secondary key
3. Foreign key
4. Candidate key
a. A non-unique field or combination of
fields that can be used to access
records.
b. a field or combination of fields that
uniquely and minimally identifies a
particular entity.
c. Any field that could serve as a
primary key is called this term.
d. A field in one table that must match a
primary key value in another table to
establish a relationship.
72
Test Yourself
4. Describe the relationship by filling in the
diamond and labeling the cardinality (i.e.
1:1, 1:M, M:N)
Student
Book
TV
Station
Movies
Social
Security
Number
Person
73
Test Yourself
4. Describe the relationship by filling in the
diamond and labeling the cardinality (i.e.
1:1, 1:M, M:N)
1
Student
TV
Station
Social
Security
Number
M
1
Owns
Shows
Assigned
to
M
Book
N
Movies
1
Person
74
Test Yourself
5. True/False: Cardinality is a graphical model
of the information system that depicts the
relationships among system entities.
75
Test Yourself
5. True/False: Cardinality is a graphical model
of the information system that depicts the
relationships among system entities.
False
76
Test Yourself
6. Briefly explain the following normalization
rules:
First Normal Form (1NF)
Second Normal Form (2NF)
Third Normal Form (3NF)
77
Test Yourself
6. Briefly explain the following normalization
rules:
First Normal Form (1NF) - Record is 1NF if there are no
repeating groups
Second Normal Form (2NF) - To be in 2NF, a record
must be in 1NF, and all nonkey fields are functionally
dependent on the entire primary key
Third Normal Form (3NF) - In 3NF, all nonkey fields are
functionally dependent on the primary key, the entire
key, and nothing but the key
78
Test Yourself
7. The relationship between an ORDER DATE
and an ORDER #, illustrates the concept of
_______ _________.
79
Test Yourself
7. The relationship between an ORDER DATE
and an ORDER #, illustrates the concept of
functional dependence.
80
Test Yourself
8. List the 4 steps of database design that are
followed after normalizing record designs or
tables.
81
Test Yourself
8. List the 4 steps of database design that are
followed after normalizing record designs or
tables.
1. Create the initial ERD
2. Assign all data elements to entities
3. Create 3NF designs for all records, taking care
to identify all primary, secondary, and foreign
keys
4. Verify all data dictionary entries
82
Test Yourself
9. A logical record/physical record is the
smallest unit of data that is accessed by the
operating system.
83
Test Yourself
9. A physical record is the smallest unit of data
that is accessed by the operating system.
84
Test Yourself
10. True/False: Date formats now are based on
the model established by the International
Organization for Standardization.
85
Test Yourself
10. True/False: Date formats now are based on
the model established by the International
Organization for Standardization.
True.
86
Systems Analysis & Design
7th Edition
End Chapter 8