Download Chapter 1

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Extensible Storage Engine wikipedia , lookup

Big data wikipedia , lookup

Database wikipedia , lookup

Relational model wikipedia , lookup

Clusterpoint wikipedia , lookup

Functional Database Model wikipedia , lookup

Object-relational impedance mismatch wikipedia , lookup

Database model wikipedia , lookup

Transcript
Chapter 4
The Database Management
System Concept
Fundamentals of Database Management Systems
by
Mark L. Gillenson, Ph.D.
University of Memphis
Presentation by: Amita Goyal Chin, Ph.D.
Virginia Commonwealth University
John Wiley & Sons, Inc.
Chapter Objectives

Discuss the problems encountered in a
nondatabase information systems
environment.

List the five basic principles of the
database concept.

Describe how data can be considered to
be a manageable resource.
4-2
Chapter Objectives

List the three problems created by data
redundancy.

Describe the nature of data redundancy
among many files.

Explain the relationship between data
integration and data redundancy in one
file.
4-3
Chapter Objectives

State the primary defining feature of a
database management system.

Explain why the ability to store multiple
relationships is an important feature of the
database approach.
4-4
Chapter Objectives

Explain why providing support for such
control issues as data security, backup
and recovery, and concurrency is an
important feature of the database
approach.

Explain why providing support for data
independence is an important feature of
the database approach.
4-5
Data Processing Systems

Data was stored in different formats in different
files.

Data was often not shared among different
programs that needed it, necessitating the
duplication of data in redundant files.

Little was understood about file design, resulting
in redundant data within individual files.
4-6
Data Processing Systems

Files often could not be rebuilt after being
damaged by a software error or a
hardware failure.

Data was not secure and was vulnerable
to theft or malicious mischief by people
inside or outside of the company.
4-7
Data Processing Systems

Programs were usually written in such a
manner that if the way that the data was
stored changed, the program had to be
modified to continue working.

Changes in everything from access
methods to tax tables required
programming changes.
4-8
The Database Concept

1. Data as a Manageable Resource

2. Data Integration and Data Redundancy

3. Multiple Relationships

4. Data Control Issues

5. Data Independence
4-9
The Database Concept

Data as a Manageable Resource - The
creation of a data-centric environment in
which a company’s data can truly be
thought of as a significant corporate
resource. A key feature of this environment
is the ability to share data among those
inside and outside of the company who
require access to it.
4-10
The Database Concept

Data Integration and Data Redundancy - The
ability to achieve data integration while at the
same time storing data in a nonredundant
fashion. This, alone, is the central, defining
feature of the database approach.

Multiple Relationships - The ability to store data
representing entities involved in multiple
relationships without introducing data
redundancy or other structural problems.
4-11
The Database Concept

Data Control Issues - The establishment of
an environment that manages certain data
control issues, such as data security,
backup and recovery, and concurrency
control.

Data Independence - The establishment of
an environment that permits a high degree
of data independence.
4-12
Data as a Manageable
Resource

Information systems environment:
 Hardware
 Networks
 Applications
software
 Systems software
 People
 Data
4-13
Data as a Manageable
Resource

Historically, data has not been the highest
priority concern.

As the operational, tactical, and strategic
corporate levels became more dependent
on information systems, data increasingly
became recognized as an important
corporate resource.
4-14
Data as a Manageable
Resource

The corporate community became
increasingly convinced that a firm’s data
could provide a significant competitive
advantage to the firm.

It became clear that data would have to be
managed in an organized way.
4-15
Data as a Manageable
Resource
4-16
Data as a Manageable
Resource

Needed a software utility that could
manage and protect data.

Data could be a critical corporate
resource.

Out of this need was born the database
management system.
4-17
Data as a Manageable
Resource

Reengineering - data is aggressively used
to redesign business processes.

Electronic commerce - a database at the
heart of every web site; allows companies
and customers to conduct business.

ERP systems - collections of application
programs build around a central database.
4-18
Data Integration and Data
Redundancy

Data integration - the ability to tie together
pieces of related data within an
information system.

Data redundancy - the same fact about the
business environment is stored more than
once within an information system.
4-19
Data Redundancy - Problems

Redundant data takes up a great deal of
extra disk space.

If the redundant data has to be updated, it
takes additional time to do so. This can be
a major performance issue.

There is the potential for data integrity
problems.
4-20
Data Integrity

Refers to the accuracy of the data.

Inaccurate data leaves the whole
information system of limited value.
4-21
Data Redundancy,
Data Integrity

When all copies of redundant data are not
updated consistently, a data integrity problem
exists.
4-22
Three Files with Redundant
Data
Sales file
Customer
Number
2746795
Customer
Name
John Jones
Customer
Address
123 Elm Street
Accounts Receivable file
Customer
Customer
Name
Number
2746795
John Jones
Customer
Address
123 Elm Street
Credit file
Customer
Number
2746795
Customer
Address
123 Elm Street
Customer
Name
John Jones
4-23
Three Files with a Data
Integrity Problem
Sales file
Customer
Number
2746795
Customer
Name
John Jones
Customer
Address
456 Oak Street
Accounts Receivable file
Customer
Customer
Name
Number
2746795
John Jones
Customer
Address
456 Oak Street
Credit file
Customer
Number
2746795
Customer
Address
123 Elm Street
Customer
Name
John Jones
4-24
Salesperson
Number
137
186
204
361
Customer
Number
0121
0839
0933
1047
1525
1700
1826
2198
2267
(a) Salesperson file
Salesperson Commission
Name
Percentage
Baker
10
Adams
15
Dickens
10
Carlyle
20
(b) Customer file
Customer
Salesperson
Name
Number
Main St.
137
Hardware
Jane’s Stores
186
ABC Home
137
Stores
Acme
137
Hardware Store
Fred’s Tool
361
Stores
XYZ Stores
361
City Hardware
137
Western
204
Hardware
Central Stores
186
Year Of Hire
1995
2001
1998
2001

General
Hardware
Company
Files
HQ City
New York
Chicago
Los Ange les
Los Ange les
Atlanta
Washington
New York
New York
New York
4-25
General Hardware Company
Combined File
0121 Main St.
Hardware
0839 JaneΥsStores
0933 ABC Home
Stores
1047 Acme
Hardware
Store
1525 FredΥsTool
Stores
1700 XYZ Stores
1826 City Hardware
2198 Western
Hardware
2267 Central Stores
137 New York
137 Baker
10 1995
186 Chicago
186 Adams
137 Los Ange les 137 Baker
15 2001
10 1995
137 Los Ange les 137 Baker
10 1995
361 Atlanta
20 2001
361 Carlyle
361 Washington 361 Carlyle 20 2001
137 New York
137 Baker
10 1995
204 New York
204 Dickens 10 1998
186 New York
186 Adams
15 2001
4-26
Anomalies

Typically occur in poorly structured files.

Problems arise when two different kinds of
data, like salesperson and customer data
are merged into one file.
4-27
Anomalies

Deletion Anomaly - e.g, if you delete a customer and that
record was the only one for a salesperson, the
salesperson’s data is gone.

Insertion Anomaly - e.g., General Hardware cannot add
data about a new salesperson the company just hired
until she is assigned at least one customer.

Update Anomaly - redundant data in the database file
must be updated each place it exists when it changes
4-28
Database Management
System

A software utility for storing and retrieving
data that gives the end-user the
impression that the data is well integrated
even though the data can be stored with
no redundancy at all.
4-29
Multiple Relationships Horizontal Solution
Salesperson Salesperson Commission Year
Number
Name
Percentage
Of
Hire
137
Baker
10
1995
186
Adams
15
2001
204
Dickens
10
1998
361
Carlyle
20
2001
(a) Salesperson file.
Customer
Number
0121
0839
0933
1047
1525
1700
1826
2198
2267
Customer
Name
Main St. Hardware
Jane’s Stores
ABC Home Stores
Acme Hardware Store
Fred’s Tool Stores
XYZ Stores
City Hardware
Western Hardware
Central Stores
(b) Customer file.
Customer
Numbers
0121, 0933, 1047, 1826
0839, 2267
2198
1525, 1700
HQ City
New York
Chicago
Los Ange les
Los Ange les
Atlanta
Washington
New York
New York
New York
4-30
Multiple Relationships Vertical Solution
4-31
Data Control Issues

Data security

Backup and Recovery

Concurrency Control
4-32
Computer Security

A very broad topic
 Protecting the physical hardware
environment
 Defending against hacker attacks
 Encrypting data transmitted over networks
 Educating employees on the importance of
protecting the company’s data
 … and many more
4-33
Backup and Recovery

Data can be lost or corrupted in any of a
variety of ways:
a
disaster such as a fire, a hurricane, or an
earthquake
 hackers
 computer viruses
 poorly written application programs
 unintentional error
4-34
Concurrency Problem

Updates to a database can interfere with
each other in such a way that the resulting
data values will be incorrect.

A database management system must be
designed to protect its databases from
such an eventuality.
4-35
Data Independence

Data Dependence - if for any reason the storage
characteristics of the data had to be changed,
the application program itself had to be modified,
often extensively.

Data Independence - to have a data storage and
programming environment in which as many
types of changes in the data structure as
possible would not require changes in the
application programs that use them.
4-36
Major DBMS Approaches

Hierarchical - navigational

Network - navigational

Relational - became commercially viable in
about 1980.

Object-oriented - useful for a variety of niche
applications.
4-37
Hierarchical & Network

Called navigational approaches because
of the way that programs have to
“navigate” through hierarchies and
networks of data to find needed data.

Developed in the 1960s and

Somewhat similar in structure
4-38
Hierarchical & Network

Suitable only for mainframe computers

were an elegant solution to the
redundancy/integration problem at the time

Complex, difficult to work

Now considered legacy systems
4-39
Relational Database

Became commercially viable in about
1980

Soon became the preferred DBMS
approach and it has remained so ever
since.
4-40
Object-oriented

Has proven useful for a variety of niche
applications

It is interesting to note that some of the
key object-oriented database concepts
have found their way into some of the
mainstream relational DBMSs and some
are described as taking a hybrid
object/relational approach to database.
4-41
“Copyright 2004 John Wiley & Sons, Inc. All rights reserved. Reproduction or
translation of this work beyond that permitted in Section 117 of the 1976
United States Copyright Act without express permission of the copyright owner
is unlawful. Request for further information should be addressed to the
Permissions Department, John Wiley & Sons, Inc. The purchaser may make
back-up copies for his/her own use only and not for distribution or resale. The
Publisher assumes no responsibility for errors, omissions, or damages caused
by the use of these programs or from the use of the information contained
herein.”
4-42