Download Chapter 1

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Big data wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Database wikipedia , lookup

Relational model wikipedia , lookup

Clusterpoint wikipedia , lookup

Functional Database Model wikipedia , lookup

Database model wikipedia , lookup

Transcript
Chapter 11
Data Administration,
Database Administration,
and Data Dictionaries
Fundamentals of Database Management Systems
by
Mark L. Gillenson, Ph.D.
University of Memphis
Presentation by: Amita Goyal Chin, Ph.D.
Virginia Commonwealth University
John Wiley & Sons, Inc.
Chapter Objectives

Define and compare data administration
and database administration.

List and describe the advantages of data
administration.

List and describe the advantages of
database administration.
11-2
Chapter Objectives

List and describe the responsibilities of
data administration.

List and describe the responsibilities of
database administration.
11-3
Chapter Objectives

Explain the concept of metadata.

List and describe such metadata
realizations as passive and active data
dictionaries, relational DBMS catalogs,
and data repositories.
11-4
Database Management

The “people side” of database
management consists of two parts:
 Data Administration
 Database Administration
11-5
Data Administration

A planning and analysis function that is
responsible for:
 setting
data policy and standards
 promoting the company’s data as a
competitive resource
 accounting for the use of data
 providing liaison support to systems analysts
during application development
11-6
Database Administration

More operationally oriented.

Responsible for the day-to-day monitoring and
management of the company’s various active
databases.

Responsible for providing liaison support to
program designers during application development.

Carries out many of the policies set by data
administration.
11-7
Advantages of Data and
Database Administration

Data as a shared corporate resources.
 Efficiency in job specialization.
 Operational management of data.
 Managing externally acquired databases.
 Managing data in the decentralized
environment.
11-8
Data as a Shared Corporate
Resource

Virtually all aspects of business have
become dependent on their information
systems and the data flowing through
them.

Data may well be the most important
corporate resource because it describes
all of the others.
11-9
Data as a Shared
Corporate Resource

The effective use of its data can give a company
a significant competitive advantage.

Data, like other resources, tends to be scarce.

The dedicated departments that manage the
company’s data are the data administration and
database administration departments.
11-10
Efficiency in Job
Specialization

Many of the functions involved in the
management of data are highly
specialized and require specific expertise.

This argues for a full-time staff of
specialists who do nothing but manage a
company’s data and databases.
11-11
Operational Management
of Data

At the operational level, for the day-to-day
management of the company’s production
databases, an independent department
must be responsible.

Data should be managed by an
independent group whose loyalty is to the
overall company and not to any individual
function.
11-12
Managing Externally
Acquired Databases

Some databases are not designed by a
company’s own personnel but are
acquired as part of purchased software
packages.

For example, Enterprise Resource
Planning (ERP) software.
11-13
Managing Data in the
Decentralized Environment

Permits user departments all over the
company to handle some or all of their
information systems needs on their own,
without having to rely on the central
information systems organization.

Decentralization is a fact of life to a greater
or lesser degree in virtually all companies.
11-14
Managing Data in the
Decentralized Environment

Many people are quite content to develop their
own databases on their PCs.

However, there is a very strong argument that
says that data and database administration are
even more important in a decentralized
environment than in a centralized one.

Most companies have a hybrid
centralized/decentralized environment.
11-15
The Responsibilities of Data
Administration








Data coordination.
Data planning.
Data standards.
Liaison to systems analysts and programmers.
Training.
Arbitration of disputes and usage authorization.
Documentation and publicity.
Data’s competitive advantage.
11-16
Data Coordination

In the centralized/decentralized environment,
with data and copies of data scattered among
mainframe computers, local area network
servers, and even PCs, the possibility of
inconsistency and error increases.

Data coordination becomes the job of the data
administrators, by which they maintain a
reasonable amount of control over the
company’s data.
11-17
Data Planning

Begins with determining what data will be
needed for future company business efforts and
the applications that will support them.

Related to strategic data planning is the matter
of what hardware and software will be needed to
support the company’s information systems
operations in the future.
11-18
Data Standards

For example, controlling the way that
attribute names, table names, and other
data related names are formed.

For example, insisting on consistency in
the way the programs that access the
database are written, especially in regard
to the database call instructions.
11-19
Liaison to Systems Analysts
and Programmers

Data administrators are responsible for providing
support to the systems analysts and
programmers in all matters concerning the data
needed by an application.

Data analysts are generally involved in database
design at some level, but the decision of what
that precise level of involvement should be is
dependent on a number of factors.
11-20
Training

Data administration may be responsible
for training all those in the company who
have a reason to understand the
company’s data and, in some cases, the
DBMS environment.
11-21
Data Ownership

Who owns the data?

Since data is a resource of value to the
company, the data “belongs” to the
company’s owners or stockholders.

Practically, data is controlled by its user or
primary user.
11-22
Data Ownership

If ownership has been established and a
new application requires the use of
existing data, then it is the job of data
administration to act as an intermediary
and approach the owner of the data with
the request for data sharing.
11-23
Arbitration of Disputes and
Usage Authorization

If there is a dispute over data sharing, then
the data administration group acts as an
arbitrator between the disagreeing parties.
11-24
Documentation and Publicity

The data management function is
responsible for documenting the data
environment.

Documentation includes:
a
description of the data and the databases,
plus programs, reports, and which people
have access to these items.
11-25
Documentation and Publicity

The data management group should
perform a publicity function, informing
potential users of what data already exists
in the database.

May help employees discover how to
automate more of their work.
11-26
Data’s Competitive Advantage

Data can provide a competitive advantage
for the company.

Data administrators are in a unique
position to understand how the company
“works.”
11-27
Data’s Competitive Advantage

It is the responsibility of the data
administration function to respond to
questions about how the company’s
business procedures can be adjusted or
modified to improve the company’s
operating efficiency.
11-28
The Responsibilities of
Database Administration

DBMS performance monitoring.
 DBMS troubleshooting.
 DBMS usage and security monitoring.
 Data dictionary operations.
 DBMS data and software maintenance.
 Database design.
11-29
Database Performance
Monitoring

One of the key functions performed by
database administration.

It is important to know how fast the various
applications are executing as part of
ensuring that response time requirements
are being met.
11-30
DBMS Troubleshooting

The database administrators should be the
troubleshooting interface.

Make an assessment of what went wrong
and coordinate the appropriate personnel
needed to fix it, including systems
programmers, application programmers,
and the data administrators themselves.
11-31
DBMS Usage and
Security Monitoring

Database administrators keep track of which
applications are running in the database
environment and can track who is accessing the
data in the database at any moment.

Security: making sure that only authorized
personnel access the data.

Usage: the need to maintain records on the
amount of use the various users make of the
database.
11-32
Data Dictionary Operations

The database administration group is
responsible for the operational aspects, as
opposed to the planning aspects, of the
data dictionary and any other metadata
tools.
11-33
DBMS Data and
Software Maintenance

These activities include:
 installing
new versions of the DBMS
 installing “fixes” or “patches” to the DBMS
 performing backup and recovery operations
 any other tasks related to repairing or
upgrading the DBMS or the database.
11-34
Database Design

There is a wide range in database administration
responsibilities for database design.

In the centralized environment, database
administration is responsible for physical
database design and possibly logical database
design.

In the decentralized environment, database
administration role is usually of a consultant.
11-35
Data Dictionaries

The IS function has been so busy developing
and running systems to support all the other
corporate functions that it took a long time
before it could invest the resources to develop
information systems to support itself.
11-36
Data Dictionaries

Metadata - data about data

Data dictionary - a database about data
11-37
An Example of Metadata
Salesperson Salesperson
Number
Name
SALESPERSON
Customer
Customer
Number
Name
CUSTOMER
Commi ssion
Percentage
Year of H ire
Salesperson
Number
Customer Employee Employee
Number
Number
Name
CUSTOMER EMPLOYEE
Product
Product
Number
Name
PRODUCT
Salesperson
Number
SALES
Office
Number
OFFICE
Product
Number
Telephone
HQ Cit y
Office
Number

General
Hardware
Company’s
database
Title
Unit Price
Quantit y
Size
11-38
A Simple Data Dictionary
Table
Table
Disk
Name
Length Number
Salesperson
500
A23
Customer
6,400
A23
Customer Employe e 127,000
A23
Product
83,000
A47
Sales
273,000
A47
Office
600
A47
.
.
.
(a) TABLES table.

Attribute
Attribute Attribute
Name
Type
Length
Salesperson Number
Numeric
3
Salesperson Name
Alphab etic
20
Commission Percentage Numeric
2
Year of Hire
Numeric
4
Customer Number
Numeric
4
Customer Name
Alphab etic
20
HQ City
Alphab etic
15
.
.
.
(b) ATTRIBUTES table.
Metadata stored in the data dictionary database.
11-39
Data Dictionary

In addition to tracking the basic facts about
the represented entities, a data dictionary
must keep track of the relationships
between the entities.
11-40
Data Dictionary
Table
Name
Salesperson
Salesperson
Salesperson
Salesperson
Customer
Customer
Customer
Customer
.
.
.
Attribute
Name
Salesperson Number
Salesperson Name
Commission Percentage
Year of Hire
Customer Number
Customer Name
Salesperson Number
HQ City

Represents the
many-to-many
relationship between
the tables and
attributes shown in
the data dictionary’s
TABLES table and
ATTRIBUTES table.
11-41
Passive and Active
Data Dictionaries

Passive Data Dictionary
 Used
for documentation purposes.
 Data about the entities in the IS environment
are entered into the dictionary and crossreferenced as one-to-many and many-tomany relationships.
 A self-contained database used for
documenting the IS environment.
11-42
Passive and Active
Data Dictionaries

Active Data Dictionary
 Interacts
with the IS environment on a realtime basis.
 Input into the data dictionary
 Output from the data dictionary
11-43
Data Dictionary:
Entities and Attributes

Data-Related Entities





Databases
Tables
Attributes
Web Pages
Software-Related Entities

Application Programs
 Database Management Systems
 Jobs
11-44
Data Dictionary:
Entities and Attributes

Hardware-Related Entities




Outputs



Computers
Disks
Local-Area Networks
Reports
Queries
People
11-45
Data Dictionary:
Relationships

Table (or file) Construction: Which attributes (or
fields) appear in which tables (or files).

Security: Which people have access to which
databases or tables or files.

Impact of Change: Which programs might be
affected by changes to which tables or files.
(Note: this has become much less of an issue
due to the data independence of relational
databases.)
11-46
Data Dictionary:
Relationships

Physical Residence: Which tables or files
are on which disks.

Program Data Requirements: Which
programs use which tables or files.

Responsibility: Which people are
responsible for updating which databases
or tables or files.
11-47
Uses and Users

The heaviest users of the data dictionary will be
IS management and the data administration and
database administration functions under them.

Systems analysts and program designers can
use the data dictionary:


as a source of information about what entities,
attributes, and so forth already exist in the IS
environment.
as a documentation device for new information.
11-48
Relational DBMS Catalogs

A highly active but limited-scope data
dictionary that is very closely tied in to the
operations of the relational DBMS.

Composed of relational tables and may be
queried with standard SQL commands.
11-49
Relational DBMS Catalogs

Database entity data stored includes:
 Databases
 Tables
 Attributes
 Views
 Indexes
 Users
 Disks
11-50
Relational DBMS Catalogs

Do not include such entities as reports and
nonrelational files.

The main purpose of the relational catalog is to
accurately support the relational query optimizer.

Provides a roadmap through the database data
for anyone who wants to query the data or
explore new ways to use the data.
11-51
Data Repositories

A large-scale data dictionary that includes
entity types generated and needed by the
latest IS technologies.

In the CASE environment, the data
repository holds the same types of data
that traditional data dictionaries hold, as
well as CASE-specific data such as
reusable code modules.
11-52
“Copyright 2004 John Wiley & Sons, Inc. All rights reserved. Reproduction or
translation of this work beyond that permitted in Section 117 of the 1976
United States Copyright Act without express permission of the copyright owner
is unlawful. Request for further information should be addressed to the
Permissions Department, John Wiley & Sons, Inc. The purchaser may make
back-up copies for his/her own use only and not for distribution or resale. The
Publisher assumes no responsibility for errors, omissions, or damages caused
by the use of these programs or from the use of the information contained
herein.”
11-53