Download Stair Principles-Chapter 5 - University of Illinois at Chicago

Document related concepts

Big data wikipedia , lookup

IMDb wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Concurrency control wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Database wikipedia , lookup

Relational model wikipedia , lookup

Clusterpoint wikipedia , lookup

Database model wikipedia , lookup

Transcript
CHAPTER
5
Organizing
Data and Information
The Hierarchy of Data
Database
Collection of data organized to meet users’ needs
Database management system (DBMS)
Software consisting of a group of programs that
manipulate the database and provide an interface between
the database and the application programs
The Hierarchy of Data
Data is generally
organized in a
hierarchy that
begins with the
smallest piece of
data (a bit) and
progresses through
the
hierarchy to a
database.
The Hierarchy of Data
Character
Basic building block of information, represented by a byte
(0,1)
Field
A name, number, or combination of characters that
describes an aspect of a business activity
The Hierarchy of Data
Record
Collection of
related fields
File
Collection of related records
Database
Collection of integrated and related files
Data Entities,
Attributes, and Keys
Entity
Generalized class of people, places, or things for which
data is collected, stored, and maintained
Attribute
Characteristic of an entity
Data item
Specific value of an attribute
Data Entities,
Attributes, and Keys
Key
A field or set of fields in a record that is used to identify
the record
Primary key
A field or set of fields that uniquely identifies the record
Secondary key
A field in a record that does not uniquely identify the record
Keys and Attributes
Key field
Attributes
Entities (records)
The Traditional Approach
to Data Management
[Figure 5.3]
Flaws in the
Traditional Approach
Data redundancy
Duplication of data in separate files
Data integrity
The degree to which the data in any one file is accurate
Program-data dependence
Potential for incompatible programs and data between
applications
The Database Approach
to Data Management
Data management in which a pool of related data
is shared by multiple application programs
Rather than having separate data files, each
application uses a collection of data that are either
joined or related in the database.
The Database Approach
to Data Management
[Figure 5.4]
Advantages of the
Database Approach
Improved strategic use of corporate data
Reduced data redundancy
Improved data integrity
Easier modification and updating
Data and program independence
Advantages of the
Database Approach
Better access to data and information
Standardization of data access
A framework for program development
Better overall protection of the data
Shared data and information resources
Disadvantages of the Database
Approach
Relatively high cost of purchasing and operating a
DBMS in a mainframe operating environment
Specialized staff
Increased vulnerability
Database Considerations
Content
 What data is to be collected at what cost?
Access
What data is to be provided to which users when
appropriate?
Database Considerations
Logical structure
How is the data to be arranged so that it makes sense to a
given user?
Physical organization
Where is the data to be physically located?
Types of Database Design
Logical design
An abstract model of how the database should be
structured and arranged to meet an organization’s
information needs
Physical design
A model of how the data will be organized and located
within the database
Data Modeling and
Entity-Relationship Diagrams
Data model
A map or diagram of entities and their relationships
Enterprise data modeling
Data modeling done at the level of the entire organization
Entity-Relationship
(ER) Diagrams
Diagrams that use basic graphical diagrams to
show the organization of and relationships
between data
Relationships include:
One-to-one (1:1)
One-to-many (1:N)
Many-to-many (N:M)
An Entity-Relationship Diagram
Attributes
Entities
[Figure 5.5]
Relationship
An ER diagram for a customer ordering database
Database Models
Hierarchical (tree) models
Network models
Relational models
Hierarchical Database Model
A model in which the data is organized in a
top-down or inverted tree-like structure
[Figure 5.6]
Network Models
An extension of the hierarchical model,
in which a member may have many owners
[Figure 5.7]
Relational Models
Data organized in tabular format (rows and
columns)
Relations:
Two-dimensional tables into which data
elements are placed
Tuple: Each row of a table
Attributes: Columns of the table
Domain: Values for attributes or columns
Relational Models
[Figure 5.8]
Data Manipulations
Selecting
Eliminating rows according to certain criteria
Projecting
Eliminating columns in a table
Data Manipulations
Joining
Combining two or more tables
Linking
Joining tables that share at least one common data element
Data Analysis
and Normalization
Data analysis
Evaluation of data to uncover problems with the content of
a database
Anomalies
Problems and irregularities in data
Normalization
Removing anomalies from a database
Comparison of
Database Models
Hierarchical model
Primary advantage:
processing efficiency
Network model
More flexible than hierarchical models in terms of
organizing data
Relational database model
Easier to control, more flexible, and more intuitive; by far
the most widely used
Database Characteristics
Amount
Database size depends on the number of records or files it
contains
Volatility
A measure of the changes typically required in a given
period of time
Immediacy
A measure of how rapidly changes must be made to data
Database Management Systems
Group of programs used as an interface between
a database and application programs or a
database and the user
Classified by the type of database model they
support
Hierarchical
Network
Relational
Storing and Retrieving Data
 Logical access path
 Application requests data
from the DBMS
 Physical access path
 DBMS accesses a storage
device to retrieve the data
 [Figure 5.14]
Data Control
Concurrency control
Locks out simultaneous access to a record that is being
updated or used by another program
Schema
The logical and physical structure of the data and
relationships among the data in the database
Providing a User View
User view
Portion of the database a user can access
Subschema
A file that contains a description of a subset of the
database and identifies which users can perform
modification on the data items in that subset
Developed to create different views
The Use of Schemas
and Subschemas
[Figure 5.15]
Creating and Modifying
the Database
 Data definition language
(DDL)
 Collection of instructions and
commands used to define and
describe data and data
relationships in a specific
database
 [Figure 5.16]
Creating and Modifying
the Database
 Data dictionary
 A detailed description of all data used in the database
[Figure 5.17]
Data Dictionary
Provides a standard definition of terms and data
elements
Assists programmers in designing and writing
programs
Simplifies database modifications
Data Dictionary
Helps achieve advantages of the database
approach
Reduced data redundancy
Increased data reliability
Faster program development
Easier modification of data and information
Manipulating Data and Generating
Reports
Data Manipulation Language (DML)
Contains the commands used to manipulate the database
Allows managers and other database users to access,
modify, and make queries about data contained in the
database to generate reports
Structured Query Language (SQL)
A standardized data manipulation
language that has become
an integral part of most
relational database packages
Selecting a Database Management
System
Begins by analyzing database needs and
characteristics
Performance
Integration
Features
The vendor
Cost
Emerging Database Trends
Distributed databases
Actual data may be spread across several smaller
databases connected via telecommunications devices
Replicated database
Holds a duplicate set of frequently used data
Distributed Database
HCIA p223
HCIA, Inc. uses a distributed database to provide
up-to-date information to their customers.
Data Warehouse
A relational database
management system
designed specifically to
support management
decision making
 [Figure 5.21]
Data Warehouse
Data mart
Subset of a data warehouse
Brings the data warehouse concept to small and
medium-size businesses
On-line analytical processing (OLAP)
Consists of programs used to store and deliver data
warehouse information
Data mining
Automated discovery of patterns and relationships in a
data warehouse
Open Database Connectivity
(ODBC)
Standards that help
ensure that specific
software can be used
with any ODBCcompliant database
 [Figure 5.22]
Object-Oriented Databases
Databases that store data as objects, which contain both the
data and the processing instructions needed to complete
the database transaction
[Table 5.6]
Image, Hypertext, and Hypermedia
Databases
Image databases
Store data in the form of images
Hypertext databases
Allow users to search and manipulate alphanumeric data in
an unstructured way
Hypermedia databases
Allow businesses to search and manipulate multimedia
forms of data
Spatial Data Technology
Involves the use of an object-relational database
Stores and accesses data according to the
locations it describes
Permits spatial queries and analysis
Aspects of
Database Administration
Overall design and coordination of the database
Development and maintenance of schemas and
subschemas
Development and maintenance of the data
dictionary
Implementation of the DBMS
Aspects of
Database Administration
System and user documentation
User support and training
Overall operation of the DBMS
Testing and maintaining the DBMS
Establishing emergency or failure-recovery
procedures
Database Use,
Policies, and Security
What data should users have direct access to?
Under what circumstances can data be
transferred from a PC or small computer system
to the large mainframe system (uploading)?
Database Use,
Policies, and Security
Under what circumstances can data be
transferred from a mainframe system to PCs or
small computer system (downloading)?
What procedures are needed to guarantee proper
database use?