Download CHAPTER 5 MANAGING ORGANIZATIONAL DATA AND

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Extensible Storage Engine wikipedia , lookup

Database wikipedia , lookup

Relational model wikipedia , lookup

Clusterpoint wikipedia , lookup

Functional Database Model wikipedia , lookup

Database model wikipedia , lookup

Transcript
1
Introduction to Information Technology
Turban, Rainer and Potter
Chapter 5 Managing Organizational Data and Information
CHAPTER 5
MANAGING ORGANIZATIONAL
DATA AND INFORMATION
Oleh : Kundang K Juman
2
Introduction to Information Technology
Turban, Rainer and Potter
Chapter 5 Managing Organizational Data and Information
Learning Objectives
 Discuss traditional data file organization and its
problems
 Explain how a database approach overcomes the
problems associated with traditional file environment,
and discuss the advantages of the database approach
 Describe how the three most common data models
organize data, and the advantages and disadvantages of
each model
 Describe how a multidimensional data model organizes
data
 Distinguish between a data warehouse and a data mart
 Discuss the similarities and difference between data
mining and text mining
Introduction to Information Technology
Turban, Rainer and Potter
Chapter 5 Managing Organizational Data and Information
3
Chapter Overview
Basics of Data
Arrangement and
Access
• The Data
Hierarchy
• Storing and
Accessing Records
Database
Management Systems
• Logical versus
Physical View
• DBMS
Components
The Traditional
File Environment
• Problems with
the File Approach
Databases: The
Modern Approach
• Locating Data in
Databases
• Creating the
Database
Logical Data
Data
Models
Warehouse
• Hierarchical Model • Multidimensional
• Network Model
Model
• Relational Model
• Data Marts
• Advantages and
• Data Mining
Disadvantages of the • Text Mining
Three Models
• Emerging Models
• Other Models
Introduction to Information Technology
Turban, Rainer and Potter
Chapter 5 Managing Organizational Data and Information
4
Case: FedEx Pinpoints Profitable Customers
 The Problem
 customers are classified as good , bad, or
ugly by the cost of doing business with
them and the profits they return
 keep the good customers, improve the bad
customers, and drop the ugly ones
 easy to identify customers who spend money with
them but difficult to identify customers who are
profitable for them
Introduction to Information Technology
Turban, Rainer and Potter
Chapter 5 Managing Organizational Data and Information
5
Case (continued…)
 The Solution
 use a data warehouse, stocked with customer data, that
allows the company to compare the complex mix of
marketing and servicing costs that go into retaining
each individual customer versus the revenues he, she,
or it might bring in
The Results
 “good” customers - expect a phone call if their
shipping volumes falter, which can prevent
defections before they occur
 “bad” customers – can be turned into profitable
customers by charging higher shipping rates
 “ugly” customers – can be ignored
6
Introduction to Information Technology
Turban, Rainer and Potter
Chapter 5 Managing Organizational Data and Information
Case (continued…)
What have we learned from this case??
 Organizations can now scrutinize their
customers (or other data) very carefully
with advanced data management and analysis tools
 Customized strategies can be developed to cut
costs, transform the marginal customer into a
profitable customer, and permit more profitable
pricing structures
 Other types of data can give an organization
important feedback about its products, services,
markets, and coming trends
Introduction to Information Technology
Turban, Rainer and Potter
Chapter 5 Managing Organizational Data and Information
7
Basics of Data Arrangement
and Access
 The Data Hierarchy
 Field - a logical grouping of characters into a word, a small
group of words, or a complete number
 Record - a logical grouping of related fields
 File - a logical grouping of related records
 Database - a logical grouping of related files
 Entity - a person, place, thing, or event about which
information is maintained
 Attribute - each characteristic or quality describing a
particular entity
 Primary Key - field that uniquely identifies the record
 Secondary Key - field that has some identifying information,
but typically does not identify the file with complete accuracy
8
Introduction to Information Technology
Turban, Rainer and Potter
Chapter 5 Managing Organizational Data and Information
Basics of Data Arrangement
and Access (continued …)
 Storing and Accessing Records
 Indexed Sequential Access Method (ISAM)
» uses an index of key fields to locate individual records
» index - lists the key field of each record and where that
record is physically located in storage
» track index - shows the highest value of the key field that
can be found on a specific track
 Direct File Access Method
» uses the key field to locate the physical address of a
record
» transform algorithm - translates the key field directly
into the record’s storage location on disk
9
Introduction to Information Technology
Turban, Rainer and Potter
Chapter 4 Computer Software
Traditional File Environment
The organization has multiple
applications with related data files
Each application has a specific
data file related to it,
containing all the data records
needed by the application
Each application comes
with an associated
application-specific data
file
10
Introduction to Information Technology
Turban, Rainer and Potter
Chapter 5 Managing Organizational Data and Information
Traditional File Environment
(continued …)
 Problems with the file approach
 data redundancy - the same piece of information
could be duplicated in several places
 data inconsistency - the various copies of the data no
longer agree
 data isolation - difficulty in accessing data from
different applications
 security - new applications may be added to the
system on an ad hoc basis
 data integrity - data values must often meet integrity
constraints
 application/data independence - the applications and
data in computer systems should be independent
Introduction to Information Technology
Turban, Rainer and Potter
Chapter 5 Managing Organizational Data and Information
11
Database : The Modern Approach
 Database Management System
 provides access to all the data
 Example : University administration
Registrar Office
Class Programs
Accounting Dept.
Accounts Programs
Athletics Dept.
Sports Programs
Academic Info.
Team Data
Employee Data
Tuition Data
Financial Aid
Student Data
Course Data
Course Data
Registration Data
Database
Management
System
12
Introduction to Information Technology
Turban, Rainer and Potter
Chapter 5 Managing Organizational Data and Information
Database : The Modern
Approach (continued …)
 Locating Data in Databases
 Centralized database
» all the related files are in one physical location
» used on large, mainframe computers
» saves the expenses associated with multiple computers
» provides database administrators with the ability to work
on a database as a whole at one location
» files are not accessible except via the centralized host
computer
» recovery from disasters can be more easily accomplished
at a central location
» vulnerable to a single pint of failure
» speed problem
13
Introduction to Information Technology
Turban, Rainer and Potter
Chapter 5 Managing Organizational Data and Information
Database : The Modern
Approach (continued …)
 Locating Data in Databases (cont’)
 Distributed database
» complete copies of a database, or portions of a
database, are in more than one location, which is
usually close to the user
» replicated database - complete copies of the entire
database are delivered to many locations, primarily to
alleviate the single-point-of-failure problems of a
centralized database as well as to increase user access
responsiveness
» partitioned databases - these are subdivided, a
portion of the entire database in each location
Introduction to Information Technology
Turban, Rainer and Potter
Chapter 5 Managing Organizational Data and Information
14
Centralized vs. Distributed
Databases
User
New York
User
Los Angeles
User
New York
Central
Location
User
Los Angeles
Central
Location
Los Angeles
New York
Chicago
New York
New York
User
Chicago
Centralized Database
Kansas City
User
Kansas City
Distributed Database
Introduction to Information Technology
Turban, Rainer and Potter
Chapter 5 Managing Organizational Data and Information
15
Database : The Modern
Approach (continued …)
 Creating a Database
 Conceptual design - an abstract model of the
database from the user or business perspective
 Physical design - shows the way a database is
actually arranged with a storage devices
 Entity-relationship (ER) modeling
» process of planning the database design
» ER diagram - document of the conceptual data model
» Entity classes  Instance  Identifiers  Relationships
 Normalization
» method for analyzing and reducing a relational database to
its most streamlined form for minimum redundancy,
maximum data integrity, and best processing performance
16
Introduction to Information Technology
Turban, Rainer and Potter
Chapter 5 Managing Organizational Data and Information
Database Management Systems
 A software program (or group of programs) that
provides access to a databases
 Permits an organization to store data in one location,
from which it can be updated and retrieved
 Provides access to the stored data by various
application programs
 Provides mechanisms for maintaining the integrity of
stored information, managing security and user
access, recovering information when the system fails,
and accessing various database functions form within
an application written in a third-generation, fourthgeneration, or object-oriented language
17
Introduction to Information Technology
Turban, Rainer and Potter
Chapter 5 Managing Organizational Data and Information
DBMS (continued …)
 Logical versus Physical View
 Physical view - deals with the actual, physical
arrangement and location of data in the direct
access storage devices (DASD)
 Logical view - represents data in a format that
is meaningful to a user and to the software
programs that process that data
Introduction to Information Technology
Turban, Rainer and Potter
Chapter 5 Managing Organizational Data and Information
18
DBMS (continued …)
 DBMS Components
 Data model
» defines the way data are conceptually structured
 Data definition language (DDL)
» defines what types of information are in the database
and how they will be structured
» functions of the DDL
> provide a means for associating related data
> indicate the unique identifiers (or keys) of the
records
> set up security access and change restrictions
Introduction to Information Technology
Turban, Rainer and Potter
Chapter 5 Managing Organizational Data and Information
19
DBMS (continued …)
 DBMS Components (cont’)
 Data manipulation language (DML)
» used with third-generation, fourth-generation, or
object-oriented languages to query the contents of
the database, store or update information in the
database, and develop database applications
» Structured query language (SQL) - most popular
relational database language, combining both DML
and DDL features
 Data Dictionary
» stores definitions of data elements and data
characteristics
20
Introduction to Information Technology
Turban, Rainer and Potter
Chapter 5 Managing Organizational Data and Information
Logical Data Models
 A manager’s ability to use a database is highly
dependent on how the database is structured
logically and physically.
 In a logically structuring database, businesses
need to consider the characteristics of the data and
how the data will be accessed.
 Three common data models : hierarchical,
network, and relational
 Using these models, database designer can build
logical or conceptual view of data that can then be
physically implemented into virtually any database
with any DBMS.
21
Introduction to Information Technology
Turban, Rainer and Potter
Chapter 5 Managing Organizational Data and Information
Logical Data Models (continued …)
 Hierarchical Database Model
 structures data into an inverted “tree” in which each
record contains two elements rigidly
1st : a single root or master field, often called a key,
which identifies the type location or ordering of the records
2nd : a variable number of subordinate fields,
which defines the rest of the data within a record
 all fields have only one “parent”, each parent may have
many “children”
 advantage : speed and efficiency
 problem : access to data is predefined before the
programs; and each relationship must be explicitly
defined when the database is created
Introduction to Information Technology
Turban, Rainer and Potter
Chapter 5 Managing Organizational Data and Information
22
Hierarchical Data Model
Sales
East Coast
Midwest
China Stemware Flatware
West Coast
Region
China Stemware Flatware
Product
Category
China Stemware Flatware
Plates
Bowls
Plates
Bowls Plates
Bowls
Product
23
Introduction to Information Technology
Turban, Rainer and Potter
Chapter 5 Managing Organizational Data and Information
Logical Data Models (continued …)
 Network Database Model
 creates relationship among data through a linked-list
structure in which subordinate records (members)
can be linked to more than one data element (owner)
 pointer - explicit link, storage addresses that contain
the location of a related record
 many-to-many relationships are possible
 complexity : for every set of linked data elements, a
pair of pointers must be maintained
24
Introduction to Information Technology
Turban, Rainer and Potter
Chapter 5 Managing Organizational Data and Information
Logical Data Models (continued …)
 Relational Database Model
 based on a simple concept of tables in order to capitalize
on characteristics of rows and columns of data
 relations - tables  tuple - row  attribute - column
 select operation - creates a subset consisting of all
records in the file that meet stated criteria
 join operation - combines relational tables to provide the
user with more information than is available in
individual tables
 project operation - creates a subset consisting of columns
in a table, permitting the user to create new tables that
contain only the information required
Introduction to Information Technology
Turban, Rainer and Potter
Chapter 5 Managing Organizational Data and Information
25
Relational Database Model
Smith, A.
Dir. Accounting
43
China
Jones, W.
Dir. Total Quality
Management
32
Stemware
Lee, J.
Dir. Information
Technology
46
China
Durham, K.
Manager, Production
35
Stemware
Stone, L.
Administrative Asst.
28
Flatware
26
Introduction to Information Technology
Turban, Rainer and Potter
Chapter 5 Managing Organizational Data and Information
Company Data Models
MODEL
ADVANTAGES
DISADVANTAGES
Hierarchical Speed and efficiency in
database
search
Access to data is predefined by
exclusively hierarchical
relationships, predetermined by
administrator. Limited search/
query flexibility. Not all data is
naturally hierarchical.
Network
database
Many more relationships
between data elements can
be defined. Greater speed
and efficiency than
relational database models.
The most complicated model to
design, implement, and maintain.
Greater query flexibility than
hierarchical model, but less than
relational model.
Relational
database
Conceptual simplicity; no
predefined relationships
among data. High flexibility in
ad hoc querying. New data and
records can be added easily
Lower processing efficiency and
speed. Data redundancy is
common, requiring additional
maintenance.
27
Introduction to Information Technology
Turban, Rainer and Potter
Chapter 5 Managing Organizational Data and Information
Logical Data Models (continued …)
 Emerging Data Models
 Object-oriented database model - an object - a small
amount of data put together with all the data needed in
order to perform an operation with that data
» Object - similar to an entity in that it represents a person,
place, or thing, but it also contains all of the data that the object
needs in order to perform an operation
» Attributes - characteristics that describe the state of that object
» Method - an operation, action, or a behavior the object may
undergo
» Messages - from other objects activate operations contained
within the object
» Class - all the messages to which the object will respond, as
well as the way in which objects of this class are implemented
28
Introduction to Information Technology
Turban, Rainer and Potter
Chapter 5 Managing Organizational Data and Information
Logical Data Models (continued …)
Emerging Data Models (cont’)
 Object-relational database model - adds new object
storage capabilities to relational database management systems
 Hypermedia database model - stores chunks of
information in a form of nodes connected by links established
by the user
Other Database Models
 Geographical information database - contains locational
data for overlaying on maps or images
 Knowledge database- stores decision rules used to evaluate
situations and help users make decisions like an experts
 Multimedia database - stores data on many media : sounds,
video, images, graphics animation, and text.
29
Introduction to Information Technology
Turban, Rainer and Potter
Chapter 5 Managing Organizational Data and Information
Data Warehouses
A data warehouse is a relational and or
multidimensional database management
system designed to support management
decision making.
The data in the “warehouse” is stored in a
single, agreed-upon format even when
underlying operational databases store the
data differently.
Introduction to Information Technology
Turban, Rainer and Potter
Chapter 5 Managing Organizational Data and Information
30
Data Warehouses
Framework and ViewAccess Applications
Data
Mart
Legacy
Select
Metadata
Reposition
Marketing
Extract
OLTT
Transform
Integrate
Maintain
External
Operational
System/Data
Enterprise
Data
Warehouse
Preparation
Target Database(s)
(RDB, MDDB)
Data
Mart
Risk
Management
Data
Mart
Engineering
EIS/DSS
Custom-Built
Application
(4GL tools)
A
P
I
S
M
L
D
D
L
E
W
A
R
E
Production
Reporting
Tools
Relational
Query Tools
OLAP/ROLAP
Data
Mining
Web
Browsers
31
Introduction to Information Technology
Turban, Rainer and Potter
Chapter 5 Managing Organizational Data and Information
Data Warehouses (continued ...)
Data Warehouse Offers Many Business Advantages
 It provides business users with a “customer-centric” view
of the company’s heterogeneous data by helping to
integrate data from sales, service, manufacturing and
distribution, and other customer-related business systems.
 It provides added value to the company’s customers by
allowing them to access better information when data
warehouse is coupled with Internet technology.
 It consolidates data about individual customers and
provides a repository of all customer contacts for
segmentation modeling, customer retention planning, and
cross-sales analysis.
32
Introduction to Information Technology
Turban, Rainer and Potter
Chapter 5 Managing Organizational Data and Information
Data Warehouses (continued ...)
Data Warehouse Advantages (cont’)
 It removes barriers among functional areas by
offering a way to reconcile views from multiple
sources, thus providing a look at activities that cross
functional lines.
 It reports on trends across multidivisional and/or
multinational operating units, including trends or
relationships in areas such as merchandising,
production planning, and so forth.
33
Introduction to Information Technology
Turban, Rainer and Potter
Chapter 5 Managing Organizational Data and Information
Data Warehouses (continued ...)
 Multidimensional Database Model
 can be the core of data warehouses
 data are stored in arrays
 consists of at least three dimensions
 dimensions are the edges of the cube, and represent
the primary “views” of the business data
 the data are intimately related and can be viewed and
analyzed from different perspectives, which are
called dimensions
 allows for the effective, efficient, and convenient
storage and retrieval of large volumes of data
Introduction to Information Technology
Turban, Rainer and Potter
Chapter 5 Managing Organizational Data and Information
34
Data Warehouses (continued ...)
Data Marts
 a scaled-down version of a data warehouse that
focuses on a particular subject area
 usually designed to support the unique business
requirements of a specific department or business
process. Example : Marketing data mart
 takes less time to build, costs less, and less complex
 the indiscriminate introduction of multiple data marts
with no linkage to each other, or to an enterprise data
warehouse, will cause problems
Introduction to Information Technology
Turban, Rainer and Potter
Chapter 5 Managing Organizational Data and Information
35
Data Warehouses (continued ...)
 Data Mining
 provides a means of extracting previously
unknown, predictive information from the base of
accessible data in data warehouses
 discovers hidden patterns, correlations, and
relationships among organizational data
 predicts future trends and behaviors, allowing
businesses to make proactive, knowledge-driven
decisions
 functions of data mining
» classification
» sequencing
» clustering
» forecasting
» association
Introduction to Information Technology
Turban, Rainer and Potter
Chapter 5 Managing Organizational Data and Information
36
What’s in IT for Me?
For Accounting
 Data gathered about each transaction (business
event) in the organization is stored in its
databases
For Finance
 Computerized databases external to the
organization, such as CompuStat or Dow Jones,
provides financial data on organizations in its
industry
Introduction to Information Technology
Turban, Rainer and Potter
Chapter 5 Managing Organizational Data and Information
37
What’s in IT for Me? (continued …)
For Marketing
 Databases including customer name, address,
purchase, amount, etc, help to plan targeted
marketing campaigns and to evaluate the
success of previous campaigns.
 Data mining is critical for many marketing
efforts to remain competitive.
For Production/Operations Management
 Organizational databases are accessed for
determining optimum inventory levels for parts
in a production process
 Information in databases are used to know
when to perform required service on machines
Introduction to Information Technology
Turban, Rainer and Potter
Chapter 5 Managing Organizational Data and Information
38
What’s in IT for Me? (continued …)
For Human Resources Management
 Organizational databases contain extensive data
on employees, such as name, address, gender,
race, age, salary, hiring date, current job
descriptions, past job descriptions, and past
performance evaluations
For MIS
 Vacancies for MIS include data entry and data
storage management to database management
and data analyst