Download Chapter 4 Data Management: Warehousing, Access and Visualization

Document related concepts

Database wikipedia , lookup

Big data wikipedia , lookup

Clusterpoint wikipedia , lookup

Object-relational impedance mismatch wikipedia , lookup

Functional Database Model wikipedia , lookup

Database model wikipedia , lookup

Transcript
CHAPTER 4
Data Management: Warehousing,
Access, and Visualization
4-1
In this Chapter, the outlines are:









MSS foundation
Many new concepts
Object-oriented databases
Intelligent databases
Data warehouse
Data mining
Online analytical processing
Multidimensionality
Internet / Intranet / Web
Fall, 2006
All right Reserved YAO Zhong, School of E&M, BUAA
4-2
4.1 Opening Vignette: Data
Warehouse and DSS at Group Health
Cooperative
Group Health Cooperative is a large nonprofit HMO based in
Seattle, Washington. The company owns hospital, inpatient
centers, and primary care centers. The company also has
contracts with many health care providers and also acts as an
insurance company. In 1997, it served over 550,000
enrollees.
A stream of 2-3 million records of data is processed monthly.
Managing such a vast amount of data is a very difficult task.
Even more difficult is the use of data for decision support.
Before the use of DSS, costs were escalating, customer
services were ineffective, use of resources was poor, and so
was the quality of some service.
The company realized that the only solution was to develop a
Fall, 2006
All right Reserved YAO Zhong, School of E&M, BUAA
4-3
a comprehensive database and DSS approach to cover the
whole spectrum of health services. Such an approach would
allow for effective and efficient data-driven decision
making.
The basic idea was to create a single database (called a data
warehouse) that would support DSS by linking data related
to cost, efficiency of resource use, outcomes, and health
status in a comprehensive corporate information system. The
data come from existing data collection applications (TPS),
such as clinic registration, laboratory, and pharmacy. An
attempt was made to avoid data redundancy and to combine
the data into meaningful information for decision making.
The system was initiated in 1989 and it is constantly being
updated and improved. It is used to generate periodic reports
such as
Fall, 2006
All right Reserved YAO Zhong, School of E&M, BUAA
4-4
–
–
–
–
–
Population-based reports summarized by clinic and by
practice
Productivity reports
Utilization management reports
Reports by consumer groups or payer groups
Statistical reports, such as by age and gender
The data warehouse is also used for many DSS, EIS, and MIS
applications such as:
– Allocating costs down to the level of use by the patient
– Holistic cost approach to answer queries such as how cost
reduction in one area affects costs in another areas.
– Cost comparisons for negotiating prices with business
partners
– Comprehensive query system
– Creating an EIS for monitoring key indicators such as cost
per patient day in a hospital
Fall, 2006
All right Reserved YAO Zhong, School of E&M, BUAA
4-5
Once the data warehouse was completed and data-based
reporting and access tools were in place, the possibilities
were endless. For example, the number of inpatient days is
an important performance indicator. It was reduced by 7%
by sending patients to outpatient services, resulting in
millions of dollars in savings. Performance measurements
are now generated periodically. Trend analysis is performed
as well as comparisons with other HMOs. Considering these
comparisons, target and plans for improvement were
established.
An example of the success of the DSS was the winning of a
military contract valued at about $1 billion over a 5-year
period. A database used specifically for the bid on this
contract was created in only 2 days because it was extracted
from the data warehouse. The warehouse was created with
the SAS data warehouse software (from the SAS Institute
Inc.), which is compatible with SAS’s DSS tools. Each
group gets an individually tailored report. Such reports
contain data from many
Fall, 2006
All right Reserved YAO Zhong, School of E&M, BUAA
4-6
sources and because of the data warehouse ,it is possible to
generate the reports rapidly and accurately.
What this opening vignette demonstrated is how a large
company uses a centralized database to support decision
making.
Fall, 2006
All right Reserved YAO Zhong, School of E&M, BUAA
4-7
4.2 Data Warehousing, Access,
Analysis, and Visualization
What to do with all the data that organizations collect,
store, and use?
(Information overload!)
Solution






Data warehousing
Data access
Data mining
Online analytical processing (OLAP)
Data visualization
Data sources
Fall, 2006
All right Reserved YAO Zhong, School of E&M, BUAA
4-8
4.3 The Nature and Sources of Data

Data: data items about things, events, activities, and
transactions are recorded, classified, and stored, but are not
organized to convey any specific meaning. It can be
numeric, alphanumeric, figures, sounds, or image.

Information: Data organized to convey meanings. It
confirms something the recipient knows or may have
“surprise” value by telling something not know.The
recipient interprets the meaning and draws inferences and
conclusions. An applications process data items so that the
results are meaningful for an intended action or decisions.
Fall, 2006
All right Reserved YAO Zhong, School of E&M, BUAA
4-9
4.3 The Nature and Sources of Data

Knowledge: Data items organized and processed to
convey understanding, experience, accumulated learning,
and expertise. A set of data items processed to extract
critical implications and to reflect past experience and
expertise provides the recipient with organizational
knowledge and has very high potential value.

DSS Database, data warehouse, may include data,
information, and knowledge.
Fall, 2006
All right Reserved YAO Zhong, School of E&M, BUAA
4-10
4.3 The Nature and Sources of Data

Data forms:
–
–
–
–
–
–

Documents
Pictures
Maps
Sound
Animation
Video
Can be hard or soft
Fall, 2006
All right Reserved YAO Zhong, School of E&M, BUAA
4-11
4.3 The Nature and Sources of Data

Internal
Data are stored in one or more places. These data are about
people, products, service, and process. For example, data
about the employees and their pay are usually stored in the
corporate database. Data about equipment and machinery
may be stored in the maintenance department database.

External
There are many sources of external data. They range from
commercial databases to data collected by sensors and
satellites. Data are available on CD-ROM, on the Internets,
as films, and as music or voices. Pictures, diagrams,
atlases, and TV are also source of data. Government
reports (either computerized or not) are a major source of
external data.
Fall, 2006
All right Reserved YAO Zhong, School of E&M, BUAA
4-12
4.3 The Nature and Sources of Data

Personal Data
The MSS users or other corporate employees may
contribute their own expertise by creating personal data.
These include subjective estimates of sales opinions about
what the competitor are like to do or interpretations of new
articles.
Fall, 2006
All right Reserved YAO Zhong, School of E&M, BUAA
4-13
4.4 Data Collection, Problems, and
Quality


Methods for collecting data
Raw data can collected manually or by instruments and
sensors. Representative data collection methods are
–
time studies (during observation),
–
surveys (using questionnaire),
–
observation (e.g.using video cameras), and
–
soliciting information from experts (e.g. Using
interview)
Data Problems (Table 4.1)
All computer-based systems depend on data. The quality and
integrity of the data are critical for the MSS to avoid the
GIGO syndrome. MSS depend on data because complied data
that make up information and knowledge are the heart of any
decision-making system.
Fall, 2006
All right Reserved YAO Zhong, School of E&M, BUAA
4-14
Data are not correct
Data are not timely
Data are not measured
or indexed properly
Fall, 2006
Raw data were entered
inaccurately
Develop a systematic way to ensure
the accuracy of raw data
data were generated
carelessly
Whenever derived data are
carefully monitor both the data
values and the manner in which
the data were generated.
Modify the systems for generating
data
The method for generating
the data is not rapid
enough to meet the need
for the data
Raw data are gathered accDevelop a system for rescaling or
ording to a logic or periodirecombining the improperly
city that is not consistent
indexed data
with the purposes of the
analysis
Detailed model contains
Develop simpler or more highly
so many coefficients
aggregated models
that it is difficult to
develop and maintain
No one ever stored data
Whether or not it is useful now,
needed now.
store data for future use. This may be
impracitical because of the cost
of storing and maintaining data.
The data may not be found when they
are needed.
Required data never existed
Make an effort to generate the data
or to estimate them if they concern the
future.
All right Reserved YAO Zhong, School of E&M, BUAA
4-15
4.4 Data Collection, Problems, and
Quality

Quality: determines usefulness of data
– Intrinsic data quality
– Accessibility data quality
– Representation data quality
Fall, 2006
All right Reserved YAO Zhong, School of E&M, BUAA
4-16
4.4 Data Collection, Problems, and
Quality
Uniformity
 Version
 Completeness check
 Conformity check
 Genealogy check (drill down)

Fall, 2006
All right Reserved YAO Zhong, School of E&M, BUAA
4-17
4.5 The Internet and
Commercial Database Services

For external data
–
–
The Internet: major supplier of external data,
Decision maker can access the home pages of vendor, clients, and
competitors, view and download information, or conduct research.
Commercial Data Banks: sell access to specialized databases
An online database service sell access to specialized DB. Such a
service can add external information to the MSS in a timely manner
and at a reasonable cost.
All that is necessary to retrieve data from such a service is a
company terminal, modem , telephone, password, and some
service fees.
Fall, 2006
All right Reserved YAO Zhong, School of E&M, BUAA
4-18
4.5 The Internet and
Commercial Database Services
Several thousand service are currently available, most of are
accessible via Internet.
Example,
– CompuServe and The Source
– CompuStat
– Dow Jones Information Service
– Interactive Data Corporation
– Lockheed Information Service
– Mead Data Central.
Fall, 2006
All right Reserved YAO Zhong, School of E&M, BUAA
4-19
4.5 The Internet and
Commercial Databases Servers
Use Web Browsers to:
– Access vital information by employees and customers
– Implement executive information systems
– Implement group decision support systems (GDSS)
– Database management systems provide data in HTML, on
Web servers directly
The big three relational database management system vendorsInformix, Oracle, Sybase and DB2-have reworked their core
database products to accommodate a world of client/server,
browser/server and Internet/intranet application

Fall, 2006
All right Reserved YAO Zhong, School of E&M, BUAA
4-20
4.5 The Internet and
Commercial Databases Servers
Use Web Browsers to:
Besides,
Powersoft is also readying software - The Internet Toolkit for
Powerbuilder 5.0, and add-on to the PowerBuilder Windows
application tool-for developing Web-enabled C/S application
Web-site and database integration suppliers include:

–
–
–
–
–
Fall, 2006
Spider Technology,
Haht Software,
Next Software Inc.
NetObject Inc.
OneWave Corp.
All right Reserved YAO Zhong, School of E&M, BUAA
4-21
4.6 Database Management Systems in
DSS

DBMS: Software program for entering (or adding)
information into a database; updating, deleting,
manipulating, storing, and retrieving information

A DBMS + modeling language to develop DSS

DBMS to handle LARGE amounts of information

DSS is often working with both data and models
Small DSS can be build by either enhanced DBMS or
spreadsheet.
Spreadsheet vs. DBMS?


Fall, 2006
All right Reserved YAO Zhong, School of E&M, BUAA
4-22
4.7 Database Organization and
Structure



Relational databases
 This form is prominent in DBMS areas. 2D tabular to
present the data logical structure.
 Easy learn and maintains.
Hierarchical databases
 Organizing data items in a top-down form, creating
logical links between related data items. It looks like a
tree.
Network databases
 Network form to represent data logical structure.
 Advantage is save store space.
Fall, 2006
All right Reserved YAO Zhong, School of E&M, BUAA
4-23
4.7 Database Organization and
Structure

Object-oriented databases
 Complex application involves complex database, such as
CIMS.
 Any forms of above three types of databases can not
satisfy the requirements.
 OODBMS is based on the principle of OOP. OODMBS
combine the characteristics of an OOP such as C++ or
SmallTalk Language with a mechanism for data storage
and access. It allows one to analyze data at a conceptual
level that emphasizes the natural relationships between
objectives. Abstraction is used to establish inheritance
hierarchies and objective encapsulation allows the
database designer to store both conventional data and
procedural code within the same objects.
Fall, 2006
All right Reserved YAO Zhong, School of E&M, BUAA
4-24
4.7 Database Organization and
Structure


Object-oriented databases (see Leung, 2002)
 OODBMS defines data as objects and encapsulates data
along with their relevant structure and behaviors. The
system uses a hierarchy of classes and subclasses of
objects. Structure, in terms of relationships, and behavior,
in terms of methods and procedures, are contained within
an objects.
 OODMBS especially useful in distributed DSS for very
complex application
Multimedia-Based Database
 MMDBMS manage data in a variety of formats, in
addition to the standard text or numeric fields. These
formats include images such as digitized photos or forms
of bit-mapped graphics such maps or .pic files, hypertext
images, video clips, sound, and virtual reality.
Fall, 2006
All right Reserved YAO Zhong, School of E&M, BUAA
4-25
4.7 Database Organization and
Structure
At best, no more than 15 percent of all corporate information
is digitized according to Gartner Group, Inc. At least 85%
of all corporate information resides outside the computer
in documents, maps, photos, images, and videotapes. For
organizations to build application to take advantage of
rich data types, the DBMS must accommodate them.
Oracle, Informix, and Sybase store rich multimedia data
types as binary large objects.
Most PC systems (as clients) are capable of supporting the
display or playback of files in these formats. It is logical,
but not easy, to expand database management capabilities
to include these objects into management support systems.
Fall, 2006
All right Reserved YAO Zhong, School of E&M, BUAA
4-26
4.7 Database Organization and
Structure

Document-based Database
Organizations are drowning(溺死)in paper. To alleviate the
paper storage and shuffling, document-based database
were developed. These are also knows as electronic
document management (EDM) systems. They are used for
information dissemination, form storage and management,
shipment, expert license processing, and workflow
automation.

Distributed database
A group of data have been distributed physically on different
sites in various computer networks and logically belong to
one system. The connected sites in the network is of
individual process capabilities and can execute the local
applications. At the same time, each site also can running the
system wide applications through the communication
subsystem.
Fall, 2006
All right Reserved YAO Zhong, School of E&M, BUAA
4-27
4.8 Data Warehousing

Definition: (SAS Institute, 1996)
– Physical separation of a company’s operational decision
support environments.
– At the heart of many companies lies a store of
operational data, usually derived from critical
mainframe-based online transaction processing (OLTP)
systems, such as order entry applications. The OLTP
systems are built with COBOL and they operate in
Customer Information Control System (CICS)
environment. OLTP systems for financial and inventory
management and control also produce operational data.
In this case, data access, application logic tasks, and
data representation logic are tightly coupled together,
usually in nonrelational database. These nonrelational
data stores
are not very conductive to data retrieval for4-28
All right Reserved YAO Zhong, School of E&M, BUAA
Fall, 2006
4.8 Data Warehousing
decision support application. Deriving information for
decision support analysis from an operational data store can
be a self-defeating activity that requires too much time and
programming expertise. It may negatively affect the
performance of the critical transactional system. However,
equally critical to a company’s success, decision support
information must be made accessible to management.
 W.H.Inmon(1992)definition:“data warehouse is a set of
supporting subject-oriented, integrated, time-dependent and
permanent decision process.
 Purpose: to establish a data repository making
operational data accessible in a form that is readily
acceptable for decision support and EIS application.
 As part of this new accessibility, a process must
transform detail-level operational data to a relational
form, which makes them more amenable to decision
right Reserved YAO Zhong, School of E&M, BUAA
4-29
processing.
Fall, 2006support All
4.8 Data Warehousing




Only data needed for decision support come from the TPS
(operational environment). As data pass into the data
warehouse, they are transformed and integrated into a
consistent structure.
Data warehouses allows for the storage of metadata, which
can include data summaries that are easier to search for and
index.
The data are then placed directly into a data repository at
the current level of detail, where they are eventually
summarized, archived into the older detail data level, or
purged.
Moving DSS information off the mainframe presents a
company with an opportunity to restructure its DSS
strategy. The company can reinvent the way that can shape
and form their DSS data.
Fall, 2006
All right Reserved YAO Zhong, School of E&M, BUAA
4-30
4.8 Data Warehousing





Any EIS requires having good summarized data in different
forms. Traditional EIS and DSS application often failed
because the underlying data were difficult or impossible to
access.
Also, manipulation of the data has traditionally been a part
of the EIS, instead of a preliminary data preparation
activity.
Data warehousing (information warehousing): solves
the data access problem
Data warehouse combines various data sources into a single
resource for end-user access
End users perform ad hoc query, reporting analysis and
visualization
Fall, 2006
All right Reserved YAO Zhong, School of E&M, BUAA
4-31
4.8 Data Warehousing


Data warehousing involves combining a variety of
technology vendor’s product into an integrated solution.
 There can be several data warehouse in one company.
DW benefits:
 Increase in knowledge worker productivity
 Supports all decision makers’ data requirements
 Provide ready access to critical data
 Insulates operation databases from ad hoc processing
 Provides high-level summary information
 Provides drill down capabilities
– Improved business knowledge
– Competitive advantage
Fall, 2006
All right Reserved YAO Zhong, School of E&M, BUAA
4-32
4.8 Data Warehousing

Enhances customer service and satisfaction
– Facilitates decision making
– Help streamline business processes

DW Architecture and Process
 two-tier architecture
 three-tier architecture (figure 4.3)
Data from internal (legacy) sources and external sources
are extracted, scrubbed, filtered, and summarized via
special software before insertion in the data warehouse.
The data then are processed again and deposited in an
additional special MD database (3 tiers), organized for
easy MD presentation. The DSS and EIS users can
query the new server and perform analysis. In a two-tier
architecture, there is no MD database or server.
Fall, 2006
All right Reserved YAO Zhong, School of E&M, BUAA
4-33
Data Warehouse Architecture
and Process
Repository
EIS
Legacy
System
DB
Server
EIS/DSS
Server
DSS
External
System
Data
warehouse
MultiD
DB
EIS
Three tiers Data Warehouse Architecture
by McFadden and Watson [1996]
Fall, 2006
All right Reserved YAO Zhong, School of E&M, BUAA
4-34
4.8 Data Warehousing

Data Warehouse Components


Fall, 2006
Large physical database: This is an actual, physical
database into which all the data for the data warehouse
are gathered, along with the metadata and the
processing logic used to scrub, organize, package, and
preprocess the data for end-user access.
Logical data warehouse: contains all the metadata,
business rules, and processing logic required to scrub,
organize, package, and preprocess the data. In addition,
it contains the information required to find and access
the actual data, wherever they actually reside.
All right Reserved YAO Zhong, School of E&M, BUAA
4-35
4.8 Data Warehousing



Fall, 2006
Data mart: A data mart is a subset of the enterprise-wide
data warehouse. Typically it performs the role of a
departmental, regional, or functional data warehouse. As
part of the iterative data warehouse process, the
organization builds a series of data marts over time and
eventually links them via an enterprise-wide logical data
warehouse.
The Metadata* : Data about data stored within the DW.
Decision support systems (DSS) and executive
information system (EIS): These are not data
warehouse but applications that use the data warehouse.
All right Reserved YAO Zhong, School of E&M, BUAA
4-36
4.8 Data Warehousing

Data Warehouse Suitability:
 Data warehousing is most appropriate for organizations
where:
 Data are stored in different systems.
 An information-based approach to management is in use.
 There is large, diverse customer base.
 The same data are represented differently in different
systems.
 Data are stored in highly technical, difficult to decipher
formats.
Fall, 2006
All right Reserved YAO Zhong, School of E&M, BUAA
4-37
4.8 Data Warehousing

Characteristics of Data Warehousing
1. Data organized by detailed subject with information
relevant for decision support
2. Integrated data :data in different locations may be
enclosed differently. For example, gender data may be
encoded 0 and 1 in one place, and m and f in another.
DB2, ORACLE, Informix, Sybase, SQL Server,
Access, etc. all are integrated into one DW.
3. Time-variant data: for 5-10 years data can used for
tends, forecasting, and comparison.
4. Non-volatile data: Once entered into DW, data are
not changed or updated.
Fall, 2006
All right Reserved YAO Zhong, School of E&M, BUAA
4-38
4.9 OLAP: Data Access and Mining,
Querying, and Analysis

Online analytical processing (OLAP)
– DSS and EIS computing done by end-users in
online systems
– Versus online transaction processing (OLTP)
OLTP is performed by end-users, whereas OLAP is
done by IS professionals. OLAP’s activities are
generating queries, requesting ad hoc reports,
conducting statistical analyses, and building
multimedia applications. To facilitate OLAP, data
warehouse is necessary. OLTP with database.
Fall, 2006
All right Reserved YAO Zhong, School of E&M, BUAA
4-39
4.9 OLAP: Data Access and Mining,
Querying, and Analysis

OLAP Activities
 Generating queries
 Requesting ad hoc reports
 Conducting statistical and other analyses
 Developing multimedia applications
Fall, 2006
All right Reserved YAO Zhong, School of E&M, BUAA
4-40
4.9 OLAP: Data Access and Mining,
Querying, and Analysis
Corporate database
OLAP Server
Spreadsheets
Statistical? ?package
?
Web-enabled OLAP
Software
Fall, 2006
Multidimensional
Database
Client PCs
Data are retrieved from
corporate database and staged
in an OLAP
? ? ?
Multidimensional
database
for retrieval by front-end
systems
All right Reserved YAO Zhong, School of E&M, BUAA
4-41
4.9 OLAP: Data Access and Mining,
Querying, and Analysis

OLAP uses the data warehouse and a set of tools,
usually with multidimensional capabilities
 Query tools
 Spreadsheets
 Data mining tools
 Data visualization tools

Two types realized modes
–
–
Fall, 2006
MOLAP (Multidimensional OLAP): EXPRESS
(MIT) and System W (Comshare)
ROLAP: (Relational OLAP): Metaphor
All right Reserved YAO Zhong, School of E&M, BUAA
4-42
4.9 OLAP: Data Access and Mining,
Querying, and Analysis
–
Using SQL for Querying
–
SQL (Structured Query Language)
Data language
English-like, nonprocedural, very user friendly language
Free format
Example:
SELECT
FROM
WHERE
Fall, 2006
Name, Salary
Employees
Salary >2000
All right Reserved YAO Zhong, School of E&M, BUAA
4-43
4.10 Data Mining


Data mining is a term used to describe knowledge discovery
in databases, knowledge extraction, data archaeology, data
exploration, data pattern processing, data dredging,
information harvesting, and software.
Data Mining Applications:
–
–





Fall, 2006
Knowledge discovery in databases
Knowledge extraction
Data archeology
Data exploration
Data pattern processing
Data dredging
Information harvesting
All right Reserved YAO Zhong, School of E&M, BUAA
4-44
4.10 Data Mining

Major Data Mining
Characteristics and Objectives
 Data are often buried deep
 Client/server architecture
 Sophisticated new tools--including advanced
visualization tools--help to remove the information
“ore”
 End-user miner empowered by data drills and other
power query tools with little or no programming skills
 Often involves finding unexpected results
 Tools are easily combined with spreadsheets, etc.
 Parallel processing for data mining
Fall, 2006
All right Reserved YAO Zhong, School of E&M, BUAA
4-45
4.10 Data Mining

Fall, 2006
Data Mining Application Areas
 Marketing (Retailing and sales)
 Banking
 e-Commerce
 Manufacturing and production
 Brokerage and securities trading
 Insurance
 Computer hardware and software
 Government and defense
 Airlines
 Health care
 Broadcasting
 Law enforcement
All right Reserved YAO Zhong, School of E&M, BUAA
4-46
4.10 Data Mining

Intelligent Data Mining
 Use intelligent search to discover information within
data warehouses that queries and reports cannot
effectively reveal
Fall, 2006

Find patterns in the data and infer rules from them

Use patterns and rules to guide decision making and
forecasting

Five common types of information that can be
yielded by data mining: 1) association, 2) sequences,
3) classifications, 4) clusters, and 5) forecasting
All right Reserved YAO Zhong, School of E&M, BUAA
4-47
4.10 Data Mining

Main Tools Used in intelligent Data Mining
 Case-based Reasoning: Using historical cases, the
case-base reasoning approach can be used to recognize
patterns.
 Neural Computing: Neural computing is a machine
learning approach by which historical data can be
examined for pattern recognition. Thus, one can go
through large databases identify potential customers of
a new product.
(a brief introduction, doesn’t requirement )
Fall, 2006
All right Reserved YAO Zhong, School of E&M, BUAA
4-48
4.10 Data Mining

Intelligent Agents: one of the most promising approaches to
retrieving information from databases, especially external
ones, is the use of intelligent agents. As vast amounts of
information are becoming available through the Internet,
finding the right information is becoming more difficult. IA is
an autonomous agent, which is a system situation within and a
part of an environment that senses that environment and acts
in on it, over time, in pursuit of its own agenda and so as to
effect what it senses in the future.
 Other Tools
• Decision trees
• Rule induction
• Data visualization
Fall, 2006
All right Reserved YAO Zhong, School of E&M, BUAA
4-49
4.10 Data Mining

Often used technologies in data mining
 Associate rules
 Clustering
 Artificial Neural Networks


Fall, 2006
Decision trees
Multivariate regression
All right Reserved YAO Zhong, School of E&M, BUAA
4-50
4.11 Data Visualization and
Multidimensionality

Data Visualization Technologies








Fall, 2006
Digital images
Geographic information systems
Graphical user interfaces
Multidimensions
Tables and graphs
Virtual reality
Presentations
Animation
All right Reserved YAO Zhong, School of E&M, BUAA
4-51
4.11 Data Visualization and
Multidimensionality

Multidimensionality






Fall, 2006
3-D + Spreadsheets (OLAP has this)
Data can be organized the way managers like to see them,
rather than the way that the system analysts do
Different presentations of the same data can be arranged
easily and quickly
Dimensions: products, salespeople, market segments,
business units, geographical locations, distribution
channels, country, or industry
Measures: money, sales volume, head count, inventory
profit, actual versus forecast
Time: daily, weekly, monthly, quarterly, or yearly
All right Reserved YAO Zhong, School of E&M, BUAA
4-52
4.11 Data Visualization and
Multidimensionality

Multidimensionality Limitations
– Extra storage requirements
– Higher cost
– Extra system resource and time consumption
– More complex interfaces and maintenance
Multidimensionality is especially popular in executive
information and support systems
Fall, 2006
All right Reserved YAO Zhong, School of E&M, BUAA
4-53
4.12 Geographic Information Systems
(GIS)





A computer-based system for capturing, storing, checking,
integrating, manipulating, and displaying data using
digitized maps
Spatially-oriented databases
Useful in marketing, sales, voting estimation, planned
product distribution
Available via the Web
Can use with GPS
Fall, 2006
All right Reserved YAO Zhong, School of E&M, BUAA
4-54
4.13 Virtual Reality





An environment and/or technology that provides
artificially generated sensory cues sufficient to engender in
the user some willing suspension of disbelief
Can share data and interact
Can analyze data by creating a landscape
Useful in marketing, prototyping aircraft designs
VR over the Internet through VRML
Fall, 2006
All right Reserved YAO Zhong, School of E&M, BUAA
4-55
4.14 Business Intelligence
(BI) on the Web




Concept: (Greene, 1996) (1) processed information (data>Information-> Intelligence); (2) Decision maker makes
decisions; (3) The impact of the environment faced by the
business on the business operations.
Architecture: (1) Data Warehouse; (2) Analysis Tools:
OLAP and Data Mining; (3) EIS.
Functionality: BI Can capture and analyze data from Web
Tools deployed on Web
–
–
Fall, 2006
IBM the database for the web generation, Building and managing
DB2 Warehouse and Gaining Insights
CA: eBusiness Intelligence
All right Reserved YAO Zhong, School of E&M, BUAA
4-56
Summary







Fall, 2006
Data for decision making come from internal and
external sources
The database management system is one of the major
components of most management support systems
Familiarity with the latest developments is critical
Data contain a gold mine of information if they can dig
it out
Organizations are warehousing and mining data
Multidimensional analysis tools and new enterprisewide system architectures are useful
OLAP tools are also useful
All right Reserved YAO Zhong, School of E&M, BUAA
4-57
Summary (cont’d.)

New data formats for multimedia DBMS
Internet and intranets via Web browser interfaces for DBMS
access
Built-in artificial intelligence methods in DBMS

References recommendation:


– Building the Data Warehouse
W.H.Inmon 机械工业出版社
– 数据仓库技术及联机分析处理
王珊等编著 科学出版社
Fall, 2006
All right Reserved YAO Zhong, School of E&M, BUAA
4-58
Individual Assigment


Surfing on the Internet to check what the data/information
sources can be provided by each of the data banks in our
class discussed (Section 4.5).
What is Data Warehouse, Data Mining Technology,
Intelligent Agents, BI?
Fall, 2006
All right Reserved YAO Zhong, School of E&M, BUAA
4-59
 Supplement
Materials about the
data warehouse
Fall, 2006
All right Reserved YAO Zhong, School of E&M, BUAA
4-60