Download Data and Knowledge Management - Jui

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Open Database Connectivity wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Concurrency control wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Database wikipedia , lookup

Clusterpoint wikipedia , lookup

Relational model wikipedia , lookup

Database model wikipedia , lookup

Transcript
Lecture 7
Data and Knowledge Management
J. S. Chou, P.E., Ph.D.
Assistant Professor
Department of Business Administration
National Chung Cheng University
1
Objectives
1. Describe why databases have become so
important to organizations
2. Describe what databases and database
management systems are and how they work
3. Explain how organizations are getting the most
from their investment in database technologies
4. Describe what is meant by knowledge
management and knowledge assets as well as
benefits and challenges of deploying a knowledge
management system
2
Database Technology
• A collection of related data organized in
a way that makes it valuable and useful
• Allows organizations to retrieve, store,
and analyze information easily
• Is vital to an organization’s success in
running operations and making
decisions
3
Database Terminology
Entities
• Things we store information about. (i.e.
persons, places, objects, events, etc.)
• Have relationships to other entities (i.e.
the entity Student has a relationship to the
entity Grades in a University Student
database
Attributes
• These are pieces of information about an
entity (i.e. Student ID, Name, etc. for the
entity Student)
4
Relationship of DBMS Concepts
to Others?
5
Levels of a Database Management System (DBMS)
Level
Term
Term Definitions
Lowest
Highest
Field
Individual characteristics about an ENTITY.
Fields are also called attributes or columns
depending on the type of DBMS
Record
A group of fields or attributes to describe a
single instance of an ENTITY. These are
also called rows depending on the DBMS
File
A collection of records or instances for a
given ENTITY. These are also called tables
depending on the DBMS
Database
A collection of files or entities containing
information to support a given system or a
particular topic area
6
View of a Database Table or File
Attribute
(One Column)
Attribute
Type
Record
(One
Row)
7
File Processing vs Database Approach Summary
File Processing Approach (Old School)
• Storage Media: Sequential tapes or files
• Data: stored in long sequential files
• Organization: redundant data in multiple files
• Efficiency: data embedded to support processing
• Updates: requires multiple updates in many files
• Processing: slower query/faster processing
Data Base Approach (New School-TODAY)
• Storage Media: Direct Access Storage Device (DASD)
• Data: stored in related tables
• Organization: redundant data minimized/eliminated
• Efficiency: data only stored only in tables
• Updates: requires few or one update for a data field
• Processing: faster query/slower processing
8
Roles in Database Development and Use
Database Administrator (DBA)
• Designs, develops and monitors
performance of databases
• Enforces policy and standards
for data uses and security
Systems Programmer
• Creates business applications
that connect to databases
• Tests the new systems and
databases before use
Systems Analyst
• Defines data requirements
working with a DBA
• Incorporates the database
design into new program
designs
9
Database Systems Activities – Data Entry
Employment
Applications
Enter
Forms
(Form Entry Screen)
Example
• Data is entered from paper employment
applications into a form entry screen
• The entry forms are designed to match
the paper forms for easy entry
• The form data is processed by the entry
program and then stored in the
employment database
(Form Entry Program)
(Employment DB)
10
Database Systems Activities – Query
Query – A database function that extracts and displays information
from a database given selection parameters.
SQL (Structure Query Language)
• A language to select and extract data from a database
• The industry standard language for relational databases
QBE (Query by Example)
• A technique that allows a user to design a query on a screen by
dragging and placing the query field in their desired locations
Example – Display applicants entered in the last 30 days
• Query parameters are selected in the query request screen
• The database program uses SQL to query and present the result
(Query Request)
(Query Program)
(Employment Query)
11
Database Systems Activities – Report
Report – A database function that extracts and formats information
from a database for printing and presentation
Report Generator
• A specialized program that uses SQL to retrieve and manipulate
data (aggregate, transform, or group)
• Reports are designed using standard templates or can be custom
generated to meet informational needs
Example – Report on applicants entered in the last 30 days
• Report parameters are selected in the report request screen
• The database program uses SQL to query and present the result
(Query Request)
(Query Program)
(Employment Report)
12
Designing Databases – Data Model
Data Model
• A map or diagram that represents entities and
their relationships
• Used by Database Administrators to design tables
with their corresponding associations
Example: ERD (Entity Relationship Diagram)
13
Designing Databases – Keys
Database Keys
Mechanisms used to identify, select, and maintain one or
more records using an application program, query, or report
Primary Key
A unique attribute type used to identify
a single instance of an entity.
Compound Primary Key
A unique combination of attributes types used to
identify a single instance of an entity
Secondary Key
An attribute that can be used to identify one or more records
within a table with a given value
14
Designing Databases – Keys (Example)
Primary
Key
ENTITIES
- Student ID
Secondary
Key
Entities are translated
into Tables
(Students and Grades)
- Major
Entities are
joined by
common
attributes
Compound
Primary Key
- Student ID
- Course ID
- Sec No.
- Term
15
Designing Databases - Associations
Associations
• Define the relationships one entity has to another
• Determine necessary key structures to access data
• Come in three relationship types:
- One-to-One
- One-to-Many
- Many-to-Many
Foreign Key
• An attribute that appears as a non-primary
key in one entity (table) and as a primary key
attribute in another entity (table)
16
Designing Databases - Associations
Entity Relationship Diagram (ERD)
• Diagramming tool used to express entity relationships
• Very useful in developing complex databases
Example
• Each Home Stadium has a Team (One-to-One)
• Each Team has Players (One-to-Many)
• Each Team Participates in Games
• For each Player and Game there are Game Statistics
17
Designing Databases - Associations
18
Designing Databases – Associations
(Example)
19
The Relational Model
The Relational Model
• The most common type of database model used
today in organizations
• Is a three-dimensional model compared to the
traditional two-dimensional database models
- Rows (first-dimension)
- Columns (second-dimension)
- Relationships (third-dimension)
• The third-dimension makes this model so powerful
because any row of data can be related to any
other row or rows of data
20
The Relational Model - Example
21
The Relational Model - Normalization
Normalization
• A technique to make complex databases more efficient by
eliminating as much redundant data as possible
• Example: Database with redundant data (below)
22
The Relational Model - Normalization
Normalized Database
23
The Relational Model – Data Dictionary
Data Dictionary
• Is a document that database designers prepare to help
individuals enter data
• Provides several pieces of information about each
attribute in the database including:
- Name
- Key (is it a key or part of a key)
- Data Type (date, alpha-numeric, numeric, etc.)
- Valid Value (the format or numbers allowed)
• Can be used to enforce Business Rules which are
captured by the database designer to prevent illegal or
illogical values from entering the database. (e.g. who has
authority to enter certain kinds of data)
24
Online Transactional Processing (OLTP)
Online Transactional Processing
• The mechanism by which customers, suppliers, and
employees process business transactions for an organization
• These users conduct transactions online through internal
systems and external Websites for processing and storage
Example
25
Operational vs Informational
Systems
26
Organizational Use of Databases
Operational
Informational
Extract
Data
Extract
Data
Department
Databases
Data
Warehouse
• Day to Day
Department
Transactions
• Used primarily by
departments
• Extracted
Department
transactions
• Used for
business
analysis
Data
Mart
• Extracted
subset of a data
warehouse
• Used for highly
specific business
analysis
27
Online Analytical Processing (OLAP)
Online Analytical Processing
• Graphical software tools that provide complex analysis
of data stored on a database
• OLAP tools enable users to analyze different
dimensions of data beyond data summary and data
aggregations of normal database queries
• The OLAP Server is the chief component of an OLAP
system which understands how the data is organized and
has special functions for analyzing data
• OLAP can provide time series and trend analysis views
of data, data-drill downs, and the ability to answer
“what-if” and “why” questions as part of its functions
28
Data Mining
Data Mining
• Is a method companies use to analyze information to
better understand their customers, products, markets, or
any other phase of their business for which they have
data
• With data mining tools you can graphically drill down,
sort or extract data based on certain conditions,
perform a variety of statistical analysis
• Data mining applications are very powerful and use highly
complex algorithms to analyze and to identify
opportunities
29
Data Warehouse Example
30
Uses of Data Warehousing
31
Data Life Cycle Process Continued
The result - generating knowledge
32
Data Sources
The data life cycle begins with the acquisition of data from data sources.
These sources can be classified as internal, personal, and external.
•
•
Internal Data Sources are usually stored in the corporate database and
are about people, products, services, and processes.
Personal Data is documentation on the expertise of corporate employees
usually maintained by the employee. It can take the form of:
–
–
–
–
–
•
•
estimates of sales
opinions about competitors
business rules
Procedures
Etc.
External Data Sources range from commercial databases to Government
reports.
Internet and Commercial Database Services are accessible through the
Internet.
33
Methods for Collecting Raw Data
The task of data collection is fairly complex. Which can create data-quality
problem requiring validation and cleansing of data.
•
Collection can take place
– in the field
– from individuals
– via manually methods
•
•
•
•
time studies
Surveys
Observations
contributions from experts
– using instruments and sensors
– Transaction processing systems (TPS)
– via electronic transfer
– from a web site (Clickstream)
34
Methods for managing data collection
One way to improve data collection from multiple external sources is to use
a data flow manager (DFM), which takes information from external sources
and puts it where it is needed, when it is needed, in a usable form.
•
DFM consists of
– a decision support system
– a central data request processor
– a data integrity component
– links to external data suppliers
– the processes used by the external data suppliers.
35
Data Quality and Integrity
Data quality (DQ) is an extremely important issue since quality determines
the data’s usefulness as well as the quality of the decisions based on the
data. Data integrity means that data must be accurate, accessible, and upto-date.
•
Intrinsic DQ: Accuracy, objectivity, believability, and
reputation.
•
Accessibility DQ: Accessibility and access security.
•
Contextual DQ: Relevancy, value added, timeliness,
completeness, amount of data.
•
Representation DQ: Interpretability, ease of understanding,
concise representation, consistent representation.
Data quality is the cornerstone of effective business intelligence.
36
Knowledge Management Definitions
Knowledge Management
The process an organization uses to gain the greatest value
from its knowledge assets
Knowledge Assets
All underlying skills routines, practices, principles, formulas,
methods, heuristics, and intuitions whether explicit or tacit
Explicit Knowledge
Anything that can be documented, archived, or codified
often with the help of information systems
Tacit Knowledge
The processes and procedures on how to effectively
perform a particular task stored in a persons mind
37
Knowledge Management System (KMS)
Best Practices
Procedures and processes that are widely accepted as
being among the most effective and/or efficient
Primary Objective
How to recognize, generate, store, share, manage this tacit
knowledge (Best Practices) for deployment and use
Technology
Generally not a single technology but instead a collection
of tools that include communication technologies (e.g.
e-mail, groupware, instant messaging), and information
storage and retrieval systems (e.g. database
management system) to meet the Primary Objective
38
Knowledge – Knowledge Management Systems
The goal of knowledge management is for an organization to be aware of
individual and collective knowledge so that it may make the most effective
use of the knowledge it has. Firms recognize the need to integrate both
explicit and tacit knowledge into a formal information systems - Knowledge
Management System (KMS)
•
A functioning knowledge management system follows six steps in a cycle
dynamically refining information over time
1.
2.
3.
4.
5.
6.
Create knowledge.
Capture knowledge.
Refine knowledge.
Store knowledge.
Manage knowledge.
Disseminate knowledge.
As knowledge is disseminated, individuals develop, create, and
identify new knowledge or update old knowledge, which they
replenish into the system.
39
Knowledge – Knowledge Management
Systems
Continued
Knowledge
Management Cycle
40
Knowledge Management – Information Technology
Knowledge management is more than a technology or product, it is a
methodology applied to business practices. However, information technology
is crucial to the success of knowledge management systems.
• Components of Knowledge Management
Systems:
– Communication technologies allow users to access needed
knowledge and to communicate with each other.
– Collaboration technologies provide the means to perform group
work.
– Storage and retrieval technologies (database management systems)
to store and manage knowledge.
41
Knowledge Management – Integration
Knowledge management systems integration.
42
Benefits and Challenges of
Knowledge Management
43