Download Related Concepts Database Systems Relational Database Models

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Entity–attribute–value model wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Concurrency control wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Database wikipedia , lookup

Clusterpoint wikipedia , lookup

Relational model wikipedia , lookup

Database model wikipedia , lookup

Transcript
Related Concepts
Data Mining
Goal: Examine some areas which are related to
data mining.
CS 341, Spring 2007
n
n
n
Lecture 2: Related Concepts (I)
n
Database Systems
Decision Support System
Data Warehousing
Fuzzy Sets and Logic
© Prentice Hall
Database Systems
n
Relational Database Models
Database
Based on the relational model developed by
E.F. Codd in 1970
Data and relationship between them are
organized in tables. Properties of Relational
Tables:
n
– A collection of related records
– The conceptual view: table with rows and columns
n
n
Schema
– A structural description of the type of facts held in a
database
– E.g. (EmployeeID
(EmployeeID,, Name, Address, Salary, JobNo)
JobNo)
–
–
–
–
–
–
– Your own example?
n
Database models : modeling database
structure
3
© Prentice Hall
ProdID
123
123
150
150
150
Relation: A rectangular table
– Attribute: A column in the table
– Tuple:
Tuple: A row in the table
150
200
300
500
500
© Prentice Hall
5
4
A Relation Containing product
information
Relational Database Model
n
Values Are Atomic
Each Row is Unique
Column Values Are of the Same Kind
The Sequence of Columns is Insignificant
The Sequence of Rows is Insignificant
Each Column Has a Unique Name
Dominant in commercial data processing
systems
n
© Prentice Hall
2
1
LocID
Dallas
Houston
Dallas
Dallas
Fort
Worth
Chicago
Seattle
Rochester
Bradenton
Chicago
Date
022900
020100
031500
031500
021000
Quantity
5
10
1
5
5
UnitPrice
25
20
100
95
80
012000
030100
021500
022000
012000
20
5
200
15
10
75
50
5
20
25
© Prentice Hall
6
1
A relation containing
redundancy
A relation containing
employee information
© Prentice Hall
7
© Prentice Hall
An employee database
consisting of three relations
Relational Operations
n
n
n
© Prentice Hall
9
The SELECT operation
© Prentice Hall
8
Select: Choose rows
Project: Choose columns
Join: Assemble information from two or
more relations
© Prentice Hall
10
The PROJECT operation
11
© Prentice Hall
12
2
Another example of the
JOIN operation
The JOIN operation
© Prentice Hall
13
© Prentice Hall
An application of the JOIN
operation
14
Structured Query Language
(SQL)
n
Operations to manipulate tuples
– insert
– update
– delete
– select
© Prentice Hall
15
© Prentice Hall
SQL Examples
n
n
SQL Examples (continued)
select EmplId,
EmplId, Dept
from ASSIGNMENT, JOB
where ASSIGNMENT.JobId = JOB.JobId
and ASSIGNMENT.TermData = “*”
insert into EMPLOYEE
values (‘
(‘43212’
43212’, ‘Sue A. Burt’
Burt’,
’33 Fair St.’
St.’, ‘444661111’
444661111’)
© Prentice Hall
16
17
n
delete from EMPLOYEE
where Name = ‘G. Jerry Smith’
Smith’
n
update EMPLOYEE
set Address = ‘1812 Napoleon Ave.’
Ave.’
where Name = ‘Joe E. Baker’
Baker’
© Prentice Hall
18
3
Maintaining Database
Integrity
n
Maintaining database
integrity (continued)
Transaction: A sequence of operations that must
all happen together
n
– Incorrect summary problem
– Lost update problem
– Example: transferring money between bank accounts
n
Transaction log: A nonnon-volatile record of each
transaction’
transaction’s activities, built before the
transaction is allowed to execute
– Commit point: The point at which a transaction has
been recorded in the log
– RollRoll-back: The process of undoing a transaction
© Prentice Hall
Simultaneous access problems
n
Locking = preventing others from
accessing data being used by a transaction
– Shared lock: used when reading data
– Exclusive lock: used when altering data
19
© Prentice Hall
Other Database Models
n
Database Systems
Hierarchical model
n
Network model
n
» More than 1 parent per child, m : m mapping
n
» Add database functionality to object
programming language.
21
The conceptual layers of a
database implementation
© Prentice Hall
Database applications
–
–
–
–
ObjectObject-oriented model
© Prentice Hall
A relational database management system
– A software package used to create a database
(Oracle, Microsoft SQL sever, MYSQL)
» a tree structure, parentparent-child relationships, 1:m
mapping
n
20
Human resource management system
Sales management system
Inventory management system
Decision support system
© Prentice Hall
22
A database vs. a file
23
© Prentice Hall
24
4
Decision Support System
n
n
n
n
What is a Data Warehouse
Computer systems and related tools that
assist managers in making decisions and
solving problems.
Build upon database systems,
systems, provide
specific information needed by management
More ad hoc and customized information
DSS may use data mining tools on a data
warehouse
© Prentice Hall
n
n
n
DM: May access data in
warehouse.
25
© Prentice Hall
SubjectSubject-oriented
n
– Data related to the same event or object are linked
together
n
n
timetime-variant
– Changes to the data in database are tracked and
recorded
n
Nonvolatile
n
Integrated
n
– Data in ware house never be deleted or changed
27
Operational Data
Data Warehouse
OLTP
Precise Queries
Snapshot
Dynamic
Application
Operational Values
Gigabits
Detailed
Often
Few Seconds
Relational
OLAP
Ad Hoc
Historical
Static
Business
Integrated
Terabits
Summarized
Less Often
Minutes
Star/Snowflake
© Prentice Hall
Data mining tools often access data
warehouses rather than operational data.
© Prentice Hall
Operational vs. Informational
Application
Use
Temporal
Modification
Orientation
Data
Size
Level
Access
Response
Data Schema
Operational Data: Data used in day to day
needs of company.
Informational Data: Supports other functions
such as planning and forecasting.
DM: May access data in
warehouse.
– Contains data from all applications for an
organization.
– Keep consistent
© Prentice Hall
26
Operational Data vs.
Informational Data
What is a Data Warehouse
n
The main repository of the organization's
historical data
Contains the raw material for
management's decision support system
The data warehouse is optimized for
reporting and analysis
28
OLAP
n
n
n
n
Online Analytic Processing (OLAP): provides more
complex queries than OLTP.
OnLine Transaction Processing (OLTP): traditional
database/transaction processing.
Dimensional data; cube view
Visualization of operations:
–
–
–
Slice: examine subsub-cube.
Dice: rotate cube to look at another dimension.
Roll Up/Drill Down
DM: May use OLAP queries.
29
© Prentice Hall
30
5
Fuzzy Sets and Logic
n
n
n
n
Fuzzy Sets
Fuzzy Set: Set membership function is a real valued
function with output in the range [0,1].
f(x): Probability x is in F.
1-f(x): Probability x is not in F.
EX:
– T = {x | x is a person and x is tall}
– Let f(x) be the probability that x is tall
– Here f is the membership function
DM: Prediction and classification are fuzzy.
© Prentice Hall
31
© Prentice Hall
Classification/Prediction is
Fuzzy
Next Lecture:
n
n
Loan
Reject
32
Information Retrieval, Question
Answering, Web Search
Reading assignments: Chapter 2
Reject
Amnt
Accept
Accept
Simple
Fuzzy
© Prentice Hall
33
© Prentice Hall
34
6