Download Data Warehousing

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Virtual University of Pakistan
Data Warehousing
Lecture-29
Brief Intro. to Data Mining
Ahsan Abdullah
Assoc. Prof. & Head
Center for Agro-Informatics Research
www.nu.edu.pk/cairindex.asp
National University of Computers & Emerging Sciences, Islamabad
Email: [email protected]
1
DWH-Ahsan Abdullah
What is Data Mining?: Non technical view
“There are things that we know that we
know…
there are things that we know that we
don’t know…
there are things that we don’t know we
don’t know.”
Donald Rumsfield
US Secretary of Defence
2
DWH-Ahsan Abdullah
What is Data Mining?: Slightly formal
3
DWH-Ahsan Abdullah
What is Data Mining?: Formal view
Data mining digs out valuable non-trivial
information from large multidimensional
apparently unrelated data bases(sets).
4
DWH-Ahsan Abdullah
Why Data Mining? Huge volume
5
DWH-Ahsan Abdullah
Claude Shannon's info. theory
More volume means
less information
6
DWH-Ahsan Abdullah
Value vs. Volume
Decision (Y/N)
Decision Support
Value
of
Data
Knowledge
Information
Indexed Data
Raw Data
Volume of Data
7
DWH-Ahsan Abdullah
Why Data Mining?: Supply & Demand
8
DWH-Ahsan Abdullah
9
DWH-Ahsan Abdullah
Data Mining is HOT!
 10 Hottest Jobs of year 2025
Time Magazine, 22 May, 2000
 10 emerging areas of technology
MIT’s Magazine of Technology Review,
Jan/Feb, 2001
10
DWH-Ahsan Abdullah
How Data Mining is different? Traditionally

Knowledge Discovery (KDD)

Data Mining (Knowledge-driven exploration)
 Data Warehouses (Data-driven exploration):

Traditional Database (Transactions):
11
DWH-Ahsan Abdullah
Data Mining Vs Statistics
12
DWH-Ahsan Abdullah
Data Mining Vs. Statistics
13
DWH-Ahsan Abdullah
Knowledge extraction using statistics
Stock increase
(%)
Inflation Vs Stock inedx increase
40
30
20
10
0
1.6
1.7
1.8 1.85 1.9 1.95
2
2.9
3
3.3
4.2
4.4
5
6
Inflation (%)
Q: What will be the stock increase when inflation is 6%?
A: Model non-linear relationship using a line y = mx + c.
Hence answer is 13%
14
DWH-Ahsan Abdullah
Failure of regression models
70000
70000
y = -0.0127x 6 + 1.5029x 5 - 63.627x 4 + 1190.3x 3 - 9725.3x 2 + 31897x - 29263
60000
60000
50000
50000
40000
40000
30000
30000
20000
20000
10000
10000
0
0
0
-10000 0
5
5
10
10
15
15
20
20
25
25
30
30
35
35
15
DWH-Ahsan Abdullah
Data Mining is…
 Decision Trees
 Neural Networks
 Rule Induction
If. . . . .
Then. . .
 Clustering
 Genetic Algorithms
16
DWH-Ahsan Abdullah
Data Mining is NOT ...
 Data warehousing
 Ad Hoc Query / Reporting
 Online Analytical Processing (OLAP)
 Data Visualization

 Software Agents
17
DWH-Ahsan Abdullah
Data Mining: Business Perspective
 “knowledge” is worth knowing if it can be used to
increase profit by lowering cost or it can be used to
increase profit by raising revenue.
Business questions
 Profiling/Segmentation
 Cross-Service
 Employee retention:
18
DWH-Ahsan Abdullah
Related documents