Download Topics to be done:

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
An overview of
The IBM Intelligent Miner for Data
By: Neeraja Rudrabhatla
11/04/1999
Mining Features supported by the Data Miner:
• Association Rules
• Clustering - Demographic, Neural networks
• Predicting classifications - Neural Networks, Decision Trees
• Predicting values
• Discovering sequential patterns
• Discovering similar time sequences
Steps for mining data using the Data Miner:
• Creation of data
• Analyze and prepare data for mining
• Mine the data using one or a combination of mining techniques
• Visualize mining results using advanced graphical techniques
Main Window of the Data Miner:
Database used for mining association rules:
Store ID
001
001
001
001
001
001
001
001
001
001
001
001
001
001
001
001
001
001
001
001
001
Customer # Date(yymmdd) Transaction #
0000007
950109
00982
0000007
950109
00982
0000007
950109
00982
0000007
950109
00982
0000003
950109
00983
0000003
950109
00983
0000003
950109
00983
0000003
950109
00983
0000005
950109
00984
0000005
950109
00984
0000005
950109
00984
0000005
950109
00984
0000008
950109
00985
0000008
950109
00985
0000008
950109
00985
0000008
950109
00985
0000006
950109
00986
0000006
950109
00986
0000006
950109
00986
0000006
950109
00986
0000002
950109
00987
ItemID
122
125
133
150
153
154
162
166
147
174
191
198
147
174
182
184
174
186
187
188
109
Name Mapping:
101
102
103
104
105
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
Cream
A-Beer
B-Beer
C-Beer
Stout
Export
Cider
Milk
Antifreeze
Port
wine
White
German
Red
German
wi
White
French
Red
French
wi
White
Italian
Red
Italian
w
Sherry
Champagne
Sekt
Asti
Spumante
Crackers
Salty
biscuit
Crisps
Cheddar
Chees
Gouda
Cheese
Cottage
chees
Irish
Butter
Results of mining for associations:
Results on the automobile Database:
Another view:
Database used for Clustering:
Gender
female
female
male
female
male
female
female
female
female
female
female
male
male
female
female
male
male
female
Age
18.02
13.03
11.0
47.5
11.07
24.0
62.1
04.08
40.1
04.08
45.8
21.07
07.02
42.5
36.9
10.03
02.03
20.0
Siblings Income
1
97
6
490
3
647
2
3192
5
736
3
22358
0
3936
1
516
0
9478
0
193
5
16984
0
10428
0
960
0
10835
2
37083
3
877
0
10
0
15432
Type
red
green
red
green
blue
blue
green
pink
red
pink
green
blue
blue
pink
green
blue
blue
green
Product
2
3
4
5
6
7
8
1
2
3
4
5
6
7
8
1
2
3
Clustering - Demographic:
Max #clusters: 9
Accuracy: 5%
Details of Cluster 7:
Detailed pie-chart for attribute Type:
Detailed bar-graph of attribute Age:
Output obtained with Clustering using Neural Networks:
Details of Cluster 6:
Database used for Classification:
Day
D1
D2
D3
D4
D5
D6
D7
D8
D9
D10
D11
D12
D13
D14
Outlook Temperature Humidity
Sunny
Hot
High
Sunny
Hot
High
Overcast Hot
High
Rain
Mild
High
Rain
Cool
Normal
Rain
Cool
Normal
Overcast Cool
Normal
Sunny
Mild
High
Sunny
Cool
Normal
Rain
Mild
Normal
Sunny
Mild
Normal
Overcast Mild
High
Overcast Hot
Normal
Rain
Mild
High
Wind PlayTennis
Weak
No
Strong
No
Weak
Yes
Weak
Yes
Weak
Yes
Strong
No
Strong
Yes
Weak
No
Weak
Yes
Weak
Yes
Strong
Yes
Strong
Yes
Weak
Yes
Strong
No
Classification using Decision Tree:
A view of a leaf node of the decision tree:
Classification using neural network:
In-sample: 4
Out-Sample: 1
Accuracy: 80
Error: 10
Learning Rate: 0.1
Momentum: 0.9
Viewing the results in bar-graphs:
Database for Value Prediction:
D1
D2
D3
D4
D5
D6
D7
D8
D9
D10
D11
D12
D13
D14
Sunny
Sunny
Overcast
Rain
Rain
Rain
Overcast
Sunny
Sunny
Rain
Sunny
Overcast
Overcast
Rain
80
75
70
55
32
35
40
60
20
67
62
58
74
61
High
High
High
High
Normal
Normal
Normal
High
Normal
Normal
Normal
High
Normal
High
Weak
Strong
Weak
Weak
Weak
Strong
Strong
Weak
Weak
Weak
Strong
Strong
Weak
Strong
No
No
Yes
Yes
Yes
No
Yes
No
Yes
Yes
Yes
Yes
Yes
No
Results of PlayTennis:
In-sample: 2
Out-sample: 1
One partition of the PlayTennis-Prediction:
Textual Representation of a single partition:
Sequential Patterns Mining and Time Sequence Mining:
• Sequential patterns are used to find predictable patterns of behavior
over a period of time.
(A certain behavior at a given time is likely to produce another
behavior or a sequence of behaviors within a certain time-span)
• Time sequences help find all occurrences of similar subsequences in a
database of time sequences.
Sequences:
• Combine several objects into a single object that you can run
• The benefit is that you can combine several steps into one step
• If you combine several functions into a sequence, you need run only
the sequence, which then runs each of the objects within it
Applications:
The Intelligent Miner offerings are intended for use by Data Analysts and
Business Technologists in the following areas:
• Perform database marketing
• Streamline business and manufacturing processes
• Detect potential cases of fraud
• Helps in customer relationship management
Related documents