Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Classification
分類與預測
Classification (分類)
predict categorical class labels 預測分門別類的類別
標籤
categorical class labels: 離散式的或名目式的
(discrete/nominal)
根據訓練組資料的分類屬性的值(即類別標籤),
建構一個模型,並以之將新資料歸類
Prediction (預測)
model continuous-valued functions 掌握連續值型態
的函數,預測未知或遺失的值
2
Ming-Yen Lin, IECS, FCU
Classification Problem
Given
a database D={t1,t2,…,tn} and
a set of classes C={C1,…,Cm},
the Classification Problem is to
define a mapping f:DgC
where each ti is assigned to one class.
Actually divides D into equivalence classes.
Prediction is similar, but may be viewed as
having infinite number of classes.
3
Ming-Yen Lin, IECS, FCU
Classification Ex: Grading
If x >= 90 then grade =A.
If 80<=x<90 then grade =B.
If 70<=x<80 then grade =C.
If 60<=x<70 then grade =D.
If x<50 then grade =F.
x
<90
x
<80
x
<70
x
<50
F
Ming-Yen Lin, IECS, FCU
>=90
A
>=80
B
>=70
C
>=60
D
4
Classification Ex: Letter Recognition
View letters as constructed from 5 components:
Letter A
Letter B
Letter C
Letter D
Letter E
Letter F
5
Ming-Yen Lin, IECS, FCU
Classification Examples
Teachers classify students’ grades as A, B, C, D, or F.
Identify mushrooms as poisonous or edible.
Classify a river as flooding or not.
Identify an individual as certain credit risk.
Speech recognition: Identify the spoken “word” from the
classified examples (rules).
Pattern recognition: Identify a pattern from the
classified ones.
典型應用
信用核發,目標行銷,醫療診斷,處方(treatment)有效性
分析
6
Ming-Yen Lin, IECS, FCU
Classification—兩大步驟
建構模型: 描述一組預定的類別(predetermined class)
每筆資料皆已具有預定的類別(由 class label attribute決定)
training set :用來建構模型的資料組
模型的呈現:(1) classification rules (2) decision trees (3) mathematical formulae
使用模型: 將未來或未知之objects分類
估計模型之 accuracy
將已知類別的test sample灌入模型,比較模型的答案類別
Accuracy rate: test set samples正確被模型分類的比例
Test set:獨立於 training set, 否則會over-fitting
若accuracy 可接受,用此模型將未知類別標籤的資料項分類
Most common techniques use DTs, NNs, or are based on distances or
statistical methods
7
Ming-Yen Lin, IECS, FCU
Process 1: 建構模型
Training
Data
NAME
M ike
M ary
B ill
Jim
D ave
A nne
RANK
YEARS TENURED
A ssistant P rof
3
no
A ssistant P rof
7
yes
P rofessor
2
yes
A ssociate P rof
7
yes
A ssistant P rof
6
no
A ssociate P rof
3
no
Ming-Yen Lin, IECS, FCU
Classification
Algorithms
Classifier
(Model)
IF rank = ‘professor’
OR years > 6
THEN tenured = ‘yes’
8
Process 2:使用模型
Classifier
Testing
Data
Unseen Data
(Jeff, Professor, 4)
NAME
T om
M erlisa
G eorge
Joseph
RANK
YEARS TENURED
A ssistant P rof
2
no
A ssociate P rof
7
no
P rofessor
5
yes
A ssistant P rof
7
yes
Ming-Yen Lin, IECS, FCU
Tenured?
9
Defining Classes
Distance Based
Partitioning Based
10
Ming-Yen Lin, IECS, FCU
Supervised vs. Unsupervised Learning
監督式學習 (classification)
監督: training data (觀察值或測量值等) 帶有指示此觀察之
標籤
新資料依照training set來分類
非監督式學習(clustering)
training data 的類別標籤未知
給予一組觀察值或測量值等,目的是建構資料中存在的
類別或群體(cluster)
11
Ming-Yen Lin, IECS, FCU
議題 (1): Data Preparation
Data cleaning
Preprocess data in order to reduce noise and handle missing
values
相關性分析(Relevance analysis):即屬性的選取
(feature selection)
Remove the irrelevant or redundant attributes
Data transformation
Generalize and/or normalize data
Missing Data
Ignore
Replace with assumed value
12
Ming-Yen Lin, IECS, FCU
議題(2): 評估 Classification Methods
預測準確率(Predictive accuracy)
Classification accuracy on test data
Confusion matrix
OC Curve
速度與可擴充性( scalability)
建構 classifier 的時間多久
何時可以使用classifier
強固性(Robustness)
handling noise and missing values
可擴充性(Scalability)
disk-resident databases 的效率
可解釋性(Interpretability)
對model的瞭解與model可提供的insight(瞭解深度)
rule有多好( goodness measure )
decision tree size
compactness of classification rules
13
Ming-Yen Lin, IECS, FCU
Height Example Data
Name
Kristina
Jim
Maggie
Martha
Stephanie
Bob
Kathy
Dave
Worth
Steven
Debbie
Todd
Kim
Amy
Wynette
Gender
F
M
F
F
F
M
F
M
M
M
F
M
F
F
F
Height
1.6m
2m
1.9m
1.88m
1.7m
1.85m
1.6m
1.7m
2.2m
2.1m
1.8m
1.95m
1.9m
1.8m
1.75m
Output1
Short
Tall
Medium
Medium
Short
Medium
Short
Short
Tall
Tall
Medium
Medium
Medium
Medium
Medium
Output2
Medium
Medium
Tall
Tall
Medium
Medium
Medium
Medium
Tall
Tall
Medium
Medium
Tall
Medium
Medium
accuracy
14
Ming-Yen Lin, IECS, FCU
Classification Performance
True Positive
False Negative
False Positive
True Negative
15
Ming-Yen Lin, IECS, FCU
Confusion Matrix Example
Using height data example with Output1 correct
and Output2 actual assignment
Actual
Membership
Short
Medium
Tall
Assignment
Short
Medium
0
4
0
5
0
1
Tall
0
3
2
16
Ming-Yen Lin, IECS, FCU
Operating Characteristic Curve
17
Ming-Yen Lin, IECS, FCU
Classification Using Decision Trees
Partitioning based: Divide search space into
rectangular regions.
Tuple placed into class based on the region
within which it falls.
DT approaches differ in how the tree is built: DT
Induction
Internal nodes associated with attribute and arcs
with values for that attribute.
Algorithms: ID3, C4.5, CART
18
Ming-Yen Lin, IECS, FCU
Decision Tree
Given:
D = {t1, …, tn} where ti=<ti1, …, tih>
Database schema contains {A1, A2, …, Ah}
Classes C={C1, …., Cm}
Decision or Classification Tree is a tree associated with
D such that
Each internal node is labeled with attribute, Ai
Each arc is labeled with predicate which can be
applied to attribute at parent
Each leaf node is labeled with a class, Cj
19
Ming-Yen Lin, IECS, FCU
DT Induction
20
Ming-Yen Lin, IECS, FCU
DT Splits Area
Gender
M
F
Height
21
Ming-Yen Lin, IECS, FCU
Comparing DTs
Balanced
Deep
22
Ming-Yen Lin, IECS, FCU
Decision Tree Induction
Training Dataset
age
<=30
<=30
31…40
>40
>40
>40
31…40
<=30
<=30
>40
<=30
31…40
31…40
>40
Ming-Yen Lin, IECS, FCU
income student credit_rating
high
no fair
high
no excellent
high
no fair
medium
no fair
low
yes fair
low
yes excellent
low
yes excellent
medium
no fair
low
yes fair
medium
yes fair
medium
yes excellent
medium
no excellent
high
yes fair
medium
no excellent
buys_computer
no
no
yes
yes
yes
no
yes
no
yes
yes
yes
yes
yes
no
23
Output: A Decision Tree for “buys_computer”
age?
<=30
student?
overcast
30..40
yes
>40
credit rating?
no
yes
excellent
fair
no
yes
no
yes
an example from Quinlan’s ID3
24
Ming-Yen Lin, IECS, FCU
Decision Tree Induction演算法
基本方法 (a greedy algorithm)
top-down recursive divide-and-conquer 式的建構此樹
開始時,所有的example都在樹根
只能處理categorical屬性(若屬性是continuous-valued, 預做
discretization)
遞迴地將examples依據所選屬性 partition
選屬性的根據:以(1) heuristic(2)或statistical measure (如
information gain)
停止partitioning的條件
某一節點的類別均相同
已沒有其他可以用來partition的屬性 – 這個leaf的類別以
「多數決」
已經沒有sample留下
Ming-Yen Lin, IECS, FCU
25
Decision Tree Induction is often based on
Information Theory
So
26
Ming-Yen Lin, IECS, FCU
Information
27
Ming-Yen Lin, IECS, FCU
DT Induction
When all the marbles in the bowl are mixed up,
little information is given.
When the marbles in the bowl are all from one
class and those in the other two classes are on
either side, more information is given.
Use this approach with DT Induction !
28
Ming-Yen Lin, IECS, FCU
Information/Entropy
Given probabilitites p1, p2, .., ps whose sum is 1,
Entropy is defined as:
Entropy measures the amount of randomness or
surprise or uncertainty.
Goal in classification
no surprise
entropy = 0
29
Ming-Yen Lin, IECS, FCU
Entropy
log (1/p)
H(p,1-p)
30
Ming-Yen Lin, IECS, FCU
選屬性: Information Gain (ID3/C4.5)
選具有最高 information gain之屬性
S contains si tuples of class Ci for i = {1, …, m}
information measures info required to classify any arbitrary tuple
m
si
si
I( s1,s2,...,sm ) log 2
s
i 1 s
entropy of attribute A with values {a1,a2,…,av}
s1 j ... smj
E(A)
I (s1 j,..., smj)
s
j 1
v
information gained by branching on attribute A
Gain(A) I(s 1, s 2 ,..., sm) E(A)
31
Ming-Yen Lin, IECS, FCU
Information Gain 例子
Class P: buys_computer = “yes”
Class N: buys_computer = “no”
I(p, n) = I(9, 5) =0.940
Compute the entropy for age:
age
<=30
30…40
>40
age
<=30
<=30
31…40
>40
>40
>40
31…40
<=30
<=30
>40
<=30
31…40
31…40
>40
pi
2
4
3
ni I(pi, ni)
3 0.971
0 0
2 0.971
income student credit_rating
high
no
fair
high
no
excellent
high
no
fair
medium
no
fair
low
yes fair
low
yes excellent
low
yes excellent
medium
no
fair
low
yes fair
medium
yes fair
medium
yes excellent
medium
no
excellent
high
yes fair
medium
no
excellent
Ming-Yen Lin, IECS, FCU
buys_computer
no
no
yes
yes
yes
no
yes
no
yes
yes
yes
yes
yes
no
5
4
I (2,3)
I ( 4,0)
14
14
5
I (3,2) 0.694
14
E ( age)
5
I (2,3) 表 “age <=30” 14個中
14
有5個,其中 2 ‘yes’、3 ‘ no’
故
Gain(age) I ( p, n) E (age) 0.246
同理
Gain(income) 0.029
Gain( student ) 0.151
Gain(credit _ rating ) 0.048
32
由decision tree產生規則
將知識以 IF-THEN 規則方式呈現
每個路徑(從root到leaf) 產生一條規則
路徑上每個attribute-value pair是一個條件組合( conjunction)
Leaf具有類別預測值
人們對於規則比較容易懂
Example
IF age = “<=30” AND student = “no” THEN buys_computer = “no”
IF age = “<=30” AND student = “yes” THEN buys_computer = “yes”
IF age = “31…40”
THEN buys_computer = “yes”
IF age = “>40” AND credit_rating = “excellent” THEN buys_computer = “yes”
IF age = “<=30” AND credit_rating = “fair” THEN buys_computer = “no”
33
Ming-Yen Lin, IECS, FCU
避免 Overfitting
Overfitting: induced tree 可能會overfit training data
太多分支:有些可能是noise或outlier的不正常反應
unseen samples的accuracy不好
Two approaches
預先修剪(Prepruning): 建樹時早一點結束
如果會導致goodness measure掉到某個threshold下,別再「分」了
(no more split)
合適的threshold 難定
後修剪(Postpruning): 從長太滿的樹中移除branches
可得到一連串漸進(progressively)修剪的樹
用一組不同於 training 的data 決定那一個「修剪的樹」
最佳
Ming-Yen Lin, IECS, FCU
34
決定 Final Tree Size的方法
分成 training (2/3) 與 testing (1/3) sets
used for data set with large number of samples
使用 cross validation, 例如, 10倍 cross validation
divide the data set into k subsamples
use k-1 subsamples as training data and one sub-sample as test data --- kfold cross-validation
for data set with moderate size
使用 所有 data training
但使用 statistical test (如 chi-square) 估計 展開(expand)或修剪(prune)
某個節點可否改善整體分配
用 minimum description length (MDL) 原則
當encoding minimized時,停止長樹
35
Ming-Yen Lin, IECS, FCU
Information Gain example
YES: 9
NO: 5
Total = 14
P(YES) = 9/14 = 0.64
P(NO) =5/14 = 0.36
I(Y,N) = -( 9/14 * log2(9/14) + 5/14 * log2(5/14) ) = 0.94
-----------------------------------------------------------------------
age buys_computer
<=30
no
<=30
no
31…40
yes
>40
yes
>40
yes
>40
no
31…40
yes
<=30
no
<=30
yes
>40
yes
<=30
yes
31…40
yes
31…40
yes
>40
no
Age (<= 30): I(Y,N) = -( 2/5 * log2(2/5) + 3/5 * log2(3/5) ) = 0.97
YES 2
NO 3
Age (31..40): I(Y, N) = -( 4/4 * log2(4/4) + 0/4 * log2(0/4) ) = 0
YES 4
NO 0
Age(>40):-( 2/5 * log2(2/5) + 3/5 * log2(3/5) ) = 0.97
YES 3
NO 2
E(age) = 5/14 I(Y,N)age(<=30) + 4/14 I(Y,N)age(31..40)+ 5/14 I(Y,N)age(>40) = 0.694
Gain = 0.94 – 0.694 = 0.246
Ming-Yen Lin, IECS, FCU
36
Example: Error Rate
Test set: 84
84
2/3
1/3
56
1/7
6+2 E1
E1
=
28
3/7
1/6 E2
3/7
3/4
1/12 E3
2/7 E4
類別錯誤的個數
進入這個 leaf 的個數
1/4
3/7 E5
= 2/(6+2)
Etree = 2/3*(1/7*E1+ 3/7*E2+3/7*E3)+1/3*(3/4*E4+1/4*E5)
37
Ming-Yen Lin, IECS, FCU
Example of overfitting
“Yes”: 11
“No” : 9
Predict: Yes
Entropy =
-(11/20log211/20+
9/20log29/20)
= 0.993
“Yes”: 3
“No”: 0
Predict: Yes
“Yes”: 8
“No”: 9
Predict: No
Average entropy =
-17/20(8/17log28/17+
9/17log29/17)
= 0.848
38
Ming-Yen Lin, IECS, FCU
強化(enhance)基本決策樹
允許 continuous-valued 屬性
動態定義將continuous attribute value partition為離散區間的
新離散屬性
處理 missing attribute values
指定共通值
指定各個值的機率
建立Attribute
為現有稀疏的屬性(sparsely represented)建立新attribute
可減少(簡化) fragmentation, repetition, and replication
39
Ming-Yen Lin, IECS, FCU
DT Issues
Choosing Splitting Attributes
Ordering of Splitting Attributes
Splits
Tree Structure
Stopping Criteria
Training Data
Pruning
40
Ming-Yen Lin, IECS, FCU
ID3
Creates tree using information theory concepts and
tries to reduce expected number of comparison..
ID3 chooses split attribute with the highest
information gain:
41
Ming-Yen Lin, IECS, FCU
ID3 Example (Output1)
Starting state entropy:
4/15 log(15/4) + 8/15 log(15/8) + 3/15 log(15/3) = 0.4384
Gain using gender:
Female: 3/9 log(9/3)+6/9 log(9/6)=0.2764
Male: 1/6 (log 6/1) + 2/6 log(6/2) + 3/6 log(6/3) = 0.4392
Weighted sum: (9/15)(0.2764) + (6/15)(0.4392) =
0.34152
Gain: 0.4384 – 0.34152 = 0.09688
Gain using height:
0.4384 – (2/15)(0.301) = 0.3983
Choose height as first splitting attribute
42
Ming-Yen Lin, IECS, FCU
CART, C4.5, CHAID
Classification And Regression Tree
maximize diversitybefore- diversityafter
diversity分散度-僅含單一類別之diversity低
– eg. entropy/information
– compute I(2,3,9)?
C4.5
Post-pruning
CHAID
A tree is “pruned” by halting its construction early.
(Pre-pruning)
CHAID is restricted to categorical variables
Continuous variables must be broken into ranges or replaced with classes
such as high, low, medium
43
Ming-Yen Lin, IECS, FCU
C4.5
ID3 favors attributes with large number of
divisions
Improved version of ID3:
Missing Data
Continuous Data
Pruning
Rules
GainRatio:
44
Ming-Yen Lin, IECS, FCU
CART
Create Binary Tree
Uses entropy
Formula to choose split point, s, for node t:
PL,PR probability that a tuple in the training set will be
on the left or right side of the tree.
45
Ming-Yen Lin, IECS, FCU
CART Example
At the start, there are six choices for split
point (right branch on equality):
P(Gender)=2(6/15)(9/15)(2/15 + 4/15 + 3/15)=0.224
P(1.6) = 0
P(1.7) = 2(2/15)(13/15)(0 + 8/15 + 3/15) = 0.169
P(1.8) = 2(5/15)(10/15)(4/15 + 6/15 + 3/15) = 0.385
P(1.9) = 2(9/15)(6/15)(4/15 + 2/15 + 3/15) = 0.256
P(2.0) = 2(12/15)(3/15)(4/15 + 8/15 + 3/15) = 0.32
Split at 1.8
46
Ming-Yen Lin, IECS, FCU
Classification in Large Databases
Classification—a classical problem extensively studied
by statisticians and machine learning researchers
Scalability: Classifying data sets with millions of
examples and hundreds of attributes with reasonable
speed
Why decision tree induction in data mining?
relatively faster learning speed (than other classification
methods)
convertible to simple and easy to understand classification
rules
can use SQL queries for accessing databases
comparable classification accuracy with other methods
Ming-Yen Lin, IECS, FCU
47
Scalable Decision Tree Induction in Data Mining
SLIQ (EDBT’96 — Mehta et al.)
builds an index for each attribute and only class list and the
current attribute list reside in memory
SPRINT (VLDB’96 — J. Shafer et al.)
constructs an attribute list data structure
PUBLIC (VLDB’98 — Rastogi & Shim)
integrates tree splitting and tree pruning: stop growing the tree
earlier
RainForest (VLDB’98 — Gehrke, Ramakrishnan &
Ganti)
separates the scalability aspects from the criteria that
determine the quality of the tree
builds an AVC-list (attribute, value, class label)
Ming-Yen Lin, IECS, FCU
48
Presentation of Classification Results
49
Ming-Yen Lin, IECS, FCU
Decision Tree in SGI/MineSet 3.0
50
Ming-Yen Lin, IECS, FCU
Tool: Decision Tree
C4.5, the “classic” decision-tree tool, developed
by J. Q. Quinlan
http://www.cse.unsw.edu.tw/~quinlan
Classification Tree in Excel
EC4.5, a more efficient version of C4.5
http://www.-kdd.di.unipi.it/software
IND, provides Gini and C4.5 decision trees
http://ic.arc.nasa.gov/projects/bayes-group/ind/INDprogram.html
51
Ming-Yen Lin, IECS, FCU
Regression
Assume data fits a predefined function
Determine best values for regression coefficients
c0,c1,…,cn.
Assume an error: y = c0+c1x1+…+cnxn+e
Estimate error using mean squared error for
training set:
52
Ming-Yen Lin, IECS, FCU
Linear Regression Poor Fit
53
Ming-Yen Lin, IECS, FCU
Classification Using Regression
Division: Use regression function to divide area
into regions.
Prediction: Use regression function to predict a
class membership function. Input includes
desired class.
54
Ming-Yen Lin, IECS, FCU
Division
55
Ming-Yen Lin, IECS, FCU
Prediction
56
Ming-Yen Lin, IECS, FCU
Classification Using Distance
Place items in class to which they are
“closest”.
Must determine distance between an item
and a class.
Classes represented by
Centroid: Central value.
Medoid: Representative point.
Individual points
Algorithm: KNN
57
Ming-Yen Lin, IECS, FCU
K Nearest Neighbor (KNN):
Training set includes classes.
Examine K items near item to be classified.
New item placed in class with the most
number of close items.
O(q) for each tuple to be classified. (Here q
is the size of the training set.)
58
Ming-Yen Lin, IECS, FCU
KNN
59
Ming-Yen Lin, IECS, FCU
KNN Algorithm
60
Ming-Yen Lin, IECS, FCU
Classification Using Neural
Networks
Typical NN structure for classification:
One output node per class
Output value is class membership function value
Supervised learning
For each tuple in training set, propagate it through
NN. Adjust weights on edges to improve future
classification.
Algorithms: Propagation, Backpropagation, Gradient
Descent
61
Ming-Yen Lin, IECS, FCU
NN Issues
Number of source nodes
Number of hidden layers
Training data
Number of sinks
Interconnections
Weights
Activation Functions
Learning Technique
When to stop learning
62
Ming-Yen Lin, IECS, FCU
Decision Tree vs. Neural Network
63
Ming-Yen Lin, IECS, FCU
Propagation
Tuple Input
64
Ming-Yen Lin, IECS, FCU
NN Propagation Algorithm
65
Ming-Yen Lin, IECS, FCU
Example Propagation
66
Ming-Yen Lin, IECS, FCU
NN Learning
Adjust weights to perform better with the
associated test data.
Supervised: Use feedback from knowledge of
correct classification.
Unsupervised: No knowledge of correct
classification needed.
67
Ming-Yen Lin, IECS, FCU
NN Supervised Learning
68
Ming-Yen Lin, IECS, FCU
Supervised Learning
Possible error values assuming output from node i is yi
but should be di:
Change weights on arcs based on estimated error
69
Ming-Yen Lin, IECS, FCU
NN Backpropagation
Propagate changes to weights backward from
output layer to input layer.
Delta Rule: r wij= c xij (dj – yj)
Gradient Descent: technique to modify the
weights in the graph.
70
Ming-Yen Lin, IECS, FCU
Backpropagation
71
Ming-Yen Lin, IECS, FCU
Backpropagation Algorithm
72
Ming-Yen Lin, IECS, FCU
Gradient Descent
73
Ming-Yen Lin, IECS, FCU
Gradient Descent Algorithm
74
Ming-Yen Lin, IECS, FCU
Output Layer Learning
75
Ming-Yen Lin, IECS, FCU
Hidden Layer Learning
76
Ming-Yen Lin, IECS, FCU
Types of NNs
Different NN structures used for different
problems.
Perceptron
Self Organizing Feature Map
Radial Basis Function Network
77
Ming-Yen Lin, IECS, FCU
Perceptron
Perceptron is one of the simplest NNs.
No hidden layers.
78
Ming-Yen Lin, IECS, FCU
Perceptron Example
Suppose:
Summation: S=3x1+2x2-6
Activation: if S>0 then 1 else 0
79
Ming-Yen Lin, IECS, FCU
Self Organizing Feature Map (SOFM)
SOM
Competitive Unsupervised Learning
Observe how neurons work in brain:
Firing impacts firing of those near
Neurons far apart inhibit each other
Neurons have specific nonoverlapping tasks
Ex: Kohonen Network
80
Ming-Yen Lin, IECS, FCU
Kohonen Network
81
Ming-Yen Lin, IECS, FCU
Kohonen Network
Competitive Layer – viewed as 2D grid
Similarity between competitive nodes and input nodes:
Input: X = <x1, …, xh>
Weights: <w1i, … , whi>
Similarity defined based on dot product
Competitive node most similar to input “wins”
Winning node weights (as well as surrounding node
weights) increased.
82
Ming-Yen Lin, IECS, FCU
Radial Basis Function Network
RBF function has Gaussian shape
RBF Networks
Three Layers
Hidden layer – Gaussian activation function
Output layer – Linear activation function
83
Ming-Yen Lin, IECS, FCU
Radial Basis Function Network
84
Ming-Yen Lin, IECS, FCU
Classification Using Rules
Perform classification using If-Then rules
Classification Rule: r = <a,c>
Antecedent, Consequent
May generate from from other techniques
(DT, NN) or generate directly.
Algorithms: Gen, RX, 1R, PRISM
85
Ming-Yen Lin, IECS, FCU
Generating Rules from DTs
86
Ming-Yen Lin, IECS, FCU
Generating Rules Example
87
Ming-Yen Lin, IECS, FCU
Generating Rules from NNs
88
Ming-Yen Lin, IECS, FCU
1R Algorithm
89
Ming-Yen Lin, IECS, FCU
1R Example
90
Ming-Yen Lin, IECS, FCU
PRISM Algorithm
91
Ming-Yen Lin, IECS, FCU
PRISM Example
92
Ming-Yen Lin, IECS, FCU
Decision Tree vs. Rules
Tree has implied order in
which splitting is
performed.
Tree created based on
looking at all classes.
Rules have no ordering
of predicates.
Only need to look at one
class to generate its rules.
決策樹方法的優點?缺點?
93
Ming-Yen Lin, IECS, FCU