Download COMP 192

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
COMP 1942
Classification
(Nearest Neighbor Classifier
and Neural Network)
TA: Harry Chan
Email: [email protected]
COMP1942
1
Outline


Nearest Neighbor Classifier
Neural Network
COMP 1942
2
Review: NN Classifier

Given a set of objects and their labels, determine the
label of a new object o.

NN Classifier



k-NN Classifier



Find o’s nearest neighbor.
Use the label of this neighbor.
Find o’s k-nearest neighbors (k-NN query).
Use the label that the majority of its neighbors share.
Data type of variables


Input variables: real
Output variable: categorical
COMP1942
3
Using k-NN Classifier in XLMiner

Two ways to access k-Nearest Neighbors


“Add-ins” Tag
 XLMiner
 Classification
 k-Nearest Neighbors
“XLMiner Platform” Tag
 Classify
 k-Nearest Neighbors
COMP 1942
4
Steps



Step 1: Specify the data range, variables and
output variable.
Step 2: Specify the scoring options, prior
class probabilities and partition options.
Step 3: Specify the output options.
COMP1942
5
Step 1
Data source
Variables
COMP1942
6
Parameter k of the
k-NN Classifier
Step 2
Score on specified
value of k as above
Scoring option
Score on best k between
1 and specified value
One time: k
Multiple times: 1, 2, …, k
COMP1942
7
Step 2
Prior probabilities
COMP1942
8
Step 3
Output options
COMP1942
9
Example 1




Dataset: Iris.xls
Input variables: Petal_width, Petal_length, Sepal_width,
Sepal_length
Output variable: Species_name
Parameters



Scoring option


Normalize input data
Number of nearest neighbors: 10
Score on best k between 1 and specified value
Data partition



Training data
Validation data
Test data
COMP1942
50%
30%
20%
10
Results
Finding the best
k between 1 and
specified value
Smallest % Error
k=5 is the best
COMP1942
11
Results
k=5 is used
List of data
points with the
predicted results
COMP1942
12
Outline


Nearest Neigbhor Classifier
Neural Network
COMP 1942
13
Review: Neural Network
input
x1
x2
w1
w2
Front
net
net = w1x1 + w2x2 + b
COMP1942
w1
w2
b
0.8
0.2
-0.5
x1
x2
y
1
0
1?
output
Back
y
Threshold function
1 if net 0
y=
0 if net <0
net = 0.8*1 + 0.2*0 +
(-0.5) = 0.3
y=1
14
Review: Learning process


Let  be the learning rate (a real
number)
Learning is done by


wi  wi + (d – y)xi
where



d is the desired output
y is the output of our neural network
b  b + (d – y)
COMP1942
Updating the weight, wi.
Updating the parameter, b.
15
Using Neural Network in XLMiner

Two ways to access Neural Network


“Add-ins” Tag
 XLMiner
 Classification
 Neural Network
 Manual Network
“XLMiner Platform” Tag
 Classify
 Neural Network
 Manual Network
COMP 1942
16
Steps



Step 1: Specify the data range, variables and
output variable.
Step 2: Specify the network architecture,
training options, activation function and
partition options.
Step 3: Specify the output options.
COMP1942
17
Step 1
Familiar interface
COMP1942
18
Step 2
Network
Architecture
COMP1942
Configurations
19
Step 3
Mining result
options. (similar)
COMP1942
20
Example 2

Dataset: Wine.xls


178 records of wines
Step1: Transform categorical data



Data Utilities -> Transform Categorical Data ->
Crate category scores…
Transform “Type” to numeric variable “Type_ord”
Assign numbers 1,2,3…
COMP1942
21
Example 2

Step2: Neural Network



Input variables: All variables except “Type_ord”
Output variable: Type_ord
Parameters:




# hidden layers: 1
# nodes: 25
# epochs: 1000
Data partition


COMP1942
Training data
Validation data
80%
20%
22
Results
List of data
points with the
predicted results
COMP1942
23
Results
Training Log
COMP1942
24
Related documents