Download Lecture Notes in PDF - University of Rhode Island

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nonlinear dimensionality reduction wikipedia , lookup

K-nearest neighbors algorithm wikipedia , lookup

Transcript
This lecture notes is based on the following paper: B. Tang and H. He, "ENN: Extended Nearest Neighbor Method for Pa@ern RecogniBon," IEEE Computa+onal Intelligence Magazine, vol.10, no.3, pp.52 -­‐ 60, Aug. 2015 ENN: Extended Nearest Neighbor
Method for Pattern Recognition
Prof. Haibo He Electrical Engineering University of Rhode Island, Kingston, RI 02881 ComputaBonal Intelligence and Self-­‐AdapBve Systems (CISA) Laboratory h@p://www.ele.uri.edu/faculty/he/ Email: [email protected] Extended Nearest Neighbor for Pa3ern Recogni6on 1.  Limita6ons of K-­‐Nearest Neighbors (KNN) 2.  “Two-­‐way communica-on”: Extended Nearest Neighbors (ENN) 3.  Experimental Analysis 4.  Conclusion Pattern Recognition
ü  Parametric Classifier §  Class-­‐wise density es6ma6on, including naive Bayes, mixture Gaussian, etc. ü  Non-­‐Parametric Classifier Nonparametric nature §  Nearest Neighbors Easy implementa6on §  Neural Network Powerfulness §  Support Vector Machine Robustness Consistency Limitations of traditional KNN
Scale-­‐SensiBve Problem: The class 1 samples dominate their near neighborhood with higher density (i.e., more concentrated distribuBon). The class 2 samples are distributed in regions with lower density (i.e., more spread out distribuBon). Those class 2 samples which are close to the region of class 1 may be easily misclassified. ENN: A New Approach
Define generalized class-wise statistic for each class:
Si denotes the samples in class i, and NNr(x, S) denotes the r-th nearest
neighbor of x in S.
Ti measures the coherence of data from the same class. 0 ≤ Ti ≤ 1 with Ti =
1 when all the nearest neighbors of class i data are also from the same class
i, and with Ti = 0 when all the nearest neighbors are from other classes.
Intra-class coherence:
Given an unknown sample Z to be classified, we iteratively assign it to
class 1 and class 2, respectively, to obtain two new generalized class-wise
statistics Tij, where j=1,2. Then, the sample Z is classified according to:
ENN Classification Rule: Maximum Gain of Intra-class Coherence.
For N-class classification: To avoid the recalculation of generalized classwise statistics in testing stage, an Equivalent
Version of ENN is proposed:
The equivalent version has the same result as the original
one, but avoids the recalculation of Tij
How this simple rule works better than KNN
The ENN method makes a prediction in a “two-way communication”
style: it considers not only who are the nearest neighbors of the test
sample, but also who consider the test sample as their nearest
neighbors.
Experimental
Results and Analysis
Sampling methods Synthetic Data Set:
A 3-dimensional Gaussian data with 3 classes:
Considering the following four models,
their error rates are:
Model 2 σ 12 = 5, σ 22 = 20, σ 32 = 5
k = 3
k = 5
k = 7
Class 1
KNN ENN
32
31.9
31.2 29.7
28.5 28.3
Class 2
KNN ENN
39.3 34.4
40.5 33.7
40.8 33.6
Class 3
KNN ENN
31.4 30.5
28.6 26.7
25
24.3
Class 1
KNN ENN
Class 2
KNN ENN
Class 3
KNN ENN
33.2
30.3
26.7
27
24
20.8
38.8
40.2
40.6
Model 3 σ 12 = 5, σ 22 = 5, σ 32 = 20
k = 3
k = 5
k = 7
31
27.3
25.1
26.8
23.2
20.8
33.7
33.5
33
Sampling methods Real-life Data Sets:
•  MNIST Handwri3en Digit Recogni6on Data Examples
Sampling methods Real-life Data Sets:
•  20 data sets from UCI Machine Learning Repository t-­‐test shows that ENN can significantly improve the classifica6on performance in 17 out of 20 datasets, in comparison with KNN. Summary: Three versions of ENN ENN B. Tang and H. He, "ENN: Extended Nearest Neighbor Method for Pa3ern Recogni6on," IEEE Computa6onal Intelligence Magazine, vol.10, no.3, pp.52 -­‐ 60, Aug. 2015 Summary: Three versions of ENN ENN.V1 B. Tang and H. He, "ENN: Extended Nearest Neighbor Method for Pa3ern Recogni6on," IEEE Computa6onal Intelligence Magazine, vol.10, no.3, pp.52 -­‐ 60, Aug. 2015 Summary: Three versions of ENN ENN.V2 B. Tang and H. He, "ENN: Extended Nearest Neighbor Method for Pa3ern Recogni6on," IEEE Computa6onal Intelligence Magazine, vol.10, no.3, pp.52 -­‐ 60, Aug. 2015 Summary: Three versions of ENN Online Resources Supplementary materials and Matlab source code implementaBon available at: h@p://www.ele.uri.edu/faculty/he/research/ENN/ENN.html B. Tang and H. He, "ENN: Extended Nearest Neighbor Method for Pa3ern Recogni6on," IEEE Computa6onal Intelligence Magazine, vol.10, no.3, pp.52 -­‐ 60, Aug. 2015 Conclusion 1.  A new ENN classifica6on methodology based on the maximum gain of intra-­‐class coherence. 2.  “Two-­‐way communica-on”: ENN considers not only who are the nearest neighbors of the test sample, but also who consider the test sample as their nearest neighbors. 3.  Important and useful for many other machine learning and data mining problems, such as density es6ma6on, clustering, regression, among others. B. Tang and H. He, "ENN: Extended Nearest Neighbor Method for Pa3ern Recogni6on," IEEE Computa6onal Intelligence Magazine, vol.10, no.3, pp.52 -­‐ 60, Aug. 2015