Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Are we still talking about diversity in classifier ensembles? Ludmila I Kuncheva School of Computer Science Bangor University, UK “CLASSIFIER ENSEMBLE DIVERSITY” Publications (580) Search on 10 Sep 2014 Citations (4594) 580 papers, 335 journals / conferences MULTIPLE CLASSIFIER SYSTEMS 30 INT JOINT CONF ON NEURAL NETWORKS (IJCNN) 22 PATTERN RECOGNITION 17 NEUROCOMPUTING 14 EXPERT SYSTEMS WITH APPLICATIONS 13 INFORMATION SCIENCES 12 APPLIED SOFT COMPUTING 11 PATTERN RECOGNITION LETTERS 10 INFORMATION FUSION 9 IEEE INT JOINT CONF ON NEURAL NETWORKS 9 KNOWLEDGE-BASED SYSTEMS 7 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 7 INT J OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE 6 MACHINE LEARNING 5 IEEE TRANSACTIONS ON NEURAL NETWORKS 5 JOURNAL OF MACHINE LEARNING RESEARCH 5 APPLIED INTELLIGENCE 4 INTELLIGENT DATA ANALYSIS 4 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION 4 ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING 4 NEURAL INFORMATION PROCESSING 4 0 5 10 15 20 25 30 580 papers, 335 journals / conferences MULTIPLE CLASSIFIER SYSTEMS 30 INT JOINT CONF ON NEURAL NETWORKS (IJCNN) 22 PATTERN RECOGNITION 17 NEUROCOMPUTING 14 EXPERT SYSTEMS WITH APPLICATIONS 13 INFORMATION SCIENCES 12 APPLIED SOFT COMPUTING 11 PATTERN RECOGNITION LETTERS 10 INFORMATION FUSION 9 IEEE INT JOINT CONF ON NEURAL NETWORKS 9 KNOWLEDGE-BASED SYSTEMS 7 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 7 INT J OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE 6 MACHINE LEARNING 5 IEEE TRANSACTIONS ON NEURAL NETWORKS 5 JOURNAL OF MACHINE LEARNING RESEARCH 5 APPLIED INTELLIGENCE 4 INTELLIGENT DATA ANALYSIS 4 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION 4 ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING 4 NEURAL INFORMATION PROCESSING 4 0 5 10 15 20 25 30 Where in the world are we? China UK USA Spain Brazil Canda Poland Iran Italy 140 68 63 55 41 32 28 23 19 ... France 11 Where in the world are we? China UK USA Spain Brazil Canda Poland Iran Italy 140 68 63 55 41 32 28 23 19 ... France 11 Laurent HEUTTE Professor of Computer Science, University of Rouen, France Are we still talking about diversity in classifier ensembles? Apparently yes... That elusive diversity... Classifier ensemble class label “combiner” classifier classifier feature values (object description) classifier That elusive diversity... independent outputs ≠ independent errors hence, use ORACLE outputs Classifier 1 Classifier 2 correct wrong correct 𝑎 𝑏 wrong 𝑐 𝑑 Number of instances labelled correctly by classifier 1 and mislabelled by classifier 2 That elusive diversity... Classifier 1 Classifier 2 correct wrong correct 𝑎 𝑏 wrong 𝑐 𝑑 • • • • • Q kappa correlation (rho) disagreement double fault • ... That elusive diversity... SEVENTY SIX !!! Do we need more “NEW” pairwise diversity measures? Looks like we don’t... Kappa-error diagrams • proposed by Margineantu and Dietterich in 1997 • visualise individual accuracy and diversity in a 2-dimensional plot • have been used to decide which ensemble members can be pruned without much harm to the overall performance Example 𝑒𝑖𝑗 sonar data (UCI): 260 instances, 60 features, 2 classes, ensemble size L = 11 classifiers, base model – tree C4.5 0.4 0.38 Adaboost 75.0% 0.36 Bagging 77.0% 0.34 0.32 Random subspace 80.9% 0.3 0.28 0.26 0.24 -0.2 Rotation Forest 84.7% Random oracle 83.3% -0.1 0 0.1 0.2 0.3 0.4 0.5 Kuncheva L.I., A bound on kappa-error diagrams for analysis of classifier ensembles, IEEE Transactions on Knowledge and Data Engineering, 2013, 25 (3), 494-501 (DOI: 10.1109/TKDE.2011.234). Kappa-error diagrams C2 correct C1 correct wrong a c wrong b d error kappa = (observed – chance)/(1-chance) Kappa-error diagrams error kappa bound (tight) Kappa-error diagrams – simulated ensembles L = 3 error 1 0.5 0 -1 kappa 0 1 Kappa-error diagrams – real data L = 11 error 0.4 0.3 0.2 0.1 0 -1 -0.5 0 kappa Real data: 77,422,500 pairs of classifiers error room for improvement 0.4 0.2 0 -1 0 1 kappa Is there space for new classifier ensembles? Looks like yes... Good and Bad diversity Diversity is not MONOTONICALLY related to ensemble accuracy Good and Bad diversity MAJORITY VOTE 3 classifiers: A, B, C 15 objects, wrong vote, correct vote individual accuracy = 10/15 = 0.667 P = ensemble accuracy A B C independent classifiers P = 11/15 = 0.733 A B C identical classifiers P = 10/15 = 0.667 A B C dependent classifiers 1 P = 7/15 = 0.467 A B C dependent classifiers 2 P = 15/15 = 1.000 Good and Bad diversity MAJORITY VOTE 3 classifiers: A, B, C 15 objects, wrong vote, correct vote individual accuracy = 10/15 = 0.667 P = ensemble accuracy A B C independent classifiers P = 11/15 = 0.733 A B C identical classifiers P = 10/15 = 0.667 A B C dependent classifiers 1 P = 7/15 = 0.467 Bad diversity A B C dependent classifiers 2 P = 15/15 = 1.000 Good diversity Good and Bad diversity Are these outputs diverse? 𝐶1 𝐶2 𝐶5 𝐶6 𝐶7 Data set Z 𝐶3 𝐶4 object 𝑧𝑖 Ensemble, L = 7 classifiers Good and Bad diversity How about these? Data set Z 𝐶1 𝐶2 𝐶3 𝐶4 𝐶5 𝐶6 object 𝑧𝑖 Ensemble, L = 7 classifiers 𝐶7 Good and Bad diversity 3 vs 4... Can’t be more diverse, really... Data set Z 𝐶1 𝐶2 𝐶3 𝐶4 𝐶5 𝐶6 object 𝑧𝑖 Ensemble, L = 7 classifiers 𝐶7 Good and Bad diversity MAJORITY VOTE Good diversity Data set Z 𝐶1 𝐶2 𝐶3 𝐶4 𝐶5 𝐶6 object 𝑧𝑖 Ensemble, L = 7 classifiers 𝐶7 Good and Bad diversity MAJORITY VOTE Bad diversity Data set Z 𝐶1 𝐶2 𝐶3 𝐶4 𝐶5 𝐶6 object 𝑧𝑖 Ensemble, L = 7 classifiers 𝐶7 Good and Bad diversity 𝑙𝑖 number of classifiers with correct output for 𝑧𝑖 Brown G., L.I. Kuncheva, "Good" and "bad" diversity in majority vote ensembles, Proc. Multiple Classifier Systems (MCS'10), Cairo, Egypt, LNCS 5997, 2010, 124-133. 𝐿 − 𝑙𝑖 number of classifiers with wrong output for 𝑧𝑖 𝑝 mean individual accuracy 𝑁 number of data points Decomposition of the Majority Vote Error Add BAD diversity 𝐸𝑚𝑎𝑗 1 = 1−𝑝 − 𝑁𝐿 Individual error Subtract GOOD diversity maj 1 ( 𝐿 − 𝑙𝑖 ) + 𝑁𝐿 𝑙𝑖 maj Good and Bad diversity This object will contribute 𝐿 − 𝑙𝑖 = (7 – 4) = 3 to good diversity This object will contribute 𝑙𝑖 = 3 to bad diversity Note that diversity quantity is 3 in both cases Ensemble Margin The voting margin for object 𝑧𝑖 is the proportion of <correct minus wrong votes> 𝑙𝑖 − (𝐿 − 𝑙𝑖 ) 𝑚𝑖 = 𝐿 POSITIVE 4 − (7 − 4) 1 𝑚𝑖 = = 7 7 3 − (7 − 3) 1 𝑚𝑖 = =− 7 7 NEGATIVE Ensemble Margin Average margin 1 𝑚= 𝑁 𝑁 𝑖=1 1 𝑚𝑖 = 𝑁 𝑁 𝑖=1 𝑙𝑖 − (𝐿 − 𝑙𝑖 ) 𝐿 Large 𝒎 corresponds to BETTER ensembles... However, nearly all diversity measures are functions of Average absolute margin 1 |𝑚| = 𝑁 𝑁 |𝑚𝑖 | 𝑖=1 or Average square margin 𝑚2 1 = 𝑁 𝑁 𝑚𝑖2 𝑖=1 Margin has no sign... Ensemble Margin Diversity is not MONOTONICALLY related to ensemble accuracy So, STOP LOOKING for a monotonic relationship!!! Conclusions 1. Beware! Overflow of diversity measures! 2. In theory, there is some room for better classifier ensembles. 3. Diversity is not monotonically related to ensemble accuracy, hence larger diversity does not necessarily mean better accuracy. Directly engineered or heuristic? – up to you 36 37