Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Machine Learning with WEKA WEKA: A Machine Learning Toolkit The Explorer • • Eibe Frank • • Department of Computer Science, University of Waikato, New Zealand • Classification and Regression Clustering Association Rules Attribute Selection Data Visualization The Experimenter The Knowledge Flow GUI Conclusions WEKA: the bird Copyright: Martin Kramer ([email protected]) 5/22/2017 University of Waikato 2 WEKA: the software • • • Machine learning/data mining software written in Java (distributed under the GNU Public License) Complements “Data Mining” by Witten & Frank Main features: • • • 5/22/2017 Comprehensive set of data pre-processing tools, learning algorithms and evaluation methods Graphical user interfaces (incl. data visualization) Environment for comparing learning algorithms University of Waikato 3 WEKA only deals with “flat” files @relation heart-disease-simplified @attribute age numeric @attribute sex { female, male} @attribute chest_pain_type { typ_angina, asympt, non_anginal, atyp_angina} @attribute cholesterol numeric @attribute exercise_induced_angina { no, yes} @attribute class { present, not_present} @data 63,male,typ_angina,233,no,not_present 67,male,asympt,286,yes,present 67,male,asympt,229,yes,present 38,female,non_anginal,?,no,not_present ... 5/22/2017 University of Waikato 4 WEKA only deals with “flat” files @relation heart-disease-simplified @attribute age numeric @attribute sex { female, male} @attribute chest_pain_type { typ_angina, asympt, non_anginal, atyp_angina} @attribute cholesterol numeric @attribute exercise_induced_angina { no, yes} @attribute class { present, not_present} @data 63,male,typ_angina,233,no,not_present 67,male,asympt,286,yes,present 67,male,asympt,229,yes,present 38,female,non_anginal,?,no,not_present ... 5/22/2017 University of Waikato 5 5/22/2017 University of Waikato 6 Explorer: pre-processing the data • • • • Data can be imported from a file in various formats: ARFF, CSV, C4.5, binary Data can also be read from a URL or from an SQL database (using JDBC) Pre-processing tools in WEKA are called “filters” WEKA contains filters for: • 5/22/2017 Discretization, normalization, resampling, attribute selection, transforming and combining attributes, … University of Waikato 7 5/22/2017 University of Waikato 8 5/22/2017 University of Waikato 9 5/22/2017 University of Waikato 10 5/22/2017 University of Waikato 11 5/22/2017 University of Waikato 12 5/22/2017 University of Waikato 13 5/22/2017 University of Waikato 14 5/22/2017 University of Waikato 15 5/22/2017 University of Waikato 16 5/22/2017 University of Waikato 17 5/22/2017 University of Waikato 18 5/22/2017 University of Waikato 19 5/22/2017 University of Waikato 20 5/22/2017 University of Waikato 21 5/22/2017 University of Waikato 22 5/22/2017 University of Waikato 23 5/22/2017 University of Waikato 24 5/22/2017 University of Waikato 25 5/22/2017 University of Waikato 26 5/22/2017 University of Waikato 27 5/22/2017 University of Waikato 28 Explorer: building “classifiers” • • Classifiers in WEKA are models for predicting nominal or numeric quantities Implemented learning schemes include: • • Decision trees and lists, instance-based classifiers, support vector machines, multi-layer perceptrons, logistic regression, Bayes’ nets, … “Meta”-classifiers include: • 5/22/2017 Bagging, boosting, stacking, error-correcting output codes, locally weighted learning, … University of Waikato 29 5/22/2017 University of Waikato 30 5/22/2017 University of Waikato 31 5/22/2017 University of Waikato 32 5/22/2017 University of Waikato 33 5/22/2017 University of Waikato 34 5/22/2017 University of Waikato 35 5/22/2017 University of Waikato 36 5/22/2017 University of Waikato 37 5/22/2017 University of Waikato 38 5/22/2017 University of Waikato 39 5/22/2017 University of Waikato 40 5/22/2017 University of Waikato 41 5/22/2017 University of Waikato 42 5/22/2017 University of Waikato 43 5/22/2017 University of Waikato 44 5/22/2017 University of Waikato 45 5/22/2017 University of Waikato 46 5/22/2017 University of Waikato 47 5/22/2017 University of Waikato 48 5/22/2017 University of Waikato 49 5/22/2017 University of Waikato 50 5/22/2017 University of Waikato 51 5/22/2017 University of Waikato 52 Explorer: clustering data • • WEKA contains “clusterers” for finding groups of similar instances in a dataset Implemented schemes are: • • • k-Means, EM, Cobweb, X-means, FarthestFirst Clusters can be visualized and compared to “true” clusters (if given) Evaluation based on loglikelihood if clustering scheme produces a probability distribution 5/22/2017 University of Waikato 53 5/22/2017 University of Waikato 54 5/22/2017 University of Waikato 55 5/22/2017 University of Waikato 56 5/22/2017 University of Waikato 57 5/22/2017 University of Waikato 58 5/22/2017 University of Waikato 59 5/22/2017 University of Waikato 60 5/22/2017 University of Waikato 61 5/22/2017 University of Waikato 62 5/22/2017 University of Waikato 63 5/22/2017 University of Waikato 64 5/22/2017 University of Waikato 65 5/22/2017 University of Waikato 66 5/22/2017 University of Waikato 67 5/22/2017 University of Waikato 68 Explorer: finding associations • WEKA contains an implementation of the Apriori algorithm for learning association rules • • Can identify statistical dependencies between groups of attributes: • • Works only with discrete data milk, butter bread, eggs (with confidence 0.9 and support 2000) Apriori can compute all rules that have a given minimum support and exceed a given confidence 5/22/2017 University of Waikato 69 5/22/2017 University of Waikato 70 5/22/2017 University of Waikato 71 5/22/2017 University of Waikato 72 5/22/2017 University of Waikato 73 5/22/2017 University of Waikato 74 5/22/2017 University of Waikato 75 5/22/2017 University of Waikato 76 Conclusion: try it yourself! WEKA is available at http://www.cs.waikato.ac.nz/ml/weka • Also has a list of projects based on WEKA • 5/22/2017 University of Waikato 77