Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Machine Learning Software + Intro WEKA Oliver Brdiczka Equipe PRIMA INRIA Rhône-Alpes Outline Machine Learning Software MATLAB Orange Torch3 R language WEKA YALE Short introduction to WEKA Machine Learning Software MATLAB toolboxes: Many toolboxes for different machine learning areas E.g: SPIDER, PRTools (pattern recognition), BNT (bayesian networks) … Need a license of MATLAB! (or use scilab) Machine Learning Software Orange (University of Ljubljana) Focus on data mining + visualization C++ components + Python scripting GUI Linux, MS Windows, Macintosh GNU General Public license Machine Learning Software TORCH3 (IDIAP) (Statistical) machine learning library C++ library Linux, MS Windows BSD license R language Language/environment for statistical computing and graphics (free impl. of S language) C++ Linux, MS Windows, Macintosh GNU General Public license Machine Learning Software WEKA (University Waikato,New Zealand) Machine learning/data mining software Java-based GUI GNU General Public license Machine Learning Software YALE (University of Dortmund) Environment for machine learning experiments (Experiment editor) Java-based GUI Integration of WEKA learners GNU General Public license Machine Learning Software Others… Libraries for specific learning algorithms: HMM: ghmm (C++), jahmm (Java) Graphical Bayesian Models: gmtk (C++) … Short introduction to WEKA WEKA: main features Comprehensive set of data preprocessing tools, learning algorithms and evaluation methods Graphical user interfaces (incl. data visualization) Environment for comparing learning algorithms WEKA: data format ARFF @relation weather @attribute @attribute @attribute @attribute @attribute outlook {sunny, overcast, rainy} temperature numeric humidity numeric windy {TRUE, FALSE} play {yes, no} @data sunny,85,85,FALSE,no sunny,80,90,TRUE,no overcast,83,86,FALSE,yes rainy,70,96,FALSE,yes rainy,68,80,FALSE,yes WEKA: data import Data can be imported from a file in various formats: ARFF, CSV, C4.5, binary Data can also be read from a URL or from an SQL database (using JDBC) Pre-processing tools in WEKA are called “filters” WEKA: filters WEKA contains filters for: Discretization, normalization, resampling, attribute selection, transforming and combining attributes, … WEKA: classifiers Classifiers in WEKA are models for predicting nominal or numeric quantities Implemented learning schemes include: Decision trees and lists, instance-based classifiers, support vector machines, multi-layer perceptrons, logistic regression, Bayes’ nets, … “Meta”-classifiers include: Bagging, boosting, stacking, error-correcting output codes, locally weighted learning, … WEKA: clusterers WEKA contains “clusterers” for finding groups of similar instances in a dataset Implemented schemes are: k-Means, EM, Cobweb, FarthestFirst … Clusters can be visualized and compared to “true” clusters (if given) Evaluation based on loglikelihood if clustering scheme produces a probability distribution WEKA: API documentation javadoc WEKA: User Interfaces Simple Command Line Interface Explorer Experimenter Filters, classifiers, clusterers, visualization Comparing different learning algorithms Knowledge Flow Graphical programming tool Conclusion Important tool for machine learning problems Used by many research groups Many extensions are available for WEKA: Spectral clustering, time series mining, grid computing, document classification and clustering, vector quantization, rule discovery, parallel processing … References (web) MATAB toolboxes: SPIDER: http://www.kyb.tuebingen.mpg.de/bs/peo ple/spider/main.html PRTools: http://www.prtools.org/ BNT: http://bnt.sourceforge.net/ References (web) Orange http://www.ailab.si/orange TORCH3 http://www.torch.ch/ R language http://www.r-project.org/ References (web) WEKA http://www.cs.waikato.ac.nz/ml/weka/ YALE http://www-ai.cs.unidortmund.de/SOFTWARE/YALE/index.html Other: Jahmm: http://www.run.montefiore.ulg.ac.be/~francois/software/jahm m/ Ghmm: http://www.ghmm.org/ Gmtk: http://ssli.ee.washington.edu/~bilmes/gmtk/