Download Machine Learning Software + Intro WEKA

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Cluster analysis wikipedia , lookup

Transcript
Machine Learning Software +
Intro WEKA
Oliver Brdiczka
Equipe PRIMA
INRIA Rhône-Alpes
Outline
Machine Learning Software
MATLAB
Orange
Torch3
R language
WEKA
YALE
Short introduction to WEKA
Machine Learning Software
MATLAB toolboxes:
Many toolboxes for different machine
learning areas
E.g: SPIDER, PRTools (pattern
recognition), BNT (bayesian networks) …
Need a license of MATLAB! (or use scilab)
Machine Learning Software
Orange (University of Ljubljana)
Focus on data mining + visualization
C++ components + Python scripting
GUI
Linux, MS Windows, Macintosh
GNU General Public license
Machine Learning Software
TORCH3 (IDIAP)
(Statistical) machine learning library
C++ library
Linux, MS Windows
BSD license
R language
Language/environment for statistical computing
and graphics (free impl. of S language)
C++
Linux, MS Windows, Macintosh
GNU General Public license
Machine Learning Software
WEKA (University Waikato,New Zealand)
Machine learning/data mining software
Java-based
GUI
GNU General Public license
Machine Learning Software
YALE (University of Dortmund)
Environment for machine learning
experiments (Experiment editor)
Java-based
GUI
Integration of WEKA learners
GNU General Public license
Machine Learning Software
Others…
Libraries for specific learning algorithms:
HMM: ghmm (C++), jahmm (Java)
Graphical Bayesian Models: gmtk (C++)
…
Short introduction to WEKA
WEKA: main features
Comprehensive set of data preprocessing tools, learning algorithms
and evaluation methods
Graphical user interfaces (incl. data
visualization)
Environment for comparing learning
algorithms
WEKA: data format ARFF
@relation weather
@attribute
@attribute
@attribute
@attribute
@attribute
outlook {sunny, overcast, rainy}
temperature numeric
humidity numeric
windy {TRUE, FALSE}
play {yes, no}
@data
sunny,85,85,FALSE,no
sunny,80,90,TRUE,no
overcast,83,86,FALSE,yes
rainy,70,96,FALSE,yes
rainy,68,80,FALSE,yes
WEKA: data import
Data can be imported from a file in
various formats: ARFF, CSV, C4.5, binary
Data can also be read from a URL or
from an SQL database (using JDBC)
Pre-processing tools in WEKA are called
“filters”
WEKA: filters
WEKA contains filters for:
Discretization, normalization, resampling,
attribute selection, transforming and
combining attributes, …
WEKA: classifiers
Classifiers in WEKA are models for predicting
nominal or numeric quantities
Implemented learning schemes include:
Decision trees and lists, instance-based classifiers,
support vector machines, multi-layer perceptrons,
logistic regression, Bayes’ nets, …
“Meta”-classifiers include:
Bagging, boosting, stacking, error-correcting output
codes, locally weighted learning, …
WEKA: clusterers
WEKA contains “clusterers” for finding groups
of similar instances in a dataset
Implemented schemes are:
k-Means, EM, Cobweb, FarthestFirst …
Clusters can be visualized and compared to
“true” clusters (if given)
Evaluation based on loglikelihood if clustering
scheme produces a probability distribution
WEKA: API documentation
javadoc
WEKA: User Interfaces
Simple Command Line Interface
Explorer
Experimenter
Filters, classifiers, clusterers, visualization
Comparing different learning algorithms
Knowledge Flow
Graphical programming tool
Conclusion
Important tool for machine learning
problems
Used by many research groups
Many extensions are available for WEKA:
Spectral clustering, time series mining, grid
computing, document classification and
clustering, vector quantization, rule
discovery, parallel processing …
References (web)
MATAB toolboxes:
SPIDER:
http://www.kyb.tuebingen.mpg.de/bs/peo
ple/spider/main.html
PRTools:
http://www.prtools.org/
BNT:
http://bnt.sourceforge.net/
References (web)
Orange
http://www.ailab.si/orange
TORCH3
http://www.torch.ch/
R language
http://www.r-project.org/
References (web)
WEKA
http://www.cs.waikato.ac.nz/ml/weka/
YALE
http://www-ai.cs.unidortmund.de/SOFTWARE/YALE/index.html
Other:
Jahmm:
http://www.run.montefiore.ulg.ac.be/~francois/software/jahm
m/
Ghmm: http://www.ghmm.org/
Gmtk: http://ssli.ee.washington.edu/~bilmes/gmtk/