Download Generating Rules through WEKA

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Generating Rules through WEKA
Dr. Alka Arora
1.
Introduction
Weka is open source software under the GNU General Public License. System is developed at
the University of Waikato in New Zealand. “Weka” stands for the Waikato Environment for
Knowledge
Analysis.
The
software
is
freely
available
at
http://www.cs.waikato.ac.nz/ml/weka. The system is written using object oriented language
Java. There are several different levels at which Weka can be used. Weka provides
implementations of state-of-the-art data mining and machine learning algorithms. Weka
contains modules for data preprocessing, classification, clustering and association rule
extraction.
Main features of Weka include:
49 data preprocessing tools
76 classification/regression algorithms
8 clustering algorithms
15 attribute/subset evaluators + 10 search algorithms for feature selection.
3 algorithms for finding association rules
3 graphical user interfaces
– “The Explorer” (exploratory data analysis)
– “The Experimenter” (experimental environment)
– “The KnowledgeFlow” (new process model inspired interface)
1.1
Weka: Download and Installation
Download Weka (the stable version) from http://www.cs.waikato.ac.nz/ml/weka/
– Choose a self-extracting executable (including Java VM)
– (If you are interested in modifying/extending weka there is a developer version
that includes the source code)
After download is completed, run the self extracting file to install Weka, and use the
default set-ups.
1.2
Start the Weka
From windows desktop,
– click “Start”, choose “All programs”, Choose “Weka 3.6” to start Weka
– Then the first interface window appears:
Weka GUI Chooser.
1.3
Weka Application Interfaces
Explorer
– preprocessing, attribute selection, learning, visualiation
Experimenter
– testing and evaluating machine learning algorithms
Knowledge Flow
– visual design of KDD process
– Explorer
Simple Command-line
– A simple interface for typing commands
1.4
WEKA data formats
Data can be imported from a file in various formats:
ARFF (Attribute Relation File Format) has two sections:
– the Header information defines attribute name, type and relations.
– the Data section lists the data records.
CSV: Comma Separated Values (text file)
Data can also be read from a database using ODBC connectivity.
1.4.1
Attribute Relation File Format (arff)
ARFF format of animal dataset is shown below:
@relation Animal_clustered
@attribute Instance_number numeric
@attribute Name {frog,toad,elephant,cat,dog,rabbit,jaguar,whale}
@attribute Metabolism {ectothermic,endothermic}
@attribute Cover {wetskin,hair}
@attribute Dentition {superior,no,complete,the_youngest}
@attribute Reproduction {oviparous,viviparous}
@attribute 'Number of feet' numeric
@data
0,frog,ectothermic,wetskin,superior,oviparous,4
1,toad,ectothermic,wetskin,no,oviparous,4
2,elephant,endothermic,hair,complete,viviparous,4
3,cat,endothermic,hair,complete,viviparous,4 4,dog,endothermic,hair,complete,viviparous,4
5,rabbit,endothermic,hair,complete,viviparous,4
6,jaguar,endothermic,hair,complete,viviparous,4
7,whale,endothermic,hair,the_youngest,viviparous,0
.
2.
Weka: Knowledge Flow GUI
Knowledge Flow is new graphical user interface in WEKA. It is Java-Beans-based interface
for setting up and running machine learning experiments. In this, data sources, classifiers, etc.
are beans and can be connected graphically. Data “flows” through components: e.g.,
“data source” -> “filter” -> “classifier” -> “evaluator”
These layouts can be saved and loaded again later.
2.1
Example
– Type: Classification
– Feature selection: GainRatio; Ranker (top 3)
– Algorithm: ID3
– Training: Weather_nominal.arff
– Test: Weather_nominal.arff
Steps to carry out this experiment in Weka Knowledge Flow are presented here:
References:
[1]
Holmes, A. Donkin, I. H. Witten, WEKA: A Machine Learning Workbench, In
Proceedings of the Second Australian and New Zealand Conference on Intelligent
Information Systems, 357-361,1994.
[2]
Software, available at: http://www.cs.waikato.ac.nz/~ml/
[3]
I. H. Witten, E. Frank, Data Mining: Practical Machine Learning Tools and
Techniques with Java Implementation, Morgan Kaufmann publishers, 1999.
[4]
http://webpages.uncc.edu/~wjiang3/TA/weka/practiceWEKA.pdf
[5]
www.cs.waikato.ac.nz/ml/weka/index_documentation.html
[6]
http://www.cs.utexas.edu/users/ml/tutorials/Weka-tut/index.htm
[7]
http://www.cs.ru.nl/~peterl/teaching/DM/weka.pdf
Related documents