Download Practicum 4: Text Classification

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nearest-neighbor chain algorithm wikipedia , lookup

Nonlinear dimensionality reduction wikipedia , lookup

K-nearest neighbors algorithm wikipedia , lookup

Expectation–maximization algorithm wikipedia , lookup

K-means clustering wikipedia , lookup

Transcript
Lab 6: Association Rules
Evgueni N. Smirnov
[email protected]
July 5, 2007
1. Introduction
In this lab you will consider two possible applications of association rules. The first one is
an application of association-rule mining for learning decision rules. The second application
is an application of association-rule mining for analyzing a market basket dataset. For both
applications you will use an implementation of the Apriori algorithm provided in Weka. We
note that this implementation uses attribute-value representation of items and that is why
you can encounter problems during the market-basket analysis.
2. Decision-Rule Learning Problem
In the previous lab you derived a set of decision rules for the weather problem using the
JRip decision-rule algorithm. In this part of this lab you will use the Weka implementation
of the Apriori algorithm on the same problem. Run the Apriori algorithm on the data file of
the weather problem and analyze the resulting association rules. Compare these rules with
the rules produced by the JRip algorithm. On the basis of the comparison derive a simple
modification of the Apriori algorithm that can be applied for decision-rule learning.
The data file for the weather problem is provided in the directory:
C:/Program Files/Weka-3-2/data/
3. Market Basket Problem
Given:
 a set I of 11 items: {fruitveg, freshmeat, dairy, cannedveg, cannedmeat,
frozenmeal, beer, wine, softdrink, fish, confectionery}.
 a database of 1000 transactions T s.t. T  I.
Find:
 interesting association rules that explain customer behaviour.
The data file marketBasket.arff for the market-basket problem is provided on the
course website.
4. Algorithm
As stated above to mine association rules you will use an implementation of the Apriori
algorithm provided in Weka.
1
5. Lab Tasks
A. Run the Apriori algorithm on the data file of the weather problem and analyze the
resulting association rules. Compare these rules with the rules produced by the
PRISM algorithm. On the basis of the comparison derive a simple modification of
the Apriori algorithm that can be applied for decision-rule learning.
B. Study the data file marketBasket.arff.
C. Run the Apriori algorithm on the data file marketBasket.arff and try to find
interesting association rules. To do this experiment you will try to find appropriate
values of the algorithm options support, confidence, lift, and
conviction.
2