Download association rules

Document related concepts
no text concepts found
Transcript
Data Mining-Knowledge
Presentation 2
Prof. Sin-Min Lee
Overview

Association rules are useful in that they suggest
hypotheses for future research

Association rules integrated into the generic
actual argument model can assist in identifying the
most plausible claim from given data items in a
forward inference way or the likelihood of missing
data values in a backward inference way
What is data mining ? What is knowledge
discovery from databases KDD?

knowledge discovery in databases (KDD)
is the 'non trivial extraction of nontrivial of
implicit, previously unknown, and
potentially useful information from data

KDD encompasses a number of different
technical approaches, such as clustering, data
summarization, learning classification rules,
finding dependency networks, analyzing
changes, and detecting anomalies

KDD has only recently emerged because we
only recently have been gathering vast
quantities of data

Examples of KDD studies
Mangasarian et al (1997) Breast Cancer diagnosis. A sample from breast lump
mass is assessed by:

mammagrophy (not sensitive 68%-79%)

data mining from FNA test results and visual inspection (65%-98%)

surgery (100% but invasive, expensive)

Basket analysis. People who buy nappies also buy beer
NBA. National Basketball Association of America. Player pattern profile.

Bhandary et al (1997)
Credit card fraud detection
Stranieri/Zeleznikow (1997) predict family law property outcomes

Rissland and Friedman (1997) discovers a change in the concept of ‘good faith’
in US Bankruptcy cases

Pannu (1995) discovers a prototypical case from a library of cases

• Wilkins and Pillaipakkamnatt (1997) predicts the time a case takes to be heard
• Veliev et al (1999) association rules for economic analaysis
Overview of process of knowledge discovery in
databases ?
Target data
Raw data
Select
Pre-proces Transform
sed data
ed data
Pre
process
Trans
form
from Fayyad, Pitatetsky-Shapiro, Smyth
(1996)
patterns
Data
mining
knowledge
Interpret
patterns
Phase 4. Data mining
Finding patterns in data or fitting models to data

Categories of techniques


Predictive (classification: neural networks, rule induction,
linear, multiple regression)

Segmentation (clustering, k-means, k-median)

Summarisation (associations, visualisation)

Change detection/modelling
What Is Association
Mining?
•
Association rule mining:
–
•
Applications:
–
•
Finding frequent patterns, associations, correlations, or
causal structures among sets of items or objects in
transaction databases, relational databases, and other
information repositories.
Basket data analysis, cross-marketing, catalog design, lossleader analysis, clustering, classification, etc.
Examples.
–
–
–
Rule form: “Body ead [support, confidence]”.
buys(x, “diapers”)  buys(x, “beers”) [0.5%, 60%]
major(x, “CS”) ^ takes(x, “DB”) grade(x, “A”) [1%, 75%]
More examples
– age(X, “20..29”) ^ income(X, “20..29K”) 
buys(X, “PC”) [support = 2%, confidence = 60%]
– contains(T, “computer”)  contains(x, “software”)
[1%, 75%]
Association rules are a data mining technique
• An association rules tell us something about the association
between two attributes
• Agrawal et al (1993) developed the first association rule
algorithm, Apriori
• A famous (but unsubstantiated AR) from a hypothetical
supermarket transaction database is if nappies then beer (80%)
Read this as nappies are bought implies beer are bought 80% of
the time
• Association rules have only recently been applied to law with
promising results
• Association rules can automatically discover rules that may
prompt an analyst to think of hypothesis they would otherwise
have considered
Rule Measures: Support
Support and confidence
and Confidenceare two independent notions.
Customer
buys both
Customer
buys beer
Customer
buys diaper
• Find all the rules X & Y  Z
with minimum confidence and
support
– support, s, probability that a
transaction contains {X  Y  Z}
– confidence, c, conditional
probability that a transaction having
{X  Y} also contains Z
Transaction ID Items Bought Let minimum support 50%,
2000
A,B,C
and minimum confidence
1000
A,C
50%, we have
4000
A,D
– A  C (50%, 66.6%)
5000
B,E,F
– C  A (50%, 100%)
Mining Association Rules—An
Example
Transaction ID
2000
1000
4000
5000
Items Bought
A,B,C
A,C
A,D
B,E,F
Min. support 50%
Min. confidence 50%
Frequent Itemset Support
{A}
75%
{B}
50%
{C}
50%
{A,C}
50%
For rule A  C:
support = support({A C}) = 50%
confidence = support({A C})/support({A}) = 66.6%
Two Step Association Rule
Mining
Step 1: Frequent itemset generation – use Support
Step 2: Rule generation – use Confidence
{milk, bread} is a frequent item set.
Folks buying milk, also buy bread.
Is it also true?: “Folks buying bread also buy milk.”
Confidence and support of an association rule
• 80% is the confidence of the rule if nappies then beer (80%). This is
calculated by n2/n1 where:
• n1 = no of records where nappies are bought
• n2 = no of records where nappies were bought and beer was also
bought.
• if 1000 transactions for nappies, and of those, 800 also had beer then
confidence is 80%.
• A rule may have a high confidence but not be interesting because it
doesn’t apply to many records in the database. i.e. no. of records where
nappies were bought with beer / total records.
• Rules that may be interesting have a confidence level and support level
above a user set threshold
Interesting rules: Confidence and support of an
association rule
• if 1000 transactions for nappies, and of those, 800 also had beer
then confidence is 80%.
• A rule may have a high confidence but not be interesting because
it doesn’t apply to many records in the database. i.e. no. of records
where nappies were bought with beer / total records.
• Rules that may be interesting have a confidence level and support
level above a user set threshold
Association rule screen shot with A-Miner from
Split Up data set
• In 73.4% of cases where the wife's needs are some to high then the
husband's future needs are few to some.
• Prompts an analyst to posit plausible hypothesis e.g. it may be the case
that the rule reflects the fact that more women remain custodial parents of
the children following divorce than men do. The women that have some to
high needs may do so because of their obligation to children.
Mining Frequent Itemsets:
the Key Step
• Find the frequent itemsets: the sets of items that
have minimum support
– A subset of a frequent itemset must also be a frequent
itemset – Apriori principle
• i.e., if {AB} is a frequent itemset, both {A} and {B} should be a
frequent itemset
– Iteratively find frequent itemsets with cardinality from 1
to k (k-itemset)
• Use the frequent itemsets to generate association
rules.
The Apriori Algorithm
• Join Step: Ck is generated by joining Lk-1with itself
• Prune Step: Any (k-1)-itemset that is not frequent cannot
be a subset of a frequent k-itemset
• Pseudo-code:
Ck: Candidate itemset of size k
Lk : frequent itemset of size k
L1 = {frequent items};
for (k = 1; Lk !=; k++) do begin
Ck+1 = candidates generated from Lk;
for each transaction t in database do
increment the count of all candidates in Ck+1
that are contained in t
Lk+1 = candidates in Ck+1 with min_support
end
return k Lk;
Association rules in law
• Association rules generators are typically packaged with very expensive
data mining suites. We developed A-Miner (available from authors) for a
PC platform.
• Typically, too many association rules are generated for feasible analysis.
So, our current research involves exploring metrics of interesting to restrict
numbers of rules that might be interesting
• In general, structured data is not collected in law as it is in other domains
so very large databases are rare
• Our current research involves 380,000 records from a Legal Aid
organization data base that contains data on client features.
• ArgumentDeveloper shell that can be used by judges to structure their
reasoning in a way that will facilitate data collection and reasoning
The Apriori Algorithm —
Example
Support = 2
Database D
TID
100
200
300
400
Items
134
235
1235
25
itemset sup.
C1
{1}
2
{2}
3
Scan D
{3}
3
{4}
1
{5}
3
C2 itemset sup
L2 itemset sup
{1 3}
{2 3}
{2 5}
{3 5}
2
2
3
2
{1
{1
{1
{2
{2
{3
2}
3}
5}
3}
5}
5}
1
2
1
2
3
2
L1 itemset sup.
{1}
{2}
{3}
{5}
2
3
3
3
C2 itemset
{1 2}
Scan D
{1
{1
{2
{2
{3
3}
5}
3}
5}
5}
Join Operation — Example
itemset
{1 3}
{2 3}
{2 5}
{3 5}
sup
2
2
3
2
L2
C3 itemset
{2 3 5}
itemset
{1 3}
{2 3}
{2 5}
{3 5}
sup
2
2
3
2
L2
Scan D
L2 join L2
{1 3} {1 3}
{1 3} {2 3}
{1 3} {2 5}
{1 3} {3 5}
null
{1 2 3}
null
{1 3 5}
{2 3} {2 3}
{2 3} {2 5}
{2 3} {3 5}
null
{2 3 5}
{2 3 5}
{2 5} {2 5}
{2 5} {3 5}
null
{2 3 5}
L3 itemset sup
{2 3 5} 2
Infrequent
Subset
{1 2}
{1 5}
Anti-Monotone Property
If a set cannot pass a test, all of its supersets
will fail the same test as well.
If {2 3} does not have a support, nor will
{1 2 3}, {2 3 5}, {1 2 3 5}, etc.
If {2 3} occurs only in 5 times, can {2 3 5}
occur in 8 times?
How to Generate Candidates?
• Suppose the items in Lk-1 are listed in an order
• Step 1: self-joining Lk-1
insert into Ck
select p.item1, p.item2, …, p.itemk-1, q.itemk-1
from Lk-1 p, Lk-1 q
where p.item1=q.item1, …, p.itemk-2=q.itemk-2, p.itemk-1 < q.itemk-1
• Step 2: pruning
forall itemsets c in Ck do
forall (k-1)-subsets s of c do
if (s is not in Lk-1) then delete c from Ck
Example of Generating
Candidates
• L3={abc, abd, acd, ace, bcd}
• Self-joining: L3*L3
– abcd from abc and abd
– acde from acd and ace
Problem of
generate-&-test
heuristic
• Pruning:
– acde is removed because ade is not in L3
• C4={abcd}
I.3
I.2
Association rules can
be used for forward
and backward
inferences in the
generic/actual
argument model for
sentencing armed
robbery
I.A
I.4
ext remely serious pat t ern of priors
very serious pat t ern of priors
Severit y of prior
serious pat t ern of priors
convict ions const ellat ion
not so serious pat t ern of priors
no prior convict ions
serious offender st at us yes
no
major psychiat ric illness
some psychiat ric illness
Offender's healt h
drug dependency
major disabilit y
no major healt h issues
>0 yrs
Offender's age
ext remely serious
very serious
serious
Seriousness of armed robbery as an
not so serious
offense relat ive t o ot her offenses t rifling
very high
Moral culpabilit y of high
average
offender
low
very low
Degree of remorse
displayed by offender
I.5
I.6
I.7
ext reme
high
some
lit t le
none
ext reme serious
seriousness of t he offence
very serious
relat ive t o ot her armedserious
not so serious
robberies
t rifling
ext reme
high
some
lit t le
none
Ext ent t o which ret ribut ion is an very appropriat e
somewhat appropriat e
appropriat e purpose
not appropriat e at all
Ext ent t o which specific det errencevery appropriat e
somewhat appropriat e
is an appropriat e purpose
not appropriat e at all
Co-operat ion
Ext ent t o which general det errencevery appropriat e
somewhat appropriat e
is an appropriat e purpose
not appropriat e at all
Ext ent t o which rehabilit at ion is anvery appropriat e
somewhat appropriat e
appropriat e purpose
not appropriat e at all
Ext ent t o which communit y
very appropriat e
prot ect ion is an appropriat e
somewhat appropriat e
not appropriat e at all
purpose
Offender's plea
guilt plea early
guilt y plea during
not guilt y t hroughout
I.1
Offender lone
penalt y
Imprisonment
Combined custody and treatment order
Hospital security order
Intensive correction order
Suspended sentence
Youth training centre detention
Community based order
Fine
Adjournment on conditions
Discharge offender
Dismiss offence
Defer sentence
Generic/actual argument model for sentencing armed
robbery
Page-1
extreme impact
serious impact
some impact
Personal background little impact
bipolar disorder
other psychiatric
Psychiatric illness other psychological
none
extreme addiction
serious addiction
Gambling
some gambling
none
Personal crisis
extremely pertinent
somewhat pertinent
not an issue
extremely pertinent
Cultural adjustment somewhat pertinent
not an issue
extreme addiction
Drug dependence serious addiction
some addiction
none
extremely pertinent
Intellectual disabilitysomewhat pertinent
not an issue
19 May, 2001
prior offence name
?
prior offence type
I.3
imprisonment, ico,
cbo
etc
prior offence sentence
prior sentence jurisdiction
serious offender status at time
:
I.4
Remarks to police
Apology offered
Degree of
violence
extremely significant
significant
not so significant
not significant at all
I.C
Degree of
planning
extremely significant
significant
not so significant
not significant at all
Extent to which
Assisted victim
extremely significant
significant
not so significant
not significant at all
I.C
I.D
Impact of the
crime on
victims
Impact of the
crime on the
community
extreme
high
some
little
none
extreme
high
some
little
none
Restitution made
Degree of assistence
offered to police by the
offender
guilt plea early
guilty plea during
not guilty throughout
indicate remorse
neutral
I.5
indicate no remorse
yes
no
yes
no
I.6
verry highe
high
average
low
very low
full admission
partial admission
complete denial
positive defense offerred
no instructions
Police interview
Assistance to Crown
very important
important
provided but not important
not provided
major psychiatric illness
Value of
property stolen
Duration of
offence
Victoria
Other Aust ralia
Other
I.A
Plea
I.B
I.2
date of prior offence
Offender's health
over many days/months or years
over many hours
over many minutes
drug dependency
no major health issues
I.7
Page
1
of
1
extremely serious pattern of priors
very serious pattern of priors
Severity of prior
serious pattern of priors
convictions constellation
not so serious pattern of priors
no prior convictions
serious offender status yes
no
major psychiatric illness
some psychiatric illness
Offender's health
drug dependency
major disability
no major health issues
>0
yrs
Offender's age
extremely serious
very serious
Seriousness of armed robbery as an serious
not so serious
offense relative to other offenses trifling
very high
Moral culpability of high
average
offender
low
very low
Degree of remorse
displayed by offender
extreme
high
some
little
none
extreme serious
seriousness of the offencevery serious
relative to other armedserious
not so serious
robberies
trifling
extreme
high
some
little
none
Extent to which retribution is an very appropriate
somewhat appropriate
appropriate purpose
not appropriate at all
Extent to which specific deterrencevery appropriate
somewhat appropriate
is an appropriate purpose
not appropriate at all
Co-operation
Extent to which general deterrencevery appropriate
somewhat appropriate
is an appropriate purpose
not appropriate at all
Extent to which rehabilitation is anvery appropriate
somewhat appropriate
appropriate purpose
not appropriate at all
Extent to which community
protection is an appropriate
purpose
Offender's plea
very appropriate
somewhat appropriate
not appropriate at all
guilt plea early
guilty plea during
not guilty throughout
I.1
Reasons to
depart from
from parity
with cooffender
penalty
Offender lone
penalty
certainly exist
probably exist
possibly exist
don't exist
Penalty
Imprisonment
Combined custody and treatment order
Hospital security order
Intensive correction order
Suspended sentence
Youth training centre detention
Community based order
Fine
Adjournment on conditions
Discharge offender
Dismiss offence
Defer sentence
Cooffender's
penalty
None
Imprisonment
Combined custody and treatment order
Hospital security order
Intensive correction order
Suspended sentence
Youth training centre detention
Community based order
Fine
Adjournment on conditions
Discharge offender
Dismiss offence
Defer sentence
Forward inference: confidence
I.3
I.2
I.A
I.4
ext remely serious pat t ern of priors
very serious pat t ern of priors
Severit y of prior
serious pat t ern of priors
convict ions const ellat ion
not so serious pat t ern of priors
no prior convict ions
serious offender st at us yes
no
major psychiat ric illness
some psychiat ric illness
Offender's healt h
drug dependency
major disabilit y
no major healt h issues
>0 yrs
Offender's age
ext remely serious
very serious
serious
Seriousness of armed robbery as an
not so serious
offense relat ive t o ot her offenses t rifling
very high
Moral culpabilit y of high
average
offender
low
very low
Degree of remorse
displayed by offender
I.5
I.6
I.7
ext reme
high
some
lit t le
none
ext reme serious
seriousness of t he offence
very serious
relat ive t o ot her armedserious
not so serious
robberies
t rifling
ext reme
high
some
lit t le
none
Ext ent t o which ret ribut ion is an very appropriat e
somewhat appropriat e
appropriat e purpose
not appropriat e at all
Ext ent t o which specific det errencevery appropriat e
somewhat appropriat e
is an appropriat e purpose
not appropriat e at all
Co-operat ion
Ext ent t o which general det errencevery appropriat e
somewhat appropriat e
is an appropriat e purpose
not appropriat e at all
Ext ent t o which rehabilit at ion is anvery appropriat e
somewhat appropriat e
appropriat e purpose
not appropriat e at all
Ext ent t o which communit y
very appropriat e
prot ect ion is an appropriat e
somewhat appropriat e
not appropriat e at all
purpose
Offender's plea
guilt plea early
guilt y plea during
not guilt y t hroughout
I.1
• In the sentence actual argument
database the following outcomes
were noted for the inputs suggested:
Offender lone
penalt y
Imprisonment
Combined custody and treatment order
Hospital security order
Intensive correction order
Suspended sentence
Youth training centre detention
Community based order
Fine
Adjournment on conditions
Discharge offender
Dismiss offence
Defer sentence
57%
Imprisonment
Combined custody and treatment order0.1%
0%
Hospital security order
12%
Intensive correction order
2%
Suspended sentence
10%
Youth training centre detention
16%
Community based order
0%
Fine
Adjournment on conditions
0%
Discharge offender
0%
Dismiss offence
Defer sentence
Backward inference: constructing the strongest argument
If all the items you suggest AND
If extremely serious pattern of priors then imprisonment
If very serious pattern of priors then imprisonment
If serious pattern of priors then imprisonment
If not so serious pattern of priors then imprisonment
If no prior convictions then imprisonment
90%
75%
68%
78%
2%
2%
7%
17%
17%
3%
Conclusion
Data mining or Knowledge discovery from databases has not been

appropriately exploited in law to date.

Association rules are useful in that they suggest hypotheses for future
research
Association rules integrated into the generic actual argument model

can assist in identifying the most plausible claim from given data items
in a forward inference way or the likelihood of missing data values in a
backward inference way
Generating Association Rules
• For each nonempty subset s of l, output the rule:
s => (l - s)
if support_count(l) / support_count(s) >= min_conf
where min_conf is the minimum confidence threshold.
l = {2 3 5}, s of l are {2 3}, {3 5}, {2 5}, {2}, {3}, & {5}.
Candidate rules:
{2 3} => {5}
{3 5} => {2}
{2 5} => {3}
{2} => {3 5}
{3} => {2 5}
{5} => {2 3}
Generating Association Rules
if support_count(l) / support_count(s) >= min_conf
(e.g,75%),
itemset sup
{1 2}
1
{1 3}
2
{1 5}
1
{2 3}
2
{2 5}
3
{3 5}
2
l = {2 3 5}
then introduce the rule s => (l - s).
itemset sup.
{1}
2
{2}
3
{3}
3
{4}
1
{5}
3
itemset sup
{2 3 5} 2
s = {2 3} {3 5} {2 5} {2} {3} {5}
{2 3} => {5} : 2/2
{3 5} => {2} : 2/2
{2 5} => {3} : 2/3
{2} => {3 5} : 2/3
{3} => {2 5} : 2/3
{5} => {2 3} : 2/3
Presentation of Association
Rules (Table Form )
Visualization of Association Rule Using Plane Graph
Visualization of Association Rule Using Rule Graph
Decision tree is a classifier in the form of a tree structure
where each node is either:
•
a leaf node, indicating a class of instances, or
•
a decision node that specifies some test to be
carried out on a single attribute value, with one branch
and sub-tree for each possible outcome of the test.
A decision tree can be used to classify an instance by
starting at the root of the tree and moving through it until
a leaf node, which provides the classification of the
instance.
Example: Decision making in the London stock market
Suppose that the major factors affecting the London stock
market are:
•
what it did yesterday;
•
what the New York market is doing today;
•
bank interest rate;
•
unemployment rate;
•
England’s prospect at cricket.
The process of predicting an instance by this decision tree
can also be expressed by answering the questions in the
following order:
Is unemployment high?
YES: The London market will rise today
NO: Is the New York market rising today?
YES: The London market will rise today
NO: The London market will not rise today.
Decision tree induction is a typical inductive approach to learn
knowledge on classification. The key requirements to do mining
with decision trees are:
•
Attribute-value description: object or case must be
expressible in terms of a fixed collection of properties or attributes.
•
Predefined classes: The categories to which cases are to be
assigned must have been established beforehand (supervised
data).
•
Discrete classes: A case does or does not belong to a
particular class, and there must be for more cases than classes.
•
Sufficient data: Usually hundreds or even thousands of
training cases.
•
“Logical” classification model: Classifier that can be only
expressed as decision trees or set of production rules
An appeal of market analysis comes from the clarity and
utility of its results, which are in the form of association
rules. There is an intuitive appeal to a market analysis
because it expresses how tangible products and services
relate to each other, how they tend to group together. A
rule like, “if a customer purchases three way calling, then
that customer will also purchase call waiting” is clear.
Even better, it suggests a specific course of action, like
bundling three-way calling with call waiting into a single
service package. While association rules are easy to
understand, they are not always useful.
The following three rules are examples of real rules
generated from real data:
•· On Thursdays, grocery store
purchase diapers and beer together.
consumers
often
•· Customers who purchase maintenance agreements
are very likely to purchase large appliances.
•· When a new hardware store opens, one of the most
commonly sold items is toilet rings.
These three examples illustrate the three common types
of rules produced by association rule analysis: the useful,
the trivial, and the inexplicable.
OLAP (Summarization) Display Using MS/Excel 2000
Market-Basket-Analysis (Association)—Ball graph
Display of Association Rules in Rule Plane Form
Display of Decision Tree (Classification Results)
Display of Clustering (Segmentation) Results
3D Cube Browser