Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Data Mining-Knowledge Presentation 2 Prof. Sin-Min Lee Overview Association rules are useful in that they suggest hypotheses for future research Association rules integrated into the generic actual argument model can assist in identifying the most plausible claim from given data items in a forward inference way or the likelihood of missing data values in a backward inference way What is data mining ? What is knowledge discovery from databases KDD? knowledge discovery in databases (KDD) is the 'non trivial extraction of nontrivial of implicit, previously unknown, and potentially useful information from data KDD encompasses a number of different technical approaches, such as clustering, data summarization, learning classification rules, finding dependency networks, analyzing changes, and detecting anomalies KDD has only recently emerged because we only recently have been gathering vast quantities of data Examples of KDD studies Mangasarian et al (1997) Breast Cancer diagnosis. A sample from breast lump mass is assessed by: mammagrophy (not sensitive 68%-79%) data mining from FNA test results and visual inspection (65%-98%) surgery (100% but invasive, expensive) Basket analysis. People who buy nappies also buy beer NBA. National Basketball Association of America. Player pattern profile. Bhandary et al (1997) Credit card fraud detection Stranieri/Zeleznikow (1997) predict family law property outcomes Rissland and Friedman (1997) discovers a change in the concept of ‘good faith’ in US Bankruptcy cases Pannu (1995) discovers a prototypical case from a library of cases • Wilkins and Pillaipakkamnatt (1997) predicts the time a case takes to be heard • Veliev et al (1999) association rules for economic analaysis Overview of process of knowledge discovery in databases ? Target data Raw data Select Pre-proces Transform sed data ed data Pre process Trans form from Fayyad, Pitatetsky-Shapiro, Smyth (1996) patterns Data mining knowledge Interpret patterns Phase 4. Data mining Finding patterns in data or fitting models to data Categories of techniques Predictive (classification: neural networks, rule induction, linear, multiple regression) Segmentation (clustering, k-means, k-median) Summarisation (associations, visualisation) Change detection/modelling What Is Association Mining? • Association rule mining: – • Applications: – • Finding frequent patterns, associations, correlations, or causal structures among sets of items or objects in transaction databases, relational databases, and other information repositories. Basket data analysis, cross-marketing, catalog design, lossleader analysis, clustering, classification, etc. Examples. – – – Rule form: “Body ead [support, confidence]”. buys(x, “diapers”) buys(x, “beers”) [0.5%, 60%] major(x, “CS”) ^ takes(x, “DB”) grade(x, “A”) [1%, 75%] More examples – age(X, “20..29”) ^ income(X, “20..29K”) buys(X, “PC”) [support = 2%, confidence = 60%] – contains(T, “computer”) contains(x, “software”) [1%, 75%] Association rules are a data mining technique • An association rules tell us something about the association between two attributes • Agrawal et al (1993) developed the first association rule algorithm, Apriori • A famous (but unsubstantiated AR) from a hypothetical supermarket transaction database is if nappies then beer (80%) Read this as nappies are bought implies beer are bought 80% of the time • Association rules have only recently been applied to law with promising results • Association rules can automatically discover rules that may prompt an analyst to think of hypothesis they would otherwise have considered Rule Measures: Support Support and confidence and Confidenceare two independent notions. Customer buys both Customer buys beer Customer buys diaper • Find all the rules X & Y Z with minimum confidence and support – support, s, probability that a transaction contains {X Y Z} – confidence, c, conditional probability that a transaction having {X Y} also contains Z Transaction ID Items Bought Let minimum support 50%, 2000 A,B,C and minimum confidence 1000 A,C 50%, we have 4000 A,D – A C (50%, 66.6%) 5000 B,E,F – C A (50%, 100%) Mining Association Rules—An Example Transaction ID 2000 1000 4000 5000 Items Bought A,B,C A,C A,D B,E,F Min. support 50% Min. confidence 50% Frequent Itemset Support {A} 75% {B} 50% {C} 50% {A,C} 50% For rule A C: support = support({A C}) = 50% confidence = support({A C})/support({A}) = 66.6% Two Step Association Rule Mining Step 1: Frequent itemset generation – use Support Step 2: Rule generation – use Confidence {milk, bread} is a frequent item set. Folks buying milk, also buy bread. Is it also true?: “Folks buying bread also buy milk.” Confidence and support of an association rule • 80% is the confidence of the rule if nappies then beer (80%). This is calculated by n2/n1 where: • n1 = no of records where nappies are bought • n2 = no of records where nappies were bought and beer was also bought. • if 1000 transactions for nappies, and of those, 800 also had beer then confidence is 80%. • A rule may have a high confidence but not be interesting because it doesn’t apply to many records in the database. i.e. no. of records where nappies were bought with beer / total records. • Rules that may be interesting have a confidence level and support level above a user set threshold Interesting rules: Confidence and support of an association rule • if 1000 transactions for nappies, and of those, 800 also had beer then confidence is 80%. • A rule may have a high confidence but not be interesting because it doesn’t apply to many records in the database. i.e. no. of records where nappies were bought with beer / total records. • Rules that may be interesting have a confidence level and support level above a user set threshold Association rule screen shot with A-Miner from Split Up data set • In 73.4% of cases where the wife's needs are some to high then the husband's future needs are few to some. • Prompts an analyst to posit plausible hypothesis e.g. it may be the case that the rule reflects the fact that more women remain custodial parents of the children following divorce than men do. The women that have some to high needs may do so because of their obligation to children. Mining Frequent Itemsets: the Key Step • Find the frequent itemsets: the sets of items that have minimum support – A subset of a frequent itemset must also be a frequent itemset – Apriori principle • i.e., if {AB} is a frequent itemset, both {A} and {B} should be a frequent itemset – Iteratively find frequent itemsets with cardinality from 1 to k (k-itemset) • Use the frequent itemsets to generate association rules. The Apriori Algorithm • Join Step: Ck is generated by joining Lk-1with itself • Prune Step: Any (k-1)-itemset that is not frequent cannot be a subset of a frequent k-itemset • Pseudo-code: Ck: Candidate itemset of size k Lk : frequent itemset of size k L1 = {frequent items}; for (k = 1; Lk !=; k++) do begin Ck+1 = candidates generated from Lk; for each transaction t in database do increment the count of all candidates in Ck+1 that are contained in t Lk+1 = candidates in Ck+1 with min_support end return k Lk; Association rules in law • Association rules generators are typically packaged with very expensive data mining suites. We developed A-Miner (available from authors) for a PC platform. • Typically, too many association rules are generated for feasible analysis. So, our current research involves exploring metrics of interesting to restrict numbers of rules that might be interesting • In general, structured data is not collected in law as it is in other domains so very large databases are rare • Our current research involves 380,000 records from a Legal Aid organization data base that contains data on client features. • ArgumentDeveloper shell that can be used by judges to structure their reasoning in a way that will facilitate data collection and reasoning The Apriori Algorithm — Example Support = 2 Database D TID 100 200 300 400 Items 134 235 1235 25 itemset sup. C1 {1} 2 {2} 3 Scan D {3} 3 {4} 1 {5} 3 C2 itemset sup L2 itemset sup {1 3} {2 3} {2 5} {3 5} 2 2 3 2 {1 {1 {1 {2 {2 {3 2} 3} 5} 3} 5} 5} 1 2 1 2 3 2 L1 itemset sup. {1} {2} {3} {5} 2 3 3 3 C2 itemset {1 2} Scan D {1 {1 {2 {2 {3 3} 5} 3} 5} 5} Join Operation — Example itemset {1 3} {2 3} {2 5} {3 5} sup 2 2 3 2 L2 C3 itemset {2 3 5} itemset {1 3} {2 3} {2 5} {3 5} sup 2 2 3 2 L2 Scan D L2 join L2 {1 3} {1 3} {1 3} {2 3} {1 3} {2 5} {1 3} {3 5} null {1 2 3} null {1 3 5} {2 3} {2 3} {2 3} {2 5} {2 3} {3 5} null {2 3 5} {2 3 5} {2 5} {2 5} {2 5} {3 5} null {2 3 5} L3 itemset sup {2 3 5} 2 Infrequent Subset {1 2} {1 5} Anti-Monotone Property If a set cannot pass a test, all of its supersets will fail the same test as well. If {2 3} does not have a support, nor will {1 2 3}, {2 3 5}, {1 2 3 5}, etc. If {2 3} occurs only in 5 times, can {2 3 5} occur in 8 times? How to Generate Candidates? • Suppose the items in Lk-1 are listed in an order • Step 1: self-joining Lk-1 insert into Ck select p.item1, p.item2, …, p.itemk-1, q.itemk-1 from Lk-1 p, Lk-1 q where p.item1=q.item1, …, p.itemk-2=q.itemk-2, p.itemk-1 < q.itemk-1 • Step 2: pruning forall itemsets c in Ck do forall (k-1)-subsets s of c do if (s is not in Lk-1) then delete c from Ck Example of Generating Candidates • L3={abc, abd, acd, ace, bcd} • Self-joining: L3*L3 – abcd from abc and abd – acde from acd and ace Problem of generate-&-test heuristic • Pruning: – acde is removed because ade is not in L3 • C4={abcd} I.3 I.2 Association rules can be used for forward and backward inferences in the generic/actual argument model for sentencing armed robbery I.A I.4 ext remely serious pat t ern of priors very serious pat t ern of priors Severit y of prior serious pat t ern of priors convict ions const ellat ion not so serious pat t ern of priors no prior convict ions serious offender st at us yes no major psychiat ric illness some psychiat ric illness Offender's healt h drug dependency major disabilit y no major healt h issues >0 yrs Offender's age ext remely serious very serious serious Seriousness of armed robbery as an not so serious offense relat ive t o ot her offenses t rifling very high Moral culpabilit y of high average offender low very low Degree of remorse displayed by offender I.5 I.6 I.7 ext reme high some lit t le none ext reme serious seriousness of t he offence very serious relat ive t o ot her armedserious not so serious robberies t rifling ext reme high some lit t le none Ext ent t o which ret ribut ion is an very appropriat e somewhat appropriat e appropriat e purpose not appropriat e at all Ext ent t o which specific det errencevery appropriat e somewhat appropriat e is an appropriat e purpose not appropriat e at all Co-operat ion Ext ent t o which general det errencevery appropriat e somewhat appropriat e is an appropriat e purpose not appropriat e at all Ext ent t o which rehabilit at ion is anvery appropriat e somewhat appropriat e appropriat e purpose not appropriat e at all Ext ent t o which communit y very appropriat e prot ect ion is an appropriat e somewhat appropriat e not appropriat e at all purpose Offender's plea guilt plea early guilt y plea during not guilt y t hroughout I.1 Offender lone penalt y Imprisonment Combined custody and treatment order Hospital security order Intensive correction order Suspended sentence Youth training centre detention Community based order Fine Adjournment on conditions Discharge offender Dismiss offence Defer sentence Generic/actual argument model for sentencing armed robbery Page-1 extreme impact serious impact some impact Personal background little impact bipolar disorder other psychiatric Psychiatric illness other psychological none extreme addiction serious addiction Gambling some gambling none Personal crisis extremely pertinent somewhat pertinent not an issue extremely pertinent Cultural adjustment somewhat pertinent not an issue extreme addiction Drug dependence serious addiction some addiction none extremely pertinent Intellectual disabilitysomewhat pertinent not an issue 19 May, 2001 prior offence name ? prior offence type I.3 imprisonment, ico, cbo etc prior offence sentence prior sentence jurisdiction serious offender status at time : I.4 Remarks to police Apology offered Degree of violence extremely significant significant not so significant not significant at all I.C Degree of planning extremely significant significant not so significant not significant at all Extent to which Assisted victim extremely significant significant not so significant not significant at all I.C I.D Impact of the crime on victims Impact of the crime on the community extreme high some little none extreme high some little none Restitution made Degree of assistence offered to police by the offender guilt plea early guilty plea during not guilty throughout indicate remorse neutral I.5 indicate no remorse yes no yes no I.6 verry highe high average low very low full admission partial admission complete denial positive defense offerred no instructions Police interview Assistance to Crown very important important provided but not important not provided major psychiatric illness Value of property stolen Duration of offence Victoria Other Aust ralia Other I.A Plea I.B I.2 date of prior offence Offender's health over many days/months or years over many hours over many minutes drug dependency no major health issues I.7 Page 1 of 1 extremely serious pattern of priors very serious pattern of priors Severity of prior serious pattern of priors convictions constellation not so serious pattern of priors no prior convictions serious offender status yes no major psychiatric illness some psychiatric illness Offender's health drug dependency major disability no major health issues >0 yrs Offender's age extremely serious very serious Seriousness of armed robbery as an serious not so serious offense relative to other offenses trifling very high Moral culpability of high average offender low very low Degree of remorse displayed by offender extreme high some little none extreme serious seriousness of the offencevery serious relative to other armedserious not so serious robberies trifling extreme high some little none Extent to which retribution is an very appropriate somewhat appropriate appropriate purpose not appropriate at all Extent to which specific deterrencevery appropriate somewhat appropriate is an appropriate purpose not appropriate at all Co-operation Extent to which general deterrencevery appropriate somewhat appropriate is an appropriate purpose not appropriate at all Extent to which rehabilitation is anvery appropriate somewhat appropriate appropriate purpose not appropriate at all Extent to which community protection is an appropriate purpose Offender's plea very appropriate somewhat appropriate not appropriate at all guilt plea early guilty plea during not guilty throughout I.1 Reasons to depart from from parity with cooffender penalty Offender lone penalty certainly exist probably exist possibly exist don't exist Penalty Imprisonment Combined custody and treatment order Hospital security order Intensive correction order Suspended sentence Youth training centre detention Community based order Fine Adjournment on conditions Discharge offender Dismiss offence Defer sentence Cooffender's penalty None Imprisonment Combined custody and treatment order Hospital security order Intensive correction order Suspended sentence Youth training centre detention Community based order Fine Adjournment on conditions Discharge offender Dismiss offence Defer sentence Forward inference: confidence I.3 I.2 I.A I.4 ext remely serious pat t ern of priors very serious pat t ern of priors Severit y of prior serious pat t ern of priors convict ions const ellat ion not so serious pat t ern of priors no prior convict ions serious offender st at us yes no major psychiat ric illness some psychiat ric illness Offender's healt h drug dependency major disabilit y no major healt h issues >0 yrs Offender's age ext remely serious very serious serious Seriousness of armed robbery as an not so serious offense relat ive t o ot her offenses t rifling very high Moral culpabilit y of high average offender low very low Degree of remorse displayed by offender I.5 I.6 I.7 ext reme high some lit t le none ext reme serious seriousness of t he offence very serious relat ive t o ot her armedserious not so serious robberies t rifling ext reme high some lit t le none Ext ent t o which ret ribut ion is an very appropriat e somewhat appropriat e appropriat e purpose not appropriat e at all Ext ent t o which specific det errencevery appropriat e somewhat appropriat e is an appropriat e purpose not appropriat e at all Co-operat ion Ext ent t o which general det errencevery appropriat e somewhat appropriat e is an appropriat e purpose not appropriat e at all Ext ent t o which rehabilit at ion is anvery appropriat e somewhat appropriat e appropriat e purpose not appropriat e at all Ext ent t o which communit y very appropriat e prot ect ion is an appropriat e somewhat appropriat e not appropriat e at all purpose Offender's plea guilt plea early guilt y plea during not guilt y t hroughout I.1 • In the sentence actual argument database the following outcomes were noted for the inputs suggested: Offender lone penalt y Imprisonment Combined custody and treatment order Hospital security order Intensive correction order Suspended sentence Youth training centre detention Community based order Fine Adjournment on conditions Discharge offender Dismiss offence Defer sentence 57% Imprisonment Combined custody and treatment order0.1% 0% Hospital security order 12% Intensive correction order 2% Suspended sentence 10% Youth training centre detention 16% Community based order 0% Fine Adjournment on conditions 0% Discharge offender 0% Dismiss offence Defer sentence Backward inference: constructing the strongest argument If all the items you suggest AND If extremely serious pattern of priors then imprisonment If very serious pattern of priors then imprisonment If serious pattern of priors then imprisonment If not so serious pattern of priors then imprisonment If no prior convictions then imprisonment 90% 75% 68% 78% 2% 2% 7% 17% 17% 3% Conclusion Data mining or Knowledge discovery from databases has not been appropriately exploited in law to date. Association rules are useful in that they suggest hypotheses for future research Association rules integrated into the generic actual argument model can assist in identifying the most plausible claim from given data items in a forward inference way or the likelihood of missing data values in a backward inference way Generating Association Rules • For each nonempty subset s of l, output the rule: s => (l - s) if support_count(l) / support_count(s) >= min_conf where min_conf is the minimum confidence threshold. l = {2 3 5}, s of l are {2 3}, {3 5}, {2 5}, {2}, {3}, & {5}. Candidate rules: {2 3} => {5} {3 5} => {2} {2 5} => {3} {2} => {3 5} {3} => {2 5} {5} => {2 3} Generating Association Rules if support_count(l) / support_count(s) >= min_conf (e.g,75%), itemset sup {1 2} 1 {1 3} 2 {1 5} 1 {2 3} 2 {2 5} 3 {3 5} 2 l = {2 3 5} then introduce the rule s => (l - s). itemset sup. {1} 2 {2} 3 {3} 3 {4} 1 {5} 3 itemset sup {2 3 5} 2 s = {2 3} {3 5} {2 5} {2} {3} {5} {2 3} => {5} : 2/2 {3 5} => {2} : 2/2 {2 5} => {3} : 2/3 {2} => {3 5} : 2/3 {3} => {2 5} : 2/3 {5} => {2 3} : 2/3 Presentation of Association Rules (Table Form ) Visualization of Association Rule Using Plane Graph Visualization of Association Rule Using Rule Graph Decision tree is a classifier in the form of a tree structure where each node is either: • a leaf node, indicating a class of instances, or • a decision node that specifies some test to be carried out on a single attribute value, with one branch and sub-tree for each possible outcome of the test. A decision tree can be used to classify an instance by starting at the root of the tree and moving through it until a leaf node, which provides the classification of the instance. Example: Decision making in the London stock market Suppose that the major factors affecting the London stock market are: • what it did yesterday; • what the New York market is doing today; • bank interest rate; • unemployment rate; • England’s prospect at cricket. The process of predicting an instance by this decision tree can also be expressed by answering the questions in the following order: Is unemployment high? YES: The London market will rise today NO: Is the New York market rising today? YES: The London market will rise today NO: The London market will not rise today. Decision tree induction is a typical inductive approach to learn knowledge on classification. The key requirements to do mining with decision trees are: • Attribute-value description: object or case must be expressible in terms of a fixed collection of properties or attributes. • Predefined classes: The categories to which cases are to be assigned must have been established beforehand (supervised data). • Discrete classes: A case does or does not belong to a particular class, and there must be for more cases than classes. • Sufficient data: Usually hundreds or even thousands of training cases. • “Logical” classification model: Classifier that can be only expressed as decision trees or set of production rules An appeal of market analysis comes from the clarity and utility of its results, which are in the form of association rules. There is an intuitive appeal to a market analysis because it expresses how tangible products and services relate to each other, how they tend to group together. A rule like, “if a customer purchases three way calling, then that customer will also purchase call waiting” is clear. Even better, it suggests a specific course of action, like bundling three-way calling with call waiting into a single service package. While association rules are easy to understand, they are not always useful. The following three rules are examples of real rules generated from real data: •· On Thursdays, grocery store purchase diapers and beer together. consumers often •· Customers who purchase maintenance agreements are very likely to purchase large appliances. •· When a new hardware store opens, one of the most commonly sold items is toilet rings. These three examples illustrate the three common types of rules produced by association rule analysis: the useful, the trivial, and the inexplicable. OLAP (Summarization) Display Using MS/Excel 2000 Market-Basket-Analysis (Association)—Ball graph Display of Association Rules in Rule Plane Form Display of Decision Tree (Classification Results) Display of Clustering (Segmentation) Results 3D Cube Browser