Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Machine Learning mining association rules Luigi Cerulo Department of Science and Technology University of Sannio Understanding association rules • A typical association rule It means “if peanut butter and jelly are purchased, then bread is also likely to be purchased” • In boolean logic terms: onions AND potatoes => burger • Antecedent Left Hand Side => Consequent Right Hand Side ety of business-related applications such as marketing promotions, inventory management, and customer relationship management. This chapter presents a methodology known as association analysis, which is useful for discovering interesting relationships hidden in large data sets. The uncovered relationships can be represented in the form of associa- Market basket analysis • Transactions Table 6.1. An example of market basket transactions. T ID 1 2 3 4 5 • Items {Bread, Milk} {Bread, Diapers, Beer, Eggs} {Milk, Diapers, Beer, Cola} {Bread, Milk, Diapers, Beer} {Bread, Milk, Diapers, Cola} 6.1 Dataset as examples and features Problem Definition 329 Table 6.2. A binary 0/1 representation of market basket data. TID 1 2 3 4 5 Bread 1 1 0 1 1 Milk 1 0 1 1 1 Diapers 0 1 1 1 1 Beer 0 1 1 1 0 Eggs 0 1 0 0 0 Cola 0 0 1 0 1 Association rules discovery Association rules are not used for prediction, but rather for unsupervised knowledge discovery in large databases. • There is no need for the algorithm to be trained and data does not need to be labeled ahead of time. • The algorithm is simply unleashed on a dataset in the hope that interesting associations are found. • How to measure if a rule is interesting • Whether or not an association rule is deemed interesting is determined by two statistical measures: support and confidence. 1 1 0 0 0 0 1 0 1 1 1 0 0 1 1 1 0 1 1 Items 1 and 1Transactions 1 market 0 basket 0 data very simplistic view of real 1 1 1 data 0such as0 the quantity 1 rtant aspects of the 6.1 Problem Definiti to purchase them. Methods for handling such • Let I the set of all items (features) in a market basket ed in Chapter 7. Table 6.2. A binary 0/1 representation of market basket data. dataset TID Bread basket Milk Diapers perhaps a very simplistic view of real market data Beer Eggs Cola 1 1 1 0 0 0 0 ttainLet I = {i ,i ,. . .,i } be the set of all items important as 1 aspects 2 d of the data such 2 1 the 0quantity 1 1 1 0 = {t1paid , t2 , . to . . ,purchase tN } be the set ofMethods all transactions. 3 for 0handling 1 1 1 0 1 rice them. such 4 1 1 1 1 0 0 subset of items chosen from I. In association be explained in Chapter 7. 5 1 1 1 0 0 1 • Let T the set of all more items is termed an itemset. If an itemset transactions (examples) -itemset. For instance, Milk} rt Count Let I = {i1{Beer, ,i2 ,. . .,iDiapers, } be the set of all items d This representation is perhaps a very simplistic view of real market bas The null (or empty) set is an itemset that does ta and T = {t1 , t2 , . . . , tN } be the set of all transactions. because it ignores certain important aspects of the data such as the ontains a subset of itemsofchosen Inpaid association items soldfrom or theI. price to purchase them. Methods for handl fined asor the number present inwilla be transnon-binary explained in Chapter 7. of zero more itemsofisitems termed andata itemset. If an itemset Each transaction contains a set of items called itemset. dcalled to • contain an itemset X if X is a subset of a k-itemset. For instance, {Beer, Diapers, Milk} Itemset and Support Count Let I = {i1 ,i2 ,. . .,id } be the set of (k-itemset = and itemset of k items) nsaction shown in Table 6.2 contains the itemtemset. The null (or empty) set is an itemset in a market basket data and Tthat = {t1 ,does t2 , . . . , tN } be the set of all tran {Bread, Milk}. An important property of ana subset of items chosen from I. In ass Each transaction ti contains . analysis, a collection of zero or more items is termed an itemset. If an hich refers to the number of transactions that idth is defined as the number of items present in a trans- width defined as the number of items present in a tr tThe I = transaction {i1 ,i2 ,. . .,id } be the setis of all items on. said to contain an itemset X if X is a subse t2 , . . .A , tNtransaction } be the set oftjallistransactions. tFor of items chosenthe from I. In association example, second transaction shown in Table 6.2 contains the i items is termed ancount itemset. Ifnot an Support anitemset itemset {Bread, Diapers} butfor {Bread, Milk}. An important property o set. For instance, {Beer, Diapers, Milk} mset is its support count, which refers to the number of transactions ll (or empty) set is an itemset that does tain• aThe particular itemset. support count, support countMathematically, of an itemset X the is the number of σ(X), fo mset can be stated ascontain follows: as the X number of items present in a transtransactions that X ontain an itemset X if X is a subset of ! ! on shown in Table 6.2 contains the!itemσ(X) = {ti |X ⊆ ti , ti ∈ T }!, d, Milk}. An important property of an 6.1 Problem Definitio efers to the number of transactions that Table 6.2. of A binary 0/1 representation data. data ere the symbol | · |count, denote thefor number elements in aof market set. basket In the matically, the support σ(X), an TID Bread Milk Diapers Beer Eggs Cola wn in Table 6.2, the support count for {Beer, Diapers, Milk} is equ 1 1 1 0 0 0 0 2 1 0 1 1all three 1 0 because there that contain items. ! are only two transactions • Example X ⊆ ti , ti ∈ T }!, 3 4 5 0 1 1 1 1 1 1 1 1 1 1 0 0 0 0 1 0 1 What is Rule the support of?association rule is an implication expression of sociation An mber of elements in a set. In the data set m X {Beer, −→ Y Diapers, , where Milk} X and Y areto disjoint itemsets, i.e., X ∩ Y = ∅. nt for is equal This representation is perhaps a very of real market bas actions all three items. ngth ofthat ancontain association rule can be measured insimplistic terms view of its support because it ignores certain important aspects of the data such as the q nfidence. Support determines how often is them. applicable tohandl a g of items sold or the price paidato rule purchase Methods for width defined as the number of items present in a tr tThe I = transaction {i1 ,i2 ,. . .,id } be the setis of all items on. said to contain an itemset X if X is a subse t2 , . . .A , tNtransaction } be the set oftjallistransactions. tFor of items chosenthe from I. In association example, second transaction shown in Table 6.2 contains the i items is termed ancount itemset. Ifnot an Support anitemset itemset {Bread, Diapers} butfor {Bread, Milk}. An important property o set. For instance, {Beer, Diapers, Milk} mset is its support count, which refers to the number of transactions ll (or empty) set is an itemset that does tain• aThe particular itemset. support count, support countMathematically, of an itemset X the is the number of σ(X), fo mset can be stated ascontain follows: as the X number of items present in a transtransactions that X ontain an itemset X if X is a subset of ! ! on shown in Table 6.2 contains the!itemσ(X) = {ti |X ⊆ ti , ti ∈ T }!, d, Milk}. An important property of an 6.1 Problem Definitio efers to the number of transactions that Table 6.2. of A binary 0/1 representation data. data ere the symbol | · |count, denote thefor number elements in aof market set. basket In the matically, the support σ(X), an TID Bread Milk Diapers Beer Eggs Cola wn in Table 6.2, the support count for {Beer, Diapers, Milk} is equ 1 1 1 0 0 0 0 2 1 0 1 1all three 1 0 because there that contain items. ! are only two transactions • Example X ⊆ ti , ti ∈ T }!, 3 4 5 0 1 1 1 1 1 1 1 1 1 1 0 0 0 0 1 0 1 What is Rule the support of?association rule is an implication expression of sociation An mber of elements in a set. In the data set m X {Beer, −→ Y Diapers, , where Milk} X and Y areto disjoint itemsets, i.e., X ∩ Y = ∅. nt for is equal This representation is perhaps a very of real market bas actions all three items. ngth ofthat ancontain association rule can be measured insimplistic terms view of its support because it ignores certain important aspects of the data such as the q nfidence. Support determines how often is them. applicable tohandl a g of items sold or the price paidato rule purchase Methods for σ(X) = !{ti |X ⊆ ti , ti ∈ T }!, ! | denote the number of elements in a set. In the data set σ(X) = !{ti |X ⊆ ti , ti ∈ the support count for rule {Beer, Diapers, Milk} is equal to Association re only two transactions that contain all three items. where the symbol | · | denote the number of elem shown in Table 6.2, the support count for {Bee • An association rule is an implication expression where X and An association rule is an implication expression of the twosets because re X and Y two are disjoint itemsets, i.e., X ∩ Ythere = ∅. are Theonly two transactions tha Y are disjoint item iation rule can be measured in terms of its support and ort determines how often a rule is applicable to a given Association Rule An association rule is an form X −→ Y , where X and Y are disjoint it strength of an association rule can be measured confidence. often a • The strength of an association ruleSupport can be determines determinedhow by its support and confidence • Support determines how often a rule is applicable to a given dataset. • Confidence determines how frequently items in Y appear in transactions that contain X 330 Chapter Association Analysis Support and 6Confidence data set, while confidence determines how frequently items in Y appear in transactions that contain X. The formal definitions of these metrics are Support, s(X −→ Y ) = Confidence, c(X −→ Y ) = σ(X ∪ Y ) ; N σ(X ∪ Y ) . σ(X) (6.1) (6.2) Example 6.1. Consider the rule {Milk, Diapers} −→ {Beer}. Since the support count for {Milk, Diapers, Beer} is 2 and the total number of transactions is 5, the rule’s support is 2/5 = 0.4. The rule’s confidence is obtained by dividing the support count for {Milk, Diapers, Beer} by the support count for {Milk, Diapers}. Since there are 3 transactions that contain milk and diapers, the confidence for this rule is 2/3 = 0.67. Why Use Support and Confidence? Support is an important measure because a rule that has very low support may occur simply by chance. A low support rule is also likely to be uninteresting from a business perspective because it may not be profitable to promote items that customers seldom buy together (with the exception of the situation described in Section 6.8). For these reasons, support is often used to eliminate uninteresting rules. As will not contain any items. The transaction width is defined as the number of items present in a tr action. A transaction tj is said to contain an itemset X if X is a subse 330 tChapter 6Confidence Association Analysis Support and . For example, the second transaction shown in Table 6.2 contains the it j set {Bread, Diapers} but not {Bread, Milk}. An important property o data set, while confidence determines how frequently items in Y appear in itemset is its support count, which refers to the number of transactions transactions that contain X. The formal definitions of these metrics are contain a particular itemset. Mathematically, the support count, σ(X), fo iation Analysis itemset X can be stated as follows: σ(X ∪ Y ) Support, s(X −→ Y ) = ; (6.1) N ! !6.1 determines how frequently items in Y appear in σ(X ∪ Y ) ! σ(X) Confidence, c(X −→= Y )!{t= i |X ⊆ ti , ti ∈. T } , X. The formal definitions of these metrics are σ(X) Problem Definitio (6.2) Table 6.2. A binary 0/1 representation of market basket data. σ(X ∪ Y ) ort, s(X −→ Y ) where = 6.1. ; Example Consider the rule(6.1) {Milk, Diapers}of −→ {Beer}. the symbol | · | denote the number elements in aSince set. the In the data N TID Bread Milk Diapers Beer Eggs Cola • Example support count Diapers, total of 0transshown 6.2, the support for {Beer, Diapers, Milk}0 is equa σ(Xinfor ∪ Table Y {Milk, ) 1Beer} count 1is 2 and 1 the 0 number 0 .there (6.2) nce, c(X −→ Y ) two =is 5,because actions the rule’s support is 2/5 = 0.4. confidence 2two 1 The 0rule’s that 1 contain 1 is obtained are only transactions all1 three0 items. σ(X) 3 0 1Beer} by 1 the support 1 0 count1 by dividing the support count for {Milk, Diapers, 4 3 transactions 1 1 that 1contain1milk and 0 the rule {Milk, Diapers} −→ {Beer}. Sinceare the for {Milk, Diapers}. Since there di-0 Association Rule Antransassociation rule of 5 1 1 is an1 implication 0 0expression 1 iapers, Beer} is 2 and the total number of apers, the confidence for this rule is 2/3 = 0.67. X −→ Y , where X and Y are disjoint itemsets, i.e., X ∩ Y = ∅. port is 2/5 = 0.4. form The rule’s confidence is obtained strength of by anthe association rule canSupport be measured in terms measure of its support unt for {Milk, Diapers, Beer} support count Why Use Support and Confidence? is an important e there are because 3 transactions that contain milk low and di- is perhaps confidence. Support determines howaoccur often a rulebyis applicable to bask a g This representation very simplistic view of real market a rule that has very support may simply chance. A Support count (X U Y) = 2 Support(X, Y ) =aspects 2/5 his rule is 2/3 = 0.67. because it ignores certain important of the data such as the q Nlow = 5support rule is also likely to be uninteresting from a business perspective of profitable items sold to or promote theConf priceidence(X, paid to purchase them.seldom Methods for handli Y customers ) = 2/3 because it may not be items that buy SupportSupport count (X) = 3important Confidence? is an non-binarymeasure data will be explained in Chapter 7. together (with the exception of the situation described in Section 6.8). For ery low support may occur simply by chance. A these reasons, support is often used to eliminate uninteresting rules. As will ! σ(X) = !{ti |X ⊆ Why Support and Confidence? where the symbol | · | denote the number shown in Table 6.2, the support count fo • Support is an important measure because a rule that has two because there are only two transactio very low support may occur simply by chance • Confidence measures the reliability of the inference made Association Rule An association rule by a rule. form X −→ Y , where X and Y are disj strength of an association rule can be me • Support is an estimate of the probability: confidence. Support determines how o P (X \ Y ) • Confidence is an estimate of the conditional probability: P (Y |X) The Association Rule Mining Problem Given a set of transactions T, find all the rules having support ≥ minsup and confidence ≥6.1 minconf. Problem Definiti • minsup and minconf are the corresponding support and confidence thresholds. rute-force approach for mining association rules is to compute d confidence for approach every possible rule.association This approach • A brute-force for mining rules is tois proh compute the support and confidence for every ve because there are exponentially many rules possible that can be e is thespecifically, number of items, i.e. features) data rule set. (d More the total number of possible rules e • data set that contains d items is d R=3 −2 d+1 + 1. of for this equation is left as an exercise to the readers (see E 405). Even for the small data set shown in Table 6.1, this a Apriori algorithm • R. Agrawal, and R.Srikant Fast algorithms for mining association rule in Proceedings of the 20th International Conference on Very Large Databases, pp. 487-499, by, (1994) • The Apriori principle states that all subsets of a frequent itemset must also be frequent. In other words, if {A, B} is frequent, then {A} and {B} both must be frequent. • Recall also that by definition, the support metric indicates how frequently an itemset appears in the data. Therefore, if we know that {A} does not meet a desired support threshold, there is no reason to consider {A, B} or any itemset containing {A}; it cannot possibly be frequent. Apriori algorithm The Apriori algorithm uses the Apriori principle to exclude potential association rules prior to actually evaluating them. • It occurs in two phases: • Identifying all itemsets that meet a minimum support threshold. • Creating rules from these itemsets that meet a minimum confidence threshold. • Apriori algorithm (phase 1) The first phase occurs in multiple iterations. Each iteration involves evaluating the support of storing a set of increasingly large itemsets. E.g. iteration 1 involves evaluating 1-itemsets, iteration 2 evaluates 2-itemsets, … • The result of each iteration i is a set of all i-itemsets that meet the minimum support threshold. • All the itemsets from iteration i are combined in order to generate candidate itemsets for evaluation in iteration i + 1. • If {A}, {B}, and {C} are frequent in iteration 1 while {D} is not frequent, then iteration 2 will consider only {A, B}, {A, C}, and {B, C} rather than the six that would have been evaluated if sets containing D had not been eliminated a priori. • Apriori algorithm (phase 1) The first phase occurs in multiple iterations. Each iteration involves evaluating the support of storing a set of increasingly large itemsets. E.g. iteration 1 involves evaluating 1-itemsets, iteration 2 evaluates 2-itemsets, … • The result of each iteration i is a set of all i-itemsets that meet the minimum support threshold. • All the itemsets from iteration i are combined in order to generate candidate itemsets for evaluation in iteration i + 1. If {A}, {B}, and {C} frequent inthat iteration while{B, {D} notfrequent, • • During iteration 2 it are is discovered {A, B}1 and C}isare frequent, iteration 2 will consider only {A, B}, {A,begin C}, and but {A, C} then is not. Although iteration 3 would normally by {B, C} rather thanthe thesupport six thatfor would have evaluated if sets containing evaluating {A, B, C},been this step need not occur at all. D had not been eliminated a priori. Why not? • Apriori algorithm (phase 2) When no bigger itemsets exists the second phase of the Apriori algorithm may begin. Given the set of frequent itemsets, association rules are generated from all possible subsets. E.g. {A, B} would result in candidate rules for {A} -> {B} and {B} -> {A}. • These are evaluated against a minimum confidence threshold, and any rules that do not meet the desired confidence level are eliminated. • Association rules discovery Finding Patterns – Market Basket Analysis Using Association Rules Before we get into that, it's worth noting that this algorithm, like all learning algorithms, is not without its strengths and weaknesses. Some of these are listed as follows: Strengths Weaknesses • Is ideally suited for working with very large amounts of transactional data • Results in rules that are easy to understand • Useful for "data mining" and discovering unexpected knowledge in databases • Not very helpful for small datasets • Takes effort to separate the insight from the common sense • Easy to draw spurious conclusions from random patterns As noted earlier, the Apriori algorithm employs a simple a priori belief as guideline for reducing the association rule search space: all subsets of a frequent itemset must also be frequent. This heuristic is known as the Apriori property. Using this astute observation, it is possible to dramatically limit the number of rules to search. For example, the set {motor oil, lipstick} can only be frequent if both {motor oil} and {lipstick} occur frequently as well. Consequently, if either motor oil or lipstick is infrequent, then any set containing these items can be excluded from the search. For additional details on the Apriori algorithm, refer to: Fast algorithms for mining association rule, in Proceedings of the 20th identifying frequently purchased groceries with association rules • Dataset contained in the arules R package M. Hahsler, K. Hornik, and T. Reutterer Implications of probabilistic data modeling for mining association rules in Studies in Classification, Data Analysis, and Knowledge Organization: from Data and Information Analysis to Knowledge Engineering, pp. 598–605, (2006). • The data contain 9,835 transactions recorded in 30 days 327 transactions per day 30 transactions per hour in a 12 hour business day) Exploring and preparing the data • There might be five brands of milk, a dozen different types of laundry detergent, and three brands of coffee. • We are not interested to associations between different brand of milk or detergent. • Thus, all brand names can be removed from the purchases. This reduces the number of groceries to a more manageable 169 types, using broad categories such as chicken, frozen meals, margarine, and soda. Exploring and preparing the data Exploring and preparing the data • To look at the contents of the sparse matrix, use the inspect() function in combination with vector operators Exploring and preparing the data • The itemInfo() show the column labels (items) of the sparse matrix Exploring and preparing the data Finding Patterns – Market Basket Analysis Using Association Rules Step 3 – training a model on the data The apriori function With data preparation taken care of, we can now work at finding the associations among shopping cart items. We will use an implementation of the Apriori algorithm in the arules package we've been using for exploring and preparing the groceries data. You'll need to install and load this package if you have not done so already. The following table shows the syntax for creating sets of rules with the apriori() function: Although running the apriori() function is straightforward, there can sometimes be a fair amount of trial and error when finding support and confidence parameters to produce a reasonable number of association rules. If you set these levels too high, Train the model minsup = 0.1, minconf = 0.8 This is not surprising because with a minsup=0.1 in order to generate a rule, an item must have appeared in at least 0.1 * 9385 = 938.5 transactions. Since only eight items appeared this frequently in our data, it's no wonder we didn't find any rules. Setting the minimum support • One way to approach the problem of setting support is to think about the minimum number of transactions you would need before you would consider a pattern interesting. • For instance, you could argue that if an item is purchased twice a day (about 60 times) then it may be worth taking a look at. • From there, it is possible to calculate the support level needed to find only rules matching at least that many transactions. • Since 60 out of 9,835 equals 0.006, we'll try setting minsup = 0.006 Setting the minimum confidence • Setting the minimum confidence involves a tricky balance. • On one hand, if confidence is too low, then we might be overwhelmed with a large number of unreliable rules (e.g. rules indicating items commonly purchased with batteries). • On the other hand, if we set confidence too high, then we will be limited to rules that are obvious or inevitable (e.g. like the fact that a smoke detector is always purchased in combination with batteries). • The appropriate minimum confidence level depends a great deal on the goals of your analysis. If you start with conservative values, you can always reduce them to broaden the search if you aren't finding actionable intelligence. • Lets start with a minconf = 0.5 Train the model minsup = 0.006, minconf = 0.25 Inspect rules Evaluation of association rules • • • Association analysis algorithms have the potential to generate a large number of patterns. Sifting through the patterns to identify the most interesting ones is not a trivial task because “one person’s trash might be another person’s treasure.” It is therefore important to establish a set of well-accepted criteria for evaluating the quality of association patterns. • Objective measures • Subjective arguments attribute types such as symmetric binary, nominal, and ordinal variab Limitations of the Support-Confidence Framework Existing tion rule mining formulation relies on the support and confidence mea eliminate uninteresting patterns. The drawback of support was previo scribed in Section 6.8, in which many potentially interesting patterns in low support items might be eliminated by the support threshold. Th back of confidence is more subtle and is best demonstrated with the fo example. Limitations of Support/Confidence • Consider this scenario 6.7 Example 6.3. Suppose we are interested in analyzing the relations tween people who drink tea and coffee. We may gather information ab beverage preferences amongPatterns a group of 373 people and summarize their re Evaluation of Association into a table such as the one shown in Table 6.8. The information given in this table can be used to evaluate the association Table 6.8. Beverage preferences among a group of 1000 people. rule {T ea} −→ {Cof f ee}. At first glance, it may appear that people who drink tea also tend to drink coffee because the rule’s support (15%) and confidence Cof f ee Cof f ee (75%) values are reasonably high. This argument would have been 150 acceptable T ea 50 200 Support(T Cof fofe)people = 150/1000 = 15% except that the ea, fraction who drink coffee, regardless of whether they T ea 650 150 800 drink tea, is 80%, while the fraction of tea drinkers who drink coffee is only 800 200 1000 idence(T ea,that Cof af e) = 150/200 = 75% 75%.Conf Thus knowing person is a tea drinker actually decreases her probability of being a coffee drinker from 80% to 75%! The rule {T ea} −→ {Cof f ee} is therefore misleading despite its high confidence value. The pitfall of confidence can be traced to the fact that the measure ignores P (Cof f ee) = 0.80 Knowing that a person is Tea drinking the support of the itemset in the rule consequent. Indeed, if the support of herbeprobability being coffee drinkers is taken into account,decreases we would not surprised to of find that a P (Cof ee|T ea) 0.75 coffee drinker many of the fpeople who = drink tea also drink coffee. What is more surprising is that the fraction of tea drinkers who drink coffee is actually less than the overall fraction of people who drink coffee, which points to an inverse relationship Limitations of Support/Confidence The pitfall of confidence can be traced to the fact that the measure ignores the support of the itemset in the rule consequent. • If the support of coffee drinkers is taken into account, we would not be surprised to find that many of the people who drink tea also drink coffee. • What is more surprising is that the fraction of tea drinkers who drink coffee is actually less than the overall fraction of people who drink coffee, which points to an inverse relationship between tea drinkers and coffee drinkers. • σ(X) = {ti |X ⊆ ti , ti ∈ T } , where the symbol | · | denote the number of elements in a set. In the data set shown in Table 6.2, the(lift support count for {Beer, Diapers, Milk} is equal to Interest Factor measure) two because there are only two transactions that contain all three items. Association Rule An association rule is an implication expression of the form X −→ Y , where X and Y are disjoint itemsets, i.e., X ∩ Y = ∅. The strength of an association rule can be measured in terms of its support and Conf idence(X, Y ) confidence. determines how often a rule is applicable to a given lif t(X, Y Support )= Support(Y ) • It is defined as the ratio between the rule’s confidence and the support of the itemset in the rule consequent • lift(X,Y) =1 X and Y are independent • lift(X,Y) >1 X and Y are positively correlated • lift(X,Y) <1 X and Y are negatively correlated σ(X) = {ti |X ⊆ ti , ti ∈ T } , where the symbol | · | denote the number of elements in a set. In the data set shown in Table 6.2, the(lift support count for {Beer, Diapers, Milk} is equal to Interest Factor measure) two because there are only two transactions that contain all three items. • StatisticalRule interpretation Association An association rule is an implication expression of the form X −→ Y , where X and Y are disjoint itemsets, i.e., X ∩ Y = ∅. The strength of an association rule can be measured in terms of its support and Conf idence(X, Y ) confidence. determines how often a rule is applicable to a given lif t(X, Y Support )= Support(Y ) • The lift metric compares the frequency of a pattern against a baseline frequency computed under the statistical independence assumption P (Y \ X) P (Y |X)P (X) P (Y |X) lif t(X, Y ) = = = P (X)P (Y ) P (X)P (Y ) P (Y ) baseline independence assumption Sort rules by lift Exercises 1.Explore other interestingness measures with the function interestMeasure() provided by the arules package