Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Elective-I Examination Scheme- In semester Assessment: 30 End semester Assessment :70 Text Books: Data Mining Concepts and Techniques- Micheline Kamber Introduction to Data Mining with case studies-G.k.Gupta Reference Books: Mining the Web Discovering Knowledge from Hypertext dataSaumen charkrobarti Reinforcement and systemic machine learning for decision making- Parag Kulkarni Market Basket Analysis Frequent item set, Closed item set, Association Rules Mining multilevel Association Rules Constraint based association rule mining Apriori Algorithm FP growth Algorithm Itemset: Transaction is a set of items (Itemset). Confidence : It is the measure of trust worthiness associated with each discovered pattern. Support : It is the measure of how often the collection of items in an association occur together as percentage of all transactions Frequent itemset : If an itemset satisfies minimum support,then it is a frequent itemset. Def: Market Basket Analysis (Association Analysis) is a mathematical modeling technique based upon the theory that if you buy a certain group of items, you are likely to buy another group of items. It is used to analyze the customer purchasing behavior and helps in increasing the sales and maintain inventory by focusing on the point of sale transaction data. identify purchase patterns what items tend to be purchased together ▪ obvious: steak-potatoes; diaper- baby lotion what items are purchased sequentially ▪ obvious: house-furniture; car-tires what items tend to be purchased by season Categorize customer purchase behavior purchase profiles profitability of each purchase profile Use it for marketing ▪ layout or catalogs ▪ select products for promotion ▪ space allocation Customer 1: beer, pretzels, potato chips, aspirin Customer 2: diapers, baby lotion, grapefruit juice, baby food, milk Customer 3: soda, potato chips, milk Customer 4: soup, beer, milk, ice cream Customer 5: soda, coffee, milk, bread Customer 6: beer, potato chips beauty conscious health conscious sports conscious smoker casual drinker new family kids’ play pet lover gardener automotive photographer tv/stereo enthusiast convenience food women’s fashion kid’s fashion hobbyist student/home office illness (prescription) illness over-the-counter casual reader home handyman men’s image conscious sentimental seasonal/traditional homemaker home comfort fashion footwear men’s fashion personal care Beauty conscious cotton balls hair dye cologne nail polish BENEFITS: simple computations can be undirected (don’t have to have hypotheses before analysis) different data forms can be analyzed Itemset: A collection of one or more items ▪ Example: {Milk, Bread, Diaper} Support count () Frequency of occurrence of an itemset E.g. ({Milk, Bread, Diaper}) = 2 Support Fraction of transactions that contain an itemset E.g. s({Milk, Bread, Diaper}) = 2/5 Frequent Itemset An itemset whose support is greater than or equal to a minsup threshold TID Items 1 Bread, Milk 2 3 4 5 Bread, Diaper, Beer, Eggs Milk, Diaper, Beer, Coke Bread, Milk, Diaper, Beer Bread, Milk, Diaper, Coke Given a set of transactions, find rules that will predict the occurrence of an item based on the occurrences of other items in the transaction Market-Basket transactions TID Items 1 Bread, Milk 2 3 4 5 Bread, Diaper, Beer, Eggs Milk, Diaper, Beer, Coke Bread, Milk, Diaper, Beer Bread, Milk, Diaper, Coke Example of Association Rules {Diaper} {Beer}, {Milk, Bread} {Eggs,Coke}, {Beer, Bread} {Milk}, Implication means cooccurrence.. TID Items 1 Bread, Milk 2 3 4 5 Bread, Diaper, Beer, Eggs Milk, Diaper, Beer, Coke Bread, Milk, Diaper, Beer Bread, Milk, Diaper, Coke Example of Rules: {Milk, Diaper} {Beer} (s=0.4, c=0.67) {Milk, Beer} {Diaper} (s=0.4, c=1.0) {Diaper, Beer} {Milk} (s=0.4, c=0.67) {Beer} {Milk, Diaper} (s=0.4, c=0.67) {Diaper} {Milk, Beer} (s=0.4, c=0.5) {Milk} {Diaper, Beer} (s=0.4, c=0.5) Observations: • All the above rules are binary partitions of the same itemset: {Milk, Diaper, Beer} • Rules originating from the same itemset have identical support but can have different confidence • Thus, we may decouple the support and confidence requirements