Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Using Data Mining Methods to Build Customer Profiles Gediminas Adomavicius, Alexander Tuzhilin New York University, USA 2008.11.10 Summarized & Presented by Jungyeon Yang IDS Lab., Seoul National University Contents Introduction Building Customer Profiles Rule Discovery Rule Validation Validation Operators Case Study – The 1:1 Pro System Discussion Copyright 2008 by CEBT 2 Introduction Personalization community must deal with Who customers are, How they behave, How similar they are to others, How to extract this knowledge Customer Profile contains Facts about a customer Rules describing that customer’s behavior This research is focused on Rule validation Implement a validation system Copyright 2008 by CEBT 3 Building Customer Profiles Data model Two basic types of the data – Factual : who the customer is – Transactional : what the customer does Profile model A complete customer profile has two parts – a factual profile : gender, age, etc. – a behavioral profile : customer’s actions, is derived from user’s transactional data Rule discovery Rule validation Copyright 2008 by CEBT 4 Rule Discovery In order to discover rules that describe the behavior Apriori algorithm for association rule CART(Classification and Regression Trees) for classification rule – Classification Tree : in case of categorical values – Regression Tree : in case of continuous values Copyright 2008 by CEBT May 22, 2017, Page 5 Rule Validation One way to Validate rules is to let a domain expert inspect rules There is scalability problem Solution of this approach Uses validation operators that let a expert validate large numbers of rules at a time with relatively little input from the expert. Copyright 2008 by CEBT May 22, 2017, Page 6 Rule Validation (Cont.) Collective rule validation lets the expert deal with such common rules just once. The expert choose various validation operators and applies them successively to the set of rules The set of all discovered rules is split into three mutually disjoint sets accepted rules(Rall) rejected rules(Rrej) possibly some - until some predefined % of rules is validated - until validation operators validate only a few rules at a time remaining unvalidated rules(Runv) Copyright 2008 by CEBT May 22, 2017, Page 7 Validation Operators Similarity-based rule grouping This operator puts similar rules into groups according to expertspecified similarity criteria Ex) according to the attribute structure similarity condition, all rules that have the same attribute structure are similar Copyright 2008 by CEBT 8 Validation Operators (Cont.) Template-based rule filtering This operator filters rules that match expert-specified rule templates The expert specifies accepting and rejecting templates Examples REJECT HEAD = {Store = RiteAid} – “Reject all rules that have Store = RiteAid in their heads.” – Rule 1 would be reject ACCEPT BODY ⊇ {Product} AND HEAD {DayOfWeek, Quantity}. – “Accept all rules that have the attribute Product (possibly among other attributes) in their bodies, that also have heads restricted to the attributes DayOfWeek or Quantity.” – Rule 5 & 7 match Copyright 2008 by CEBT 9 Validation Operators (Cont.) Redundant-rule elimination It eliminates rules that by themselves carry no new information about a customer’s behavior Example – Product = AppleJuice => Store = GrandUnion (2%, 100%) – Assume that the fact “The customer shops only at Grand Union” in one’s factual profile – AppleJuice rule would be eliminated Copyright 2008 by CEBT 10 Case study – The 1:1 PRO SYSTEM Short for “One-to-One Profiling System” Profiling and validation system Input The factual and transactional data stored in a DB or files architecture Copyright 2008 by CEBT 11 Case study – The 1:1 PRO SYSTEM Copyright 2008 by CEBT 12 Disscussion “Quality” of generated rules Different expert has different validation Scalability Attributes ↑ Apriori has bottleneck User ↑ validation operators should scale up Constraint-based rule generation vs. post-analysis Examination of groups of rules Expert can apply validation operators just to particular group of rules and examine its subgroups Copyright 2008 by CEBT 13 Opinions Pros Can handle large numbers of rules Easy to validate rules using GUI Provide several ways to validate rules using operators Cons Other mining algorithms should be applied Constraints are needed when rules are generated. (too many) In order to use rules for context-aware services, a system which has intuitive UI and many functionality is necessary. A system should have flexibilities to be applied in many domains Copyright 2008 by CEBT 14