Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
CIS 600: Master's Project Online Trading and Data MiningBased Marketing of IT Books Supervisor : Dr. Haiping Xu Student : Tsung-Ta Tu Student ID : 999-20-1529 Outline 1. Introduction and Motivation 2. Data Mining Technology 3. System Architecture & Demo 4. Analyze and Discuss The Result 5. Conclusion 6. Future work Introduction and Motivation In Internet era, each E-Commerce website contain a large database of customer transactions, where each transaction consists of a set of items that purchased by a customer in a visit. All the data in the database is treasure not garbage. When you analyze the data, it can solve some questions. Introduction and Motivation (2) Questions: (1) How to keep touch with increasing customers? (2) What are the characteristics, the requirement mode and consuming patterns of the customers? (3) How to design attractive binding products which supply more convenient shopping options for the customers? Data Mining Techniques (1) Association Rules (2) Classification (3) Clustering (4) Neural Network (5) Generalization Association Rules An association rule is a rule which implies certain association relationships among a set of objects (such as “occur together” or “one implies the other”) in a database. The intuitive meaning of such a rule is that transactions of the database which contain X tend to contain Y . Association Rules (2) This basic process for association rules analysis consist of three important concerns (1) Choosing the right set of items (2) Generating rules by deciphering the counts in the cooccurrence matrix (3) Overcoming the practical limits imposed by thousands or tens of thousands of items appearing in combinations large enough to be interesting An Example An example of an association rule is: ``75% of transactions that contain diapers also contain beer; 37.5% of all transactions contain both of these items''. Here 75% is called the confidence of the rule, and 37.5% is called the support of the rule. Jason Manager of IT Book System Architecture and Skills Ⅰ. System Architecture ( 3-Tier ) : (1) Server Side Oracle 9.0.2 Database + Windows XP (2) Application Side Tomcat 5.0.18 + Windows XP (3) Client Side IE 6.0 + Windows XP Ⅱ. Skills : (1) UML (2) HTML , JavaScript (3) Java Program Language (J2SDK) (5) JSP , Java Servlet (6) JDBC , Java Bean (8) Oracle SQL , PL/SQL ( Trigger , Procedure , Function ) (9) Oracle Database Management Use Case Diagram <<extend>> Search Books <<extend>> Check Top10 Books View Book Information <<extend>> Create Customer Profile View Customer Profile <<extend>> Update Customer Profile Customers Place order for book <<include>> Payment View Order History Use Case Diagram Add Book <<extend>> <<extend>> Update Book Information Check Books Information <<extend>> Remove Book Manager Analyze Association Rules of Books <<extend>> Add Package for on Sale <<extend>> Update Package Information Check on Sale List <<extend>> Remove Package Class Diagram Display System Jason Manager of IT Book Connect to Jason Select Book Information Search Book Information Book Information Login My Profile Place Order Place Order Place Order Shopping Car Place Order Place Order Order Information Manager Select Classification Select Book Profit Association Rule Profit Association Rule Promotion Promotion Analyze and Discuss The Result Association rule help us to find out the association in transaction, but too depend on it will lose the consideration of other factor that influence the customer behavior. For example, classification and quantity of sale item are also as an important factor that we need to consider. Analyze and Discuss The Result Is the most confident rule the best rule ? There is a problem. This rule is actually worse than if just randomly saying that A appears in the transaction. A occurs in 45 percent of the transactions but the rule only gives 33 percent confidence. The rule does worse than just randomly guessing. Improvement Improvement tells how much better a rule is at predicting the result than just assuming the result in the first place. It is given by the following formula: P(A^B) / P (A) Improvement = --------------------------P(B) Improvement (2) When improvement is greater than 1, then the resulting rule is better at predicting the result than random chance. When it is less than 1 , it is worse than the random probability. The Profit Association Rules The profit association rules that not only consider the basic concept of association rule but also other influence factor. Three major portion of profit association rules are (1) Frequency (2) Quantity (3) Auxiliary Give each estimate a weight to calculate the final value Frequency Portion (1) Support : P(A^B) (2) Confident : P(A^B) / P (A) (3) Improvement : [P(A^B) / P (A)] / P(B) Quantity Portion (1) B’s sale quantity of B’s classification quantity = Q(B) / Q (CB) (2) A’s sale quantity of A’s classification quantity = Q(A) / Q (CA) (3) Comparative quality = Q(B) / Q(A) Auxiliary Portion A and B have same author A and B in same classification Whether A in top 10 list or not Whether B in top 10 list or not Etc. Case Study (1) Case Study (2) Case Study (3) Conclusion Profit association rule can suggest an evaluation value that let marketing manager can make business decisions include (1) Catalog design (2) What to put on sale (3) How to design coupons (4) Cross-marking. Future work Optimize the weight factor of Profit Association Rule. Integrate this system into CRM system (Data Warehouse, Data Mining, Call Center) Using AI technology to make Jason Manager more like a human being. Refine knowledge of domain know-how that bring business intelligence (BI). References R. Agrawal, T. Imielinski, and A. Swami, “Mining association rules between sets of items in large databases,” Proceedings of the ACM-SIGMOD International Conference on Management of Data, Washington, DC, pp. 207-216, 1993. C. H. Cai, “Mining association rules with weighted items,” Proceedings of the International Database Engineering and Application Symposium, Cardiff, Wales, UK, pp. 68-77, 1998. A. Gyenesei, “Mining weighted association rules for fuzzy quantitative items,” Techical Report, Turku Centre for Computer Science, no. 346, Finland, 2000. R. Rastogi and K. Shim, “Mining optimized association rules with categorical and numeric attributes,” IEEE Transactions on Knowledge and Data Engineering, vol. 14, no. 1, pp. 29 -50, 2002. P. S. M. Tsai and C. M. Chen, “Mining quantitative association rules in a large database of sales transactions,” Journal of Information Science and Engineering, vol. 17, no.4, pp. 667-681, 2001. Thank you