Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
UNIVERSITY OF SOUTH AUSTRALIA Assignment Cover Sheet – Internal An Assignment cover sheet needs to be included with each assignment. Please complete all details clearly. If you are submitting the assignment on paper, please staple this sheet to the front of each assignment. If you are submitting the assignment online, please ensure this cover sheet is included at the start of your document. (This is preferable to a separate attachment.) Please check your Course Information Booklet or contact your School Office for assignment submission locations. Name: Sun Kang Student ID 1 0 0 1 0 3 5 5 3 Email: [email protected]<[email protected]>; Course code and title: CIS research methods School: Computer and Information Science Program Code:INFT4017 Course Coordinator: Prof. Paul A. Swatman Tutor: Prof. Paul A. Swatman Day, Time, Location of Tutorial/Practical: Due date: 14th June 2009 Assignment number: Assignment 2a Assignment topic as stated in Course Information Booklet: research proposal Further Information: (e.g. state if extension was granted and attach evidence of approval, Revised Submission Date) I declare that the work contained in this assignment is my own, except where acknowledgement of sources is made. I authorise the University to test any work submitted by me, using text comparison software, for instances of plagiarism. I understand this will involve the University or its contractor copying my work and storing it on a database to be used in future to test work submitted by others. I understand that I can obtain further information on this matter at http://www.unisa.edu.au/ltu/students/study/integrity.asp Note: The attachment of this statement on any electronically submitted assignments will be deemed to have the same authority as a signed statement. Date: 14th June 2009 Signed: Sun kang Date received from student Recorded: Assessment/grade Assessed by: Dispatched (if applicable): The University of South Australian INFT 4017 CIS Research method Research proposal: An application of online shopping promotion based on Personalization recommendation Supervisor: Jiuyong Li(john) Student ID: 100103553 Student Name: Sun Kang Table of contents 1. Introduction ........................................................................................ 5 1.1 Title ..................................................................................................................................................................5 1.2 Background.......................................................................................................................................................5 1.3 Motivation ........................................................................................................................................................6 2. Literature review ................................................................................. 8 2.1 Mature applications ..........................................................................................................................................8 2.2 related studies ...................................................................................................................................................8 2.3 data mining algorithms .....................................................................................................................................9 3.0 Research methodology..................................................................... 11 3.1 Data collection ................................................................................................................................................ 11 3.2 Data relevance ................................................................................................................................................ 11 3.3 Limitations ...................................................................................................................................................... 11 4.0 Project planning ............................................................................... 13 4.1 Expected outcomes ......................................................................................................................................... 13 4.2 Project planning .............................................................................................................................................. 13 Reference: ............................................................................................... 16 Disclaimer I declare the following part is my own work, unless otherwise referenced, as defined by the University’s policy. Sun Kang 1. Introduction 1.1 Title An application of online shopping promotion based on Personalization recommendation 1.2 Background Online shopping started from 1990s and rapidly growth since it was born. Online shopping is one of the outcomes of internet. Internet has changed our lifestyle greatly. The term ‘e-commerce’ refer to all of the transaction and the online shopping website we normally deal with belongs to B2C (business-to-consumer). Basically, online shopping always promoted certain kind of services or products to their target user. However, the number of products or services had extremely increased in a short periods. Categorization is needed for customer to find the item they need. Search engine and product classification both are helping customer to search the product. Information overloading make the customer get lost in the website, how to help customer to get the item they really want become a new demand for all of the online shopping websites. New approach personalization recommendation was the concept to solve the problem for customer. Personalization recommendation also knows as recommender system provides useful and important information to customer of the online shopping websites(Zanker & Jessenitschnig 2009). The concept of personalization recommendation is to provide every customer a personal store. The store has useful and interesting items for customers; all of the items in the personal store are generated by the software system of the website. the background system which applied into the online shopping website is the application of personalization recommendation. Three types of personalized recommendation are commonly used in the real world applications.(Candillier, Meyer & Fessant 2008) Collaborative filtering Contented-based filtering Hybrid filtering A group of researchers(Kim et al. 2002) mentioned in their paper that collaborative filtering is the most successful methods for online shopping personalized recommendation applications. Collaborative filtering has been applied into the world’s most successful B2C retailer Amazon(Linden, Smith & York 2003). The collaborative filtering based personalized recommendation system has been running to give suggestion to customer for many years. The basic concept of collaborative filtering algorithm is to get the most similar customer preference and present the useful suggestion to the active customer(Wei, Huang & Fu 2007). data from multiple agents, viewpoints and source will gathering together for filtering the interesting information for customer, it usually need very large of data set when processing. The data mining process is going to apply the weighting schema association rule mining method to the data set from the real world. Association rule mining is one of the earliest and most successful methods in the data mining domain. Simply speaking weighting schema is to score different attributes in the data set by different score; the important attributes assigned higher score. Computing the score for each item and sum it will generate the total score. This score is going to applied into the association rule miming(Wang, Xin & Coenen 2007). Association rule mining works by searching the relationship between different items in the large dataset. It is well known as the market basket theory; customer who bought this also bought that. 1.3 Motivation Personalized recommendation system need to apply because the information overload. Information overload become a common problem of all the e commerce company. The application of the recommender system has a long history since 1990s, many recommender system failed because the poor customer stratification(Wei, Huang & Fu 2007). The main purpose of this research is to gain the customer stratification by apply the new knowledge into the system. Understand the customers better in order to understand the market better, knowing what customer really wants is really important for all of the internet marketers(Yang & Lai 2006). E-commerce website such as Amazon had collected heaps of data for their customers. Finding out what are the data are more important than other data is another purpose of this research. I have decided to choose this topic is also because some of the personal experience. Some of the items which promoted by the e-commerce website always useless, those information have been ignore by the some of the customer for long time. The data which generate by the system should also gain the interesting from the customer. Apply the knowledge into data mining methods is also one of the reasons for developing this application. Assign grade from 1 to 5 to from different aspects to one particular item is a common technique used in marketing area. Table below shows the example of how important of each aspect for different customers. Performance Style Brand Cost Customer 1 4 4 5 1 Customer 2 3 5 2 4 Many website provide rating and comments of the items in their online store, all of these information is a guide for the future user. However, the comments could be different for a same item, how does that happen, different person has different perspective. Current user need to finish a small survey before they view the online shopping website, this survey help system to find the perfect match or similar match customer from the database, use the rating or comments from the perfect or similar match customer to guide the current customer. The purpose of this application is to gain customer stratification and revenue for online shopping site at the same time. 2. Literature review 2.1 Mature applications A group of the researcher mentioned the interesting of buying product from internet is getting less. Information about the product, customer and transaction had leaded a information overload problem to all the e-commerce company. Recommender system has been widely used to solve the problem(Wei, Huang & Fu 2007). Amazon is the largest retainer of the world (Amazon.com 2009). Group of researchers from Amazon propose their recommendation algorithm. Item to item collaborative filtering is the algorithm using by Amazon now. Customer who purchased need to rate the item they have bought from Amazon. Based on the rating from customer, system can generate a table by the similar items. When customer click one item from the table, the rest of the most similar items will be display to customer. The item to item algorithm is to compute the similarities between different items by customer’s rating(Linden, Smith & York 2003). As one of the earliest and most successful recommender system, GroupLens is a system to gathering, broadcasting and using certain amount of the user to predict the rest(Resnick et al. 1994). User who care about net news will need to rate the news based on certain aspects. Based on the rate from other user, the system can predict what the current user wants. The advance feature of the system including, openness, ease to use, compatibility, scalability and privacy. 2.2 Related studies Two researchers had mentioned in their paper about data which been collected is always not enough(Yang & Lai 2006), they pointed out there are some more data need to be collected for the purpose of accuracy. More accuracy data generated by the system, more useful of the recommender system is. For instance, customer moved the items into the shopping cart or moved the items out of the shopping cart need to record in the database. Learning from the customer shopping behavior becomes another important issue in order to increase the accuracy. Researchers mentioned data used to be inaccessible can be recorded now to help firms to analysis the customer pattern better(Natarajan & Shekar 2005). Interestingness is the new approach to measure the customer’s expectation. New technologies allow the firm to record new type of data into their database. New data such as mouse movement shows the most interesting part of the webpage which customer focus on. 2.3 data mining algorithms Researcher from Poland proposed a new algorithm in his paper by applying indirect association rule in the web recommendation(Kazienko 2009). The algorithm searching items not directly associated but they all linked to group of items. By using this algorithm, some of items not related but they are connected to each other will be found out and display to customer or user. This approach helps user to search the same thing but present in different forms. Two South Korean researchers proposed a new algorithm which is using product taxonomy based on collaborative filtering(Cho & Kim 2004). The purpose of this algorithm is to solve the problems of sparsity and scalability which exists in the current collaborative filtering recommender system. All of the products in the database represent as a hieratical tree which groups the similar items into the same group. Based on the tree, some of the process need to be done to recommends a list to customer. As mentioned before, association rule mining is one of the most successful data mining methods to solve the personal recommendation system. Group of researcher shows that integrate neural network method can improve the performance in their report(Changchien & Lu 2001). Clustering is used to divide the customer and product into different group, association rule is used to extract the rule between different clusters. Those rules help the e-commerce site improve their one to one online business. Group of researchers proposed a new recommendation approach by integrated the semantic annotations(Markellou et al. 2005). From the result which generated in the paper, it seems the system has good performance in the given data set. The system works by apply the naïve Bayes classifiers for the categorization. Association rule mining is used to generate the interesting patterns from the customer online shopping history. This paper had focus on the improvement of recommendation for new users. Singapore researchers proposed an associative classification based recommendation for the B2C website for the hand phone data set(Zhang & Jiao 2007). The concept of this paper is to reduce the variation of words and enlarge the search scope of the customer requirement. Semantic meaning is used to generate the customer requirement from all of the customers. After the requirement has been generated, system will generalized the rules based on the customer requirement. PRES is a personalized recommendation system which based on the content based filtering. This approach was proposed by two researchers (Meteren and Someren, 2000). PRES is mainly based on the user profile and the feedback from the other user. The information about the users and the webpage are stored in the data base separately, recommender system find the useful information based on users’ preference. User need to be a member in order to get the personalization recommendation. This system is build by using object-oriented language java. 3.0 Research methodology 3.1 Data collection Most of the information about my research topic is from the University of South Australia online database. Compendex is the one used most of the time. Article, conference journal and papers can be downloaded from the database. Information which I have collected is based from two aspects, personalization recommendation and association rule mining. 3.2 Data relevance Lot of data have been collected from online database, however, only a few of them related or really help my research topic. There is also some information which has been collected from the websites. All of the data which collected from internet are recently published which provides current status of my research area. I will try to improve the recommendation system based on the knowledge been collected about this area. Some of the mature models have been introduced in the journal or paper. Those model can be a reference for the application which going to be developed. The performance can be test or compared by using the sample data for other application and the application which is going to develop. 3.3 Limitations Before the start of the project itself, there are some preparations need to be done. A complete table of how people judge an item from different aspects needs to be created. Large number of people is needed in order to complete this table, how can get the complete table with large amount of people is the problem. This could be one of the limitations. Another limitation is the over fitting, this is one of the common problem for data mining application. After the system complete, large number of tester is required. How to get tester and where to get tester become a big issue, one of the solution could be looking for volunteers. The performance data can get from other source is based on the real world application, all of these applications are considered as mature project, it is hard to compare the application with mature project. 4.0 Project planning 4.1 Expected outcomes The planning outcome for this project is a website which based on J2EE framework and system which runs smoothly without any big flaw. The recommendation component should be working at the same time because which is the purpose of the website. The new concept which involved in the project should get better performance than standard recommendation system. Another important outcome is the research document which including the process of my research and it also need to provide the history and performance of other major systems. This research is mainly focus on the improvement of current personalization recommendation approach. 4.2 Project planning All of the activities needed to be well defined in order to finish the whole thesis properly. The activities list is to show all the events in the project need to be done and their sequence. Activities lists Index Activities Start date End date Mile stone 1 Research proposal 09-03-16 09-6-13 1 2 Finding topic 09-03-16 09-4-1 number 3 Discuss topic 09-4-1 09-4-4 4 Topic define 09-4-02 09-4-10 5 Literature searching 09-4-10 09-4-15 6 Literature review 09-4-15 09-5-16 7 Report writing 09-5-18 09-6-10 8 Presentation preparation 09-6-1 09-6-10 9 R code 09-7-1 09-8-25 10 R exercise 09-7-1 09-7-16 11 R coding 09-7-17 09-8-15 12 Test the R program with 09-8-17 09-8-25 2 test data set 13 Website creating 09-9-24 10-1-14 14 Environment built 09-9-24 09-3-30 15 Coding 09-10-1 09-12-31 3 16 R code transform 09-12-31- 10-1-14 17 Testing 10-3-1 10-3-17 18 Thesis report 09-7-31 10-6-30 sResearch proposal R code Website creating, testing and thesis report 4 Reference: Candillier, L, Meyer, F & Fessant, F 2008, 'Designing specific weighted similarity measures to improve collaborative filtering systems', Leipzig, Germany. Changchien, SW & Lu, T-C 2001, 'Mining association rules procedure to support on-line recommendation by customers and products fragmentation', Expert Systems with Applications, vol. 20, no. 4, pp. 325-335. Cho, YH & Kim, JK 2004, 'Application of Web usage mining and product taxonomy to collaborative recommendations in e-commerce', Expert Systems with Applications, vol. 26, no. 2, pp. 233-246. Kazienko, P 2009, 'Mining indirect association rules for web recommendation', International Journal of Applied Mathematics and Computer Science, vol. 19, no. 1, pp. 165-186. Kim, JK, Cho, YH, Kim, WJ, Kim, JR & Suh, JH 2002, 'A personalized recommendation procedure for Internet shopping support', Electronic Commerce Research and Applications, vol. 1, no. 3-4, pp. 301-313. Linden, G, Smith, B & York, J 2003, 'Amazon.com recommendations: Item-to-item collaborative filtering', IEEE Internet Computing, vol. 7, no. 1, pp. 76-80. Markellou, P, Mousourouli, I, Sirmakessis, S & Tsakalidis, A 2005, 'Personalized Ecommerce recommendations', Beijing, China. Natarajan, R & Shekar, B 2005, 'Interestingness of association rules in data mining: Issues relevant to e-commerce', Sadhana - Academy Proceedings in Engineering Sciences, vol. 30, no. 2-3, pp. 291-309. Resnick, P, Iacovou, N, Suchak, M, Bergstrom, P & Riedl, J 1994, 'GroupLens: An open architecture for collaborative filtering of netnews', Chapel Hill, NC, United states. RV Meteren and MV Someren. Using content-based filtering for recommendation. Machine Learning in the New Information Age MLnet/ECML2000 Workshop, Spain, 2000. Wang, YJ, Xin, Q & Coenen, F 2007, 'A novel rule weighting approach in classification association rule mining', Omaha, NE, United states. Wei, K, Huang, J & Fu, S 2007, 'A survey of E-commerce recommender systems', Changdu, China. Yang, T-C & Lai, H 2006, 'Comparison of product bundling strategies on different online shopping behaviors', Electronic Commerce Research and Applications, vol. 5, no. 4, pp. 295-304. Zanker, M & Jessenitschnig, M 2009, 'Case-studies on exploiting explicit customer requirements in recommender systems', User Modelling and User-Adapted Interaction, vol. 19, no. 1-2 SPEC. ISS., pp. 133-166. Zhang, Y & Jiao, J 2007, 'An associative classification-based recommendation system for personalization in B2C e-commerce applications', Expert Syst. Appl., vol. 33, no. 2, pp. 357-367.