Download Research proposal - University of South Australia

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Theoretical computer science wikipedia , lookup

Neuroinformatics wikipedia , lookup

Operational transformation wikipedia , lookup

Recommender system wikipedia , lookup

Transcript
UNIVERSITY OF SOUTH AUSTRALIA
Assignment Cover Sheet – Internal
An Assignment cover sheet needs to be included with each assignment. Please complete all details clearly.
If you are submitting the assignment on paper, please staple this sheet to the front of each assignment. If you are
submitting the assignment online, please ensure this cover sheet is included at the start of your document. (This is
preferable to a separate attachment.)
Please check your Course Information Booklet or contact your School Office for assignment submission locations.
Name: Sun Kang
Student ID
1
0
0
1
0
3
5
5
3
Email: [email protected]<[email protected]>;
Course code and title: CIS research methods
School: Computer and Information Science
Program Code:INFT4017
Course Coordinator: Prof. Paul A. Swatman
Tutor: Prof. Paul A. Swatman
Day, Time, Location of Tutorial/Practical:
Due date: 14th June 2009
Assignment number: Assignment 2a
Assignment topic as stated in Course Information Booklet: research proposal
Further Information: (e.g. state if extension was granted and attach evidence of approval, Revised Submission
Date)
I declare that the work contained in this assignment is my own, except where acknowledgement of sources is made.
I authorise the University to test any work submitted by me, using text comparison software, for instances of plagiarism. I
understand this will involve the University or its contractor copying my work and storing it on a database to be used in future
to test work submitted by others.
I understand that I can obtain further information on this matter at http://www.unisa.edu.au/ltu/students/study/integrity.asp
Note: The attachment of this statement on any electronically submitted assignments will be deemed to have the same
authority as a signed statement.
Date: 14th June 2009
Signed: Sun kang
Date received from student
Recorded:
Assessment/grade
Assessed by:
Dispatched (if applicable):
The University of South Australian
INFT 4017
CIS Research method
Research proposal:
An application of online shopping promotion based
on Personalization recommendation
Supervisor: Jiuyong Li(john)
Student ID: 100103553
Student Name: Sun Kang
Table of contents
1. Introduction ........................................................................................ 5
1.1 Title ..................................................................................................................................................................5
1.2 Background.......................................................................................................................................................5
1.3 Motivation ........................................................................................................................................................6
2. Literature review ................................................................................. 8
2.1 Mature applications ..........................................................................................................................................8
2.2 related studies ...................................................................................................................................................8
2.3 data mining algorithms .....................................................................................................................................9
3.0 Research methodology..................................................................... 11
3.1 Data collection ................................................................................................................................................ 11
3.2 Data relevance ................................................................................................................................................ 11
3.3 Limitations ...................................................................................................................................................... 11
4.0 Project planning ............................................................................... 13
4.1 Expected outcomes ......................................................................................................................................... 13
4.2 Project planning .............................................................................................................................................. 13
Reference: ............................................................................................... 16
Disclaimer
I declare the following part is my own work, unless otherwise referenced, as defined by
the University’s policy.
Sun Kang
1. Introduction
1.1 Title
An application of online shopping promotion based on Personalization recommendation
1.2 Background
Online shopping started from 1990s and rapidly growth since it was born. Online
shopping is one of the outcomes of internet. Internet has changed our lifestyle greatly.
The term ‘e-commerce’ refer to all of the transaction and the online shopping website we
normally deal with belongs to B2C (business-to-consumer). Basically, online shopping
always promoted certain kind of services or products to their target user. However, the
number of products or services had extremely increased in a short periods. Categorization
is needed for customer to find the item they need. Search engine and product
classification both are helping customer to search the product.
Information overloading make the customer get lost in the website, how to help customer
to get the item they really want become a new demand for all of the online shopping
websites. New approach personalization recommendation was the concept to solve the
problem for customer. Personalization recommendation also knows as recommender
system provides useful and important information to customer of the online shopping
websites(Zanker & Jessenitschnig 2009). The concept of personalization
recommendation is to provide every customer a personal store. The store has useful and
interesting items for customers; all of the items in the personal store are generated by the
software system of the website. the background system which applied into the online
shopping website is the application of personalization recommendation. Three types of
personalized recommendation are commonly used in the real world
applications.(Candillier, Meyer & Fessant 2008)
Collaborative filtering
Contented-based filtering
Hybrid filtering
A group of researchers(Kim et al. 2002) mentioned in their paper that collaborative
filtering is the most successful methods for online shopping personalized
recommendation applications. Collaborative filtering has been applied into the world’s
most successful B2C retailer Amazon(Linden, Smith & York 2003). The collaborative
filtering based personalized recommendation system has been running to give suggestion
to customer for many years. The basic concept of collaborative filtering algorithm is to
get the most similar customer preference and present the useful suggestion to the active
customer(Wei, Huang & Fu 2007). data from multiple agents, viewpoints and source will
gathering together for filtering the interesting information for customer, it usually need
very large of data set when processing.
The data mining process is going to apply the weighting schema association rule mining
method to the data set from the real world. Association rule mining is one of the earliest
and most successful methods in the data mining domain. Simply speaking weighting
schema is to score different attributes in the data set by different score; the important
attributes assigned higher score. Computing the score for each item and sum it will
generate the total score. This score is going to applied into the association rule
miming(Wang, Xin & Coenen 2007). Association rule mining works by searching the
relationship between different items in the large dataset. It is well known as the market
basket theory; customer who bought this also bought that.
1.3 Motivation
Personalized recommendation system need to apply because the information overload.
Information overload become a common problem of all the e commerce company. The
application of the recommender system has a long history since 1990s, many
recommender system failed because the poor customer stratification(Wei, Huang & Fu
2007). The main purpose of this research is to gain the customer stratification by apply
the new knowledge into the system.
Understand the customers better in order to understand the market better, knowing what
customer really wants is really important for all of the internet marketers(Yang & Lai
2006). E-commerce website such as Amazon had collected heaps of data for their
customers. Finding out what are the data are more important than other data is another
purpose of this research.
I have decided to choose this topic is also because some of the personal experience. Some
of the items which promoted by the e-commerce website always useless, those
information have been ignore by the some of the customer for long time. The data which
generate by the system should also gain the interesting from the customer.
Apply the knowledge into data mining methods is also one of the reasons for developing
this application. Assign grade from 1 to 5 to from different aspects to one particular item
is a common technique used in marketing area. Table below shows the example of how
important of each aspect for different customers.
Performance
Style
Brand
Cost
Customer 1
4
4
5
1
Customer 2
3
5
2
4
Many website provide rating and comments of the items in their online store, all of these
information is a guide for the future user. However, the comments could be different for a
same item, how does that happen, different person has different perspective. Current user
need to finish a small survey before they view the online shopping website, this survey
help system to find the perfect match or similar match customer from the database, use
the rating or comments from the perfect or similar match customer to guide the current
customer.
The purpose of this application is to gain customer stratification and revenue for online
shopping site at the same time.
2. Literature review
2.1 Mature applications
A group of the researcher mentioned the interesting of buying product from internet is
getting less. Information about the product, customer and transaction had leaded a
information overload problem to all the e-commerce company. Recommender system has
been widely used to solve the problem(Wei, Huang & Fu 2007).
Amazon is the largest retainer of the world (Amazon.com 2009). Group of researchers
from Amazon propose their recommendation algorithm. Item to item collaborative
filtering is the algorithm using by Amazon now. Customer who purchased need to rate
the item they have bought from Amazon. Based on the rating from customer, system can
generate a table by the similar items. When customer click one item from the table, the
rest of the most similar items will be display to customer. The item to item algorithm is to
compute the similarities between different items by customer’s rating(Linden, Smith &
York 2003).
As one of the earliest and most successful recommender system, GroupLens is a system
to gathering, broadcasting and using certain amount of the user to predict the rest(Resnick
et al. 1994). User who care about net news will need to rate the news based on certain
aspects. Based on the rate from other user, the system can predict what the current user
wants. The advance feature of the system including, openness, ease to use, compatibility,
scalability and privacy.
2.2 Related studies
Two researchers had mentioned in their paper about data which been collected is always
not enough(Yang & Lai 2006), they pointed out there are some more data need to be
collected for the purpose of accuracy. More accuracy data generated by the system, more
useful of the recommender system is. For instance, customer moved the items into the
shopping cart or moved the items out of the shopping cart need to record in the database.
Learning from the customer shopping behavior becomes another important issue in order
to increase the accuracy.
Researchers mentioned data used to be inaccessible can be recorded now to help firms to
analysis the customer pattern better(Natarajan & Shekar 2005). Interestingness is the new
approach to measure the customer’s expectation. New technologies allow the firm to
record new type of data into their database. New data such as mouse movement shows
the most interesting part of the webpage which customer focus on.
2.3 data mining algorithms
Researcher from Poland proposed a new algorithm in his paper by applying indirect
association rule in the web recommendation(Kazienko 2009). The algorithm searching
items not directly associated but they all linked to group of items. By using this
algorithm, some of items not related but they are connected to each other will be found
out and display to customer or user. This approach helps user to search the same thing but
present in different forms.
Two South Korean researchers proposed a new algorithm which is using product
taxonomy based on collaborative filtering(Cho & Kim 2004). The purpose of this
algorithm is to solve the problems of sparsity and scalability which exists in the current
collaborative filtering recommender system. All of the products in the database represent
as a hieratical tree which groups the similar items into the same group. Based on the tree,
some of the process need to be done to recommends a list to customer.
As mentioned before, association rule mining is one of the most successful data mining
methods to solve the personal recommendation system. Group of researcher shows that
integrate neural network method can improve the performance in their report(Changchien
& Lu 2001). Clustering is used to divide the customer and product into different group,
association rule is used to extract the rule between different clusters. Those rules help the
e-commerce site improve their one to one online business.
Group of researchers proposed a new recommendation approach by integrated the
semantic annotations(Markellou et al. 2005). From the result which generated in the
paper, it seems the system has good performance in the given data set. The system works
by apply the naïve Bayes classifiers for the categorization. Association rule mining is
used to generate the interesting patterns from the customer online shopping history. This
paper had focus on the improvement of recommendation for new users.
Singapore researchers proposed an associative classification based recommendation for
the B2C website for the hand phone data set(Zhang & Jiao 2007). The concept of this
paper is to reduce the variation of words and enlarge the search scope of the customer
requirement. Semantic meaning is used to generate the customer requirement from all of
the customers. After the requirement has been generated, system will generalized the
rules based on the customer requirement.
PRES is a personalized recommendation system which based on the content based
filtering. This approach was proposed by two researchers (Meteren and Someren, 2000).
PRES is mainly based on the user profile and the feedback from the other user. The
information about the users and the webpage are stored in the data base separately,
recommender system find the useful information based on users’ preference. User need to
be a member in order to get the personalization recommendation. This system is build by
using object-oriented language java.
3.0 Research methodology
3.1 Data collection
Most of the information about my research topic is from the University of South
Australia online database. Compendex is the one used most of the time. Article,
conference journal and papers can be downloaded from the database. Information which I
have collected is based from two aspects, personalization recommendation and
association rule mining.
3.2 Data relevance
Lot of data have been collected from online database, however, only a few of them
related or really help my research topic. There is also some information which has been
collected from the websites. All of the data which collected from internet are recently
published which provides current status of my research area. I will try to improve the
recommendation system based on the knowledge been collected about this area. Some of
the mature models have been introduced in the journal or paper. Those model can be a
reference for the application which going to be developed. The performance can be test
or compared by using the sample data for other application and the application which is
going to develop.
3.3 Limitations
Before the start of the project itself, there are some preparations need to be done. A
complete table of how people judge an item from different aspects needs to be created.
Large number of people is needed in order to complete this table, how can get the
complete table with large amount of people is the problem. This could be one of the
limitations.
Another limitation is the over fitting, this is one of the common problem for data mining
application.
After the system complete, large number of tester is required. How to get tester and
where to get tester become a big issue, one of the solution could be looking for
volunteers.
The performance data can get from other source is based on the real world application, all
of these applications are considered as mature project, it is hard to compare the
application with mature project.
4.0 Project planning
4.1 Expected outcomes
The planning outcome for this project is a website which based on J2EE framework and
system which runs smoothly without any big flaw. The recommendation component
should be working at the same time because which is the purpose of the website. The new
concept which involved in the project should get better performance than standard
recommendation system.
Another important outcome is the research document which including the process of my
research and it also need to provide the history and performance of other major systems.
This research is mainly focus on the improvement of current personalization
recommendation approach.
4.2 Project planning
All of the activities needed to be well defined in order to finish the whole thesis properly.
The activities list is to show all the events in the project need to be done and their
sequence.
Activities lists
Index
Activities
Start date
End date
Mile stone
1
Research proposal
09-03-16
09-6-13
1
2
Finding topic
09-03-16
09-4-1
number
3
Discuss topic
09-4-1
09-4-4
4
Topic define
09-4-02
09-4-10
5
Literature searching
09-4-10
09-4-15
6
Literature review
09-4-15
09-5-16
7
Report writing
09-5-18
09-6-10
8
Presentation preparation 09-6-1
09-6-10
9
R code
09-7-1
09-8-25
10
R exercise
09-7-1
09-7-16
11
R coding
09-7-17
09-8-15
12
Test the R program with 09-8-17
09-8-25
2
test data set
13
Website creating
09-9-24
10-1-14
14
Environment built
09-9-24
09-3-30
15
Coding
09-10-1
09-12-31
3
16
R code transform
09-12-31-
10-1-14
17
Testing
10-3-1
10-3-17
18
Thesis report
09-7-31
10-6-30
sResearch proposal
R code
Website creating, testing and thesis report
4
Reference:
Candillier, L, Meyer, F & Fessant, F 2008, 'Designing specific weighted similarity
measures to improve collaborative filtering systems', Leipzig, Germany.
Changchien, SW & Lu, T-C 2001, 'Mining association rules procedure to support on-line
recommendation by customers and products fragmentation', Expert Systems with
Applications, vol. 20, no. 4, pp. 325-335.
Cho, YH & Kim, JK 2004, 'Application of Web usage mining and product taxonomy to
collaborative recommendations in e-commerce', Expert Systems with Applications, vol.
26, no. 2, pp. 233-246.
Kazienko, P 2009, 'Mining indirect association rules for web recommendation',
International Journal of Applied Mathematics and Computer Science, vol. 19, no. 1, pp.
165-186.
Kim, JK, Cho, YH, Kim, WJ, Kim, JR & Suh, JH 2002, 'A personalized recommendation
procedure for Internet shopping support', Electronic Commerce Research and
Applications, vol. 1, no. 3-4, pp. 301-313.
Linden, G, Smith, B & York, J 2003, 'Amazon.com recommendations: Item-to-item
collaborative filtering', IEEE Internet Computing, vol. 7, no. 1, pp. 76-80.
Markellou, P, Mousourouli, I, Sirmakessis, S & Tsakalidis, A 2005, 'Personalized Ecommerce recommendations', Beijing, China.
Natarajan, R & Shekar, B 2005, 'Interestingness of association rules in data mining:
Issues relevant to e-commerce', Sadhana - Academy Proceedings in Engineering Sciences,
vol. 30, no. 2-3, pp. 291-309.
Resnick, P, Iacovou, N, Suchak, M, Bergstrom, P & Riedl, J 1994, 'GroupLens: An open
architecture for collaborative filtering of netnews', Chapel Hill, NC, United states.
RV Meteren and MV Someren. Using content-based filtering for recommendation.
Machine Learning in the New Information Age MLnet/ECML2000 Workshop, Spain,
2000.
Wang, YJ, Xin, Q & Coenen, F 2007, 'A novel rule weighting approach in classification
association rule mining', Omaha, NE, United states.
Wei, K, Huang, J & Fu, S 2007, 'A survey of E-commerce recommender systems',
Changdu, China.
Yang, T-C & Lai, H 2006, 'Comparison of product bundling strategies on different online
shopping behaviors', Electronic Commerce Research and Applications, vol. 5, no. 4, pp.
295-304.
Zanker, M & Jessenitschnig, M 2009, 'Case-studies on exploiting explicit customer
requirements in recommender systems', User Modelling and User-Adapted Interaction,
vol. 19, no. 1-2 SPEC. ISS., pp. 133-166.
Zhang, Y & Jiao, J 2007, 'An associative classification-based recommendation system for
personalization in B2C e-commerce applications', Expert Syst. Appl., vol. 33, no. 2, pp.
357-367.