Download Cross-domain recommendation - YesBut

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nonlinear dimensionality reduction wikipedia , lookup

Transcript
University of Technology, Sydney
Faculty of Engineering and Information Technology
Cross-domain Collaborative Recommendation with Exploiting Rich
Side Information Sources of Users and Items
Peng Hao
Supervisor: Guangquan Zhang
Co-supervisor: Jie Lu
Decision Systems and e-Service Intelligence Lab, QCIS
School of Software
October 2014
Abstract
The keen marketing competitions and high customer churn in telecom industry requires telecom
companies to provide personalized products/services to customers, which brings great
challenges and difficulties to current telecom companies due to their lack of product/service
personalization and intelligence ability. Recommender systems can help telecom companies to
implement product/service personalization and intelligence. However, the products/services in
telecom industry are complex and present hierarchical tree structures. Also, uncertain and
incomplete information exists in products/services data. Fuzzy set theory and techniques are
used to solve this problem. In theory, existing recommender systems cannot measure the
similarity between hierarchical tree structured objects for generating recommendations.
Therefore, a similarity measure on tree structured data is needed from the point of view of both
theory and applications. In this research, 1) after a formalization of tree structure modeling
methodology is proposed, a comprehensive tree similarity measure method and algorithm will be
developed. 2) A fuzzy similarity measure on tree structured data will be developed. 3) Based on
the developed tree similarity measure, a recommendation approach for hierarchical tree
structured items will be developed. 4) A recommender system for business users in telecom
industry will then be designed in this research. A case based recommendation approach which
fully utilizes experiences and integrates the domain knowledge, such as business rules, will be
developed. 5) Finally, a working recommender system prototype will be implemented and
evaluated.
2
TABLE OF CONTENTS
1.
Introduction ........................................................................................................................................... 4
2.
Research Questions, Objectives and Expected Outcomes .................................................................... 6
3.
Literature Review.................................................................................................................................. 7
3.1
Traditional single domain recommendation.................................................................................. 7
3.1.1
Content-Based Recommendation Techniques ...................................................................... 8
3.1.2
Collaborative Filtering Recommendation Techniques.......................................................... 8
3.1.3
Knowledge Based Recommendation Techniques ................................................................. 9
3.1.4
Hybrid Recommendation Techniques ................................................................................. 10
3.2
Cross-domain recommendation .................................................................................................. 11
3.2.1
Side information sources of users and items ....................................................................... 11
3.2.2
Transfer learning techniques ............................................................................................... 13
3.2.3
Cross-domain collaborative recommendation techniques................................................... 16
4.
Significance......................................................................................................................................... 18
5.
Research methodology ........................................................................................................................ 19
6.
Research timeline ................................................................................................................................ 23
7.
Research progress up to date ............................................................................................................... 24
8.
References ........................................................................................................................................... 25
3
1. Introduction
Since the wide spread of Web 2.0, a huge and increasing amount of complex and heterogeneous
data are generated online every day. As a result, it becomes a serious burden for human
processing ability. To overcome such information overload problem, recommender system has
been developed to assist people’s selection and decision making. Recommender system is the
most popular technique to implement personalization (Burke 2000). It can be defined as
programs which attempt to recommend items to users by predicting a user’s interest to an item
based on various sorts of information. The aim of recommender systems is to provide right
information about products/services to right customers that relevant to their needs/interests on
right time. This can be achieved by filtering out the unrelated products automatically and
suggesting only the relevant ones (Goy, Ardissono & Petrone 2007; Markellou et al. 2005).
There are mainly three types of recommendation techniques, which are collaborative-filtering,
content-based and knowledge-based (Burke 2002). Collaborative-filtering (CF) recommendation
technique is the most successful and widely used technique for recommender systems (Huang,
Zeng & Chen 2007; Schafer et al. 2007). It helps people make their choices based on the
opinions of other people who share similar interests and try to provide right information to the
right user (Deshpande & Karypis 2004). Content-based (CB) recommendation techniques
recommend items that are similar to the ones preferred before by a specific user (Pazzani &
Billsus 2007). The knowledge-Based (KB) recommender systems offer items to users based on
knowledge about the users and items (Felfernig et al. 2008). In contrast to collaborative-filtering
and content-based approaches, knowledge-based approaches are in the majority of cases applied
for recommending complex products and services such as consumer goods, technical equipment,
or financial services (Felfernig et al. 2008), which is suitable for telecom products/services. Each
recommendation technique has its own merits and drawbacks. A hybrid recommendation
technique can be proposed to gain higher performance and to avoid the drawbacks of the typical
recommendation techniques (Burke 2007a). The most common practice in the existed hybrid
recommendation techniques is to combine the CF with the other RS recommendation techniques
in an attempt to avoid cold-start, sparseness and/or scalability problems (Adomavicius &
Tuzhilin 2005a; Kim et al. 2006).
Though great progress has been made in single domain recommendation, it is restricted to offer
recommendations only for items belonging to a single domain. There is a strong demand of joint
recommendation in our daily life. For example, a user browsed a movie in Netflix, besides
suggesting related movies to the specific user, other types of items provided by different
websites, like music, books, and videogames somehow related to that movie, are also favourable.
There is already some recommender systems offer joint recommendation of items in different
domains, like e-commerce site Amazon. It would be useful to exploit the user’s evaluations
about diverse types of items in order to generate a more general model of the user preferences.
However, in practice, to build a recommender system in one domain, users’ preferences in that
target domain are only exploited, which may suffer from cold start or data sparsity problem. But
by analysing we find that there could be dependencies and correlations between preferences in
different domains and instead of treating each type of items independently, user knowledge
acquired in one domain could be transferred and exploited in several other domains. The data
sparsity problem associated with extremely large-scale recommendation systems provides us
with strong motivation for finding new ways to transfer knowledge from auxiliary data sources.
4
Recently, with the rapid development of transfer learning techniques, cross-domain
recommendation has received much attention from both researchers and practisers (FernándezTobías et al. 2012; Li 2011).
In the perspective of transfer learning, all the existing cross-domain recommendation algorithms
implemented in different knowledge transfer patterns can be classified into three categories:
adaptive knowledge transfer, collective knowledge transfer and integrative knowledge transfer.
Cross-domain recommendation techniques based on adaptive knowledge transfer are usually
achieved in two separate ways. First, common knowledge is mined from auxiliary data. Then
those extracted knowledge is adapted to target data. Compared to adaptive knowledge transfer,
collective knowledge transfer tries to complete common knowledge extraction and target domain
rating prediction simultaneously. Instead of extracting common knowledge or finding latent
common features, integrative knowledge transfer incorporate auxiliary data directly into target
learning task. As integrative knowledge transfer can utilize more interaction between auxiliary
data and target data, so it is believed to enable more effective knowledge transfer. However, the
time complexity may also increase. In all the methods, cross-domain collaborative filtering is the
most widely studied approach for cross-domain recommender system, which can be considered
as collaborative filtering in a single domain extended with incorporating various types of
additional information from auxiliary domains. Though some representative works have been
conducted in this direction, new and effective algorithms are still needed to be developed
especially when abundant additional information sources are emerging.
In addition to the large effort devoted to exploiting collaborative filtering with matrix
factorization, another category of approaches, the graph-based approaches are well studied and
extensively developed in the field of social network (Liben‐Nowell & Kleinberg 2007; Tong,
Faloutsos & Pan 2006); researchers in the area of recommender system have also exploited those
methods in various ways in order to improve collaborative filtering based on user-item ratings
(Gori & Pucci 2007; Jamali & Ester 2009a, 2009b; Yildirim & Krishnamoorthy 2008). The
importance of graph-based approaches has rapidly grown with the increasing availability of
additional information that can be incorporated for recommendation. But to my best knowledge,
there is not a work studies cross-domain recommendation with graph-based methods. How to get
different graphs connected, which are built in each domain respectively, becomes the bottleneck.
In my research, I will try to develop a graph-based cross-domain recommendation framework
and propose related algorithms.
The rest of this report is organized as follows: Section 2 summarizes the research questions and
lists out the objectives and expected outcomes of this research. Section 3 presents a
comprehensive review of the related works. The surveyed areas include traditional single domain
recommendation approaches, different types of side information sources that can be incorporated
into recommendation, typical transfer learning methods and existing techniques developed for
cross-domain recommendation. The significance and innovation of this research are described in
Section 4. Section 5 presents the methodology to complete corresponding research objectives.
Section 6 outlines the entire timeline of this research with the planned tasks and expected
outcomes for each stage. In the end, the up-to-date research progress is reported.
5
2. Research Questions, Objectives and Expected Outcomes
This research aims to develop a new and effective cross-domain collaborative recommender
system to support metadata owners or individual companies in optimizing their
recommendations and improving their products/services quality. As there is multiple information
sources can be exploited to enrich the quantity and quality of knowledge used in single domain
recommendation scenario, this study pays more attention to develop efficient and selective
knowledge transfer methods for improving cross-domain collaborative recommendation. To
summarize, the following research questions will be answered by this research:
Q1. How to effectively find and establish the domain relatedness among multiple domains?
Q2. How to selectively transfer the common knowledge among different domains with the
corresponding domain relatedness?
Q3.How to build a parallel and distributed cross-domain recommender system?
This research aims to achieve the following objectives, which are expected to answer the above
research questions:
Objective 1. To discover an explicit link among multiple domains via utilizing user contributed
data or user-item interaction information.
An explicit link among multiple domains needs to be defined with the help of user
contributed data or user-item interaction information in order to characterize user/item profile in
different domains. Some existing methods propose to exploit user/item overlap or common
social tag as explicit link, and the improvements are proved to be significant (Shi, Larson &
Hanjalic 2011). Based on the explicit domain link, the bridge that brings different domains
together for knowledge transfer can be built.
Objective 2. To discover an implicit link among multiple domains via mining latent common
patterns shared between users or items.
An implicit link among multiple domains will be mined either from user aspects or item
aspects. For user aspects, the implicit link can be user’s preference shared among groups or
friends network. For item aspects, the implicit link can be extracted from item-item relevance
network of taxonomy. Based on those implicit links, more hidden knowledge can be transferred
among multiple domains.
Objective 3. To develop a graph-based cross-domain collaborative recommendation
framework and related methods to enhance cross-domain recommendation quality.
A graph-based cross-domain collaborative recommendation framework will be defined. In
this graph, the concept of nodes, edges and weights of edges need be defined. The biggest
challenge of this method lies in connecting different graphs together, as each graph is built in one
domain respectively. A similarity measure between different graph nodes is then developed.
Objective 4. To develop a new cross-domain collaborative filtering framework and related
algorithms by expanding matrix factorization technique with incorporating side information
of users and items.
Based on the matrix factorization technique, a cross-domain collaborative filtering
framework will be developed. Various types of user/item contributed information/data will be
6
integrated into the factorized user/item latent feature matrices for assisting the knowledge
transfer.
Objective 5. To develop a parallel and distributed cross-domain recommender system
prototype based on the above proposed algorithms.
A novel cross-domain recommender system will be developed for use. In the core of this
system, the above proposed cross-domain recommendation algorithms will be applied in the
designed cross-domain recommender system prototype.
Upon the successful completion of this research, the following outcomes can be expected:
(1)
(2)
(3)
(4)
A graph-based cross-domain recommendation framework and relevant algorithms;
A new cross-domain collaborative filtering framework and relevant algorithms;
An effective cross-domain recommender system prototype for application;
Several high quality research papers and PhD thesis.
3. Literature Review
As cross-domain recommendation can be seen as a process that exploits multiple domains
common knowledge and utilizes transfer learning techniques to complete the knowledge transfer
for recommendation making in a single target domain, so in this part a brief history of traditional
single domain recommendation techniques are introduced first, then cross-domain
recommendation problem becomes the focus. In particular, I will show what kinds of knowledge
can be explored to enrich the information sources for recommender system besides the explicit
user-item ratings. Next state-of-the-art transfer learning techniques are exhibited. Finally I will
also describe some existing cross-domain recommendation techniques in details.
3.1 Traditional single domain recommendation
Recommender system (RS) attempt to recommend items to users by predicting a user’s interest
to an item based on various sorts of information, including information about similar items, users
with same preferences and interactions between users and items. Since the wide spread of Web
2.0, there are many practical applications with recommender systems as they are appealing to
more and more companies, such as Amazon, YouTube, iTunes, in order to offer appropriate
services and goods to their customers, while at the same time improve their sales performances.
In academic, recommender systems started to attract researchers’ attention since the early
nineties. Research in recommender systems grew out of information retrieval and filtering
research (Goldberg et al. 1992; Resnick & Varian 1997). The aim of using recommendation
techniques is to overcome information overload through retrieving the most relevant information
and services from a huge amount of data.
There have been many techniques proposed for single domain recommendation. These
techniques are classified differently according to different criteria. Many researches have been
done in investigating the types of recommendation approaches and discussing various limitations
of these approaches (Adomavicius & Tuzhilin 2005b, 2011; Burke 2000, 2002; Burke 2007b;
Koren, Bell & Volinsky 2009; Schafer et al. 2007; Schafer, Konstan & Riedl 1999). The
following subsections will describe the major single domain RS approaches, which are: the
content-based (CB), the collaborative filtering (CF), the knowledge-based (KB) and the hybrid
7
recommendation approaches, describing in detail the main idea and the advantages and
limitations of each approach.
3.1.1
Content-Based Recommendation Techniques
Content-Based recommender system (CBRS) recommends items that are similar to the ones
preferred before by a specific user (Pazzani & Billsus 2007). The basic idea of CB
recommendation is:
(1) It first analyses the description of the preferred items by a particular user in order to find
out the common attributes (preferences), which can be used to distinguish these items.
The attained preferences are stored in a user profile;
(2) Then it compares each item’s attributes with the user profile and as a result only the items
that have a higher degree of similarity with the user profile would be recommended
(Pazzani & Billsus 2007).
Two techniques are widely used in CBRS, one recommends heuristically using the traditional
information retrieval methods, such as cosine similarity measure, while the other technique
generates recommendations using statistical learning and machine learning methods. The latter
method mainly builds models that can learn user’s interests from the users’ historical data, which
behaves like classification. The algorithms in classification, such as decision tree, naive Bayesian
and k-nearest neighbours create a probability function that has the potential to provide the
probability estimation for a user’s interest to an unseen item. The attained probability can be
used to provide users with a sorted list of recommendations (Pazzani & Billsus 2007). Some
examples of CBRS are WebWatcher (Armstrong et al. 1995) and Websail (Chen et al. 2000).
The advantage of CBRS is that it adopts semantic content of items and recommends items to a
specific user that is similar to the preferred items in his/her profile. As a result, CBRS would be
able to recommend new items and unpopular items. Furthermore, it can provide a clarification of
recommended items by listing content-features based on which an item is to be recommended. It
doesn’t need to have information about preferences of other users in making recommendations,
so it does not suffer from the sparseness problem associated to collaborative filtering.
One of the main limitations of CBRS is the new user problem. It is not able to offer accurate
recommendations to a new user since he/she has few rated items. CBRS also has the
overspecialization problem. It can only recommend items to a user according to the preferred
items in his/her user profile, thus, it cannot recommend items outside the user’s profile.
Additionally, in some particular cases, it may not be desirable for a recommender system to
recommend too similar items to users, such as different news articles that describe the same
event. Another limitation of CBRS is the item content dependency problem. As CBRS makes
recommendations according to contents of items, it is hard to use content-based method to
recommend items which cannot be represented as keywords, such as image and movies. CBRS
cannot distinguish the items which are represented by the same set of content features.
3.1.2
Collaborative Filtering Recommendation Techniques
Collaborative-filtering (CF) recommendation techniques help people make their choices based
on the opinions of other people who share similar interests (Shardanand & Maes 1995). Resnick
& Varian stated that the CF approach built on a significant assumption that “a good way to find
interesting content is to find other people who have similar interests, and then recommend titles
8
that those similar users like” (Resnick & Varian 1997). It has been proven that the CF
recommendation approach is the most successful and widely used approach for RS (Herlocker et
al. 1999; Huang, Zeng & Chen 2007; Schafer et al. 2007). CF based recommender systems have
been developed and used in many fields including recommending news (Resnick et al. 1994),
articles, movies, music, products, books (Linden, Smith & York 2003), web pages and many
more. Existing CF algorithms can be mainly divided into three types: the user-based CF, the
item-based CF, and the model-based CF (Schafer et al. 2007).
The user-based algorithms are formally known as the nearest neighbour algorithms (Sarwar et al.
2001). These algorithms recommend new item to a particular user using close users’ rating
information on the same item. As all the items and users’ ratings are stored in the memory, so
these algorithms are also referred to memory-based CF.
Another type of memory-based CF approach is item-based algorithms, which basically depend
on exploring the relationships between items instead of the relationships between users. They
generate recommendations for users by finding similar items to the unrated items that the user
has rated or seen before (Sarwar et al. 2001). It is found that the item-based algorithms are able
to provide the same quality of services as the user-based algorithm but with less online
computation because the relationship between items are relatively static compared with the
relationship between users (Sarwar et al. 2001). The item-based CF algorithms are concerned
with suggesting some new items to a particular user. It aims to recommend a new item, which
has not been rated by the target user.
The model-based CF algorithms use the whole or part of existing ratings as input to build a
model which is then used to make predictions for individual users. Different machine learning
algorithms can be used to accomplish model building process such as Bayesian network (Breese,
Heckerman & Kadie 1998), clustering (Jia, Jin & Liu 2010), the latent semantic model
(Hofmann 2004) and the mixture model (Kleinberg & Sandler 2008; Si & Jin 2003). These
algorithms mainly use a probabilistic approach to compute prediction values for unrated items
(Adomavicius & Tuzhilin 2005a; Schafer et al. 2007). Recently, matrix factorization (MF) has
attached increased attention dues to its advantage with respect to scalability and accuracy.
The main advantage of using CF recommendation techniques is that it works for any type of
items without the need to extract features related to items. It only bases on user-item (UI) explicit
rating matrix.
The major limitations of CF methods include sparseness, scalability and cold-start problems
(Adomavicius & Tuzhilin 2005a; Schafer et al. 2007). The drawbacks for both memory-based
CF approaches are typical. First, it takes a lot of computation to calculate similarity between
users or items. Second, the accuracy of those approaches depends on the adopted similarity
measure. The cold-start problem which refers to that a CF approach is unable to make useful
recommendations for both new user and new item (Papagelis & Plexousakis 2005; Schafer et al.
2007).
3.1.3
Knowledge Based Recommendation Techniques
Knowledge-Based (KB) recommendation techniques offer items to users based on extracted
knowledge about the users and items. Usually, a KBRS retains a functional knowledge base that
describes how a particular item meets a specific user’s requirement, which can be performed
based on inferences about the relationship between a user’s need and a possible recommendation
9
(Burke 2002). Case-based reasoning technique is the main common example of KBRS (Smyth
2007).
Case-based reasoning systems rely on the idea of using the past experience as a primary source
to solve the new problem (Aamodt & Plaza 1994). It is represented by a four-step (4Rs) cycle:
retrieve, reuse, revise and retain (Aamodt & Plaza 1994). The past problem solutions are stored
in a database as cases, each case is typically made up of two parts, the specification part and the
solution part. The specification part describes the problem at hand, whereas the solution part
describes the solution that used to solve this problem. New problem is solved by retrieving a case
whose specification is similar to the current problem and then fit the attained solution to match
the current problem. Case-based recommender systems represent items as cases and generate the
recommendations by retrieving the most similar cases to the user’s query or profile. In these
systems, items are described in terms of well-defined set of features (e.g., price, colour, make,
etc.) (Smyth 2007).
Case-based RS borrows heavily from the core concepts of retrieval and similarity in case-based
reasoning. Case-based RS can be seen as a special type of content based recommender systems.
There are two important ways in which case-based RS can be distinguished from other types of
content systems: (1) the manner in which products are represented; and (2) the way in which
product similarity is assessed (Smyth 2007). Case-based RS relies on more structured
representations of item content. In the existing case-based recommender systems, cases are
usually represented as fixed predefined feature vectors. The second important distinguishing
feature of case-based recommender systems relates to their use of various sophisticated
approaches to similarity assessment when it comes to judging which cases to retrieve in response
to some user query. Similarity assessment is obviously a key issue for case-based reasoning and
case-based RS. The existing similarity measures focus on feature vectors represented cases.
Knowledge-Based recommender systems (KBRS) have its own advantages. As KBRS exploits
deep knowledge about the product/service domain, it is able to support intelligent explanations
and product recommendations which are determined by a set of explicitly defined constraints
(Felfernig et al. 2006; Felfernig et al. 2008). Knowledge-based approaches are in the majority of
cases when applied for recommending complex products and services such as consumer goods,
technical equipment, or financial services (Felfernig et al. 2008). KBRS has no cold-start
problem as a new user can get recommendations based on a simple knowledge of his/her
interests. KBRS generates recommendations by computing the similarities between the existed
cases and the user’s request, so it doesn’t require the user to rate or purchase many items in order
to generate good recommendations.
KBRS also has some limitations. For instances, a KBRS needs to retain some information about
items, users and functional knowledge for making recommendations. It also suffers from the
scalability problem as it needs a longer time and more efforts to calculate the similarities for a
larger case-base compared with other recommendation techniques.
3.1.4
Hybrid Recommendation Techniques
It can be seen that each recommendation technique has its own merits and drawbacks. A hybrid
recommendation technique can be proposed to gain higher performance and to avoid the
drawbacks of the typical recommendation techniques (Burke 2007a). This can be done by
combining the best features of two or more recommendation techniques into one hybrid
approach. The most common practice in the existed hybrid recommendation techniques is to
10
combine the CF recommendation techniques with other RS techniques in an attempt to avoid
cold-start, sparseness and/or scalability problems (Adomavicius & Tuzhilin 2005a; Kim et al.
2006).
3.2 Cross-domain recommendation
Though we have witnessed great success in single domain recommendation with above various
techniques, examples like e-commerce and leisure Web sites, such as Netflix, YouTube, iTunes
and Lasf.fm, they can only recommend single type of items to users in their own domain (e.g.
Lasf.fm makes personalized recommendations of music artists and compositions but all are
related to music). There is a huge requirement for joint recommendation. For example, in ecommerce website Amazon, it provides various types of items that may meet the users’
preferences. As a result, by offering more diverse choices cross-domain recommendation will
lead a higher user satisfaction and engagement (Adomavicius & Tuzhilin 2005b)
Cross-domain recommendation can have other advantages, such as addressing the cold-start
problem in single domain, mitigating data sparsity problem. By exploring the relations between
items in different domain, cross-domain recommendation can offer recommendations to users in
a new, unexplored domain with considering their preferences for items in the known domains.
Before describing the existing cross-domain recommendation techniques, there is a need to
answer two questions. First, what kinds of auxiliary information can be explored for improving
cross-recommendation performance exclude explicit users’ ratings on items, and second, how to
transfer knowledge from source domain to make recommendation in target domain with above
found auxiliary information sources.
3.2.1
Side information sources of users and items
The range of side information beyond user-item rating matrix is wide and varied. One of most
usually used side information source is attribute information (Bao, Bergman & Thompson 2009;
Koenigstein, Dror & Koren 2011; Li et al. 2010), which contains user attributes and item
attributes. User attributes may include information such as the user’s gender, age, and hobbies.
Item attributes reflect properties of the item, such as category or content. Recently, social
network and user contributed data have increased their roles in RS as they can provide more
specific information about users and items.
Since different domains may not share same users/items, which caused a big challenge to
evaluate the similarity between users/items from different domains. What is common among
different domains for knowledge transfer becomes the first thing to be considered. In this case,
side information of users and items can be utilized as a bridge to bring up the gap among
different domains. In the remainder of this subsection, we will introduce various kinds of side
information of users and items.
3.2.1.1 Social network
Social network is useful to improve recommendation as it provides useful information in the
form of user-user relationship. The relationship between social users can be directed or
undirected. There are mainly three kinds of relationship that are widely studied. One is trust and
11
distrust relationship (Guha et al. 2004; Leskovec, Huttenlocher & Kleinberg 2010; Ma, Lyu &
King 2009; Ma et al. 2008; Massa & Avesani 2007). It is a directed relation and indicates
whether a user trusted or distrusted another. Another directed social relationship is follow. This
relationship is common in Twitter and reflects the appreciation of a user (follower) for another
user (followee) (Kwak et al. 2010). The last relationship is the friendship used in Facebook.
Friendship is an undirected relationship and can be represented as a symmetric user-user
graph/matrix, which encodes whether two users are friends of each other (Konstas, Stathopoulos
& Jose 2009). It is also possible to extract more complex relationships, such as tie strength and
similarity, between users in a social network by analyzing the link structure and the common
patterns of user behaviour (Backstrom & Leskovec 2011; Gilbert & Karahalios 2009; Liben‐
Nowell & Kleinberg 2007).
All the algorithms integrate above social relationship are based on same assumption that users
that hold a positive relationship with each other should also share the same interest on items.
2.2.1.2 User contributed data
Today an increasing number of metadata have been left online as people may have diverse online
account and they can freely describe the items or express their feelings after bought/used them.
This information is very valuable since they are specific to items and users, which can also be
explored as extra content information sources for improving recommendation quality of single
domain CF technique. So in the following subsection we will investigate the usage of four
mostly used user contributed data in RS, which are tag, geotag, multimedia content and
reviews/comments.
Tag
Tag can be considered as a short plain text that are given to the items by the users (Robu, Halpin
& Shepherd 2009; Sen et al. 2006). But the format of tags is not constrained and they can be
used in different domains for different purposes. As a result, it is a powerful mechanism that
enables users to find, organize, and understand online entities. Tags are also a most important
information source to enhance recommendation algorithms. A lot of algorithms that incorporate
tag information into CF technique have been developed (Sen, Vig & Riedl 2009; Tso-Sutter,
Marinho & Schmidt-Thieme 2008; Zhen, Li & Yeung 2009).
When considering cross-domain recommendation, tagging systems offer an alternative way to
address domain mismatch problem in cross-domain recommendation. Because tags are easily
comprehended by users in different recommendation domain, tags can serve as a bridge enabling
users in different domain to better understand an unknown relationship among themselves on
evaluating different items. With the help of tagging information to find the common
characteristic among different domains, knowledge can be transferred to target domain to
facilitate the rating prediction for new users/items. Lots of research works on cross-domain
recommendation are focused on combining CF technique and tagging information to improve
joint recommendation performance, as will be further discussed in Section 3.2.3.1.
12
GeoTag
Geotag is a special class of tag that depicts location information of users in a social site, such as
photo taken site, micro blog positioning site (Luo et al. 2011). This kind of information becomes
popular since nowadays more and more mobile devices have the standard GPS positioning
function. Due to the availability of geographic information in the format of geotag, remarkable
progress has been achieved in improving restaurant, activity and location/travel recommendation
(Kurashima et al. 2010; Lu et al. 2010; Luo et al. 2011; Zheng, Zha & Chua 2012).
If we can set up a model of users’ locations and previous activity histories, by mining knowledge
from geotag, such as location feature and activity-activity relation, we may recommend related
locations and activities to the user when he/she visits some specific places or plays some specific
activities. Typical algorithms will be discussed in Section 3.2.3.2.
Multimedia content
The usage and applications of social media have become pervasive, such as Flicker, YouTube,
Twitter, Facebook, they contain daily information of people and can be exploited for more
elaborate modelling the user interests, thus contributing to content recommendation and
facilitating social trend aware recommendations (Davidson et al. 2010; Roy et al. 2012).
When refer to cross-domain recommendation, how to build common connection among the
disparate social media on the internet and fuse multimedia content is a big challenge for crossdomain media recommendation. As to our best knowledge, there is not too much work on using
multimedia content to perform cross-domain recommendation, so it still needs a large amount of
research work to be done.
Reviews/comments
Besides tags and geotags, freely written reviews/comments that are published online by users
when purchasing a specific item are another important source of community contributions. They
are valuable not only because of their semantics but also because of the sentiment dimension. As
a result, reviews/comments are not surprising to become an important type of side information
for improving recommendation performance (Aciar et al. 2007; Jakob et al. 2009; Levi et al.
2012; Moshfeghi, Piwowarski & Jose 2011).
Predicting the sentiment orientation of reviews/comments can be converted to a rating prediction
problem, while the former is a widely studied field known as sentiment classification (Liu 2012;
Pang, Lee & Vaithyanathan 2002; Ponomareva & Thelwall 2013; Wu, Tan & Cheng 2009). By
exploring the common words/topics in user generated reviews/comments as the bridge, the
sentiment from known domain can be propagated to the target domain to decide the sentiment
polarity (positive or negative) of target reviews/comments. Then the sentiment polarity can be
converted to rating scores based on some designed mechanisms.
3.2.2
Transfer learning techniques
Traditional machine learning techniques assume that training and test data follow the same
distribution, while this assumption does not hold in many practical applications. In such case, we
can solve it by training a new classifier with plenty of new labelled data. However, in some
particular applications it usually costs heavily to annotate data manually in order to collect
13
enough labelled data for re-training. In contrast, we normally have large amount of old data in
hands, which is related but different to the new data. It is really a waste if we cannot reuse it.
Transfer learning is a new machine learning scenario, which tries to extract useful knowledge
from auxiliary data (source domain) to facilitate the learning task in new data (target domain)
(Pan & Yang 2010). According to (Pan & Yang 2010), transfer learning can be divided into three
categories, namely inductive transfer learning, transductive transfer learning and unsupervised
transfer learning, based on different settings of source and target domains. We will describe the
corresponding problem setting of each category and introduce typical algorithms in it.
3.2.2.1 Inductive transfer learning
In inductive transfer learning setting, the learning task in the target domain should be different
from source domain, and inductive transfer learning aims to learn a prediction function with
labelled target domain data and source domain data (Pan & Yang 2010). Based on whether
labelled data are provided in source domain, inductive transfer learning can behave like multitask learning (Ben-David & Schuller 2003; Evgeniou & Pontil 2007) with respect to labelled
data are given in source domain or self-taught learning (Raina et al. 2007) in the setting of no
labelled source data.
In the consideration of ‘what to transfer’ problem, existing inductive transfer learning algorithms
can be summarized into four cases: instance-based transfer, feature-based transfer, parameterbased transfer and relation-based transfer (Pan & Yang 2010).
Instance-based transfer assumes that some source domain data can be reused together with a few
labelled data in target domain to train a new model for the target domain. Dai et al. (Dai et al.
2007) proposed an algorithm called TrAdaBoost, which iteratively reweights the source domain
data in order to pick out ‘good’ samples while alleviate ‘bad’ ones for training a classifier on
target domain. Based on the same idea of removing ‘misleading’ examples in source domain,
different strategies have been adopted and various kinds of algorithms haven been developed
(Jiang & Zhai 2007; Liao, Xue & Carin 2005; Wu & Dietterich 2004).
Feature-based transfer aims to find common feature representation for both source and target
domain on which the mismatch between two domains can be decreased. When labelled data in
source domain are given, one can learn a good representation with labelled source and target data
similar to feature learning in multi-task learning setting. Argyriou et al. (Evgeniou & Pontil 2007)
proposed to learn a common mapping function for both source and target domain simultaneously,
after projecting both source and target domain data into a low-dimension feature space, a
classifier can be constructed with labelled data by solving an optimization problem on that space.
Lee et al. (Lee et al. 2007) ensemble related learning tasks to learn metapriors that can be
transferred across domains and add weight to features to enable the learning of a representation.
A kernel-based method was proposed for projecting target data in (Rückert & Kramer 2008).
When no labelled data are given in source domain, Raina et al. (Raina et al. 2007) proposed to
apply sparse coding technique to learn high-level features for both domains and with the help of
those shared high-level features to learn a representation of target data. Then a classifier can be
built using learned representation and corresponding labelled target data. But sometimes highlevel features found from source domain may not work well in target domain. Under this
unsupervised feature learning setting, manifold learning had also been adopted for inductive
transfer learning (Wang & Mahadevan 2008).
14
Parameter-based transfer assumes that models in related domains may share common parameters
or priors. But different to the multi-task learning, approaches proposed in parameter-based
transfer normally add larger weights for loss function of target domain instead of same weights
for both source and target domains (Bonilla, Chai & Williams 2008; Evgeniou & Pontil 2004;
Lawrence & Platt 2004). With respect to parameter transfer, Gao et al. (Gao et al. 2008)
proposed a locally weighted ensemble learning framework to combine multiple models for
transfer learning, where the weights are dynamically assigned according to a model’s predictive
power on each test example in the target domain.
Relation-based transfer mainly used in transfer knowledge relational domains, such as network
data, social network data, where the data are not dependent and identically distributed. To solve
this problem, an algorithm TAMAR that transfers relational knowledge with Markov Logic
Networks across relational domains is proposed (Mihalkova, Huynh & Mooney 2007). Later, the
author also extended TAMAR to single-entity setting (Mihalkova & Mooney 2008).
3.2.2.2 Transductive transfer learning
Transductive transfer learning aims at model learning in target domain when learning tasks in
both source and target domain are same and unlabelled target data can be obtained at training
time (Pan & Yang 2010). According to this problem setting, transductive transfer learning is
similar to domain adaptation (Jiang 2008), which is a widely studied subject in machine learning
and NLP community.
Under the framework of domain adaptation, the discrepancy between source and target domain
can be caused by following reasons: different marginal distribution, that is 𝑃(𝑋𝑆 ) ≠ 𝑃(𝑋𝑇 ),
different conditional distribution, that is 𝑃(𝑌𝑆 |𝑋𝑆 ) ≠ 𝑃(𝑌𝑇 |𝑋𝑇 ), and both. A lot of algorithms
have been proposed to address above problems. To overcome marginal distribution discrepancy,
we can apply sampling method to estimate 𝑃(𝑋𝑆 ) and 𝑃(𝑋𝑇 ) separately just based on the
observed data. Fan et al. (Fan et al. 2005) proposed to estimate the probability ratio by using
various classifiers. A kernel-mean matching (KMM) algorithm was developed to learn 𝑃(𝑋𝑆 )
and 𝑃(𝑋𝑇 ) directly by matching the means between the source domain data and the target
domain data in a reproducing-kernel Hilbert space (Huang et al. 2006). With respect to different
conditional distribution, Pan et al. (Pan, Kwok & Yang 2008) exploited the Maximum Mean
Discrepancy Embedding (MMDE) method, originally designed for dimensionality reduction, to
learn a low-dimensional space to reduce the marginal difference between different domains for
transductive transfer learning. However, MMDE may suffer from its computational burden. Thus,
in, Pan et al. (Pan, Tsang, et al. 2011) further proposed an efficient feature extraction algorithm,
known as Transfer Component Analysis (TCA) to overcome the drawback of MMDE.
3.2.2.3 Unsupervised transfer learning
In the setting of unsupervised transfer learning, no labelled data are provided in both source
domain and target domain. Unsupervised transfer learning is relatively a new topic, so there are
still some blanks needed to be filled.
In (Dai et al. 2008), a new approach called self-taught clustering is proposed, which aims at
clustering a small collection of unlabelled data in the target domain with the help of a large
amount of unlabelled data in the source domain. Especially, self-taught clustering tries to learn a
common feature space across domains, which can help in clustering in the target domain.
Similarly, (Wang, Song & Zhang 2008) first applies clustering methods to generate pseudo class
15
labels for the target unlabelled data. It then applies dimensionality reduction methods to the
target data and labelled source data to reduce the dimensions. These two steps run iteratively to
find the best subspace for the target data.
3.2.3
Cross-domain collaborative recommendation techniques
Matrix factorization technique is able to digest the sparse data while at the same time learn the
latent features. It is also flexible to integrate different types of auxiliary data to enrich
information sources that an algorithm can use. So most of state-of-the-art cross-domain
collaborative filtering (CDCF) techniques are factorization based methods.
There are some research works summarize the development of cross-domain recommendation. A
brief survey (Li 2011) chooses to introduce CDCF algorithms in two dimensions: collaborative
filtering domains and knowledge transfer styles. With respect to collaborative filtering domains,
the work (Li 2011) points out there are three representative domains in practice, which are
system domain, data domain, and temporal domain. With respect to knowledge transfer styles,
the work (Li 2011) mainly focus on three transfer ways, namely rating-pattern sharing, latentfeature sharing, and domain correlating. An extended survey of cross-domain recommendation
(Fernández-Tobías et al. 2012) mainly focuses on relations between domains, including contentbased relations and collaborative filtering based relations. Recently a more comprehensive
survey (Shi, Larson & Hanjalic 2014) covers a broad topic on how to improve user-based and
model-based CF techniques with exploring various kinds of auxiliary data.
In the perspective of transfer learning, all the existing cross-domain recommendation algorithms
implemented in different knowledge transfer pattern can be classified into three categories:
adaptive knowledge transfer, collective knowledge transfer and integrative knowledge transfer.
In the next few parts, representative algorithms in each category will be introduced in detail.
3.2.3.1 Adaptive knowledge transfer
Cross-domain recommendation techniques based on adaptive knowledge transfer can be
achieved in two separate ways. First, common knowledge is mined from auxiliary data. Then
those extracted knowledge is adapted to target data.
CodeBook transfers (CBT) (Li, Yang & Xue 2009a) is an early cross-domain collaborative
filtering technique, it transfers cluster-level rating pattern from movies to books with the
consumption that there is a underlying correspondence of the user item rating patterns between
two domains. An extension called RMGM (Li, Yang & Xue 2009b) combines codebook
construction and codebook expansion in one single step. Considering the existence of various
auxiliary data, the codebook in CBT was extended into multiple codebooks with different
relatedness weight (Moreno et al. 2012). Furthermore, a recent work generalizes the codebook to
include a data-independent rating pattern and a data-dependent rating pattern, which is shown to
be more accurate than only sharing the data-independent common knowledge (Gao et al. 2013).
Cluster-level rating pattern is particularly useful when in the situation that there is no explicit
overlap or correspondence can be found between target data and auxiliary data.
There is another branch of adaptive knowledge transfer with applying constraint on
regularization. These regularization terms restrict the user-specific feature matrix and itemspecific feature matrix factorized from target data and auxiliary data respectively to be similar.
CST (Pan et al. 2010) tries to transfer knowledge from auxiliary implicit feedbacks of browsing
16
records to target explicit feedbacks of ratings. More specifically, it incorporates the coordinate
systems (or latent features) extracted from auxiliary data into the target factorization system via
two regularization terms. This work provides a way to deal with heterogeneous data for crossdomain recommendation, and the only drawback is that it requests same users and items in both
auxiliary data and target data.
3.2.3.2 Collective knowledge transfer
Compared to adaptive knowledge transfer, collective knowledge transfer tries to complete
common knowledge extraction and target domain rating prediction simultaneously.
Collective knowledge transfer methods assume that same latent features are shared in auxiliary
data and target data. The type of latent feature can be either user-specific latent feature or itemspecific latent feature. Under this assumption, different kinds of algorithms have been developed
by fusing various types of user-side or item-side information. CMF (Singh & Gordon 2008) is
proposed to collectively factorize one user-item rating matrix and one item-content matrix, with
sharing same item-specific latent features to enable knowledge transfer between two data.
Similar to CMF, in the same period a model in (Ma et al. 2008) was proposed to factorize one
user-item rating matrix and one user-user social network matrix in order to find shared userspecific latent features. Later, more complicated model with more matrix factorization has been
put forward. WNMCTF exploited no-negative matrix factorization (NMF) technique to
collectively factorize one user-item rating matrix, one user demographic matrix and one itemcontent matrix, with the idea of sharing both user-specific latent features and item-specific latent
features to enhance the knowledge transfer. MCF-LF (Zhang, Cao & Yeung 2010), CLP-GP
(Cao, Liu & Yang 2010) and NB-MCF (Chatzis 2013) study multiple user-side auxiliary data
matrices and learn users’ preferences and similarities between target and auxiliary data
simultaneously, which are shown to be more effective as compared with sharing the latent
features alone. Instead of using the auxiliary data directly, some researchers propose to mine
further. JMF (Shi, Larson & Hanjalic 2013) collectively factorizes one user-item rating matrix
and one item-item similarity matrix mined from movies’ mood description. LOCABAL (Tang et
al. 2013) collectively factorizes one user-item rating matrix weighted by users’ global
reputations and one weighted user-user social matrix, adding the constraint on sharing the same
user-specific latent feature matrix. Instead of transferring whole knowledge, STLCF (Lu et al.
2013) selectively transfers high quality knowledge from multiple user-aligned data, which was
shown to be more accurate than transfer with selection.
When heterogeneity of rating representations in different recommender system is taken into
consideration, a new algorithm called TCF (Pan, Liu, et al. 2011) was proposed to solve this
problem. It factorizes one explicit user feedback of rating matrix and implicit feedback of
like/dislike data. In particular, apart from sharing both user-specific and item-specific latent
features it also uses two inner matrices to represent data-dependent information. To share
common knowledge and not to share data-dependent part is a complicated strategy and is
applicable to many practical applications. An extension of TCF called iTCF (Pan & Ming 2014)
was proved to be more effective than TCF.
17
3.2.3.3 Integrative knowledge transfer
Instead of extracting common knowledge or finding latent common features, cross-domain
recommendation techniques with the idea of integrative knowledge transfer incorporate auxiliary
data directly into target learning task, which is related to feature engineering, .data fusion.
FM (Rendle 2012) designs the user-item feedback matrix in a new way, as a result, the revised
prediction rule can take the interactions between two latent features into account. Since FM can
capture more complex correlations among variables with revised prediction rule, it is believed to
generate more accurate accommodations. However, the revised prediction rule will also cause
the learning and prediction procedures more expensive regarding the time and space complexity.
Recently, tag as an important information source and bridge for making cross-domain
recommendation has attained more and more attention. TagiCoFi (Kamishima, Hamasaki &
Akaho 2009) is proposed to make use of social tag to bridge up the gap between auxiliary data
and target data. Specially, in this work, a regularization term,
𝑛
𝑛
∑ ∑ 𝑠𝑢𝑢′ ‖𝑈° − 𝑈 ′ ° ‖2𝐹
𝑢=1 𝑢′ =1
is added to the basic user-item rating matrix factorization function. Where 𝑠𝑢𝑢′ measures the
similarity between user u and u’ through mining the social tags. 𝑈° is the user vector for user u,
similarly, 𝑈 ′ ° represents user vector for user u’. SocialMF (Jamali & Ester 2010) studies the
preference distance between a specific user’s feature vector and a weighted sum of his/her
friends’ feature vectors by posing an additional regularization term
𝑛
𝑛
2
∑ ‖𝑈𝑢 − ∑ 𝑠𝑢𝑢′ × 𝑈𝑢′ ‖
𝑢=1
𝑢′∈𝐺𝑢
𝐹
to the general matrix factorization model. Where user u’ is selected from neighbour group 𝐺𝑢 of
user u. Auxiliary data can be represented by some constraints, so incorporating auxiliary data via
constraint is very flexible. As an example, TIF (Pan, Xiang & Yang 2012) defines a score
intervals called uncertain ratings as a constraint in addition to the basic matrix factorization
function, which requires the predicted reference should fall in the range of the corresponding
auxiliary data.
4. Significance
Significance 1: the research develops a graph-based cross-domain recommendation framework
and relevant algorithms
The graph-based cross-domain recommendation model proposed in this research considers all
the information on multipartite graph, like the definition of nodes, edges and weights of edges
that connect any two endpoints. It can be adapted to describe three mostly encountered entities,
user, item and tag, and their affiliations in one graph representation. A spectral clustering method
is performed to identify the relationship between domain-independent (common) tags and
domain-dependent tags. Based on the tag clusters, the users and items interacted with tags in the
same clusters are connected. Then the preference information can be transferred among those
18
partial users. As to our knowledge, there is not a pioneer work conducted on using graph-based
method for cross-domain recommendation. So our proposed approach can greatly enrich the
classes of cross-domain recommendation algorithms.
Significance 2: the research develops a new cross-domain collaborative filtering framework and
relevant algorithms
A new cross-domain collaborative filtering framework is developed. In our approach, more rich
and diverse auxiliary information from user contributed data or user-item interaction can be
incorporated into collaborative filtering technique, which does not consider any context
information in addition to historical rating scores. Though there are already many research works
conducted on this direction, but cross-domain collaborative filtering is a relatively new topic and
many more additional information sources are emerging. Our proposed method will be able to
mine more efficient knowledge hidden in those additional information sources and fill some
blanks in cross-domain recommendation community.
Significance 3: the research can directly support metadata providers or individual company to
improve their recommendation performances and increase their profits appropriately.
A software framework based on the proposed cross-domain recommendation approaches will be
designed and a working software prototype of such a system will be developed aiming to
generate customers’ references about different items in different domains in a more general
model. The software system will help offer joint recommendation, which in return will
significantly lead to a higher user satisfaction and engagement. This will also open a new era for
recommendation technology.
5. Research methodology
Task 1:
Find an explicit domain link to bring different domains together through
interaction-associated information (to achieve objective 1)
There are some pioneers works have been proposed to discover explicit linkage among domains.
The most direct method is utilizing the overlap of users or items (Shapira, Rokach & Freilikhman
2013). However, domains are mutually exclusive, each involving a certain type of product (e.g.,
movies, music or books) and a set of users whose identities or identifiers are largely unique to
the domain. As a result, it is difficult to directly extract common characteristics among users and
items from different domains. Recently, a novel approach has been developed based on user
generated tag, which assumes that different users in different domains may use the same tags to
describe items or express their opinions about items (Shi, Larson & Hanjalic 2011).
In addition to above user generated information sources, other abundant interaction-associated
information, which records the interaction between a user and an item, like the time when a user
gave a rating to an item, the location where a user upload a photo or download a mobile
application, can also be analysed and used to find an explicit domain link to bridge up different
domains for common knowledge transfer.
Task 2:
Connect different tag clusters together based on identical tags, so that more
knowledge transfer accesses will be built (to achieve objective 2)
Exploiting social tag information has been a popular way to improve recommender systems in
recent years. Existing works in cross-domain recommendation community only use identical tags
19
that appear in different domains as the access for knowledge transfer while abandon lots of
different ones. In fact, those different tags can also be utilized to build up the bridge for more
efficient knowledge transfer. In our proposed method, we try to explore further relationship on
those social tags.
Step 2.1- To perform tag clustering in each domain and connect different tag clusters in different
domains together through spectral clustering technique and identical tags.
Different recommendation systems may have different users and items, however, it is still
possible that some users in different domains use the same tags to annotate items of interest, and
that some items in different domains are tagged by same tags that encode their similar properties.
Apart from those identical tags, there are also abundant of different tags, called domain-specific
tags, are attached to specific items by corresponding users in each domain. In our approach, we
firstly use clustering techniques to cluster all the domain-specific tags in each domain into
separate clusters. Then we apply spectral clustering to combine those tag clusters with
corresponding identical tags, so that different tag clusters that belong to different domains can be
connected and extra knowledge transfer bridges can be constructed. The whole process is
illustrated in figure 1.
Step 2.2- To mine local user similarities and item similarities from those locally connected
domain-specific tag clusters
Based on the connection mined from domain-specific tag clusters in different domains, we can
define local cross-domain user to user similarity and item to item similarity to make up the
deficiency for only using identical tags to define a global user to user similarity and item to item
similarity. Those domain-specific tags induced local user and item similarities can be seen as
extra knowledge transfer bridges to bring different domains together in order to share more
efficient knowledge. This part can be shown in figure 2.
Task 3:
Mine the relationship among reviews/comments in different domains to
predict the ratings in target domain (to achieve objective 2)
Most recommender systems provide reviews/comments function for users to express their
feelings about the quality and satisfaction on purchased items/services. This form of interaction
information can also be mined to make better recommendation. In our approach, we try to
analyse the sentiment orientation of target domain reviews with reviews from auxiliary domains
and corresponding sentiment classification techniques. Based on the predicted sentiment
orientation, we can infer item rating scores with some heuristic strategies.
Step 3.1- To identify the sentiment orientation of reviews in target domain with sentiment
classification techniques
Sentiment classification is the job of classifying an opinionated document as expressing positive
or negative opinion. Because review/comment is free-form textual information that contains an
opinion a consumer holds on specific item. So we can apply cross-domain sentiment
classification techniques to analyse the sentiment orientation of target domain reviews with
reviews collected from auxiliary domains.
Step 3.2- To convert the predicted sentiment levels to rating scores with some heuristic strategies
20
After getting the sentiment orientation of reviews expressed on specific items, we can use some
heuristic strategies and regression methods to restore the rating scores with best confidence.
Task 4:
Use graph-based methods to model the relationship among users and items in
different domains and applying graph mining techniques to make cross-domain
recommendation (to achieve objective 3)
Graph is an abstract representation for organizing data. The graph-based approaches whose
importance has rapidly grown with the increasing availability of additional information that can
be incorporated into graph representation. In recommendation community, graph-based
representation of the recommender data was shown to successfully encapsulate the relationships
between the entities and to facilitate the generation of accurate recommendations. It also allows
for automatic extraction and population of graph-based features, which further improve the
recommendation accuracy (Gu, Zhou & Ding 2010; Tiroshi et al. 2013). But to our best of
knowledge, most of works which rely on random walk and its variants are mainly developed for
generating recommendation in a single domain but not for cross-domain recommendation
purpose. In our proposed approach, we try to develop a graph-based method to generate crossdomain recommendation. Figure 3 shows a conceptual view of graph-based approaches for
cross-domain recommendation.
Step 4.1- To construct a graph that can represent users and items in each domain and link
multiple graphs together by mining some relationships among users and items
As the most important part in graph-based approach, how to construct a graph, including the
definition of nodes, edges and weights of edges, will affect the final result of this kind of method.
In our approach, we propose to use both internal information (e.g. content information such as
users’ occupation and items’ genre) and external information (e.g. social trust network and social
tagging system) in a single graph. Two examples are shown in figure 4. More specially, in figure
4a, we build a tripartite graph in each domain, whose nodes represent users, items and tags,
respectively. Its structure can not only reflect the intra-relationship among nodes of users and
nodes of items, but also the inter-relationship among users and tags, items and tags. The nodes of
tags can act as the bridge to connect different domains. In figure 4b, we construct a bipartite
graph in each domain, but different to existing methods, our approach involves the connection
between user and user, item and item. In the core of the whole graph is the friendship extracted
from social network, which can link up different domains together.
Step 4.2- To apply graph mining techniques to generate cross-domain recommendation
Similar to existing methods, we also exploit random walk with restart in mining relationships in
a multipartite graph including users, items, and other entities. One of the key issues in generating
recommendations using multipartite graphs is the treatment of different scales that are used for
the weights of relationships (edges) between different entities (nodes). In order to solve this
problem, we can apply some heuristic normalization methods to attain comparable scales of edge
weights.
21
Task 5:
Apply matrix factorization and related techniques with incorporating
various side information of users and items to improve cross-domain recommendation
(to achieve objective 4)
The most successful and widely used technique in recommendation is collaborative filtering (CF),
which is based on the assumption that similar users will have same preferences on items. This
technique can be applied in various recommendation scenarios as it solely exploits user-item
rating matrix without consideration of any content information. So the recommendation result
may be restricted to some extent. Nowadays various additional information sources are available
in addition to specific rating scores; CF technique can be greatly enhanced if we can make fully
use of those additional information sources. In such situation, the technique of matrix
factorization (MF) has been adopted to exploit rich side information. In our approach, we intend
to develop a cross-domain collaborative filtering framework by exploiting rich side information
of users and items.
Step 5.1- To find a useful side information source that can be incorporated into recommendation
framework
To the recommender system area, social networks introduce information in the form of user-user
relationships, which may be particularly useful for improving the quality of recommendation.
There are several types of social relationship need to be considered in recommendation, which
can be either directed or undirected. In our approach, we will exploit those two relationships in
respective model.
In addition to jointly exploiting social networks with MF, other types of side information can
also be exploited for improving recommendation performance. Like social tags, geotags, etc. In
our approach, we will also investigate the contribution of incorporating rich tag information in
making cross-domain recommendation.
Step 5.2- To merge above information with factorized user-specific latent feature matrix and
item-specific latent feature matrix
Based on user-item rating matrix and MF technique, we can get two matrices in each domain
respectively, which are user-specific latent feature matrix and item-specific latent feature matrix.
From each row of those two matrices we can identify which cluster does a user or item belong to.
It reflects rating patterns of users on items in each domain and can be further exploited in step
5.3. As side information of users and items contain rich information about the relationship of
users and items, so factorized user-specific latent feature matrix and item-specific latent feature
matrix can be adjusted and enriched by those additional information sources, which will lead a
better result in recommendation.
Step 5.3- To model the domain relatedness and transfer knowledge selectively from auxiliary
domains
Except containing rich information about users and items, additional information sources also
provide a way to link different domains together. But different domains may play a different role
in knowledge transfer, so that we need to automatically identify the domain relatedness in our
model and selectively transfer the knowledge from related domains to make recommendation in
target domain with historical rating data from auxiliary domains.
Task 6:
Develop a cross-domain collaborative recommender system for industry (to
achieve objective 5)
22
Step 6.1- To design a system framework
Based on the above proposed cross-domain recommendation approaches, a framework of the
cross-domain recommender system illustrated in Figure 5 is designed.
The system constitutes four parts: database system, knowledge base system, similarity engine
and recommendation engine. Database system involves the development of two databases:
products/services database and users’ profile and usage records database. Knowledge base
system maintains the business rules. Similarity engine contains the comprehensive similarity
measure model and algorithms for tree structured data. Recommendation engine contains three
main parts: retrieve module, adapt module and recommendation generator. Retrieve module
receives the target customer's profile information and requirement and constructs a tree
structured representation; extracts data from user profile usage database, constructs tree
structured cases, and restructures their structure type to a unified form with the target user
specification if necessary; calls the similarity engine to evaluate the similarity degree of the
existing cases and find the most similar K cases to the target user. Adapt module modifies the
retrieved product packages according to the target user requirement; checks the business rules
and adjusts the improper product; extracts data from products/services database to find the most
suitable products/services which can be matched with the target customer; receives the target
user's revision requirement to the recommendations, and adjusts the recommendations.
Recommendation generator module generates recommended product packages for the target
customer.
Step 6.5- To implement the cross-domain recommender system
A working cross-domain recommender system prototype based on the framework will be
developed.
Task 7:
Validation of the proposed cross-domain recommendation approaches
The research will be verified by the common datasets, such as MovieLens dataset
(http://grouplens.org/datasets/movielens/ ), Jester dataset (http://goldberg.berkeley.edu/jesterdata/) and Netflix dataset (http://www.lifecrunch.biz/archives/207 ), will be used to verify the
accuracy and effectiveness of the cross-domain recommendation approaches.
6. Research timeline
Time
Semester 1
Semester 2
Task
1
2
Research step
Find an explicit domain link to bring different domains together
through interaction-associated information
Step 2.1: To perform tag clustering in each domain and connect
different tag clusters in different domains together through
spectral clustering technique and identical tags
Step 2.2: To mine local user similarities and item similarities
23
Objective
Progress
1
Finished
2
Preparing
from those locally connected domain-specific tag clusters
Step 3.1: To identify the sentiment orientation of reviews in
target domain with sentiment classification techniques
3
2
Step 3.2: To convert the predicted sentiment levels to rating
scores with some heuristic strategies
Semester 3
4
Step 4.1: To construct a graph that can represent users and items
in each domain and link multiple graphs together by mining some
relationships among users and items
Doing experiments
and analyzing
results
3
Step 4.2: To apply graph mining techniques to generate crossdomain recommendation
Step 5.1: To find a useful side information source that can be
incorporated into recommendation framework
Semester 4
5
Step 5.2: To merge above information with factorized userspecific latent feature matrix and item-specific latent feature
matrix
4
Step 5.3: To model the domain relatedness and transfer
knowledge selectively from auxiliary domains
Step 6: Develop a cross-domain collaborative recommender
system for industry
Semester 5
6,7
5
Step 7: validation of the proposed cross-domain recommendation
approaches
Semester 6
Step 6: writing the thesis
7. Research progress up to date
One conference paper has been published. In this paper, a new fuzzy domain adaptation method
based on self-constructing fuzzy neural network is proposed. This approach models the
transferred knowledge supporting the development of the current models granularly in the form
of fuzzy sets and adapts the knowledge using fuzzy similarity measure to reduce prediction error
in the target domain.
24
Peng H., Guangquan Z., Vahid B & Zheng Z. 2014, ‘A fuzzy domain adaptation method based
on self-constructing fuzzy neural network’, International Conference on Fuzzy Logic and
Intelligent Technologies in Nuclear Science (FLINS), 2014, pp. 676-681.
8. References
Aamodt, A. & Plaza, E. 1994, 'Case-based reasoning: Foundational issues, methodological variations, and
system approaches', AI communications, vol. 7, no. 1, pp. 39-59.
Aciar, S., Zhang, D., Simoff, S. & Debenham, J. 2007, 'Informed recommender: Basing recommendations
on consumer product reviews', Intelligent Systems, IEEE, vol. 22, no. 3, pp. 39-47.
Adomavicius, G. & Tuzhilin, A. 2005a, 'Toward the next generation of recommender systems: a survey
of the state-of-the-art and possible extensions', IEEE Transactions on Knowledge and Data
Engineering,, vol. 17, no. 6, pp. 734-49.
Adomavicius, G. & Tuzhilin, A. 2005b, 'Toward the next generation of recommender systems: A survey
of the state-of-the-art and possible extensions', Knowledge and Data Engineering, IEEE
Transactions on, vol. 17, no. 6, pp. 734-49.
Adomavicius, G. & Tuzhilin, A. 2011, 'Context-aware recommender systems', Recommender systems
handbook, Springer, pp. 217-53.
Armstrong, R., Freitag, D., Joachims, T. & Mitchell, T. 1995, 'Webwatcher: A learning apprentice for the
world wide web', Proc. AAAI Spring Symposium on Information Gathering from Heterogeneous,
Distributed Environments, pp. 6-12.
Backstrom, L. & Leskovec, J. 2011, 'Supervised random walks: predicting and recommending links in
social networks', Proceedings of the fourth ACM international conference on Web search and
data mining, ACM, pp. 635-44.
Bao, X., Bergman, L. & Thompson, R. 2009, 'Stacking recommendation engines with additional metafeatures', Proceedings of the third ACM conference on Recommender systems, ACM, pp. 109-16.
Ben-David, S. & Schuller, R. 2003, 'Exploiting task relatedness for multiple task learning', Learning
Theory and Kernel Machines, Springer, pp. 567-80.
Bonilla, E., Chai, K.M. & Williams, C. 2008, 'Multi-task Gaussian process prediction'.
Breese, J.S., Heckerman, D. & Kadie, C. 1998, 'Empirical analysis of predictive algorithms for
collaborative filtering', Proceedings of the Proceedings of the Fourteenth Conference Annual
Conference on Uncertainty in Artificial Intelligence (UAI-98), Morgan Kaufmann, San Francisco,
CA, pp. 43-52.
Burke, R. 2000, 'Knowledge-based recommender systems', Encyclopedia of library and information
systems, vol. 69, no. Supplement 32, pp. 175-86.
Burke, R. 2002, 'Hybrid recommender systems: Survey and experiments', User Modeling and UserAdapted Interaction, vol. 12, no. 4, pp. 331-70.
Burke, R. 2007a, 'Hybrid web recommender systems', Springer-Verlag, pp. 377-408.
Burke, R. 2007b, 'Hybrid web recommender systems', The adaptive web, Springer, pp. 377-408.
Cao, B., Liu, N.N. & Yang, Q. 2010, 'Transfer learning for collective link prediction in multiple
heterogenous domains', Proceedings of the 27th International Conference on Machine Learning
(ICML-10), pp. 159-66.
Chatzis, S. 2013, 'Nonparametric bayesian multitask collaborative filtering', Proceedings of the 22nd
ACM international conference on Conference on information & knowledge management, ACM,
pp. 2149-58.
Chen, Z., Meng, X., Zhu, B. & Fowler, R.H. 2000, 'WebSail: from on-line learning to Web search',
Proceedings of the First International Conference on Web Information Systems Engineering,
2000., vol. 1, pp. 206-13 vol.1.
Dai, W., Yang, Q., Xue, G.-R. & Yu, Y. 2007, 'Boosting for transfer learning', Proceedings of the 24th
international conference on Machine learning, ACM, pp. 193-200.
25
Dai, W., Yang, Q., Xue, G.-R. & Yu, Y. 2008, 'Self-taught clustering', Proceedings of the 25th
international conference on Machine learning, ACM, pp. 200-7.
Davidson, J., Liebald, B., Liu, J., Nandy, P., Van Vleet, T., Gargi, U., Gupta, S., He, Y., Lambert, M. &
Livingston, B. 2010, 'The YouTube video recommendation system', Proceedings of the fourth
ACM conference on Recommender systems, ACM, pp. 293-6.
Deshpande, M. & Karypis, G. 2004, 'Item-based top-N recommendation algorithms', ACM Trans. Inf.
Syst., vol. 22, no. 1, pp. 143-77.
Evgeniou, A. & Pontil, M. 2007, 'Multi-task feature learning', Advances in neural information processing
systems, vol. 19, p. 41.
Evgeniou, T. & Pontil, M. 2004, 'Regularized multi--task learning', Proceedings of the tenth ACM
SIGKDD international conference on Knowledge discovery and data mining, ACM, pp. 109-17.
Fan, W., Davidson, I., Zadrozny, B. & Yu, P.S. 2005, 'An improved categorization of classifier's
sensitivity on sample selection bias', Data Mining, Fifth IEEE International Conference on, IEEE,
p. 4 pp.
Felfernig, A., Friedrich, G., Jannach, D. & Zanker, M. 2006, 'An Integrated Environment for the
Development of Knowledge-Based Recommender Applications', International Journal of
Electronic Commerce, vol. 11, no. 2, pp. 11-34.
Felfernig, A., Gula, B., Leitner, G., Maier, M., Melcher, R. & Teppan, E. 2008, 'Persuasion in
Knowledge-Based Recommendation', in H. Oinas-Kukkonen, P. Hasle, M. Harjumaa, K.
Segerståhl & P. Øhrstrøm (eds), Persuasive Technology, vol. 5033, Springer Berlin / Heidelberg,
pp. 71-82.
Fernández-Tobías, I., Cantador, I., Kaminskas, M. & Ricci, F. 2012, 'Cross-domain recommender systems:
A survey of the state of the art', Proceedings of the 2nd Spanish Conference on Information
Retrieval. CERI.
Gao, J., Fan, W., Jiang, J. & Han, J. 2008, 'Knowledge transfer via multiple model local structure
mapping', Proceedings of the 14th ACM SIGKDD international conference on Knowledge
discovery and data mining, ACM, pp. 283-91.
Gao, S., Luo, H., Chen, D., Li, S., Gallinari, P. & Guo, J. 2013, 'Cross-Domain Recommendation via
Cluster-Level Latent Factor Model', Machine Learning and Knowledge Discovery in Databases,
Springer, pp. 161-76.
Gilbert, E. & Karahalios, K. 2009, 'Predicting tie strength with social media', Proceedings of the SIGCHI
Conference on Human Factors in Computing Systems, ACM, pp. 211-20.
Goldberg, D., Nichols, D., Nichols, D., Oki, B.M. & Terry, D. 1992, 'Using collaborative filtering to
weave an information tapestry', Commun. ACM, vol. 35, no. 12, pp. 61-70.
Gori, M. & Pucci, A. 2007, 'ItemRank: A Random-Walk Based Scoring Algorithm for Recommender
Engines', IJCAI, vol. 7, pp. 2766-71.
Goy, A., Ardissono, L. & Petrone, G. 2007, 'Personalization in E-Commerce Applications', in P.
Brusilovsky, A. Kobsa & W. Nejdl (eds), The Adaptive Web: Methods and Strategies of Web
Personalization, vol. 4321, Springer Berlin / Heidelberg, pp. 485–520.
Gu, Q., Zhou, J. & Ding, C.H. 2010, 'Collaborative Filtering: Weighted Nonnegative Matrix Factorization
Incorporating User and Item Graphs', SDM, SIAM, pp. 199-210.
Guha, R., Kumar, R., Raghavan, P. & Tomkins, A. 2004, 'Propagation of trust and distrust', Proceedings
of the 13th international conference on World Wide Web, ACM, pp. 403-12.
Herlocker, J.L., Konstan, J.A., Borchers, A. & Riedl, J. 1999, 'An algorithmic framework for performing
collaborative filtering', Proceedings of the 22nd annual international ACM SIGIR conference on
Research and development in information retrieval, ACM, Berkeley, California, United States, pp.
230-7.
Hofmann, T. 2004, 'Latent semantic models for collaborative filtering', ACM Transactions on Information
Systems (TOIS), vol. 22, no. 1, pp. 89-115.
Huang, J., Gretton, A., Borgwardt, K.M., Schölkopf, B. & Smola, A.J. 2006, 'Correcting sample selection
bias by unlabeled data', Advances in neural information processing systems, pp. 601-8.
26
Huang, Z., Zeng, D. & Chen, H. 2007, 'A comparison of collaborative-filtering recommendation
algorithms for e-commerce', Intelligent Systems, IEEE, vol. 22, no. 5, pp. 68-78.
Jakob, N., Weber, S.H., Müller, M.C. & Gurevych, I. 2009, 'Beyond the stars: exploiting free-text user
reviews to improve the accuracy of movie recommendations', Proceedings of the 1st international
CIKM workshop on Topic-sentiment analysis for mass opinion, ACM, pp. 57-64.
Jamali, M. & Ester, M. 2009a, 'TrustWalker: a random walk model for combining trust-based and itembased recommendation', Proceedings of the 15th ACM SIGKDD international conference on
Knowledge discovery and data mining, ACM, pp. 397-406.
Jamali, M. & Ester, M. 2009b, 'Using a trust network to improve top-N recommendation', Proceedings of
the third ACM conference on Recommender systems, ACM, pp. 181-8.
Jamali, M. & Ester, M. 2010, 'A matrix factorization technique with trust propagation for
recommendation in social networks', Proceedings of the fourth ACM conference on
Recommender systems, ACM, pp. 135-42.
Jia, R., Jin, M. & Liu, C. 2010, 'A new clustering method for collaborative filtering', 2010 International
Conference on Networking and Information Technology (ICNIT), pp. 488-92.
Jiang, J. 2008, 'A literature survey on domain adaptation of statistical classifiers', URL: http://sifaka. cs.
uiuc. edu/jiang4/domainadaptation/survey.
Jiang, J. & Zhai, C. 2007, 'Instance weighting for domain adaptation in NLP', ACL, vol. 7, Citeseer, pp.
264-71.
Kamishima, T., Hamasaki, M. & Akaho, S. 2009, 'TrBagg: A simple transfer learning method and its
application to personalization in collaborative tagging', Data Mining, 2009. ICDM'09. Ninth
IEEE International Conference on, IEEE, pp. 219-28.
Kim, B., Li, Q., Park, C., Kim, S. & Kim, J. 2006, 'A new approach for combining content-based and
collaborative filters', Journal of Intelligent Information Systems, vol. 27, no. 1, pp. 79-91.
Kleinberg, J. & Sandler, M. 2008, 'Using mixture models for collaborative filtering', Journal of Computer
and System Sciences, vol. 74, no. 1, pp. 49-69.
Koenigstein, N., Dror, G. & Koren, Y. 2011, 'Yahoo! music recommendations: modeling music ratings
with temporal dynamics and item taxonomy', Proceedings of the fifth ACM conference on
Recommender systems, ACM, pp. 165-72.
Konstas, I., Stathopoulos, V. & Jose, J.M. 2009, 'On social networks and collaborative recommendation',
Proceedings of the 32nd international ACM SIGIR conference on Research and development in
information retrieval, ACM, pp. 195-202.
Koren, Y., Bell, R. & Volinsky, C. 2009, 'Matrix factorization techniques for recommender systems',
Computer, vol. 42, no. 8, pp. 30-7.
Kurashima, T., Iwata, T., Irie, G. & Fujimura, K. 2010, 'Travel route recommendation using geotags in
photo sharing sites', Proceedings of the 19th ACM international conference on Information and
knowledge management, ACM, pp. 579-88.
Kwak, H., Lee, C., Park, H. & Moon, S. 2010, 'What is Twitter, a social network or a news media?',
Proceedings of the 19th international conference on World wide web, ACM, pp. 591-600.
Lawrence, N.D. & Platt, J.C. 2004, 'Learning to learn with the informative vector machine', Proceedings
of the twenty-first international conference on Machine learning, ACM, p. 65.
Lee, S.-I., Chatalbashev, V., Vickrey, D. & Koller, D. 2007, 'Learning a meta-level prior for feature
relevance from multiple related tasks', Proceedings of the 24th international conference on
Machine learning, ACM, pp. 489-96.
Leskovec, J., Huttenlocher, D. & Kleinberg, J. 2010, 'Signed networks in social media', Proceedings of
the SIGCHI Conference on Human Factors in Computing Systems, ACM, pp. 1361-70.
Levi, A., Mokryn, O., Diot, C. & Taft, N. 2012, 'Finding a needle in a haystack of reviews: cold start
context-based hotel recommender system', Proceedings of the sixth ACM conference on
Recommender systems, ACM, pp. 115-22.
Li, B. 2011, 'Cross-domain collaborative filtering: A brief survey', Tools with Artificial Intelligence
(ICTAI), 2011 23rd IEEE International Conference on, IEEE, pp. 1085-6.
27
Li, B., Yang, Q. & Xue, X. 2009a, 'Can Movies and Books Collaborate? Cross-Domain Collaborative
Filtering for Sparsity Reduction', IJCAI, vol. 9, pp. 2052-7.
Li, B., Yang, Q. & Xue, X. 2009b, 'Transfer learning for collaborative filtering via a rating-matrix
generative model', Proceedings of the 26th Annual International Conference on Machine
Learning, ACM, pp. 617-24.
Li, Y., Hu, J., Zhai, C. & Chen, Y. 2010, 'Improving one-class collaborative filtering by incorporating
rich user information', Proceedings of the 19th ACM international conference on Information and
knowledge management, ACM, pp. 959-68.
Liao, X., Xue, Y. & Carin, L. 2005, 'Logistic regression with an auxiliary data source', Proceedings of the
22nd international conference on Machine learning, ACM, pp. 505-12.
Liben‐Nowell, D. & Kleinberg, J. 2007, 'The link‐prediction problem for social networks', Journal of the
American society for information science and technology, vol. 58, no. 7, pp. 1019-31.
Linden, G., Smith, B. & York, J. 2003, 'Amazon.com recommendations: item-to-item collaborative
filtering', Internet Computing, IEEE, vol. 7, no. 1, pp. 76-80.
Liu, B. 2012, 'Sentiment analysis and opinion mining', Synthesis Lectures on Human Language
Technologies, vol. 5, no. 1, pp. 1-167.
Lu, X., Wang, C., Yang, J.-M., Pang, Y. & Zhang, L. 2010, 'Photo2trip: generating travel routes from
geo-tagged photos for trip planning', Proceedings of the international conference on Multimedia,
ACM, pp. 143-52.
Lu, Z., Pan, W., Xiang, E.W., Yang, Q., Zhao, L. & Zhong, E. 2013, 'Selective transfer learning for cross
domain recommendation', SDM, SIAM, pp. 641-9.
Luo, J., Joshi, D., Yu, J. & Gallagher, A. 2011, 'Geotagging in multimedia and computer vision—a
survey', Multimedia Tools and Applications, vol. 51, no. 1, pp. 187-211.
Ma, H., Lyu, M.R. & King, I. 2009, 'Learning to recommend with trust and distrust relationships',
Proceedings of the third ACM conference on Recommender systems, ACM, pp. 189-96.
Ma, H., Yang, H., Lyu, M.R. & King, I. 2008, 'Sorec: social recommendation using probabilistic matrix
factorization', Proceedings of the 17th ACM conference on Information and knowledge
management, ACM, pp. 931-40.
Markellou, P., Mousourouli, I., Sirmakessis, S. & Tsakalidis, A. 2005, 'Personalized e-commerce
recommendations', paper presented to the IEEE International Conference on e-Business
Engineering, 2005. ICEBE 2005., 12-18 Oct. 2005.
Massa, P. & Avesani, P. 2007, 'Trust-aware recommender systems', Proceedings of the 2007 ACM
conference on Recommender systems, ACM, pp. 17-24.
Mihalkova, L., Huynh, T. & Mooney, R.J. 2007, 'Mapping and revising Markov logic networks for
transfer learning', AAAI, vol. 7, pp. 608-14.
Mihalkova, L. & Mooney, R.J. 2008, 'Transfer learning by mapping with minimal target data',
Proceedings of the AAAI-08 workshop on transfer learning for complex tasks.
Moreno, O., Shapira, B., Rokach, L. & Shani, G. 2012, 'Talmud: transfer learning for multiple domains',
Proceedings of the 21st ACM international conference on Information and knowledge
management, ACM, pp. 425-34.
Moshfeghi, Y., Piwowarski, B. & Jose, J.M. 2011, 'Handling data sparsity in collaborative filtering using
emotion and semantic based features', Proceedings of the 34th international ACM SIGIR
conference on Research and development in Information Retrieval, ACM, pp. 625-34.
Pan, S.J., Kwok, J.T. & Yang, Q. 2008, 'Transfer Learning via Dimensionality Reduction', AAAI, vol. 8,
pp. 677-82.
Pan, S.J., Tsang, I.W., Kwok, J.T. & Yang, Q. 2011, 'Domain adaptation via transfer component analysis',
Neural Networks, IEEE Transactions on, vol. 22, no. 2, pp. 199-210.
Pan, S.J. & Yang, Q. 2010, 'A survey on transfer learning', Knowledge and Data Engineering, IEEE
Transactions on, vol. 22, no. 10, pp. 1345-59.
28
Pan, W., Liu, N.N., Xiang, E.W. & Yang, Q. 2011, 'Transfer learning to predict missing ratings via
heterogeneous user feedbacks', IJCAI Proceedings-International Joint Conference on Artificial
Intelligence, vol. 22, p. 2318.
Pan, W. & Ming, Z. 2014, 'Interaction-Rich Transfer Learning for Collaborative Filtering with
Heterogeneous User Feedbacks', IEEE Intelligent Systems, p. 1.
Pan, W., Xiang, E.W., Liu, N.N. & Yang, Q. 2010, 'Transfer Learning in Collaborative Filtering for
Sparsity Reduction', AAAI, vol. 10, pp. 230-5.
Pan, W., Xiang, E.W. & Yang, Q. 2012, 'Transfer Learning in Collaborative Filtering with Uncertain
Ratings', AAAI.
Pang, B., Lee, L. & Vaithyanathan, S. 2002, 'Thumbs up?: sentiment classification using machine
learning techniques', Proceedings of the ACL-02 conference on Empirical methods in natural
language processing-Volume 10, Association for Computational Linguistics, pp. 79-86.
Papagelis, M. & Plexousakis, D. 2005, 'Qualitative analysis of user-based and item-based prediction
algorithms for recommendation agents', Engineering Applications of Artificial Intelligence, vol.
18, no. 7, pp. 781-9.
Pazzani, M. & Billsus, D. 2007, 'Content-Based Recommendation Systems', in P. Brusilovsky, A. Kobsa
& W. Nejdl (eds), The adaptive web, vol. 4321, Springer Berlin / Heidelberg, pp. 325-41.
Ponomareva, N. & Thelwall, M. 2013, 'Semi-supervised vs. Cross-domain Graphs for Sentiment
Analysis', RANLP, pp. 571-8.
Raina, R., Battle, A., Lee, H., Packer, B. & Ng, A.Y. 2007, 'Self-taught learning: transfer learning from
unlabeled data', Proceedings of the 24th international conference on Machine learning, ACM, pp.
759-66.
Rendle, S. 2012, 'Factorization machines with libFM', ACM Transactions on Intelligent Systems and
Technology (TIST), vol. 3, no. 3, p. 57.
Resnick, P., Iacovou, N., Suchak, M., Bergstrom, P. & Riedl, J. 1994, 'GroupLens: an open architecture
for collaborative filtering of netnews', Proceedings of the 1994 ACM conference on Computer
supported cooperative work, ACM, Chapel Hill, North Carolina, United States, pp. 175-86.
Resnick, P. & Varian, H.R. 1997, 'Recommender systems', Commun. ACM, vol. 40, no. 3, pp. 56-8.
Robu, V., Halpin, H. & Shepherd, H. 2009, 'Emergence of consensus and shared vocabularies in
collaborative tagging systems', ACM Transactions on the Web (TWEB), vol. 3, no. 4, p. 14.
Roy, S.D., Mei, T., Zeng, W. & Li, S. 2012, 'Socialtransfer: cross-domain transfer learning from social
streams for media applications', Proceedings of the 20th ACM international conference on
Multimedia, ACM, pp. 649-58.
Rückert, U. & Kramer, S. 2008, 'Kernel-based inductive transfer', Machine Learning and Knowledge
Discovery in Databases, Springer, pp. 220-33.
Sarwar, B., Karypis, G., Konstan, J. & Reidl, J. 2001, 'Item-based collaborative filtering recommendation
algorithms', ACM, pp. 285-95.
Schafer, J.B., Frankowski, D., Herlocker, J. & Sen, S. 2007, 'Collaborative filtering recommender
systems', The adaptive web, Springer, pp. 291-324.
Schafer, J.B., Konstan, J. & Riedl, J. 1999, 'Recommender systems in e-commerce', Proceedings of the
1st ACM conference on Electronic commerce, ACM, pp. 158-66.
Sen, S., Lam, S.K., Rashid, A.M., Cosley, D., Frankowski, D., Osterhouse, J., Harper, F.M. & Riedl, J.
2006, 'Tagging, communities, vocabulary, evolution', Proceedings of the 2006 20th anniversary
conference on Computer supported cooperative work, ACM, pp. 181-90.
Sen, S., Vig, J. & Riedl, J. 2009, 'Tagommenders: connecting users to items through tags', Proceedings of
the 18th international conference on World wide web, ACM, pp. 671-80.
Shapira, B., Rokach, L. & Freilikhman, S. 2013, 'Facebook single and cross domain data for
recommendation systems', User Modeling and User-Adapted Interaction, vol. 23, no. 2-3, pp.
211-47.
29
Shardanand, U. & Maes, P. 1995, 'Social information filtering: algorithms for automating “word of
mouth&rdquo', paper presented to the Proceedings of the SIGCHI conference on Human factors
in computing systems, Denver, Colorado, United States.
Shi, Y., Larson, M. & Hanjalic, A. 2011, 'Tags as bridges between domains: Improving recommendation
with tag-induced cross-domain collaborative filtering', User Modeling, Adaption and
Personalization, Springer, pp. 305-16.
Shi, Y., Larson, M. & Hanjalic, A. 2013, 'Mining contextual movie similarity with matrix factorization
for context-aware recommendation', ACM Transactions on Intelligent Systems and Technology
(TIST), vol. 4, no. 1, p. 16.
Shi, Y., Larson, M. & Hanjalic, A. 2014, 'Collaborative filtering beyond the user-item matrix: A survey of
the state of the art and future challenges', ACM Computing Surveys (CSUR), vol. 47, no. 1, p. 3.
Si, L. & Jin, R. 2003, 'Flexible mixture model for collaborative filtering', ICML, vol. 3, pp. 704-11.
Singh, A.P. & Gordon, G.J. 2008, 'Relational learning via collective matrix factorization', Proceedings of
the 14th ACM SIGKDD international conference on Knowledge discovery and data mining,
ACM, pp. 650-8.
Smyth, B. 2007, 'Case-Based Recommendation', in P. Brusilovsky, A. Kobsa & W. Nejdl (eds), The
adaptive web, vol. 4321, Springer Berlin / Heidelberg, pp. 342-76.
Tang, J., Hu, X., Gao, H. & Liu, H. 2013, 'Exploiting local and global social context for recommendation',
Proceedings of the Twenty-Third international joint conference on Artificial Intelligence, AAAI
Press, pp. 2712-8.
Tiroshi, A., Berkovsky, S., Kaafar, M.A., Chen, T. & Kuflik, T. 2013, 'Cross social networks interests
predictions based ongraph features', Proceedings of the 7th ACM conference on Recommender
systems, ACM, pp. 319-22.
Tong, H., Faloutsos, C. & Pan, J.-Y. 2006, 'Fast random walk with restart and its applications'.
Tso-Sutter, K.H., Marinho, L.B. & Schmidt-Thieme, L. 2008, 'Tag-aware recommender systems by
fusion of collaborative filtering algorithms', Proceedings of the 2008 ACM symposium on Applied
computing, ACM, pp. 1995-9.
Wang, C. & Mahadevan, S. 2008, 'Manifold alignment using Procrustes analysis', Proceedings of the 25th
international conference on Machine learning, ACM, pp. 1120-7.
Wang, Z., Song, Y. & Zhang, C. 2008, 'Transferred dimensionality reduction', Machine learning and
knowledge discovery in databases, Springer, pp. 550-65.
Wu, P. & Dietterich, T.G. 2004, 'Improving SVM accuracy by training on auxiliary data sources',
Proceedings of the twenty-first international conference on Machine learning, ACM, p. 110.
Wu, Q., Tan, S. & Cheng, X. 2009, 'Graph ranking for sentiment transfer', Proceedings of the ACLIJCNLP 2009 Conference Short Papers, Association for Computational Linguistics, pp. 317-20.
Yildirim, H. & Krishnamoorthy, M.S. 2008, 'A random walk method for alleviating the sparsity problem
in collaborative filtering', Proceedings of the 2008 ACM conference on Recommender systems,
ACM, pp. 131-8.
Zhang, Y., Cao, B. & Yeung, D.-Y. 2010, 'Multi-domain collaborative filtering', Proceedings of the 3rd
ACM Conference on Recommender Systems, pp. 725-32.
Zhen, Y., Li, W.-J. & Yeung, D.-Y. 2009, 'TagiCoFi: tag informed collaborative filtering', Proceedings of
the third ACM conference on Recommender systems, ACM, pp. 69-76.
Zheng, Y.-T., Zha, Z.-J. & Chua, T.-S. 2012, 'Mining travel patterns from geotagged photos', ACM
Transactions on Intelligent Systems and Technology (TIST), vol. 3, no. 3, p. 56.
30