Download Overview of KDDCUP 2011

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Overview of KDDCUP 2011
Nathan Liu
[email protected]
KDDCUP 2011 Music Recommendation
• KDDCUP is the most prominent data mining
competition.
• In recent years, there have been a number of
contest related to movie recommendation:
– Netflix 2006: predict future ratings
– KDDCUP 2007: how many ratings and who rated what
– CAMRA 2010: context aware movie recommendation
• KDDCUP 2011 is organized by yahoo and provides
the first and largest music ratings datasets.
Yahoo Music
KDDCUP 2011
• There are three types of items: songs, artists, albums.
• Songs and albums are annotated with genres.
• You are given the date, time and scores of each user’s ratings of
these different items.
• Challenges:
– Scale: biggest public dataset ever. 1 million user, 0.6 million items, 300
million ratings
– Hierarchical item relation: song belong to albums, albums belong to
artists. All of them are annotated with genre tags.
– Rich meta data: over 900 genres
– Fine temporal resolution: no previous challenge provided time in
addition to date.
• For the project, you will be provided with a small subset of the data
and we will held a mini internal competition to determine which
group obtained the best results.
KDDCUP 2011: Task 1
• The test set consists of hold out ratings from users in the
training set. Each rating is time stamped.
• In the test set, you are given who rated which items at what
time.
• You are asked to predict the rating scores.
• Closely related to Netflix competition, but may require time
of day effect consideration.
• References:
– Koren. Matrix Factorization Techniques for Recommender
Systems. (IEEE Computer 2009)
– Koren. Collaborative Filtering with Temporal Dynamics (KDD’09)
– Xiong. Time-Evolving Collaborative Filtering (SDM’10)
– Liu. Online Evolutionary Collaborative Filtering (RECSYS’10)
KDDCUP 2011: Task 2
• The test set consists of hold out ratings from users in the training
set. Time has been removed.
• In the test set, you are given 6 items for each user.
• You are asked to predict which 3 of the 6 are actually rated by the
user.
• Closely related to KDDCUP 2007 “who rated what” and CAMRA2010
weekly recommendation track
• References:
– Hu. Collaborative Filtering for Implicit Feedback Datasets (ICDM’08)
– Rendle. Bayesian Personalized Ranking from Implicit Feedback (UAI’09)
– Cremonesi. Performance of Recommender Algorithms on Top-N
Recommendation Tasks (RECSYS’10)
– Steck. Training and Testing of Recommender Systems on Data Missing
Not at Random (KDD’10)
For The Project
• We will extract a subset for you to work on.
• We will provide some basic algorithms.
• You can choose to work on one of the two
tasks.
• The minimum requirement is that you should
run thorough experiments with the provided
algorithms and write a report on your findings
about different algorithms.
• There are also new things to try….
Things to Try (1): Ensemble
• Same algorithm different parameter settings
• Different algorithms
• Stacking:
– What meta learner? Gradient Boosted Decision Tree,
Linear Regression
– Any meta features? Tail vs. Head segmentation strategy
• References:
– Bao et. al. Stacking Recommendation Engines with
Additional Meta-Features (RECSYS’09)
– Jahrer et. al. Combining Predictions for Accurate
Recommender Systems (KDD’10)
Things to Try (3): Exploiting Item
Relations and Genres
• From social network of users to networks of items.
• Combining collaborative filtering with genre based
prediction for alleviating sparseness.
• References:
– Ma. Recommender Systems with Social Regularization
(WSDM’11)
– Agarwal. Regression based Latent Factor Models (KDD’09)
– Popescul. Probabilistic Models for Unified Collaborative
and Content-based Recommendation in sparse-data
environments (UAI’01)
– Gunawardana. Tied Boltzman Machines for Cold Start
Recommendations (RecSys’08)
Things to Try (2): Temporal Dynamics
• Various possible types of temporal dynamics:
– Long term effect: people getting pickier over time
– Short term effect: festival mood
– Time of day effect: day time vs. night time
preference
– Periodicity: every Friday night is party time
• References:
• Koren. Collaborative Filtering with Temporal Dynamics
(KDD’09)