Download Data Mining on Big Data for Music Recommender Systems

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Data Mining on Big Data
for Music Recommender Systems
Advisor(s): Fabrice Muhlenbach (UJM, LaHC), Pierre-René Lhérisson (UJM, LaHC /
1Dlab), and Pierre Maret (UJM, LaHC)
Mail: [email protected], [email protected],
[email protected]
Location: Laboratoire Hubert Curien
Team: Connected Intelligence
Summary In today’s world, many goods and services provided to the consumers are done
through a web application. Within the available plethora of information in e-commerce, it is
necessary to filter this information for keeping only items that might be relevant for the user.
Recommender systems are such automated systems, they can be defined as software tools and
techniques that provide suggestions for items that are most likely of interest to a particular
user [1]. In the cultural field, like music recommendation [2], using those systems raises the
question of diversity, novelty, and discovery [3]. The human being is fond of stability, but
he is not against breaking his routine and exploring things out of his comfort zone. In this
context, it is relevant to propose new items not too similar to items already used or buy by
the users for expanding and enriching their cultural knowledge. This approach can been done
based on a dissimilarity measure computed between cultural items. Moreover, few years ago,
with the emergence of the word embedding paradigm [4] and the ability to run algorithms on
big data, the content-based approaches [5] have experienced a resurgence of interest for the
recommender systems [6, 7].
The objective of this master thesis is to study how data mining techniques can be helpful
for improving the quality of the music recommendations on a streaming platform when the
content associated to music artists is not well structured, which is the case of emerging music
artists from independent record labels (“indie labels”): the artists are not publicly known,
they have no publicity or only a little, most of them do not have a web site with a useful
structured content to exploit (e.g., a Wikipedia page), there is a very few chance of finding
items on these music artists in the specialized music press, etc.
This project will be done at Hubert Curien Laboratory (Saint-Etienne, France) on the
data of a real music streaming platform called “1D lab” developed by the social start-up
company 1D Lab (http://en.1d-lab.eu/).
Expected results
• Theoretical: Data mining techniques used for improving the quality of a music contentbased recommender system model.
1
• Practical:
– Implementation of word embedding techniques (in R/Python) on artists descriptions for improving the similarity measure between music artists.
– Application of this similarity between a list of listened music artists and candidate
recommender music items.
– Evaluation of the recommendation relevance with a top-N recommendation protocol when the items are not so popular, like indie music artists [8].
Keywords: data mining, recommender system, music recommendation, content-based recommendation, word embedding, top-N recommendations
References
[1] F. Ricci, L. Rokach, and B. Shapira, “Recommender systems: Introduction and challenges,” in Recommender Systems Handbook, F. Ricci, L. Rokach, and B. Shapira,
Eds. Springer, 2015, pp. 1–34. [Online]. Available: http://dx.doi.org/10.1007/
978-1-4899-7637-6_1
[2] M. Schedl, P. Knees, B. McFee, D. Bogdanov, and M. Kaminskas, “Music recommender
systems,” in Recommender Systems Handbook, F. Ricci, L. Rokach, and B. Shapira,
Eds. Springer, 2015, pp. 453–492. [Online]. Available: http://dx.doi.org/10.1007/
978-1-4899-7637-6_13
[3] E. Pariser. (2011). The Filter Bubble: What The Internet Is Hiding From You. NY,
Penguin Press.
[4] T. Mikolov, Q. V. Le, and I. Sutskever, “Exploiting similarities among languages for
machine translation,” CoRR, vol. abs/1309.4168, 2013. [Online]. Available: http://
arxiv.org/abs/1309.4168
[5] P. Lops, M. de Gemmis, and G. Semeraro, “Content-based recommender systems:
State of the art and trends,” in Recommender Systems Handbook, F. Ricci, L. Rokach,
B. Shapira, and P. B. Kantor, Eds. Springer, 2011, pp. 73–105. [Online]. Available:
http://dx.doi.org/10.1007/978-0-387-85820-3_3
[6] J. Manotumruksa, C. MacDonald, and I. Ounis, “Modelling user preferences using
word embeddings for context-aware venue recommendation,” CoRR, vol. abs/1606.07828,
2016. [Online]. Available: http://arxiv.org/abs/1606.07828
[7] M. G. Ozsoy, “From word embeddings to item recommendation,” CoRR, vol.
abs/1601.01356, 2016. [Online]. Available: http://arxiv.org/abs/1601.01356
[8] P. Cremonesi, P. Garza, E. Quintarelli, and R. Turrin, “Top-N recommendations on
unpopular items with contextual knowledge,” in Proceedings of the 3rd International
Workshop on Context-Aware Recommender Systems, CARS-2011, October 23, 2011,
Chicago, Illinois, USA, 2011, 5 pages. [Online]. Available: http://ceur-ws.org/
Vol-791/paper1.pdf
2