Download Music Recommendation A Data Mining Approach - Graph-RAT

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Music Recommendation
A Data Mining Approach
Daniel McEnnis
2nd year PhD
Overview
 High level overview
 Toolkit Improvements
 Experiments
 Evaluation
 Algorithms research
 Data
 Future work
Project Goals
 Integrate social information
 Make algorithms ‘culturally aware’
 Implement existing algorithms
 Systematic evaluation framework
Similarity Algorithms
 Create new relations based on some
aspect of similarity
 6 different varieties of similarity
 Each algorithm can use one of 6
distance functions
Aggregator Algorithms
 Takes data from one set of actors and
moves it to another
 6 different varierties
 Each variety uses one of 7 aggregator
functions
 Basic building block of Graph-RAT
applications
Graph Triples Census
 Probable novel algorithm
 Proof of Correctness Completed
 Proof of Time Complexity Completed
 Literature review in progress
SUCCESS!
 Graph-RAT programming language now
functioning
 Graph-RAT integrates social, cultural,
personal, and audio data into algorithms
 Includes most commercial algorithms
 Contains primitives for existing academic
systems
 Evaluation is entirely automated
PROBLEMS
Evaluation Exploration
 9 types of music recommendation
 Personalized versus generic
 Open query versus targeted query
 Dynamic versus static data
 New music versus all music
Personalized Radio
 Open query with personalized
presentation
 Static data vs dynamic data
 New items prediction vs predict
anything
Targeted Search
 Not personalized
 Similarity queries
 Automatically generating targeted lists
for a browsing hierarchy
 New music vs all music
 Static vs dynamic data
Personalized Tag Radio
 Create a personalized play list matching
a given query
 New music vs all music
 Static vs dynamic data
Excluded Types
 ‘Top 40’ prediction
 Rendered obsolete by other types
Existing Algorithms
 Item-to-Item collaborative filtering
 7 variations
 User-to-user collaborative filtering
 7 variations
 Associative mining collaborative filtering
 Direct machine learning playlist data
 Direct machine learning audio data
Novel Algorithms
 Machine learning over profile data
 Machine learning over cultural and profile
data
 Machine learning on different concatenations




Audio
Playlist
Profile
Cultural
Initial Data
 LiveJournal
 Separating music data is difficult
 No tag info or audio content
 No enough musical data
 LastFM by User
 No audio content
 Data cleaning is an issue
Current Data
 40’s Jazz Recordings
 1800 annotated recordings from 70 CDs
 Covers nearly all 40’s popular music
 LastFM by Song
 Retrieves tag and user info by song
 Data cleaning on user playcounts needed
Data Cleaning Tags





Polysemy
Synonomy
Disjoint
Hypersomny
Hyposomny
 Initial algorithms developed
Future Work: Programming
 Radically different programming
environment
 SQL
 LINQ library package in C#
Future Work: Scalability
 Distributed SQL database
implementation
 Just-in-time compilation
 Event-based recalculation of algorithm
results
 Parallel execution of algorithms
 Multi-threaded algorithms
Related documents