* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Collaborative Filtering
Survey
Document related concepts
Transcript
Collaborative Data Analysis and Multi-Agent Systems Robert W. Thomas CSCE 824 15 APR 2013 Agenda • • • • Problem Description Existing Research Overview Limitation of Existing Results Future Research Suggestions 2 Problem Description • Information Overload • Divide and Conquer; Reconcile • Recommender Systems and Social Media – Content Filtering – Collaborative Filtering – Collaborative Data Analysis through Agents 3 Content Filtering • Recommendations based on items similar to what has been preferred previously 4 Collaborative Filtering (CF) • Recommendations based on what others in a network prefer • Different Techniques – Memory-Based – Model-Based – Hybrid 5 Memory-Based CF • Similarity Computation • Prediction and Recommendation Computation • Top-N Recommendations 6 Similarity Computation • Compares Users or Items • Correlation-Based (Pearson correlation) • 𝑊𝑢,𝑣 = • 𝑊𝑖,𝑗 = 𝑖∈𝐼(𝑟𝑢,𝑖 −𝑟𝑢 )(𝑟𝑣,𝑖 −𝑟𝑣 ) 𝑖∈𝐼(𝑟𝑢,𝑖 −𝑟𝑢 ) 2 2 𝑖∈𝐼(𝑟𝑣,𝑖 −𝑟𝑣 ) 𝑢∈𝑈(𝑟𝑢,𝑖 −𝑟𝑖 )(𝑟𝑢,𝑗 −𝑟𝑗 ) 2 𝑢∈𝑈(𝑟𝑢,𝑖 −𝑟𝑖 ) • Vector Cosine-Based • 𝑊𝑖,𝑗 = cos 𝑖, 𝑗 = 𝑖∙𝑗 𝑖 ∗ 𝑗 2 𝑢∈𝑈(𝑟𝑢,𝑗 −𝑟𝑗 ) Two users: u,v Two items: i,j 𝑖 ∈ 𝐼= items both u and v have rated 𝑟𝑢 = avg rating of co-rated items of the 𝑢𝑡ℎ user 𝑢 ∈ 𝑈= users who rated both i and j 𝑟𝑖 = avg rating of the 𝑖 𝑡ℎ item by those users R = m x n user-item matrix 𝑖, 𝑗 are n dimensional vectors corresponding to i and j column of R 7 Prediction and Recommendation Computation • Weighted Sum of Others’ Ratings – 𝑃𝑎,𝑖 = 𝑟𝑎 + 𝑢∈𝑈( 𝑟𝑢,𝑖 −𝑟𝑢 𝑤𝑎,𝑢 ) 𝑢∈𝑈 𝑤 • Simple Weighted Average – 𝑃𝑢,𝑖 = 𝑛∈𝑁 𝑟𝑢,𝑛 𝑤𝑖,𝑛 𝑛∈𝑁 𝑤𝑖,𝑛 Prediction P for active user a, on item i 𝑟𝑢 = avg rating of user u 𝑤𝑎,𝑢 = weight between user a and user u 𝑢 ∈ 𝑈= users who rated item i Prediction P for user u on item i 𝑛 ∈ 𝑁= all other rated items for user u 𝑤𝑖,𝑛 = weight between items i and n 𝑟𝑢,𝑛 = rating for user u on item n 8 Top-N Recommendations • Item-Based • User-Based 9 Model-Based CF • • • • • Bayesian Belief Net Clustering Regression-Based Markov Decision Process (MDP) –Based Latent Semantic 10 Bayesian Belief Net • Bayesian logic – decision making and inferential statistics • Simple Bayesian – Memory-Based – 𝑐𝑙𝑎𝑠𝑠 = arg max 𝑗∈𝑐𝑙𝑎𝑠𝑠𝑆𝑒𝑡 𝑝(𝑐𝑙𝑎𝑠𝑠𝑗 ) 𝑜 𝑃(𝑋𝑜 = 𝑥𝑜 |𝑐𝑙𝑎𝑠𝑠𝑗 ) – Laplace Estimator to avoid a conditional probability of 0 – 𝑃 𝑋𝑖 = 𝑥𝑖 | 𝑌 = 𝑦 = #(𝑋𝑖 =𝑥𝑖 ,𝑌=𝑦)+1 #(𝑌=𝑦)+ 𝑋𝑖 • Tree Augmented naïve Bayes and naïve Bayes optimized by Extended Logic Regression (ELR) – Require extended training periods to produce results beyond simple Bayesian and Pearson correlation 11 Clustering • Cluster: collection of similar objects, dissimilar to objects in other clusters – Pearson correlation can be used • Three Categories – Partitioning – Density-based – Hierarchal • Often an Intermediate Step 12 Regression-Based • Use approximation of ratings to make predictions against a regression model • Apply to situations where rating vectors have large Euclidean distances but very high Similarity Computation scores 13 MDP-Based • Sequential Optimization Problem • <S,A,R,Pr> – S = {states} – A = {actions} – R = {rewards} for r(s,a,s’) – Pr = {transition probabilities} for pr(s,a,s’) • Partially Observable MDP (POMDP) 14 Latent Semantic • Uses statistical modeling to discover additional communities or profiles 15 Network Trust • We’re all mad here; I’m mad; you’re mad. • Opinions of different contacts are valued more than others under certain conditions • Accounting for this can increase CF accuracy • Semantic Knowledge • Social Tie-Strength 16 Hybrid CF • CF + Content-Based • CF + CF • CF + CF and/or Content-Based 17 Limitations of Existing Solutions • • • • • • • • Time / Accuracy Trade Offs Noisy Data Data Sparsity (New User) Scalability Synonymy Gray Sheep Shilling Attacks Privacy 18 Future Research Suggestions • • • • Hybrids Semantics Trust Parallel Processing – Multi-Agent Systems 19 BACKUP 20 References • Su, Xiaoyuan, and Taghi M. Khoshgoftaar. "A survey of collaborative filtering techniques." Advances in Artificial Intelligence 2009 (2009): 4. • Chen, Wei, and Simon Fong. "Social network collaborative filtering framework and online trust factors: a case study on Facebook." Digital Information Management (ICDIM), 2010 Fifth International Conference on. IEEE, 2010. • O'Donovan, John, and Barry Smyth. "Trust in recommender systems." Proceedings of the 10th international conference on Intelligent user interfaces. ACM, 2005. 21