Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Collaborative Data Analysis and Multi-Agent Systems Robert W. Thomas CSCE 824 15 APR 2013 Agenda • • • • Problem Description Existing Research Overview Limitation of Existing Results Future Research Suggestions 2 Problem Description • Information Overload • Divide and Conquer; Reconcile • Recommender Systems and Social Media – Content Filtering – Collaborative Filtering – Collaborative Data Analysis through Agents 3 Content Filtering • Recommendations based on items similar to what has been preferred previously 4 Collaborative Filtering (CF) • Recommendations based on what others in a network prefer • Different Techniques – Memory-Based – Model-Based – Hybrid 5 Memory-Based CF • Similarity Computation • Prediction and Recommendation Computation • Top-N Recommendations 6 Similarity Computation • Compares Users or Items • Correlation-Based (Pearson correlation) • 𝑊𝑢,𝑣 = • 𝑊𝑖,𝑗 = 𝑖∈𝐼(𝑟𝑢,𝑖 −𝑟𝑢 )(𝑟𝑣,𝑖 −𝑟𝑣 ) 𝑖∈𝐼(𝑟𝑢,𝑖 −𝑟𝑢 ) 2 2 𝑖∈𝐼(𝑟𝑣,𝑖 −𝑟𝑣 ) 𝑢∈𝑈(𝑟𝑢,𝑖 −𝑟𝑖 )(𝑟𝑢,𝑗 −𝑟𝑗 ) 2 𝑢∈𝑈(𝑟𝑢,𝑖 −𝑟𝑖 ) • Vector Cosine-Based • 𝑊𝑖,𝑗 = cos 𝑖, 𝑗 = 𝑖∙𝑗 𝑖 ∗ 𝑗 2 𝑢∈𝑈(𝑟𝑢,𝑗 −𝑟𝑗 ) Two users: u,v Two items: i,j 𝑖 ∈ 𝐼= items both u and v have rated 𝑟𝑢 = avg rating of co-rated items of the 𝑢𝑡ℎ user 𝑢 ∈ 𝑈= users who rated both i and j 𝑟𝑖 = avg rating of the 𝑖 𝑡ℎ item by those users R = m x n user-item matrix 𝑖, 𝑗 are n dimensional vectors corresponding to i and j column of R 7 Prediction and Recommendation Computation • Weighted Sum of Others’ Ratings – 𝑃𝑎,𝑖 = 𝑟𝑎 + 𝑢∈𝑈( 𝑟𝑢,𝑖 −𝑟𝑢 𝑤𝑎,𝑢 ) 𝑢∈𝑈 𝑤 • Simple Weighted Average – 𝑃𝑢,𝑖 = 𝑛∈𝑁 𝑟𝑢,𝑛 𝑤𝑖,𝑛 𝑛∈𝑁 𝑤𝑖,𝑛 Prediction P for active user a, on item i 𝑟𝑢 = avg rating of user u 𝑤𝑎,𝑢 = weight between user a and user u 𝑢 ∈ 𝑈= users who rated item i Prediction P for user u on item i 𝑛 ∈ 𝑁= all other rated items for user u 𝑤𝑖,𝑛 = weight between items i and n 𝑟𝑢,𝑛 = rating for user u on item n 8 Top-N Recommendations • Item-Based • User-Based 9 Model-Based CF • • • • • Bayesian Belief Net Clustering Regression-Based Markov Decision Process (MDP) –Based Latent Semantic 10 Bayesian Belief Net • Bayesian logic – decision making and inferential statistics • Simple Bayesian – Memory-Based – 𝑐𝑙𝑎𝑠𝑠 = arg max 𝑗∈𝑐𝑙𝑎𝑠𝑠𝑆𝑒𝑡 𝑝(𝑐𝑙𝑎𝑠𝑠𝑗 ) 𝑜 𝑃(𝑋𝑜 = 𝑥𝑜 |𝑐𝑙𝑎𝑠𝑠𝑗 ) – Laplace Estimator to avoid a conditional probability of 0 – 𝑃 𝑋𝑖 = 𝑥𝑖 | 𝑌 = 𝑦 = #(𝑋𝑖 =𝑥𝑖 ,𝑌=𝑦)+1 #(𝑌=𝑦)+ 𝑋𝑖 • Tree Augmented naïve Bayes and naïve Bayes optimized by Extended Logic Regression (ELR) – Require extended training periods to produce results beyond simple Bayesian and Pearson correlation 11 Clustering • Cluster: collection of similar objects, dissimilar to objects in other clusters – Pearson correlation can be used • Three Categories – Partitioning – Density-based – Hierarchal • Often an Intermediate Step 12 Regression-Based • Use approximation of ratings to make predictions against a regression model • Apply to situations where rating vectors have large Euclidean distances but very high Similarity Computation scores 13 MDP-Based • Sequential Optimization Problem • <S,A,R,Pr> – S = {states} – A = {actions} – R = {rewards} for r(s,a,s’) – Pr = {transition probabilities} for pr(s,a,s’) • Partially Observable MDP (POMDP) 14 Latent Semantic • Uses statistical modeling to discover additional communities or profiles 15 Network Trust • We’re all mad here; I’m mad; you’re mad. • Opinions of different contacts are valued more than others under certain conditions • Accounting for this can increase CF accuracy • Semantic Knowledge • Social Tie-Strength 16 Hybrid CF • CF + Content-Based • CF + CF • CF + CF and/or Content-Based 17 Limitations of Existing Solutions • • • • • • • • Time / Accuracy Trade Offs Noisy Data Data Sparsity (New User) Scalability Synonymy Gray Sheep Shilling Attacks Privacy 18 Future Research Suggestions • • • • Hybrids Semantics Trust Parallel Processing – Multi-Agent Systems 19 BACKUP 20 References • Su, Xiaoyuan, and Taghi M. Khoshgoftaar. "A survey of collaborative filtering techniques." Advances in Artificial Intelligence 2009 (2009): 4. • Chen, Wei, and Simon Fong. "Social network collaborative filtering framework and online trust factors: a case study on Facebook." Digital Information Management (ICDIM), 2010 Fifth International Conference on. IEEE, 2010. • O'Donovan, John, and Barry Smyth. "Trust in recommender systems." Proceedings of the 10th international conference on Intelligent user interfaces. ACM, 2005. 21