Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Modeling Techniques for Content Analysis and Recommendation in Social Media Search Ge Yu Northeastern University, China Outline Social Media Problems and Challenges Measuring Relationships Further Research Directions ☻1 Social Media Social ☻2 Media a new platform for users to exchange and communicate blog, forum, wiki, microblog, …… users can share their opinions, viewpoints, and experiences on the platform Elements of Social Media Users ☻3 – Writers, Readers, Distributers Elements of Social Media (Resources) – Text, Pictures, Audio Video, Contents ☻4 Users Resources Characteristics of Social Media Users’ features user’s dual roles ☻5 Users Resource Characteristics of Social Media Contents’ features multi-modal ☻6 Users Resource Characteristics of Social Media Contents’ community features academics composite tourism shopping integrated resources combined resources single resources ☻7 subresources Characteristics of Social Media Users and Resources’ features multipoly raltionships ☻8 Users Resource Characteristics of Social Media Users and Resources’ features communities ☻9 Users Resource Characteristics of Social Media Users and Contents’ features heterogeneous ☻ 10 Users Resource Outline Social Media Problems and Challenges Measuring Relationships Further Research Directions ☻ 11 Search on Social Media ☻ 12 Entity : users and resources Relationship : linkage between users, between resources, and between users and resources Social media searches: Entity oriented Search and relationship oriented Search Integrated Resource Search, such as academic resource, tourism resource, shopping resource: sophisticated analysis and mining on entities and relationships. Social Media Search Framework Results Ranking and User Feedback Entity Search Relationship Search E E E E Content, Sentiment, Topic, raltionship, Interest, Operation, … Relativity or Similarity Social Media Entityrelationship raltionship Entity Modeling R R User Interface relationship computing between entities 。。。。。 。。。。 。。。。。 。。。。 。。。。。 。。。。 。。。。。 。。。。 。。。。。 。。。。 。。。。。 。。。。 f Social Media Modeling Entity relationship and Searching Searching content analyzing recommending Tag and Context Analyzing ☻ 13 Integrated Resource Mining Search Intent Understanding Multi-modal Content Analyzing User Social relationship Analyzing … Supporting Techniques Problems and Challenges (1) Understanding Social Media Data Social media has “sociality”: contents are generated by a lot of associated users The same content upated by different users may have different meanings, sentiments, opinions The different contents updated by the same user may have close relationship Even the same event or opinion, different users might express different meanings, sentiments, opinions How to accurately measure the relationship among different entities (contents and users )? ☻ 14 Problems and Challenges (2) Understanding Search Intentions Search intentions involve entities search and relationships search An intention is hided in a sentence, an image, a piece of video, a piece of audio, and/or their combination It is hard to understand what the entities are and what the relationships are in the search intentions How to accurately extract entities and relationships among them from users’ search requirements? ☻ 15 Problems and Challenges (3) Dynamic Maintenance on Search Model Social media users and resources are mushrooming Users’ search requirements are diverse Entities and the relationship among them have to change with above changes How to maintain the search model for reflecting the dynamic changes of social media and providing search results as accurate as possible? ☻ 16 Outline Social Media Problems and Challenges Measuring Relationships Further Research Directions ☻ 17 Measuring Relationship between Resources ? ? content similarity 1.tag 2. media context 3. map into a new space 4. sentiment, topic similarity deep learning manifold alignment transfer learning …… correlation? ? ☻ 18 corralation 5. user relationship Measuring Relationship between Resources Manifold leaning ☻ 19 Zhu X, Huang Z, Shen H, Zhao X: Linear cross-modal hashing for efficient multimedia search. ACM Multimedia 2013: 143-152 Measuring Relationship between Users 1.social raltionship(frien ds, employment) ? 3. relationship between the contents operated by the users 2. operation on contents share reply tag, topic, sentiment, content, score similarity? review ☻ 21 similar interest, preference,… ? Measuring Relationship between Users friend raltionship trust raltionship interest similarity Ma H: On measuring social friend interest similarities in recommender systems. SIGIR 2014: 465-474 ☻ 23 Measuring Relationship between Users and Resources ? 1. direct operation of user to resource publish 2. indirect operation of user to resource interesting? like? collect similar interest recommend like similar topic ☻ 24 review Measuring Relationship between Users and Resources measure the preference of user to video Cui P, Wang Z, Su Z: What Videos Are Similar with You?: Learning a Common Attributed Representation for Video Recommendation. ACM Multimedia 2014: 597-606 ☻ 25 An Example Multiple entities and relationship search Yao T, Liu Y, Ngo C, Mei T: Unified entity search in social media community. WWW 2013: 1457-1466 Scenario 1 (friend suggestion): Henry wants to find friends who have the similar interests to himself. Scenario 2 (image (geo-)tagging): Henry sees a beautiful picture when he browses a webpage and he wants to know where this picture was taken. Scenario 3 (personalized image search): Henry wants to search photos with sunrise scene. ☻ 27 An Example Multi entities and relationship search Construct a multi-level graph organizing the heterogeneous entities Formulate entity search as a global optimization problem in a unified Bayesian framework Varioussearch applications are efficiently realized ☻ 28 Outline Social Media Problems and Challenges Measuring Relationship Further Research Directions ☻ 29 Further Research Directions (1) Combing Text and Multimedia Process Techniques with Social Science Theory Social media has “sociality” Text and multimedia process have rich techniques Social science theory such as social correlation, balance theory, and status theory, can be used for mining social media Many researchers proposed the idea of combining social science theory The combination is expected to solve the problems of understanding social media data and search intention ☻ 30 Further Research Directions (2) Combing Advanced Data Analysis Techniques with Big Data Process Approach Social media has the characteristics of big data Advanced data analysis techniques such as data mining, machine learning, have achieved a lot of good results Big data process has effective approach for solving data storage, update, and maintain, such as parallel, distribution systems The combination expected to solve the problems of search model dynamic maintainence ☻ 31 Outline Social Media Problems and Challenges Measuring Relationship Further Research Direction What we have done ☻ 33 Overview Public Opining Analysis Cross-Media Retrieval Personalized Recommendation Text Analysis Information Retrieval Data Mining Multi-Modal Data Social Media ☻ 34 Multimedia Data Process Applications Techniques Sentiment Analysis Platform Personalized Sentiment Classification Based on Latent Individuality of Microblog Users (1) girl, enjoys losing weight follower grumbles about hard work colleague A: Yoga helps make my body flexible, lean & slim. A: After work overtime for 3 days, I lose 3 pounds! (+) B: I lose 5! (+) C: Getting poor feedback on a project where you are getting paid very little money for a lot of work. C: After work overtime for 3 days, I lose 3 pounds! (-) D: I lose 5! (-) the same sentence, but opposite sentiment how to do sentiment classification? ☻ 35 Personalized Sentiment Classification Based on Latent Individuality of Microblog Users (1) Statistics of the datasets Statistics items Weibo Twitter # of posts 43,250 48,563 Comparison of different model configurations Dataset Weibo # of positive posts 32,060 34,624 # of negative posts 11,190 13,939 Twitter Size of vocabulary 30,171 23,181 # of sentiment words 4,495 2,457 # of topic words 22,758 17,899 # of syntactic units 314,712 213,590 # of sentiment units 40,775 42,650 # of topic units 164,529 98,987 ☻ 37 Metric Basic BOW Follow Depend Full SEN .725 .699 .707 .748 .745 SPE .594 .713 .721 .709 .725 GM .656 .705 .714 .727 .735 SEN .747 .799 .800 .823 .835 SPE .832 .839 .843 .846 .847 GM .779 .819 .825 .835 .840 Comparison of different model approaches Dataset Metric SVM MFP PSVM SNM Co-train Ours SEN .719 .642 .691 .718 .715 .745 SPE .501 .605 .652 .654 .695 .725 GM .587 .623 .667 .675 .705 .735 SEN .514 .654 .704 .790 .743 .835 Accepted by IJCAITwitter 2015SPE GM .746 .621 .624 .764 .735 .847 .619 .633 .657 .776 .739 .840 Weibo Extracting common emotions from blogs based on fine-grained sentiment clustering (2) Traditional problem: classifying the sentiment orientation of the given text: “I seen the movie on Direc TV. I ordered it and I really liked it. I can’t wait to get it for blu ray! Excellent work Rob! Positive / Negative What are the people’s typical opinions toward a trending hot topic in the social media? ☻ 38 How to go emotions beyond just sentiment Common toward Liu orientation classification Xiang’s withdrawal fromand aggregate bloggers’ opinions in Beijing Olympic Games an unsupervised way? Extracting common emotions from blogs based on fine-grained sentiment clustering (2) Use hot topic word as query word, and collect the results from the blog search engine Each blog search result snippet is represented by sentiment word vector b = {w1, w2, …, wm} Suppose {b} is generated by hidden sentiment factors {o} Apply PLSA model Apply EM Algorithm P (o | b ) P ( d | o ) P (o ) P(b | o) P(o) o ☻ 39 SentSim(bi , b j ) bi b j bi b j k bi b j P(om | bi ) P(om | b j ) m 1 bi k P (o | b ) l 1 l i 2 Extracting common emotions from blogs based on fine-grained sentiment clustering (2) Clustering the blog search results based on underlying emotion similarity between them Extract the common emotion words from each cluster Clusters Common Emotion Words A “期望” (expectation) ,“希望”(hope), “接受 ”(accept) B “痛苦的”(painful), “心痛”(brokenhearted), “责怪”(blame) C “赞同”(approve), “谢谢”(thank), “陶醉 ”(intoxicated) “失误”(mistake), “可怜”(pathetic), “伤害 ”(hurt) “眼泪”(tear), “没想到”(unexpected), “危 机”(crisis) D E F “相信”(believe), “了解”(understandable), “勇敢的”(brave) “失望”(disappointed), “受伤的”(injured), G 2011, Knowledge and Information Systems, 27(2): 281-302 “无奈”(helpless) ☻ 41 H “遗憾”(regretful), “慰问”(console), “郁闷 ”(depressed) A Novel Approach Based on Multi-View Content Analysis and Semi-supervised Enrichment for Movie Recommendation(3) story synopsis movies content analysis single view score assignment multi-view represented items single view profiles poster, photo music multi-view recommendation Represented Items Enriched Profiles semi-supervised enrichment text image multi-view represent ☻ 42 music new movies Web A Novel Approach Based on Multi-View Content Analysis and Semi-supervised Enrichment for Movie Recommendation(3) story synopsis movies content analysis single view score assignment multi-view represented items single view profiles poster, photo music multi-view recommendation Represented Items Enriched Profiles semi-supervised enrichment text image multi-view represent music new movies We b Recommend movies using improved users’ profile ☻ 44 Improving user profile based on improved co-train: 1. Predict score of not clicked movies on every modality respectively. 2. Add the movies with the same score on all modalities to training set . 3. Repeat 1 and 2 until no new movie added. A Novel Approach Based on Multi-View Content Analysis and Semi-supervised Enrichment for Movie Recommendation(3) Our method After improving Science and Technology, Comparison of different Journal of Computer Comparison of before and after 2013, 28(5), 776-787 recommendation methods improvement ☻ 45