Download Ge Yu

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Modeling Techniques for Content
Analysis and Recommendation in
Social Media Search
Ge Yu
Northeastern University, China
Outline

Social Media

Problems and Challenges

Measuring Relationships

Further Research Directions
☻1
Social Media
Social



☻2
Media
a new platform for users to exchange and
communicate
blog, forum, wiki, microblog, ……
users can share their opinions, viewpoints, and
experiences on the platform
Elements of Social Media
Users
☻3
– Writers, Readers, Distributers
Elements of Social Media
(Resources) – Text, Pictures,
Audio Video,
Contents
☻4
Users
Resources
Characteristics of Social Media
Users’
features
user’s dual roles
☻5
Users
Resource
Characteristics of Social Media
Contents’
features
multi-modal
☻6
Users
Resource
Characteristics of Social Media
Contents’
community
features
academics
composite
tourism
shopping
integrated
resources
combined
resources
single
resources
☻7
subresources
Characteristics of Social Media
Users
and Resources’ features
multipoly
raltionships
☻8
Users
Resource
Characteristics of Social Media
Users
and Resources’ features
communities
☻9
Users
Resource
Characteristics of Social Media
Users
and Contents’ features
heterogeneous
☻ 10
Users
Resource
Outline

Social Media

Problems and Challenges

Measuring Relationships

Further Research Directions
☻ 11
Search on Social Media




☻ 12
Entity : users and resources
Relationship : linkage between users, between
resources, and between users and resources
Social media searches: Entity oriented Search
and relationship oriented Search
Integrated Resource Search, such as academic
resource, tourism resource, shopping resource:
sophisticated analysis and mining on entities and
relationships.
Social Media Search Framework
Results
Ranking
and
User
Feedback
Entity Search
Relationship Search
E
E
E
E
Content, Sentiment, Topic,
raltionship, Interest, Operation, …
Relativity or Similarity
Social Media
Entityrelationship
raltionship
Entity
Modeling
R
R
User
Interface
relationship computing
between entities
。。。。。
。。。。
。。。。。
。。。。
。。。。。
。。。。
。。。。。
。。。。
。。。。。
。。。。
。。。。。
。。。。
f
Social Media
Modeling
Entity relationship
and
Searching
Searching
content analyzing
recommending
Tag and
Context
Analyzing
☻ 13
Integrated
Resource
Mining
Search
Intent
Understanding
Multi-modal
Content
Analyzing
User Social
relationship
Analyzing
…
Supporting
Techniques
Problems and Challenges (1)
Understanding
Social Media Data
Social media has “sociality”: contents are
generated by a lot of associated users
 The same content upated by different users may
have different meanings, sentiments, opinions
 The different contents updated by the same user
may have close relationship
 Even the same event or opinion, different users
might express different meanings, sentiments,
opinions
 How to accurately measure the relationship
among different entities (contents and users )?

☻ 14
Problems and Challenges (2)
Understanding

Search Intentions
Search intentions involve entities search and
relationships search
 An intention is hided in a sentence, an image, a
piece of video, a piece of audio, and/or their
combination
 It is hard to understand what the entities are
and what the relationships are in the search
intentions
 How to accurately extract entities and
relationships among them from users’ search
requirements?
☻ 15
Problems and Challenges (3)
Dynamic

Maintenance on Search Model
Social media users and resources are
mushrooming
 Users’ search requirements are diverse
 Entities and the relationship among them have
to change with above changes
 How to maintain the search model for reflecting
the dynamic changes of social media and
providing search results as accurate as possible?
☻ 16
Outline

Social Media

Problems and Challenges

Measuring Relationships

Further Research Directions
☻ 17
Measuring Relationship between Resources
?
?
content
similarity
1.tag
2. media context
3. map into a new space
4. sentiment, topic
similarity
deep learning
manifold alignment
transfer learning
……
correlation?
?
☻ 18
corralation
5. user
relationship
Measuring Relationship between Resources
Manifold leaning
☻ 19
Zhu X, Huang Z, Shen H, Zhao X: Linear cross-modal
hashing for efficient multimedia search. ACM Multimedia
2013: 143-152
Measuring Relationship between Users
1.social
raltionship(frien
ds, employment)
?
3. relationship between the
contents operated by the users
2. operation on contents
share
reply
tag, topic,
sentiment,
content,
score
similarity?
review
☻ 21
similar interest, preference,… ?
Measuring Relationship between Users
friend raltionship
trust raltionship
interest similarity
Ma H: On measuring social friend interest similarities in recommender systems. SIGIR
2014: 465-474
☻ 23
Measuring Relationship between Users
and Resources
?
1. direct operation of user to
resource
publish
2. indirect operation of user to resource
interesting?
like?
collect
similar interest
recommend
like
similar
topic
☻ 24
review
Measuring Relationship between Users
and Resources
measure the preference
of user to video
Cui P, Wang Z, Su Z: What Videos Are Similar with You?: Learning a Common
Attributed Representation for Video Recommendation. ACM Multimedia 2014: 597-606
☻ 25
An Example
Multiple
entities and relationship search
Yao T, Liu Y, Ngo C, Mei T: Unified entity search in social
media community. WWW 2013: 1457-1466
Scenario 1 (friend suggestion): Henry wants to
find friends who have the similar interests to
himself.
Scenario 2 (image (geo-)tagging): Henry sees a
beautiful picture when he browses a webpage and
he wants to know where this picture was taken.
Scenario 3 (personalized image search): Henry
wants to search photos with sunrise scene.
☻ 27
An Example
Multi
entities and relationship search
Construct a multi-level graph organizing the heterogeneous entities
Formulate entity search as a global optimization problem in a unified Bayesian
framework
Varioussearch applications are efficiently realized
☻ 28
Outline

Social Media

Problems and Challenges

Measuring Relationship

Further Research Directions
☻ 29
Further Research Directions (1)
Combing Text
and Multimedia Process
Techniques with Social Science Theory
Social media has “sociality”
 Text and multimedia process have rich techniques
 Social science theory such as social correlation,
balance theory, and status theory, can be used for
mining social media
 Many researchers proposed the idea of combining
social science theory
 The combination is expected to solve the problems
of understanding social media data and search
intention

☻ 30
Further Research Directions (2)
Combing Advanced
Data Analysis
Techniques with Big Data Process Approach

Social media has the characteristics of big data
 Advanced data analysis techniques such as data
mining, machine learning, have achieved a lot of
good results
 Big data process has effective approach for
solving data storage, update, and maintain, such
as parallel, distribution systems
 The combination expected to solve the problems
of search model dynamic maintainence
☻ 31
Outline

Social Media

Problems and Challenges

Measuring Relationship

Further Research Direction

What we have done
☻ 33
Overview
Public Opining
Analysis
Cross-Media
Retrieval
Personalized
Recommendation
Text Analysis
Information
Retrieval
Data Mining
Multi-Modal
Data
Social Media
☻ 34
Multimedia
Data Process
Applications
Techniques
Sentiment
Analysis
Platform
Personalized Sentiment Classification
Based on Latent Individuality of Microblog
Users (1)
girl, enjoys losing weight
follower
grumbles about
hard work
colleague
A: Yoga helps make my body flexible, lean & slim.
A: After work overtime for 3 days, I lose 3 pounds! (+)
B: I lose 5! (+)
C: Getting poor feedback on a project where you are
getting paid very little money for a lot of work.
C: After work overtime for 3 days, I lose 3 pounds! (-)
D: I lose 5! (-)
the same sentence, but opposite sentiment
how to do sentiment classification?
☻ 35
Personalized Sentiment Classification
Based on Latent Individuality of Microblog
Users (1)
Statistics of the datasets
Statistics items
Weibo
Twitter
# of posts
43,250
48,563
Comparison of different model configurations
Dataset
Weibo
# of positive posts
32,060
34,624
# of negative posts
11,190
13,939
Twitter
Size of vocabulary
30,171
23,181
# of sentiment
words
4,495
2,457
# of topic words
22,758
17,899
# of syntactic units
314,712
213,590
# of sentiment units
40,775
42,650
# of topic units
164,529
98,987
☻ 37
Metric
Basic
BOW
Follow
Depend
Full
SEN
.725
.699
.707
.748
.745
SPE
.594
.713
.721
.709
.725
GM
.656
.705
.714
.727
.735
SEN
.747
.799
.800
.823
.835
SPE
.832
.839
.843
.846
.847
GM
.779
.819
.825
.835
.840
Comparison of different model approaches
Dataset
Metric
SVM
MFP
PSVM
SNM
Co-train
Ours
SEN
.719
.642
.691
.718
.715
.745
SPE
.501
.605
.652
.654
.695
.725
GM
.587
.623
.667
.675
.705
.735
SEN
.514
.654
.704
.790
.743
.835
Accepted by IJCAITwitter
2015SPE
GM
.746
.621
.624
.764
.735
.847
.619
.633
.657
.776
.739
.840
Weibo
Extracting common emotions from
blogs based on fine-grained sentiment
clustering (2)
Traditional problem: classifying the sentiment orientation
of the given text:
“I seen the movie on Direc TV. I ordered it and I really liked it. I
can’t wait to get it for blu ray! Excellent work Rob!
Positive / Negative
What are the people’s typical opinions toward a trending
hot topic in the social media?
☻ 38
How
to go emotions
beyond just
sentiment
Common
toward
Liu
orientation
classification
Xiang’s withdrawal
fromand
aggregate
bloggers’
opinions in
Beijing Olympic
Games
an unsupervised way?
Extracting common emotions from
blogs based on fine-grained sentiment
clustering (2)
Use hot topic word as query word, and collect the results
from the blog search engine
Each blog search result snippet is represented by sentiment word
vector b = {w1, w2, …, wm}
Suppose {b} is generated by hidden sentiment factors {o}
Apply PLSA model
Apply EM Algorithm
P (o | b ) 
P ( d | o ) P (o )
 P(b | o) P(o)
o
☻ 39
SentSim(bi , b j ) 
bi  b j
bi  b j
k
bi  b j   P(om | bi ) P(om | b j )
m 1
bi 
k
 P (o | b )
l 1
l
i
2
Extracting common emotions from
blogs based on fine-grained sentiment
clustering (2)
Clustering the blog search results based on underlying
emotion similarity between them
Extract the common emotion words from each cluster
Clusters Common Emotion Words
A
“期望” (expectation) ,“希望”(hope), “接受
”(accept)
B
“痛苦的”(painful), “心痛”(brokenhearted), “责怪”(blame)
C
“赞同”(approve), “谢谢”(thank), “陶醉
”(intoxicated)
“失误”(mistake), “可怜”(pathetic), “伤害
”(hurt)
“眼泪”(tear), “没想到”(unexpected), “危
机”(crisis)
D
E
F
“相信”(believe), “了解”(understandable),
“勇敢的”(brave)
“失望”(disappointed), “受伤的”(injured),
G 2011,
Knowledge and Information Systems,
27(2): 281-302
“无奈”(helpless)
☻ 41
H
“遗憾”(regretful), “慰问”(console), “郁闷
”(depressed)
A Novel Approach Based on Multi-View
Content Analysis and Semi-supervised
Enrichment for Movie Recommendation(3)
story synopsis
movies
content analysis
single view score
assignment
multi-view
represented items
single view
profiles
poster, photo
music
multi-view
recommendation
Represented
Items
Enriched Profiles
semi-supervised
enrichment
text
image
multi-view
represent
☻ 42
music
new movies
Web
A Novel Approach Based on Multi-View
Content Analysis and Semi-supervised
Enrichment for Movie Recommendation(3)
story synopsis
movies
content analysis
single view score
assignment
multi-view
represented items
single view
profiles
poster, photo
music
multi-view
recommendation
Represented Items
Enriched Profiles
semi-supervised
enrichment
text
image
multi-view
represent
music
new movies
We
b
Recommend movies using improved users’ profile
☻ 44
Improving user profile
based on improved co-train:
1. Predict score of not
clicked movies on every
modality respectively.
2. Add the movies with the
same score on all
modalities to training set .
3. Repeat 1 and 2 until no
new movie added.
A Novel Approach Based on Multi-View
Content Analysis and Semi-supervised
Enrichment for Movie Recommendation(3)
Our method
After improving
Science
and Technology,
Comparison of different Journal of Computer
Comparison
of before
and after
2013,
28(5), 776-787
recommendation methods
improvement
☻ 45