Download ppt

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Topic Evolution and Social Interactions:
How Authors Effect Research
Ding Zhou, Xiang Ji,
Hongyuan Zha, C. Lee Giles
CIKM’06
Advisor: Prof. Hsin-Hsi Chen
Reporter: Yu-Hui Chang
2008/09/10
2008/9/10
Topic Evolution and Social Interactions:
How Authors Effect Research
1
“Given a seemingly new topic,
from where does this topic
evolve?”
“What author or authors
cause such a transition between
topics?”
2008/9/10
Topic Evolution and Social Interactions:
How Authors Effect Research
2
Introduction
• In order to interpret and understand the
changes of topic dynamics in documents, we
resort to discovering the social reasons of why
a topic evolves and relates dependencies with
others.
– Consider an actor au associating a topic ti at time k.
For some reason, this actor meets and establishes a
social tie with actor av who is mostly associated
with a new topic tj and they start to work on the
new topic with a higher probability.
2008/9/10
Topic Evolution and Social Interactions:
How Authors Effect Research
3
Introduction
• we identify the Markov topic transition matrix
via maximum likelihood estimation of the 1stand 2nd-order constraints brought about by the
hidden social interactions of authors
2008/9/10
Topic Evolution and Social Interactions:
How Authors Effect Research
4
Introduction
• Our contributions are:
• (1) a model of the topic dynamics in social documents which
connect the temporal topic dependency with the latent social
interactions;
• (2) a novel method to estimate the Markov transition matrix of
topics based on social interactions of different order;
• (3) the use of the properties of finite state Markov process as
the basis for discovering hierarchical clustering of topics,
where each cluster is a Markov metastable state;
• (4) a new topic-dependent metric for ranking social actors
based on their social impact. We test this metric by applying it
to CiteSeer authors.
2008/9/10
Topic Evolution and Social Interactions:
How Authors Effect Research
5
Problem Definition
2008/9/10
Topic Evolution and Social Interactions:
How Authors Effect Research
6
Social Network
2008/9/10
Topic Evolution and Social Interactions:
How Authors Effect Research
7
Problem Formalize
• Transform matrix DW (word)=> DT (topic)by LDA
• Using the matrix DA, a collaboration matrix A is
obtained by setting {αi,j}A×A = A = (DA)tDA
• Let the author set be Λ
• where a is the set of authors on a document and t is
the distribution over topic specifying this document
2008/9/10
Topic Evolution and Social Interactions:
How Authors Effect Research
8
Multiple orders of social
interactions
Idea :“collaborations bring about new topics”.
2008/9/10
Topic Evolution and Social Interactions:
How Authors Effect Research
9
Social Interactions & Markov
Topic Transition
2008/9/10
Topic Evolution and Social Interactions:
How Authors Effect Research
10
Model Estimation &
Markov Metastable State
Discovery
2008/9/10
Topic Evolution and Social Interactions:
How Authors Effect Research
11
The P(ti|tj) then costs O((NLT+NL2)(A+A2)),
which is bounded by O(A2NLT)
2008/9/10
Topic Evolution and Social Interactions:
How Authors Effect Research
12
Markov Metastable State
Discovery
• Markov chains are called nearly uncoupled if:
– the state space can be decomposed into several
disjoint subsets A such that ωπ(Ai|Aj) ≈ 1 for i = j
and ωπ(Ai|Aj) ≈ 0 for i = j.
• Each aggregate in a nearly uncoupled Markov
chain M is called a metastable state of M.
2008/9/10
Topic Evolution and Social Interactions:
How Authors Effect Research
13
Experiment
2008/9/10
Topic Evolution and Social Interactions:
How Authors Effect Research
14
Data preparation
• Corpus: Citeseer
– over 739,135 academic documents
– 418,809 distinct authors (after name
disambiguation)
– 1991 to 2004
– Eliminate the authors with <50 publications (in
1991~2004)
2008/9/10
Topic Evolution and Social Interactions:
How Authors Effect Research
15
Data preparation
• Associate each document with the list of
disambiguated authors
• Perform breadth-first-search
– search on the co-authorship graph from several
predefined well known author seeds until the graph
is completely connected or there are no new nodes.
– Choose Michael Jordan and Jiawei Han as seeds,
from statistical learning and data mining and
database respectively.
2008/9/10
Topic Evolution and Social Interactions:
How Authors Effect Research
16
Discovered topics
• train a Latent Dirichlet Allocation (LDA) model
setting the topic number as T = 50,
– T is small, because we only work on a small subset of
author in CiteSeer (3,974 authors out of 418,809).
2008/9/10
Topic Evolution and Social Interactions:
How Authors Effect Research
17
Discovered topics
2008/9/10
Topic Evolution and Social Interactions:
How Authors Effect Research
18
Markov topic transition
• We use the properties of finite state Markov process
as the basis for discovering hierarchical clustering of
topics, where each cluster is a Markov metastable
state
After
permutation
2008/9/10
Topic Evolution and Social Interactions:
How Authors Effect Research
19
Markov topic transition
• Permute the matrix Γ such that Γ is
approximately a block diagonal matrix
– The metastable states have in effect reduced the
original Markov transition process to a new
Markov process with fewer states
– Each diagonal block can be seen as a metastable
state which is a cluster of topics with tight intratransition edges.
2008/9/10
Topic Evolution and Social Interactions:
How Authors Effect Research
20
• We observe that diagonal
elements show the existence
of high self-transition
probabilities
• Both matrices are almost
symmetric, meaning the
pair-wise transition between
topics in the same mTopic
are largely balanced.
2008/9/10
Topic Evolution and Social Interactions:
How Authors Effect Research
21
data management,
data mining
• Transitions with probability
numerical analysis,
lower than 0.16 are hidden
machine learning
from the graph to clarify the
major transition among the
five mTopics.
• mT4 (numerical analysis) has been essential in these
mTopics. And there is a transition to mT5 (statistical
methods) and which is tightly coupled with research
in mT1 (data management and data mining).
• Results also imply that researchers in mT3 (networks)
will be concerned with mT2 (systems)
2008/9/10
Topic Evolution and Social Interactions:
How Authors Effect Research
22
Who powers the topic transtion
• We give a new metric δ(au) for the author impact
ratio of au as measuring the difference between the
obtained P(ti|tj )’s, with and without au.
2008/9/10
Topic Evolution and Social Interactions:
How Authors Effect Research
23
Conclusion
• Relating social actors to their associated social
topics and use them to derive topic trends.
• We model the topic dynamics as a Markov
chain and discover the probabilistic
dependency between topics from the latent
social interactions.
2008/9/10
Topic Evolution and Social Interactions:
How Authors Effect Research
24
Thanks
Any Questions?
2008/9/10
Topic Evolution and Social Interactions:
How Authors Effect Research
25