Download EVOLVING SOCIAL GRAPHS CLUSTERING Affiliation Synonyms

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
EVOLVING SOCIAL GRAPHS CLUSTERING
Affiliation
Athena Vakali
Department of Informatics, Aristotle University, Thessaloniki, Greece
[email protected]
Synonyms
Evolving Community Detection
Glossary
SM-EC : Sequential Mapping-driven Evolving Clustering
TS-EC: Temporal Smoothing-driven Evolving Clustering
MD-EC : Milestones’ detection-driven Evolving Clustering
IA-EC: Incremental Adaptation-driven Evolving Clustering
Definition
Social graphs: In the current Web 2.0 or Social Web era, users’ intensive engagement in social
networking and content sharing applications results in the formation of a massive amount of new
associations daily among the actors involved. The types of such associations vary, depending on the
application at hand and the may correspond to either explicit relationships invoked by users’ actions.
Apart from users, implicit associations between other types of social application “actors” can be
inferred, such as between web resources (i.e. images, articles, videos, etc) annotated with the same
metadata. In a given web application’s context, such a complex network comprising the involved
“actors” that belong to different modes and the associations existing between these modes
constitutes its corresponding social graph.
Associations formed in the context of social networking applications are often multiway; i.e. they
involve multiple entities (e.g. User A commenting on Post P of User B), and are more precisely
captured in a generalized graph structure (i.e. hypergraph) with its (hyper)edges connecting more
than two nodes. However, for simplicity and depending on the required analysis task, they are
usually projected into simple graph structures G(V,E), V: the set of vertexes and E: the set of edges
[1]. Due to the existence of different modes, independent subsets of nodes and edges are identified
within V and E, respectively. For example, in a social tagging network, the modes of users U,
resources R, and metadata (i.e. tags) M can be represented in V with three different set of nodes, V=
{U; R; M}, while the bipartite associations existing between users-resources, resources-metadata and
users- metadata can be represented with the corresponding sets of edges in E, i.e. E = {UR; RM; UM}.
Depending on the creation process, social graph structures can be weighted / unweighted, unipartite
/ bipartite / multipartite, directed / undirected, etc. [2]. Figure 1 shows an example of transforming a
tag assignment tripartite to a unipartite graph.
Figure 1: Creation of a tripartite graph based on tag assignments in a social tagging network and its projection
into a unipartite graph. u, r, and m stands for the User, Resource and Metadata modes, respectively, whereas
ER and IR represent explict and implicit associations between the corresponding modes.
Evolving social graphs: Entities and associations in social web applications undergo several states
across time, since new entities e.g. users can appear, or old ones disappear, while some associations
may have a longer duration than other, e.g. two users may constantly exchange messages, whereas
other may have communicated just on a single occasion. In order to capture the evolution in social
data, evolving representation structures are needed that model the different data states in
successive time-steps [3]. Figure 2 depicts an evolving social dataset as a 3-layered stream G of
individual successive snapshot graphs representing the existing data associations at a given time. The
snapshot layer captures evolving social data as a (static) graph snapshots’ sequence, the segment
layer partitions the online graph stream in segments, comprising successive similar snapshots and
captures their connectivity in structures such as tensors (i.e. generalized matrices with more than
two dimensions), while the stream layer represents directly the graph stream, modeled as a tensor
updated on an upcoming snapshot or as a simple matrix, using appropriate edge aggregation or a
multi-graph, updated in either a new association’s arrival (single update), or on a upcoming graph
snapshot (batch update). For the sake of simplicity, simple graphs are often used instead of tensors,
with their edges aggregating past interactions under a given weight updating scheme.
Figure 2: Layers for evolving social data graph structures [3].
Evolving social graphs clustering: Social graph clustering or community detection is the process of
identifying clusters or latent communities in a social graph. Given a social graph G = (V; E), a
community C can be coarsely defined as a subgraph of G comprising a set
of entities that are
associated with a common element (e.g. a topic, an event, an activity or a cause) of interest [2].
Social data clustering is approached as an evolving process across time, assuming that social data
continuously undergo changes generating successive correlated clustering states, while aiming to
extract some kind of knowledge about identified patterns’ evolution [3]. A proposed classification [3]
of evolving social graphs clustering approaches in four seminal categories (given in Table 1):
Sequential Mapping-driven Evolving Clustering (SM-EC), Temporal Smoothing-driven Evolving
Clustering (TS-EC), Milestones’ detection-driven Evolving Clustering (MD-EC), and Incremental
Adaptation-driven Evolving Clustering (IA-EC).
Table 1: Proposed classification of evolving social graphs clustering approaches
representative
approaches
data model
updating
scheme
evolution aspect
clusters’ temporal
dependency
SM-EC [4]
snapshot
graphs/tensors
snapshot
graphs/tensors
batch
independent clustering at
each time-step
captures long-term changes
in clusters’
composition/ignores shortterm concept drifts
MD-EC [8]
snapshot
graphs/tensors
and segments
batch
IA-EC [9,10]
snapshot
graphs/tensors
and changestream
single/
batch
tracks individual
cluster’s evolution
assumes clusters
evolve smoothly
from the previous
model/incorporates
deviation as
regularization term
generates a new
clustering model on
an identified drastic
change in data’s
structure
builds new clusters
by adapting the
previous ones
TS-EC [5,6,7]
batch
clusters’
re-computation
requirement
required
required
identifies time-points when
clustering structure
drastically changes
required
identifies individual clusters’
evolution through different
new data-dependent
adaptation processes
not required
Cross-References
•
Inferring Social Ties and Communities in Social Networks
•
Modeling Social Preferences Based on Social Interactions
•
Models of Social Networks
•
Managing Social Graph Data: Flow of Retrieval and Maintenance
References
1. Peter Mika. 2007. Ontologies are us: A unified model of social networks and semantics. Web
Semant. 5, 1 (March 2007), 5-15.
2. Symeon Papadopoulos, Yiannis Kompatsiaris, Athena Vakali, and Ploutarchos Spyridonos. 2012.
Community detection in Social Media. Data Min. Knowl. Discov. 24, 3 (May 2012), 515-554.
3. M. Giatsoglou and A. Vakali. 2012. Capturing Social Data Evolution via Graph Clustering. IEEE
Internet Computing. Preprint. DOI: 10.1109/MIC.2012.24.
4. G. Palla, A.-L. Barabási, T. Vicsek. 2007. Quantifying social group evolution. Nature 446:7136, 664667.
5. D. Chakrabarti, R. Kumar, and A. Tomkins. 2006. Evolutionary clustering. In Proceedings of the
12th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD '06).
ACM, New York, NY, USA, 554-560.
6. L. Tang, H. Liu, J. Zhang. 2011. Identifying Evolving Groups in Dynamic Multi-Mode Networks. IEEE
Transactions on Knowledge and Data Engineering, 18 Jul. 2011, 72-85.
7. Y.-R. Lin, J. Sun, P. Castro, R. Konuru, H. Sundaram, and A. Kelliher. 2009. MetaFac: community
discovery via relational hypergraph factorization. In Proceedings of the 15th ACM SIGKDD
international conference on Knowledge discovery and data mining (KDD '09). ACM, New York, NY,
USA, 527-536.
8. J. Sun, C. Faloutsos, S. Papadimitriou, and P.S. Yu. 2007. GraphScope: parameter-free mining of
large time-evolving graphs. In Proceedings of the 13th ACM SIGKDD international conference on
Knowledge discovery and data mining (KDD '07). ACM, New York, NY, USA, 687-696.
9. J. Sun, D. Tao, S. Papadimitriou, P.S. Yu, and C. Faloutsos. 2008. Incremental tensor analysis:
Theory and applications. ACM Trans. Knowl. Discov. Data 2, 3, Article 11 (October 2008), 37
pages.
10.H. Ning, W. Xu, Y. Chi, Y. Gong, and T.S. Huang. 2010. Incremental Spectral Clustering by Efficiently
Updating the Eigen-System. Pattern Recognition, 43(1).
Related documents