Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
COUPLED HIDDEN MARKOV MODELS FOR USER ACTIVITY IN SOCIAL NETWORKS Vasanthan Raghavan† , Greg ver Steeg‡ , Aram Galstyan‡ , Alexander G. Tartakovsky† † ‡ Department of Mathematics, University of Southern California, Los Angeles, 90089, CA, USA Information Sciences Institute, University of Southern California, Marina del Rey, 90292, CA, USA Email: {vasanthr, tartakov}@usc.edu, {gregv, galstyan}@isi.edu ABSTRACT We consider the problem of developing data-driven probabilistic models describing the activity profile of users in online social network settings. Previous models of user activities have discarded the potential influence of a user’s network structure on his temporal activity patterns. Here we address this shortcoming and suggest an alternative approach based on coupled Hidden Markov Models (HMM), where each user is modeled as a hidden Markov chain, and the coupling between different chains is allowed to account for social influence. We validate the model using a significant corpus of user activity traces on Twitter, and demonstrate that the coupled HMM explains and predicts the observed activity profile more accurately than a renewal process-based model or a conventional uncoupled HMM, provided that the observations are sufficiently long to ensure accurate model learning. Index Terms— Activity Modeling and Prediction, Coupled Hidden Markov Models, Social Network Influence 1. INTRODUCTION Over the last decade, social networking websites such as Facebook, Twitter, etc. have become popular with hundreds of millions of users that engage in various forms of activity on those sites. The enormous user-base has led to an explosion of social multimedia content on social networking websites. The goal of fully exploiting these possibilities to improve the efficiency of multimedia applications requires a fundamental understanding of the individual and collective behavior on social networks at a very large scale. Recent research has focused on understanding the properties of networks induced by social interactions, modeling information diffusion on such networks, characterizing their evolution in time, etc. Another important problem that has attracted significant interest is characterizing individual and collective activity patterns in such settings. Understanding temporal patterns of user activity can be leveraged for a number of important applications, such as efficient resource allocation, This work was supported by the Defense Advanced Research Projects Agency (DARPA) under grant # DARPA-W911NF-12-1-0034 at the University of Southern California. user-specific information dissemination, user classification, etc. Perhaps the simplest model of user activity is given by a Poisson process, where each activity event (e.g., tweeting) occurs independently with a time-independent rate. However, recent empirical evidence from various sources (e-mail logs, web surfing, etc.) suggest that human activity has distinctly non-Poissonian characteristics [1, 2]. In particular, the interevent time distribution, which is known to be exponential for the Poisson process, has been shown to be heavy tailed for a number of different activity types. Different approaches have been put forward to explain the non-homogenous nature of the activity patterns [1, 3, 4, 5, 6, 7]. Despite recent progress, however, open questions remain. Most remarkably, existing studies so far have discarded the role of the social network where the user activity takes place, instead describing each user via an independent stochastic process. On the other hand, it is clear that social interactions on networks affect user activity, and discarding these interactions should generally lead to sub-optimal models. The main contribution of this paper is to develop a computational model of user activity which explicitly takes into account the interaction between users by introducing a coupling between corresponding stochastic processes. Specifically, we propose a coupled Hidden Markov Model to describe interconnected dynamics of user activity. In our model, the individual dynamics of each user is coupled to the aggregated activity profile of his neighbors in the network. While a user’s activity may be preferentially affected by specific neighbors, the predictive power of the model can be substantially improved using the aggregated activity of all the neighbors. The hidden states in our model correspond to different patterns in user activity, similar to the approach suggested in [5]. However, here the state transitions are influenced by the activity of the neighbors, and in turn, the activity of the aggregated set of neighbors is influenced by the state of the given user. We perform a number of experiments with data describing user activity traces on Twitter, and demonstrate that the proposed approach has a better performance both in terms of explaining observed data (model-fitting) and predicting future activity (generalization). In particular, we report statistically significant improvement over two baseline approaches, a re- newal process-based model and a conventional (uncoupled) HMM. 2. RELATED WORK Several models have been proposed in the literature for modeling the temporal activity of users’ communication. Approaches based on simple Poisson processes has been proposed for user participation in an online social network setting in [8] and [9]. To explain the bursty features of human dynamics, [3] suggested the priority queue model. An alternative approach based on cascading Poisson processes was suggested in [4]. Although this model has been shown to be consistent with empirical observations, it is computationally intensive. To overcome this issue, Malmgren et al. [5] suggested a simpler two-state HMM for the activity of users in an email/communication network where the states reflect a measure of the user’s activity. Other work has also stressed the importance of distinguishing active versus inactive users [10, 11]. In addition to one-parameter exponential observation density for user activity utilized in [5], more general twoparameter models such as the Weibull (or stretched exponential) have been proposed for modeling inter-post duration in the context of instant-messaging networks [12], accessing patterns in Internet-media [13], and understanding inter-post dynamics for original content in general online social networks [14]. While the theory of HMMs is well-developed [15, 16], HMMs are ill-suited in settings where multiple processes interact with each other and/or information about the history of the process needed for future inferencing is not reflected in the current state. Coupled HMMs have been used in many such settings including models for complex human actions and behaviors [17], freeway traffic [18], EEG classification [19], spread of infection in social networks [20], etc. 3. MODELING ACTIVITY PROFILE OF TWITTER USERS Let Ti , i = 0, 1, · · · , N denote the time-stamps of a specific user’s tweets over the period of interest. We can equivalently define the inter-tweet duration ∆i as the state of the user of interest. Specifically, Qi = 0 denotes that the user is in an Inactive state between Ti−1 and Ti , whereas Qi = 1 denotes that the user is in an Active state. We also assume that Qi (i ≥ 1) evolves in a time-homogenous Markovian manner and is dependent only on Qi−1 and is conditionally independent of Q0i−2 = [Q0 , · · · , Qi−2 ] given Qi−1 . This is a reasonable first approximation of human behavioral dynamics. The state transition probability matrix P = {P[m, n]} is given as P= ∆i ∼ , f1 (·) f0 (·) if Qi = 1 if Qi = 0, for an appropriate choice of f0 (·) and f1 (·). As mentioned earlier, an exponential model for f· (·) corresponds to a Poisson process assumption under either state. While the exponential model is captured by a single parameter, this simplicity often constrains the model-fit either in the small inter-tweet (bursty) regime or large inter-tweet regime (tails). Two-parameter extensions of the exponential such as the Gamma or Weibull density allow a better fit in these two regimes. While both the Gamma and the Weibull models result in similar modeling performance, the Gamma model allows for simple parameter estimate formulas, whereas the Weibull results in solving for coupled equations in the model parameters. Thus, we will restrict attention to the exponential and Gamma model choices in this work. User of interest One of the main goals of this work is to develop a mathematical model for {∆i } = ∆N 1 = [∆1 , · · · , ∆N ]. Q0 Observations € Along the lines of [5], we start by developing a simplistic two-state HMM for {∆i }. Assumption 1 – Underlying States: We assume that a variable Qi , taking one of two possible values {0, 1}, reflects β0,1 1 − β1,0 with P[m, n] = P(Qi = n|Qi−1 = m), m, n ∈ {0, 1}. The density of the initial state Q0 is denoted as P(Q0 = j) = πj , j = 0, 1. Note that the switching from the Inactive state to the Active state in the HMM paradigm can capture the nocturnal/work-home patterns of individual users without any further explicit modeling [5]. Assumption 2 – Observation Density: In general, Qi is hidden (unobservable) and we can only observe {∆i } (or equivalently, {Ti }). In the Inactive state, {∆i } form samples from a “low”-rate point process, whereas in the Active state, {∆i } form samples from a “high”-rate point process. Specifically, let the probability density function of ∆i be given as ∆i = Ti − Ti−1 , i = 1, 2, · · · , N. 3.1. Influence-Free Hidden Markov Modeling 1 − β0,1 β1,0 Other users € Z1 Q1 Q2 Q3 Δ1 Δ2 Δ3 € € Z2 Z3 Time € € € € € € Fig. 1. Coupled HMM framework for user activity. 3.2. Influence-Driven Hidden Markov Modeling A more sophisticated influence-driven model is developed now by making the following additional assumptions: Assumption 3 – Influence of Neighbors: In addition to Qi−1 , the evolution of Qi is also influenced by the aggregated activity of all the users interacting with the user of interest (“neighbors,” for short). For example, a series of tweets from the neighbors can result in a reply/retweet by the user, or a long period of non-activity could induce the user to initiate a burst of activity. Let the variable Zi (i = 1, · · · , N ) capture the influence of the neighbors’ tweets on the user of interest. Examples of candidate influence structures include: i) a binary indicator function that reflects whether there was a mention of the user between Ti−1 and Ti (or not), ii) the number of such mentions, iii) aggregated or an appropriately weightedactivity of the friends of the user that appear in the user’s Twitter timeline, etc. The coupling between {Qi } and {Zi } is simplified by the i Markovian assumption P(Qi |Qi−1 1 , Z1 ) = P(Qi |Qi−1 , Zi ). In general, to keep computational requirements in inferencing low, it is useful to assume that the evolution of Qi is captured by a summary statistic φ(Zi ) : Zi 7→ [0, 1] such that but only weakly dependent on Zi−1 . Motivated by this thinking, we make the simplistic assumption that i−1 P(Zi |Qi−1 1 , Z1 ) = P(Zi |Qi−1 ). While the above assumption can be justified under certain scenarios, more general influence evolution models need to be considered and the loss in explanatory/predictive power by making the simplistic assumption in (1) needs to be studied carefully. This is the subject of ongoing work. Rephrasing, (1) presumes that user aggregation de-correlates Zi from its past history. Further, let the probability density function of Zi be given as g0 (·) if Qi−1 = 0 Zi ∼ g1 (·) if Qi−1 = 1. Combining the above four assumptions, the joint density of the observations {∆i }, the influence structure {Zi }, and the state {Qi } can be simplified as N N P ∆N 1 , Z1 , Q0 = P Q0 , Z1 , Q1 , ∆1 , · · · , ZN , QN , ∆N P(Qi |Qi−1 , Zi ) = P0 (Qi |Qi−1 ) · (1 − φ(Zi )) = P(Q0 ) + P1 (Qi |Qi−1 ) · φ(Zi ) with Pk [m, n] = Pk (Qi = n|Qi−1 = m) and 1 − p0 p0 1 − p1 P0 = , P1 = q0 1 − q0 q1 € qk Inactive Active p1 1 − q1 . 1 − qk pk € Observations € € € Point process “high” rate Point process “low” rate P(Zi |Qi−1 ) N Y P(Qi |Qi−1 , Zi ) i=1 N Y P(∆i |Qi ). i=1 (2) Zi 1 − pk Hidden States N Y i=1 In particular, the choice φ(Zi ) = 11(Zi > τ ) for a suitable threshold τ implies that the user switches from the transition probability matrix P0 to P1 depending on the magnitude of the influence structure. Influence Structure (1) Fig. 2. Pictorial illustration of state-transition evolution. Assumption 4 – Evolution of Influence Structure: Noting that Zi is a function of the activity of all the neighbors (and not a specific user), we assume that Zi is dependent on Qi−1 , The dependence relations that drive the coupled HMM framework for user activity are illustrated in Figs. 1 and 2. 4. MODEL LEARNING AND INFERENCE 4.1. Learning Model Parameters In this work, we study a conventional (uncoupled) HMM and a coupled HMM with different influence structures. It is of interest to infer the underlying states {Qi } that cannot be observed directly. This task is performed with the aid of the observations {∆i } in the HMM setting, and with the aid of {∆i } and the influence structure {Zi } in the coupled HMM setting. In the HMM setting, a locally optimal choice of model parameters is sought to maximize the likelihood function P(∆N 1 |λ). The model parameters are updated via the BaumWelch algorithm [15]. In the coupled HMM setting, a generalized Baum-Welch algorithm that results in the maximizaN tion of the joint likelihood function P ∆N 1 , Z1 |λ is used to learn the model parameters. The efficacy of the different models learned are then studied in two ways. In the first approach, the model parameters learned via the (generalized) Baum-Welch algorithm are used with a state estimation procedure to estimate the most probable state sequence associated with the observations. For the HMM setting, state estimation is straightforward via the use of the Viterbi algorithm [15]. State estimation in the coupled HMM setting requires a generalized Viterbi algorithm, details of which we omit due to space restriction. The observed inter-tweet durations corresponding to the classified states are compared with the inter-tweet durations obtained with the proposed model(s) via a graphical method such as the Quantile-Quantile (Q-Q) plot. Recall that a Q-Q plot plots the quantiles corresponding to the true observations with the quantiles corresponding to the model(s) [21]. If the proposed model reflects the observations correctly, the quantiles lie on the (reference) straight-line that extrapolates the first and the third quartiles. Discrepancies from the straightline benchmark indicate artifacts introduced by the model(s) not observed in the observations and/or features in the observation not explained by the model(s). In the second approach, the fits of the different models to the data are studied via a more formal metric such as the Akaike Information Criterion (AIC), defined as AIC(n) = 2k − 2 log(L), 4.2. Forecasting Given ∆n1 (and Z1n ), forecasting ∆n+1 is of immense importance in tasks such as resource allocation, advertising, anomaly detection, etc. A simple maximum a posteriori (MAP) predictor of the form e n+1 = arg max f (∆n+1 = y|∆n1 , Z1n ) ∆ MAP y = arg max y P where βei = j k=1 X βei fi (∆n+1 = y) i=0 en+1 |Qn =j)P(Qn+1 =i|Z en+1 ,Qn =j) α en (j)P(Z P α e (j) n j fails when fi (∆n+1 = y) is unimodal with the same mode for all i. This is always the case with exponential observation models (mode is 0) and with Gamma models if ki θi < 1 for all i (mode is 0), which is typically the case with the best modelfits for many users. On the other hand, a conditional mean predictor of the form where k denotes the number of parameters used in the model, n the length of the observation sequence, and L the optimized likelihood function for the observation sequence corresponding to the model. The AIC penalizes models with more parameters and the model that results in the smallest value of AIC is the most suitable model (for the observed data) from the class of models considered. In the HMM setting with kH parameters, the AIC corresponding to {∆i } is given as AIC(n) = 2kH − 2 log P(∆N 1 |λ) H = 2kH − 2 log αN (0) + αN (1) , results in large forecasting errors in the Inactive state if the mean inter-tweet durations in the two states are very disparate. To overcome these problems, we consider a predictor of the form where the converged model parameter estimates from the Baum-Welch algorithm are used in αi (j) = P(∆i1 , Qi = j) using the forward procedure. In the coupled HMM setting with kCH parameters, the corresponding AIC metric is N AIC(n) = 2kCH − 2 log P(∆N (3) 1 |Z1 , λ) , e n+1 is the state estimate using the (generalized) where Q Viterbi algorithm with ∆n1 (and Z1n ) as inputs and study the forecasting performance in the Active state with a Symmetric Mean Absolute Percentage Error (SMAPE) metric: N e i 1 X ∆i − ∆ SMAPE(N ) = · 11(Qi = 1). ei N i=1 ∆i + ∆ CH N where model parameter estimates maximizing P(∆N 1 |Z1 , λ) are to be used in (3). While the converged model parameters from the generalized Baum-Welch algorithm locally maxN imize P(∆N 1 , Z1 |λ), they do not (necessarily) maximize N N P(∆1 |Z1 , λ). Thus, with the model parameters from the generalized Baum-Welch algorithm as initialization, a local (gradient) search in the parameter space is performed to maxN imize P(∆N 1 |Z1 , λ). With these model parameters, an upper bound to AIC(n) is obtained as CH AIC(n) CH ≤ AIC(n) = 2kCH − 2 log α eN (0) + α eN (1) α bN (0) + α bN (1) where α ei (j) = P(∆i1 , Z1i , Qi = j) and α bi (j) are computed using the generalized forward procedure. e n+1 ∆ CM e n+1 = ∆ = E [∆n+1 |∆n1 , Z1n ] = k=1 X βei E [∆n+1 |Qn+1 = i] i=0 k=1 X e n+1 = i|∆n1 E [∆n+1 |Qn+1 = i] 11 Q i=0 The SMAPE metric is seen as a percentage error and is bounded between 0% and 100% with a smaller value indicating a better model for forecasting. 5. NUMERICAL RESULTS The dataset used to illustrate the efficacy of the models proposed in this work is a 30-day long record of Twitter activity described in [22]. This dataset consists of Nt = 652, 522 tweets from Nu = 30, 750 users (with at least one tweet). The time-scale on which the tweets are collected is minutes. More details on the different aspects of the dataset can be obtained at [22]. Since reliable model learning can be accomplished only for users with sufficient activity, we focus on users with a large number of tweets over the data collection period. There were 223 users with over 600 tweets and 115 users with over 1000 tweets. We consider the following models for the activity profile of a user: i) conventional two-state HMM, ii) coupled HMM with a binary influence structure that is set to 1 when there is a mention of the user and 0 otherwise, and iii) coupled HMM with the number of such mentions as the influence structure. While exponential and Gamma densities are considered for the observations, geometric, Poisson and shifted zeta densities are considered for the number of mentions. For a typical user, it is consistently seen that the coupled HMM with a geometric density for the number of mentions and a Gamma density for the observations results in the best fit from the class of models studied. The observation that the two-parameter Gamma density for the observations results in a better fit than that achieved with the exponential density has also been made in [12, 13, 14]. It is also seen that conventional HMMs are competitive with the more sophisticated coupled HMMs for small values of n. In general, the coupled HMM works relative to the HMM as follows. A conventional HMM declares a period as Inactive provided that ∆i is large. On the other hand, the coupled HMM declares a period as Inactive (independent of the nature of Qi−1 or Zi ) provided that ∆i is large, or when ∆i is small and in addition, Zi is also small and Qi−1 = 0. In other words, if the user is in the Inactive state and the influence structure does not suggest a switch to the Active state, a small inter-tweet period is treated as an anomaly rather than as an indicator of change to the Active state. Thus, unlike the HMM setting where the state estimate depends on the magnitude of ∆i , the coupled HMM is less trigger-happy in the sense that it considers the magnitude of ∆i in the context of neighbors’ activity before declaring a state as Active or Inactive. We now study the model-fits across a corpus of 100 users with different numbers of tweets and mentions over their periods of activity. Given that the exponential observation density consistently under-performs relative to a Gamma density, we henceforth focus on the performance of i) Model a — conventional HMM with Gamma density, and ii) Model b — coupled HMM with geometric influence structure and Gamma density. We define the relative AIC and SMAPE gain metrics as ∆AIC = AIC − AIC Model a Model b ∆SMAPE = SMAPE − SMAPE . Model a Model b For all the users, it is observed that a local optimum (to reasonable accuracy) is achieved by the generalized BaumWelch algorithm within 20-30 iterations and independent of the model initializations. Fig. 3(a)-(b) plots the histogram of ∆AIC for the corpus of 100 users with n = 500 and n = 1000, respectively. From Fig. 3, it can be seen that Model b out-performs Model a for a large fraction of the users and this out-performance gets better as n increases. Specifically, the fraction of users for whom the probability that Model a minimizes the information loss (relative to Model b) is less than 1% is 25% with n = 500 and 72% with n = 1000, respectively. The corresponding figures at a relative likelihood of 10% are 33% with n = 500 and 85% with n = 1000. Similarly, Fig. 3(c) plots the histogram of ∆SMAPE for n = 500 and it can again be seen that Model b is better than Model a in terms of predictive power for a large fraction of users. In general, the following conclusions can be made based on our studies: i) Model b would be most useful if there are enough observations and influence structure observations to ensure the accurate learning of the sophisticated model, ii) Model a would be most useful if there are enough observations, but not enough influence structure observations, iii) The simplest choice of an uncoupled HMM with exponential density would be most useful for very limited observations. 6. CONCLUDING REMARKS In this work, we have introduced a new class of coupled Hidden Markov Models (CHMM) to describe temporal patterns of user activity which incorporate the the social effects of influence from the activity of a user’s neighbors. We have shown that the proposed model results in better explanatory and predictive power over existing baseline models such as a renewal process-based model or an uncoupled HMM. While there have been many works on models for user activity in diverse social network settings, our work is the first to incorporate social network influence on a user’s activity. It would be of interest to develop hierarchical social influence driven models for groups of users as well as better understand those facets of a user’s social network that influence him the most. Combining temporal activity patterns with unstructured information such as the topic of discussion could result in better predictive performance than temporal activity alone. 7. REFERENCES [1] A. Vázquez, J. G. Oliveira, Z. Dezsö, K.-I. Goh, I. Kondor, and Albert-Laszlo Barabási, “Modeling bursts and heavy tails in human dynamics,” Phys. Rev. E, vol. 73, no. 3, pp. 036127, Mar. 2006. [2] A. Vázquez, B. Racz, A. Lukacs, and A-L Barabási, “Impact of non-Poissonian activity patterns on spreading processes,” Phys. Rev. Lett., vol. 98, no. 15, pp. 158702, 2007. [3] A-L Barabási, “The origin of bursts and heavy tails in human dynamics,” Nature, vol. 435, pp. 207, 2005. [4] R. D. Malmgren, D. B. Stouffer, A. E. Motter, and L. A. N. Amaral, “A Poissonian explanation for heavy tails in e-mail communication,” Proc. Nat. Acad. Sci., vol. 105, no. 47, pp. 18153–18158, 2008. (a) (b) (c) Fig. 3. Histogram of AIC gain with coupled HMM for a corpus of 100 users with (a) n = 500 and (b) n = 1000 observations. (c) Histogram of SMAPE gain with coupled HMM for n = 500. [5] R. D. Malmgren, J. M. Hofman, L. A. N. Amaral, and D. J. Watts, “Characterizing individual communication patterns,” Proc. ACM SIGKDD Conf., Paris, pp. 607– 615, June-July 2009. [14] L. Guo, E. Tan, S. Chen, X. Zhang, and Y. Zhao, “Analyzing patterns of user content generation in online social networks,” Proc. ACM SIGKDD Conf., Paris, pp. 369–378, June-July 2009. [6] Juliette Stehlé, A. Barrat, and G. Bianconi, “Dynamical and bursty interactions in social networks,” Phys. Rev. E, vol. 81, pp. 035101, Mar. 2010. [15] L. R. Rabiner, “A tutorial on hidden Markov models and selected applications in speech recognition,” Proc. IEEE, vol. 77, no. 2, pp. 257–286, Feb. 1989. [7] D. Rybski, S. V. Buldyrev, S. Havlin, F. Liljeros, and H. A. Makse, “Communication activity in a social network: relation between long-term correlations and interevent clustering,” Scientific Reports, vol. 2, pp. 560, 2012. [16] J. A. Bilmes, “A gentle tutorial of the EM algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models,” Tech. Rep., Intern. Comput. Sci. Inst., Berkeley, CA, Apr. 1998. [8] L. Guo, S. Chen, Z. Xiao, E. Tan, X. Ding, and X. Zhang, “Measurements, analysis and modeling of BitTorrent-like systems,” Proc. ACM SIGCOMM Internet Meas. Conf., New Orleans, pp. 35–48, Oct. 2005. [9] J. Leskovec, L. Backstrom, R. Kumar, and A. Tomkins, “Microscopic evolution of social networks,” Proc. ACM SIGKDD Conf., Las Vegas, pp. 462–470, Aug. 2008. [17] M. Brand, N. Oliver, and A. Pentland, “Coupled hidden markov models for complex action recognition,” Proc. IEEE Conf. Comput. Vis. Patt. Recog., San Juan, pp. 994–999, June 1997. [18] J. Kwon and K. Murphy, “Modeling freeway traffic with coupled HMMs,” Tech. Rep., University of California, Berkeley, CA, May 2000. [10] D. M. Romero, W. Galuba, S. Asur, and B. A. Huberman, “Influence and passivity in social media,” Soc. Sci. Res. Netw. Work. Pap. Ser., Aug. 2010. [19] S. Zhong and J. Ghosh, “HMMs and coupled HMMs for multi-channel EEG classification,” Proc. Intern. Joint Conf. on Neural Netw., Honolulu, pp. 1154–1159, May 2002. [11] C. Tan, J. Tang, J. Sun, Q. Lin, and F. Wang, “Social action tracking via noise tolerant time-varying factor graphs,” Proc. ACM SIGKDD Conf., Washington, DC, pp. 1049–1058, July 2010. [20] W. Dong, A. Pentland, and K. A. Heller, “Graphcoupled HMMs for modeling the spread of infection,” Proc. Conf. on Uncertainty in Art. Intel., Catalina Is., pp. 227–236, Aug. 2012. [12] Z. Xiao, L. Guo, and J. Tracey, “Understanding instant messaging traffic characteristics,” Proc. IEEE Intern. Conf. Dist. Comput. Sys., Toronto, p. 51, June 2007. [21] R. Gnanadesikan, Methods for Statistical Data Analysis of Multivariate Observations, Wiley-Interscience, 2nd edition, 1997. [13] L. Guo, E. Tan, S. Chen, Z. Xiao, and X. Zhang, “The stretched exponential distribution of Internet media access patterns,” Proc. ACM Symp. Prin. Dist. Comput., Toronto, pp. 283–294, Aug. 2008. [22] S. A. Macskassy, “On the study of social interactions in Twitter,” Proc. Intern. Conf. Weblogs and Soc. Media, Dublin, Ireland, pp. 226–233, June 2012.