Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Group Membership and Diffusion in Virtual Worlds David A. Huffaker, Chun-Yuen Teng, Matthew P. Simmons, Liuling Gong, Lada A. Adamic School of Information, University of Michigan, Ann Arbor, MI Email: [email protected], (chunyuen,mpsimmon,llgong,ladamic)@umich.edu Abstract—Virtual goods continue to emerge in online communities, offering scholars an opportunity to understand how social networks can facilitate the diffusion of innovations. We examine the social ties for over one million user-to-user virtual goods transfers in Second Life, a popular 3D virtual world, and the unique role that groups play in the diffusion of virtual goods. The results show that individuals – especially early adopters – are more likely to adopt a virtual good when they belong to the same groups as previous adopters. We also find that groups exhibit bursty adoption, in which many individuals adopt in short succession. In addition, we show that adoption activity within a group depends on the group’s size and interactivity. Our work provides insights into theories of social influence and homophily. I. I NTRODUCTION Virtual goods continue to become incorporated into social network services, gaming platforms and virtual worlds, allowing users to enhance their online experiences or customize their online identities. These virtual items can be purchased for a small payment or for free, and often manifest as commodities (e.g., fashion, brand name or celebrity gear) or game accessories (e.g., a World of Warcraft shield or a wagon in Zynga’s Farmville) [1], [2]. Virtual goods represent a way to measure information being transmitted over social networks and provide broader insights into our understanding of social influence and diffusion. While previous research in communication studies has demonstrated the important role that social ties play in the spread of information [3], little empirical work has been able to test these propositions on a large-scale. More recent research on diffusion has highlighted the intricate and intertwined roles of social influence (i.e., a friend or opinion leader impacts one’s propensity to adopt an idea or innovation) and homophily (i.e., sharing similar interests with others increase one’s exposure to new innovations and propensity to adopt) [4], but it is unclear which variables best measure these complex social behaviors. This research examines the role of social influence and homophily in large-scale virtual goods adoption with a particular focus on the role that groups play in the process of diffusion. It is motivated by the following research questions: What social factors increase the likelihood that a user will adopt a virtual good? What role does group membership play in the adoption of virtual goods, and to what extent do group characteristics impact adoption? To answer these questions, we examine how virtual goods spread along social networks using a unique data set of online interactions in a popular 3D immersive virtual world called Second Life (SL). SL allows users to create and distribute virtual goods, and uses an in-game currency that can be purchased with real money. Because 95% of the content in SL is actually created by the users, and because an astounding number of paid and zero-cost transactions occur in SL [5], it remains an ideal venue for studying why some virtual goods are adopted by users while others fail to diffuse. Second Life also allows us to examine the unique role of groups in the process of diffusion. Users can join up to 25 formal groups based on common interests or identities. SL users can send messages to other group members, hold events and meetings, and share information about products and locations inside the virtual world. II. T HEORETICAL F RAMEWORK Two popular theoretical frameworks have been used to explain diffusion: (a) theories of social influence ; and (b) theories of homophily. Theories of social influence suggest that exposure to influential individuals increases the likelihood of adopting similar beliefs or attitudes [6]. Friends, peers, family, or more distant leaders, cultural figures or celebrities all have the potential to impact another person’s attitudes or behaviors. Examples in recent research include evidence that obese friends or partners can increase one’s own weight gain [7] or that local social networks can influence others to quit smoking [8]. Research in online settings also show support for theories of social influence: digital information tends to spread within the same social circles [9]; adoption rates increase as friends begin adopting an item [10], [11]; neighbors provide social reinforcement of a particular health behavior [12]. Theories of homophily posit that individuals seek out others who follow the same interests and self-categorization (e.g., auto restoration hobbyists, punk rockers), or belong to the same formal or informal groups (e.g., ethnicity, IT workers) [6]. Homophily not only increases exposure to ideas or innovations as one coalesces with the group, but group members are more likely to adopt because they already have a propensity towards the attitude, behavior or idea. In other words, it is not that hanging with the wrong crowd will increase bad behavior, but that individuals with a propensity for bad behavior might gravitate toward the wrong crowd [13]. Recent online studies lend support for theories of homophily and diffusion: a study of mobile app adoption finds that homophily explains more than half of diffusion [14]; a study of email product recommendations finds that the most popular items represent niche interests, suggesting that there is a low GROUP B GROUP A GROUP C which small groups can rely on strong connections of a few individuals while large groups require continuous changes in membership [19]. Our secondary contribution is to provide more insight into the relationship between group-level social structure and diffusion. III. M ETHODOLOGY A. Data Fig. 1. Conceptual model of how diffusion spreads via social influence and homophily. The origin of the virtual good (in black) influences the adoption of several friends, one of whom is able to spread the asset further because of shared interest among fellow group members. probability of crossing over to individuals who do not share that particular interest [15]. These theories are not independent. Scholars argue that homophily selection results in correlated attitudes and behaviors, making it difficult to determine a causal relationship between social influence and diffusion [4]. Some even argue that social influence and homophily are so highly intercorrelated that they can not be distinguished from one another [16]. This is not unique to offline settings; a study of social influence in Wikipedia shows that users become rapidly similar after their first online communications, and this keeps increasing over time, making it difficult to tease social influence and homophily apart [17]. This research intends to provide further insights into the ways in which we can disentangle social influence and homophily in online interactions. In Second Life, users can distribute virtual goods to one another, communicate through a chat system, add each other to a friend list, or join up to 25 formal groups based on interest or self-categorization. This allows us to study the various types of interaction that users engage in, and measure the extent to which diffusion is related to friendship or shared group membership. We argue that while social ties impact adoption, being a member of a group increases the diffusion of virtual goods. This is illustrated in Figure 1: the node in black shares a virtual good to several friends, but the asset continues to diffuse because nodes in gray are in groups based on shared interests. This even allows the asset to spread across multiple groups with a potential boundary. Since Group C is focused on a completely different topic, the asset does not penetrate. We also argue that not all groups share the same communication patterns or structural signatures, and it is likely that some attributes such as the size and communication patterns should impact diffusion as well. For example, participation equality and dynamism, group size and diversity, maturity and cohesiveness all contribute to seeking and sharing information [18]. The structural signatures necessary for the health and stability of large and small groups also differ, in Our data includes all virtual goods adoptions that took place in SL between November and December 2008 for two types of assets, provided the user still held the asset at the time of data collection. The first, landmarks, are bookmarks that avatars can use to visit a specific location within the virtual world. There may be multiple landmarks with different annotations that correspond to identical coordinates, facilitating the tracking of diffusion of information about a location through different initial landmark instances. The second kind of asset is a gesture, which allows the avatar to execute a sequence of motions, or to emit a particular sound. Landmarks and gestures were selected because they are two types of nondivisible virtual goods that can easily be traced. In addition, we have weekly snapshots of the social network (buddy) graph for SL users, as well as weekly updates on their group affiliations for the same Nov.-Dec. 2008 time period. We also have data on fee transactions that have been described in previous studies [5], [10], [20], but which here are used only to determine whether a user is active in SL during the period in question. Data on individual users includes information on their experience and activity level: when they joined SL, how many days they were active and how much revenue they generated. B. Sample Our goal is to be able to predict which users will adopt which asset, given certain types of information such as the number of their friends who had previously adopted a particular asset, or the number of groups the user shares with previous adopters of the asset. At first it may seem difficult to determine which of hundreds of thousands of users will adopt one of the millions of assets. However, the problem becomes much more scoped once we know that a user’s friend has adopted an asset. In previous work, we showed that a large portion of asset transfers occurs between users who are friends in SL, and users are more likely to adopt after their friends do, even if they do not obtain the asset from the friend directly [10]. We chose our first sample accordingly by examining only adopters and non-adopters who both have at least one adopting friend. This allows us control for a user’s knowing someone who adopted while examining our variables of interest described below. In other words, for each asset we select one adopter who had a friend subsequently adopted the same asset, and then record that adopting friend and as well as one friend who did not adopt. We omit assets where no pair of adopters are friends. In terms of asset adoptions, we see 1,092,094 adoptions, which represent 546,047 unique assets that were exchanged among 235,467 unique users. 1.00 0.20 0.10 0.05 P(X>=x) 0.50 gesture landmark 1 5 10 50 100 500 1000 # of adopters per asset Fig. 2. The distribution of number of adopters for two types of virtual goods. As a control, we also construct a sample where neither adopters nor non-adopters share an adopting friend. For this, we select one adopter per asset, and to balance the sample, we select one non-adopter who actively transacted during the time that the asset was being adopted. This excludes users who may have been non-adopters due to inactivity. However, because there is no record of adopting users subsequently discarding the item. predictive accuracy is compromised by lack of complete information about all adoptions. For our examination of group characteristics conducive to diffusion, we analyze the network characteristics of 61,722 unique groups. We limit our analysis to groups with ten (10) or more members, and for virtual goods with at least five (5) adopters. From Figure 2, we can see that while a few assets enjoy widespread popularity, most were adopted by just a few individuals. The distribution of gestures displays a step function because avatars by default have a certain number of gestures that tend to remain unless discarded from inventory. It is also interesting to note that adopters tend to occupy a distance of two degrees in the buddy graph from the first adopter. C. Variables of Interest Our dependent variables focus on the likelihood of adoption, as well as total adoptions within a group. Our independent variable selection captures many unique social behaviors including relationships with friends, groups and other adopters. We also control for how long users have been in SL, their previous adoption behavior, and the adoption rates of the groups they inhabit. Because SL users can join up to 25 different groups, we develop an aggregated measure of group similarity between adopters and non-adopters. We also present a variety of social network variables used to measure characteristics of these groups. Asset Adoption. This dichotomous outcome variable represents whether a unique asset has been adopted by a particular user. Number of Days in SL: Total number of days a user has spent logged into Second Life. Number of Previous Adoptions: Total number of other virtual assets that the user has adopted in the past. Adopt Time: Ratio of the adopter’s position in the adoption chain (i.e., first adopter, second adopter, nth adopter) divided by the total number of adopters of the asset. For example, t = 0.1 means that the user was among 1/10th of users to adopt. Number of Previously Adopting Friends: Total number of friends that previously adopted the same virtual asset. Friends are users the user has added to their buddy list. Distance to First Adopter: Length of the shortest path from the first adopter of an asset in SL to any other adoptions that occur afterwards. The first adopter is also likely to be the creator. Number of Friends Shared with Previous Adopter: In the sample where the paired adopter and non-adopter both share the same adopting friend, this is a proxy for the strength of tie with the adopting friend. Number of Groups: Total number of SL groups a user belongs to during the sampling frame. Group Similarity: Cosine similarity between the group membership of an adopter and the groups of previous adopters. |Gi · Gj | (1) � Gi �� Gj � We assign the following weight Wk to each group k: Wk = log[(total # users)/(# users in group k)]. When comparing one user against all previous adopters, we take the average pairwise similarity between that user and all previous adopters. Crowding Factor: Percentage of adopters in each of the user’s groups. It is calculated as: Si,j = Crowdingf actor = � k Wk # adopters in Group k # users in Group k (2) Burstiness: Binary variable specifying whether a burst event is detected in the asset adoption time series. Group Adoptions: Total number of assets adopted by all users in the group. The asset can be exchanged with any other user in SL. Group Transfers: Total number of assets transferred between group members. Number of Group Members, N: Total number of users that have joined or were already members of a group during the one-month period. Number of Edges E: Total Number of connections between group members in the social graph or through transactions. Average Degree: E/N. Clustering Coefficient: The number of closed triads as a proportion of all connected triples. It represents how cliquish social ties are within a group. Largest Connected Component (LCC): The LCC is the largest subset of the network by which one can reach any node from any other node within the subset by traversing edges. Change in Size: Positive or negative change in group membership between November and December 2008, normalized by the size by the group membership in November. Number of Active Users: Users in the group who actively communicated or traded during the one-month period. The groups in this sample demonstrate a large amount of variability in the above characteristics. While 36% of groups have at least 10 members, the top 1% of groups have 500 or more members. There is a high rate of overall adoptions by group members (M = 1531.89), as well as within-group transfers (M = 856). There is a high degree of clustering (M = .29), a medium average degree (M = 5.6), and a large connected component (M = .51). These groups also show high variability in their membership turnover, ranging from losing two members in a month to gaining 4,000 members. IV. R ESULTS A. Predicting Virtual Goods Adoption In order to predict adoption, we include attributes of the individual users, their interactions with other adopters, and adoption activity within the groups to which they belong. We rely on two data sets to build our models. In the first test data set, we select two users for each asset who have a friend that has previously adopted it. One user adopts the asset after the friend, and the other user does not. This allows us to utilize a logistic model where asset adoption serves as a binary outcome variable. As shown in Table I, all the variables in the logistic regression are significant. We note that because our sample is very large, some of the variables — while significant — do not explain much of the variance. In order to examine the individual impact of each variable on the model, we rely on cross-validation measures [21]. Cross-validation (CV) randomly selects ten “folds” in the data, removes each fold and re-fits the regression model repeatedly, in order to estimate how well the predictive model performs. Values above 0.5 suggest that the variable(s) is predicting better than a random guess, i.e., that everyone in our balanced data set would have 50% chance of adopting. The model shows that group similarity (CV = 0.63) and crowding factor (0.61) are the most predictive individual variables in determining if an adopter’s friends will adopt. The group similarity measure shows that the more groups that a potential adopter shares with those who previously adopted, the higher the likelihood of adoption (Odds Ratio = 1.26). The crowding factor also reveals that the more adoption activity found in a users’ groups, the more prone its members are to adopt (Odds Ratio = 1.11). The next three most predictive variables represent one’s propensity to adopt virtual goods in general, how embedded an individual is in a particular social circle with the previous adopter, and the social distance from the first adopter. The number of friends that are shared with the previous adopter, a proxy for tie strength, shows a positive impact on adoption (Odds Ratio = 1.40). The more items a user has adopted in the past, the more likely she is to adopt in the future (Odds Ratio = 1.62). The further removed a user is from the first adopter, who is also potentially the creator, the less likely that the former will adopt an asset (Odds Ratio = .93). The length of individual users have been in SL does not affect their likelihood of adoption (Odds Ratio = .99). The number of adopting friends tells us whether the user had more than just one friend who adopted before him. In the full model, having additional friends who have adopted corresponds to a lower likelihood of adopting (Odds Ratio = .64). However, in a simple logistic regression the number of adopting friends is a positive predictor. One interpretation is that both the adopter and non-adopter in the test set have potentially been exposed to the asset through the shared friend who adopted. Once we account for the user’s potential interest in the asset through the group similarity variable, the number of adopting friends actually signals that this user has resisted adoption through multiple potential exposures. Hence he is less likely to eventually adopt. We then apply the same analysis to the second data set, where the sampled adopter and non-adopter do not share a common adopting friend. We find qualitatively consistent results with one important exception. As shown in Table II, the number of adopting friends variable has a much higher CV accuracy in the second sample (0.73 vs. the earlier 0.53), which is expected since the previous sample had guaranteed at least one adopting friend, but this one does not. The variables capturing the connection between groups and adoption also have a higher individual predictive ability, and the combined model achieves a CV accuracy of 0.86. We also measured specifically how much prediction improved when group variables are included. In our first sampled dataset, we find that the full model including group similarity and crowding factor shows roughly 5.5% greater accuracy (CV = 0.68) than the full model without group similarity (CV = 0.63). In our second sampled dataset, the accuracy of the full model with group similarity and crowding factor is still higher (CV = .86) than the model without group similarity and crowding factor (CV = 0.83). TABLE I VARIABLES PREDICTING WHETHER FRIENDS OF AN ADOPTER WILL ADOPT # Days in SL # Adopting Friends # of Groups Distance from 1st Adopter # other Adopted Assets # Friends with Prev Adopter Crowding Factor Group Similarity Combined model Estimate -0.01 -0.44 -0.52 -0.07 0.48 0.34 0.10 0.23 Std. Error 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 z value -10.13 -140.46 -239.48 -17.72 438.47 316.52 225.86 341.87 CV acc. 0.50 0.53 0.53 0.56 0.58 0.58 0.61 0.63 0.68 In sum, the results suggest that group similarity and crowding factor have a strong impact on the likelihood of adoption. To better illustrate this, we plotted the fitted regressions for both group similarity and crowding factor, as well as the number of adopting friends. As Figure 3 shows, the impact of the number of adopting friends has a flatter slope than TABLE II VARIABLES PREDICTING ADOPTION OF A VIRTUAL GOOD WHERE ADOPTERS AND NON - ADOPTERS DO NOT NECESSARILY SHARE AN ADOPTING FRIEND # Days in SL # of Groups # Adopting Friends # Other Adopted Assets Distance from 1st Adopter Crowding Factor Group Similarity Combined model Estimate 0.01 -0.40 3.11 1.04 -8.39 0.21 0.94 Std. Error 0.00 0.00 0.02 0.00 0.04 0.01 0.00 z value 6.04 -113.94 169.93 424.24 -238.68 27.64 341.61 CV acc. 0.50 0.66 0.73 0.74 0.75 0.77 0.78 0.86 in the “middle”. Predicting whether someone adopts an asset might further depend on the overall popularity of that asset. Therefore, we classify the assets by the number of adopters into popular (i.e. more than 100 adopters), less popular (between 6 and 12 adopters), and least popular assets (fewer than 6 adopters). As can be seen in Table III, when we try to predict which of two friends of an adopter will themselves adopt, it is the early adopters who are easiest to predict. However, it is the late adopters who can most easily be differentiated from a non-adopter who does not necessarily have an adopting friend. We believe there are two contributing factors. The first is that groups may be more instrumental in early adoption. However, having an adopting friend is also an important factor in adoption. Therefore, in the case where we control for having an adopting friend, it is the early adopters who are more predictable using information on group affiliation. By contrast, late adopters are more likely to have had a previously adopting friend, and are therefore more predictable when compared to non-adopters who do not have adopting friends. We also observe for both sampling methods that adoption of the least popular assets is most predictable. It is the niche items that are most easily predicted by information on whether one’s friends and groups are adopting. TABLE III C ROSS - VALIDATION ACCURACY FOR PREDICTING ADOPTION BROKEN DOWN BY ADOPTION TIME AND ASSET POPULARITY. F DESIGNATES THAT BOTH THE SAMPLED ADOPTER AND NON - ADOPTER SHARE A FRIEND WHO PREVIOUSLY ADOPTED ; NF DESIGNATES THAT THE SAMPLED ADOPTER AND NON - ADOPTER DO NOT NECESSARILY SHARE AN ADOPTING FRIEND . Fig. 3. Probability of adoption based on group similarity, crowding factor and number of adopting friends among adopters and non-adopters who have a friend who previously adopted the virtual good. All predictors were standardized to the same scale. the impact of these two group-related variables. In the next section, we examine how these patterns differ among early and late adopters. B. Differentiating Early vs. Late Adoption and Item Popularity Groups may differ in their ability to explain early as opposed to late adoption. If groups serve as active distributors of information about an item, they may be instrumental in dispersing early word of a new available item, or they may give late-comers an opportunity to catch up with what they had missed. On the other hand, even if groups are simply a proxy for user interests, it may be that early adopters tend to share more characteristics with one another relative to late adopters [14]. To understand this further, we separate all adoptions into early, middle, and late. Given differing time spans over which different assets are adopted, we normalize the time span from the first to last adoption event. Early adoptions occur in the first 1/5 of the time interval, late adoptions in the last 1/4, and the rest are labeled as being Adoption time EarlyF M iddleF LateF EarlyN F M iddleN F LateN F Popular 0.758 0.682 0.667 0.820 0.852 0.892 Less popular 0.736 0.714 0.671 0.828 0.862 0.880 Least popular 0.787 0.726 0.682 0.841 0.873 0.900 C. Exploring Asset Adoption Through Group Coverage The regressions in the previous sections show a correspondence between sharing groups with adopters and being likely to adopt. This effect did not merely reflect overall engagement with the virtual world and its groups, since the number of groups to which a user belongs is not a significantly predictive variable. Rather, adopters of the same assets are likely to share the same specific groups. To illustrate this further, we computed a minimum number of groups required to “cover” all users who adopt an asset and who are also members of at least one group. Omitting users who are not members of any group reduces the per-asset adoption count by less than 1% for 91% of assets. We then find a minimal subset of groups where the members of those groups cover all adopters of the asset, according to the greedy algorithm developed in [5]. We then create a baseline by calculating the number of groups it would take to cover any random set of users who adopted any asset of a given size. adopters of same asset ● ● ● ● ● ● ● ● ● ● ● ● 0.06 ● ● ● ● ● ● ● 0.04 ● ● ● ● ● ● ● ● ● 10 ● 100 ● ● ● ● 1000 ● ● ● with ≥ 1000 adopters, 2,754 medium assets with between 500 and 1000 adopters, and 5,082 small assets with < 500 adopters. Figure 5 shows that the minimum required number of groups to cover adopters is smaller for those assets which are adopted in bursts. This implies that bursts of adoption tend to occur within groups. This further suggests that groups are either a rapid medium for the diffusion of information, or that they in other ways correspond to pockets of susceptible individuals in the social graph through which information can spread rapidly. 10000 10000 0.12 random set of adopters ●● 0.08 0.10 ●● 0.02 % of group contributed to coverage for largest group Data Source Number of users adopting asset (grouped only) dataset D. Group Membership and Adoption Bursts An asset adoption time series gives the adoption frequency of an asset over a large time scale. A burst in the time series suggests an abnormal behavior in the series where an unexpectedly large number of events, e.g. adoptions, occur within some time window. One might expect friends and members of the same group to adopt during a narrower time window than all adopters of an asset. We therefore ask whether those individuals involved in the bursts are more likely to be linked through friendship or group ties. Here we select two methods of identifying such bursts. The first, a fixed window burst detection, looks simply at whether a high proportion of all adoptions occurs within a short, 7day time window. We identify those assets where 90% of adoptions occur within a week, those where a maximum of between 20% and 90% of adoptions occurred within any single 7-day period, and assets with no such period of concentrated adoption. This definition can be applied regardless of the total number of adoptions, although assets with fewer adoptions are more likely to have a majority of those adoptions occur within a short time window. The second method, elastic window burst detection using a wavelet technique [22], must have a sufficient number of observations for a burst to be detected. We randomly sampled one fifth, or 11,398 such assets adopted at least 250 times, among which 6809 assets (or about 60%) contained a burst event. The assets are then classified into three categories according their total number of adopters, i.e., 3,562 large assets 1000 90% burst No burst Random set of adopters ● ● ● ● ● ● ● 100 ● ● ● ● ● ● ● ● ● ● ● ● ● 10 ● ● ● ● ● ● 1 As can be seen in Figure 5, it takes on average a smaller number of groups to cover users who are adopting the same asset than to cover users who are adopting assets in general. This pattern, consistent with our regression results, suggests that the individuals who choose to adopt an asset are more likely to share the same groups. Furthermore, as Figure 4 shows, adopters can comprise a small but significant portion of an individual group’s membership. As expected, randomly chosen users do not comprise a significant portion of any group. Number of groups required to cover 20% burst Fig. 4. For each asset, we select the group that contains the most adopters of that asset. We then show what fraction of the group is comprised by the asset’s adopters. We compare this against equally sized random sets of adopters. 3−7 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 8−20 20−50 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 50−150 150−400 400−1.1k 1.1k−3k 3k−8k 8k+ Number of users who adopted asset (group members only) Fig. 5. The number of groups required to cover all adopters of a particular asset, provided they are members of least one group. The number of groups required to cover the adopters of an asset is smaller than the number that would be required to cover a randomly assembled group of the same size. In addition to being more likely to share groups, adopters of bursty assets are more likely to be friends. We quantify this by constructing equivalent networks using a configuration model proposed by Chung and Lu [23].The model keeps the degree of each node fixed, but randomly matches the endpoints of the edges. One can then calculate, given the number of friends each adopter has, the number of connections one would expect between adopters. Table IV shows that friendship connections are more common relative to what is expected for assets being adopted in bursts than for those who don’t exhibit bursty adoption. Furthermore, users who adopt in close succession, i.e., those who are part of the same burst event, are more likely to be friends (see Table V). Moreover, it is notable that those who end up holding smaller (i.e. less-widely adopted assets) are more likely to be friends. They may have been brought together by a unique interest, or influenced one another directly. This illustrates that social variables are more important for diffusion in the niche. E. Group Characteristics that Increase Adoption So far we have provided several pieces of evidence that groups play an important role in the diffusion of virtual goods. Yet groups tend to vary in their size, attrition rates and interconnectedness. A second goal in our analysis is to determine what characteristics of groups, if any, impact diffusion. In order to examine this, we performed a linear regression of the total virtual goods transfers among members within a group, as well as a linear regression of the total TABLE IV M ULTIPLE COMPARISON OF THE ACTUAL - TO - EXPECTED RATIOS OF CONNECTIONS BETWEEN ALL - ADOPTERS FOR BURST- ASSETS AND FOR NON - BURST- ASSETS Mean Large Assets Medium Assets Small Assets Non-burst Burst Non-burst Burst Non-burst Burst-assets 7.0040 6.4031 14.2801 19.3747 23.0261 32.4454 99% Conf. int mean diff. [-1.6326, 0.4308] [1.7275, 8.4618] [1.8695, 16.9693] TABLE V M ULTIPLE COMPARISON OF THE ACTUAL - TO - EXPECTED RATIOS OF CONNECTIONS BETWEEN ALL - ADOPTERS AND BETWEEN BURST- ADOPTERS FOR BURST- ASSETS Mean Large Assets Medium Assets Small Assets All-adopters Burst-adopters All-adopters Burst-adopters All-adopters Burst-adopters 6.4031 13.2359 19.3747 54.0735 32.4454 123.3453 99% Conf. int mean diff. [3.5348, 10.1310] [15.4171, 53.9804] Fig. 6. Network visualizations of transaction ties (in black) and social ties (in gray) for lowest (top) and highest (bottom) frequency of adoption for a random sample of groups with 20 members. TABLE VI VARIABLES PREDICTING TOTAL WITHIN - GROUP TRANSFERS [35.5050, 146.2946] adoptions by group members from anywhere in Second Life. As shown in Table VI, we find several group-level attributes to be correlated with the transfer of virtual goods within groups. In particular, the number of group members (β = −.06) is slightly negatively correlated with total transfers. A simple explanation could be that large groups are more easily joined and therefore might be composed of users who are not as active in SL. However, we fail to find a correlation between the size of a group and the percentage of its members who are active, as judged by last login. We also do not find that changes in the size of groups impact the volume of transfers. The number of active users (β = .11) does show a strong positive relationship with transfers. This shows that it is the active group size that matters. Although the total number of social ties between members follows the trend of total number of members in being negatively correlated with transfers (β = −.04), the average in-group degree per member, measuring how connected each member is within the group, is positively related with transfers (β = .17). Further measures of social connectedness, such as the clustering coefficient (β = .21) and the strongly connected components (β = .24) are also positively correlated with transfers. This suggests that groups that show signatures of social interactions are more likely responsible for the spread of assets.This is illustrated in Figure 6, in which low adopting groups (top row) show very few links between members, especially with regard to social ties (in gray). When we examine all adoptions that are made by group members, some of the group characteristics show a different relationship. First, as shown in Table VII, change in group size has a small but significant impact on total adoptions (β = .05). This suggests that as groups grow and membership changes, there are new opportunities for adoption. For example, a new (Intercept) Num of Members Num of Edges Clustering Coefficient Average Degree LCC Change in Size Num of Active Users Estimate -1.902 -0.002 -0.000 12.275 0.326 8.924 -0.002 0.008 Std. Error 0.122 0.000 0.000 0.234 0.011 0.189 0.008 0.001 t value -15.601 -10.106 -8.477 52.549 30.835 47.224 -0.252 16.432 Pr(>|t|) 0.000 0.000 0.000 0.000 0.000 0.000 0.801 0.000 member of the group may come with new ideas or innovations, which are attractive to members who have been exposed to the same items for a period of time. However, there may be some internal boundaries: the results show that average degree has a negative impact on adoption (β = −.22), that as the interconnectedness of group members increases, the less likely new innovations are to be adopted. For example, members may be accustomed to certain types of virtual goods and uninterested in investing time or money in new ones, or certain cultural practices or norms may be in place that lend themselves to some types of innovation but not to others. TABLE VII VARIABLES PREDICTING TOTAL ADOPTIONS FROM GROUP MEMBERS (Intercept) Num of Members Num of Edges Clustering Coefficient Average Degree LCC Change in Size Num of Active Users Estimate 5.048 -0.006 -0.000 9.860 -0.547 26.434 0.133 0.016 Std. Error 0.162 0.000 0.000 0.310 0.014 0.251 0.011 0.001 t value 31.206 -27.383 -2.430 31.809 -38.994 105.413 12.063 23.807 Pr(>|t|) 0.000 0.000 0.015 0.000 0.000 0.000 0.000 0.000 V. C ONCLUSION We explored the role of group membership and structure in the adoption of virtual goods in a large-scale online community. Prior work had focused primarily on the influence of single social ties, or at most the entirety of an individual’s ego network. By contrast, we examine the role of larger social collectives, namely groups, and demonstrate that they are useful in discerning whether a given individual will eventually adopt an asset. We show that when an individual belongs to many of the same groups as other adopters, and conversely, when that individual’s groups are populated by adopters, then the individual is more likely to adopt. These findings are particularly germane for assets that are adopted by a small number of users. Individual groups are also more likely to play a role in bursty adoption, when many individuals adopt in rapid succession, and for adoptions that occur shortly after the asset is first created. From these findings we can infer that groups represent niche interests, and that assets may be more likely to diffuse initially through a small group. Groups might also facilitate rapid diffusion of an asset during a short time period, through oneto-many announcements, events, or other ways in which group members interact. We gain further insight that group structure is conducive to diffusion. Smaller groups tend to show higher overall adoption activity, as well as within-group transfers. Denser, more interconnected groups, are related to exchanges within the group, but not in terms of virtual goods adoption among group members and the rest of the community. There are clear limitations to our approach. Group activity is reflective of both social ties and shared interests, two variables that contribute to adoption, highlighting the complex interplay and potentially confounding nature of social influence and homophily [4], [16]. While we have demonstrated the utility of factoring in information about group membership to predicting adoption events, controlling for variables such as the number of friends in one’s social circle who adopted, the present data analysis does not allow us to distinguish how much of the explanatory power of group membership is due to groups reflecting interests that would make an individual susceptible to adoption, and how much is due to activities within a group promoting adoption. We think this is an interesting avenue for future work. In conclusion, we have demonstrated the important role that groups and group membership play in the diffusion process, and provided evidence that the adoption of virtual goods in Second Life is more likely to occur when users share similar groups. We also show that the communication behaviors and network structure of these groups are tied to the number of adoptions that occurs among group members. These findings provide further support that individual influence is not the sole factor in the diffusion process and that considering larger social collectives such as formal groups can better explain the likelihood of adopting an innovation. ACKNOWLEDGMENT We thank Linden Lab for sharing Second Life data. This work was supported by the Intelligence Community (IC) Postdoctoral Research Fellowship Program, NSF IIS-0746646, and MURI award FA9550-08-1-0265 from the Air Force Office of Scientific Research. We also thank Robert Horowitz. R EFERENCES [1] E. Castronova, D. Williams, C. Shen, R. Ratan, L. Xiong, Y. Huang, and B. Keegan, “As real as real? Macroeconomic behavior in a large-scale virtual world,” New Media & Society, vol. 11, no. 5, pp. 685–707, 2009. [2] T. Malaby, “Parlaying value: Capital in and beyond virtual worlds,” Games and Culture, vol. 1, no. 2, p. 141, 2006. [3] E. Rogers, Diffusion of innovations. Free Pr, 1995. [4] M. Jackson, “An Overview of Social Networks and Economic Applications,” The Handbook of Social Economics., ed. Jess Benhabib, Alberto Bisin and Matthew O. Jackson. Elsevier Press. forthcoming, 2009. [5] E. Bakshy, M. Simmons, D. Huffaker, C. Teng, and L. Adamic, “The Social Dynamics of Economic Activity in a Virtual World,” in Fourth International AAAI Conference on Weblogs and Social Media (ICWSM10), vol. 1001, 2010, p. 48103. [6] P. Monge and N. Contractor, Theories of communication networks. Oxford University Press, USA, 2003. [7] N. Christakis and J. Fowler, “The spread of obesity in a large social network over 32 years,” New England Journal of Medicine, vol. 357, no. 4, p. 370, 2007. [8] ——, “The collective dynamics of smoking in a large social network,” New England journal of medicine, vol. 358, no. 21, pp. 2249–2258, 2008. [9] F. Wu, B. Huberman, L. Adamic, and J. Tyler, “Information flow in social groups,” Physica A: Statistical and Theoretical Physics, vol. 337, no. 1-2, pp. 327–335, 2004. [10] E. Bakshy, B. Karrer, and L. Adamic, “Social influence and the diffusion of user-created content,” in Proceedings of the Tenth ACM Conference on Electronic Commerce. ACM, 2009, pp. 325–334. [11] L. Backstrom, D. Huttenlocher, J. Kleinberg, and X. Lan, “Group formation in large social networks: membership, growth, and evolution,” in KDD’06, 2006, p. 54. [12] D. Centola, “The Spread of Behavior in an Online Social Network Experiment,” Science, vol. 329, no. 5996, p. 1194, 2010. [13] M. McPherson, L. Smith-Lovin, and J. Cook, “Birds of a feather: Homophily in social networks,” Annual Review of Sociology, vol. 27, pp. 415–444, 2001. [14] S. Aral, L. Muchnik, and A. Sundararajan, “Distinguishing influencebased contagion from homophily-driven diffusion in dynamic networks,” Proceedings of the National Academy of Sciences, vol. 106, no. 51, p. 21544, 2009. [15] J. Leskovec, L. Adamic, and B. Huberman, “The dynamics of viral marketing,” ACM Transactions on the Web (TWEB), vol. 1, no. 1, p. 5, 2007. [16] C. Shalizi and A. Thomas, “Homophily and contagion are generically confounded in observational social network studies,” Sociological Methods & Research, vol. 40, no. 2, p. 211, 2011. [17] D. Crandall, D. Cosley, D. Huttenlocher, J. Kleinberg, and S. Suri, “Feedback effects between similarity and social influence in online communities,” in Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2008, pp. 160–168. [18] J. Gastril, The group in society. Sage Publications, 2009. [19] G. Palla, A. Barabási, and T. Vicsek, “Quantifying social group evolution,” Nature, vol. 446, no. 7136, pp. 664–667, 2007. [20] D. Huffaker, M. Simmons, E. Bakshy, and L. Adamic, “Seller activity in a virtual marketplace,” First Monday, vol. 15, no. 7-5, 2010. [21] R. Kohavi, “A study of cross-validation and bootstrap for accuracy estimation and model selection,” in International joint Conference on artificial intelligence, vol. 14. Citeseer, 1995, pp. 1137–1145. [22] Y. Zhu and D. Shasha, “Efficient elastic burst detection in data streams,” in Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2003, pp. 336–345. [23] F. Chung and L. Lu, “Connected components in random graphs with given expected degree sequences,” Annals of combinatorics, vol. 6, no. 2, pp. 125–145, 2002.