Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
An Alternate Method for Comparing Social Networks Comparing Multiple Social Networks using Multidimensional Scaling John Stevens, University of Essex Biographical note John Stevens is a junior member of staff at the University of Essex in the School of Health and Human Sciences. His current research interests are in social networks and the sociology of the East of England; other research interests include acquired neurological disabilities and independent living. Stevens has multiple sclerosis and works from a wheelchair as he is paralysed on the right side of his body. 1 Introduction In this methodological report, further developments to a technique that I have developed for comparing three or more social networks, that uses multidimensional scaling, will be described. Several criteria may be desired as outcomes when drawing comparisons between different social networks. The criteria to be compared may include the size, connectedness and density of social networks. In previous research I have found no single number that can adequately describe the similarities or differences between social networks (Stevens 2008). The innovative technique described here illustrates, graphically, the similarities and differences between several networks. The initial research was previously reported in Stevens (2010 and 2011) but is presented here with a more concise and clearer description, together with further developments on the interpretation of the final multi-dimensional scaling plot. As there is some preference in British Sociology for text or graphics, rather than quantitative measures, a graphical technique has been developed for comparing social networks. This paper describes the method developed for comparing multiple social networks. This is followed by an exploration of the alternate technique, together with an example of how it can be used to express graphically the comparison of seven social networks. To conclude, findings are summarised, and the advantages and limitations of this alternate technique are highlighted. 2 Methods Following a description of social network analysis, social network variables that can be used to differentiate between the social networks are identified. This is followed by a description of the multi-dimensional scaling technique implemented to display graphically the difference between the social networks. The seven different social network data sets used in this paper are described before detailing the technique for differentiating between social networks. 2.1 Social Network Analysis The Manchester School drew on Parsons’ (1971) approach to sociology to examine communities by placing an “emphasis on seeing structures as networks of relations”(Scott 2000:pg 7). Barns(1954:pg 137) described his research similarly, “The image I have is of a set of points some of which are joined by lines. The points of the image are people, or sometimes groups and the lines indicate which people interact with each other.” Barns argued that local social systems are best represented as a network of the social relationships that can reach beyond the boundaries of a local area. Researchers at Harvard University in the 1970s, including Harrison White, established social network analysis as a method of structural analysis. The mixing of mathematical graph theory with sociological and anthropological methods was essential in the work of this group. Some of the methods of analysis developed by the group distinguish egocentric from sociocentric networks, and assess the density of nets which is used later in this paper (Freeman 1979). Although the research and observations of both the Manchester and Harvard schools continue to be relevant, this next section discusses some of the literature produced/published by the post-Manchester school. In the more recent British literature, Crow and Allen (1994) described communities as “inter-locking social networks of neighbourhood, kinship and friendship […] not dependent on any notion of locality or place”(pg157). In their analysis of several community studies in the 20th century, they draw extensively on an article by Wellman (1979) which posits that a community can be viewed as a network of ties. These ties can be with family, kin, friends, or people who live in the same neighbourhood. The strength of these ties varies, as do the network structures, with the amount of contact with, and distance from, other individuals. In 1990, Wellman and Wortley concentrated on the amount of support people received from these ties and the amount of time these same people spent in the community. They concluded that the size of the community, the density of its support ties, and the length of its friendship chains all have a greater effect on a person’s feeling of community than does the physical boundary of that community. This research has an influence on my definition of community (Wellman and Wortley 1990). 2.2 Social Network Variables Traditionally, the most commonly-used quantifiable way of comparing social networks was to compare the density of the different social networks, but this is a very broad sweeping approach. As a way forward, Faust and Skvoretz (2002) compared the number of triads (cliques of three nodes) between different social networks. It was decided that cliques would be ignored as it was anticipated that the multidimensional scaling technique would be used as a way of analysing several variables of the social network simultaneously, not just the number of cliques. As described in Stevens(2010), variables were identified that can be used to determine the greatest distance between the social networks. Please see the next section for a list of these. These network variables can all be created for a given social network using most standard network programs; UCINET has been used here. For a concise description of the network variables described in this paper, please refer to Scott(2000). There is not a single variable that defines adequately a social network completely, thus comparing social networks is a difficult task. 2.3 Multidimensional scaling (MDS) The application of MDS to social network analysis has been covered extensively by Wassermann and Faust(1994) but they do not cover applications for comparing social networks. The technique developed, which is described here, applies the MDS techniques of PREFSCAL and HICLUS to the variables identified in the previous section. These techniques are further described in Coxon and Davies (1982). 2.4 Data sets To explain the technique described in this article, an example is shown in the next section which compares seven different social network data sets. The data sets collected can all be described as social networks, ranging in type from longitudinal to cross-sectional, complete to incomplete, large to small, and community to email data sets. The data sets are best described in the following table. Social network Description Rural housing social network A small incomplete community social network of a rural housing estate Urban housing social network A small incomplete community social network of an urban housing Student social network wave 1 A small largely complete longitudinal social network of self reported contacts amounts post-graduate sociology students at an UK university. (data collected in week 1) Student social network wave 2 A small largely complete longitudinal social network of self reported contacts amounts post graduate sociology students at an UK university. (data collected in week 7) Student social network wave 3 A small largely complete longitudinal social network of self reported contacts amounts post graduate sociology students at an UK university. (data collected in week 15) Student email social network A large complete internal email traffic data set of all students in a university for 1 week Staff email social network A large internal email traffic data set of all staff in a university for 1 week Table 1 – Data sets (Source: Stevens (2008) (2011) (2013)) 2.5 Technique No single variable adequately defines a social network completely and, therefore, comparing several social networks is a difficult task. For this reason, a metadata table was created, this contained the values of the five most promising network variables for each of the social networks (as identified in Stevens, 2008). These are “The number of network nodes”, “The density of the social network”, “The average shortest path”, “The network centralisation” and “The network betweenness”. Please note that, at minimum, three or more social networks should be compared as I have previously determined in Stevens(2011). This metadata table was used as the input data to both the PREFSCAL procedure and the HICLUS procedure. It should be noted that if SPSS is used for the analysis, the dendrogram output option should be selected to enable a graphicalout put to be produced. Lastly, the output of the two procedures was combined and interpreted using the procedure outlined by Coxon and Davies (1982). This technique is presented as a flowchart in Figure 1. Metadata table | \|/ PREFSCAL | \|/ HICLUS | \|/ Interpret result Figure 1 – Flowchart of alternate comparison technique 3 An example of comparing multiple social networks A metadata table was produced; this detailed five network variables, across all seven datasets, for the number of nodes in a given social network, the density of the social network, the network centralization and the average shortest path length between nodes. To give an example of how the alternate technique for comparing multiple social networks was applied, I began by comparing seven social networks of differing sizes and genre. To begin, a metadata table was created, as shown in section 3.1. The multidimensional scaling tools of PREFSCAL and HICLUS were then applied to the metadata table. These processes are described in sections 3.2 and 3.3 respectively. A graphical output was consequently produced; it enabled the comparison of social networks once the outputs of the two procedures had been combined. From this, it was possible to interpret the resulting graphs, as shown in section 3.4. 3.1 Metadata table As discussed, the metadata table (Table 1) was created for the seven social networks to be compared. Data set No of nodes Average Shortes t Path Network % centralization Betweenness 24 Density of the social network 0.0451 Rural housing social network Urban housing social network Student social network wave 1 Student social network wave 2 Student social network wave 3 Student email social network Staff email social network 3.160 19.84 20.404 27 0.0116 1.547 5.78 0.604 60 0.0370 2.924 14.80 75.458 65 0.0938 2.535 20.12 91.200 64 0.1796 2.053 32.80 63.373 324 0.0118 4.442 18.06 221.775 833 0.0051 3.618 4.88 1027.364 Table 2 - Metadata table (Source: Values calculated using UCINET analysing data sets described in Stevens 2008, 2011 and 2013) 3.2 PREFSCAL The stage involved the application of the MDS technique, the PREFSCAL procedure; this was implemented in SPSS using the metadata table as the input data. Figure 2 shows the output obtained. This procedure has the notable advantage of more flexibility in that the user can select whether to output rows or columns in the model space and can use a rectangular input data matrix. The output of PREFSCAL is similar to that of ALSCAL with the notable exception that the graph has no major grouping. It should be noted by the reader that by multidimensional scaling’s nature of combining many graphs into one graph it is imposable to purt dimensins on any of the plot outputs. Figure 2 - PREFSCAL output (Final Stress: 0.0000: Penalty 4.5677) (Source: Scaling output of PREFSCAL as implemented in SPSS when computing metadata Table 2) 3.3 Hierarchical clustering The scaling technique, HICLUS procedure, as implemented in SPSS, was applied next to the metadata table to find a dendrogram graph of the hierarchical clustering of the table. This standard output graph has the advantage that its output, being graphical, was straightforward to understand. The output of the HICLUS procedure provided complementary results to the PREFSCAL procedure, grouping the five data-sets (waves 1, 2 and 3, the urban and the rural networks) together while differentiating them from the other two data sets. * * H I E R A R C H I C A L C L U ST E R A NA LY S I S * * Dendrogram using Average Linkage (Between Groups) Rescalde Distancie Cluster Combine CA S E 0 5 10 15 20 25 Label Num +---------+---------+---------+---------+---------+ Urban 2 ─┐ wave1 3 ─┤ rural 1 ─┼───┐ wave2 4 ─┤ ├──────────────────────────────┐ wave3 5 ─┘ │ │ Student 6 ─────┘ │ staff 7 ───────────────────────────────────┘ Figure 3 - HICLUS output (Source: Scaling output of HICLUS as implemented in SPSS when computing metadata Table 2) 3.4 Interpreting the results MDS, by its nature, does not allow easily for a scale to be applied to its output as the procedure folds multi-dimensions into a two-dimensional work space. To interpret the output, combining the results of the PREFSCAL and HICLUS procedures, shown in Figure 2 and Figure 3, is recommended as they were created using the same data through the combination techniques described in Coxon and Davies(1982). This technique has the advantage that it enables the highlighting of groupings in the data which are undetectable and appear a bit tenuous and a stretch to the casual observer but are statically significant. This is demonstrated in the final output (Figure 4) after the combination, or “folding,” of six input variables for each network into a single graph. Figure 4 - Combined output (Final Stress: 0.0000: Penalty 4.5677) (Source: Combined output of PREFSCAL and HICLUS as implemented in SPSS when computing metadata Table 2) When comparing multiple social networks, the above techniques are successful for showing the differences and similarities between the social networks. It is my opinion that this method has the benefit of grouping visually, statically significant variances in the data. Summarised in data Table 2, for a more casual observer, that six variables are combined into one graph. In the original data, the student and staff networks are outliers from the other networks, in relation to both the number of nodes and betweenness. A minimum of three social networks should be compared using the above method as I explored in Stevens(2001). An area for further investigation would be to assess the model's accuracy when the minimum number of social networks is compared. To summarize, comparing social networks using MDS on a metadata table is a method that works effectively when multiple social networks need to be compared. This alternate technique enables more than three social networks to be compared using this graphical method while highlighting minor differences between different social networks. 4. Summary As stated above, there are a number of factors to be considered when comparing social networks. Others have used various methods and factors for comparing these social networks, such as comparing “degree of node“, “density of the social network“, “path length“ and the “number of triads“ in a network. The technique described here, which was far more successful for comparing several social networks, was achieved by creating a metadata table, consisting of many social network variables, and the application of MDS techniques, such as PREFSCAL and HICLUS. When the output of these procedures was combined and analysed, a graphical representation of the differences between several social networks is produced. This technique is most effective when more than three social networks are compared. As no other approaches were compared because of the space limitations it is impossible to state anything more than this technique is not perfect but it does provide a systematic method for comparing multiple social networks. References Barns, J. (1954). Class and community in a Norwegian Island Parish. New York, Harper and Row. Coxon, A. P. M. and P. M. Davies (1982). The user's guide to multidimensional scaling: with special reference to the MDS (X) library of computer programs, Heinemann Educational Books Exeter. Faust, K. and J. Skvoretz (2002). "Comparing networks across space and time, size and species." Sociological Methodology 32(1): 267-299. Freeman, L. C. (1979). "Centrality in social networks conceptual clarification." Social Networks 1(3): 215-239. Parsons, T. (1971). The social system, Psychology Press. Scott, J. (2000). "Social network analysis." Sociology 22(1): 109-127. Stevens, J. (2008). "Self Identified Ethnicity and Friendship: Networks Among PostGraduate Students at a British University." The Essex Graduate Journal of Sociology University of Essex 8: 28-34. Stevens, J. (2010). "Comparing Social networks: Comparing Multiple Social Networks using Multiple Dimensional Scaling." Methodological investigations Online 5(1). Stevens, J. (2011). "Comparison of education social networks." Essex Student Research Online 3(2). Stevens, J. (2013). "Comparing communities: social networks in an urban and a rural housing estate in East of England." The Essex Graduate Journal of Sociology University of Essex 13. Wassermann, S. and K. Faust (1994). Social network analysis: Methods and applications. New York. Wellman, B. (1979). "The community question: The intimate networks of East Yorkers." American Journal of Sociology: 1201-1231. Wellman, B. and S. Wortley (1990). "Different strokes from different folks: Community ties and social support." American Journal of Sociology: 558-588.