Download An Alternate Method for Comparing Social

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Airborne Networking wikipedia , lookup

Transcript
An Alternate Method for Comparing Social Networks
Comparing Multiple Social Networks using Multidimensional Scaling
John Stevens, University of Essex
Biographical note
John Stevens is a junior member of staff at the University of Essex in the School of
Health and Human Sciences. His current research interests are in social networks
and the sociology of the East of England; other research interests include acquired
neurological disabilities and independent living. Stevens has multiple sclerosis and
works from a wheelchair as he is paralysed on the right side of his body.
1 Introduction
In this methodological report, further developments to a technique that I have
developed for comparing three or more social networks, that uses multidimensional
scaling, will be described. Several criteria may be desired as outcomes when drawing
comparisons between different social networks. The criteria to be compared may
include the size, connectedness and density of social networks. In previous research I
have found no single number that can adequately describe the similarities or
differences between social networks (Stevens 2008). The innovative technique
described here illustrates, graphically, the similarities and differences between several
networks. The initial research was previously reported in Stevens (2010 and 2011)
but is presented here with a more concise and clearer description, together with
further developments on the interpretation of the final multi-dimensional scaling plot.
As there is some preference in British Sociology for text or graphics, rather than
quantitative measures, a graphical technique has been developed for comparing social
networks. This paper describes the method developed for comparing multiple social
networks. This is followed by an exploration of the alternate technique, together with
an example of how it can be used to express graphically the comparison of seven
social networks. To conclude, findings are summarised, and the advantages and
limitations of this alternate technique are highlighted.
2 Methods
Following a description of social network analysis, social network variables that can
be used to differentiate between the social networks are identified. This is followed
by a description of the multi-dimensional scaling technique implemented to display
graphically the difference between the social networks. The seven different social
network data sets used in this paper are described before detailing the technique for
differentiating between social networks.
2.1 Social Network Analysis
The Manchester School drew on Parsons’ (1971) approach to sociology to examine
communities by placing an “emphasis on seeing structures as networks of
relations”(Scott 2000:pg 7). Barns(1954:pg 137) described his research similarly, “The
image I have is of a set of points some of which are joined by lines. The points of the
image are people, or sometimes groups and the lines indicate which people interact
with each other.” Barns argued that local social systems are best represented as a
network of the social relationships that can reach beyond the boundaries of a local
area.
Researchers at Harvard University in the 1970s, including Harrison White, established
social network analysis as a method of structural analysis. The mixing of mathematical
graph theory with sociological and anthropological methods was essential in the
work of this group. Some of the methods of analysis developed by the group
distinguish egocentric from sociocentric networks, and assess the density of nets
which is used later in this paper (Freeman 1979).
Although the research and observations of both the Manchester and Harvard
schools continue to be relevant, this next section discusses some of the literature
produced/published by the post-Manchester school. In the more recent British
literature, Crow and Allen (1994) described communities as “inter-locking social
networks of neighbourhood, kinship and friendship […] not dependent on any notion
of locality or place”(pg157). In their analysis of several community studies in the 20th
century, they draw extensively on an article by Wellman (1979) which posits that a
community can be viewed as a network of ties. These ties can be with family, kin,
friends, or people who live in the same neighbourhood. The strength of these ties
varies, as do the network structures, with the amount of contact with, and distance
from, other individuals. In 1990, Wellman and Wortley concentrated on the amount of
support people received from these ties and the amount of time these same people
spent in the community. They concluded that the size of the community, the density
of its support ties, and the length of its friendship chains all have a greater effect on a
person’s feeling of community than does the physical boundary of that community.
This research has an influence on my definition of community (Wellman and Wortley
1990).
2.2 Social Network Variables
Traditionally, the most commonly-used quantifiable way of comparing social networks
was to compare the density of the different social networks, but this is a very broad
sweeping approach. As a way forward, Faust and Skvoretz (2002) compared the
number of triads (cliques of three nodes) between different social networks. It was
decided that cliques would be ignored as it was anticipated that the multidimensional
scaling technique would be used as a way of analysing several variables of the social
network simultaneously, not just the number of cliques. As described in
Stevens(2010), variables were identified that can be used to determine the greatest
distance between the social networks. Please see the next section for a list of these.
These network variables can all be created for a given social network using most
standard network programs; UCINET has been used here. For a concise description
of the network variables described in this paper, please refer to Scott(2000). There is
not a single variable that defines adequately a social network completely, thus
comparing social networks is a difficult task.
2.3 Multidimensional scaling (MDS)
The application of MDS to social network analysis has been covered extensively by
Wassermann and Faust(1994) but they do not cover applications for comparing
social networks. The technique developed, which is described here, applies the MDS
techniques of PREFSCAL and HICLUS to the variables identified in the previous
section. These techniques are further described in Coxon and Davies (1982).
2.4 Data sets
To explain the technique described in this article, an example is shown in the next
section which compares seven different social network data sets. The data sets
collected can all be described as social networks, ranging in type from longitudinal to
cross-sectional, complete to incomplete, large to small, and community to email data
sets. The data sets are best described in the following table.
Social network
Description
Rural housing
social network
A small incomplete community social network of a rural
housing estate
Urban housing
social network
A small incomplete community social network of an urban
housing
Student social
network wave 1
A small largely complete longitudinal social network of self
reported contacts amounts post-graduate sociology students
at an UK university.
(data collected in week 1)
Student social
network wave 2
A small largely complete longitudinal social network of self
reported contacts amounts post graduate sociology students
at an UK university.
(data collected in week 7)
Student social
network wave 3
A small largely complete longitudinal social network of self
reported contacts amounts post graduate sociology students
at an UK university.
(data collected in week 15)
Student email
social network
A large complete internal email traffic data set of all students
in a university for 1 week
Staff email
social network
A large internal email traffic data set of all staff in a university
for 1 week
Table 1 – Data sets
(Source: Stevens (2008) (2011) (2013))
2.5 Technique
No single variable adequately defines a social network completely and, therefore,
comparing several social networks is a difficult task. For this reason, a metadata table
was created, this contained the values of the five most promising network variables
for each of the social networks (as identified in Stevens, 2008). These are “The
number of network nodes”, “The density of the social network”, “The average
shortest path”, “The network centralisation” and “The network betweenness”. Please
note that, at minimum, three or more social networks should be compared as I have
previously determined in Stevens(2011). This metadata table was used as the input
data to both the PREFSCAL procedure and the HICLUS procedure. It should be
noted that if SPSS is used for the analysis, the dendrogram output option should be
selected to enable a graphicalout put to be produced. Lastly, the output of the two
procedures was combined and interpreted using the procedure outlined by Coxon
and Davies (1982). This technique is presented as a flowchart in Figure 1.
Metadata table
|
\|/
PREFSCAL
|
\|/
HICLUS
|
\|/
Interpret result
Figure 1 – Flowchart of alternate comparison technique
3 An example of comparing multiple social networks
A metadata table was produced; this detailed five network variables, across all seven
datasets, for the number of nodes in a given social network, the density of the social
network, the network centralization and the average shortest path length between
nodes.
To give an example of how the alternate technique for comparing multiple social
networks was applied, I began by comparing seven social networks of differing sizes
and genre. To begin, a metadata table was created, as shown in section 3.1. The
multidimensional scaling tools of PREFSCAL and HICLUS were then applied to the
metadata table. These processes are described in sections 3.2 and 3.3 respectively. A
graphical output was consequently produced; it enabled the comparison of social
networks once the outputs of the two procedures had been combined. From this, it
was possible to interpret the resulting graphs, as shown in section 3.4.
3.1 Metadata table
As discussed, the metadata table (Table 1) was created for the seven social networks
to be compared.
Data set
No of
nodes
Average
Shortes
t Path
Network %
centralization
Betweenness
24
Density
of the
social
network
0.0451
Rural
housing
social
network
Urban
housing
social
network
Student
social
network
wave 1
Student
social
network
wave 2
Student
social
network
wave 3
Student
email
social
network
Staff
email
social
network
3.160
19.84
20.404
27
0.0116
1.547
5.78
0.604
60
0.0370
2.924
14.80
75.458
65
0.0938
2.535
20.12
91.200
64
0.1796
2.053
32.80
63.373
324
0.0118
4.442
18.06
221.775
833
0.0051
3.618
4.88
1027.364
Table 2 - Metadata table
(Source: Values calculated using UCINET analysing data sets described
in Stevens 2008, 2011 and 2013)
3.2 PREFSCAL
The stage involved the application of the MDS technique, the PREFSCAL procedure;
this was implemented in SPSS using the metadata table as the input data. Figure 2
shows the output obtained. This procedure has the notable advantage of more
flexibility in that the user can select whether to output rows or columns in the
model space and can use a rectangular input data matrix. The output of PREFSCAL is
similar to that of ALSCAL with the notable exception that the graph has no major
grouping. It should be noted by the reader that by multidimensional scaling’s nature
of combining many graphs into one graph it is imposable to purt dimensins on any of
the plot outputs.
Figure 2 - PREFSCAL output
(Final Stress: 0.0000: Penalty 4.5677)
(Source: Scaling output of PREFSCAL as implemented in SPSS when computing
metadata Table 2)
3.3 Hierarchical clustering
The scaling technique, HICLUS procedure, as implemented in SPSS, was applied next
to the metadata table to find a dendrogram graph of the hierarchical clustering of the
table. This standard output graph has the advantage that its output, being graphical,
was straightforward to understand. The output of the HICLUS procedure provided
complementary results to the PREFSCAL procedure, grouping the five data-sets
(waves 1, 2 and 3, the urban and the rural networks) together while differentiating
them from the other two data sets.
* * H I E R A R C H I C A L C L U ST E R A NA LY S I S * *
Dendrogram using Average Linkage (Between Groups)
Rescalde Distancie Cluster Combine
CA S E
0
5
10
15
20
25
Label Num +---------+---------+---------+---------+---------+
Urban
2 ─┐
wave1
3 ─┤
rural
1 ─┼───┐
wave2
4 ─┤
├──────────────────────────────┐
wave3
5 ─┘
│
│
Student 6 ─────┘
│
staff
7 ───────────────────────────────────┘
Figure 3 - HICLUS output
(Source: Scaling output of HICLUS as implemented in SPSS when computing
metadata Table 2)
3.4 Interpreting the results
MDS, by its nature, does not allow easily for a scale to be applied to its output as the
procedure folds multi-dimensions into a two-dimensional work space. To interpret
the output, combining the results of the PREFSCAL and HICLUS procedures, shown
in Figure 2 and Figure 3, is recommended as they were created using the same data
through the combination techniques described in Coxon and Davies(1982). This
technique has the advantage that it enables the highlighting of groupings in the data
which are undetectable and appear a bit tenuous and a stretch to the casual observer
but are statically significant. This is demonstrated in the final output (Figure 4) after
the combination, or “folding,” of six input variables for each network into a single
graph.
Figure 4 - Combined output
(Final Stress: 0.0000: Penalty 4.5677)
(Source: Combined output of PREFSCAL and HICLUS as implemented in SPSS when
computing metadata Table 2)
When comparing multiple social networks, the above techniques are successful for
showing the differences and similarities between the social networks. It is my opinion
that this method has the benefit of grouping visually, statically significant variances in
the data. Summarised in data Table 2, for a more casual observer, that six variables are
combined into one graph. In the original data, the student and staff networks are
outliers from the other networks, in relation to both the number of nodes and
betweenness. A minimum of three social networks should be compared using the
above method as I explored in Stevens(2001). An area for further investigation would
be to assess the model's accuracy when the minimum number of social networks is
compared. To summarize, comparing social networks using MDS on a metadata table
is a method that works effectively when multiple social networks need to be
compared. This alternate technique enables more than three social networks to be
compared using this graphical method while highlighting minor differences between
different social networks.
4. Summary
As stated above, there are a number of factors to be considered when comparing
social networks. Others have used various methods and factors for comparing these
social networks, such as comparing “degree of node“, “density of the social network“,
“path length“ and the “number of triads“ in a network. The technique described here,
which was far more successful for comparing several social networks, was achieved
by creating a metadata table, consisting of many social network variables, and the
application of MDS techniques, such as PREFSCAL and HICLUS. When the output of
these procedures was combined and analysed, a graphical representation of the
differences between several social networks is produced. This technique is most
effective when more than three social networks are compared. As no other
approaches were compared because of the space limitations it is impossible to state
anything more than this technique is not perfect but it does provide a systematic
method for comparing multiple social networks.
References
Barns, J. (1954). Class and community in a Norwegian Island Parish. New York, Harper
and Row.
Coxon, A. P. M. and P. M. Davies (1982). The user's guide to multidimensional scaling:
with special reference to the MDS (X) library of computer programs, Heinemann
Educational Books Exeter.
Faust, K. and J. Skvoretz (2002). "Comparing networks across space and time, size and
species." Sociological Methodology 32(1): 267-299.
Freeman, L. C. (1979). "Centrality in social networks conceptual clarification." Social
Networks 1(3): 215-239.
Parsons, T. (1971). The social system, Psychology Press.
Scott, J. (2000). "Social network analysis." Sociology 22(1): 109-127.
Stevens, J. (2008). "Self Identified Ethnicity and Friendship: Networks Among PostGraduate Students at a British University." The Essex Graduate Journal of Sociology
University of Essex 8: 28-34.
Stevens, J. (2010). "Comparing Social networks: Comparing Multiple Social Networks
using Multiple Dimensional Scaling." Methodological investigations Online 5(1).
Stevens, J. (2011). "Comparison of education social networks." Essex Student
Research Online 3(2).
Stevens, J. (2013). "Comparing communities: social networks in an urban and a rural
housing estate in East of England." The Essex Graduate Journal of Sociology
University of Essex 13.
Wassermann, S. and K. Faust (1994). Social network analysis: Methods and
applications. New York.
Wellman, B. (1979). "The community question: The intimate networks of East
Yorkers." American Journal of Sociology: 1201-1231.
Wellman, B. and S. Wortley (1990). "Different strokes from different folks: Community
ties and social support." American Journal of Sociology: 558-588.