Download The Network Structure of Sociological Production

Document related concepts

Sociology of the family wikipedia , lookup

Positivism wikipedia , lookup

Sociology of terrorism wikipedia , lookup

Six degrees of separation wikipedia , lookup

Actor–network theory wikipedia , lookup

Sociological theory wikipedia , lookup

Network society wikipedia , lookup

Sociology of culture wikipedia , lookup

Social network analysis wikipedia , lookup

Social network wikipedia , lookup

Public sociology wikipedia , lookup

Sociology of knowledge wikipedia , lookup

Index of sociology articles wikipedia , lookup

History of sociology wikipedia , lookup

Transcript
The Network Structure of Sociology Production
James Moody
Duke University
Stanford University Colloquium
March, 2007
Introduction
Outline:
•Networks & Science: Two Questions & 4 networks
•How do scientific fields evolve?
•Where do good ideas come from?
•Data Sources & Methods
•Results
•Where does sociology fit? Journal co-citation networks
•What do sociologists study? Topic networks
•Who produces sociology? Social science collaboration networks
•Discussion
Networks & Science: Two Questions & 4 networks
"Science, carved up into a host of detailed studies that have no
link with one another, no longer forms a solid whole."
Durkheim, 1933
Stratification
Social
Welfare
Organizations
Historical
Sociology
Crime
Gender
Health
Networks & Science: Two Questions & 4 networks
1) How do scientific fields evolve?
a) Is there a coherent logic to the ebb and flow of topics studied?
b) How does the success or failure of ideas depend on the social community
in which it is embedded?
c) (How) Does the evidentiary basis of a field shape it’s logic of discovery?
The descriptive answer is given by mapping the field in network space.
The analytic answer will come by modeling the emergence, growth and decline
of scientific subfields.
Networks & Science: Two Questions & 4 networks
2) Where do good ideas come from?
a) What is a good idea?
a) Ideas that change a scientific field. Indexed by (a) citations and (b) the
relevant topography of the networks within which the idea was originally
embedded. Ideas are not inherently good; they are recognized as “good” by
their effect on a field.
b)
How do disciplines produce new ideas?
a) Intersection  Good ideas are produced by combining ideas of others in
unique ways (Burt)
b) Development  Good ideas arise naturally from either the progressive “error
reduction” process of good normal science (Popper) or the accepted practices
of a scientific community (Crane).
c) Peer Influence & Recognition  Any idea is a good idea if others think so,
and thinking so is influenced by the network. (Gould).
d) Resource competition  Search for prestige conditioned by organizational
structure (Fuchs)
Will model this by examining how citations are affected by field dynamics (and vice
versa).
Networks & Science: Two Questions & 4 networks
Theoretical approaches to scientific development
We are thus left with multiple action frames to guide our understanding:
Truth: Ideas run their error-reduction course (Popper)
Prestige: Actors seek the greatest visibility (Merton)
Resource competition: “To the victor goes the spoils” – Fuchs
Boundary Protection (Gieryn )
Fractal Development (Abbott)
Community Influence (SSK – Collins, etc)
Peer magnification (Gould)
Power (JL Martin)
For entire fields, these mechanisms are largely unknown and
underspecified.
 Need to extend beyond particular lab studies
 Take a large-scale “Satellite” view of science dynamics
 Link action frames to specific patterns in 4 science networks
Networks & Science: Two Questions & 4 networks
Theoretical approaches to scientific development
Four relevant networks:
1.
Citation networks – a direct trace of scientific recognition &
production
2.
Topic networks – clusters of scientific products related to the same
subject
3.
Collaboration networks – “invisible communities” of social
interaction that produces scientific products
4.
Research Communities – People linked through common research
topics (Substantively a derivative of 2 & 3)
Networks & Science: Two Questions & 4 networks
Scientific Environments
Evidentiary Basis: How do we array disciplines with respect to evidence?
Two Dimensions: Objectivity & Control
Objectivity is taken from Popper: The extent to which a given knowledge
claim is independent of the knower.
Control refers to the ability of scientists to directly manipulate the object of
study. “Lab Science” with complete ability to control apparatus (and
thus environment) represents the strongest ability, while “observation”
represents the other.
Cases: Chemistry (Lab Science: High Objectivity & High Control)
Paleontolgoy (Field Science: High Objectivity & Low Control)
Sociology (Social Science: Moderate Objectivity & Low Control)
Cultural Anthropology (Low Objectivity & Low Control)
This approach is very similar to Fuchs (1993)
Networks & Science: Two Questions & 4 networks
Chemistry
Paleon
Sociology
Citation
Journal
Citation
Structure
Topics
Subfield
Evolution
Collaboration
Community
Collaboration
& Cohesion
Cultural
Anthro
Networks & Science: Two Questions & 4 networks
Focusing on Sociology as a current case
The field of sociology can thus be thought of as the intersection of multiple
networks.
The shape of these networks differs across scales and over time.
- Differences between local and global visions of the network shape our
perceptions of scientific coherence.
-
We tend to perceive coherence in our own specialty fields and
incoherence for the entire discipline.
-
A globally federated structure, that cannot easily exclude empirical
topics, might still be socially coherent if scientific mixing crosscuts empirical problems.
We can see this structure by examining these 4 networks at large scale and
over time.
Data Sources
•Citation Networks
•Compiled from the ISI web of science Journal citation tables
•Covers 1681 social science journals indexed in 2003
•Will eventually
-fill this series from 1950 to present across all fields.
-Add a sample of paper-level citations to model performance.
•Topic & Collaboration Networks (for Sociology)
•Compiled from Sociological Abstracts
•281,163 papers published between 1963 and 1999
•A sub-sample of “sociology only” papers published in a select set of
non-specialty sociology journals  35% of the total (~100K)
•Contains information on title, abstract, keywords, author(s), tables,
journal & citation
•Will use similar indexes for Chemistry, Geology and Lit Crit
Where does sociology fit?
•Perennial debates over the existence of a theoretical core
•Rapid growth in the internal diversity of topics sociologist study:
50
45
Number of ASA Sections
40
35
30
25
20
15
10
5
0
1950
1960
1970
1980
1990
2000
2010
Where does sociology fit?
•Perennial debates over the existence of a theoretical core
•Rapid growth in the number of journals relevant to sociologists:
Where does sociology fit?
This growth & diversity has been seen as evidence for the ultimate
emptiness of sociology as a scientific discipline.
But disciplines are shaped by the connections between ideas, not the
number of ideas.
That is, we recognize fields by who they speak to as much as by what they
speak about.
The clearest empirical trace of this communication is citation.
Disciplines can then be defined as clusters of work that speak more to
each other than to anyone else, which we trace with co-citation
networks.
Where does sociology fit?
Building co-citation networks
Links in a co-citation
network are constructed by
measuring how similar each
journal is to every other
journal.
Similarity is gauged by
correlating the pattern of
citations received by each
journals from every other
journal.
AJS ASR AER … JER
J1
#
#
0
0
J2
#
#
0
0
J3
0
0
#
#
J4
.
.
.
JER
0
#
#
#
0
0
#
#
Comparing across columns tells us whether the two
journals are recognized by others as similar.
Where does sociology fit?
Building co-citation networks
Links in a co-citation
network are constructed by
measuring how similar each
journal is to every other
journal.
Similarity is gauged by
correlating the pattern of
citations received by each
journals from every other
journal.
AJS ASR AER … JER
AJS
1.0
ASR
High 1.0
AER
.
.
.
JER
Low Med
1.0
Low Low High
1.0
This create a valued network of ties between two
journals. I use a cosine similarity score developed
in bibliometrics, selected for those with ties > 0.45
& at sharing at least 2% of their citation volume.
Source: Loet Leydesdorff
Where does sociology fit?
Economics co-citation similarity network
Density = 0.197
N=152
Isolates (not shown): 5
Node size proportional to log(degree)
Where does sociology fit?
Political Science co-citation similarity network
Density = 0.160
N=69
Isolates (not shown): 10
Node size proportional to log(degree)
Where does sociology fit?
Sociology co-citation similarity network
Density = 0.140
N=69
Isolates : 7
Where does sociology fit?
Where does sociology fit?
Where does sociology fit?
Where does sociology fit?
•Sociology “fits” at the center of the social sciences. We are not as internally
cohesive as Economics or Law, but more so than many (anthropology, allied
health fields).
•This represents a tradeoff. We have traded unique dominance of a topic
(markets, politics, mind, space) for diversity & thus centrality.
•Sociology is an interstitial discipline (Abbott, 2004) in at least two-senses:
•There is no content topic we can reasonably exclude
•We pull together, and generate, the ideas and topics covered by
specialty disciplines.
•This makes us uniquely positioned to provide insights on many different
empirical questions. How have the topics sociologists study shifted over
time?
How does this look in the Physical Sciences?
What do sociologists study?
How do we capture the internal organization of research problems?
•Could use paper-level citation networks (see Hargens 2000), but data
are difficult & expensive to obtain for large-scale networks.
•Can examine the network of papers formed by the topics they write
about.
•Directly taps scientific content
•Purely endogenous creation of topics that allows new topic areas to
emerge and old ones to die over time
•Tractability: data can be extracted from information held in
Sociological Abstracts
•Multiple levels:
•Coarse grained Focus solely on keywords (Light 2005)
•Fine grained  Use all information available (title, abstract,
keywords)
What do sociologists study?
A fine grained view
Data Selection & Manipulation:
Index entries contain title, abstract and keywords that summarize the
paper’s content.
•Sample all papers indexed within four 3-year windows between 1970 and
1999.
•Construct a paper – by – word matrix, where the ij cell lists how many
times word i is used to describe paper j.
•Word set is stemmed to get at root words
•A stop-list is used to minimize inclusion of low-information content
words (“the” “and” “is” etc.) or words commonly found in the data
source (“Tables” “Figure” “References”)
•Construct a network by linking the most highly correlated papers
•Use correlation of 0.40 or better
•Ties are treated as valued in the network analyses
What do sociologists study?
A fine grained view
Analysis & Presentation: General approach is “quantitatively inductive”
- Construct a low-dimensional map of the network, using contour
sociograms. These allow for full information in the network structure.
-Use cluster analysis to identify distinct topics
-Use a variant of Moody’s RNM algorithm to cluster the network
This clustering routine:
(a) is efficient: Allows clustering on 10s of thousands of nodes
(b) automatically specifies the optimal number of clusters
(c) allows that some cases can fall ‘between’ clusters
-I set a minimum cluster size of 12 papers published over the 3-year
window.
-Evaluate the clustered papers for content and label the maps.
What do sociologists study?
A fine grained view
Analysis & Presentation: General approach is “quantitatively inductive”
Compare the maps over time qualitatively, looking for general changes
in the frequency & alliance of topics.
Examine shifts in structural indicators of the extent of clustering &
cluster size distributions.
What do sociologists study?
A fine grained view
Example: One-step neighborhood of “More information, better jobs?”
What do sociologists study?
A fine grained view
Example: One-step neighborhood of “More information, better jobs?”
What do sociologists study?
A fine grained view: Content
(all journals)
What do sociologists study?
A fine grained view: Content
(all journals)
What do sociologists study?
A fine grained view: Content
(all journals)
What do sociologists study?
A fine grained view: Content
(all journals)
What do sociologists study?
A fine grained view: Content
(all journals)
The cluster content of the topic network has evolved slowly:
•Some clearly central specialties have remained prominent over the entire
period. This includes larger areas such as:
• Class & Stratification
• Race & Ethnicity
• Education
• Gender (Strongest from 1980s on)
• Family (Strongest from the 1980s on)
• Crime
As well as clearly distinct, though numerically smaller bodies of research
related to
• Suicide
• Sociology of Science, Technology & “Reflexive” sociology
• Unions
What do sociologists study?
A fine grained view: Content
(all journals)
The cluster content of the topic network has evolved slowly:
•The clearest change has been the rapid growth of social research on health.
•Dominated by a very large body of research related to HIV/AIDS
•Other areas of relative growth include:
•Family topics were most prominent in the 1980s
•A strong presence of research on sex & sexuality emerged in the 1980s
and 90s
•Relative declines have come in areas such as:
• Groups
• Interaction
• “Radical” studies
• Elite studies
Summary: A move away from basic social processes toward studying social problems, with
a growing uniqueness of theory & method
What do sociologists study?
A fine grained view: Content
(Restricted Sample)
What do sociologists study?
A fine grained view: Content
(Restricted Sample)
What do sociologists study?
A fine grained view: Content
(Restricted Sample)
What do sociologists study?
A fine grained view: Content
(Restricted Sample)
What do sociologists study?
A fine grained view: Content
(Restricted Sample)
The cluster content of the restricted topic network has evolved similarly to the wider
social science field:
•The subfield structure is less dominated by the purely applied work on
HIV/AIDS in the 90s, but there is a still a clear association of topics around
sexuality, health and AIDS.
•Health, Family, Education, Gender, and Race are always prominent and
large.
•The relative prominence of “reflexive sociology” is much higher –
•These topics cannot be published elsewhere, and the resulting tight
cluster looks proportionately larger in the smaller sample.
What do sociologists study?
A fine grained view: Content
We can measure the degree of consensus in words used to describe
papers with:
C = S pi2
Where pi is the proportion of times word i is used
What do sociologists study?
A fine grained view: Content
Word Consensus Scores
1970 - 1999
0.13
C (x 100)
0.125
Soc Only
0.12
0.115
All SA Journals
0.11
1965
1970
1975
1980
1985
1990
1995
2000
What do sociologists study?
A fine grained view
(Core Soc)
Proportion of papers falling inside a cluster
1
0.9
Total
Cn > 12
0.8
Restricted
0.7
0.6
0.5
0.4
0.3
Total
Cn > 100
Restricted
0.2
0.1
1965
1970
1975
1980
1985
1990
1995
2000
What do sociologists study?
A fine grained view
We can measure the extent that ties fall within clusters with the
modularity score:
 ls  d s 
M    
L
2
L


s 

2



Where:
s indexes clusters in the network
ls is the number of lines in cluster s
ds is the sum of the degrees of s
L is the total number of lines
What do sociologists study?
A fine grained view
Network Modularity
1970 - 1999
0.85
Modularity Score
All SA Journals
0.8
Soc Only
0.75
0.7
1965
1970
1975
1980
1985
1990
1995
2000
What do sociologists study?
A fine grained view
Number of Clusters
1970 - 1999
Total Number of Clusters
500
All Journals
400
300
200
Soc Only
100
0
1965
1970
1975
1980
1985
1990
1995
2000
What do sociologists study?
A fine grained view
Mean Cluster Size
1970 - 1999
80
All Journals
Mean Size of Clusters
70
60
50
Soc Only
40
30
20
10
0
1965
1970
1975
1980
1985
1990
1995
2000
What do sociologists study?
A fine grained view
The cluster structure of the topic network:
•The vast majority of papers can be assigned to clear clusters, with
slight growth in this proportion over time.
•The number of clusters has increased rapidly, though slightly
slower within core sociology than in the broader field of social
science.
•There has been significant growth in the tails of the
distribution – the size distribution is more skewed in later
periods.
•The modularity of the network has increased over time, though
most of this change is between the 1970 and 1980 periods.
This meshes with our intuition of “separate worlds” in the social
sciences: larger, more distinct topical production of science work.
What do sociologists study?
A fine grained view
Next steps:
1. Build a continuous moving window to fill in the dates from 1960 to
2005.
2. Link clusters across time periods, so we can track exactly the relative
growth and decline of each subfield.
3. Model this growth as a function of connections to other fields, author
composition and disciplinary environment.
4. Build this network’s dual: scientists connected through topics.
What do sociologists study?
A clustered topic structure focused strongly on practical problem solving has a hint of
Durkheim’s concern: Is there any integration across these topic clusters?
We shouldn’t jump too quickly to the fractured conclusion:
•
Topic clusters are formed from papers, and papers typically have well
encapsulated ideas. They have a small “maximum digestible unit”
•
Scientific integration is really about how scientists bridge these multiple
topics.
•
If authors write and collaborate across these topics, ideas can quickly
disseminate as well.
What is the structure of the collaboration graph – if this is highly clustered it
would signal potential fragmentation  Who produces sociology?
Who produces sociology?
Science is typically produced through collaboration, both formally and
informally (Crane 1972, Crane & Small 2000, Friedkin 1998).
The best empirical trace of collaboration for large communities of science is
coauthorship.
•Misses the less intense collaborations recognized in acknowledgements,
discussions, colleagues reading each other’s work
•But should provide the strongest test of a fractionalization hypothesis,
since the set of people we write with should be more like us than the set
of people we have lunch with or discuss work with informally.
•There are differences across subfields in formal collaboration rates,
which, if anything, should magnify the extent of observed fragmentation.
Who produces sociology?
Coauthorship Trends in Sociology
Sociological Abstracts and ASR
Proportion of papers with >1 author
0.75
0.6
0.45
0.3
Sociological Abstracts
ASR
0.15
0
1930
1940
1950
1960
1970
Year
1980
1990
2000
Who produces sociology?
Distribution of Coauthorship Across Journals
Child
Development
Sociological Abstracts, 1963-1999
Proportion of papers w. >1 author
1
0.8
Soc.
Forces
J. Health &
Soc. Beh.
ASR
0.6
J.Am.
Statistical A.
0.4
AJS
Atca
Politica
Soc.
Theory
0.2
Signs
J. Soc.
History
0
0
100
200
300
400
500
600
700
Coauthorship Rank
800
900
1000
1100
Who produces sociology?
Construct a collaboration network by assigning an edge between any pair of
people who coauthored a paper together.
Example Paths: 3-steps from Stan Wasserman
N=361
Who produces sociology?
Construct a collaboration network by assigning an edge between any pair of
people who coauthored a paper together.
Example Paths: 3-steps from Stan Wasserman
N=361
Node size proportional to log of degree
Who produces sociology?
The simplest summary test for a fragmented network is to measure the extent of
clustering in the network. Watts’ work on the “small-world problem” suggests that if
the collaboration network is a small world network it might be fractured.
C=Large, L is Small =
SW Graphs
•High relative probability that a node’s contacts are connected to each other.
•Small relative average distance between nodes
Who produces sociology?
In a highly clustered, ordered
network, a single random
connection will create a shortcut
that lowers L dramatically
Watts demonstrates that Small
world properties can occur in
graphs with a surprisingly small
number of shortcuts
Who produces sociology?
Locally clustered graphs are a good model for coauthorship
when there are many authors on a paper.
Paper 1
Paper 2
Paper 3
Paper 4
Paper 5
Newman (2001) finds that coauthorship among natural scientists
fits a small world model.
I test this model on the sociology coauthorship network,
using all authors from 1963 – 1999.
Who produces sociology?
Clustering
Distance
Observed
Random
0.194
0.206
9.81
7.57
The sociology network is less clustered than would be expected
by chance and somewhat longer overall distances.
This suggests that it does not have a small-world structure.
Who produces sociology?
The network has a broad Core-periphery structure
(68,923)
59,866
38,823
29,462
Bicomponent
Component
Unconnected
Structurally Isolated
Who produces sociology?
Internal Structure of the Coauthorship Core
Health
General
Sociology
Who produces sociology?
•Strong specialty effects for ever-coauthored
Unlikely:
History & Theory
Sociology of Knowledge
Radical / Marxist Sociology
Feminist / Gender Studies
Likely:
Social psychology
Family
Health & Medicine
Social Problems
Social Welfare
Who produces sociology?
•Weak specialty effects for network embeddedness
•Large number of coauthors increases embeddedness
•Large number of people on any given paper decreases
embeddedness
Summary & discussion
Social Science Citation Structure
•Economics, Law, Psychology, Business/Management, Linguistics
are most cohesive
•The are also “peripheral” in that they speak to a relatively
limited set of problems
•Sociology is at least as cohesive as Political Science, and more
cohesive than fields such as Anthropology, Social Work, Education
or allied health fields that all have more limited empirical domains
•Our position represents a tradeoff between internal cohesion
and external centrality.
Summary & discussion
Scientific Topic Network
•Big-Picture: A general progression towards problem solving and
the specialization of work on theory & methods (Light 2005).
•Fine-grained structure:
•A federated topic structure that has largely retained that form
since the 1970s, though there have been shifts in substantive
topics.
•Key content areas have remained largely constant
•Race, Family, Class, Gender, Science, and Health
•A decrease in focus on general foundation problems
•Group structure, community, interaction
•An increase in work on social problems
•Health & HIV/AIDS -related topics
•Some (minor) evidence for greater homogeneity in topics
discussed
Summary & discussion
Scientific Collaboration Network
•The networks is not divided into small research-area based clusters.
•There is no partition that strongly separates scientists.
•This has to imply that authors bridge topic clusters.
•This is good for social cohesion, and probably good for
theoretical cohesion.
•Caveat: There is evidence for a division based on research method,
with largely quantitative work more likely to be coauthored, though
there is no such simple division in the topics network.
Summary & discussion
Combined, these models suggest a discipline that is integrated socially
and locally cohesive topically.
Discipline-wide integration will likely only increase as pressures for
collaboration push more scientists to work together across topics.
However, the perception of disintegration will likely continue:
• because most of us are only exposed outside our areas by work that
appears in the general journals.
•But almost all of the topical cohesion is due to “normal science”
work occurring in specialty journals.