Download quantitative and qualitative - BU Blogs

yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

Geography wikipedia, lookup

Style (sociolinguistics) wikipedia, lookup

Neohumanism wikipedia, lookup

Social theory wikipedia, lookup

Postdevelopment theory wikipedia, lookup

Sociological theory wikipedia, lookup

History of sociology wikipedia, lookup

Symbolic interactionism wikipedia, lookup

Ethnography wikipedia, lookup

Other (philosophy) wikipedia, lookup

Social anthropology wikipedia, lookup

Operations research wikipedia, lookup

Positivism wikipedia, lookup

Social network analysis wikipedia, lookup

Intercultural competence wikipedia, lookup

Origins of society wikipedia, lookup

Sociology of knowledge wikipedia, lookup

Social comparison theory wikipedia, lookup

Tribe (Internet) wikipedia, lookup

Social history wikipedia, lookup

History of the social sciences wikipedia, lookup

Extended version of chapter published in
International Encyclopedia of Political Science,
Bertrand Badie, Dirk Berg-Schlosser, Leonardo Morlino (eds),
Sage, 2011
John Gerring
Department of Political Science
Boston University
232 Bay State Road
Boston MA 02215
[email protected]
Craig W. Thomas
Associate Professor
Daniel J. Evans School of Public Affairs
Box 353055
University of Washington
Seattle, WA 98195-3055
[email protected]
A persistent dualism between qualitative and quantitative methods afflicts
contemporary social science. This paper explains that dualism and offers a way forward. The
key methodological issue, we argue, is the comparability of adjacent pieces of evidence
(observations). Quantitative work presumes that observations are highly comparable because
quantitative tools require a single, uniform metric and a precise point estimate of each
observation. Qualitative work, by contrast, need not assume a high level of comparability
among observations because qualitative tools are linguistic and words are open to a variety
of meanings. Differing assumptions about comparability may be grounded in differing views
about the nature of things or the nature of knowledge; but they are not inherently
epistemological or ontological. In this respect, our explanation differs from standard
interpretations of the qual/quant divide.
When you can measure what you are speaking about, and express it in numbers,
you know something about it; but when you cannot measure it, when you cannot
express it in numbers, your knowledge is of a meager and unsatisfactory kind: it
may be the beginning of knowledge, but you have scarcely, in your thoughts,
advanced to the stage of science, whatever the matter may be.
-- Lord Kelvin 1
Not everything that can be counted counts, and not everything that counts can be
-- Albert Einstein 2
What things should a researcher count, and when she should count them? Are all
things countable worth counting? What is “qualitative” evidence, anyway? Is it simply nonquantitative evidence? If so, what does “quantitative” mean? What is the relationship
between these two kinds of evidence and causal inference?
Perhaps no division in the social sciences is so persistent, so nettlesome, and so
poorly understood as the division between quantitative and qualitative ways of knowing. The
cleavage can be traced back to the first applications of statistics within the disciplines of
economics, political science, and sociology, and became increasingly acute in the latetwentieth century as quantitative approaches gained in stature, grew in complexity, and
pushed qualitative empirical analysis out of the limelight (Barnes 1925; Becker 1934; Bernard
1928; Gosnell 1933; Hammersley 1989; Jocher 1928; Marshall 1897; McLaughlin 1991; Snow
1959/1993; Stouffer 1931; Teggart 1939/1967; Waller 1934; White 1930; Znaniecki 1934).
During this period, the division between qualitative and quantitative methods became
associated – unfortunately and inappropriately – with the rival epistemological positions of
positivism and interpretivism. Smith (1989: 29) summarizes the now-familiar stand-off:
On the one hand, there are those who argue that only through the
application of quantitative measurements and methods can the social
sciences ever hope to become ‘real’ sciences; on the other hand, there are
those who claim that the subject matter of the social sciences is simply not
amenable to quantification and all attempts to impose such measures and
methods upon social behavior is just so much nonsense. 3
While there have been many attempts to shed light on this persistent division in the social
sciences, work on this question is generated primarily by writers who occupy one of the two
camps. These writers tend to be either strong partisans or visceral opponents of the
“quantitative worldview.” A chronic dualism besets these debates, in large part because the
distinction between quantitative and qualitative forms of descriptive and causal inference
have been folded into the debate between positivism and interpretivism.
Positivists (aka naturalists), usually identified with quantitative methods, present their
perspective as hegemonic: there is, or ought to be, only one logic of inference (Blalock 1982,
1989; Friedman 1953; Goldthorpe 2000; King, Keohane, Verba 1994; Lazarsfeld, Rosenberg
1955; Lieberson 1985; Wilson 1998). 4 The conclusion of these scholars is either that there
are no important distinctions between qualitative and quantitative work or, to the extent that
such distinctions exist, they are to the detriment of qualitative scholarship. “When possible,
quantify,” is the motto of this camp (Benoit 2005). Where quantification is not possible, this
camp encourages qualitative scholars to follow the logic of quantitative reasoning.
Defenders of qualitative work typically emphasize the limits of quantification and the
insights that can be gained through an interpretive approach to social action. Rather than a
unified logic, interpretivists suggest that there might be multiple logics at work in the social
sciences. These multiple logics stem from epistemological or ontological commitments,
which may themselves be culturally prescribed, political, or historical in origin (MacIntyre
1971/1978; Rabinow and Sullivan 1979; Shweder 1996; Shweder and LeVine 1984; Taylor
1962; Winch 1958). Yet, the intent of many of these authors is often as polemical as their
Thus are the lines of battle cast. On the one side, positivists claim there are no
important divisions between qualitative and quantitative work; this is the hegemonic vision
of naturalism. On the other side, interpretivists (not to mention post-structuralists) claim
that the various schools within the social sciences are irreconcilably divided, perhaps even
incommensurable. We believe that these positions are unhelpful and, in certain important
respects, misleading. The methodological tools of social science are neither uniform (with
the same standards applying for both quant and qual) nor dichotomous (with a bright line
separating quant and qual). While we agree with the general sentiment expressed by the
unificationists – and certainly with the goal of scientific cumulation across fields and
methods – we find that these sorts of pronouncements are primarily rhetorical. That is, they
either ignore or condemn differences among well-established traditions of scholarship that
are quite real and eminently justifiable. Yet, these differences are not well accounted for by
the usual dichotomous logic – i.e., quantitative versus qualitative, large-N versus small-N,
numbers versus narrative, positivism versus interpretivism, formal methods versus informal
intuition. Things are more complicated. 5
We present a new way of thinking about these issues that rests upon the key concept
of comparability. At the level of observations, we argue in the first part of the paper that the
principal factor separating qualitative observations from quantitative observations is the
assumed comparability of evidence. Quantitative observations presume a population of
things that can be readily measured, counted, and hence compared. Qualitative observations,
by contrast, presume an empirical field where individual pieces of evidence are not directly
comparable to one another, at least not to the extent that they can be precisely measured. In
this sense, quantitative work is appropriately labeled nomothetic, and qualitative work
idiographic. The key point is that the difference between these two kinds of observations
rests on the presumed comparability of adjacent observations, not (at least not directly) on
the size of N, the style of presentation (numbers or narrative), epistemology, ontology, or
the formal structure of the method.
Our purpose in this paper, it should be stressed, is not to reinforce existing cleavages
in the social sciences. Rather, we wish to draw attention to what we see as the most
important underlying issue in these debates, an issue that has not received much recognition.
We do not pretend that this is the only issue underlying the qualitative/quantitative divide,
but we argue that it is the most fundamental. If analysis is based on comparison, the central
methodological question is what we can reasonably compare, and how precise those
comparisons can be. Here, there are legitimate differences of opinion, and they are not the
sort that can be empirically proven.
What makes a descriptive statement about the world “qualitative” rather than
“quantitative”? The question is so apparently obvious that it is difficult to reflect upon.
Indeed, a recent lexicon focused on qualitative inquiry notes, in the entry for this term, that
“the adjective does not clearly signal a particular meaning” (Schwandt 1997: 129). Rather, it
is used as a “blanket designation for all forms of [hermeneutic] inquiry including
ethnography, case study research, naturalistic inquiry, ethnomethodology, life history
methodology, narrative inquiry, and the like.” In sum, concludes the author of this dictionary
of qualitative methods (with no apparent sense of irony), “‘qualitative research’ is simply not
a very useful term for denoting a specific set of characteristics of inquiry” (Ibid. 130). 6 The
inherent fuzziness of the concepts prompts some scholars to argue that the
qualitative/quantitative debate must be a red herring, for neither term is very specific
(Brodbeck 1968: 573-4; Hammersley 1992). Alternately, “the transformation of quantity into
quality, or conversely, is a semantic or logical process, not a matter of ontology” (Kaplan
1964: 207). In other words, it is not very important, and a distraction from the real issues of
social science methodology.
Yet, these two concepts, and the attendant debates, refuse to be banished. Books,
articles, courses, institutes, and arguments continue to bear these names. There must be a
there, there, somewhere. One possibility is that the distinction between qual and quant is
simply a matter of numbers versus narrative, counting versus recounting. Wherever one sees
a number, quantitative inquiry is taking place; wherever one espies a word, qualitative inquiry
is at work. By this off-hand definition, all work in the social sciences includes both elements,
a rather unhelpful conclusion. Alternatively, one could calculate a specific ratio of numbers
to words in a given work in order to attain a quantitative/qualitative score. But this does not
seem to be what most authors have in mind when they invoke these categories.
We propose to understand the pervasive qual/quant split in terms of presumptions
about the comparability of evidence. 7 We understand comparable observations as members of
the same population and therefore potential members of the same sample. They are
examples of a similar phenomenon. They are apples, rather than apples and oranges, to use
the time-honored metaphor. Note that comparing apples and oranges is not prohibited;
however, to do so we must adopt a higher-order concept – e.g., fruit – according to which
apples and oranges are similar. Comparison, writes Fredrik Barth (1999: 78), “involves
identifying two forms as ‘variants’ of ‘the same,’ which means constructing an over-arching
category within which the two forms can be included, and compared and contrasted.” This
common-sense meaning of comparability is widely understood and agreed upon. But what
does it mean for items to be comparable within the context of social science research?
Surely, it is more than shared membership in an arbitrary linguistic category.
First, comparable observations must share a set of relevant descriptive attributes
(dimensions). This is what makes them comparable. The observations need not demonstrate
the same values for those attributes. Each observation in a sample may “score” differently
on each attribute in either quantitative or qualitative terms – e.g., high/low, present/absent,
and so forth. But each observation must be score-able on some scale, and the attribute must
mean (roughly or precisely) the same thing across the contexts in which it is being compared.
We label this descriptive comparability, and argue that it is a fundamental feature of conceptual
validity. The defining attributes of a concept must be valid across the designated
observations. Otherwise, we say that a concept is being “stretched” (Collier and Mahon
1993; Sartori 1970). 8
A second kind of comparability refers to the inter-relationship of two factors in a
causal analysis, the cause (or vector of causal factors), X, and the outcome, Y. The specified
X/Y relationship must hold across the chosen observations. We label this causal comparability.
This idea is familiar to statisticians, who often invoke the assumption of unit homogeneity as
part of their models. “For a data set with n observations, unit homogeneity is the assumption
that all units with the same value of the explanatory variables have the same expected value
of the dependent variable” (King, Keohane, Verba 1994: 91).
Thus, there are two kinds of comparability: descriptive and causal. The first is
presumed in the second. If a sample of observations is assumed to be causally comparable,
then it must also be descriptively comparable. In statistical research, the assumption of unit
homogeneity makes this explicit, but it must also be true more generally, for causal
comparability can exist only in the presence of descriptive comparability.
We propose that the underlying issues in the enduring qual/quant debate are
tethered to the problem of comparability (as defined). To “quantify” an observation is to
formulate it in terms that can be explicitly and precisely compared across a large number of
observations, i.e., where a concept can be expressed on a numerical scale, a metric, a variable
(we use these three words interchangeably). Note that a number by itself (e.g., “70”) is not a
quantitative observation. Its link to the empirical world depends upon its connection to a
metric such as temperature. “Seventy degrees Fahrenheit” is a quantitative observation, while
“70” is not. Similarly, a phone number does not measure anything, for it is not based on a
scale. As such, it is more properly classified as qualitative than quantitative. Qualitative
observations rely primarily on natural language, though they may also include numbers so
long as those numbers are not connected to particular scales. Quantitative observations
combine natural-language words (nouns, verbs, or adjectives) with numbers according to
some pre-assigned metric. It is a question of measurement, which “in the most general
terms, can be regarded as the assignment of numbers to objects (or events or situations) in
accord with some rule” (Kaplan 1964: 177; see also Adcock and Collier 2001; Cicourel 1964:
Let us begin with a discussion of the concept of “precision.” Note that to simply recode a dichotomous natural-language category as a series of binary numbers does not make
it any more precise. Thus, 0/1 is no more precise than “pregnant/not pregnant.” However,
numerical scales offer the possibility of greater precision when the number of categories
surpasses the categories inherent in natural language, as well as in circumstances where these
categories can be understood as positions on a continuous (interval) scale. To say that one
room is “warmer” than another is comparative, but it is less precise than saying that one
room is 70 degrees Fahrenheit and the other is 60 degrees Fahrenheit. Thus, in many
situations the use of a quantitative idiom allows for more precise comparisons across units.
In all situations, the use of a scale is at least as precise as natural language (in the sense that no
precision is lost in the translation of words to numbers). Again, to use a quantitative idiom
does not entail great precision; it entails the possibility of great precision (as well as a more
explicit set of comparisons).
Precision should not be confused with certainty. Qualitative or quantitative statements
may be uttered with more or less confidence. For example, one might say “I would guess
that the room is 70 degrees” or “I would guess that the room is warm.” With quantitative
statements, a mathematical indicator of uncertainty may accompany the point score, as in
World Bank estimates of the quality of governance crossnationally – where, as it happens,
the standard errors happen to be extraordinarily high (Kaufmann, Kraay, and ZoidoLoboton 1999). When we say that quantitative observations are more precise we are
referring to the point estimate, not the degree of uncertainty (or dispersion).
As we said, quantitative statements are both more precise (at least potentially) and
more explicit. This is because the very act of creating a numerical scale requires a set of
explicit comparisons and an explicit comparison-set – a domain. Scales cannot be developed
in highly specific contexts. Imagine developing a barometer of a single event, say, the
terrorist attacks that occurred on September 11, 2001. The idea is nonsensical because scales
make sense only relative to classes of events. One can have a terrorism meter, but not a 9/11
meter (unless of course that event is being used as a metric for understanding other events,
in which case it becomes a comparative metric). Weather can be measured precisely because,
for one thing, there is lots of weather to measure, and temperature is thought to have the
same general meaning in many different contexts. Granted, all scales are bounded; there is
no universal scale (a scale applying to everything). Some things, like metaphysics, have no
temperature; the concept of temperature (and whatever scale might be used to measure it)
does not apply in this domain. The point is that relative to words, which are only loosely and
implicitly comparative, scales are precisely and explicitly comparative, and their range is
usually quite broad (otherwise, why bother to develop a systematic scale?).
Now, it is true that some natural-language adjectives, such as “warm,” are explicitly
comparative. But most words are more ambiguous. This is apparent in the extra locutions
that are necessary to render ordinary language comparative. One must clarify “warmer than,”
“more chair-like than,” and so forth; whereas, to append such judgments to a numerical
scale is redundant. (One does not say, “70 degrees F, warmer than 65 degrees F.”) Numerical
scales are already comparative, and no matter what one does with a numerical observation it
cannot lose its precise, explicitly comparative quality.
To be sure, if one labels an object with a noun – e.g., “chair” – one is implicitly (if
not explicitly) comparing it with other objects: non-chairs. Language has this universal
aspect; if we call something X, we imply that other things are not-X, or less-X or more-X.
However, the comparisons are vague. It is unclear, for example, where a chair leaves off and
a stool begins, for few words – and very few key words in social science – have crisp
boundaries. More importantly, most words are multivalent; they have more than one
attribute and consequently can mean more than one thing. Thus, to say that an object is
“not-A” could mean a number of different things, depending on the attribute(s) that are
intended by an author, or understood by the reader. Moreover, a word usually gains meaning
by its context, and this context is undefined in settings other than that which the author is
studying. Additionally, the other objects that are not-X are typically not defined, in which
case the larger population of cases (the domain of the inference) remains implicit. Finally,
words are contingent upon a particular natural language, and this imposes another sort of
contextual boundary against comparison. (By contrast, the number “5” and the operator “=”
mean the same thing everywhere – since the adoption of a uniform language of mathematics
– and they also mean the same thing in all contexts that they might be employed.)
Frequently, natural-language comparisons are without any obvious comparative
reference-point. The statement “Caesar crossed the Rubicon” is comparative in so many
possible ways that it might be considered non-comparative: he did not cross it; he did not
cross the Tiber; it was not Brutus who crossed the Rubicon; and so forth. If this is
“comparative,” it is in the most minimal sense. Yet, it is important for our purposes to
recognize that comparison is a matter of degree. Qualitative observations can be more or
less comparative, but quantitative observations are almost always more precisely and
explicitly comparative.
One final clarification is in order. We have said that all quantitative statements about
the world invoke a class of events; these form the basis for the metric. However, it does not
follow that quantitative statements about the world are necessarily broader in scope than
qualitative statements. Indeed, the very fuzziness of natural language issues a license to
generalize – for one can avoid saying anything terribly specific – while the exactness of
quantification may rein in the temptation to generalize. It follows that qualitative statements
can be either very restricted in scope (as in our previous example about the singular event of
9/11) or extremely broad. Saying something in words does not affect the scope of the
inference. Saying something with a metric, however, presupposes a class of referents, which
is to say it must make reference to more than one discrete event (and these reference-points
must be fairly precise and explicit).
In principle, any qualitative observation can be converted into quantitative form, as
attested by the plethora of methods designed to perform precisely this function, e.g.,
NUD*IST software (Gahan and Hannibal 1999), Computer Assisted Qualitative Data
Analysis Software (CAQDAS) (e.g., Fielding and Lee 1998), various narrative-based methods
(Abbott 1992; Abell 2004; Büthe 2002; Griffin 1992), as well as more generic forms of
content analysis (Krippendorff 2003). There is no such thing as a non-quantifiable
observation because any single statement that can be made about one phenomenon could
also be made about another phenomenon, thus providing the possibility of some sort of
Yet, it is not clear that one would always want to make this transposition from words
to words-with-numbers (variables). Indeed, there are usually costs associated with this
conversion. The tradeoff may be understood in terms of precision and explicitness on the one
hand, and depth (or richness) on the other. More concisely, the analyst has the option of
describing thinly or describing thickly. Begin with the fact that words are usually multivalent
(Robinson 1954); they generally carry a variety of attributes, some of which may not even be
logically consistent. This is particularly true of key words in social science – e.g., democracy,
justice, corporatism, political party, and so forth. When one of these words is converted into
a measurable variable, which is to say into to a precise scale, a researcher is generally forced
to drop one or more of its attributes. For not all of these attributes will be precisely
applicable to the class of phenomena that the concept is now (quite explicitly) intended to
cover. Recall that we are not necessarily describing an expansion in scope, for natural
language can reach as far as mathematical variables. But in making the comparison precise
and explicit, it is usually necessary to narrow the definition of the natural-language concept.
To be sure, it could be that the intension of the natural-language concept is also quite a bit
narrower than the full set of attributes normally (in ordinary language) associated with the
term. An author is free to define a term as she sees fit; qualitative work is not wedded to
ordinary language (Sartori 1984). The point is, in creating a variable one is forced to make
explicit choices about which definitional attributes properly apply to a class of phenomena,
and which do not. This is likely to prompt some narrowing of the semantic options. And
this is why we consider the choice to quantify a concept as a move toward thin description.
More explicit comparisons are being made, but they are narrowed down to one or several
dimensions (the chosen attributes of the core concept).
Similarly, if one chooses to particularize, rather than to generalize, natural language is
the obvious vehicle of choice. As we have pointed out, it is inappropriate to construct a scale
when the class of instances under investigation is one or several. A scale presupposes a
population. By contrast, a word can be used in a highly specific context; it does not
presuppose an explicit comparison with other instances. This means that in describing the
singularity of an event one is drawn towards the implements of natural language. The lack of
perfect commensurability between words used in one context and the same words used in
another context allows the researcher the facility to elucidate what is different – categorically
(qualitatively), not marginally (quantitatively) – about that phenomenon. A very high score
on some scale can be (indeed, must be) indicated with a quantitative metric. But a very
different kind of score requires a word, perhaps a series of words.
In short, there are gains and losses in the transposition of words to numbers, and
vice-versa. What is interesting about this classic debate is that both may be described as
“reductionist.” Quantitative studies are often accused of reducing reality, and in the process
distorting that reality to fit the austere requirements of the quantitative format. Each piece of
reality must be sliced up into variables and these must be comparable across all observations.
Qualitative studies are accused of a different kind of reduction, in which a subject is shrunk
down to a highly particular context – the country, neighborhood, or event of special interest.
From our perspective, these contrasting notions of reduction/expansion are best
understood as arguments over comparability. Scholars inclined toward the tools provided by
natural language are often keen to explore a wide variety of different aspects in a particular
setting. They wish to explore multiple dimensions of one thing. (“Dimension” is employed
as a synonym for “variable” in this context.) In Howard Becker’s (1996: 65) words, one is
“trying to find out something about every topic the research touches on, even tangentially.”
Thus, Lizabeth Cohen’s (1991) history of the New Deal follows the narrative of this
extraordinary epoch in one city (Chicago) in extraordinary detail. Clifford Geertz (1980)
focuses on the “theatre state” in Bali, but his analysis touches on virtually all aspects of
Balinese culture, economy, and society. Qualitative analysis is thus often focused inward, like
a vast funnel. Many comparisons are made, but they are all understood as features of the
same general topic, existing in one time and place. Natural language is adept for this purpose
for it is rich, textured, context-specific, and multivalent. It elucidates a wealth of details
about a person, event, or situation. This is why we find a natural affinity between qualitative
tools and ethnographic, historical, and – more broadly – interpretivist styles of research. By
contrast, scholars inclined toward a numerical understanding of the world are drawn toward
comparisons that are broad and thin. They intend to explore one particular dimension of
many things.
The interesting aspect of this familiar contrast is that both qualitative and
quantitative scholars perceive their work as conforming to the natural bend of the universe.
Qualitative scholars usually assume a case-centered approach. Different aspects of the same
cases can be compared; they go together. Quantitative scholars are drawn toward a
dimensional approach to comparability. A single aspect (dimension) of an entity is assumed to
be comparable across multiple cases (Ragin 1997). While for a qualitative scholar it would
seem natural to explore everything about A, for a quantitative scholar it would seem more
natural to explore one thing about A, B, and C. Underlying scholars’ choice of method are
certain assumptions about cross-case comparability. The tools we choose – words or
numbers – are, in part, the expression of our relative confidence in the ability to compare
across entities in a given research context. 9
The importance of comparison is illustrated in one of the most common defenses of
qualitative work in the interpretive mode. It is often said that qualitative analysis focuses on
human meanings while quantitative analysis focuses on the behavioral components of
human reality – i.e., actions, institutions, or events. From the interpretivist perspective, the
business of social science is the business of construing meaning, not analyzing behavior (Shweder
1996; Znaniecki 1934: 37-40). Yet, one might reasonably inquire, why not study human
meanings quantitatively – i.e., with scalar measures – in addition to studying them
One rationale is that human meaning is constructed through language (which
establishes the categories by which we understand the world); therefore, it makes sense to
study these linguistic categories through other linguistic categories, rather than the (alien)
categories of numbers. Herbert Blumer (1969: 132) concludes that
the crucial limit to the successful application of variable analysis to human
group life is set by the process of interpretation or definition that goes on in
human groups. This process, which I believe to be the core of human action,
gives a character to human group life that seems to be at variance with the
logical premises of variable analysis.
This conventional defense of qualitative research, however, misses the point we are making.
The problem is not that the analysis of linguistic phenomena must be carried out with
linguistic tools; after all, the contemporary discipline of linguistics is heavily quantitative.
Indeed, quantitative observations are necessarily rooted in language because every scale must
be expressible in a linguistic category; a variable must have a name. What is it, then, that
seems so problematic about collecting quantitative observations about semantic realities?
Why is there no quantitative hermeneutics?
The problem, we suggest, is the core problem of the qual/quant dispute: the
problem of representing human meaning across diverse contexts in an explicit and precise
comparative fashion – i.e., a variable. It is the problem of comparability, not the problem of
language per se, that is at issue. And what makes it so problematic is a basic feature of
human life. The ways in which we make sense of our lived experience is extraordinarily
diverse – through time, across cultures, and across individuals. It is difficult to reduce this
complexity to standard categories, as quantitative knowledge requires. Consider the question
of human happiness (variously understood as welfare or quality of life), which has attracted
increasing attention on the part of psychologists, sociologists, and economists. The following
scale was developed by the United States Bureau of the Census to measure the quality of life:
a) terrible, b) unhappy, c) mostly dissatisfied, d) mixed, e) mostly satisfied, f) mostly pleased,
g) delighted (Caws 1989: 26). The question we quite naturally ask ourselves is whether
human happiness is accurately captured in these categories. This is to say, does a person who
answers (b) unhappy have a lower quality of life than a person who responds with (d) mixed?
There are potential problems of conceptual validity, and the issue is not simply a lack of
sophistication on the part of researchers. More fundamentally, the problem is that human
meanings – such as happiness – are resistant to the terms of uniform comparisons across
subjects. One feels much more comfortable with imprecise comparisons expressed in the
looseness, and contextual specificity, of natural language. We might say, for example, that
“Smith is happy,” implying that across some undefined population Smith’s level of
happiness is, let us say, somewhere above the mean. However, we are probably reluctant to
assign a precise score to Smith’s happiness because such a score would presume a precise
comparison with all others in the population.
We do not wish to imply that all human meanings are un-quantifiable. Nor do we
wish to imply that social scientists should abstain from quantifying difficult and ambiguous
meanings such as human happiness. There may be much to learn from the quantification of
abstract concepts that summarize a great deal of information about human experience. Our
point is simply that the profitability of quantifying varies with the topic, and the core issue is
the comparability of the phenomena across potential observations. Certain topics are more
recalcitrant, and these tend to be tied up with the generation of meaning (values, ideas,
intentions), rather than the observable behavior of individuals and institutions. It is
noteworthy, for example, that some human intentions seem to be more quantifiable than
others, and these tend to be those where we can assume a high level of comparability across
individuals or across cultures. Thus, scholars routinely measure the concept of “economic
voting” because the notion of voters pursuing economic interests seems comparable, and
hence valid, across individuals and populations. By contrast, the measurement of religious
influences on voting behavior is much more difficult to quantify because it is more difficult
to compare.
According to the standard view, approaches in the social sciences can be understood
as either qualitative or quantitative (or some mix thereof). Scholars differ in their opinions
about the utility of this distinction. Some dismiss it as a red herring; others feel that there is
good justification for the division. We feel that there is no plausible way of discarding the
distinction but that it is greatly misunderstood. The key to this misunderstanding, we have
argued, is to be found in assumptions about comparability.
There is indeed a difference in basic-level assumptions between statements that are
quantitative (i.e., understood through a numerical scale, a metric, or a variable) and those
which are qualitative (expressed in natural language), even if the focus is ostensibly on a
single observation. Quantitative descriptive statements presuppose a class of comparable
cases that can be compared in an explicit and precise manner. To measure phenomenon, X,
is to impose a very specific metric on it, one that is explicitly comparative (since other
phenomena in this same category are assumed to be score-able). Qualitative descriptive
statements do not make any such presuppositions. There may or may not be an identifiable
class of comparable cases that can be measured along some set of dimensions. Often, the
assumption is quite the reverse – particularizing rather than generalizing. Thus, we argued
that to quantify something is to compare in an explicit and precise manner. To qualify is to
leave such comparisons open; one may or may not engage explicit comparisons with
adjacent cases, and any comparisons made are unlikely to be very precise. While this might
seem to indicate a distinct advantage for quantitative work, we also showed that there are
costs to assuming a quantitative idiom. Not only must the cases be (actually) comparable, but
there is usually some loss of information since words are usually multivalent and metrics are
usually unidimensional (or at most combine several dimensions). It is not clear that we
always gain in analytic leverage by moving from words to numbers. We do, however, make
different sorts of comparisons.
The choice between math and natural language as tools of social science is, therefore,
a highly consequential choice. Methodological tools help us to reconstruct the empirical
world; they are not theory-neutral. In this respect, the division between math and language is
akin to the influence that early anthropologists and linguists assigned to language. Different
languages divide up the world into different packages; they encourage us to visualize things
in different ways. 10 So, arguably, do the different “languages” of mathematics and ordinary
speech. Quantitative tools help us to compare, and hence to generalize; qualitative tools
encourage us to differentiate.
It is quite another thing, however, to disentangle the causal priority of methodologies
and ontologies. Do cultural anthropologists use qualitative tools because they envision a
lumpy universe, or do they see a lumpy universe because they perceive it with qualitative
tools? About all one can say with any degree of confidence is that there is a strong synergy
between methods and ontologies. We presume that they are, at the very least, strongly
reinforcing. This may help to account for the virulence and endurance of this central
cleavage in the social sciences today.
In pointing to the problem of comparability as the key feature of this debate, we
hope to contribute to a demystification of this longstanding distinction, perhaps even to a
reconciliation of the two camps. It is not merely a matter of numbers versus words, or a
debate about what can or cannot be quantified. More fundamentally, the venerable debate
represents fundamental disagreements over how precise, explicit, and extensive socialscience comparisons ought to be. Those who resist numerical analysis, like Alasdair
MacIntyre (1971/1978), are dubious about the validity of comparisons. They see no need to
enhance the precision or explicitness of comparisons because they do not seek to compare
in the first place, or they seek a more restricted ambit of comparative reference-points.
Those who embrace quantification are those who, like Gabriel Almond and Sidney Verba
(1963) – the authors of the seven-nation Civic Culture study that MacIntyre attacked, are more
comfortable with such comparisons. This debate has been engaged at many levels – across
individuals, across levels of government, across cultures, across time-periods – and over
many years.
Abbott, Andrew. 1992. “From Causes to Events: Notes on Narrative Positivism.”
Sociological Methods and Research 20:4 (May) 428-55.
Abell, Peter. 2004. “Narrative Explanation: An Alternative to Variable-Centered
Explanation?” Annual Review of Sociology 30, 287-310.
Adcock, Robert and David Collier. 2001. “Measurement Validity: A Shared Standard for
Qualitative and Quantitative Research.” American Political Science Review 95, 529-46.
Almond, Gabriel A. and Sidney Verba. 1963. Civic Culture: Political Attitudes and Democracy in
Five Nations. Princeton: Princeton University.
Barnes, Harry Elmer (ed). 1925. The History and Prospects of the Social Sciences. New York:
Alfred A. Knopf.
Barth, Fredik. 1999. “Comparative Methodologies in the Analysis of Anthropological
Data.” In John R. Bowen and Roger Petersen (eds), Critical Comparisons in Politics and
Culture (Cambridge: Cambridge University Press), 78-89.
Beck, Nathaniel. 2004. “Is Causal-Process Observation an Oxymoron?: A Comment on
Brady and Collier, Rethinking Social Inquiry.” Ms.
Becker, Howard S. 1934. “Culture Case Study and Ideal-typical Method.” Social Forces 12:3,
Becker, Howard S. 1996. “The Epistemology of Qualitative Research.” In Richard Jessor,
Anne Colby, and Richard A. Shweder (eds), Ethnography and Human Development: Context
and Meaning in Social Inquiry (Chicago: University of Chicago Press), 53-71.
Benoit, Kenneth. 2005. “How Qualitative Research Really Counts.” Qualitative Methods:
Newsletter of the American Political Science Association Organized Section on Qualitative Methods 3:1
(Spring) 9-12.
Bernard, L.L. 1928. “The Development of Method in Sociology.” The Monist 38 (April)
Black, Max. 1962. Models and Metaphors: Studies in Language and Philosophy. Ithaca: Cornell
University Press.
Blalock, Hubert M., Jr. 1982. Conceptualization and Measurement in the Social Sciences. Beverly
Hills: Sage.
Blalock, Hubert M., Jr. 1989. “The Real and Unrealized Contributions of Quantitative
Sociology.” American Sociological Review 54:3 (June) 447-60.
Blumer, Herbert. 1969. Symbolic Interactionism: Perspective and Method. Englewood Cliffs, NJ:
Brady, Henry E. and David Collier. 2004. Rethinking Social Inquiry: Diverse Tools, Shared
Standards. Lanham, MD: Rowman and Littlefield.
Brodbeck, May (ed). 1968. Readings in the Philosophy of the Social Sciences. New York:
Bryman, Alan. 1984. “The Debate about Quantitative and Qualitative Research: A
Question of Method or Epistemology?” British Journal of Sociology 35:1 (March) 75-92.
Büthe, Tim. 2002. “Taking Temporality Seriously: Modeling History and the Use of
Narratives as Evidence.” American Political Science Review 96:3 (September) 481-93.
Caws, Peter. 1989. “The Law of Quality and Quantity, or What Numbers Can and Can’t
Describe.” In Barry Glassner and Jonathan D. Moreno (eds), The Qualitative-Quantitative
Distinction in the Social Sciences (Boston Studies in the Philosophy of Science, 112) 13-28.
Cicourel, Aaron V. 1964. Method and Measurement in Sociology. Glencoe, IL: Free Press.
Cohen, Lizabeth. 1991. Making a New Deal: Industrial Workers in Chicago, 1919-1939.
Cambridge: Cambridge University Press.
Collier, David and James E. Mahon, Jr. 1993. “Conceptual ‘Stretching’ Revisited: Adapting
Categories in Comparative Analysis.” American Political Science Review 87:4 (December)
Creswell, John W. 1998. Qualitative Inquiry and Research Design. Thousand Oaks: Sage.
DeFelice, E. Gene. 1980. “Comparison Misconceived: Common Nonsense in Comparative
Politics.” Comparative Politics 13:1 (October) 119-26.
Denzin, Norman K. and Yvonna S. Lincoln (eds). 1994. Handbook of Qualitative Research.
Thousand Oaks: Sage.
Elman, Colin. 2005. “Explanatory Typologies in Qualitative Studies of International
Politics.” International Organization 59:2 (Spring) 293-326.
Fielding, N.G. and R.M. Lee. 1998. Computer Analysis and Qualitative Research. Thousand
Oaks: Sage.
Friedman, Milton. 1953. “The Methodology of Positive Economics.” In Essays in Positive
Economics (Chicago: University of Chicago Press), 3-43.
Gahan, C. and M. Hannibal. 1999. Doing Qualitative Research Using QSR NUD*IST.
Thousand Oaks: Sage.
Geertz, Clifford. 1980. Negara: The Theatre State in Bali. Princeton: Princeton University
George, Alexander L. and Andrew Bennett. 2005. Case Studies and Theory Development.
Cambridge: MIT Press.
Goldthorpe, John H. 2000. On Sociology: Numbers, Narratives, and the Integration of Research and
Theory. Oxford: Oxford University Press.
Gosnell, Harold F. 1933. “Statisticians and Political Scientists.” American Political Science
Review 27:3 (June) 392-403.
Griffin, Larry J. 1992. “Temporality, Events, and Explanation in Historical Sociology: An
Introduction.” Sociological Methods and Research 20:4 (May) 403-27.
Hammersley, Martyn. 1989. The Dilemma of Qualitative Method: Herbert Blumer and the Chicago
School. London: Routledge & Kegan Paul.
Hammersley, Martyn. 1992. “Deconstructing the Qualitative-Quantitative Divide.” In Julie
Brannen (ed), Mixing Methods: Qualitative and Quantitative Research (Aldershot, England:
Avebury), 39-55.
Jocher, Katharine. 1928. “The Case Study Method in Social Research.” Social Forces 7, 5125.
Kaplan, Abraham. 1964. The Conduct of Inquiry: Methodology for Behavioral Science. San
Francisco: Chandler Publishing.
Kaufmann, Daniel, Aart Kraay and Pablo Zoido-Lobaton. 1999. “Aggregating Governance
Indicators.” Manuscript, World Bank.
King, Gary, Robert O. Keohane, and Sidney Verba. 1994. Designing Social Inquiry: Scientific
Inference in Qualitative Research. Princeton: Princeton University Press.
Krippendorff, Klaus. 2003. Content Analysis: An Introduction to its Methodology. Newbury Park,
CA: Sage.
Lazarsfeld, Paul F. and Morris Rosenberg. 1955. The Language of Social Research. Glencoe, IL:
Free Press.
Lieberman, Evan S. 2005. “Nested Analysis as a Mixed-Method Strategy for Comparative
Research.” American Political Science Review (August) 435-52.
Lieberson, Stanley. 1985. Making it Count: The Improvement of Social Research and Theory.
Berkeley: University of California Press.
MacIntyre, Alasdair. 1971/1978. “Is a Science of Comparative Politics Possible?” In
Against the Self-Images of the Age: Essays on Ideology and Philosophy (London: Duckworth), 26079.
Mahoney, James and Dietrich Rueschemeyer (eds). 2003. Comparative Historical Analysis in the
Social Sciences. Cambridge: Cambridge University Press.
Marshall, Alfred. 1897. “The Old Generation of Economists and the New.” Quarterly
Journal of Economics 11:2 (January) 115-35.
McLaughlin, Eithne. 1991. “Oppositional Poverty: The Quantitative/Qualitative Divide
and Other Dichotomies.” The Sociological Review 39 (May): 292-308.
Rabinow, Paul and William M. Sullivan (eds). 1979. Interpretive Social Science: A Reader.
Berkeley: University of California Press.
Ragin, Charles C. 1997. “Turning the Tables: How Case-Oriented Research Challenges
Variable-Oriented Research.” Comparative Social Research 16: 27-42.
Robinson, Richard. 1954. Definition. Oxford: Clarendon Press.
Sartori, Giovanni. 1970. “Concept Misinformation in Comparative Politics.” American
Political Science Review 64:4 (December) 1033-46.
Sartori, Giovanni. 1984. “Guidelines for Concept Analysis.” In Social Science Concepts: A
Systematic Analysis (Beverly Hills: Sage) 15-48.
Schwandt, Thomas A. 1997. Qualitative Inquiry: A Dictionary of Terms. Thousand Oaks, CA:
Shweder, Richard A. 1996. “Quanta and Qualia: What is the ‘Object’ of Ethnographic
Method?” In Richard Jessor, Anne Colby, and Richard A. Shweder (eds), Ethnography and
Human Development: Context and Meaning in Social Inquiry (Chicago: University of Chicago
Press), 175-82.
Shweder, Richard A. and Robert A. LeVine (eds). 1984. Culture Theory: Essays on Mind, Self,
and Emotion. Cambridge: Cambridge University Press.
Smith, Charles W. 1989. “The Qualitative Significance of Quantitative Representation.” In
Barry Glassner and Jonathan D. Moreno (eds), The Qualitative-Quantitative Distinction in the
Social Sciences (Boston Studies in the Philosophy of Science, 112) 29-42.
Snow, C.P. 1959/1993. The Two Cultures. Cambridge: Cambridge University Press.
Stouffer, Samuel A. 1931. “Experimental Comparison of a Statistical and a Case History
Technique of Attitude Research.” Publications of the American Sociological Society 25, 154-56.
Strauss, Anselm and Juliet Corbin. 1998. Basic of Qualitative Research: Techniques and Procedures
for Developing Grounded Theory. Thousand Oaks: Sage.
Taylor, Charles. 1962. The Explanation of Behavior. New York: Routledge & Kegan Paul.
Teggart, Frederick J. 1939/1967. Rome and China: A Study of Correlations in Historical Events.
Berkeley: University of California Press.
Urban, Greg. 1999. “The Role of Comparison in the Light of the Theory of Culture.” In
John R. Bowen and Roger Petersen (eds), Critical Comparisons in Politics and Culture
(Cambridge: Cambridge University Press), 90-109.
Van Deth, Jan W. (ed). 1998. Comparative Politics: The Problem of Equivalence. New York:
Waller, Willard. 1934. “Insight and Scientific Method.” American Journal of Sociology 40:3
(November) 285-97.
White, Leonard D. (ed). 1930. The New Social Science. Chicago: University of Chicago Press.
Wilson, Edward O. 1998. Consilience: The Unity of Knowledge. New York: Alfred A. Knopf.
Winch, Peter. 1958. The Idea of a Social Science, and its Relation to Philosophy. London:
Routledge & Kegan Paul.
Zelditch, M., Jr. 1971. “Intelligible Comparisons.” In I. Vallier (ed), Comparative Methods in
Sociology: Essays on Trends and Applications (Berkeley: University of California Press) 267307.
Znaniecki, Florian. 1934. The Method of Sociology. New York: Rinehart.
Quoted in Kaplan (1964: 172).
The origin of this quote is in dispute. It appears to have been on a sign that hung
outside Einstein’s office at the Institute for Advanced Study. Though it may not have been
written by him, it is nonetheless generally attributed to him, and must have received his
See also Bryman (1984: 76).
Beck (2004: 1) likens this to “the view of the Inquisition that we are all God’s
We speak of a general tendency in the social sciences. As always, there are
exceptions. More subtle treatments of these issues can be found in Brady and Collier (2004),
Elman (2005), George and Bennett (2005), Lieberman (2005), Mahoney and Rueschemeyer
(2003), and elsewhere.
For additional attempts at definition see Creswell (1998: 14-6), Denzin and Lincoln
(1994: 2), Strauss and Corbin (1998: 10-11).
Note that there are a large number of neighboring terms – commensurability,
consonance, equivalence, homogeneity, homology, similarity – all of which we shall treat as
synonyms. For work on these topics see Barth (1999), DeFelice (1980), Urban (1999), Van
Deth (1998), Zelditch (1971).
Descriptive comparability is closely related to measurement validity (Adcock and
Collier 2001). However, we do not wish to presuppose that in order to achieve descriptive
validity a concept must be measurable. Note that problems of conceptual stretching may be
ameliorated by the use of multi-dimensional typologies (Elman 2005), in which circumstance
a single concept is (usually) sub-divided into several concepts, each pertaining to a different
empirical realm.
This is not to deny that many scholars use both words and numbers. The point is
that within a given context the likelihood of choosing one or the other strategy is influenced
by assumptions about case-comparability.
This view is ascribed to the work of Edward Sapir (1884-1936) and Benjamin
Whorf (1897-1941), and is known generally as the Sapir-Whorf hypothesis (Black 1962).