Download organizational behavior and human decision processes

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Ars Conjectandi wikipedia , lookup

Probability interpretations wikipedia , lookup

Transcript
ORGANIZATIONAL BEHAVIOR AND HUMAN DECISION PROCESSES
Vol. 70, No. 2, May, pp. 87–94, 1997
ARTICLE NO. OB972696
General Knowledge Overconfidence: Cross-National Variations,
Response Style, and “Reality”
J. Frank Yates
The University of Michigan
Ju-Whei Lee
Chung Yuan University, Taiwan
and
Julie G. Bush
The University of Michigan
Suppose a person is presented with two-alternative
general knowledge questions like the following: “Which
contains more calories per unit of weight: (a) bread or
(b) rice?” The person picks one of the alternatives and
then states a 50–100% probability judgment that the
chosen alternative really is correct. Under broad
(though not universal) conditions, such judgments tend
to be overconfident in the sense that, on average, they
are higher than the proportions of questions respondents in fact answer correctly (cf. Yates, 1990, Chapter 4).
Beginning with the work of George Wright and Lawrence Phillips (e.g., Wright, Phillips, Whalley, Choo, Ng,
Tan, & Wisudha, 1978), there have been repeated demonstrations of cross-national variations in general
knowledge overconfidence (e.g., Lee, Yates, Shinotsuka,
Singh, Onglatco, Yen, Gupta, & Bhatnagar, 1995; Whitcomb, Önkal, Curley, & Benson, 1995; Yates, Zhu,
Ronis, Wang, Shinotsuka, & Toda, 1989; Zhang, 1992).
The typical finding, which surprises most people (Yates,
Lee, & Shinotsuka, 1996), is that Asians exhibit even
more overconfidence than Westerners. Japanese and
Singaporeans are significant exceptions. It is noteworthy, however, that Japan and Singapore have many
distinctive Western characteristics. Consider, for instance, Japan’s century-old self-conscious effort to learn
from the West’s technological and economic successes
(Reischauer & Jansen, 1995). Or take Singapore’s history as a British colony and international trading center, or its people’s nearly universal knowledge of English, the language of instruction in its schools
(LePoer, 1991).
General knowledge overconfidence, a special form of
probability judgment miscalibration, is the subject of
Overconfidence in general knowledge is typically
stronger among Asian than among Western subject
groups. The research described here examined the possibility that such differences might be a manifestation
of previously reported extreme response styles on the
part of Asian respondents. This hypothesis was evaluated by comparing overconfidence implicit in directly
reported judgments and judgments inferred from decisions Chinese and American subjects made about wagers in which they could earn actual, material goods.
Contrary to the response style hypothesis, indications
of extreme Chinese overconfidence were unaffected by
whether judgments were direct or inferred from decisions. However, American subjects’ inferred judgments
were even more overconfident than their direct judgments. The bias in all subjects’ inferred judgments indicates that, in disagreement with some interpretations
of recent developments in the literature, overconfidence is indeed a “real,” consequential phenomenon,
not a data-analytic artifact. An additional, serendipitous finding was that the inferred judgments of both
Chinese and American subjects were far less variable
than their direct judgments. q 1997 Academic Press
This research was supported by U.S. National Science Foundation
Grant SES92-10027 to the University of Michigan and Grant NSC852413-H 033-004 from the R.O.C. National Science Council to Chung
Yuan University. It is our great pleasure to acknowledge the translation assistance of Dawei Liu and the data analysis contributions of
Winston Sieck. We also appreciate the helpful comments on a previous
version of this article provided by two anonymous reviewers.
Address correspondence and reprint requests to J. Frank Yates,
Judgment and Decision Laboratory, Department of Psychology, University of Michigan, 525 East University Avenue, Ann Arbor, MI
48109-1109. E-mail: [email protected]; or Ju-Whei Lee, Department
of Psychology, Chung Yuan University, Chungli, Taiwan, ROC. Email: [email protected].
87
0749-5978/97 $25.00
Copyright q 1997 by Academic Press
All rights of reproduction in any form reserved.
88
YATES, LEE, AND BUSH
intense scrutiny in current judgment research. Numerous contributors to and mechanisms for the basic overconfidence phenomenon have been proposed and are
under debate (see, for instance, the review and commentaries by Griffin & Varey, 1996; McClelland & Bolger,
1994; and Wallsten, 1996). A variety of explanations
for the origins of cross-national variations in overconfidence have been proposed as well (cf. Lee et al., 1995;
Yates & Lee, 1996; Yates et al., 1996), claims that sometimes are compatible with previously proposed general
accounts but sometimes are not. The research described
here primarily concerns how we might indeed explain
cross-national overconfidence differences. In particular,
it considers the sobering possibility that such variations
do not reflect differences in actual beliefs at all. Instead,
they may be mere instances of distinctions in response
styles, tendencies to express equivalent beliefs in consistently different ways.
Is there any reason to suspect that cross-national
variations in general knowledge overconfidence might,
in fact, be nothing more than response style manifestations? Yes, there is. For some time, cross-cultural researchers have been concerned that people from different cultures might tend to use response scales in
characteristically different ways (see, for example, Jaccard & Wan, 1986). And empirical research has shown
that such concern is sometimes justified (e.g., Chun,
Campbell, & Yoo, 1974; Hui & Triandis, 1989; Zax &
Takahashi, 1967). A study by Stening and Everett
(1984) is especially pertinent to the present issues.
These investigators found that, in comparison to Americans, Japanese, Britons, and Singaporeans, Chinese
respondents in Hong Kong and respondents in various
southeast Asian locations (Malaysia, the Philippines,
Indonesia, and Thailand) were far more likely to use the
highest and lowest response categories of the semantic
differential scales made available to them. Could the
extreme general knowledge probability judgments reported by these latter groups be simply another demonstration of such response styles? The present study
sought to evaluate this proposal. (Bear in mind that
the origin of the response styles themselves would remain the mystery it has always been.)
Fischhoff, Slovic, and Lichtenstein (1977) provided
an excellent model for how one might approach an issue
like the present one. In two experiments, these investigators set out to determine whether their (presumably)
American subjects were willing to make consequential
decisions that were consistent with the extreme confidence they expressed in their written judgments. Specifically, subjects were asked to consider all the instances in which they said that their odds of being
correct for a given general knowledge question were
“50:1 or greater.” The subject was then requested to
consider a game involving such questions, what we will
here call the “Betting Game.” In a given round of the
Betting Game, the actual answer to one of the subject’s
“50:11” questions is determined. If the subject’s chosen
answer to the question is wrong, the subject pays the
experimenter $1. Another operation is performed during that round as well. An opaque bag is filled with 100
white poker chips and 2 red ones (i.e., with odds 50:1
favoring white chips). The experimenter (or the subject,
if he or she prefers) then draws one chip at random
from the bag. If the chip is red, the experimenter pays
the subject $1. The experimenter in the Fischhoff et al.
studies pointed out to the subject that, if his or her
judgments were well-calibrated, then playing the Betting Game was favorable to the subject. That is because
the subject felt that, for at least some questions, the
odds were greater than 50:1 that his or her chosen
answers were actually correct. Thus, the subject should
expect to win more money than he or she loses. The
subject was then asked whether he or she would, in fact,
be willing to play the Betting Game. Most subjects were.
Rightly so, the Fischhoff et al. (1977) results are taken
as evidence that, in the United States, at least, general
knowledge overconfidence is not an instance of an extreme response bias; people are willing to act in a correspondingly extreme manner as well. In the present
study, we adopted the same basic strategy as Fischhoff
et al. in an attempt to determine whether, nevertheless,
extreme Asian overconfidence might be a reflection of
response style. There are two noteworthy features of
the specific approach we employed. First, we made a
comparison between American and Chinese respondents, the latter in Taiwan. The use of Chinese respondents is particularly appropriate because extreme overconfidence has been demonstrated with this Asian
culture more than any other. Second, we “sharpened”
the procedure of Fischhoff et al. in order to make the
more precise comparisons required by the present issue
and to make especially salient the material consequences of subjects’ decisions.
METHOD
Subjects
Eighty-five introductory psychology students at the
University of Michigan participated in the study in exchange for course credit. Their counterparts in Taiwan
were 109 students at Chung Yuan University who were
also taking introductory psychology classes.
Materials
Subjects considered 20 general knowledge questions.
These questions were a subset of those used by Yates
OVERCONFIDENCE AND RESPONSE STYLE
et al. (1989). An illustrative item asked whether the
capital of New Zealand is (a) Auckland or (b) Wellington. All materials, including the additional ones described below, were translated and backtranslated between English and Chinese using the method described
by Brislin (1970).
Procedure
Subjects participated in the experiment in small, noninteracting groups. At the beginning of the session, the
subject was given a covered question booklet and two
answer sheets. The booklet included the 20 general
knowledge questions. Answer Sheet #1 contained 20
sections, each identified with one of the questions, as
illustrated here:
Question 7
a
b
The following example shows how judgments were requested on each line contained in Answer Sheet #2:
Question 7
Probability That My Chosen Answer is
Correct (50–100%):
%
The subject was instructed in how to perform the
basic general knowledge task. Specifically, the subject
was asked to indicate the preferred answer for each
question by circling the corresponding letter on Answer
Sheet #1 and to enter in the appropriate blank on Answer Sheet #2 his or her probability judgment that
that alternative really was correct. The subject was
told explicitly how to interpret the probability scale: “A
probability of 50% would mean that you think that your
chosen answer is just as likely to be correct as incorrect;
100% would mean that you are absolutely sure that
your chosen answer is right; intermediate probabilities
between 50 and 100% should reflect corresponding degrees of certainty that you have picked the right alternative.” The subject was also told that probabilities
below 50% should never be reported, and why, viz.,
that such a probability would be an indication that the
subject felt that the non-selected option was, in fact,
more likely to be correct than the selected one. After
subjects practiced the procedure and had their questions answered, they completed the basic general
knowledge task as described and were asked to put
away Answer Sheet #2, the one containing their probability judgments.
The experimenter then initiated a wager and pricing
procedure. The subject was informed that he or she
owned the opportunity to play a wager involving one
of the questions he or she answered in the first part of
the session. At the University of Michigan, the subject
was told:
89
The wager is of the following type:
• If you got the answer to that particular question
correct, you will receive a gift certificate to the Michigan
Union Bookstore valued $2.20.
• If you got the answer to that particular question
incorrect, you do not receive any gift certificate.
The instructions at Chung Yuan University were equivalent, involving a gift certificate for NT$80 (Taiwan
currency) at that university’s bookstore. We did not
establish the equivalence of US$2.20 and NT$80 on the
basis of official exchange rates. That is because such
rates might not be strictly pertinent to university students. Instead, the equivalence was set by comparing
the local costs of identical “index bundles” of commonly
purchased bookstore items, such as pencils, pads of
paper, and transparent tape.
The subject was then told how his or her “Wager
Question” would be selected, the previously answered
question that would be used to determine whether or
not the subject won a gift certificate. The subject was
shown 20 poker chips, numbered 1 through 20, along
with an opaque bag. The experimenter told the subject
that, at the end of the experiment, the subject would
be asked to put all the chips in the bag and then select
one of them at random. The subject’s Wager Question
would be the original question corresponding to that
number.1
The subjects were also informed that all wagers
would be carried out in private. Specifically, all the
subjects would be asked to leave the experimental room
and return one at a time to play their wagers. Further,
every subject, whether he or she won or lost the wager,
would be given an envelope to take away from the experiment. If a given subject won his or her wager, the
envelope would contain the subject’s gift certificate; if
the subject lost, the envelope would be empty. This procedure was intended to eliminate any concerns subjects
might have had about embarrassment.
1
This feature of the procedure is a potentially significant departure
from that used by Fischhoff et al. (1977). Several authors (e.g., Gigerenzer, Hoffrage, & Kleinbölting, 1991; Sniezek & Buckley, 1991) have
found that people tend to underestimate the number of previously
considered questions they answered correctly. That is, post hoc aggregate judgments tend to exhibit underconfidence, not the overconfidence manifested in concurrent, item-by-item assessments. The
Fischhoff et al. procedure actually focused on aggregate judgments
whereas the present method examines item-by-item beliefs, the actual judgments of interest. It is something of a puzzle why the Fischhoff et al. (1997) results indicated aggregate overconfidence rather
than underconfidence. It is possible that the Fischhoff et al. procedure
encouraged subjects to try to make bets consistent with their earlier
judgments, a demand characteristic we actively sought to avoid in
the present study, e.g., by denying subjects access to their previously
recorded probability judgments.
90
YATES, LEE, AND BUSH
Next came the pricing operations, an application of
what is sometimes called the “Marschak bidding procedure” (Becker, DeGroot, & Marschak, 1964). The subject
was told that, “although you own the opportunity to
play this wager for one of the twenty questions, you
might be given the chance to sell that opportunity instead.” The subject was told that, when the time came
to play his or her wager, the experimenter would make
an offer of a price to buy the wager from the subject. The
subject learned that the experimenter’s offered price
would be determined randomly. In particular, the experimenter would select a chip at random from an opaque
bag containing 45 poker chips. Each chip had an
amount of money written on it, from $.00 to $2.20,
in increments of $.05. The amount on the randomly
selected chip would be the price the experimenter offered the subject for his or her wager. Now, before the
experimenter selected the buying price, the subject was
required to state his or her minimum selling price for
the wager. If the experimenter’s buying price happened
to be the same as or greater than the subject’s minimum
selling price, the subject was required to sell the wager
for a bookstore gift certificate valued at the experimenter’s buying price. Otherwise, no sale took place, and
the subject had to play the wager, receiving either a
$2.20 gift certificate or nothing, depending on whether
his or her answer to the given question were correct or
incorrect. (The same procedure, but involving amounts
between NT$0 and NT$80, in increments of NT$2, was
employed at Chung Yuan University.)
This random determination of buying prices implies
that it is in the subject’s interests to report truthfully
his or her actual minimum selling price, what the subject feels the wager is really worth to him or her. Put
another way, it is contrary to the subject’s interests to
hedge, to report a selling price that is either higher or
lower than his or her honest opinion about the wager’s
worth. The experimenter offered the subject graphical
and detailed illustrations and arguments for why this
is so. The experimenter also explained why the subject
should not report a minimum selling price less than
$1.10 (NT$40 in Taiwan).2
At that stage in the session, the question that was
to be used in the wager the subject actually owned had
not yet been determined. Thus, the subject was asked
to indicate in advance his or her minimum selling prices
for the wagers involving all 20 of the questions considered previously. The subject did this by reviewing each
of the questions and, on Answer Sheet #1, in the space
to the right of his or her previous indication of (a) or
2
In retrospect, it probably would have been better had the procedure repeatedly reminded the subject to report prices $1.10 (or
NT$40) or above, as was done in the probability judgment procedure.
(b) as the correct alternative, recording the minimum
selling price for the corresponding wager.
RESULTS
Of the 85 American subjects who participated in the
study, the data of one were excluded from the analyses
reported below because several of his or her prices violated the instruction that they be at least $1.10. Seven
of the 109 Taiwanese subjects’ data were excluded for
the same reason.
Each subject provided two key dependent variables
for each question considered. The first was the directly
reported probability judgment that his or her chosen
answer was correct, f 5 P8(Correct). The other dependent variable was the subject’s reported minimum selling price for the wager whose payoff of a $2.20 gift
certificate (NT$80 in Taiwan) depended on the correctness of the subject’s chosen answer, which we can denote
by m. The key analyses reported below could have been
performed directly on f and m. However, they are most
easily interpreted when m is converted to a form similar
to f. We intentionally chose the size of the gift certificate
to be moderate. This allowed us to assume with reasonable confidence that the subject determined m at least
roughly in a manner consistent with the maximization
of subjective expected value; that is because the subject’s utility function for money in the pertinent region
should be fairly linear.3 This assumption implies that,
for a given question, it was approximately the case that,
for an American subject,
SEV(m) 5 SEV(Wager)
or
m 5 P*(Correct)($2.20) 1 [1 2 P*(Correct)]($0),
where SEV denotes subjective expected value and
P*(Correct) represents the probability judgment underlying the subject’s evaluation of the wager. (Similar
expressions would hold for the Taiwanese subjects.)
Thus, the second dependent variable that we actually
analyzed was a decision-inferred probability judgment
f * 5 P*(Correct) 5 m/$2.20 (or m/NT$80 for subjects
in Taiwan).
3
To the degree that subjects violated subjective expected value
maximization, we should expect the violations to be in the direction
opposite to the elicitation method differences actually observed for
our American subjects. A prototypical finding (e.g., Fishburn & Kochenberger, 1979; Kahneman & Tversky, 1979) is that people tend
to be risk averse when considering positive prospects like those used
here, which would tend to suppress their selling prices for those prospects.
OVERCONFIDENCE AND RESPONSE STYLE
General knowledge overconfidence is normally measured by a certain bias statistic, the difference between
the subject’s mean probability judgment (f̄) and the
proportion of questions the subject actually answered
correctly (d̄),
Bias 5 f̄ 2 d̄,
where d is an indicator variable which takes on the
value 1 if a question is answered correctly and 0 otherwise. Here, for each subject, we had two values of Bias,
one based on the subject’s directly reported judgment
f, the other on his or her inferred judgment f *. Now, if
the response bias hypothesis as described previously is
true, then we should have observed a particular kind
of interaction. Specifically, biases based on directly reported judgments should have been substantially
higher for our Chinese subjects than for our American
subjects. But, whereas the Chinese subjects’ biases
based on inferred judgments should have decreased relative to those based on direct judgments, those for the
American subjects should have remained about the
same.
Figure 1 shows the mean bias measures we actually
observed, by elicitation method (direct vs. inference)
and subject group (Chinese vs American). Note that
there was indeed a significant interaction, F(1, 184) 5
12.86, p 5 .0004.4 Moreover, whereas, consistent with
all previous findings, Chinese subjects’ directly reported judgments were significantly more positively biased than American subjects’ direct judgments, t(184)
FIG. 1. Mean bias (overconfidence) measures by judgment elicitation method and subject group.
4
The interaction might have been even stronger had it not been
necessary to eliminate more Chinese than American subjects because
of disallowable minimum prices.
91
5 2.95, p 5 .004, the difference for inferred judgments
was minimal, t(184) 5 0.47, ns. But the character of
the interaction was essentially the opposite of that predicted by the response style hypothesis. That is, it was
Chinese subjects’ overconfidence that was unaffected
by whether judgments were direct or inferred, t(101) 5
1.36, ns, while American subjects’ overconfidence increased markedly when judgments were inferred from
their wager decisions, t(83) 5 3.72, p , .0005. Thus,
there is no evidence that extreme Chinese general
knowledge overconfidence is a manifestation of an extreme Chinese response style.
We performed several other analyses also. The first
concerned subjects’ proportions of correct answer
choices. Lichtenstein and Fischhoff (1977) documented
what is now known as the “hard-easy effect.” This is
the phenomenon whereby general knowledge overconfidence depends on the difficulty of the items considered:
overconfidence tends to increase with item difficulty.
Given this empirical generalization, it is important to
control for item difficulty in studies like the present one.
It so happened that the items we used were essentially
equally difficult for our Chinese and American subjects,
with the mean proportions of correct answers being
.68 and .66 for these groups, respectively, F(1, 184) 5
1.35, ns.
A question of some interest is the relationship between subjects’ direct judgments and those implicit in
their wager decisions. Correlations between f and f *
provide some insight into this relationship. The means
of these correlations were .68 and .75 for the Chinese
and American subjects, respectively, means that were
not significantly different from each other, t(181) 5
1.31, ns, for a comparison of Fisher-transformed correlations. (Correlations for two Chinese subjects and one
American subject could not be calculated because there
was no variance in one of their types of judgments.) As
Fig. 1 implies, the average direct and inferred judgments were essentially the same for the Chinese subjects, t(101) 5 21.36, ns, but the latter were much
higher than the former for the American subjects,
t(83) 5 3.72, p , .0005.
For both groups of subjects, we observed an unexpected and marked difference between direct and inferred judgments with respect to their variability. Specifically, the variances of the inferred judgments were
significantly lower than those of the direct judgments
(M 5 .0204 vs .0268, respectively, for the Chinese subjects, and M 5 .0202 vs .0290 for the American subjects),
F(1, 184) 5 65.55, p , .00001, with no main effect for
subject group and no interaction between subject group
and assessment procedure. Essentially, the inference
procedure made subjects more conservative, discouraging them from reporting extreme judgments. This led
92
YATES, LEE, AND BUSH
to an improvement in the accuracy dimension known
as “scatter” in the covariance decomposition of the mean
probability or Brier score (Yates, 1990, 1994), for both
subject groups. But this improvement was offset by
deterioration in the “slope” accuracy dimension. In
terms of overall accuracy, as indexed by the Brier score,
the net effect of using an inference rather than a directreport assessment method were nil for the Chinese subjects and negative for the Americans, t(83) 5 3.40, p 5
.001, due most likely to the latter subjects’ increased
overconfidence.
DISCUSSION
What do the present results indicate about extreme
Asian overconfidence in general knowledge, as represented by the case of Chinese in Taiwan? Is it a reflection of a mere extreme response style? The answer to
this question depends on the standard of comparison.
If response style were the explanation for extreme Chinese overconfidence, then we should expect it to diminish if not disappear when the respondent has an incentive for revealing his or her true opinions, e.g., to make
consequential decisions on the basis of those opinions.
From this absolute standard perspective, we must conclude that extreme Chinese overconfidence is not a response style artifact; Chinese subjects’ overconfidence
was unchanged when they had to use those opinions to
make actual decisions. But suppose the standard of
comparison is the overconfidence manifested in the consequential decisions of Americans? Then we might be
inclined toward the opposite conclusion; the Chinese
and American subjects made essentially the same decisions. So, which conclusion makes more sense? And
what are the theoretical and practical implications of
the findings, regardless of the conclusion that is accepted?
A key argument against the mere response style account is the form of the interaction we observed. Recall
that the most straightforward interpretation of that
account predicts that Chinese overconfidence should
diminish in inferred as compared to direct judgments,
whereas American overconfidence should remain constant. But the data revealed that American overconfidence increased while Chinese overconfidence stayed
at its original high level. This “metric” interpretation
takes seriously the comparability of the literal magnitudes of the subjects’ direct and inferred judgments.
However, suppose the inference procedure, in effect,
overestimated the actual judgments driving subjects’
wager decisions by, say, 7%. Then the proper comparisons should be between biases based on f and f ** 5 f *
2 .07, not f and f *. And the nature of the observed
interaction would be as prescribed. Is there any reason
to suspect such overestimation? As noted before, the
tendency for people to be risk averse in decisions involving gains should lead to underestimation, not overestimation. Moreover, the Marschak bidding procedure is
intended to discourage biases of any sort. Nevertheless,
one potential source of overestimation might be subjects’ overgeneralization from real-world negotiations
in which, as a ploy, they purposely inflate their selling
prices. At this time, it is impossible to say definitively
which pressures are more likely to have been present
or stronger, those favoring overestimation or underestimation. Intuitively, however, underestimation or a
“standoff” seems most plausible.
The unanticipated differences in the nature of the
direct and inferred judgments (i.e., in variability) could
be taken as another argument against the mere response bias proposition. Such differences imply that
the inference procedure is more than just an “elicitation
technique.” At a more general level, they suggest that
the processes by which people form the judgments they
articulate explicitly—to others or to themselves—are
different from those that produce the judgments that
drive their personal decisions. An important task for
future work is to determine the precise nature of the
differences. One plausible initial hypothesis is the following: Every judgment that deviates from a baseline
assessment must be predicated on information (i.e.,
cues) that justifies such a deviation. When a judgment
must support a decision that has material consequences
for the judge, the criteria for an extreme deviation
might be stricter than when the judgment will not assume such a role.
The present indication of systematic differences between direct and inferred judgments agrees with the
Bayesian speculation about the need for distinguishing
them, dating back at least as far as Ramsey (1931). It
does not imply that we should pay attention to one
variety of judgment and ignore the other, though. That
is because, in real life, people depend on both. Thus,
some of our decisions are predicated on our unexpressed
beliefs about what the truth is or will be. In other cases,
however, we must decide on the basis of other people’s
articulated opinions, e.g., a physician’s judgment of
whether a lump in a breast is something to worry about,
or a business consultant’s expectation that a bid for a
contract will be successful.
This implicates the practical importance of overconfidence, including its cross-national variations. As demonstrated concretely in this study, overconfidence in
both direct and inferred judgments can be costly in
material terms. It is not difficult to show that, in the
wager situation used here, the average American subject experienced an “overconfidence loss” of $2.20(Bias)
5 $2.20(f̄ * 2 d̄) ' $.13, and the typical Chinese subject
OVERCONFIDENCE AND RESPONSE STYLE
93
a bit more. That is, the subjects gained less than they
would have had they not been overconfident by 6%.
With only $2.20 at stake, the losses are trivial. But
with larger amounts, the significance of overconfidence
grows correspondingly.
If another person had to make the same sorts of decisions on the basis of our typical subject’s assessments,
that judgment “consumer” would have suffered the
same losses our subjects incurred themselves. Suppose
that the present results apply beyond the context of
general knowledge to the kinds of events encountered
in practical contexts, e.g., in legal contexts. Then we
should expect similar losses when decision makers
must rely on the overconfident assessments of their
consultants (e.g., when they ask, “If we take this lawsuit
to trial, what are our chances of winning?”). And if,
regardless of their underlying “true” beliefs, multinational collaborators of one group demonstrate more
overconfidence than those of another, the decision
maker should be aware of this. Further, the decision
maker should do something about it, or be prepared to
live with the consequences.
A number of authors (e.g., Erev, Wallsten, & Budescu,
1994; Pfeifer, 1994) have recently suggested that random error, manifesting itself in regression toward the
mean, might have led some researchers to conclude
erroneously that overconfidence is stronger than it actually is—if it exists at all. Some readers have taken
such results to mean that overconfidence is nothing
more than a data-analytic artifact. But data and analyses like those reported here show that at least some
degree of such overconfidence is indeed quite “real”;
people are made demonstrably worse off by it.
Griffin, D. W., & Varey, C. A. (1996). Towards a consensus on overconfidence. Organizational Behavior and Human Decision Processes,
65, 227–231.
REFERENCES
Stening, B. W., & Everett, J. E. (1984). Response styles in a crosscultural managerial study. Journal of Social Psychology, 122, 151–
156.
Becker, G. M., DeGroot, M. H., & Marschak, J. (1964). Measuring
utility by a single-response sequential method. Behavioral Science,
9, 226–232.
Brislin, R. W. (1970). Back-translation for cross-cultural research.
Journal of Cross-Cultural Psychology, 1, 185–216.
Chun, K.-T., Campbell, J. B., & Yoo, J. H. (1974). Extreme response
style in cross-cultural research: A reminder. Journal of Cross-Cultural Psychology, 5, 465–480.
Erev, I., Wallsten, T. S., & Budescu, D. V. (1994). Simultaneous overand underconfidence: The role of error in judgment processes. Psychological Review, 101, 519–527.
Fischhoff, B., Slovic, P., & Lichtenstein, S. (1977). Knowing with
certainty: The appropriateness of extreme confidence. Journal of
Experimental Psychology: Human Perception and Performance,
3, 552–564.
Fishburn, P. C., & Kochenberger, G. A. (1979). Two-piece von Neumann-Morgenstern utility functions. Decision Sciences, 10, 503–
518.
Gigerenzer, G., Hoffrage, U., & Kleinbölting, H. (1991). Probabilistic
mental models: A Brunswikian theory of confidence. Psychological
Review, 98, 506–528.
Hui, C. H., & Triandis, H. C. (1989). Effects of culture and response
format on extreme response style. Journal of Cross-Cultural Psychology, 20, 296–309.
Jaccard, J., & Wan, C. K. (1986). Cross-cultural methods for the
study of behavioral decision making. Journal of Cross-Cultural
Psychology, 17, 123–149.
Kahneman, D., & Tversky, A. (1979). Prospect theory: An analysis
of decision under risk. Econometrica, 47, 263–291.
Lee, J.-W., Yates, J. F., Shinotsuka, H., Singh, R., Onglatco, M. L.
U., Yen, N. S., Gupta, M., & Bhatnagar, D. (1995). Cross-national
differences in overconfidence. Asian Journal of Psychology, 1,
63–69.
LePoer, B. L. (Ed.). (1991). Singapore: A country study (2nd ed.).
Washington, DC: Federal Research Division, Library of Congress.
Lichtenstein, S., & Fischhoff, B. (1977). Do those who know more
also know more about how much they know? Organizational Behavior and Human Performance, 20, 159–183.
McClelland, A. G. R., & Bolger, F. (1994). The calibration of subjective
probabilities: Theories and models 1980–94. In G. Wright & P.
Ayton, Subjective probability (pp. 453–482). Chichester, England:
Wiley.
Pfeifer, P. E. (1994). Are we overconfident in the belief that probability
forecasters are overconfident? Organizational Behavior and Human Decision Processes, 58, 203–213.
Ramsey, F. P. (1931). Truth and probability. In F. P. Ramsey (Ed.), The
foundations of mathematics and other logical essays (pp. 156–198).
New York: Harcourt, Brace, Jovanovich.
Reischauer, E. O., & Jansen, M. B. (1995). The Japanese today. Cambridge, MA: Belknap Press.
Sniezek, J. A., & Buckley, T. (1991). Confidence depends on level of
aggregation. Journal of Behavioral Decision Making, 4, 263–272.
Wallsten, T. S. (1996). An analysis of judgment research analyses.
Organizational Behavior and Human Decision Processes, 65, 220–
226.
Whitcomb, K. M., Önkal, D., Curley, S. P., & Benson, P. G. (1995).
Probability judgment accuracy for general knowledge: Cross-national differences and assessment methods. Journal of Behavioral
Decision Making, 8, 51–67.
Wright, G. N., Phillips, L. D., Whalley, P. C., Choo, G. T., Ng, K. O.,
Tan, I., & Wisudha, A. (1978). Cultural differences in probabilistic
thinking. Journal of Cross-Cultural Psychology, 9, 285–299.
Yates, J. F. (1990). Judgment and decision making. Englewood Cliffs,
NJ: Prentice Hall.
Yates, J. F. (1994). Subjective probability accuracy analysis. In G.
Wright & P. Ayton (Eds.), Subjective probability (pp. 381–410).
Chichester, England: Wiley.
Yates, J. F., & Lee, J.-W. (1996). Chinese decision making. In M. H.
Bond (Ed.), Handbook of Chinese psychology (pp. 338–351). Hong
Kong: Oxford University Press.
94
YATES, LEE, AND BUSH
Yates, J. F., Lee, J.-W., & Shinotsuka, H. (1996). Beliefs about overconfidence, including its cross-national variation. Organizational Behavior and Human Decision Processes, 65, 138–147.
Zax, M., & Takahashi, S. (1967). Cultural influences on response
style: Comparisons of Japanese and American college students.
Journal of Social Psychology, 71, 3–10.
Yates, J. F., Zhu, Y., Ronis, D. L., Wang, D.-F., Shinotsuka, H., &
Toda, M. (1989). Probability judgment accuracy: China, Japan, and
the United States. Organizational Behavior and Human Decision
Processes, 43, 145–171.
Zhang, B. (1992). Cultural conditionality in decision making: A prospect of probabilistic thinking. Unpublished doctoral dissertation,
Department of Information Systems, London School of Economics
and Political Science, University of London, London.
Received: November 13, 1996