Download The Relative Judgment Theory of the Psychometric Function

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Scalar field theory wikipedia , lookup

Mathematical optimization wikipedia , lookup

Psychometrics wikipedia , lookup

Renormalization group wikipedia , lookup

Generalized linear model wikipedia , lookup

Risk aversion (psychology) wikipedia , lookup

Transcript
The Relative Judgment Theory
of the Psychometric Function
Stephen W. Link
Department of Psychology
McMaster University, Hamilton , Canada
ABSTRACT
The theory of relative judgment is based upon sequential comparisons between
a presented stimulus and a mental psychophysical standard. The process of
comparison continues until either of two response thresholds is first exceeded;
then a response is emitted. This sequential theory of judgment provides predictions for both response probability (RP) and response time (RT), and a
fundamental relationship between RP and RT. The application of the theory
to the psychometric function rests on the assumption that a single mental
standard, not necessarily identical to the experimenter's standard, provides the
referent against which individual stimuli are compared. To confirm the predicted relation between RP and RT , three experiments are examined. In all
cases the theoretically predicted relationship is supported by the data. As a
bonus, it is now possible to estimate the expected value of the mental standard from either RP or RT results, compare the estimates, and relate the
estimated value to units of the physical stimulus.
I. INTRODUCTION
History is replete with examples of replicable scientific observations that eventually obtain the status of empirical laws. The psychometric function which
Urban (1910) defined as "A mathem atical expression which gives the probability
of a judgment as function of the comparison stimulus [sic] " is an example. Over a
wide range of experimental conditions, methods , and stimuli, psychologists have
discovered that as the difference between a comparison stimulus and a standard
increases, the probability of judging the comparison stimulus to be greater than
the standard also increases. The empirical fact is that this rather smoothly in-
619
620
LINK
PART VI
creasing function of stimulus difference appears to have the form of a cumulative
probability function.
The existence of such a widely found result has tempted statisticians to discover the "best" mathematical description of its form. For the psychometric
function the best description has been claimed to be (Bock & Jones, 1968)
(1) the normal distribution, (2) the log normal, (3) the angular, and (4) the logistic. Of course , each of these proposals has merit, but as Feller (1940) pointed
out, poor fit will not permit great discrimination among these, as well as many
other, functional forms because each will provide a reasonably close fit to empirical results . Nevertheless the pursuit of a mathematical form for the psychometric function has led many statisticians to favor the logistic distribution (c.f.
Berkson, 1953 ; Cox, 1970).
In spite of the arguments in favor of the logistic, there is little agreement
concerning the reason for its frequent appearance in psychological research. A
major impediment to providing a theory consistent with the assumed logistic
form of the psychometric function is raised by the fact that the probabilities that
appear to follow this cumulative probability distribution so nicely are not cumulative probabilities at all. Rather, they are probabilities obtained from formal
choice experiments (Bush, Luce, & Galanter, 1963) and should be considered as
choice probabilities. Thus a theory that accounts for the empirical result must
predict that choice probabilities, when viewed as an increasing function of stimulus difference should "appear" to match the cumulative logistic distribution.
The many difficulties in finding the best form for the psychometric function
assume less importance if we shed our empirical cloaks, don our theoretical caps,
and view the psychometric function as only one manifestatiqn of discriminative
performance. As a start, we might suspect that if both response probabilities and
response times are measured then response probability and correct and error
response times should vary as a function of stimulus difference. Moreover, these
variables should be related by predictions from a theory of discrimination. A
theory of psychophysical discrimination that predicts all of these response
measures will then permit experimental verification of relations between re·
sponse probability and response time. The predictions and experimental analyses
of such a theory are the focus of this chapter.
II. THE THEORY OF RELATIVE JUDGMENT
Relative Judgment Theory (RJT) (Link, 1975; Link & Heath, 1975) provides a
new theoretical basis for two choice discrimination experiments, and as we will
see , a new basis for the psychometric function. The theory postulates that
through experience , such as training or preexposure, a mental standard is established. When the experimenter presents a stimulus to the subject, a psychological
value of the stimulus is compared against the mental standard by subtraction.
32.
RELATIVE JUDGMENT THEORY
621
During a single trial these differences are accumulated over time until one or the
other of two fixed response thresholds is first exceeded; a response is then
emitted. Because the theory is sequential, immediate relations between choice
probability and response time result. These theoretical relations provide a quite
reasonable account of choice probability and response time for a number of
experiments.
We will assume that in a multistimulus, two-choice experiment the presentation of a stimulus results in a comparison between random variables representing
psychological values X; for stimulus S;(i = 1, 2, ... n) and Xr for the subject's
mental standard. The process of comparison occurs during a time unit of size t::.t,
for which we define the stationary difference random variable d; = xi - xr to
characterize the result of the comparison process. For the first k time units, we
let the sum of all such differences be Dik = I:J'= 1 d;j· The random variable Dik we
will take to be a sum of independent identically distributed random variables
that performs a random walk on a psychological dimension of comparative difference.
As shown in Fig. 1 the random walk performed by Dik is bounded by two response thresholds equidistant from a zero value for comparative difference. If
the random walk first exceeds the response threshold at A, then response R A ,
let us say the "greater than" response, occurs; otherwise the response Rn must
occur first because the random walk must eventually terminate at either A or
- A (Wald, 1947). The distance between response thresholds is assumed to be
constant during a trial, and equals the total amount of comparative difference
RESPOND RA
t
A
,
TOTAL
COMPARATIVE
c
DIFFERENCE
l
-A
- D ·I, K
'
K
RESPOND R 8
FIG . 1.
TIME
622
PART VI
LINK
needed by the subject to meet the experimenter's speed and accuracy instructions. The starting value of the random walk, that is, DiO, is labelled C and may
be shifted positively to bias responding toward response R A , or shifted negatively producing bias toward response R B.
The mathematical development of this theory of sequential discrimination
rests on the well known Waldldentity :
where N is the value of k when the random walk stops (at either A or - A) and
Ml8) is the moment generating function for di. From Wald's Identity it follows
that the probability of first responding R A given stimulus si is
PA.
!
e8/A +C) - I
= - - - - --
8 ·(A +C)
e '
-e
---
- 8 ·(A -
'
C)
where 8 i is a nonzero solution to Mi(8)
p
Ai
(i=I , 2, .. .. . n)
= 1. Algebraic
manipulation then gives
- 8.c
1
1 - e '
+ - - -- 8 .A
- 8 ·A
8 .A
- 8·A
e ' -e '
e ' -e '
8iA
= e -
(1)
When there is no response bias, a recognized criterion for symmetry of the psychometric function (Green & Swets, 1966, p. 125), then C = 0 and Equation 1
becomes
pAi
= (e8iA
- 1)/(e8iA
Rewriting this equation yields
pA i
- 8 ·A
= (1 + e '
)-1 ;:
e- 8 '·A) .
;.;
I
;<-tc
/..,,, X.
~/..·¥\
,'(
~
~
,. . ~
(2)
which is the defining equation of the distribution function of a standard logistic
variable (evaluated at the point 8 0).
Maximum likelihood estimates of 8 0 can be obtained by computing In (PA J
1 - PA i) = 8 0 . When the value of PA i is not too near 0 or I , a useful estimate
of 8 0 can be obtained from the (Anscom be, 1956) transform 8 0 = ln [(nA i + ~)/
(ni- nAi + ~)] where nAi is the number of greater than responses, and ni is the
total number of responses, to stimulus Si. Whether the values of PA i appear to
generate a psychometric function consistent with a cumulative logistic distribution is largely a matter of whether estimates of 8 0 are a linear function of the
stimulus magnitudes employed by the experimenter. It is, however , precisely this
empirical fact that has led to the adoption of logistic distribution as a basis for
the psychometric function .
32.
RELATIVE JUDGMENT THEORY
623
To determine whether the psychometric function results from the accumulation of comparative difference as postulated by RJT, we examine the predicted
relationship between response probability and response time. In general , the
mean decision time equals the average distance to the response thresholds
divided by the rate of drift, that is,
EDTi
[PAlA- C) - (1 - PA) (A+ C)] I Jl.i
[A(2PAi - I) - C] I Jl.i
where Jl.i = E(Xi - Xr) is the expected discrepancy between the psychological
values of Si and the subject's men tal standard . Collecting together all non decision
components of response time into an average value K yields a predicted mean
RT to stimulus Si of ERTi = EDTi + K. For experiments in which response bias
is negligible we find that
A
ERTi =-(2PAi - I)+ K.
Jl.i
(3)
Thus, for fixed response thresholds at A and - A, the mean response time depends upon a probability that can be estimated from choice response data and
upon an unknown value Ji.i. When Jl.i is known, ERTi is a linear function of
(2PA i - I )IJi.i having slope A and intercept K.
While a set of Ji.li = I , 2, ... n) that provide a best fit for Equation 3 may be
found by least squares, there are other procedures that are preferable and that
involve the selection of fewer than n parameters. F9r example, it is known that
when 8 i = 0, then necessarily Ji.i = 0. Therefore, when estimates of 8 ;A are a
linear function of stimulus difference, as in the case of a logistic psychometric
function, the position on the abscissa where 8;A = 0 provides an estimate of the
value of a subject's average mental standard in terms of stimulus magnitude.
Thus, when a stimulus magnitude, say Sr, generates psychological values having
expected value equal to the expected value of the subject's mental standard,
E(Xr), then Jl.i = 0.
For the rather small range of stimulus magnitudes characteristic of psychometric function experiments an appealing assumption concerning Jl.i is that the
drift rate is a similarity transformation of the difference between Si and Sr.
Specifically, let Ji.i = m(Si- Sr) , where m is a constant , Si is a stimulus magnitude, and Sr is the value of the subject's mental standard defined in units of the
stimulus. Equation (3) may then be written as
ERT . = -A (2PAi - I) +K
I
m
si- sr
(4)
where A/m is an unknown slope, and K is the mean nondecision time. One advantage of Equation 4 over Equation 3 is that only the stimulus magnitude cor-
624
LINK
PART VI
responding to the subject's mental standard need be estimated, and this value
can often be determined from the psychometric function. A second advantage is
that only the relative differences between stimulus magnitudes and Sr need be
known in order to assess the least squares fit of Equation 4. The three experiments examined below use Equation 4 to provide evidence that RJT provides an
excellent account of data .
Il l.
K EL LOGG 'S EX PER IMENT
In a now classic study of split disk brightness discrimination, Kellogg (1931) required subjects to determine which side of a split disk was brighter. Four luminances were used to create seven stimuli with the property that the difference in
luminance between the left and right sides of the disc was an arithmetic progression having a step size of one and ranging from - 3 to 3 t.s units. After 4,000
trials Kellogg began recording response times without the subject's knowledge ,
and obtained 240 observations for each value of Si.
Kellogg's careful balancing of the stimuli, his attempt to minimize response
bias, and the large number of observations suggest that 8 r4 be estimated using
the maximum likelihood calculation, 8 r4 = 1n (PA i/1 - PA J These estimates
are shown in Fig. 2 as a function of the luminance difference between the two
sides of the disc. A least squares fit yielded the equation 8A = 1.14 t.s + .31 with
a standard error of .15. Solving this equation for the value of t.s at which 8A = 0
eA
4
2
-3
-2
•
.15
-1
2
-2
eA = t-14 .15 + ·31
•
-4
FIG . 2.
3
32 . RELATIVE JUDGMENT THEORY
u
Q)
625
1000
If)
E
Q)
E
F
Q)
If)
c
0
a.
If)
Q)
0::
c
0
Q)
::?!
700
T
0
.2
.3
4
( 2 PA -I ) J-Li
.5
.6
7
-I
I
FIG. 3.
gives ~s = -.27. Thus, given the standard error, we would expect the average
value of the subject's mental standard to lie in the region -.40 to - .13 ~s.
To examine the predicted relation between response time and response probability given in Equation 4, values of Si were taken to equal values of the visually
presented stimulus difference, i.e., - 3, - 2 .. . 3~s. Values of Sr ranging from -3
to 3 in steps of .01 were defined and for each value the agreement between
Equation 4 and the actual data points was assessed by calculating average squared
deviation. A minimum squared deviation occurred when Sr assumed the value
- .33 ·~s, a value in good agreement with that obtained from the estimated position of the mental standard by using the psychometric function.
Figure 3 shows the result of plotting the mean RTs according to Equation 4
and taking Sr = -.33. The linear fit is quite good with slope 1186 and intercept
475 msec. At the minimum the standard error proved to be only 18 msec. The
descent to this minimum was quite steep as revealed by standard errors of 50 at
626
LINK
PART VI
Sr = - .40 and 56 at Sr = - .27. Thus the best fit at Sr = - .33 is considerably
better than assuming, from calculations employing the psychometric function,
that Sr = - .27. The difference between the two estimates of Sr is well within
sampling error, and therefore the theory appears to provide a quite respectable
account of Kellogg's results.
The relationship between mean RT and choice probability given in Equation
4 can be examined without reference to the psychometric function. Even in experiments in which the error rate is rather small, changes in the value of Jli can
have a significant effect upon response time. To illustrate the application of
Equation 4 without first examining the psychometric function, data from a
well known card sorting experiment were examined .
IV. THE SHALLICE AND VICKERS EXPERIMENTS
Seven subjects sorted eight card packs containing 40 cards each.
Each card contained two lines, one of 4.5 em. The other line took eight difference
values ranging from 3.4 to 4.4 em depending upon the pack it was in. Their positions
were varied along the top of the card , being a constant distance of 1.0 em apart. In
each pack 20 cards had the longer line on the left and 20 on the right .... Each subject sorted the pack into two piles depending on whether the long line was on the left
or the right. (Shallice & Vickers, 1964 , p. 45)
Let us suppose that subjects consistently compare the line on one side of the
card to the line opposite. If the comparison is from left to right , then whenever
the 4.5 em line appears on the left, a positive differenGe results ; otherwise, the
difference is negative. The discrepancy obtained is a visual discrepancy which
generates a psychological value that is compared to the subject's mental standard.
In the present experiment, the 4.5 em line appeared equally often on the left or
right-hand sides of the cards and, therefore, the discrepancies are as often positive as negative. Thus the subject would do well to maintain a mental standard
for the difference equal to zero.
The data reported by Shallice and Vickers were not partitioned according to
whether the long line actually occurred on the left or right hand side. However,
for the eight values of physical difference between lines, corresponding to the
eight decks , we do have available the number of errors and the mean response
time. The data, then , may be considered as defining the positive side of a perfectly
balanced psychometric function experiment. The psychometric function itself is
not of immediate concern , for the error rates varied from .03 7 to only .007 for
the 11 mm difference. However, the predicted relationship between choice
probability and response time may still be examined.
The lowest panel in Fig. 4 shows the result obtained by applying Equation 4
to the average data for the experiment described earlier (Experiment 3). The
32 . RELATIVE J UDGMENT THEOR Y
627
800
700
600
500
400
Experiment
300
I
Sr= + .l7
200
0
2
0Q)
(f)
E
Q)
E
r=
Q)
(f)
c
0
Q.
(f)
Q)
cr
c
0
Q)
4
(2F' - l))...d
Cj
6
8
I
800
700
600
500
400
Experiment
300
2
Sr= + .07
200
~
0
1400
1300
1200
1100
/ r',...,..,_ • oot
- ')
1000
900
0
2
4
(2F' -ll 1 1 ~ 1
Cj
6
8
/1
FIG . 4.
least squares fit yielded an estimate of the subject's mental standard for physical
difference equal to -.02 mm, a value quite near zero. The slope and intercept
values were 48 and 939 msec, respectively, and the standard error was only 8.3
msec. The rate of conve rgence to the minimum was quite fast, with the standard
error having values of 15 msec and 18 msec when the subject's standard was taken
to be -.07 or .03 mm, respectively. The quite close fit gives additional evidence
in favo r of Equation 4, as do the additional results shown for Shallice and
Vickers' Experiments 1 and 2 which were similar to Experiment 3.
628
PART VI
LINK
In each of these experiments a balanced experimental method has provided
estimates of the mental standard as having a value near zero. It might be assumed
then that subjects simply compare one component of the stimulus to the other,
and the discrepancy so obtained is used to drive the decision process. That is,
the subject uses as a standard the value of a standard established by the experimenter. This assumption would be in accord with the many theories of discrimination based upon Thurstonian psychophysical models. However, the
estimated value of a subject's mental standard is generally zero only in "balanced"
discrimination experiments. The next experiment illustrates this point.
V . THE LINK (1971) EXPERIMENT
Four very well practiced subjects were required to respond "same" or "different"
depending upon whether a line presented 200 msec after a fixed standard, S0 ,
was of the same or a different length from S 0 . If a comparison line was different
from S 0 , it was always either longer or shorter depending upon the subject. On
50% of the trials S 0 was followed by a re-presentation of S 0 ; the remaining trials
were equally divided among lines which varied from the standard by 1, 2, 3, or
4 mm . If the subject adops as a mental standard the experimenter's standard,
then whenever the comparison is of the same length as the experimenter's standard , the value of fJ. will be zero. This would result in long response times and
chance performance. Adopting such a standard would, in this case, lead to rather
odd performances. Alternatively, the mental standard may be displaced from the
experimenter's standard and result in a nonzero value for fJ. when S 0 is the comparison stimulus.
In addition to varying the stimulus differences, RT deadlines were imposed
on the task. Subjects were instructed at the beginning of each trial to respond
faster than either 260 or 460 msec, or were to be as accurate as possible. Each of
these instructions was presented equally often, and in all respects the experiment
was completely randomized within each block of 240 experimental trials.
Theoretically the effect of the RT deadline instruction should be to force the
subjects to vary the total comparative difference used in the decision process.
With a 260 msec deadline there should be little if any discrimination, because in
order to produce such rapid responses the value of A must be near zero. When a
moderate deadline is in force then the value of A should increase, and when accuracy is required, the valt.ie of A should be still greater. The effect on Equation
4 is to increase the slope of the function relating RT and choice performance.
But the subject's mental standard should be found to be identical across the
various RT deadline conditions.
The results shown in Fig. 5 were obtained by a least squares method analogous
to that used in the preceding experiments. It can be seen that for the 260 msec
RT deadline the slope was zero, indicating that performance was rather poor.
When the RT deadline was relaxed , the slope of Equation 4 increased, and when
32.
RELATI VE JUDGMENT THEORY
629
the subjects were to be accurate, the slope increased again. Within each deadline
condition the least squares fit was quite good with standard errors of 1.9 and 1.3
msec for the 460 msec and accuracy conditions respectively . The estimated position of the average mental standard, labelled Sr in Fig. 5, equalled S 0 + .74 for
the 460 msec, and S0 + .78 for the accuracy condition.
The conclusion to be drawn is that the subject's mental standard was not the
same as the experimenter's standard but was instead about .75 mm distant from
the experimenter's standard . The adoption of the standard may be an automatic,
learned feature of human discrimination that depends upon the frequency of
presenting various stimuli, but the important fact is that the comparison process
is not accurately characterized in terms of differences from the experimenter's
standard.
It appears that in experiments in which the comparison stimuli are balanced
on either side of the experimenter's standard, both the subject's and experimenter's standards match quite nicely. This result would give support to Thurstone's assumption that the value zero is a standard for comparative judgment.
However, when the balancing is changed , as in Link (1971 ), we observe a shift of
the subject's mental standard away from the physical standard established by the
experimenter.
In summary, we have shown how the well-kn qwn psychometric function can
be derived from relative judgment theory and how the theory relates response
time and response probability. Whereas our analysis has assumed unbiasedness
600
Accuracy
Sr:S0 +.78
u
Q)
"'
E
Q)
E
i=
Q)
"'c0
a::"'
460 Dead line
Q.
Q)
Sr= S0 + .74
c
0
Q)
300
:::!!
260 Deadline
A=O
200
~2
0
.5
1.0
) -1
( 2 PA -I f-Li
I
FIG . 5.
1.5
630
PART VI
LINK
with respect to response thresholds, future experiments will report results in
which response bias is manipulated in order to promote changes in the response
times and in the psychometric function.
ACKNOWLEDGMENTS
This project is supported by the National Research Council of Canada and by the Science
and Engineering Research Board of McMaster University. Mr. Norman Wintrip and Miss Julia
Cox have assisted in the collection and analysis of data for this project.
REFERENCES
Anscombe, F . J. On estimating binomial response relations. Biometrika, 1956, 43, 461 464.
Berkso n, J. A statistically precise and relatively simple method of estimating the bio-assay
with quanta! response , based on the logistic function. Journal of the American Statistical
A ssociation, 1953,48, 565 - 599.
Bock, D. R., & Jones, L. V. Th e measurement and prediction of judgment and choice. San
Francisco: Holden-Day, 1968 .
Bush, R. R. , Galanter, E., & Luce, R. D. Characterization and classification of choice experiments. In R. D. Luce, R. R. Bush , & E. Galanter (Eds.), Handbook of mathematical
psychology. (Vol. 1). New York : Wiley, 1963.
Cox, D. R. The analysis of binary data. London: Methuen, 1970.
Feller, W. On the logistic law of growth and its empirical verifications in Biology. Acta
Biotheoretica , 1940, 4, 51 - 66.
.
Green, D. M., & Swets, J. A. Signal detection theory and psychophysics. New York: Wiley,
1966.
Kellogg, W. N. The time of judgment in psychometric measures. American Journal of Psychology, 1931,43, 65-86 .
Link, S. W. Applying RT deadlines to discrimination reaction time, Psychonomic Science,
1971,25, 355 - 358.
Link, S. W. The relative judgment theory of two choice response time. Journal of Mathematical Psychology , 1975,12, 114-135 .
Link, S. W., & Heath, R. A. A sequential theory of psychological discrimination. Psychometrika, 1975,40, 77-105.
Shallice, T., & Vickers, D. Theories and experiments on discrimination tim es . Ergonomics,
1964, 7, 37 - 49.
Urban, F. M. The method of constant stimuli and its generalizations. Psychological Review,
1910,27, 229 - 259.
Wald, A. Sequential analysis. New York: Wiley, 1947 .
ATTENTION and
PERFORMANCE VII
Edited by
Jean Requin
Departement de Psychobiologie Experimentale
lnstitut de Neurophysiologie et Psychophysiologie
du Centre National de Ia Recherche Scientifique
Marseille, France
m
1978
LAWRENCE ERLBAUM ASSOCIATES, PUBLISHERS
Hillsdale, New Jersey
DISTRIBUTED BY THE HALSTED PRESS DIVISION OF
JOHN WILEY & SONS
New York
Toronto
London
Sydney