Download Untitled - Mide UC

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Statistics wikipedia , lookup

History of statistics wikipedia , lookup

Transcript
I
NF
ORMEST
ÉCNI
COSMI
DEUC /T
ECHNI
CALREPORT
SMI
DEUC
Cent
r
odeMedi
ci
ónMI
DEUC /Meas
ur
ementCent
erMI
DEUC
I
T
1106
I
dent
i
f
i
cat
i
onoft
he1PLModel
wi
t
hGues
s
i
ngPar
amet
er
:
Par
amet
r
i
candS
emi
par
amet
r
i
cRes
ul
t
s
ERNES
T
OS
ANMART
Í
NANDJ
EANMARI
EROL
I
N
Identification of the 1PL Model with Guessing Parameter:
Parametric and Semi-parametric Results
E RNESTO S AN M ART ÍNa,b,c
AND J EAN -M ARIE
ROLINd
a
Department of Statistics, Pontificia Universidad Católica de Chile, Chile.
b
Faculty of Education, Pontificia Universidad Católica de Chile, Chile.
c
Measurement Center MIDE UC, Pontificia Universidad Católica de Chile, Chile.
d
Institut de statistique, biostatistique et sciences actuarielles, Université catholique de Louvain, Belgium.
September 28, 2011
Abstract
In this paper, we study the identification of a particular case of the 3PL model, namely when
the discrimination parameters are all constant and equal to 1. We term this model, 1PL-G model.
The identification analysis is performed under three different specifications. The first specification
considers the abilities as unknown parameters. It is proved that the item parameters and the abilities
are identified if a difficulty parameter and a guessing parameter are fixed at zero. The second specification assumes that the abilities are mutually independent and identically distributed according to
a distribution known up to the scale parameter. It is shown that the item parameters and the scale
parameter are identified if a guessing parameter is fixed at zero. The third specification corresponds
to a semi-parametric 1PL-G model, where the distribution G generating the abilities is a parameter
of interest. It is not only shown that, after fixing a difficulty parameter and a guessing parameter at
zero, the item parameters are identified, but also that under those restrictions the distribution G is
not identified. It is finally shown that, after introducing two identification restrictions either on the
distribution G or on the item parameters, the distribution G and the item parameters are identified
provided an infinite quantity of items is available.
Keywords: 3PL model, location-scale distributions, fixed effects, random effects, identified parameter, parameters of interest, Hilbert space.
1 Introduction
For multiple-choice tests it is reasonable to assume that the respondents guess when they believe that they
don’t know the correct response. This type of behavior seems to be prevalent in a low-stakes test, where
students are asked to take a test for which they receive neither grades nor academic credit and thus may be
unmotivated to do well. The solution for this problem is to include a so-called guessing parameter in the
model that would have been used without any guessing, which is commonly the 1PL or the 2PL model.
1
Three possibilities are found in the literature (Hutschinson, 1991): (1) a fixed value L−1 , with L being
the number of response categories; (2) an overall guessing parameter to be estimated from the data, with
the same value for all items; (3) an item-specific guessing parameter. The third possibility is the one used
in the 3PL, the extension of the 2PL with an item-specific guessing parameter; this parameter reflects the
probability of a correct guess. The 3PL was introduced by Birnbaum (1968), and is discussed in most
handbooks of IRT (Hambleton, Swaminathan, and Rogers, 1991; van der Linden and Hambleton, 1997;
Embretson and Reise, 2000; McDonald, 1999; Thissen and Wainer, 2001) and the option is available
in various specialized computer programs (e.g., BILOG, LOGIST, MIRTE, MULILOG, PARSCALE,
RASCAL, R).
The 3PL is a popular model, but it restricts the guessing parameter to be item dependent, instead of
allowing this parameter to be also person dependent. Several extensions have been proposed. Thus,
for instance, San Martı́n, del Pino, and De Boeck (2006) extends the 3PL model to let the guessing
parameter depend on the ability of the examinee. Wise and Kong (2005) proposed to use response time
to distinguish solution behavior and rapid-guessing behavior: for each item there is a threshold which
is the response time boundary between solution behavior and rapid-guessing behavior. Based on this
research, Wise and DeMars (2006) developed the effort-moderate model. If an examinee’s response
time is longer than the threshold, the model reduces to the 3PL model describing solution behavior.
Otherwise, the model reduces to a constant probability model with the guessing probability being L−1 ,
with L the number of response categories. Cao and Stokes (2008) proposed three IRT models to describe
the results from a group of examinees including both nonguessers and partial guessers. The first model
assumes that the guesser answers questions based on his/her knowledge up to a certain test item, and
guesses thereafter. The second model assumes that the guesser answers relatively easy questions based
on his/her knowledge and guesses randomly on the remaining items. The third model assumes that the
guesser gives less and less effort as he/she proceeds through the test.
In spite of these extensions, the 3PL model still deserves basic questions to be investigated and which
should have an impact on more complex guessing IRT-models. Parameter identifiability is one of those
basic questions. Identifiability is relevant because it is a condition necessary for ensuring a coherent
inference on the parameters of interest. The parameters of interest are related to the sampling distributions which describe the data generating process. If there not exist a one to one relationship between
those parameters and the sampling distributions, the parameters of interest are not provided with an empirical meaning. In a sampling theory framework, this fact is made explicit through the impossibility
of obtaining unbiased and/or consistent estimators of unidentified parameters (Koopmans and Reiersøl,
1950; Gabrielsen, 1978; San Martı́n and Quintana, 2002). This limitation seems to be circumvented by
a Bayesian approach because in this set-up it is always possible to compute the posterior distribution of
unidentified parameters (Lindley, 1971; Poirier, 1998; Gelfand and Sahu, 1999; Ghosh, Ghosh, Chen,
and Agresti, 2000). However, taking into account that a statistical model always involves an identified
parametrization (see Theorem 4.3.3 in Florens, Mouchart, and Rolin (1990)), the posterior distribution
of an unidentified parameter updates the identified parameter only and, consequently, does not provide
any empirical information on the unidentified parameter (San Martı́n and González, 2010; San Martı́n,
Jara, Rolin, and Mouchart, 2011).
Thus, either from Bayesian point of view, or from a sampling theory framework, identifiability is an
2
issue that need to be considered. In both perspectives, an identification analysis should begin by making
explicit the sampling distributions as well as the parameters of interest. Regarding the 3PL model, the
psychometric literature has considered three different type of likelihoods:
1. A first likelihood corresponds to the probability distribution of the observations given both the item
parameters (difficulty, discrimination and guessing ones) and the abilities. These parameters are
also considered as the parameters of interest; see, for instance, Swaminathan and Gifford (1986)
and Maris and Bechger (2009).
2. A second likelihood is obtained after integrating out the abilities, which in turn are assumed to be
distributed according to a parametric distribution Gφ known up to a parameter φ. In this case, the
sampling distribution or likelihood is indexed by the item parameters (difficulty, discrimination
and guessing ones) and φ. These parameters are also considered as the parameters of interest. The
abilities are typically obtained in a second step through an empirical Bayes procedure; see, for
instance, Bock and Aitkin (1981) and Bock and Zimowski (1997).
3. A third likelihood is obtained after integrating out the abilities, which in turn are assumed to
be distributed according to an unknown probability distribution G. In this case, the sampling
distribution or likelihood is indexed by the item parameters (difficulty, discrimination and guessing
ones) and G. These parameters are considered as the parameters of interest; see, for instance,
Woods (2006, 2008).
Following the terminology of generalized non-linear mixed models (De Boeck and Wilson, 2004), it
can be said that the first likelihood considers the abilities as fixed-effects, whereas the remaining two
likelihoods consider the abilities as random-effects. This terminology helps to make precise whether the
abilities are parameters indexing or not the likelihood function. It can be used in spite of the estimation procedure, particularly in a Bayesian framework, where the parameters indexing the likelihood are
endowed with a prior distribution. For identification purpose, the estimation procedure is irrelevant.
The identification problems corresponding to each of those sampling distributions are quite different.
Some contributions conjecture (Adams, Wilson, and Wang, 1997; Adams and Wu, 2007) that the identification of the parameters indexing the first type of likelihoods implies the identification of the parameters
indexing the second (and, by extension, the third) type of likelihoods. However, in the case of the Rasch
model, it was shown that such relationships are not true (see Sections 3.2 and 4.2 in San Martı́n et al.
(2011)). This result suggests that the identification problems in the context of 3PL models are still open
problems.
This paper focuses its attention on these identification problems. It is mainly motivated by the recent
work of Maris and Bechger (2009); there, it is considered a likelihood of the first type, where the discrimination parameters are equal to an unknown common value. They shown that the item parameters
and the abilities are not identified. Our paper studies the identification problem in the three contexts mentioned above: Section 2 discusses the problem under a fixed-effects specification of the model; Section
3 studies the problem under a parametric random-effects specification, where the distribution generating
the abilities is assumed to be known up to a scale parameter and a location parameter. Finally, in Section
3
4, the problem is analyzed in a semi-parametric context, where the distribution generating the abilities is
considered among the parameters of interest. The main obtained results can be summarized as follows:
1. Under a fixed-effects specification of the model, it is shown that the item parameters and the
abilities are identified if basically two restrictions are imposed: one on the difficulty parameters
and one on the guessing parameters.
2. Under a parametric random-effects specification, it is shown that the item parameters and the scale
parameter are identified if basically one restriction on the guessing parameters is imposed.
3. Under a semi-parametric specification, it is shown that the item parameters are identified if basically two restrictions are imposed: one on the difficulty parameters and one on the guessing
parameters. Using the structure of a specific Hilbert space, it is also shown that the distribution G
is not identified by the observations. However, when an infinite quantity of items is available, it is
proved that G becomes identified.
The paper ends with a discussion.
2
Identification under a fixed-effects specification
The 3PL model, introduced by Birnbaum (1968), is specified as follows: for each person i = 1, . . . , N
and each item j = 1, . . . , J, the probability that person i answers correctly item j is given by
P [Yij = 1 | θi , βj , αj , cj ] = cj + (1 − cj ) Ψ[αj (θi − βj )],
(2.1)
where Ψ(x) = exp(x)/[1 + exp(x)]. This model assumes that if a person i has ability θi , then the
probability that he/she will know the correct answer of the item j is given by Ψ[αj (θi − βj )]; here
αj corresponds to the discrimination parameter of item j, whereas βj is the corresponding difficulty
parameter. It further assumes that if he/she does not know the correct answer, he/she will guess and, with
probability cj , will guess correctly; the parameter cj is accordingly called guessing parameter of item j.
It follows from these assumptions that the probability of a correct response to item j by person i is given
by (2.1). For details, see Birnbaum (1968) and Chapter 4 in Embretson and Reise (2000).
The model is completed by assuming that Yij ’s are mutually independent. The statistical model (or,
likelihood function) describing the data generating process corresponds, therefore, to the family of sampling distributions indexed by the parameters
(θ 1:N , α1:J , β 1:J , c1:J ) ∈ RN × RJ+ × RJ × [0, 1]J ,
where θ 1:N = (θ1 , . . . , θN ), α1:J = (α1 , . . . , αJ ), and similarly for β 1:J and c1:J .
Recently, Maris and Bechger (2009) considered the identifiability of a particular case of the 3PL,
namely when the discrimination parameters αj , with j = 1, . . . , J, are equal to α. The parameter
indeterminacies inherited from the Rasch model and the 2PL model that should be removed, are the
location and scale ones. Maris and Bechger (2009) removed them by fixing α at one and constraining β1
4
and c1 in such a way that β1 = − ln(1−c1 ). In this specific case, the unidentifiability of the parameters of
interest persists because (θi , βj , cj ) and (ln(exp(θi ) + r), ln(exp(βj ) − r), (cj exp(βj ) − r)/(exp(βj ) −
r)), with a constant r such that
− min{exp(θi ) : i = 1, . . . , N } ≤ r ≤ min{cj exp(βj ) : j = 1, . . . , J},
induce the same probability distribution (2.1) (with αj = 1 for all item j). Thus, “in contrast to the
location and scale indeterminacy, this new form of indeterminacy involves not only the ability and the
item difficulty parameters, but also the guessing parameter” (p. 6 Maris and Bechger, 2009).
The question is under which additional restrictions, the 3PL with constant discriminations is identified.
San Martı́n, González, and Tuerlinckx (2009) considered this problem for the 3PL model with discriminations equal to 1. Following San Martı́n et al. (2006), this model, called 1PL-G model, is specified
as
P [Yij = 1 | θi , βj , cj ] = cj + (1 − cj ) Ψ(θi − βj ).
(2.2)
The data generating process is accordingly described by a family of sampling distributions indexed by
the parameters
(θ 1:N , β 1:J , c1:J ) ∈ RN × RJ × [0, 1]J .
(2.3)
San Martı́n et al. (2009) shown that these parameters are identified by the observations under specific
restrictions which are summarized in the following theorem:
Theorem 2.1 For the statistical model 2.2, the parameters (θ 1:N , β 1:J , c1:J ) are identified by the
observations provided the following conditions hold:
1. At least one item is available.
2. There exists at least two persons such that their probabilities to answer correctly all the items are
different.
3. c1 and β1 are fixed at 0.
3
3.1
Identification of the Parametric 1PL-G Model
Random-effects specification of the 1PL-G model
The previous identification results are valid under a fixed-effects specification of the 1PL-G model, that
is, when the abilities are viewed as unknown parameters. However, in modern item response theory, θ
is usually considered as a latent variable and, therefore, its probability distribution is an essential part of
the model; see (Thissen, 2009). More specifically, the probability distribution generating the individual
abilities θi ’s is assumed to be a location-scale distribution G µ,σ defined as
(
)
x−µ
.
µ,σ
P [θi ≤ x | µ, σ] = G ((−∞, x]) = G (−∞,
] ,
(3.1)
σ
5
where µ ∈ R is the location parameter and σ ∈ R+ is the scale parameter. In applications, G is typically
chosen as a standard normal distribution.
It is also assumed that for each person i, his/her responses Y i = (Yi1 , . . . , YiJ )′ satisfy the Axiom of
Local Independence, namely that Yi1 , . . . , YiJ are mutually independent conditionally on (θi , β 1:J , c1:J ).
The distribution of Yij depends on (θi , βj , cj ) through the function (2.2). It is finally assumed that,
conditionally on (θ 1:N , β 1:J , c1:J ), the patterns responses Y 1 , . . . , Y N are mutually independent.
The statistical model (or, likelihood function) is obtained after integrating out the random effects
θi ’s. The above-mentioned hypotheses underlying the 1PL-G model imply that the patterns responses
Y 1 , . . . , Y N are mutually independent given (β 1:J , c1:J , µ, σ), with a common probability distribution
defined as
P [Y i = y i | β 1:J , c1:J , µ, σ] =
∫ ∏
J
=
{ P [Yij = 1 | θ, βj , cj ] }yij { P [Yij = 0 | θ, βj , cj ] }1−yij G µ,σ (dθ),
(3.2)
R j=1
where y i = (yi1 , . . . , yiJ )′ ∈ {0, 1}J and P [Yij = 1 | θ, βj , cj ] as defined by (2.2). The parameters of
interest are accordingly given by
(β 1:J , c1:J , µ, σ) ∈ RJ × [0, 1]J × R × R+ .
(3.3)
In order to distinguish the probability distribution (3.2) from the conditional probability P [Yij = 1 |
θi , βj , cj ], (3.2) is termed marginal probability.
Under the iid property of the statistical model generating the Y i ’s, the identification of the parameters
of interest by one observation is entirely similar to their identification by an infinite quantity of observations; for a proof, see Theorem 7.6.6 in Florens et al. (1990). Thus, the identification problem to be
studied in this section consists in establishing restrictions (if necessary) under which the mapping
(β 1:J , c1:J , µ, σ) 7−→ P [Y 1 = y 1 | β 1:J , c1:J , µ, σ]
is injective for all y 1 ∈ {0, 1}J , where P [Y 1 = y 1 | β 1:J , c1:J , µ, σ] is given by (3.2).
3.2
Identification strategy
Following San Martı́n et al. (2009), an identification strategy consists in distinguishing between parameters of interest and identified parameters. In the case of the statistical model (3.2), the parameters of
interest are (β 1:J , c1:J , µ, σ). The probabilities of the 2J different possible patterns are given by
q12···I = P [ Y11 = 1, . . . , Y1,J−1 = 1, Y1J = 1 | β 1:J , c1:J , µ, σ ]
q12···I¯ = P [ Y11 = 1, . . . , Y1,J−1 = 1, Y1J = 0 | β 1:J , c1:J , µ, σ ]
..
.
q1̄2̄···I¯ = P [ Y11 = 0, . . . , Y1,J−1 = 0, Y1J = 0 | β 1:J , c1:J , µ, σ ].
6
The statistical model (3.2) corresponds, therefore, to a multinomial distribution (Y 1 | π) ∼ Mult (2J , π),
where π = (q12···I , q12···I−1,I¯, . . . , q1̄,2̄,...,I¯). It is known that the parameter π of a multinomial distribution is identified by Y 1 . We accordingly term the parameter π, identified parameter; the q’s with less than
J subscripts are linear combinations of them and, therefore, are identified parameters. Consequently, the
identifiability of the parameters of interest follows if an injectivity relation between them and functions
of the identified parameters π is established. This strategy is followed in the rest of this paper. Taking into
account that a statistical model always involves an identified parametrization (see Florens et al., 1990;
San Martı́n et al., 2009), the restrictions which are introduced to establishing an injective relationship
between the parameters of interest and the identified parameters, are not only sufficient identification
conditions, but also necessary conditions.
3.3 Identification when G is known up to a scale parameter
Let us begin the identification analysis of the 1PL-G model when the distribution G generating the
individual abilities is known up to the scale parameter σ. The corresponding identification analysis is
divided into three steps:
S TEP 1: It is shown that the difficulty parameters β 1:J are a function of the scale parameter σ, the
guessing parameters c1:J and the identified parameters P [Yij = 0 | β 1:J , c1:J , σ], with j =
1, . . . , J.
S TEP 2: It is shown that the scale parameter σ is a function of the guessing parameters c1 and c2 , as
well as of the identified parameters P [Yi1 = 1, Yi2 = 1 | β 1:J , c1:J , σ], P [Yi1 = 1 | β 1:J , c1:J , σ]
and P [Yi2 = 1 | β 1:J , c1:J , σ].
S TEP 3: It is finally shown that the difficulty parameters β 2:J and the guessing parameters c2:J are
functions of (β1 , c1 ) and identified parameters which in turn depend on π.
Combining these steps, the identification of (β 1:J , c2:J , σ) by both the observations and c1 is obtained.
Therefore, one identification restriction is needed, namely c1 = 0. Let us mention that both S TEP 1 and
S TEP 2 are valid for all the conditional specifications of the form
P [Yij = 1 | θi , βj , cj ] = cj + (1 − cj ) F (θi − βj ),
(3.4)
where F is a strictly increasing continuos distribution function with a positive density function on R, and
not only for the logistic distribution Ψ. However, S TEP 3 depends on the logistic function Ψ. In what
follows, these steps are duly detailed.
3.3.1 Step 1 for the parametric 1PL-G model
Let
.
ωj = P [Yij = 0 | β 1:J , c1:J , σ] = δj p (σ, βj ),
7
(3.5)
∫
where
p (σ, βj ) =
R
{1 − F (σθ − βj )} G(dθ),
(3.6)
.
with F a strictly increasing continuos distribution function, and δj = 1 − cj . The parameter ωj is
identified because it is a function of the identified parameter π; see Section 3.2.
The function p(σ, βj ) is a continuous function in (σ, βj ) ∈ R+ × R that is strictly increasing in β ∈ R
because F is a strictly increasing continuous function (in particular, the logistic distribution Ψ satisfies
these properties). Furthermore, p (σ, −∞) = 0 and p (σ, +∞) = 1 and, consequently,
0 ≤ ωj ≤ δ j
Therefore, if we define
for all j = 1, . . . , J.
(3.7)
.
p̄ (σ, ϵ) = inf{β : p (σ, β) > ϵ},
(3.8)
p̄ [σ, p (σ, β)] = β
(3.9)
it is clear that
for all β.
Taking into account relation (3.7), it follows that βj = p̄ [σ, ωj /δj ]; that is, for all j = 1, . . . , J, the item
parameter βj is a function of the scale parameter σ, of the identified parameter ωj and of the non-guessing
parameter δj (and, by extension, of the guessing parameter cj ).
3.3.2 Step 2 for the parametric 1PL-G model
Let
ω12
.
=
=
P [ Yi1 = 0 , Yi2 = 0 | β 1:J , c1:J , σ ]
∫
δ1 δ2
{1 − F (σ θ − β1 )} {1 − F (σ θ − β2 )} G ( dθ ).
(3.10)
R
Using S TEP 1, the identified parameter ω12 can be written as a function of σ , δ1 , δ2 , ω1 and ω2 in the
following way:
ω12
.
=
=
φ ( σ , δ1 , δ 2 , ω 1 , ω2 )
[
(
)]} {
[
(
)]}
∫ {
ω1
ω2
1−F σθ − p σ,
1−F σθ − p σ,
G ( dθ ).
δ1
δ2
R
Now, if the distribution function F has a continuous density function f strictly positive on R, then it can
be shown that the function ω12 = φ (σ, δ1 , δ2 , ω1 , ω2 ) is a strictly increasing continuous function of σ
and, therefore, σ = φ (ω12 , δ1 , δ2 , ω1 , ω2 ), where
φ (ω , δ1 , δ2 , ω1 , ω2 ) = inf { σ : φ ( σ , δ1 , , δ2 , ω1 , ω2 ) > ω }.
In other words, σ becomes a function of the identified parameters ω1 , ω2 and ω12 , as well as of the
non-guessing parameters δ1 and δ2 . The details are developed in Appendix A.
8
3.3.3 Step 3 for the parametric 1PL-G model
This step essentially depends on the logistic distribution Ψ and, consequently, the arguments below are
performed using the conditional probability (2.2). Let J ≥ 3, j ̸= 1 and k ̸= j (with k, j ≤ J). Defines
the identified parameters pJ0 , pJj and pJjk as follows:


∩
.
pJ0 = P 
{Yij = 0} | β 1:J , δ 1:J , σ  =
∫ ∏
R
exp(βj )
1≤j≤J exp(βj )+exp(σ θ)

pJj
δj × I0J (β 1:J , σ),
1≤j≤J
1≤j≤J
where I0J (β 1:J , σ) =
∏

∩
.
= P
G(dθ);
{Yik = 0} | β 1:J , δ 1:J , σ 
1≤k≤J, k̸=j
∏
=
δk
{
}
I0J (β 1:J , σ) + e−βj I1J (β 1:J , σ) ,
1≤k≤J, k̸=j
where I1J (β 1:J , σ) =

∫
R
∏
eσθ
G(dθ); and

∩
.
pJjk = P 
exp(βj )
1≤j≤J exp(βj )+exp(σ θ)
{Yir = 0} | β 1:J , δ 1:J , σ 
1≤r≤J, r̸=j, r̸=k
=
∏
δr
{
}
I0J (β 1:J , σ) + (e−βj + e−βk ) I1J (β 1:J , σ) + e−βj e−βk I2J (β 1:J , σ) ,
1≤r≤J, r̸=j, r̸=k
∫
∏
exp(βj )
where I2J (β 1:J , σ) = R e2σθ 1≤j≤J exp(βj )+exp(σ
θ) G(dθ). Assuming that, for every j = 1, . . . , J,
δj > 0 and βj ∈ (−∞, ∞), it follows that
pJj
pJ0
pJjk
pJ0
=
1
e−βj
+
g(β 1:J , σ),
δj
δj
=
}
1 {
1 + (e−βj + e−βk ) g(β 1:J , σ) + e−βj e−βk h(β 1:J , σ) ,
δj δk
where
g(β 1:J , σ) =
I1J (β 1:J , σ)
I0J (β 1:J , σ)
and
h(β 1:J , σ) =
I2J (β 1:J , σ)
.
I0J (β 1:J , σ)
Using these definitions, the following three propositions are established:
9
Proposition 3.1 Let J ≥ 3 and j ̸= k. Then
J
rjk
J
pJj pJk
e−βj e−βk
. pjk
= J − J J =
k(β 1:J , σ),
δj δk
p0
p0 p0
where k(β 1:J , σ) = h(β 1:J , σ) − g(β 1:J , σ)2 ≥ 0.
The non-negativity of k(β 1:J , σ) follows after noticing that g(β 1:J , σ) = EGβ1:J ,σ (eσ θ ) and h(β 1:J , σ) =
EGβ1:J ,σ (e2 σ θ ), with
.
Gβ1:J ,σ (dθ) =
1
∏
I0J (β 1:J , σ) 1≤j≤J
exp(βj )
G(dθ).
exp(βj ) + exp(σ θ)
Proposition 3.2 Let J ≥ 3 and j ̸= k, with j ̸= 1. Then
δj βj − β1
. rJ
,
=
e
uj = 1k
J
δ
rjk
1
and, consequently, uj > 0. Moreover, {uj : 2 ≤ j ≤ J} are identified parameters.
Proposition 3.3 Let J ≥ 3 and j ̸= 1. Then
J
)
pJ
1 ( βj −β1
. pj
vj = J uj − 1J =
e
−1 ,
δ1
p0
p0
and, consequently, vj ∈ R. Moreover, {vj : 2 ≤ j ≤ J} are identified parameters.
Propositions 3.2 and 3.3 entail the following identities: for j = 2, . . . , J,
(i) δj =
uj δ1
,
vj δ1 + 1
(ii) βj = β1 + ln(vj δ1 + 1).
(3.11)
These identities requires that, for each j = 2, . . . , J, vj δ1 + 1 > 0. By Proposition 3.3, this inequality
is equivalent to eβj −β1 > 0, which is always true. So, the inequality vj δ1 + 1 > 0 does not restrict the
sampling process.
3.3.4 Main result for the parametric 1PL-G model
The previous results allow to show that (β 1:J , c 2:J , σ) are identified by Y 1 provided c1 (or, equivalently,
δ1 ) is fixed. As a matter of fact:
10
1. S TEP 1, S TEP 2 and (3.11) ensures that
βj
= β1 + ln(vj δ1 + 1)
= p (σ, ω1 /δ1 ) + ln(vj δ1 + 1)
= p (φ( ω12 , δ1 , δ2 , ω1 , ω2 ), ω1 /δ1 ) + ln(vj δ1 + 1)
= p (φ( ω12 , δ1 , u2 , v2 , ω1 , ω2 ), ω1 /δ1 ) + ln(vj δ1 + 1).
That is, for each j = 2, . . . , J, βj is a function of both δ1 and (ω12 , ω1 , ω2 , u2 , v2 , vj ).
2. S TEP 2 implies that σ is a function of (δ1 , δ2 , ω1 , ω2 , ω12 ). Applying equality (3.11.i) for j = 2, it
follows that σ is a function of (δ1 , u2 , v2 , ω1 , ω2 , ω12 ).
3. S TEP 1 implies that β1 is a function of (δ1 , σ, ω1 ). Using the fact that σ is a function (δ1 , u2 , v2 ,
ω1 , ω2 , ω12 ), it follows that β1 is a function of (δ1 , u2 , v2 , ω1 , ω2 , ω12 ).
4. Finally, equality (3.11.i) implies that δ 2:J is a function of δ1 and {(uj , vj ) : 2 ≤ j ≤ J}.
We summarize these findings in the following theorem:
Theorem 3.1 For the statistical model (3.2) induced by both the 1PL-G model (2.2) and the abilities
distributed according to a distribution G known up to the scale parameter σ, the parameters of interest
(β 1:J , c 2:J , σ) are identified by Y 1 provided that
1. At least three items are available.
2. The guessing parameter c1 is fixed at 0.
3. 0 < cj .
Moreover, the specification of the model entails that
cj ≤ P [Yij = 1 | β 1:J , c1:J , σ] for every j = 2, . . . , J.
Condition (3) of Theorem 3.1 suggests that the 1PL-model is not nested into the 1PL-G model. However, there exists a relationship between these two models. As a matter of fact, under the identification
restriction c1 = 0 (equivalently, δ1 = 1), the marginal probability that a person i correctly answers the
item 1 is given (see relations (3.5) and (3.6)) by
∫
exp(σθ − β1 )
P [ Yi1 = 1 | β 1:J , c1:J , σ ] =
G(dθ).
R 1 + exp(σθ − β1 )
This means that the test (or measurement instrument) must contain an item (labeled with the index 1) such
that each person answers it without guessing. In other words, the identification restriction c1 = 0 implies
a restriction on the design of the test or measurement instrument, an aspect that the practitioner should
carefully consider. A relevant question is the following: how can be ensured that each person answers
11
a specific item without guessing? By including in the test an open question: each person answers it
correctly or incorrectly, but it is not possible to choose an answer by guessing1 .
It should be remarked that βj = β1 for every j = 2, . . . , J if and only if vj = 0 for every j = 2, . . . , J.
In this case, (β1 , δ 2:J , σ) are identified provided that δ1 is fixed at 1. Similarly, δj = δ1 for every
j = 2, . . . , J if and only if vj δ1 + 1 = uj for every j = 2, . . . , J. This last equality implies that δ1 is
identified and, therefore, (β 1:J , δ1 , σ) are identified without additional identification restrictions. Let us
summarize these remarks in the following corollary:
Corollary 3.1 Consider the statistical model (3.2) induced by both the 1PL-G model (2.2) and the abilities distributed according to a distribution G known up to the scale parameter σ.
1. If the items have a common difficulty parameter β1 , then (β1 , c2:J , σ) are identified by one observation if c1 is fixed at 0.
2. If the items have a common guessing parameter c1 , then (β 1:J , c1 , σ) are identified by one observation.
3.3.5 Identification when G is known up to both a location and a scale parameter
If the distribution G generating the abilities is known up to both a location parameter µ and a scale
parameter σ, it is then necessary to impose an identification restriction on the difficulty parameters. As a
matter of fact, let Gµ,σ be a probability distribution given by (3.1). Relation (3.5) is, therefore, rewritten
as
∫
.
ω
fj = P [ Yij = 0 | β 1:J , δ 1:J , σ, µ ] = δj
{ 1 − F (σ θ + µ − βj ) } G ( dθ )
R
for all j = 1, . . . , J. Since F is a strictly increasing continuous function, we have that, for all j =
1, . . . , J, βj − µ is a function of (σ, ω
fj , δj ). Following the arguments developed in Sections 3.3.2 and
3.3.3, it follows that (β1 − µ , . . . , βJ − µ , c2 , . . . , cJ , σ) is identified by the observations given c1 .
Therefore, under a restriction of the form a′ β 1:J = 0 such that 11′J a ̸= 0, with a ∈ Rm known, the
parameters (β 1:J , c2:J , µ , σ) are identified
the observations. Typical choices are a = 11J , which
∑by
J
leads to restrict the difficulty parameters as j=1 βj = 0; or a = e1 –the first canonical vector of RJ –,
which leads to impose β1 = 0. Summarizing, we establish the following corollary:
Corollary 3.2 For the statistical model (3.2) induced by both the 1PL-G model (2.2) and the abilities
distributed according to a distribution G known up to both the location parameter µ and the scale
parameter σ, the parameters of interest (β 1:J , c 2:J , µ, σ) are identified by Y i provided that
1. At least three items are available.
2. The guessing parameter c1 is fixed at 0.
3. cj > 0 for every j = 2, . . . , J.
1
This suggestion is due to Paul De Boeck.
12
4. a′ β 1:J = 0 such that 11′J a ̸= 0, with a ∈ Rm known.
Moreover, the specification of the model entails that
cj ≤ P [Yij = 1 | β 1:J , c1:J , σ],
j = 2, . . . , J.
It should be remarked that the identification restriction imposed on β 1:J implies, in particular, that
β1 ̸= βj for each j = 2, . . . , J.
4
Identification of the Semi-parametric 1PL-G model
Many IRT models are fitted under the assumption that θi is normally distributed with an unknown variance. The use of a distribution for θi (as opposed to treating θi as a fixed effect) is a key feature of the
widely use marginal maximum likelihood method. The normal distribution is convenient to work with,
specially because is available in statistical package as SAS (Proc NLMIXED) or R (lme4). However,
as pointed out by Woods and Thissen (2006) and Woods (2006), there exist specific fields, as personality and psychopathology, in which the normality assumption is not realistic (for references, see Woods
(2006)). In these fields, it could be argued that psychopathology and personality variables are likely to
be positively skewed, because most persons in general population have low pathology, and fewer persons
have severe pathology. However, the distribution G of θi is unobservable and, consequently, though a
researcher may hypothesize about it, it is not known in advance of an analysis. Therefore, any a priori
parametric restriction on the shape of the distribution G could be a misspecification.
These considerations lead to extend parametric IRT models by considering the distribution G as a parameter of interest and, therefore, to estimate it by using nonparametric techniques. Besides the contributions of Woods and Thissen (2006) and Woods (2006, 2008), Bayesian non-parametric methods applied
to IRT models should also be mentioned; see, among others, Roberts and Rosenthal (1998); Karabatsos and Walker (2009); Miyazaki and Hoshino (2009). In spite of these developments, it is relevant to
investigate whether the item parameters as well as the distribution G of an IRT model –particularly the
1PL-G model– are or not identified by the observations. If such parameters of interest are identified,
then a semi-parametric extension of the 3PL model actually provides greater flexibility than does the
assumption of nonnormal parametric form for G. In this section, we consider the identification problem
of a semi-parametric extension of the 1PL-G model, which can be viewed as a particular case of the
semi-parametric 3PL model.
4.1
Semi-parametric specification of the 1PL-G model
A semi-parametric 1PL-G model is obtained after substituting the parametric hypothesis (3.1) by the
following hypothesis:
iid
(θi | G) ∼ G,
(4.1)
where G is a probability distributions on (R, B). The rest of the model structure is as specified in Section
3.1.
13
The statistical model is obtained after integrating out the random effects θi ’s. The patterns responses
Y 1 , · · · , Y N are mutually independent conditionally on (β 1:J , c1:J , G). The common distribution of
Y i is given by
}
∫ ∏
J {
exp[yj (θ − βj )]
P [Y i = y | β 1:J , c1:J , G] =
cj + (1 − cj )
G(dθ),
1 + exp(θ − βj )
R
(4.2)
j=1
where y ∈ {0, 1}J . Consequently, the parameters of interest and the corresponding parameter space are
(β 1:J , c1:J , G) ∈ RJ × [0, 1]J × P(R, B).
4.2 Identification analysis under a finite quantity of items
Similarly to the parametric 1PL-G model (see Section 3.2), the statistical model induced by the semiparametric 1PL-G model corresponds to a multinomial distribution Mult (2J , π), where the identified
parameter π = (q12···I , q12···I−1,I¯, . . . , q1̄,2̄,...,I¯) is given by
q12···I = P [ Y11 = 1, . . . , Y1,J−1 = 1, Y1J = 1 | β 1:J , c1:J , G ]
q12···I¯ = P [ Y11 = 1, . . . , Y1,J−1 = 1, Y1J = 0 | β 1:J , c1:J , G ]
..
.
(4.3)
q1̄2̄···I¯ = P [ Y11 = 0, . . . , Y1,J−1 = 0, Y1J = 0 | β 1:J , c1:J , G ].
These marginal probabilities are of the form (4.2).
The parameters of interest (β 1:J , c1:J , G) become identified if they can be written as functions of π.
The identification analysis developed in the context of the parametric 1PL-G actually provides insight
for the identification of (β 1:J , c1:J , G). As a matter of fact, in the parametric case, the identification
analysis was based on three steps, each of them involving specific features:
1. S TEPS 1 and 2 essentially depend on the parameter σ indexing the distribution generating the
random effects θi ’s. More specifically, the item parameters β 1:J are written as a function of c1:J
and σ, whereas σ is written as a function of c1 and c2 ; see Sections 3.3.1 and 3.3.2.
2. S TEP 3 first establishes that the item parameters (β 2:J , c2:J ) can be written as a function of β1
and c1 . Second, using S TEPS 1 and 2, it is concluded that (β 1:J , c2:J , σ) is a function of c1 ; see
Section 3.3.3.
Thus, S TEPS 1 and 2 critically depend on the parametric hypothesis which is assumed on the distribution
generating the random effects θi ’s, whereas a part of the S TEP 3 does not depend on it. This suggests
that the S TEP 3 can be used in the identification analysis of (β 1:J , c1:J , G).
14
4.2.1 Identification of the item parameters
More precisely, let J ≥ 3, j ̸= 1 and k ̸= j (with k, j ≤ J). Similarly to Section 3.3.3, defines the
identified parameters pJ0 , pJj and pJjk as follows:

pJ0

∩
.
= P
1≤j≤J
δj × I0J (β 1:J , G);
1≤j≤J

pJj
∏
{Yij = 0} | β 1:J , δ 1:J , G =

∩
.
= P
{Yik = 0} | β 1:J , δ 1:J , G
1≤k≤J, k̸=j
∏
=
δk
{
}
I0J (β 1:J , G) + e−βj I1J (β 1:J , G) ;
1≤k≤J, k̸=j


∩
.
pJjk = P 
{Yir = 0} | β 1:J , δ 1:J , G
1≤r≤J, r̸=j, r̸=k
=
∏
δr
{
}
I0J (β 1:J , G) + (e−βj + e−βk ) I1J (β 1:J , G) + e−βj e−βk I2J (β 1:J , G) ;
1≤r≤J, r̸=j, r̸=k
.
where δ1 = 1 − c1 and
∫
I0J (β 1:J , G)
=
∏
R 1≤j≤J
∫
I1J (β 1:J , G)
eθ
=
R
e2θ
R
∏
1≤j≤J
∫
I2J (β 1:J , G) =
exp(βj )
G(dθ);
exp(βj ) + exp(θ)
∏
1≤j≤J
exp(βj )
G(dθ);
exp(βj ) + exp(θ)
exp(βj )
G(dθ).
exp(βj ) + exp(θ)
Assuming that δj > 0 and βj ∈ (−∞, ∞) for each j = 2, . . . , J, it follows, by the same arguments
developed in Section 3.3.3, that
δj =
uj δ 1
,
vj δ1 + 1
βj = β1 + ln(vj δ1 + 1),
j = 2, . . . , J,
(4.4)
where uj and vj are defined as in Propositions 3.2 and 3.3. Taking into account that uj ’s and vj ’s are
identified parameters, the following theorem can be established:
15
Theorem 4.1 For the statistical model (4.2) induced by the semi-parametric 1PL-G model, the item
parameters (β 2:J , c2:J ) are identified by Y 1 provided that
1. At least three items are available.
2. β1 = 0 and c1 = 0.
3. cj > 0 for each j = 2, . . . J.
Moreover, under these identification restrictions, the item parameters can be expressed in terms of
marginal probabilities; that is,
]
[ J J J J
pj p1k −p1 pjk
(i) βj = ln(vj + 1) = ln pJ pJ −pJ pJ + 1 ,
0 jk
j
k
(4.5)
(ii) cj = 1 −
uj
vj +1
=1−
J
J J
pJ
0 p1k −p1 pk
J −pJ )+pJ (pJ −pJ ) ,
pJ
(p
1
j
jk 0
1k
k
for some k ̸= 1, j.
It should be mentioned that equalities (4.5) provide an explicit statistical meaning for the item parameters in terms of the data generating process. However, it is not possible to explain it in narrative
terms.
According to Theorem 4.1, the marginal probability that a person i correctly answer the standard item
1 is given by
∫
exp(θ)
G(dθ).
P [Yi1 = 1 | β 1:J , c1:J , G] =
R 1 + exp(θ)
As in the parametric case (see Section 3.3.4), this means that the test (or measurement instrument) must
contain an item (labeled with 1) such that each person answers it without guessing. Finally, according to
Theorem 4.1, the guessing parameters cj ’s are strictly positive. This suggests that the semi-parametric
1PL model is not nested into the semi-parametric 1PL-G model.
4.2.2 Is the distribution G identified?
Let us now consider the possible identification of the distribution G generating the random effects. Using relationship (4.4), the conditional probability that a person i answers the item j incorrectly can be
rewritten as follows:
P [Yij = 0 | βj , δj , θi ] =
=
δj exp(βj )
exp(βj ) + exp(θi )
uj
1
1
vj + δ1 + δ1 exp(β
exp(θi )
1)
16
(4.6)
Let K ⊂ {1, . . . , J}. Then the following 2J − 1 equations are available to analyze the identifiability
of the distribution G:


∫ ∏
∩
uj
. 
{Yij = 0} | β 1:J , c1:J , G =
G(dθ)
pK = P
1
1
R j∈K vj + δ1 + δ1 exp(β1 ) exp(θ)
j∈K
∏
=
j∈K
∫
uj ×
R
∏ [
G(dθ)
vj +
j∈K
1
δ1
+
1
δ1 exp(β1 )
].
exp(θ)
It should be mentioned that the information provided by the 2J − 1 marginal probabilities pK ’s is exactly
the same as the information provided by the marginal probabilities q’s as defined in (4.3). Taking into
account that uj ’s as well as pK are identified, and using the identification restrictions established in
Theorem 4.1, it follows that, for every subset K ⊂ {1, . . . , J}, except the empty set,
∫
G(dθ)
.
∏
(4.7)
mG (K) =
[vj + 1 + exp(θ)]
R
j∈K
is identified by the observations. Denotes {mG (K) : K ⊂ {1, . . . , J} \ ∅} as mG .
The question is to know whether the identifiability of these 2J − 1 functionals ensure the identifiability
of G; that is, if two distributions G1 and G2 on (R, B) satisfying that mG1 = mG2 , is it true that
G1 = G2 ? We argue that those 2J − 1 equations are far from being enough to identify G. As a matter of
fact, let suppose that G has a density function g with respect to a σ-finite measure λ on (R, B), that is,
g = dG/dλ. Suppose furthermore that g ∈ L2 (R, B, λ). Then (4.7) can equivalently be written as
∫
g(θ)dλ(θ)
∏
∀ K ⊂ {1, . . . , J}, with K ̸= ∅.
(4.8)
mg (K) =
[v
j + 1 + exp(θ)]
R
j∈K
Since vj > −1 for each j, it follows that
0<
∏
j∈K
1
≤c
vj + 1 + exp(θ)
∀ K ⊂ {1, . . . , J}, with K ̸= ∅
where c is a real constant. Therefore,
. ∏
fK (θ) =
j∈K
1
∈ L2 (R, B, λ)
vj + 1 + exp(θ)
for each K ⊂ {1, . . . , J}, with K ̸= ∅.
Define the linear functional T : L2 (R, B, λ) −→ R2 −1 as T g = mg , where mg = (mg ({1}), mg ({2}), . . . , mg ({1, . .
and mg (K) is defined by (4.8) for each set K. Thus, the identifiability of g can be written as follows:
J
T g1 = T g2 =⇒ g1 = g2 .
17
Taking into account that T is linear, we conclude that g is identified if and only if
T g = 0 =⇒ g = 0,
that is, if and only if Ker (T ) = {0}. If this were the case, then we conclude that L2 (R, B, λ) is of finite
dimension.
As a matter of fact, as it is well known (L2 (R, B, λ), (·, , ·)), with the inner product (·, ·) defined as
∫
(f, h) =
f (θ)h(θ) dλ(θ),
for f, h ∈ L2 (R, B, λ),
R
is a Hilbert space. Now, let g ∈ L2 (R, B, λ) such that T g = 0. It follows that g is orthogonal to each
fK ; that is,
(g, fK ) = 0 ∀ K ⊂ {1, . . . , J}, with K ̸= ∅
and, therefore, g is orthogonal to N , the span generated by {fK : K ⊂ {1, . . . , J}, K ̸= ∅}. Since
N is of finite dimension, it is therefore closed in L2 (R, B, λ). Taking into account that g = 0 if and
only if (g, f ) = 0 for all f ∈ L2 (R, B, λ) (see Theorem 1, Section 4, in Halmos (1951)), if g = 0 then
N = L2 (R, B, λ), which is imposible because L2 (R, B, λ) is an infinite dimensional linear space.
Summarizing, we obtain the following theorem:
Theorem 4.2 For the statistical model (4.2) induced by the semi-parametric 1PL-G model, assume that
there are at least three items and that cj > 0 for each j = 2, . . . , J. Then the (2J − 1)-dimensional
vector mG is identified by the observations provided that β1 and c1 are fixed, but the distribution G
generating the individual abilities is not identified.
4.3
Identification analysis under an infinite quantity of items
The identification arguments previously developed, either in the parametric case or in the semi-parametric
case, has a common feature, namely to write the parameters of interest as functions of identified parameters. Now the problem consists in written the distribution G as a function of identified parameters when
an infinite number of items is available. This type of relationships can be developed in a Bayesian framework because in such a framework, the concept of identification reduces to a condition of measurability
in the sense that a parameter is said to be Bayesian identified if and only if it is a measurable function of countably many sampling expectations of statistics (that is, expectations of functions of the data
conditionally on the parameters); for details, see Appendix B.
According to Theorem 4.2, for every K ⊂ {1, . . . , J}, with K ̸= ∅, the following functionals are
identified by one observation:
∫
∫ ∞
Gβ1 ,δ1 (dx)
G(dθ)
∏
mG (K) =
=
,
∏
−1
−1
[vj + x]
(vj + δ1 + δ1 exp(θ − β1 ))
R
δ1
j∈K
j∈K
where
X=
1
exp(θ)
+
δ1 δ1 exp(β1 )
18
and, therefore
Gβ1 ,δ1 (x) = G[ β1 + ln( δ1 x − 1 ) ].
(4.9)
( −1 )
Note that the support of the random variable X is δ1 , ∞ . However, by assuming the following three
conditions:
H1. For all m, n ∈ N, θ 1:n , β 1:m and c1:m are mutually independent conditionally on (G, H, K).
H2. The item parameters β 1:∞ for an iid process, where H is the common probability distribution.
H3. The item parameters c1:∞ for an iid process, where K is the common probability distribution.
H4. G, H and K are mutually independent;
it can be proved that
∫
R+
f (x) Gβ1 ,δ1 (dx)
(4.10)
is identified by one observation, for every bounded continuous function f : R+ −→ R, where Gβ1 ,δ1 is
defined by (4.9). In particular, since
fn (y) = 11(0,x] (y) + [1 − n(y − x)] 11(x,x+ 1 ) (y) ↓ 11(0,x] (y)
n
∀ x ∈ R+ ,
as n → ∞, the monotone convergence theorem implies that, for every x ∈ R+ , Gβ1 ,δ1 ((0, x]), and so
Gβ1 ,δ1 , is identified by one observation.
It is interesting to remark how strong is the identifiability condition (4.10) when it is compared with the
identifiability of the functionals mG (K): the first one is a condition valid for every function f ∈ Cb (R+ ),
where Cb (R+ ) denotes the set of bounded continuous function f : R+ −→ R, whereas the second
condition is valid for the set


∏

(
)
Cb (R+ ).
[vj + x]−1 : K ⊂ {1, . . . , J} \ ∅, J ≥ 3
Cb (δ1−1 , ∞)


j∈K
The following theorem, proved in Appendix C, establishes conditions under which the item parameters
and the latent distribution G are identified in the asymptotic Bayesian model (Y i , β 1:∞ , c1:∞ , G):
Theorem 4.3 Consider an asymptotic Bayesian semi-parametric 1PL-G model obtained when the number of items J → ∞. The item parameters (β 1:∞ , c1:∞ ) and the latent distribution generating the
individual abilities G are b-identified by Y 1 if the following three conditions hold:
1. cj > 0 for every j ∈ N.
2. The difficulty parameter β 1:∞ and the guessing parameters c1:∞ satisfy conditions H1–H4.
3. At least one of the following identifying restrictions hold:
19
(a) β1 = 0 a.s. and c1 = 0 a.s.
(b) G is a.s. a probability distribution on R such that its mean and variance are known constants.
(c) G is a.s. a probability distribution on R with two known q-quantiles.
It should be mentioned that if a Bayesian non-parametric procedure is implemented for estimating the
identified functionals mG , it is known that the identification restrictions (1) or (3) of Theorem 4.3 are
easily implemented, but not the identification restriction (2). However, this type of considerations are
outside of the scope of this paper.
5 Discussion
We have studied the identification problem of a particular case of the 3PL model, namely the 1PL-G
model which assumes that the discrimination parameters are all equal to 1. The identification problem
was studied under three different specifications. The first specification assumes that the individual abilities are unknown parameters. The second specification considers the abilities as mutually independent
random variables with a common distribution known up to the scale parameter. In this context, it is
also considered the case where the distribution generating the individual abilities is known up to the
scale parameter and the location parameter. The third specification corresponds to a semi-parametric
1PL-G model, where the distribution generating the individual abilities is unspecified and, consequently,
considered as a parameter of interest.
For the first specification, the parameters of interest are the difficulty parameters, the guessing parameters and the individual abilities. These are identified provided a difficulty parameter and a guessing
parameter are fixed at zero. For the second specification, the parameters of interest are the difficulty and
guessing parameters, and the scale parameter. It was shown that these parameters are identified by one
observation if a guessing parameter is fixed at zero. It should be emphasized that this identification result
impose a design restriction on the test, namely to ensure that the test involves an item that any person
will answer by guessing. In the context of educational measurement, this condition may be satisfied if
the test involves an open question. In the context of the second specification, it was also studied the
identification problem when the distribution generating the individual abilities is known up to both the
scale and the location parameters. In this context, the parameters of interest are identified if a guessing
parameter and a difficulty parameter are fixed at 0.
For the third specification, the parameters of interest are the difficulty parameters, the guessing parameters and the distribution G generating the individual abilities. When at least three items are available,
the item parameters are identified provided a difficulty parameter and a guessing parameter are fixed at
0. However, under these identification restrictions, it was proved that the distribution G is not identified.
This lack of identification jeopardizes the empirical meaning of an estimate for G under a finite number
of items. This result is of practical relevance, especially considering the large amount of research trying
to relax the parametric assumption of G in the IRT literature. In the unrealistic case when an infinite
quantity of items is available, the distribution G and the item parameters become identified if either a
20
difficulty and guessing parameters are fixed at zero, or two characteristics of G (the first two moments or
two quantiles) are fixed. For an overview of these results, see Table 1.
It should be remarked that the proofs of these identification results consisted in obtaining the corresponding identified parameterizations of the sampling process. Thereafter, identification restrictions
were imposed in such a way that the parameters of interest become identified. This means that the identification restrictions are not only sufficient conditions, but also necessary. On the other hand, these
results show that the identification of the fixed-effects 1PL-G model is not sufficient for obtaining the
identification of both parametric and semi-parametric random-effects specification of the 1PL-G model.
We argue that our identification results are of practical relevance, specially because the 3PL model is
widely used to analyze educational data and it is actually available in various specialized softwares. In
applications, it is well known that the 3PL has estimation problems, specially for the guessing parameters.
Some softwares prevent against this type of problems; see, for instance, Rizopoulos (2006). Taking into
account that the 1PL-G model is a particular case of the 3PL model, our results explain that a source
of such problems is due to the lack of parameter identifiability. Furthermore, the results exposed in this
paper should be considered as an invitation to investigate the identifiability of more complex guessing
IRT-models. Finally, these results allow to propose the 1PL-G model as a well-behaved guessing IRTmodel in the sense that we provide the identification restrictions in three contexts which are of interest in
educational applications. In other words, the 1PL-G model should be considered a model better than the
3PL model while we are waiting for its identifiability.
Appendix
A Identifiability of the scale parameter σ by ω12 , δ1 , δ2 , ω1 and ω2
To prove that the function ω12 given by (3.10) is a strictly increasing continuous function of σ , we need
to study the sign of its derivative with respect to σ. This requires not only to use the Implicit Function
Theorem (Spivak, 1965), but also to assume regularity conditions allowing to perform derivatives under
the integral sign. We accordingly assume that the cumulative distribution function F has a continuous
density function f strictly positive on R. Furthermore, to prove that ω12 is a strictly increasing continuous
function of σ , we need to obtain the derivatives under the integral sign of the function p ( σ , β ) as
defined in (3.6) with respect to σ and to β. Consequently, the standard regularity conditions are of the
type
∫
∫
R
f (σx − β)G(dx) < ∞,
R
|x| f (σx − β)G(dx) < ∞,
under which the Dominated Convergence Theorem can be applied to ensure the validity of performing
derivatives under the integral sign.
Under these regularity conditions, the function p ( σ , β ) is continuously differentiable under the in-
21
22
(β 1:J , c1:J , G) ∈ RJ × [0, 1]J × P(R, B)
(β 1:J , c1:J , G) ∈ RJ × [0, 1]J × P(R, B)
Semi-parametric
µ,σ
(β 1:J , c1:J , µ, σ) ∈ RJ × [0, 1]J × R × R+
Semi-parametric
with (θi | σ, µ) ∼ G
Random-effects
with (θi | σ) ∼ Gσ
(β 1:J , c1:J , σ) ∈ RJ × [0, 1]J × R+
(θ 1:N , β 1:J , c1:J ) ∈ RN × RJ × [0, 1]J
Fixed-effects
Random-effects
Parameters of
interest
Model
specification
3≤J <∞
c1 = 0;
cj > 0 for every j = 2, . . . , J.
c1 = 0;
cj > 0 for every j = 2, . . . , J;
a′ β 1:J = 0, with 11′m a ̸= 0 and a known.
β1 = 0 and c1 = 0;
cj > 0 for every j = 2, . . . , J.
cj > 0 for every j = 2, . . . , J;
H1–H4;
at least one of the following conditions hold:
(iii.1) β1 = 0 and c1 = 0.
(iii.2) The variance and the mean of G are a.s. known.
(iii.3) Two q-quantiles of G are a.s. known.
(i)
(ii)
(i)
(ii)
(iii)
(i)
(ii)
(i)
(ii)
(iii)
J =∞
3≤J <∞
3≤J <∞
1≤J <∞
β1 = 0 and c1 = 0;
N ≥ 2;
P [Yij = 1 | θi , βj , cj ] ̸= P [Yi′ j = 1 | θi′ , βj , cj ]
for every j, where i ̸= i′ .
Number of
items
(i)
(ii)
(iii)
Identification
restrictions
Table 1: Identification restrictions in different specifications of the 1PL-G model
(β 2:∞ , c2:∞ , G)
under (iii.1); or
(β 1:∞ , c1:∞ , G)
under (iii.2) or (iii.3),
except what it is known
on G.
(β 2:J , c2:J , mG )
(β 2:J , c2:J , µ, σ)
(β 1:J , c2:J , σ)
(θ 1:N , β 2:J , c2:J )
Identified
Parameters
Theorem 4.3
Theorem 4.1 and 4.2
Corollary 3.2
Theorem 3.1
Theorem 2.1
Theorem
tegral on R+
0 × R and, therefore,
. ∂
D2 p ( σ , β ) =
p(σ, β ) =
∂β
(i)
∫
R
. ∂
D1 p ( σ , β ) =
p(σ, β ) = −
∂σ
(ii)
f ( σ x − β ) G ( dx )
(A.11)
∫
R
x f ( σ x − β ) G ( dx ).
Thus, p ( σ , α ) as defined by (3.8) is also continuously differentiable on R+
0 × ( 0 , 1 ) and from (3.9),
we obtain that
(i) 1
=
=
∂
p[σ, p(σ, β )]
∂β
D2 p [ σ , p ( σ , β ) ] × D2 p ( σ , β )
(A.12)
(ii)
0
=
=
∂
p[σ, p(σ, β )]
∂σ
D1 p [ σ , p ( σ , β ) ] + D2 p [ σ , p ( σ , β ) ] × D1 p ( σ , β ),
where
. ∂
D1 p ( σ , ω ) =
p ( σ , ω ),
∂σ
Combining (A.11) and (A.12), we obtain that
(i) D2 p ( σ , ω )
(ii)
where
D1 p ( σ , ω )
. ∂
D2 p ( σ , ω ) =
p ( σ , ω ).
∂ω
=
1
1
= ∫
D2 p [ σ , p ( σ , ω ) ]
R f [ σ x − p ( σ , ω ) ] G ( dx )
=
∫
x f [ σ x − p ( σ , ω ) ] G ( dx )
D1 p [ σ , p ( σ , ω ) ]
−
= ∫R
D2 p [ σ , p ( σ , ω ) ]
R f [ σ x − p ( σ , ω ) ] G ( dx )
.
=
Eσ , ω ( X ),
(A.13)
.
. f [ σ x − p ( σ , ω ) ] G ( dx )
Pσ , ω [ X ∈ dx ] = Gσ , ω ( dx ) = ∫
.
R f [ σ x − p ( σ , ω ) ] G ( dx )
Thanks to the regularity conditions allowing to perform derivatives of p ( σ , β ) , and to the fact that
F ≤ 1 , it can be shown that ω12 is continuously differentiable under the integral sign in σ , β1 and
β2 ; therefore the function φ ( σ , δ1 , δ2 , ω1 , ω2 ) is continuously differentiable under the integral sign
with respect to σ . It remains to show that the derivative w.r.t. σ is strictly positive. Let us consider the
sign of one of the two terms of the derivative of φ ( σ , δ1 , δ2 , ω1 , ω2 ) . Using (A.13.ii), we obtain that
∂
{1 − F [σ x − p ( σ , ω1 /δ1 ) ]} = −f [ σx − p (σ, ω1 /δ1 ) ]( x − Eσ, ω1 /δ1 ( X ) ).
∂σ
23
Therefore, such a second term can be written as
[
(
)]
{
[
(
)]}
∫
{
}
ω1
ω2
θ − Eσ, ω1 /δ1 (θ) × 1 − F σθ − p σ,
G(dθ)
δ1 δ2
−f σθ − p σ,
δ1
δ2
R
[
{
[
(
)]
(
)]}
∫
ω1
ω2
= δ1 δ2
f σθ − p σ,
G(dθ) × Cσ, ω1 /δ1 θ, F σ θ − p σ,
.
δ1
δ2
R
Now, since F [σ θ − p ( σ , ω2 /δ2 ) ] is a strictly increasing function of θ , the covariance between θ
and F [σ θ − p ( σ , ω2 /δ2 ) ] (with respect to Gσ , ω1 /δ1 ) is strictly positive (if θ is not degenerate).
Furthermore,
∫
R
f [ σ x − p ( σ , ω1 /δ1 ) ] G ( dx )
is clearly strictly positive. The two terms of the derivative of φ ( σ , δ1 , δ2 , ω1 , ω2 ) are, therefore,
strictly positive.
B Bayesian identification
Whereas the classical or sampling approach considers a statistical model as an indexed family of distributions on the sample space, the Bayesian approach considers a unique probability measure on the product
space “parameters × observations”. This produces two different approaches to the identification: the injectivity of a mapping in a sampling theory approach and the minimal sufficiency of the parameterization
in a Bayesian approach (Kadane, 1974; Picci, 1977). We refer the reader to Section 4.6.2 in Florens et al.
(1990) for the technical conditions required to link these approaches.
In order to define Bayesian identification, it is necessary first to define the concept of sufficient parameter.
Definition B.1 Consider the Bayesian model defined by the joint probability distribution on (Y , ϑ). A
.
function ψ = g(ϑ) of the parameter ϑ is a sufficient parameter for Y if the conditional distribution of
the sample Y given ψ is the same as the distribution of the sample Y given ψ, that is, Y ⊥
⊥ ϑ | ψ.
The previous definition implies that the distribution of Y is completely determined by the sufficient
parameter ψ, being ϑ redundant. In fact, by definition of conditional independence, Y ⊥
⊥ ϑ | ψ implies
that E[h(Y ) | ϑ] = E[h(Y ) | ψ] for every measurable function h. Equivalently, by the symmetry
of a conditional independence relation, it can also be concluded that ψ is a sufficient parameter if the
conditional distribution of the redundant part of ϑ, given the sufficient parameter ψ, is not updated by
the sample, that is, E[f (ϑ) | Y , ψ] = E[f (ϑ) | ψ] for every measurable function f .
Because of the numerous sufficient parameters in a given problem, one might ask whether one sufficient
parameter ψ still contains redundant information about the sampling process. Suppose that there exists
a function of ψ, say ψ 1 , which is also a sufficient parameter for Y , that is, Y ⊥
⊥ ϑ | ψ 1 . It follows
that Y ⊥
⊥ ψ | ψ 1 and, therefore, the sampling process is fully characterized by ψ 1 , being ψ redundant
in the sense that E[h(Y ) | ϑ] = E[h(Y ) | ψ] = E[h(Y ) | ψ 1 ], for every measurable function h,
24
or that E[f (ψ) | Y , ψ 1 ] = E[f (ψ) | ψ 1 ], for every measurable function f . Clearly, ψ 1 should be
preferred over ψ because it contains less redundant information about the sampling process. In fact, a
parameterization that achieves the most parameter reduction while retaining all the information about the
sampling process should be considered preferable. The definition of such a parameter is formalized next.
For the Bayesian model defined on (Y , ϑ), ϑmin is a minimal sufficient parameter for Y if the following
conditions are satisfied: (i) ϑmin is a measurable function of ϑ, (ii) ϑmin is a sufficient parameter, and
(iii) for any other sufficient parameter ψ, ϑmin is a measurable function of it. From this definition, it
follows that ϑmin does not contain redundant information about the sampling process because it does
not exist a non–injective function of it, say φ, such that Y ⊥
⊥ ϑmin | φ. These considerations lead to the
definition of Bayesian identification (Florens and Rolin, 1984).
Definition B.2 Consider the Bayesian model defined by the joint probability distribution on (Y , ϑ). The
parameter ϑ is said to be Bayesian identified or b–identified if it is a minimal sufficient parameter.
The definition of b–identification depends on the prior distribution through its null sets only. As a matter
of fact, if the original prior distribution is replaced by a different one, the Bayesian model changes.
However, if both prior distributions have the same null sets, i.e., the same events of probability 0 or
1, the b–identified parameters in both models coincide. We refer the reader to Proposition 4.6.8 in
(Florens et al., 1990) for a formal proof. An important consequence of this, is that unidentified parameters
remain unidentified even if proper and concentrated priors are considered. Unidentified parameters can
become identified if and only if the prior null sets are changed, which is equivalent to introduce dogmatic
constraints.
From an operational point of view, it can be shown that the minimal sufficient parameter is generated
by the family of all the sampling expectations E[h(Y ) | ϑ], where h ∈ L1 (Ω, Y, Π), with L1 (Ω, Y, Π)
being the set of integrable functions defined on the sample space Ω and measurable w.r.t. the σ–field Y
generated by Y . Let
σ{E[h(Y ) | ϑ] : h ∈ L1 (Ω, Y, Π)}
be the σ–field generated by the sampling expectations. By definition of conditional expectation, this
σ–field is contained in the σ–field generated by ϑ. Therefore, to show that ϑ is b–identified by Y , it is
needed to show that ϑ is measurable w.r.t. σ{E[h(Y ) | ϑ] : h ∈ L1 (Ω, Y, Π)}, or equivalently that
σ{ϑ} ⊂ σ{E[h(Y ) | ϑ] : h ∈ L1 (Ω, Y, Π)} (see Chapter 4 in Florens et al. (1990)). In order to
illustrate this point, let us consider the following simple example. Suppose that (X | µ) ∼ N (µ, 1),
where the prior distribution for µ is absolutely continuous w.r.t. the Lebesgue measure on R. It follows
that E(X | µ) is measurable w.r.t. σ{E[h(X) | µ] : h ∈ L1 (R, B, Π)}, where B is the Borel σ–field.
Consequently, µ = E(X | µ) is measurable w.r.t. σ{E[h(X | µ] : h ∈ L1 (R, B, Π)} and, therefore, it
is b–identified by X.
C Proof of Theorem 4.3
The identification analysis of the parameters of interest (β 1:∞ , c1:∞ , G) should be done in the asymptotic Bayesian model defined on (Y i , β 1:∞ , c1:∞ , G), where Y i ∈ {0, 1}N corresponds to the response
25
pattern of person i. The corresponding minimal sufficient parameter is given by the following σ-field:
.
A∗ = σ{E(f | β 1:∞ , c1:∞ , G) : f ∈ [σ(Y 1 )]+ },
where [σ(Y 1 )]+ denotes the set of positive functions f such that f = g(Y 1 ), with g a measurable
function. The identification of the semi-parametric 1PL-G model leads to prove, under identification
restrictions if necessary, that (β 1:∞ , c1:∞ , G) is a measurable function of the parameter A∗ . By the
Lemma of Dynkin-Doob, this is equivalent to prove that, under identification restrictions if necessary,
σ(β 1:∞ , c1:∞ , G) = A∗ .
This equality relies on the following steps:
S TEP 1: By the same arguments used to establish identity (4.4), it follows that
.
σ(β 2:∞ , δ 2:∞ ) = σ(βj : j ≥ 2) ∨ σ(δj : j ≥ 2) ⊂ A∗ ,
.
where δj = 1 − cj .
S TEP 2: Hypotheses H1, H2, H3 and H4 jointly imply that {(uj , vj ) : 2 ≤ j < ∞} are iid conditionally
on (β1 , δ1 , K, H). By the Strong Law of Large Numbers, it follows that
1 ∑
.
a.s.
11{(uj ,vj )∈B}
W β1 ,δ1 (B) = P [(u2 , v2 ) ∈ B | β1 , δ1 , K, H] = lim sup
m
m
1≤j≤m
for B ∈ B + × B. But Propositions 3.2 and 3.3 ensure that {uj : 2 ≤ j < ∞} and {vj : 2 ≤ j < ∞} are
identified parameters. It follows that {(uj , vj ) : 2 ≤ j < ∞} is measurable w.r.t. A∗ and, consequently,
W β1 ,δ1 (B) is measurable w.r.t. A∗ for all B ∈ B + × B. The upper-bar denotes a σ-field completed with
measurable sets; for a definition, see Chapter 2 in Florens et al. (1990).
S TEP 3: Using (4.6), it follows that
E(Yij | β 1:∞ , c1:∞ , θ 1:∞ ) = 1 −
= 1−
δj eβj
eβj + eθi
vj +
1
δ1
+
uj
1
δ1 exp(β1 )
exp(θi )
,
and
piJ
[
]
.
= E Y iJ | β 1:∞ , c1:∞ , θ 1:∞
uj
1 ∑
= 1−
.
1
1
J
v + δ1 + δ1 exp(β
exp(θi )
1)
1≤j≤J j
S TEP 3. A : By the law of large deviations (see Chapter IV, Section 5 in Shiryaev (1995)), it follows that
Y iJ − piJ −→ 0
a.s. conditionally on (β1 , δ1 , θi ) as J → ∞.
26
But as J → ∞
[
piJ
−→ 1 − E
∫
vj +
{
1−
=
R+ ×R
.
=
]
uj
1
δ1
1
δ1 exp(β1 )
+
| β1 , δ1 , θi
exp(θi )
}
u
v+
1
δ1
+
a.s. conditionally on (β1 , δ1 , θi )
1
δ1 exp(β1 )
exp(θi )
W β1 ,δ1 (du, dv)
p(β1 , δ1 , θi ).
Therefore,
Y iJ −→ p(β1 , δ1 , θi )
a.s. conditionally on (β1 , δ1 , θi ) as J → ∞.
S TEP 3. B : It follows that, for all g ∈ Cb ([0, 1]),
)
(
a.s. and in L1 conditionally on (β1 , δ1 , θi ) as J → ∞.
g Y iJ −→ g (p(β1 , δ1 , θi ))
Then for all g ∈ Cb ([0, 1]),
[ (
)
]
E g Y iJ | β 1:∞ , c1:∞ , G −→ E [g (p(β1 , δ1 , θi )) | β 1:∞ , c1:∞ , G]
{
∫
=
g
[
1−E
a.s.
]}
uj
vj +
1
δ1
+
1
δ1 exp(β1 )
exp(θi )
| β1 , δ1 , θi
G(dθ).
[ (
)
]
By definition of conditional expectation, E g Y iJ | β 1:∞ , c1:∞ , G is measurable w.r.t. A∗ . Thus
{
∫
g
R+
[
1−E
]}
uj
vj +
1
δ1
+
1
δ1 exp(β1 )
exp(θi )
| β1 , δ1 , θi
G(dθ)
∗
is measurable
[ (
) w.r.t. A ; the] bar is added because such an integral is the a.s. limit of the sequence
{E g Y iJ | β 1:∞ , c1:∞ , G : J ∈ N}.
S TEP 3. C : Using the transformation (4.9), it is follows that
∫
g [L(x)] Gβ1 ,δ1 (dx) is A∗ -measurable,
R+
∫
where
L(x) =
R+ ×R
(
u
1−
v+x
27
)
W β1 ,δ1 (du, dv).
The function L(·) is a strictly continuous function from (δ −1 , ∞) to (0, 1) that is known because
it is measurable w.r.t. σ(W[β1 ,δ1 ).] By S TEP 2, σ(W β1 ,δ1 ) ⊂ A∗ . In particular, for every function
f ∈ Cb (R+ ), take g(y) = f L(y) , where L(α) = inf{x : L(x) ≥ α}. It follows that
∫
f (x)Gβ1 ,δ1 (dx)
R+
is measurable w.r.t. A∗ . Considering
fn (y) = 11(0,x] (y) + [1 − n(y − x)] 11(x,x+ 1 ) (y) ↓ 11(0,x] (y)
n
∀ x ∈ R+ ,
as n → ∞, the monotone convergence theorem implies that, for every x ∈ R+ , Gβ1 ,δ1 ((0, x]), and so
Gβ1 ,δ1 , is measurable w.r.t. A∗ .
S TEP 4: From S TEPS 1 and 3 C, it follows that (β 2:∞ , δ 2:∞ , Gβ1 ,δ1 ) is measurable w.r.t. A∗ . By the
Lemma of Dynkin-Doob, this is equivalent to
σ(β 2:∞ , δ 2:∞ ) ∨ σ(Gβ1 ,δ1 ) ⊂ A∗ .
However, σ(Gβ1 ,δ1 ) ⊂ σ(G) ∨ σ(β1 , δ1 ). Therefore, two restrictions should be introduced in order to
obtain the equality σ(Gβ1 ,δ1 ) = σ(G) ∨ σ(β1 , δ1 ). Two possibilities can be considered:
1. The first possibility consists in fixing two q-quartiles of G. In fact, let
x1 = inf{x : Gβ1 ,δ1 (x) > q1 },
x2 = inf{x : Gβ1 ,δ1 (x) > q2 }.
Using (4.9), this is equivalent to
β1 + ln(δ1 x1 − 1) = y1 = inf{y : G(y) > q1 },
β1 + ln(δ1 x2 − 1) = y2 = inf{y : G(y) > q2 }.
]
ey2 − ey1
x1 ey2 − x2 ey1
, δ1 =
β1 = ln
;
x2 − x1
x1 ey2 − x2 ey1
that is, β1 and δ1 are identified since x1 , x2 , y1 , y2 depends on G that it is identified.
It follows that
[
2. The second possibility consists in fixing the mean and the variance of the distribution of exp(θ),
namely
EG (eθ ) = µ, VG (eθ ) = σ 2 .
Using (4.9), this is equivalent to
m = EGβ1 ,δ1 (X) =
It follows that
β1 = ln
µ
1
+
,
δ1 δ1 eβ1
( mσ
v
)
−µ ,
For instance, if µ = 0 and σ = 1, then
v 2 = VGβ1 ,δ1 (X) =
δ1−1 = m −
σ2
.
δ12 e2β1
µv
.
σ
(m)
1
, δ1 = ;
v
m
that is, β1 and δ1 are identified since m, v depends on G that it is identified.
β1 = ln
28
Acknowledgements: The work developed in this paper was presented in a Symposium on Identification Problems in Psychometrics at the International Meeting of the Psychometric Society IMPS 2009.
The meeting was held in Cambridge (UK), in July 2009. The first author gratefully acknowledges the
partial financial support from the PUENTE Grant 08/2009 from the Pontificia Universidad Católica de
Chile. The authors gratefully acknowledge several discussions with Claudio Fernández (Faculty of Mathematics, Pontificia Universidad Católica de Chile) and Paul De Boeck (University of Amsterdam).
References
Adams, R. J., M. R. Wilson, and W. Wang (1997). The Multidimensional Random Coefficients Multinomial Logit Model. Applied Psychological Measurement 21, 1–23.
Adams, R. J. and M. L. Wu (2007). The Mixed-Coefficients Multinomial Logit Model: A Generalization
Form of the Rasch Model. In M. von Davier and C. H. Carstensen (Eds.), Multivariate and Mixture
Distribution Rasch Models, pp. 57–75. Springer.
Birnbaum, A. (1968). Some latent trait models and their use in inferring any examinee’s ability. In
F. M. Lord and M. R. Novick (Eds.), Statistical Theories of Mental Test Scores, pp. 395–479.
Adison-Wesley.
Bock, R. D. and M. Aitkin (1981). Marginal maximum likelihood estimation of itemm parameters:
Application of an EM algorithm. Psychometrika 46, 443–459.
Bock, R. D. and M. F. Zimowski (1997). Multiple Group IRT. In W. J. van der Linden and R. K.
Hambleton (Eds.), Handbook of Modern Item Repsonse Theory, pp. 433–448. Springer.
Cao, J. and L. Stokes (2008). Bayesian IRT guessing models for partial guessing behaviors. Psychoemtrika 73, 209–230.
De Boeck, P. and M. Wilson (2004). Explanatory Item Response Models. A Generalized Linear and
Nonlinear Approach. Springer.
Embretson, S. E. and S. P. Reise (2000). Item Response Theory for Psychologists. New Jersey: Lawrence
Erlbaum Associates, Publishers.
Florens, J.-P., M. Mouchart, and J.-M. Rolin (1990). Elements of Bayesian Statistics. Marcel Dekker.
Florens, J.-P. and J.-M. Rolin (1984). Asymptotic sufficiency and exact estimability. In J.-P. Florens,
M. Mouchart, J.-P. Raoult, and L. Simar (Eds.), Alternative Approaches to Time Series Analysis,
pp. 121–142. Publications des Facultés Universitaires Saint-Louis, Bruxelles.
Gabrielsen, A. (1978). Consistency and Identifiability. Journal of Econometrics 8, 261–263.
Gelfand, A. E. and S. K. Sahu (1999). Identifiability, improper priors, and gibbs sampling for generalized
linear models. Journal of the American Statistical Association 94, 247–253.
Ghosh, M., A. Ghosh, M.-H. Chen, and A. Agresti (2000). Noninformaive priors for one-parameter item
response models. Journal of Statistical Planning and Inference 88, 99–115.
Halmos, P. (1951). Introduction to Hilbert Space, and the Theory of Spectral Multiplicity. Chelsea
Publishing.
Hambleton, R. K., H. Swaminathan, and H. J. Rogers (1991). Fundamentals of Item Response Theory.
Sage Publications.
29
Hutschinson, T. P. (1991). Ability, Parameter Information, Guessing: Statistical Modelling Applied to
Multiple-Choice Tests. Rumsby Scientific Publishing.
Kadane, J. (1974). The role of identification in Bayesian theory. In S. Fienberg and A. Zellner (Eds.),
Studies in Bayesian Econometrics and Statistics, pp. 175–191. North Holland.
Karabatsos, G. and S. Walker (2009). Coherent psychometric modelling with Bayesian nonparametrics.
British Journal of Mathematical and Statistical Psychology 62, 1–20.
Koopmans, T. C. and O. Reiersøl (1950). The Identification of Structural Characteristics. The Annals of
Mathematical Statistics 21, 165–181.
Lindley, D. V. (1971). Bayesin Statistics: A Review. Society for Industrial and Applied Mathematics.
Maris, G. and T. Bechger (2009). On interpreting the model parameters for the three parameter logistic
model. Measurement: Interdisciplinary Research and Perspective 7, 75–86.
McDonald, R. P. (1999). Test Theory: A Unified Treatment. Lawrence Erlbaum.
Miyazaki, K. and T. Hoshino (2009). A Bayesian semiparametric item response model with Dirichlet
process priors. Psychometrika 74, 375–393.
Picci, G. (1977). Some connections between the theory of sufficient statistics and the identifiability
problem. SIAM Journal on Applied Mathematics 33, 383–398.
Poirier, D. J. (1998). Revising Beliefs in Nonidentified Models. Econometric Theory 14, 483–509.
Rizopoulos, D. (2006). ltm: An R package for Latent Variable Modelling and Item Response Theory
Analyses. Journal of Statistical Software 17(5), 1–25.
Roberts, G. O. and J. Rosenthal (1998). Markov-chain Monte Carlo: Some practical implications of
theoretical results. The Canadian Journal of Statistics 26(1), 5–20.
San Martı́n, E., G. del Pino, and P. De Boeck (2006). IRT Models for Ability-Based Guessing. Applied
Psychological Measurement 30, 183–203.
San Martı́n, E. and J. González (2010). Bayesian identifiability: Constributions to an incloncluusive
debate. Chilean Journal of Statistics 1, 69–91.
San Martı́n, E., J. González, and F. Tuerlinckx (2009). Identified Parameters, Parameters of Interest and
Their Relationships. Measurement: Interdisciplinary Research and Perspective 7, 95–103.
San Martı́n, E., A. Jara, J.-M. Rolin, and M. Mouchart (2011). On the Bayesian Nonparametric Generalization of IRT-type Models. Psychometrika 76, 385–409.
San Martı́n, E. and F. Quintana (2002). Consistency and Identifiability Revisited. Brazilian Journal of
Probability and Statistics 16, 99–106.
Shiryaev, A. N. (1995). Probability. Second Edition. Springer.
Spivak, M. (1965). Calculus on Manifolds: A Modern Approach to Classical Theorems of Advanced
Calculus. Perseus Book Publishing.
Swaminathan, H. and J. A. Gifford (1986). Bayesian Estimation in the Three-Parameter Logistic Model.
Psychometrika 51, 589–601.
Thissen, D. (2009). On Interpreting the Parameters for any Item Response Model. Measurement: Interdisciplinary Research and Perspective 7, 104–108.
Thissen, D. and H. Wainer (2001). Item Response Models for Items Scored in Two Categories. Springer.
van der Linden, W. and R. K. Hambleton (1997). Handbook of Modern Item Response Theory. Springer.
Wise, S. L. and C. E. DeMars (2006). An application of item response time: the effort-moderated IRT
model. Journal of Educational Measurement 43, 19–38.
30
Wise, S. L. and X. Kong (2005). Response time effort: a new measure of examinee motiivation in
computer-based tests. Applied Psychological Measurement 18, 1–17.
Woods, C. M. (2006). Ramsay-curve item response theory (RC-IRT) to detect and correct for nonnormal
latent variables. Psychological Methods 11, 253–270.
Woods, C. M. (2008). Ramsay-curve item response theory for the three-parameter logistic item response
model. Applied Psychological Measurement 32, 447–465.
Woods, C. M. and D. Thissen (2006). Item Response Theory with Estimation of the Latent Population
Distribution using Spline-Based Densities. Psychometrika 71, 281–301.
31