Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
I NF ORMEST ÉCNI COSMI DEUC /T ECHNI CALREPORT SMI DEUC Cent r odeMedi ci ónMI DEUC /Meas ur ementCent erMI DEUC I T 1106 I dent i f i cat i onoft he1PLModel wi t hGues s i ngPar amet er : Par amet r i candS emi par amet r i cRes ul t s ERNES T OS ANMART Í NANDJ EANMARI EROL I N Identification of the 1PL Model with Guessing Parameter: Parametric and Semi-parametric Results E RNESTO S AN M ART ÍNa,b,c AND J EAN -M ARIE ROLINd a Department of Statistics, Pontificia Universidad Católica de Chile, Chile. b Faculty of Education, Pontificia Universidad Católica de Chile, Chile. c Measurement Center MIDE UC, Pontificia Universidad Católica de Chile, Chile. d Institut de statistique, biostatistique et sciences actuarielles, Université catholique de Louvain, Belgium. September 28, 2011 Abstract In this paper, we study the identification of a particular case of the 3PL model, namely when the discrimination parameters are all constant and equal to 1. We term this model, 1PL-G model. The identification analysis is performed under three different specifications. The first specification considers the abilities as unknown parameters. It is proved that the item parameters and the abilities are identified if a difficulty parameter and a guessing parameter are fixed at zero. The second specification assumes that the abilities are mutually independent and identically distributed according to a distribution known up to the scale parameter. It is shown that the item parameters and the scale parameter are identified if a guessing parameter is fixed at zero. The third specification corresponds to a semi-parametric 1PL-G model, where the distribution G generating the abilities is a parameter of interest. It is not only shown that, after fixing a difficulty parameter and a guessing parameter at zero, the item parameters are identified, but also that under those restrictions the distribution G is not identified. It is finally shown that, after introducing two identification restrictions either on the distribution G or on the item parameters, the distribution G and the item parameters are identified provided an infinite quantity of items is available. Keywords: 3PL model, location-scale distributions, fixed effects, random effects, identified parameter, parameters of interest, Hilbert space. 1 Introduction For multiple-choice tests it is reasonable to assume that the respondents guess when they believe that they don’t know the correct response. This type of behavior seems to be prevalent in a low-stakes test, where students are asked to take a test for which they receive neither grades nor academic credit and thus may be unmotivated to do well. The solution for this problem is to include a so-called guessing parameter in the model that would have been used without any guessing, which is commonly the 1PL or the 2PL model. 1 Three possibilities are found in the literature (Hutschinson, 1991): (1) a fixed value L−1 , with L being the number of response categories; (2) an overall guessing parameter to be estimated from the data, with the same value for all items; (3) an item-specific guessing parameter. The third possibility is the one used in the 3PL, the extension of the 2PL with an item-specific guessing parameter; this parameter reflects the probability of a correct guess. The 3PL was introduced by Birnbaum (1968), and is discussed in most handbooks of IRT (Hambleton, Swaminathan, and Rogers, 1991; van der Linden and Hambleton, 1997; Embretson and Reise, 2000; McDonald, 1999; Thissen and Wainer, 2001) and the option is available in various specialized computer programs (e.g., BILOG, LOGIST, MIRTE, MULILOG, PARSCALE, RASCAL, R). The 3PL is a popular model, but it restricts the guessing parameter to be item dependent, instead of allowing this parameter to be also person dependent. Several extensions have been proposed. Thus, for instance, San Martı́n, del Pino, and De Boeck (2006) extends the 3PL model to let the guessing parameter depend on the ability of the examinee. Wise and Kong (2005) proposed to use response time to distinguish solution behavior and rapid-guessing behavior: for each item there is a threshold which is the response time boundary between solution behavior and rapid-guessing behavior. Based on this research, Wise and DeMars (2006) developed the effort-moderate model. If an examinee’s response time is longer than the threshold, the model reduces to the 3PL model describing solution behavior. Otherwise, the model reduces to a constant probability model with the guessing probability being L−1 , with L the number of response categories. Cao and Stokes (2008) proposed three IRT models to describe the results from a group of examinees including both nonguessers and partial guessers. The first model assumes that the guesser answers questions based on his/her knowledge up to a certain test item, and guesses thereafter. The second model assumes that the guesser answers relatively easy questions based on his/her knowledge and guesses randomly on the remaining items. The third model assumes that the guesser gives less and less effort as he/she proceeds through the test. In spite of these extensions, the 3PL model still deserves basic questions to be investigated and which should have an impact on more complex guessing IRT-models. Parameter identifiability is one of those basic questions. Identifiability is relevant because it is a condition necessary for ensuring a coherent inference on the parameters of interest. The parameters of interest are related to the sampling distributions which describe the data generating process. If there not exist a one to one relationship between those parameters and the sampling distributions, the parameters of interest are not provided with an empirical meaning. In a sampling theory framework, this fact is made explicit through the impossibility of obtaining unbiased and/or consistent estimators of unidentified parameters (Koopmans and Reiersøl, 1950; Gabrielsen, 1978; San Martı́n and Quintana, 2002). This limitation seems to be circumvented by a Bayesian approach because in this set-up it is always possible to compute the posterior distribution of unidentified parameters (Lindley, 1971; Poirier, 1998; Gelfand and Sahu, 1999; Ghosh, Ghosh, Chen, and Agresti, 2000). However, taking into account that a statistical model always involves an identified parametrization (see Theorem 4.3.3 in Florens, Mouchart, and Rolin (1990)), the posterior distribution of an unidentified parameter updates the identified parameter only and, consequently, does not provide any empirical information on the unidentified parameter (San Martı́n and González, 2010; San Martı́n, Jara, Rolin, and Mouchart, 2011). Thus, either from Bayesian point of view, or from a sampling theory framework, identifiability is an 2 issue that need to be considered. In both perspectives, an identification analysis should begin by making explicit the sampling distributions as well as the parameters of interest. Regarding the 3PL model, the psychometric literature has considered three different type of likelihoods: 1. A first likelihood corresponds to the probability distribution of the observations given both the item parameters (difficulty, discrimination and guessing ones) and the abilities. These parameters are also considered as the parameters of interest; see, for instance, Swaminathan and Gifford (1986) and Maris and Bechger (2009). 2. A second likelihood is obtained after integrating out the abilities, which in turn are assumed to be distributed according to a parametric distribution Gφ known up to a parameter φ. In this case, the sampling distribution or likelihood is indexed by the item parameters (difficulty, discrimination and guessing ones) and φ. These parameters are also considered as the parameters of interest. The abilities are typically obtained in a second step through an empirical Bayes procedure; see, for instance, Bock and Aitkin (1981) and Bock and Zimowski (1997). 3. A third likelihood is obtained after integrating out the abilities, which in turn are assumed to be distributed according to an unknown probability distribution G. In this case, the sampling distribution or likelihood is indexed by the item parameters (difficulty, discrimination and guessing ones) and G. These parameters are considered as the parameters of interest; see, for instance, Woods (2006, 2008). Following the terminology of generalized non-linear mixed models (De Boeck and Wilson, 2004), it can be said that the first likelihood considers the abilities as fixed-effects, whereas the remaining two likelihoods consider the abilities as random-effects. This terminology helps to make precise whether the abilities are parameters indexing or not the likelihood function. It can be used in spite of the estimation procedure, particularly in a Bayesian framework, where the parameters indexing the likelihood are endowed with a prior distribution. For identification purpose, the estimation procedure is irrelevant. The identification problems corresponding to each of those sampling distributions are quite different. Some contributions conjecture (Adams, Wilson, and Wang, 1997; Adams and Wu, 2007) that the identification of the parameters indexing the first type of likelihoods implies the identification of the parameters indexing the second (and, by extension, the third) type of likelihoods. However, in the case of the Rasch model, it was shown that such relationships are not true (see Sections 3.2 and 4.2 in San Martı́n et al. (2011)). This result suggests that the identification problems in the context of 3PL models are still open problems. This paper focuses its attention on these identification problems. It is mainly motivated by the recent work of Maris and Bechger (2009); there, it is considered a likelihood of the first type, where the discrimination parameters are equal to an unknown common value. They shown that the item parameters and the abilities are not identified. Our paper studies the identification problem in the three contexts mentioned above: Section 2 discusses the problem under a fixed-effects specification of the model; Section 3 studies the problem under a parametric random-effects specification, where the distribution generating the abilities is assumed to be known up to a scale parameter and a location parameter. Finally, in Section 3 4, the problem is analyzed in a semi-parametric context, where the distribution generating the abilities is considered among the parameters of interest. The main obtained results can be summarized as follows: 1. Under a fixed-effects specification of the model, it is shown that the item parameters and the abilities are identified if basically two restrictions are imposed: one on the difficulty parameters and one on the guessing parameters. 2. Under a parametric random-effects specification, it is shown that the item parameters and the scale parameter are identified if basically one restriction on the guessing parameters is imposed. 3. Under a semi-parametric specification, it is shown that the item parameters are identified if basically two restrictions are imposed: one on the difficulty parameters and one on the guessing parameters. Using the structure of a specific Hilbert space, it is also shown that the distribution G is not identified by the observations. However, when an infinite quantity of items is available, it is proved that G becomes identified. The paper ends with a discussion. 2 Identification under a fixed-effects specification The 3PL model, introduced by Birnbaum (1968), is specified as follows: for each person i = 1, . . . , N and each item j = 1, . . . , J, the probability that person i answers correctly item j is given by P [Yij = 1 | θi , βj , αj , cj ] = cj + (1 − cj ) Ψ[αj (θi − βj )], (2.1) where Ψ(x) = exp(x)/[1 + exp(x)]. This model assumes that if a person i has ability θi , then the probability that he/she will know the correct answer of the item j is given by Ψ[αj (θi − βj )]; here αj corresponds to the discrimination parameter of item j, whereas βj is the corresponding difficulty parameter. It further assumes that if he/she does not know the correct answer, he/she will guess and, with probability cj , will guess correctly; the parameter cj is accordingly called guessing parameter of item j. It follows from these assumptions that the probability of a correct response to item j by person i is given by (2.1). For details, see Birnbaum (1968) and Chapter 4 in Embretson and Reise (2000). The model is completed by assuming that Yij ’s are mutually independent. The statistical model (or, likelihood function) describing the data generating process corresponds, therefore, to the family of sampling distributions indexed by the parameters (θ 1:N , α1:J , β 1:J , c1:J ) ∈ RN × RJ+ × RJ × [0, 1]J , where θ 1:N = (θ1 , . . . , θN ), α1:J = (α1 , . . . , αJ ), and similarly for β 1:J and c1:J . Recently, Maris and Bechger (2009) considered the identifiability of a particular case of the 3PL, namely when the discrimination parameters αj , with j = 1, . . . , J, are equal to α. The parameter indeterminacies inherited from the Rasch model and the 2PL model that should be removed, are the location and scale ones. Maris and Bechger (2009) removed them by fixing α at one and constraining β1 4 and c1 in such a way that β1 = − ln(1−c1 ). In this specific case, the unidentifiability of the parameters of interest persists because (θi , βj , cj ) and (ln(exp(θi ) + r), ln(exp(βj ) − r), (cj exp(βj ) − r)/(exp(βj ) − r)), with a constant r such that − min{exp(θi ) : i = 1, . . . , N } ≤ r ≤ min{cj exp(βj ) : j = 1, . . . , J}, induce the same probability distribution (2.1) (with αj = 1 for all item j). Thus, “in contrast to the location and scale indeterminacy, this new form of indeterminacy involves not only the ability and the item difficulty parameters, but also the guessing parameter” (p. 6 Maris and Bechger, 2009). The question is under which additional restrictions, the 3PL with constant discriminations is identified. San Martı́n, González, and Tuerlinckx (2009) considered this problem for the 3PL model with discriminations equal to 1. Following San Martı́n et al. (2006), this model, called 1PL-G model, is specified as P [Yij = 1 | θi , βj , cj ] = cj + (1 − cj ) Ψ(θi − βj ). (2.2) The data generating process is accordingly described by a family of sampling distributions indexed by the parameters (θ 1:N , β 1:J , c1:J ) ∈ RN × RJ × [0, 1]J . (2.3) San Martı́n et al. (2009) shown that these parameters are identified by the observations under specific restrictions which are summarized in the following theorem: Theorem 2.1 For the statistical model 2.2, the parameters (θ 1:N , β 1:J , c1:J ) are identified by the observations provided the following conditions hold: 1. At least one item is available. 2. There exists at least two persons such that their probabilities to answer correctly all the items are different. 3. c1 and β1 are fixed at 0. 3 3.1 Identification of the Parametric 1PL-G Model Random-effects specification of the 1PL-G model The previous identification results are valid under a fixed-effects specification of the 1PL-G model, that is, when the abilities are viewed as unknown parameters. However, in modern item response theory, θ is usually considered as a latent variable and, therefore, its probability distribution is an essential part of the model; see (Thissen, 2009). More specifically, the probability distribution generating the individual abilities θi ’s is assumed to be a location-scale distribution G µ,σ defined as ( ) x−µ . µ,σ P [θi ≤ x | µ, σ] = G ((−∞, x]) = G (−∞, ] , (3.1) σ 5 where µ ∈ R is the location parameter and σ ∈ R+ is the scale parameter. In applications, G is typically chosen as a standard normal distribution. It is also assumed that for each person i, his/her responses Y i = (Yi1 , . . . , YiJ )′ satisfy the Axiom of Local Independence, namely that Yi1 , . . . , YiJ are mutually independent conditionally on (θi , β 1:J , c1:J ). The distribution of Yij depends on (θi , βj , cj ) through the function (2.2). It is finally assumed that, conditionally on (θ 1:N , β 1:J , c1:J ), the patterns responses Y 1 , . . . , Y N are mutually independent. The statistical model (or, likelihood function) is obtained after integrating out the random effects θi ’s. The above-mentioned hypotheses underlying the 1PL-G model imply that the patterns responses Y 1 , . . . , Y N are mutually independent given (β 1:J , c1:J , µ, σ), with a common probability distribution defined as P [Y i = y i | β 1:J , c1:J , µ, σ] = ∫ ∏ J = { P [Yij = 1 | θ, βj , cj ] }yij { P [Yij = 0 | θ, βj , cj ] }1−yij G µ,σ (dθ), (3.2) R j=1 where y i = (yi1 , . . . , yiJ )′ ∈ {0, 1}J and P [Yij = 1 | θ, βj , cj ] as defined by (2.2). The parameters of interest are accordingly given by (β 1:J , c1:J , µ, σ) ∈ RJ × [0, 1]J × R × R+ . (3.3) In order to distinguish the probability distribution (3.2) from the conditional probability P [Yij = 1 | θi , βj , cj ], (3.2) is termed marginal probability. Under the iid property of the statistical model generating the Y i ’s, the identification of the parameters of interest by one observation is entirely similar to their identification by an infinite quantity of observations; for a proof, see Theorem 7.6.6 in Florens et al. (1990). Thus, the identification problem to be studied in this section consists in establishing restrictions (if necessary) under which the mapping (β 1:J , c1:J , µ, σ) 7−→ P [Y 1 = y 1 | β 1:J , c1:J , µ, σ] is injective for all y 1 ∈ {0, 1}J , where P [Y 1 = y 1 | β 1:J , c1:J , µ, σ] is given by (3.2). 3.2 Identification strategy Following San Martı́n et al. (2009), an identification strategy consists in distinguishing between parameters of interest and identified parameters. In the case of the statistical model (3.2), the parameters of interest are (β 1:J , c1:J , µ, σ). The probabilities of the 2J different possible patterns are given by q12···I = P [ Y11 = 1, . . . , Y1,J−1 = 1, Y1J = 1 | β 1:J , c1:J , µ, σ ] q12···I¯ = P [ Y11 = 1, . . . , Y1,J−1 = 1, Y1J = 0 | β 1:J , c1:J , µ, σ ] .. . q1̄2̄···I¯ = P [ Y11 = 0, . . . , Y1,J−1 = 0, Y1J = 0 | β 1:J , c1:J , µ, σ ]. 6 The statistical model (3.2) corresponds, therefore, to a multinomial distribution (Y 1 | π) ∼ Mult (2J , π), where π = (q12···I , q12···I−1,I¯, . . . , q1̄,2̄,...,I¯). It is known that the parameter π of a multinomial distribution is identified by Y 1 . We accordingly term the parameter π, identified parameter; the q’s with less than J subscripts are linear combinations of them and, therefore, are identified parameters. Consequently, the identifiability of the parameters of interest follows if an injectivity relation between them and functions of the identified parameters π is established. This strategy is followed in the rest of this paper. Taking into account that a statistical model always involves an identified parametrization (see Florens et al., 1990; San Martı́n et al., 2009), the restrictions which are introduced to establishing an injective relationship between the parameters of interest and the identified parameters, are not only sufficient identification conditions, but also necessary conditions. 3.3 Identification when G is known up to a scale parameter Let us begin the identification analysis of the 1PL-G model when the distribution G generating the individual abilities is known up to the scale parameter σ. The corresponding identification analysis is divided into three steps: S TEP 1: It is shown that the difficulty parameters β 1:J are a function of the scale parameter σ, the guessing parameters c1:J and the identified parameters P [Yij = 0 | β 1:J , c1:J , σ], with j = 1, . . . , J. S TEP 2: It is shown that the scale parameter σ is a function of the guessing parameters c1 and c2 , as well as of the identified parameters P [Yi1 = 1, Yi2 = 1 | β 1:J , c1:J , σ], P [Yi1 = 1 | β 1:J , c1:J , σ] and P [Yi2 = 1 | β 1:J , c1:J , σ]. S TEP 3: It is finally shown that the difficulty parameters β 2:J and the guessing parameters c2:J are functions of (β1 , c1 ) and identified parameters which in turn depend on π. Combining these steps, the identification of (β 1:J , c2:J , σ) by both the observations and c1 is obtained. Therefore, one identification restriction is needed, namely c1 = 0. Let us mention that both S TEP 1 and S TEP 2 are valid for all the conditional specifications of the form P [Yij = 1 | θi , βj , cj ] = cj + (1 − cj ) F (θi − βj ), (3.4) where F is a strictly increasing continuos distribution function with a positive density function on R, and not only for the logistic distribution Ψ. However, S TEP 3 depends on the logistic function Ψ. In what follows, these steps are duly detailed. 3.3.1 Step 1 for the parametric 1PL-G model Let . ωj = P [Yij = 0 | β 1:J , c1:J , σ] = δj p (σ, βj ), 7 (3.5) ∫ where p (σ, βj ) = R {1 − F (σθ − βj )} G(dθ), (3.6) . with F a strictly increasing continuos distribution function, and δj = 1 − cj . The parameter ωj is identified because it is a function of the identified parameter π; see Section 3.2. The function p(σ, βj ) is a continuous function in (σ, βj ) ∈ R+ × R that is strictly increasing in β ∈ R because F is a strictly increasing continuous function (in particular, the logistic distribution Ψ satisfies these properties). Furthermore, p (σ, −∞) = 0 and p (σ, +∞) = 1 and, consequently, 0 ≤ ωj ≤ δ j Therefore, if we define for all j = 1, . . . , J. (3.7) . p̄ (σ, ϵ) = inf{β : p (σ, β) > ϵ}, (3.8) p̄ [σ, p (σ, β)] = β (3.9) it is clear that for all β. Taking into account relation (3.7), it follows that βj = p̄ [σ, ωj /δj ]; that is, for all j = 1, . . . , J, the item parameter βj is a function of the scale parameter σ, of the identified parameter ωj and of the non-guessing parameter δj (and, by extension, of the guessing parameter cj ). 3.3.2 Step 2 for the parametric 1PL-G model Let ω12 . = = P [ Yi1 = 0 , Yi2 = 0 | β 1:J , c1:J , σ ] ∫ δ1 δ2 {1 − F (σ θ − β1 )} {1 − F (σ θ − β2 )} G ( dθ ). (3.10) R Using S TEP 1, the identified parameter ω12 can be written as a function of σ , δ1 , δ2 , ω1 and ω2 in the following way: ω12 . = = φ ( σ , δ1 , δ 2 , ω 1 , ω2 ) [ ( )]} { [ ( )]} ∫ { ω1 ω2 1−F σθ − p σ, 1−F σθ − p σ, G ( dθ ). δ1 δ2 R Now, if the distribution function F has a continuous density function f strictly positive on R, then it can be shown that the function ω12 = φ (σ, δ1 , δ2 , ω1 , ω2 ) is a strictly increasing continuous function of σ and, therefore, σ = φ (ω12 , δ1 , δ2 , ω1 , ω2 ), where φ (ω , δ1 , δ2 , ω1 , ω2 ) = inf { σ : φ ( σ , δ1 , , δ2 , ω1 , ω2 ) > ω }. In other words, σ becomes a function of the identified parameters ω1 , ω2 and ω12 , as well as of the non-guessing parameters δ1 and δ2 . The details are developed in Appendix A. 8 3.3.3 Step 3 for the parametric 1PL-G model This step essentially depends on the logistic distribution Ψ and, consequently, the arguments below are performed using the conditional probability (2.2). Let J ≥ 3, j ̸= 1 and k ̸= j (with k, j ≤ J). Defines the identified parameters pJ0 , pJj and pJjk as follows: ∩ . pJ0 = P {Yij = 0} | β 1:J , δ 1:J , σ = ∫ ∏ R exp(βj ) 1≤j≤J exp(βj )+exp(σ θ) pJj δj × I0J (β 1:J , σ), 1≤j≤J 1≤j≤J where I0J (β 1:J , σ) = ∏ ∩ . = P G(dθ); {Yik = 0} | β 1:J , δ 1:J , σ 1≤k≤J, k̸=j ∏ = δk { } I0J (β 1:J , σ) + e−βj I1J (β 1:J , σ) , 1≤k≤J, k̸=j where I1J (β 1:J , σ) = ∫ R ∏ eσθ G(dθ); and ∩ . pJjk = P exp(βj ) 1≤j≤J exp(βj )+exp(σ θ) {Yir = 0} | β 1:J , δ 1:J , σ 1≤r≤J, r̸=j, r̸=k = ∏ δr { } I0J (β 1:J , σ) + (e−βj + e−βk ) I1J (β 1:J , σ) + e−βj e−βk I2J (β 1:J , σ) , 1≤r≤J, r̸=j, r̸=k ∫ ∏ exp(βj ) where I2J (β 1:J , σ) = R e2σθ 1≤j≤J exp(βj )+exp(σ θ) G(dθ). Assuming that, for every j = 1, . . . , J, δj > 0 and βj ∈ (−∞, ∞), it follows that pJj pJ0 pJjk pJ0 = 1 e−βj + g(β 1:J , σ), δj δj = } 1 { 1 + (e−βj + e−βk ) g(β 1:J , σ) + e−βj e−βk h(β 1:J , σ) , δj δk where g(β 1:J , σ) = I1J (β 1:J , σ) I0J (β 1:J , σ) and h(β 1:J , σ) = I2J (β 1:J , σ) . I0J (β 1:J , σ) Using these definitions, the following three propositions are established: 9 Proposition 3.1 Let J ≥ 3 and j ̸= k. Then J rjk J pJj pJk e−βj e−βk . pjk = J − J J = k(β 1:J , σ), δj δk p0 p0 p0 where k(β 1:J , σ) = h(β 1:J , σ) − g(β 1:J , σ)2 ≥ 0. The non-negativity of k(β 1:J , σ) follows after noticing that g(β 1:J , σ) = EGβ1:J ,σ (eσ θ ) and h(β 1:J , σ) = EGβ1:J ,σ (e2 σ θ ), with . Gβ1:J ,σ (dθ) = 1 ∏ I0J (β 1:J , σ) 1≤j≤J exp(βj ) G(dθ). exp(βj ) + exp(σ θ) Proposition 3.2 Let J ≥ 3 and j ̸= k, with j ̸= 1. Then δj βj − β1 . rJ , = e uj = 1k J δ rjk 1 and, consequently, uj > 0. Moreover, {uj : 2 ≤ j ≤ J} are identified parameters. Proposition 3.3 Let J ≥ 3 and j ̸= 1. Then J ) pJ 1 ( βj −β1 . pj vj = J uj − 1J = e −1 , δ1 p0 p0 and, consequently, vj ∈ R. Moreover, {vj : 2 ≤ j ≤ J} are identified parameters. Propositions 3.2 and 3.3 entail the following identities: for j = 2, . . . , J, (i) δj = uj δ1 , vj δ1 + 1 (ii) βj = β1 + ln(vj δ1 + 1). (3.11) These identities requires that, for each j = 2, . . . , J, vj δ1 + 1 > 0. By Proposition 3.3, this inequality is equivalent to eβj −β1 > 0, which is always true. So, the inequality vj δ1 + 1 > 0 does not restrict the sampling process. 3.3.4 Main result for the parametric 1PL-G model The previous results allow to show that (β 1:J , c 2:J , σ) are identified by Y 1 provided c1 (or, equivalently, δ1 ) is fixed. As a matter of fact: 10 1. S TEP 1, S TEP 2 and (3.11) ensures that βj = β1 + ln(vj δ1 + 1) = p (σ, ω1 /δ1 ) + ln(vj δ1 + 1) = p (φ( ω12 , δ1 , δ2 , ω1 , ω2 ), ω1 /δ1 ) + ln(vj δ1 + 1) = p (φ( ω12 , δ1 , u2 , v2 , ω1 , ω2 ), ω1 /δ1 ) + ln(vj δ1 + 1). That is, for each j = 2, . . . , J, βj is a function of both δ1 and (ω12 , ω1 , ω2 , u2 , v2 , vj ). 2. S TEP 2 implies that σ is a function of (δ1 , δ2 , ω1 , ω2 , ω12 ). Applying equality (3.11.i) for j = 2, it follows that σ is a function of (δ1 , u2 , v2 , ω1 , ω2 , ω12 ). 3. S TEP 1 implies that β1 is a function of (δ1 , σ, ω1 ). Using the fact that σ is a function (δ1 , u2 , v2 , ω1 , ω2 , ω12 ), it follows that β1 is a function of (δ1 , u2 , v2 , ω1 , ω2 , ω12 ). 4. Finally, equality (3.11.i) implies that δ 2:J is a function of δ1 and {(uj , vj ) : 2 ≤ j ≤ J}. We summarize these findings in the following theorem: Theorem 3.1 For the statistical model (3.2) induced by both the 1PL-G model (2.2) and the abilities distributed according to a distribution G known up to the scale parameter σ, the parameters of interest (β 1:J , c 2:J , σ) are identified by Y 1 provided that 1. At least three items are available. 2. The guessing parameter c1 is fixed at 0. 3. 0 < cj . Moreover, the specification of the model entails that cj ≤ P [Yij = 1 | β 1:J , c1:J , σ] for every j = 2, . . . , J. Condition (3) of Theorem 3.1 suggests that the 1PL-model is not nested into the 1PL-G model. However, there exists a relationship between these two models. As a matter of fact, under the identification restriction c1 = 0 (equivalently, δ1 = 1), the marginal probability that a person i correctly answers the item 1 is given (see relations (3.5) and (3.6)) by ∫ exp(σθ − β1 ) P [ Yi1 = 1 | β 1:J , c1:J , σ ] = G(dθ). R 1 + exp(σθ − β1 ) This means that the test (or measurement instrument) must contain an item (labeled with the index 1) such that each person answers it without guessing. In other words, the identification restriction c1 = 0 implies a restriction on the design of the test or measurement instrument, an aspect that the practitioner should carefully consider. A relevant question is the following: how can be ensured that each person answers 11 a specific item without guessing? By including in the test an open question: each person answers it correctly or incorrectly, but it is not possible to choose an answer by guessing1 . It should be remarked that βj = β1 for every j = 2, . . . , J if and only if vj = 0 for every j = 2, . . . , J. In this case, (β1 , δ 2:J , σ) are identified provided that δ1 is fixed at 1. Similarly, δj = δ1 for every j = 2, . . . , J if and only if vj δ1 + 1 = uj for every j = 2, . . . , J. This last equality implies that δ1 is identified and, therefore, (β 1:J , δ1 , σ) are identified without additional identification restrictions. Let us summarize these remarks in the following corollary: Corollary 3.1 Consider the statistical model (3.2) induced by both the 1PL-G model (2.2) and the abilities distributed according to a distribution G known up to the scale parameter σ. 1. If the items have a common difficulty parameter β1 , then (β1 , c2:J , σ) are identified by one observation if c1 is fixed at 0. 2. If the items have a common guessing parameter c1 , then (β 1:J , c1 , σ) are identified by one observation. 3.3.5 Identification when G is known up to both a location and a scale parameter If the distribution G generating the abilities is known up to both a location parameter µ and a scale parameter σ, it is then necessary to impose an identification restriction on the difficulty parameters. As a matter of fact, let Gµ,σ be a probability distribution given by (3.1). Relation (3.5) is, therefore, rewritten as ∫ . ω fj = P [ Yij = 0 | β 1:J , δ 1:J , σ, µ ] = δj { 1 − F (σ θ + µ − βj ) } G ( dθ ) R for all j = 1, . . . , J. Since F is a strictly increasing continuous function, we have that, for all j = 1, . . . , J, βj − µ is a function of (σ, ω fj , δj ). Following the arguments developed in Sections 3.3.2 and 3.3.3, it follows that (β1 − µ , . . . , βJ − µ , c2 , . . . , cJ , σ) is identified by the observations given c1 . Therefore, under a restriction of the form a′ β 1:J = 0 such that 11′J a ̸= 0, with a ∈ Rm known, the parameters (β 1:J , c2:J , µ , σ) are identified the observations. Typical choices are a = 11J , which ∑by J leads to restrict the difficulty parameters as j=1 βj = 0; or a = e1 –the first canonical vector of RJ –, which leads to impose β1 = 0. Summarizing, we establish the following corollary: Corollary 3.2 For the statistical model (3.2) induced by both the 1PL-G model (2.2) and the abilities distributed according to a distribution G known up to both the location parameter µ and the scale parameter σ, the parameters of interest (β 1:J , c 2:J , µ, σ) are identified by Y i provided that 1. At least three items are available. 2. The guessing parameter c1 is fixed at 0. 3. cj > 0 for every j = 2, . . . , J. 1 This suggestion is due to Paul De Boeck. 12 4. a′ β 1:J = 0 such that 11′J a ̸= 0, with a ∈ Rm known. Moreover, the specification of the model entails that cj ≤ P [Yij = 1 | β 1:J , c1:J , σ], j = 2, . . . , J. It should be remarked that the identification restriction imposed on β 1:J implies, in particular, that β1 ̸= βj for each j = 2, . . . , J. 4 Identification of the Semi-parametric 1PL-G model Many IRT models are fitted under the assumption that θi is normally distributed with an unknown variance. The use of a distribution for θi (as opposed to treating θi as a fixed effect) is a key feature of the widely use marginal maximum likelihood method. The normal distribution is convenient to work with, specially because is available in statistical package as SAS (Proc NLMIXED) or R (lme4). However, as pointed out by Woods and Thissen (2006) and Woods (2006), there exist specific fields, as personality and psychopathology, in which the normality assumption is not realistic (for references, see Woods (2006)). In these fields, it could be argued that psychopathology and personality variables are likely to be positively skewed, because most persons in general population have low pathology, and fewer persons have severe pathology. However, the distribution G of θi is unobservable and, consequently, though a researcher may hypothesize about it, it is not known in advance of an analysis. Therefore, any a priori parametric restriction on the shape of the distribution G could be a misspecification. These considerations lead to extend parametric IRT models by considering the distribution G as a parameter of interest and, therefore, to estimate it by using nonparametric techniques. Besides the contributions of Woods and Thissen (2006) and Woods (2006, 2008), Bayesian non-parametric methods applied to IRT models should also be mentioned; see, among others, Roberts and Rosenthal (1998); Karabatsos and Walker (2009); Miyazaki and Hoshino (2009). In spite of these developments, it is relevant to investigate whether the item parameters as well as the distribution G of an IRT model –particularly the 1PL-G model– are or not identified by the observations. If such parameters of interest are identified, then a semi-parametric extension of the 3PL model actually provides greater flexibility than does the assumption of nonnormal parametric form for G. In this section, we consider the identification problem of a semi-parametric extension of the 1PL-G model, which can be viewed as a particular case of the semi-parametric 3PL model. 4.1 Semi-parametric specification of the 1PL-G model A semi-parametric 1PL-G model is obtained after substituting the parametric hypothesis (3.1) by the following hypothesis: iid (θi | G) ∼ G, (4.1) where G is a probability distributions on (R, B). The rest of the model structure is as specified in Section 3.1. 13 The statistical model is obtained after integrating out the random effects θi ’s. The patterns responses Y 1 , · · · , Y N are mutually independent conditionally on (β 1:J , c1:J , G). The common distribution of Y i is given by } ∫ ∏ J { exp[yj (θ − βj )] P [Y i = y | β 1:J , c1:J , G] = cj + (1 − cj ) G(dθ), 1 + exp(θ − βj ) R (4.2) j=1 where y ∈ {0, 1}J . Consequently, the parameters of interest and the corresponding parameter space are (β 1:J , c1:J , G) ∈ RJ × [0, 1]J × P(R, B). 4.2 Identification analysis under a finite quantity of items Similarly to the parametric 1PL-G model (see Section 3.2), the statistical model induced by the semiparametric 1PL-G model corresponds to a multinomial distribution Mult (2J , π), where the identified parameter π = (q12···I , q12···I−1,I¯, . . . , q1̄,2̄,...,I¯) is given by q12···I = P [ Y11 = 1, . . . , Y1,J−1 = 1, Y1J = 1 | β 1:J , c1:J , G ] q12···I¯ = P [ Y11 = 1, . . . , Y1,J−1 = 1, Y1J = 0 | β 1:J , c1:J , G ] .. . (4.3) q1̄2̄···I¯ = P [ Y11 = 0, . . . , Y1,J−1 = 0, Y1J = 0 | β 1:J , c1:J , G ]. These marginal probabilities are of the form (4.2). The parameters of interest (β 1:J , c1:J , G) become identified if they can be written as functions of π. The identification analysis developed in the context of the parametric 1PL-G actually provides insight for the identification of (β 1:J , c1:J , G). As a matter of fact, in the parametric case, the identification analysis was based on three steps, each of them involving specific features: 1. S TEPS 1 and 2 essentially depend on the parameter σ indexing the distribution generating the random effects θi ’s. More specifically, the item parameters β 1:J are written as a function of c1:J and σ, whereas σ is written as a function of c1 and c2 ; see Sections 3.3.1 and 3.3.2. 2. S TEP 3 first establishes that the item parameters (β 2:J , c2:J ) can be written as a function of β1 and c1 . Second, using S TEPS 1 and 2, it is concluded that (β 1:J , c2:J , σ) is a function of c1 ; see Section 3.3.3. Thus, S TEPS 1 and 2 critically depend on the parametric hypothesis which is assumed on the distribution generating the random effects θi ’s, whereas a part of the S TEP 3 does not depend on it. This suggests that the S TEP 3 can be used in the identification analysis of (β 1:J , c1:J , G). 14 4.2.1 Identification of the item parameters More precisely, let J ≥ 3, j ̸= 1 and k ̸= j (with k, j ≤ J). Similarly to Section 3.3.3, defines the identified parameters pJ0 , pJj and pJjk as follows: pJ0 ∩ . = P 1≤j≤J δj × I0J (β 1:J , G); 1≤j≤J pJj ∏ {Yij = 0} | β 1:J , δ 1:J , G = ∩ . = P {Yik = 0} | β 1:J , δ 1:J , G 1≤k≤J, k̸=j ∏ = δk { } I0J (β 1:J , G) + e−βj I1J (β 1:J , G) ; 1≤k≤J, k̸=j ∩ . pJjk = P {Yir = 0} | β 1:J , δ 1:J , G 1≤r≤J, r̸=j, r̸=k = ∏ δr { } I0J (β 1:J , G) + (e−βj + e−βk ) I1J (β 1:J , G) + e−βj e−βk I2J (β 1:J , G) ; 1≤r≤J, r̸=j, r̸=k . where δ1 = 1 − c1 and ∫ I0J (β 1:J , G) = ∏ R 1≤j≤J ∫ I1J (β 1:J , G) eθ = R e2θ R ∏ 1≤j≤J ∫ I2J (β 1:J , G) = exp(βj ) G(dθ); exp(βj ) + exp(θ) ∏ 1≤j≤J exp(βj ) G(dθ); exp(βj ) + exp(θ) exp(βj ) G(dθ). exp(βj ) + exp(θ) Assuming that δj > 0 and βj ∈ (−∞, ∞) for each j = 2, . . . , J, it follows, by the same arguments developed in Section 3.3.3, that δj = uj δ 1 , vj δ1 + 1 βj = β1 + ln(vj δ1 + 1), j = 2, . . . , J, (4.4) where uj and vj are defined as in Propositions 3.2 and 3.3. Taking into account that uj ’s and vj ’s are identified parameters, the following theorem can be established: 15 Theorem 4.1 For the statistical model (4.2) induced by the semi-parametric 1PL-G model, the item parameters (β 2:J , c2:J ) are identified by Y 1 provided that 1. At least three items are available. 2. β1 = 0 and c1 = 0. 3. cj > 0 for each j = 2, . . . J. Moreover, under these identification restrictions, the item parameters can be expressed in terms of marginal probabilities; that is, ] [ J J J J pj p1k −p1 pjk (i) βj = ln(vj + 1) = ln pJ pJ −pJ pJ + 1 , 0 jk j k (4.5) (ii) cj = 1 − uj vj +1 =1− J J J pJ 0 p1k −p1 pk J −pJ )+pJ (pJ −pJ ) , pJ (p 1 j jk 0 1k k for some k ̸= 1, j. It should be mentioned that equalities (4.5) provide an explicit statistical meaning for the item parameters in terms of the data generating process. However, it is not possible to explain it in narrative terms. According to Theorem 4.1, the marginal probability that a person i correctly answer the standard item 1 is given by ∫ exp(θ) G(dθ). P [Yi1 = 1 | β 1:J , c1:J , G] = R 1 + exp(θ) As in the parametric case (see Section 3.3.4), this means that the test (or measurement instrument) must contain an item (labeled with 1) such that each person answers it without guessing. Finally, according to Theorem 4.1, the guessing parameters cj ’s are strictly positive. This suggests that the semi-parametric 1PL model is not nested into the semi-parametric 1PL-G model. 4.2.2 Is the distribution G identified? Let us now consider the possible identification of the distribution G generating the random effects. Using relationship (4.4), the conditional probability that a person i answers the item j incorrectly can be rewritten as follows: P [Yij = 0 | βj , δj , θi ] = = δj exp(βj ) exp(βj ) + exp(θi ) uj 1 1 vj + δ1 + δ1 exp(β exp(θi ) 1) 16 (4.6) Let K ⊂ {1, . . . , J}. Then the following 2J − 1 equations are available to analyze the identifiability of the distribution G: ∫ ∏ ∩ uj . {Yij = 0} | β 1:J , c1:J , G = G(dθ) pK = P 1 1 R j∈K vj + δ1 + δ1 exp(β1 ) exp(θ) j∈K ∏ = j∈K ∫ uj × R ∏ [ G(dθ) vj + j∈K 1 δ1 + 1 δ1 exp(β1 ) ]. exp(θ) It should be mentioned that the information provided by the 2J − 1 marginal probabilities pK ’s is exactly the same as the information provided by the marginal probabilities q’s as defined in (4.3). Taking into account that uj ’s as well as pK are identified, and using the identification restrictions established in Theorem 4.1, it follows that, for every subset K ⊂ {1, . . . , J}, except the empty set, ∫ G(dθ) . ∏ (4.7) mG (K) = [vj + 1 + exp(θ)] R j∈K is identified by the observations. Denotes {mG (K) : K ⊂ {1, . . . , J} \ ∅} as mG . The question is to know whether the identifiability of these 2J − 1 functionals ensure the identifiability of G; that is, if two distributions G1 and G2 on (R, B) satisfying that mG1 = mG2 , is it true that G1 = G2 ? We argue that those 2J − 1 equations are far from being enough to identify G. As a matter of fact, let suppose that G has a density function g with respect to a σ-finite measure λ on (R, B), that is, g = dG/dλ. Suppose furthermore that g ∈ L2 (R, B, λ). Then (4.7) can equivalently be written as ∫ g(θ)dλ(θ) ∏ ∀ K ⊂ {1, . . . , J}, with K ̸= ∅. (4.8) mg (K) = [v j + 1 + exp(θ)] R j∈K Since vj > −1 for each j, it follows that 0< ∏ j∈K 1 ≤c vj + 1 + exp(θ) ∀ K ⊂ {1, . . . , J}, with K ̸= ∅ where c is a real constant. Therefore, . ∏ fK (θ) = j∈K 1 ∈ L2 (R, B, λ) vj + 1 + exp(θ) for each K ⊂ {1, . . . , J}, with K ̸= ∅. Define the linear functional T : L2 (R, B, λ) −→ R2 −1 as T g = mg , where mg = (mg ({1}), mg ({2}), . . . , mg ({1, . . and mg (K) is defined by (4.8) for each set K. Thus, the identifiability of g can be written as follows: J T g1 = T g2 =⇒ g1 = g2 . 17 Taking into account that T is linear, we conclude that g is identified if and only if T g = 0 =⇒ g = 0, that is, if and only if Ker (T ) = {0}. If this were the case, then we conclude that L2 (R, B, λ) is of finite dimension. As a matter of fact, as it is well known (L2 (R, B, λ), (·, , ·)), with the inner product (·, ·) defined as ∫ (f, h) = f (θ)h(θ) dλ(θ), for f, h ∈ L2 (R, B, λ), R is a Hilbert space. Now, let g ∈ L2 (R, B, λ) such that T g = 0. It follows that g is orthogonal to each fK ; that is, (g, fK ) = 0 ∀ K ⊂ {1, . . . , J}, with K ̸= ∅ and, therefore, g is orthogonal to N , the span generated by {fK : K ⊂ {1, . . . , J}, K ̸= ∅}. Since N is of finite dimension, it is therefore closed in L2 (R, B, λ). Taking into account that g = 0 if and only if (g, f ) = 0 for all f ∈ L2 (R, B, λ) (see Theorem 1, Section 4, in Halmos (1951)), if g = 0 then N = L2 (R, B, λ), which is imposible because L2 (R, B, λ) is an infinite dimensional linear space. Summarizing, we obtain the following theorem: Theorem 4.2 For the statistical model (4.2) induced by the semi-parametric 1PL-G model, assume that there are at least three items and that cj > 0 for each j = 2, . . . , J. Then the (2J − 1)-dimensional vector mG is identified by the observations provided that β1 and c1 are fixed, but the distribution G generating the individual abilities is not identified. 4.3 Identification analysis under an infinite quantity of items The identification arguments previously developed, either in the parametric case or in the semi-parametric case, has a common feature, namely to write the parameters of interest as functions of identified parameters. Now the problem consists in written the distribution G as a function of identified parameters when an infinite number of items is available. This type of relationships can be developed in a Bayesian framework because in such a framework, the concept of identification reduces to a condition of measurability in the sense that a parameter is said to be Bayesian identified if and only if it is a measurable function of countably many sampling expectations of statistics (that is, expectations of functions of the data conditionally on the parameters); for details, see Appendix B. According to Theorem 4.2, for every K ⊂ {1, . . . , J}, with K ̸= ∅, the following functionals are identified by one observation: ∫ ∫ ∞ Gβ1 ,δ1 (dx) G(dθ) ∏ mG (K) = = , ∏ −1 −1 [vj + x] (vj + δ1 + δ1 exp(θ − β1 )) R δ1 j∈K j∈K where X= 1 exp(θ) + δ1 δ1 exp(β1 ) 18 and, therefore Gβ1 ,δ1 (x) = G[ β1 + ln( δ1 x − 1 ) ]. (4.9) ( −1 ) Note that the support of the random variable X is δ1 , ∞ . However, by assuming the following three conditions: H1. For all m, n ∈ N, θ 1:n , β 1:m and c1:m are mutually independent conditionally on (G, H, K). H2. The item parameters β 1:∞ for an iid process, where H is the common probability distribution. H3. The item parameters c1:∞ for an iid process, where K is the common probability distribution. H4. G, H and K are mutually independent; it can be proved that ∫ R+ f (x) Gβ1 ,δ1 (dx) (4.10) is identified by one observation, for every bounded continuous function f : R+ −→ R, where Gβ1 ,δ1 is defined by (4.9). In particular, since fn (y) = 11(0,x] (y) + [1 − n(y − x)] 11(x,x+ 1 ) (y) ↓ 11(0,x] (y) n ∀ x ∈ R+ , as n → ∞, the monotone convergence theorem implies that, for every x ∈ R+ , Gβ1 ,δ1 ((0, x]), and so Gβ1 ,δ1 , is identified by one observation. It is interesting to remark how strong is the identifiability condition (4.10) when it is compared with the identifiability of the functionals mG (K): the first one is a condition valid for every function f ∈ Cb (R+ ), where Cb (R+ ) denotes the set of bounded continuous function f : R+ −→ R, whereas the second condition is valid for the set ∏ ( ) Cb (R+ ). [vj + x]−1 : K ⊂ {1, . . . , J} \ ∅, J ≥ 3 Cb (δ1−1 , ∞) j∈K The following theorem, proved in Appendix C, establishes conditions under which the item parameters and the latent distribution G are identified in the asymptotic Bayesian model (Y i , β 1:∞ , c1:∞ , G): Theorem 4.3 Consider an asymptotic Bayesian semi-parametric 1PL-G model obtained when the number of items J → ∞. The item parameters (β 1:∞ , c1:∞ ) and the latent distribution generating the individual abilities G are b-identified by Y 1 if the following three conditions hold: 1. cj > 0 for every j ∈ N. 2. The difficulty parameter β 1:∞ and the guessing parameters c1:∞ satisfy conditions H1–H4. 3. At least one of the following identifying restrictions hold: 19 (a) β1 = 0 a.s. and c1 = 0 a.s. (b) G is a.s. a probability distribution on R such that its mean and variance are known constants. (c) G is a.s. a probability distribution on R with two known q-quantiles. It should be mentioned that if a Bayesian non-parametric procedure is implemented for estimating the identified functionals mG , it is known that the identification restrictions (1) or (3) of Theorem 4.3 are easily implemented, but not the identification restriction (2). However, this type of considerations are outside of the scope of this paper. 5 Discussion We have studied the identification problem of a particular case of the 3PL model, namely the 1PL-G model which assumes that the discrimination parameters are all equal to 1. The identification problem was studied under three different specifications. The first specification assumes that the individual abilities are unknown parameters. The second specification considers the abilities as mutually independent random variables with a common distribution known up to the scale parameter. In this context, it is also considered the case where the distribution generating the individual abilities is known up to the scale parameter and the location parameter. The third specification corresponds to a semi-parametric 1PL-G model, where the distribution generating the individual abilities is unspecified and, consequently, considered as a parameter of interest. For the first specification, the parameters of interest are the difficulty parameters, the guessing parameters and the individual abilities. These are identified provided a difficulty parameter and a guessing parameter are fixed at zero. For the second specification, the parameters of interest are the difficulty and guessing parameters, and the scale parameter. It was shown that these parameters are identified by one observation if a guessing parameter is fixed at zero. It should be emphasized that this identification result impose a design restriction on the test, namely to ensure that the test involves an item that any person will answer by guessing. In the context of educational measurement, this condition may be satisfied if the test involves an open question. In the context of the second specification, it was also studied the identification problem when the distribution generating the individual abilities is known up to both the scale and the location parameters. In this context, the parameters of interest are identified if a guessing parameter and a difficulty parameter are fixed at 0. For the third specification, the parameters of interest are the difficulty parameters, the guessing parameters and the distribution G generating the individual abilities. When at least three items are available, the item parameters are identified provided a difficulty parameter and a guessing parameter are fixed at 0. However, under these identification restrictions, it was proved that the distribution G is not identified. This lack of identification jeopardizes the empirical meaning of an estimate for G under a finite number of items. This result is of practical relevance, especially considering the large amount of research trying to relax the parametric assumption of G in the IRT literature. In the unrealistic case when an infinite quantity of items is available, the distribution G and the item parameters become identified if either a 20 difficulty and guessing parameters are fixed at zero, or two characteristics of G (the first two moments or two quantiles) are fixed. For an overview of these results, see Table 1. It should be remarked that the proofs of these identification results consisted in obtaining the corresponding identified parameterizations of the sampling process. Thereafter, identification restrictions were imposed in such a way that the parameters of interest become identified. This means that the identification restrictions are not only sufficient conditions, but also necessary. On the other hand, these results show that the identification of the fixed-effects 1PL-G model is not sufficient for obtaining the identification of both parametric and semi-parametric random-effects specification of the 1PL-G model. We argue that our identification results are of practical relevance, specially because the 3PL model is widely used to analyze educational data and it is actually available in various specialized softwares. In applications, it is well known that the 3PL has estimation problems, specially for the guessing parameters. Some softwares prevent against this type of problems; see, for instance, Rizopoulos (2006). Taking into account that the 1PL-G model is a particular case of the 3PL model, our results explain that a source of such problems is due to the lack of parameter identifiability. Furthermore, the results exposed in this paper should be considered as an invitation to investigate the identifiability of more complex guessing IRT-models. Finally, these results allow to propose the 1PL-G model as a well-behaved guessing IRTmodel in the sense that we provide the identification restrictions in three contexts which are of interest in educational applications. In other words, the 1PL-G model should be considered a model better than the 3PL model while we are waiting for its identifiability. Appendix A Identifiability of the scale parameter σ by ω12 , δ1 , δ2 , ω1 and ω2 To prove that the function ω12 given by (3.10) is a strictly increasing continuous function of σ , we need to study the sign of its derivative with respect to σ. This requires not only to use the Implicit Function Theorem (Spivak, 1965), but also to assume regularity conditions allowing to perform derivatives under the integral sign. We accordingly assume that the cumulative distribution function F has a continuous density function f strictly positive on R. Furthermore, to prove that ω12 is a strictly increasing continuous function of σ , we need to obtain the derivatives under the integral sign of the function p ( σ , β ) as defined in (3.6) with respect to σ and to β. Consequently, the standard regularity conditions are of the type ∫ ∫ R f (σx − β)G(dx) < ∞, R |x| f (σx − β)G(dx) < ∞, under which the Dominated Convergence Theorem can be applied to ensure the validity of performing derivatives under the integral sign. Under these regularity conditions, the function p ( σ , β ) is continuously differentiable under the in- 21 22 (β 1:J , c1:J , G) ∈ RJ × [0, 1]J × P(R, B) (β 1:J , c1:J , G) ∈ RJ × [0, 1]J × P(R, B) Semi-parametric µ,σ (β 1:J , c1:J , µ, σ) ∈ RJ × [0, 1]J × R × R+ Semi-parametric with (θi | σ, µ) ∼ G Random-effects with (θi | σ) ∼ Gσ (β 1:J , c1:J , σ) ∈ RJ × [0, 1]J × R+ (θ 1:N , β 1:J , c1:J ) ∈ RN × RJ × [0, 1]J Fixed-effects Random-effects Parameters of interest Model specification 3≤J <∞ c1 = 0; cj > 0 for every j = 2, . . . , J. c1 = 0; cj > 0 for every j = 2, . . . , J; a′ β 1:J = 0, with 11′m a ̸= 0 and a known. β1 = 0 and c1 = 0; cj > 0 for every j = 2, . . . , J. cj > 0 for every j = 2, . . . , J; H1–H4; at least one of the following conditions hold: (iii.1) β1 = 0 and c1 = 0. (iii.2) The variance and the mean of G are a.s. known. (iii.3) Two q-quantiles of G are a.s. known. (i) (ii) (i) (ii) (iii) (i) (ii) (i) (ii) (iii) J =∞ 3≤J <∞ 3≤J <∞ 1≤J <∞ β1 = 0 and c1 = 0; N ≥ 2; P [Yij = 1 | θi , βj , cj ] ̸= P [Yi′ j = 1 | θi′ , βj , cj ] for every j, where i ̸= i′ . Number of items (i) (ii) (iii) Identification restrictions Table 1: Identification restrictions in different specifications of the 1PL-G model (β 2:∞ , c2:∞ , G) under (iii.1); or (β 1:∞ , c1:∞ , G) under (iii.2) or (iii.3), except what it is known on G. (β 2:J , c2:J , mG ) (β 2:J , c2:J , µ, σ) (β 1:J , c2:J , σ) (θ 1:N , β 2:J , c2:J ) Identified Parameters Theorem 4.3 Theorem 4.1 and 4.2 Corollary 3.2 Theorem 3.1 Theorem 2.1 Theorem tegral on R+ 0 × R and, therefore, . ∂ D2 p ( σ , β ) = p(σ, β ) = ∂β (i) ∫ R . ∂ D1 p ( σ , β ) = p(σ, β ) = − ∂σ (ii) f ( σ x − β ) G ( dx ) (A.11) ∫ R x f ( σ x − β ) G ( dx ). Thus, p ( σ , α ) as defined by (3.8) is also continuously differentiable on R+ 0 × ( 0 , 1 ) and from (3.9), we obtain that (i) 1 = = ∂ p[σ, p(σ, β )] ∂β D2 p [ σ , p ( σ , β ) ] × D2 p ( σ , β ) (A.12) (ii) 0 = = ∂ p[σ, p(σ, β )] ∂σ D1 p [ σ , p ( σ , β ) ] + D2 p [ σ , p ( σ , β ) ] × D1 p ( σ , β ), where . ∂ D1 p ( σ , ω ) = p ( σ , ω ), ∂σ Combining (A.11) and (A.12), we obtain that (i) D2 p ( σ , ω ) (ii) where D1 p ( σ , ω ) . ∂ D2 p ( σ , ω ) = p ( σ , ω ). ∂ω = 1 1 = ∫ D2 p [ σ , p ( σ , ω ) ] R f [ σ x − p ( σ , ω ) ] G ( dx ) = ∫ x f [ σ x − p ( σ , ω ) ] G ( dx ) D1 p [ σ , p ( σ , ω ) ] − = ∫R D2 p [ σ , p ( σ , ω ) ] R f [ σ x − p ( σ , ω ) ] G ( dx ) . = Eσ , ω ( X ), (A.13) . . f [ σ x − p ( σ , ω ) ] G ( dx ) Pσ , ω [ X ∈ dx ] = Gσ , ω ( dx ) = ∫ . R f [ σ x − p ( σ , ω ) ] G ( dx ) Thanks to the regularity conditions allowing to perform derivatives of p ( σ , β ) , and to the fact that F ≤ 1 , it can be shown that ω12 is continuously differentiable under the integral sign in σ , β1 and β2 ; therefore the function φ ( σ , δ1 , δ2 , ω1 , ω2 ) is continuously differentiable under the integral sign with respect to σ . It remains to show that the derivative w.r.t. σ is strictly positive. Let us consider the sign of one of the two terms of the derivative of φ ( σ , δ1 , δ2 , ω1 , ω2 ) . Using (A.13.ii), we obtain that ∂ {1 − F [σ x − p ( σ , ω1 /δ1 ) ]} = −f [ σx − p (σ, ω1 /δ1 ) ]( x − Eσ, ω1 /δ1 ( X ) ). ∂σ 23 Therefore, such a second term can be written as [ ( )] { [ ( )]} ∫ { } ω1 ω2 θ − Eσ, ω1 /δ1 (θ) × 1 − F σθ − p σ, G(dθ) δ1 δ2 −f σθ − p σ, δ1 δ2 R [ { [ ( )] ( )]} ∫ ω1 ω2 = δ1 δ2 f σθ − p σ, G(dθ) × Cσ, ω1 /δ1 θ, F σ θ − p σ, . δ1 δ2 R Now, since F [σ θ − p ( σ , ω2 /δ2 ) ] is a strictly increasing function of θ , the covariance between θ and F [σ θ − p ( σ , ω2 /δ2 ) ] (with respect to Gσ , ω1 /δ1 ) is strictly positive (if θ is not degenerate). Furthermore, ∫ R f [ σ x − p ( σ , ω1 /δ1 ) ] G ( dx ) is clearly strictly positive. The two terms of the derivative of φ ( σ , δ1 , δ2 , ω1 , ω2 ) are, therefore, strictly positive. B Bayesian identification Whereas the classical or sampling approach considers a statistical model as an indexed family of distributions on the sample space, the Bayesian approach considers a unique probability measure on the product space “parameters × observations”. This produces two different approaches to the identification: the injectivity of a mapping in a sampling theory approach and the minimal sufficiency of the parameterization in a Bayesian approach (Kadane, 1974; Picci, 1977). We refer the reader to Section 4.6.2 in Florens et al. (1990) for the technical conditions required to link these approaches. In order to define Bayesian identification, it is necessary first to define the concept of sufficient parameter. Definition B.1 Consider the Bayesian model defined by the joint probability distribution on (Y , ϑ). A . function ψ = g(ϑ) of the parameter ϑ is a sufficient parameter for Y if the conditional distribution of the sample Y given ψ is the same as the distribution of the sample Y given ψ, that is, Y ⊥ ⊥ ϑ | ψ. The previous definition implies that the distribution of Y is completely determined by the sufficient parameter ψ, being ϑ redundant. In fact, by definition of conditional independence, Y ⊥ ⊥ ϑ | ψ implies that E[h(Y ) | ϑ] = E[h(Y ) | ψ] for every measurable function h. Equivalently, by the symmetry of a conditional independence relation, it can also be concluded that ψ is a sufficient parameter if the conditional distribution of the redundant part of ϑ, given the sufficient parameter ψ, is not updated by the sample, that is, E[f (ϑ) | Y , ψ] = E[f (ϑ) | ψ] for every measurable function f . Because of the numerous sufficient parameters in a given problem, one might ask whether one sufficient parameter ψ still contains redundant information about the sampling process. Suppose that there exists a function of ψ, say ψ 1 , which is also a sufficient parameter for Y , that is, Y ⊥ ⊥ ϑ | ψ 1 . It follows that Y ⊥ ⊥ ψ | ψ 1 and, therefore, the sampling process is fully characterized by ψ 1 , being ψ redundant in the sense that E[h(Y ) | ϑ] = E[h(Y ) | ψ] = E[h(Y ) | ψ 1 ], for every measurable function h, 24 or that E[f (ψ) | Y , ψ 1 ] = E[f (ψ) | ψ 1 ], for every measurable function f . Clearly, ψ 1 should be preferred over ψ because it contains less redundant information about the sampling process. In fact, a parameterization that achieves the most parameter reduction while retaining all the information about the sampling process should be considered preferable. The definition of such a parameter is formalized next. For the Bayesian model defined on (Y , ϑ), ϑmin is a minimal sufficient parameter for Y if the following conditions are satisfied: (i) ϑmin is a measurable function of ϑ, (ii) ϑmin is a sufficient parameter, and (iii) for any other sufficient parameter ψ, ϑmin is a measurable function of it. From this definition, it follows that ϑmin does not contain redundant information about the sampling process because it does not exist a non–injective function of it, say φ, such that Y ⊥ ⊥ ϑmin | φ. These considerations lead to the definition of Bayesian identification (Florens and Rolin, 1984). Definition B.2 Consider the Bayesian model defined by the joint probability distribution on (Y , ϑ). The parameter ϑ is said to be Bayesian identified or b–identified if it is a minimal sufficient parameter. The definition of b–identification depends on the prior distribution through its null sets only. As a matter of fact, if the original prior distribution is replaced by a different one, the Bayesian model changes. However, if both prior distributions have the same null sets, i.e., the same events of probability 0 or 1, the b–identified parameters in both models coincide. We refer the reader to Proposition 4.6.8 in (Florens et al., 1990) for a formal proof. An important consequence of this, is that unidentified parameters remain unidentified even if proper and concentrated priors are considered. Unidentified parameters can become identified if and only if the prior null sets are changed, which is equivalent to introduce dogmatic constraints. From an operational point of view, it can be shown that the minimal sufficient parameter is generated by the family of all the sampling expectations E[h(Y ) | ϑ], where h ∈ L1 (Ω, Y, Π), with L1 (Ω, Y, Π) being the set of integrable functions defined on the sample space Ω and measurable w.r.t. the σ–field Y generated by Y . Let σ{E[h(Y ) | ϑ] : h ∈ L1 (Ω, Y, Π)} be the σ–field generated by the sampling expectations. By definition of conditional expectation, this σ–field is contained in the σ–field generated by ϑ. Therefore, to show that ϑ is b–identified by Y , it is needed to show that ϑ is measurable w.r.t. σ{E[h(Y ) | ϑ] : h ∈ L1 (Ω, Y, Π)}, or equivalently that σ{ϑ} ⊂ σ{E[h(Y ) | ϑ] : h ∈ L1 (Ω, Y, Π)} (see Chapter 4 in Florens et al. (1990)). In order to illustrate this point, let us consider the following simple example. Suppose that (X | µ) ∼ N (µ, 1), where the prior distribution for µ is absolutely continuous w.r.t. the Lebesgue measure on R. It follows that E(X | µ) is measurable w.r.t. σ{E[h(X) | µ] : h ∈ L1 (R, B, Π)}, where B is the Borel σ–field. Consequently, µ = E(X | µ) is measurable w.r.t. σ{E[h(X | µ] : h ∈ L1 (R, B, Π)} and, therefore, it is b–identified by X. C Proof of Theorem 4.3 The identification analysis of the parameters of interest (β 1:∞ , c1:∞ , G) should be done in the asymptotic Bayesian model defined on (Y i , β 1:∞ , c1:∞ , G), where Y i ∈ {0, 1}N corresponds to the response 25 pattern of person i. The corresponding minimal sufficient parameter is given by the following σ-field: . A∗ = σ{E(f | β 1:∞ , c1:∞ , G) : f ∈ [σ(Y 1 )]+ }, where [σ(Y 1 )]+ denotes the set of positive functions f such that f = g(Y 1 ), with g a measurable function. The identification of the semi-parametric 1PL-G model leads to prove, under identification restrictions if necessary, that (β 1:∞ , c1:∞ , G) is a measurable function of the parameter A∗ . By the Lemma of Dynkin-Doob, this is equivalent to prove that, under identification restrictions if necessary, σ(β 1:∞ , c1:∞ , G) = A∗ . This equality relies on the following steps: S TEP 1: By the same arguments used to establish identity (4.4), it follows that . σ(β 2:∞ , δ 2:∞ ) = σ(βj : j ≥ 2) ∨ σ(δj : j ≥ 2) ⊂ A∗ , . where δj = 1 − cj . S TEP 2: Hypotheses H1, H2, H3 and H4 jointly imply that {(uj , vj ) : 2 ≤ j < ∞} are iid conditionally on (β1 , δ1 , K, H). By the Strong Law of Large Numbers, it follows that 1 ∑ . a.s. 11{(uj ,vj )∈B} W β1 ,δ1 (B) = P [(u2 , v2 ) ∈ B | β1 , δ1 , K, H] = lim sup m m 1≤j≤m for B ∈ B + × B. But Propositions 3.2 and 3.3 ensure that {uj : 2 ≤ j < ∞} and {vj : 2 ≤ j < ∞} are identified parameters. It follows that {(uj , vj ) : 2 ≤ j < ∞} is measurable w.r.t. A∗ and, consequently, W β1 ,δ1 (B) is measurable w.r.t. A∗ for all B ∈ B + × B. The upper-bar denotes a σ-field completed with measurable sets; for a definition, see Chapter 2 in Florens et al. (1990). S TEP 3: Using (4.6), it follows that E(Yij | β 1:∞ , c1:∞ , θ 1:∞ ) = 1 − = 1− δj eβj eβj + eθi vj + 1 δ1 + uj 1 δ1 exp(β1 ) exp(θi ) , and piJ [ ] . = E Y iJ | β 1:∞ , c1:∞ , θ 1:∞ uj 1 ∑ = 1− . 1 1 J v + δ1 + δ1 exp(β exp(θi ) 1) 1≤j≤J j S TEP 3. A : By the law of large deviations (see Chapter IV, Section 5 in Shiryaev (1995)), it follows that Y iJ − piJ −→ 0 a.s. conditionally on (β1 , δ1 , θi ) as J → ∞. 26 But as J → ∞ [ piJ −→ 1 − E ∫ vj + { 1− = R+ ×R . = ] uj 1 δ1 1 δ1 exp(β1 ) + | β1 , δ1 , θi exp(θi ) } u v+ 1 δ1 + a.s. conditionally on (β1 , δ1 , θi ) 1 δ1 exp(β1 ) exp(θi ) W β1 ,δ1 (du, dv) p(β1 , δ1 , θi ). Therefore, Y iJ −→ p(β1 , δ1 , θi ) a.s. conditionally on (β1 , δ1 , θi ) as J → ∞. S TEP 3. B : It follows that, for all g ∈ Cb ([0, 1]), ) ( a.s. and in L1 conditionally on (β1 , δ1 , θi ) as J → ∞. g Y iJ −→ g (p(β1 , δ1 , θi )) Then for all g ∈ Cb ([0, 1]), [ ( ) ] E g Y iJ | β 1:∞ , c1:∞ , G −→ E [g (p(β1 , δ1 , θi )) | β 1:∞ , c1:∞ , G] { ∫ = g [ 1−E a.s. ]} uj vj + 1 δ1 + 1 δ1 exp(β1 ) exp(θi ) | β1 , δ1 , θi G(dθ). [ ( ) ] By definition of conditional expectation, E g Y iJ | β 1:∞ , c1:∞ , G is measurable w.r.t. A∗ . Thus { ∫ g R+ [ 1−E ]} uj vj + 1 δ1 + 1 δ1 exp(β1 ) exp(θi ) | β1 , δ1 , θi G(dθ) ∗ is measurable [ ( ) w.r.t. A ; the] bar is added because such an integral is the a.s. limit of the sequence {E g Y iJ | β 1:∞ , c1:∞ , G : J ∈ N}. S TEP 3. C : Using the transformation (4.9), it is follows that ∫ g [L(x)] Gβ1 ,δ1 (dx) is A∗ -measurable, R+ ∫ where L(x) = R+ ×R ( u 1− v+x 27 ) W β1 ,δ1 (du, dv). The function L(·) is a strictly continuous function from (δ −1 , ∞) to (0, 1) that is known because it is measurable w.r.t. σ(W[β1 ,δ1 ).] By S TEP 2, σ(W β1 ,δ1 ) ⊂ A∗ . In particular, for every function f ∈ Cb (R+ ), take g(y) = f L(y) , where L(α) = inf{x : L(x) ≥ α}. It follows that ∫ f (x)Gβ1 ,δ1 (dx) R+ is measurable w.r.t. A∗ . Considering fn (y) = 11(0,x] (y) + [1 − n(y − x)] 11(x,x+ 1 ) (y) ↓ 11(0,x] (y) n ∀ x ∈ R+ , as n → ∞, the monotone convergence theorem implies that, for every x ∈ R+ , Gβ1 ,δ1 ((0, x]), and so Gβ1 ,δ1 , is measurable w.r.t. A∗ . S TEP 4: From S TEPS 1 and 3 C, it follows that (β 2:∞ , δ 2:∞ , Gβ1 ,δ1 ) is measurable w.r.t. A∗ . By the Lemma of Dynkin-Doob, this is equivalent to σ(β 2:∞ , δ 2:∞ ) ∨ σ(Gβ1 ,δ1 ) ⊂ A∗ . However, σ(Gβ1 ,δ1 ) ⊂ σ(G) ∨ σ(β1 , δ1 ). Therefore, two restrictions should be introduced in order to obtain the equality σ(Gβ1 ,δ1 ) = σ(G) ∨ σ(β1 , δ1 ). Two possibilities can be considered: 1. The first possibility consists in fixing two q-quartiles of G. In fact, let x1 = inf{x : Gβ1 ,δ1 (x) > q1 }, x2 = inf{x : Gβ1 ,δ1 (x) > q2 }. Using (4.9), this is equivalent to β1 + ln(δ1 x1 − 1) = y1 = inf{y : G(y) > q1 }, β1 + ln(δ1 x2 − 1) = y2 = inf{y : G(y) > q2 }. ] ey2 − ey1 x1 ey2 − x2 ey1 , δ1 = β1 = ln ; x2 − x1 x1 ey2 − x2 ey1 that is, β1 and δ1 are identified since x1 , x2 , y1 , y2 depends on G that it is identified. It follows that [ 2. The second possibility consists in fixing the mean and the variance of the distribution of exp(θ), namely EG (eθ ) = µ, VG (eθ ) = σ 2 . Using (4.9), this is equivalent to m = EGβ1 ,δ1 (X) = It follows that β1 = ln µ 1 + , δ1 δ1 eβ1 ( mσ v ) −µ , For instance, if µ = 0 and σ = 1, then v 2 = VGβ1 ,δ1 (X) = δ1−1 = m − σ2 . δ12 e2β1 µv . σ (m) 1 , δ1 = ; v m that is, β1 and δ1 are identified since m, v depends on G that it is identified. β1 = ln 28 Acknowledgements: The work developed in this paper was presented in a Symposium on Identification Problems in Psychometrics at the International Meeting of the Psychometric Society IMPS 2009. The meeting was held in Cambridge (UK), in July 2009. The first author gratefully acknowledges the partial financial support from the PUENTE Grant 08/2009 from the Pontificia Universidad Católica de Chile. The authors gratefully acknowledge several discussions with Claudio Fernández (Faculty of Mathematics, Pontificia Universidad Católica de Chile) and Paul De Boeck (University of Amsterdam). References Adams, R. J., M. R. Wilson, and W. Wang (1997). The Multidimensional Random Coefficients Multinomial Logit Model. Applied Psychological Measurement 21, 1–23. Adams, R. J. and M. L. Wu (2007). The Mixed-Coefficients Multinomial Logit Model: A Generalization Form of the Rasch Model. In M. von Davier and C. H. Carstensen (Eds.), Multivariate and Mixture Distribution Rasch Models, pp. 57–75. Springer. Birnbaum, A. (1968). Some latent trait models and their use in inferring any examinee’s ability. In F. M. Lord and M. R. Novick (Eds.), Statistical Theories of Mental Test Scores, pp. 395–479. Adison-Wesley. Bock, R. D. and M. Aitkin (1981). Marginal maximum likelihood estimation of itemm parameters: Application of an EM algorithm. Psychometrika 46, 443–459. Bock, R. D. and M. F. Zimowski (1997). Multiple Group IRT. In W. J. van der Linden and R. K. Hambleton (Eds.), Handbook of Modern Item Repsonse Theory, pp. 433–448. Springer. Cao, J. and L. Stokes (2008). Bayesian IRT guessing models for partial guessing behaviors. Psychoemtrika 73, 209–230. De Boeck, P. and M. Wilson (2004). Explanatory Item Response Models. A Generalized Linear and Nonlinear Approach. Springer. Embretson, S. E. and S. P. Reise (2000). Item Response Theory for Psychologists. New Jersey: Lawrence Erlbaum Associates, Publishers. Florens, J.-P., M. Mouchart, and J.-M. Rolin (1990). Elements of Bayesian Statistics. Marcel Dekker. Florens, J.-P. and J.-M. Rolin (1984). Asymptotic sufficiency and exact estimability. In J.-P. Florens, M. Mouchart, J.-P. Raoult, and L. Simar (Eds.), Alternative Approaches to Time Series Analysis, pp. 121–142. Publications des Facultés Universitaires Saint-Louis, Bruxelles. Gabrielsen, A. (1978). Consistency and Identifiability. Journal of Econometrics 8, 261–263. Gelfand, A. E. and S. K. Sahu (1999). Identifiability, improper priors, and gibbs sampling for generalized linear models. Journal of the American Statistical Association 94, 247–253. Ghosh, M., A. Ghosh, M.-H. Chen, and A. Agresti (2000). Noninformaive priors for one-parameter item response models. Journal of Statistical Planning and Inference 88, 99–115. Halmos, P. (1951). Introduction to Hilbert Space, and the Theory of Spectral Multiplicity. Chelsea Publishing. Hambleton, R. K., H. Swaminathan, and H. J. Rogers (1991). Fundamentals of Item Response Theory. Sage Publications. 29 Hutschinson, T. P. (1991). Ability, Parameter Information, Guessing: Statistical Modelling Applied to Multiple-Choice Tests. Rumsby Scientific Publishing. Kadane, J. (1974). The role of identification in Bayesian theory. In S. Fienberg and A. Zellner (Eds.), Studies in Bayesian Econometrics and Statistics, pp. 175–191. North Holland. Karabatsos, G. and S. Walker (2009). Coherent psychometric modelling with Bayesian nonparametrics. British Journal of Mathematical and Statistical Psychology 62, 1–20. Koopmans, T. C. and O. Reiersøl (1950). The Identification of Structural Characteristics. The Annals of Mathematical Statistics 21, 165–181. Lindley, D. V. (1971). Bayesin Statistics: A Review. Society for Industrial and Applied Mathematics. Maris, G. and T. Bechger (2009). On interpreting the model parameters for the three parameter logistic model. Measurement: Interdisciplinary Research and Perspective 7, 75–86. McDonald, R. P. (1999). Test Theory: A Unified Treatment. Lawrence Erlbaum. Miyazaki, K. and T. Hoshino (2009). A Bayesian semiparametric item response model with Dirichlet process priors. Psychometrika 74, 375–393. Picci, G. (1977). Some connections between the theory of sufficient statistics and the identifiability problem. SIAM Journal on Applied Mathematics 33, 383–398. Poirier, D. J. (1998). Revising Beliefs in Nonidentified Models. Econometric Theory 14, 483–509. Rizopoulos, D. (2006). ltm: An R package for Latent Variable Modelling and Item Response Theory Analyses. Journal of Statistical Software 17(5), 1–25. Roberts, G. O. and J. Rosenthal (1998). Markov-chain Monte Carlo: Some practical implications of theoretical results. The Canadian Journal of Statistics 26(1), 5–20. San Martı́n, E., G. del Pino, and P. De Boeck (2006). IRT Models for Ability-Based Guessing. Applied Psychological Measurement 30, 183–203. San Martı́n, E. and J. González (2010). Bayesian identifiability: Constributions to an incloncluusive debate. Chilean Journal of Statistics 1, 69–91. San Martı́n, E., J. González, and F. Tuerlinckx (2009). Identified Parameters, Parameters of Interest and Their Relationships. Measurement: Interdisciplinary Research and Perspective 7, 95–103. San Martı́n, E., A. Jara, J.-M. Rolin, and M. Mouchart (2011). On the Bayesian Nonparametric Generalization of IRT-type Models. Psychometrika 76, 385–409. San Martı́n, E. and F. Quintana (2002). Consistency and Identifiability Revisited. Brazilian Journal of Probability and Statistics 16, 99–106. Shiryaev, A. N. (1995). Probability. Second Edition. Springer. Spivak, M. (1965). Calculus on Manifolds: A Modern Approach to Classical Theorems of Advanced Calculus. Perseus Book Publishing. Swaminathan, H. and J. A. Gifford (1986). Bayesian Estimation in the Three-Parameter Logistic Model. Psychometrika 51, 589–601. Thissen, D. (2009). On Interpreting the Parameters for any Item Response Model. Measurement: Interdisciplinary Research and Perspective 7, 104–108. Thissen, D. and H. Wainer (2001). Item Response Models for Items Scored in Two Categories. Springer. van der Linden, W. and R. K. Hambleton (1997). Handbook of Modern Item Response Theory. Springer. Wise, S. L. and C. E. DeMars (2006). An application of item response time: the effort-moderated IRT model. Journal of Educational Measurement 43, 19–38. 30 Wise, S. L. and X. Kong (2005). Response time effort: a new measure of examinee motiivation in computer-based tests. Applied Psychological Measurement 18, 1–17. Woods, C. M. (2006). Ramsay-curve item response theory (RC-IRT) to detect and correct for nonnormal latent variables. Psychological Methods 11, 253–270. Woods, C. M. (2008). Ramsay-curve item response theory for the three-parameter logistic item response model. Applied Psychological Measurement 32, 447–465. Woods, C. M. and D. Thissen (2006). Item Response Theory with Estimation of the Latent Population Distribution using Spline-Based Densities. Psychometrika 71, 281–301. 31