Download How Should We Assess the Fit of Rasch-Type Models?

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
How Should We Assess the Fit
of Rasch-Type Models?
Approximating the Power of Goodness-of-fit
Statistics in Categorical Data Analysis
Alberto Maydeu-Olivares
Rosa Montano
Outline
Introduction
 Rasch-Type Models for Binary Data
 Rationale of Goodness-of-Fit Statistics

◦ Full Picture
◦ M2, R1 and R2
Estimating the Power
 Empirical Comparison of R1, R2 and M2
 Numerical Examples
 Discussion and Conclusion

Introduction

Two properties of Rasch-Type models
◦ Sufficient statistics
◦ Specific objectivity

Estimation methods
◦ Specific for Rasch-Type models (CML)
◦ General procedures (MML via EM)

Goodness-of-fit testing procedures
◦ Specific to Rasch-Type models
◦ General to IRT or multivariate discrete data models
Introduction

Compare the performance of certain goodnessof-fit statistics to test Rasch-Type models in
MML via EM
◦ Binary data
◦ 1PL (random effects)
R1 and R2 for 1PL
 M2 for multivariate discrete data

Rasch model and 1PL

Fixed effects
◦ The distribution of ability is not specified

Random effects
◦ Specify a standard normal distribution for ability
◦ The less restrictive definition of specific objectivity
still hold
Rationale
1. High-dimensional contingency table
(000)
(100) (010) (001) (110) (101) (011) (111)
1
0
1
0
0
0
0
0
0
2
0
0
0
0
0
1
0
0
3
0
0
0
0
1
0
0
0
Marginal Total for each cell > 5
C = 2^n cells which n is the number of items.
For example, 20 items test
C = 2^20 = 1048576 cells
To fulfill the rule of thumb >5, at least 1048576*5
sample size is needed.
2.
(000) (100) (010) (001)
(110)
(101)
(011) (111)
1
0
1
0
0
0
0
0
0
2
0
0
0
0
0
1
0
0
32
15
8
12
19
…
Marginal Total
10
Observed
proportion
0.07
Probability
Under Model
0.11
17
21
134
3. Limited information approach (M2)
Pooling cells of the contingency table




When order r = 2, Mr -> M2
M2 used the univariate and bivariate information
The degree of freedom is
It is statistics of choice for testing IRT models
3. Limited information approach (R1 and R2)




Degree of freedom is n(n-2)
Specific to the monotone increasing and parallel item
response functions assumptions
Degree of freedom is (n(n-2)+2)/2
Specific to the unidimensionality assumption
Estimating the Asymptotic Power Rate

Under the sequence of local alternatives
◦ The noncentrality parameter of a chi-square
distribution can be calculated given the df for
M2, R1 and R2

The Kullback-Leibler discrepancy function can
be used
◦ The minimizer of DKL is the same as the
maximizer of the maximum likelihood function
between a “true” model and a null model
Study 1: Accuracy of p-values under correct model



df = Mean; df = ½ Var
Another Study by Montano (2009), M2 is better than R1 and the
discrepancies between the empirical and asymptotic rate were not
large.
Group the sum scores ->




The degree of freedom is also adjust
An iterative procedure
When appropriate score ranges are used, the empirical
rejection rate of R1 should be closely match the
theoretical rejection rates.
This should be also done in R2
Study 2: Asymptotic Power to reject a 2PL
Study 3: Empirical Power to reject a 2PL
Study 4: Asymptotic Power to reject a 3PL
Study 5: Asymptotic Power to reject a multidimensional model
Empirical Example 1: LSAT 7 Data

The agreement in ordering between value/df
ratio and power
Empirical Example 2: Chilean Mathematical Proficiency
Data
Discussion and Conclusions
Generally, M2 is more powerful than R1,
R2.
 That is, the R1 and R2 which developed
specific to Rasch-type models is not
superior than the general M2
