Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Relationships between Alpha Coefficient and Scale Stability Index James B. Olsen, Alpine Media Corporation July 8, 2004 Purpose This paper describes an empirical investigation between two reliability coefficients as indices of relationship among a set of item or task scores. These item and task scores may be from standard multiple-choice exams, performance-like tests (scenario or simulation based) and performance tests (intentionally designed to evaluate candidate proficiency levels on authentic, integrated knowledge and skill clusters). The reliability coefficients selected for this comparison are the Cronbach alpha coefficient and the Judd Scale Stability Index (SSI). Introduction The Cronbach alpha coefficient is an inter-class correlation that measures the average degree of relationship among item or task scores drawn from all possible split-halves of the test scores. The test scores could be divided into two comparable groups based on the order of test completion. Alternatively, the test scores could be divided into two groups based on odd and even number indices from the examinee’s identification code. The test scores could be divided into two groups based on quartile scores where the first group includes scores in the first and fourth quartiles and the second group includes scores from the second and third quartiles. There are many other possible ways of dividing the scores into two comparable groups. Scores from one group (one split half) are correlated with the scores from the other group (second split half). The alpha coefficient of reliability is the average of all possible split halves of these test scores. Coefficient alpha is also highly related to other interclass correlations such as the KR-20 and KR-21 coefficients. The Judd Scale Stability Index (SSI) is a coefficient that measures the degree to which a set of item or task scores forms a consistent ordered scale. “The individual Scale Stability Index (SSI) measures how well a set of responses conforms to a Guttman scale. Individual scores range from 1 (perfect conformity to a Guttman scale) to 0 (responses exactly counter to a Guttman scale).” (Judd, 2004) Items or tasks are ordered by item difficulty (proportion correct across an appropriate sample). Examinee actual response patterns are compared to a Guttman scale ordering of the response patterns with the same number of correct answer choices (note the similarity of this idea to Rasch model ability measure where there is a direct one to one relationship between the number correct score and the proficiency or ability metric). SSI creates a hypothetical edge condition where the expected response pattern in scale order shifts from a string of correct answer choices to a string of incorrect choices. SSI computes the distance discrepancy per item from the hypothetical edge for any test items or tasks that do not conform to the standard Guttman scale ordering of the number of correct responses by item difficulty order. Research Question Is there a relationship between the Cronbach alpha coefficient and the Judd Scale Stability Index? Research Procedure Random samples of item responses to a ten-item test with thirteen examinees were generated from the scale stability index program. The scale stability index for each of these samples was recorded. Each of these random samples of examinee responses was also analyzed using the alpha reliability procedure in SPSS. Alpha reliabilities and scale stability indices were computed for each of the random samples of items and examinees and then correlated with a Pearson bi-variate correlation and two non-parametric correlations Kendall tau_b and Spearman rho. Sample One hundred and seven data sets (ten items x 13 examinees) were analyzed with the scale stability index and alpha reliability indices. Appendix A includes the data used for this analysis. Results The SSI scores and the alpha reliability values were correlated 0.900 (significance level < .01, two tailed test) using the Pearson bi-variate parametric correlation, 0.770 (significance level < .01, two tailed test) using the Kendall tau_b nonparametric correlation, and 0.924 (significance level <.01, two tailed test) for the Spearman rho nonparametric correlation. Descriptive statistics for the analysis are provided in Table 1. These statistics are very comparable for the range, minimum, maximum, mean score, standard error of the mean, standard deviation, and variance. Differences were found in the skewness and kurtosis statistics. The alpha reliability showed a slightly larger skewness value and the SSI index showed a larger kurtosis value. Frequency distributions were computed for the data and are presented in Table 2. These frequency distribution statistics also include multiple measures of central tendency (mean, median, and mode). The multiple measures of central tendency and the distributional proportions at quartiles and deciles are also very similar. Table 1 Descriptive Statistics N Range Minimum Maximum Mean Std. Deviation Variance Skewness Kurtos is Statistic Statistic Statistic Statistic Statistic Std. Error Statistic Statistic Statistic Std. Error Statistic Std. Error SSISCORE 107 .37 .55 .92 .7619 .01083 .11203 .013 -.098 .234 -1.492 .463 ALPHA 107 .487 .405 .892 .74205 .010651 .110173 .012 -.645 .234 -.190 .463 Table 2 Frequency Distribution Analysis Statistics N Valid Missing Mean Median Mode Percentiles 10 20 25 30 40 50 60 70 75 80 90 SSISCORE 107 0 .7619 .7500 .90 .6180 .6500 .6600 .6640 .7000 .7500 .8200 .8600 .8700 .8800 .9000 ALPHA 107 0 .74205 .74700 .841a .57300 .66100 .67500 .68400 .71380 .74700 .79580 .83620 .84500 .84940 .86900 a. Multiple modes exis t. The smallest value is shown Figure 1 presents a scatter plot of the SSI and alpha reliability comparisons. This scatter plot shows strong comparability between the SSI and alpha reliability if either of the estimates are above 0.80. Each of the estimates appears to lie close to the expected regression diagonal. When the estimates are below 0.80 there is considerably greater scatter or deviations from the expected regression diagonal line. Figure 1 Scatterplot of SSI Scores (Y axis) and Alpha Reliabilities (X Axis) 0.90 SSISCORE 0.80 0.70 0.60 0.400 0.500 0.600 0.700 0.800 0.900 ALPHA Discussion This research investigation has shown that there is a consistent and high degree of relationship or correlation between the reliability indices produced by the scale stability index and the alpha reliability procedures. The reliability indices are using substantively different assumptions and computational procedures, however, using common data sets of items and examinees, these reliability procedures show striking similarity to each other when analyzed using parametric correlation procedures, non-parametric correlation procedures, descriptive statistics, and frequency distribution comparisons. This analysis shows the comparability as well as differences between the SSI and alpha reliability values as illustrated in the scatter plot of values. SSI and alpha reliability appear to be measuring some things in common or to be measuring something in common with some other construct or variable related to each measure. Also SSI and alpha reliability are different measures of the consistency, stability, or generaliability of scores using different assumptions, computational procedures, and show considerable divergences when the reliability estimates from either index are considerably lower than 0.80. The comparability of the Pearson bi-variate correlation and the Spearman rho nonparametric correlation suggests considerably consistency between the SSI and the alpha reliability estimates. The considerably lower Kendall tau_b coefficient and the scatterplot diagram show that SSI and alpha reliability are measuring somewhat different things or measuring somewhat different constructs or variables. We need more insight into these similarities and differences. Hopefully, this investigation has provided some information for this further quest. Appendix A SSI .73 .57 .65 .68 .63 .65 .63 .65 .82 .64 .79 .74 .81 .70 .72 .70 .64 .61 .64 .90 .66 .91 .63 .91 .56 .60 .90 .55 Alpha .743 .487 .705 .682 .555 .676 .674 .661 .814 .616 .756 .684 .772 .678 .667 .661 .708 .549 .664 .841 .725 .863 .574 .872 .590 .682 .886 .576 .91 .65 .61 .62 .66 .92 .90 .59 .75 .63 .63 .66 .67 .92 .90 .92 .89 .89 .69 .88 .88 .92 .68 .61 .88 .87 .65 .69 .72 .87 .89 .80 .88 .65 .64 .69 .88 .89 .69 .75 .88 .87 .87 .87 .86 .85 .865 .620 .569 .656 .628 .887 .869 .567 .686 .549 .713 .717 .553 .849 .849 .862 .884 .868 .710 .849 .847 .878 .684 .477 .841 .829 .611 .675 .738 .845 .872 .792 .841 .524 .405 .610 .879 .864 .687 .727 .874 .869 .865 .843 .808 .798 .87 .84 .79 .83 .81 .90 .66 .66 .90 .91 .90 .60 .68 .66 .60 .67 .73 .86 .85 .74 .74 .82 .82 .67 .73 .84 .85 .80 .86 .80 .83 .84 .82 .853 .850 .741 .810 .772 .865 .733 .688 .845 .892 .854 .737 .708 .742 .671 .647 .747 .843 .812 .736 .710 .749 .767 .690 .754 .825 .798 .796 .828 .749 .795 .829 .769