Download NIU Testing Services Report: Standard Test Analyses

NIU Testing Services Report: Standard Test Analyses Test Item Analysis Printout Testing Services’ Test Item Analysis provides a variety of information about each item on your exam. This information can be used to improve your exams and assess teaching effectiveness. Following is an example of a Test Item Analysis followed by descriptions of its contents. • Item No. – Refers to the exam item numbers in sequential order. • Disc. Index – The Discrimination Index is a measure of how well the item discriminates among the various ability levels within the group of examinees. In many classroom exams the discrimination indices will generally range from +.50 to -.30, although sometimes the index may be +.60 or higher or -.50 or lower. High values indicate that the item is discriminating appropriately between highly competent and less competent examinees. A high discrimination index indicates that the competent examinees answered the item correctly AND that less competent examinees answered the item incorrectly – a desirable characteristic of an exam item. A low discrimination index generally indicates that some competent examinees answered incorrectly and less competent examinees answered correctly – an undesirable characteristic of a exam item. Literature in the field of educational measurement presents more than one algorithm for computing an item’s discrimination index. The index used by NIU’s Testing Services is the point-biserial correlation coefficient between the right/wrong scores on the item across examinees and the total exam scores of the examinees. Like other correlations, the range of possible values for the point-biserial is +1.0 to -1.0. In practice, however, discrimination indices close to +1.0 or 1.0 are extremely rare. Generally, items exhibiting a discrimination index less than +.15 may be good candidates for modification and improvement. Occasionally the discrimination index for some items equals 0.00. This is usually because all examinees answered the item correctly; the ITEM AVG will be 1.00. Such an item is not discriminating among ability levels within the examinee group. • Item Diff. – The Item Difficulty is the number of examinees answering the item correctly divided by the number of examinees (i.e., the proportion of examinees who chose the correct answer). For example, if 7 examinees out of 10 answer an item correctly, the item difficulty is 7/10 or .70. • Item Weight – Refers to the number of points or weight assigned to each exam item. • No. Wrong – The number of examinees who answered each exam item incorrectly. • N % – Number and Percent of examinees, respectively, choosing each response option for an item. The correct response option is indicated by an asterisk "*". • Omit – This column shows the number and percent of examinees who did not respond to each exam item. Omits near the end of an exam may indicate that insufficient time was allowed. • N – Number of examinees taking the exam. • Mean – The arithmetic average of the exam scores. Normally, the mean should fall approximately halfway between the highest possible exam score and the expected chance score. • Std Dev – The standard deviation is a measure of the dispersion of exam scores around the mean exam score. Normally the SD value would be expected to be around 1/6 the range between the highest possible exam score and the expected chance score. • KR20 and KR21 – KR20 and KR21 are estimates of the reliability of an exam. Exam reliability is traditionally measured by computing the correlation between the exam scores from two equivalent exam administrations with the same group of examinees. In this way, the range of possible reliability estimates is +1.0 to -1.0 with high, positive values indicating reliable exams. Ideally, the examinees would achieve very similar exam scores on these two exam administrations. There are numerous ways of assessing the reliability of an exam including test-retest estimates and parallel/alternate form estimates. KR20 and KR21 compute a measure of exam reliability from only one exam administration by estimating the strength of the correlation between theoretical halves of the same exam. Carefully developed standardized achievement exams usually have reliability estimates around .90. The typical classroom exam of 50-75 items might have reliability values averaging in the low .70's. Exam reliability can be improved by improving the discrimination of the exam items or by adding more, highly discriminating, items to the exam. • SE Measure– The Standard Error of Measurement is an estimate of the possible amount of error in a set of exam scores. Theoretically, each actual (obtained) exam score is only an estimate of the student's true score or ability. The true score cannot be determined directly as there are a variety of sources of error involved in estimating an examinee’s ability. For this reason, the Standard Error of Measurement is computed to provide a range in which an examinee’s true score is likely to fall. For example, if an examinee's obtained score is 60 and the SE MEAS is 3, the true score is likely to be near 60 but could be as low as 51 or as high as 69 (60 ± 3 Standard Errors of Measurement). 01/29/2010 SAMPLE SCORE DISTRIBUTION REPORT SCORE: Distribution of weighted test scores; it is the sum of the weights assigned to the chosen answers PERCENT: Conversion of the weighted score to a percentage score (Note: If the maximum score is 100 and equal weights are used, this score is the percent correct) FREQUENCY: Indicates the number of individuals obtaining each score (Note: Sum of the FREQUENCY column equals the number of individuals taking the test) PERCENTILE: Indicates percentile level of each score (i.e. % of individuals scoring equal to or less than each particular score) FREQUENCY GRAPH: The lines of astericks (" * ") to the right of the percentile distribution is a graphic representation of the frequency distribution. (Note: If space permits, the asterisks equal the number of students achieving each score) MEAN: Mean for the weighted scores STANDARD DEVIATION: Standard deviation for the weighted scores NO. of EXAMS: Number of examinees The cumulative score distribution report uses cumulative scores, rather than current test scores 01/29/2010 SAMPLE STUDENT ROSTER REPORT STUDENT NAME: Last name of each individual followed by his/her first and middle initial STUDENT ID: ID number (usually Z-ID, but could be any number unique to each individual) SEC. ID: Course section number GRADE (optional): This column indicates the letter grade assigned to each individual for the current test PERCENTILE: Percentile level of the individual’s total score (i.e. the percentage of the class scoring equal to or less than that individual’s score) (Note: this is for the cumulative score, not for the current test) TOTAL: Sum of the current test plus previous tests (i.e. the total score) 1…2…3…: Individual test scores 01/29/2010

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download NIU Testing Services Report: Standard Test Analyses