Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
PROC ITEM - AN ITEM ANALYSIS PROCEDURE Cynthia Deitz Texas Instruments, Incorporated Dallas. Texas David A. Smith Department of Statistics & Computer Science West Virginia University Morgantown, West Virginia ABSTRACT Information The Procedure ITEM analy xes i t£!1n data from multiple--l'E:'sponse tests which have a single correct response ~or each item. A KEY statement which indicatp.s the correct response is used to compute a score for each individual. lObe output consists of four parts: 1) Total test statistics which includes the mean. median, .tand~rd deviation, nomogenity reliability KR-2i, standard error of measurement., possible low and high score, obtained low and high Score. and number scores with counts of blank, and invalid 61:01'&S. based on the computed test score includes total test statistics. test trequ.n'~ distribution, percents. peT'centil&s, and standard score. Each item is then presented as a graph of the percent Of tD~r.ct respDnse by fifths and also as a matrix 0+ the responses by fifths In addition to tnf? matrix, foreach response lS calculated ttl!'? proportion of people responding and a point-biserial C 01'1' l? L;:d; i on of valid, FORrlUlAS USED A test f:requen.cy distribution 1:5 presented which indictes th~ raw score, standard score, p~rcentile 2) percentage of people in total group who received the given score, fT'equenclj' cumulativE' freltllE?ncy, and a histogram portraying the frequency of the scores at ea~h score valwe. rank~ N number of x = sum of the correct rSGponsw, ror each respondent rp~pondents in sample MEAN :;;- SUM X/N 3) An item quintile t~ble which compares the item response verses the total score distribution for each item. The percentage of responses to each alternative by fifths of the A - 1/2 + (CtJMA - N/;;J)/FREOA MEDIAN where FREGA distribution is plotted. 4) Item statistics which can help determine which items are good and which ne~d impoTvement or deletion from the examination. A matrix bf responses by fifths is printed which shows the frequency of responses within each fifth Also presented fo~ each alternative is the proportion of responses and pOint-biserial correlation_ ~ CUMA = A ~ ~requency of scores at A frequencu Or scores starting with the highest and 5ummlng down to and including a lowest score that one half Dr morE of thF djstribution obtained ~umul.tive SIGMA<l3TANDARD DEVIATION) ~ SGRT( (N*SUM( >:*·*2) -- (SUM ( X) ) **2) IN (N-l ) J KR-21(homagmneity reliability) = [k / {k-l ) J*( 1- (MEAN* ( i-MEAN/V") ) /8 IGMA**~:1J -wh er e K GENERAL DESeR IPTlON ::= nL1mbeT of items STANDARD ERnOR OF 1'1EASUREMEI\lT 51 GMA*SGRTE l-KR._21 J This procedure analyses item data from multiple response tests which have a sinQle correct response far each item. A-1'key" is used to STANDARD SCORE ,~ iOO*CX-MEAN1/SIGMA compute a score for each individual by counting the number of correct responses. 403 + ~OO ,.t PERCENTILE er e CUMB = ~ SUMCCUMB)/N EHHl}RS DN SAS STATEMENTS: 1 = length of T'~5pDTlse strin9 not equal to number of responses SPec i flied 2 re5ponse statement must precede Illh cumul~tlvE st~rting ~nd in~ PrequenLY of ~CDres with the lowest scaTe summlng up to and includthE fr'eql,ency of the score the pPTcmntiJe represents Ad,lustmpnts have been made 3 at upper and lower limits to prevRnt percentiles less than 1 or greater than ~9 4 5 - IU?lJ st.atemp.nt Jenqth of key string not ~qual to number o~ questions specified number of responses not specified on PROe statement number o~ questions not specified on PROC statement ITEM STf-'\TISTICS PROP ;." Nf' /N wh The output is divided into th~ee parts: 1) Test Statistics. 2) Distribution Characteri5tH:s, and :,)) H'rl? ~ NP number of respondpnts respondin9 to an alternative Item Statistic5. RPDI(point-biseri~l (MP wher'('> MP - correlation) MEAN)JSIGMA) !nean scor!? Or * The ·rest Stcltistics include the number of items, mean or average of all scores, rnedii;in which is the point that divides the raw score distribution such that 50 oT the scores are above and 50 are below the median. 'fhe standard d~viatiQn is a measure of the variabilitv of the scores fT"om thp. me,sn l·he reliabl1it~ is a measure of the relationship between test items for the group tested. lhe standard error indicates the accuracy Df the measure in that i t reflects how close the the observed score is to the true score. Al%o presented are possible low and high scores} -obtained low and high scores, and the number 01 blank and invalid SCDT-G'$, SGRT(NP/NQ) Nfl NP = number rE's,onding to the NQ allernati.'/€' number not responding to the alternativp. PROCEDURE S"t"ATEMENTS REGUIRED: PROC :I TEI1 NR~ N(j~ IiJIH~T"P NR ::o.! number of valid 'f'f.lspons~s number of test items NG RE.~SPONSE string; The frequency distribution oiders the raw scores ¥rom high to low Al~o presentpd are standard standard score (meE:ln::::500, p.ercent, deviation=100}, perce-nti Ie, rrequenc\jJ and tumulative frequenc~ indicates the A histo!':I'r'am the scores at each distributiDn of valuE'. where sting of the valid s tr in 9 ~e~ponses(lenght*NR) KEY strin~J where string string of the correct r~Bpons.s{1.ngth3NQ} OPT I ONt-,L. VAR list u9 t[~st TI1-L~ OPTIONS i variabl~s The item statistics are useful for observing how well the items performed on a aivan sample and indicates which - items could be improved. The difficulty of.n item 1S the proportionCPROP) of sub"ects who passed it. Ebel(1965) and Linquist(1951) recommend using items with dif~iculty ranging from .30 to .70 in order to obtain differential information. The discrimination power Df an item is measured by the point-biserial correlation(RPBI). which are tl-~ms title of output 1:1 st of optic,ns produces item plots and matrices 404 The relationship b€tween score and the total test the item statistic can be obtained Prom this score. The hignest RPBI's would represent the most discriminating items. Since a dichotomous variable is compared to a continuDus on~ and since teache~-made tests ~arel~ achi~ve RPBI'§ above 50. it is rare for a high RPBI to be obtained Therefore. good items which do discT'1minate should have RPBI's which fall in the range of 30 to .70 Also pr'esented is a of thp correct response by fifths and a matrix of the tabulation of the same. This tabulation along with the proportion of the group responding to ~ach alternative provides necessary information for determining which it~ms ar~ goad and which are weak. A good item would be one with a high plot(scatt~rQram) proportion pS$sing it and a hi~h RPBI. An item which has a negative RPBI would indicate that the students with lower stores are passing i t and the ones with higher stores are not. ADDITIONAL QUTPUT INFORMATION If the< sample size is large, the scattergram in the frequency distribution section will represent a mUltiple of scores with one #. This is calculated with respect to the largest frequenc~. The nocenter option is accounted as are the linesire and pagesil. options. The lin.size option will split thp frequency and item statistics into 2 pages if the linesize is less than 82. fOT The mode of the variables is alphanum; therefore, is no vaT'iable list lS specifip.d the standard BAS defaults will be in effect, REFERENCES Bussel L Rodney R. wi ttl Costa, Irene A. , Spencer, Richard E. , and Aleamoni, Lawrence M, MERMAC Manual. UniveT'sity of' Illinois Press. U.,..bana, 1 Ill., 1971. Ebel, R. L. At h i eVement, Jersey, Measuring Educational Prentice-Hall, New 1965. LinQ.uist, E. G. (ed), Educational t;1es5urement, American Council on Education, Wasnington, D. C., 1951. 405 ~ f ?A~IPL~ SA~ STATEMENTS PROC ITEM NR~5 RESPONSE 01~'34; N(I"~10 DATA=ALL, KEY 0123423210; PROC lTEM I'm;:;:':;) NG=qO DATA:::::ALL; RESPONSE 01234, KEY 0123423110; OPTIONS LS~"(jl; PRoC ITEM NR c 5 RESPONSE 01234; N(I~10 KEY 0123423123; YAR (11-(110; SAMPLE OUTPUT ITEM ANALYSIS SUMMARY OF TEST STATISTICS NUMBER OF QUEST! ONS 10 MEAN SCORE 5~ MEDI AN SCORE 5 O()(; STANDARD DEYIATION 1. 449 RELIABILITY (KR-21J S~E~ OF MEASUREMENT 000 -o~ 211 I~ 595 POSSIBLE LOW SCORE PDSSIBL~E HIGH SCORE 10 OBTAINED LOW SCORE OBTAINED HIGH SCORE 10 0 1 TOTAL NUMBER OF GUESTIoNS BLANK INVALID 1000 11 70 ITEM ANALYSIS TEST FREQUENCY DISTRIBUTION RAW STANDARD PER CUM SCORE SCORE CENTILE PERCENT FREG FREG EACH # REPRESENTS UP TO I SCORE 10 9 8 7 6 844 755 99 98 10 o~o 706 97 2~0 2 1 293 224 92 76 46 21 8 3 8,0 24~ 0 34~ 0 4 3 637 568 500 431 :362 5 1 18~ 0 8~ 0 0 0 3~ 2~ 1 0 2 S 24 34 19 8 3 2 100 99 99 97 89 65 31 13 5 2 406 # #11 iliIlIlHliI#iI #1Ii11111i111#####III1I1###III1##iIiI #111111####11##1111######11#11#111111##11## ##illI############## ###III1It## Itll# 11# 1 Tt"l'i AN(~l.YS 1 S ITE!"t STATIST Ies PEr,CENT OF CORRECT RESPONSE J3V FIFTHS 18 r + ;;!ND + 3RD + 4TH + 5TH ,. "* '* ~. ... +----+.'"--.-+---.~+----*-~--, -.-"~-+----+----+-.-.--+.----+ o 10 20 30 40 50 60 ITEM 1 t1ATR IX OF RESPONSES BY FIFTHS 1ST 8 15 " ;';,:ND :3RD 10 4TH 6 5TH 13 B C 21 8 6 4 1 4 1 3 4 0 PROP o. 30 0 07 RPBl -·0 . .l7 -0. O! o. .2:1 o. o. E 70 00 2 0 8 1 () 1" ." ;) () 16 ~ 6 ~1 1 0 ~1l 0_ 09 OI'11T o. 01 -0. 05 407 100 C 15 CORRECT RESPONSE D 8 48 O. 020 ·-0. 09 90