Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
TYPE I ERROR RATE FOR SMALL SAMPLE SIZES USING VARIOUS OPTIONS OF THE CA TMOD PROCEDURE Susan J. Kenny, University of Oklahoma Health Sciences Center, Oklahoma City, OK J. Paul Costiloe, University of Oklahoma Health Sciences Center, Oklahoma City, OK Andrew -J. Cucchiara, University of Oklahoma Health Sciences Center, Oklahoma City, OK present problems to researchers, since the behaviOr of the statistic is not known when computed for The caution extended by SAS Institute Inc. to small sample sizes. Computational difficulties occur users of their categorical model procedure, CATMOD, when transformations, such as the logit or the cell does not clearly identify the risks involved when proportion, are performed on the response analyzing models with small sample size. A SAS probabilities in the presence of one or more empty program was written and executed to assess these cells. Furthermore, ,the effect that addition of a risks by variation of three binomial parameter values small value to empty cells has on the Type I error under the null hypothesis (P0=0.1, 0.3, 0.5) with rate has not beeR investigated. constant sample sizes of 2 to 120. For sample sizes less than twenty, the 4 dimensional sample space for The purpose of this paper is to determine the a 2 3 categorical model was formed and the Type I error rate of CA TMOD when the expected coordinates of each point used to construct cell size of the critical region is equal to 0.05 for sample frequencies for a categorical table with two factors sizes of two to one hundred twenty observations. and one binomial response,. The significance of the For sample sizes less than twenty, the assessment is interaction term in these tables 'was determined by made by summing the probabilities of all points of use of CATMOD. For each table that yielded a X2 the sample space that fall into the rejection region. value of 3.84 or greater, the binomial probability For larger sample sizes, a simulation study is function of SAS (PROBNML) was used to calculate conducted to estimate the Type I error rate. The the probability associated with point in the sample effect that the addition of various small values to space. The cumulative probability of the significant empty cells has on the Type I error rate is also tables represents the true size of the rejection investigated and a comparison is made between the region. For sample sizes greater than twenty, a default logit response function and the joint response function. simulation method was used for each of three null hypotheses to estimate the size of the critical region METHODS as the proportion of the sampled 2000 tables which yielded X2 values of 3.84 or greater. The results A categorical model with two independent factor demonstrate that the Type I error rate is less than variables and one binomial response variable was the nominal ex. of 0.05 for sample sizes less than 20 used for this investigation. The factor variables, when using the default options of SAS. Conversely, designated GROUP and TREATMENT, each assume the joint response function produced a critical region values of 1 or 2, while the dependent response that was more often greater than 0.05. Different variable assumes values of success or failure. This additions to empty cells had the greatest effect when model can be represented as four populations with sample sizes were less than twenty. two response categories. The row margins corresponding to the population sizes were held INTRODUCTION constant within a sample space so that n1.=n2.=n3.=n 4 .=n, where 1 < n < 120. The frequency Categorical data is information that is classified table for this model is given by the following: into discrete categories of nominal or ordinal scales. Observations of subjects under one or more POPULATIONS (S) RESPONSE CATEGORY categorical variables can be represented by GROUP TREATMENT S FAILURE SUCCESS . contingency tables of one or more dimensions. When more than two categorical variables are present, a 1 1 distinction can be made between independent and nu n,Z 1 dependent variables. Profiles based on independent , 2 n21 variables are called population profiles and those n'2 based on the dependent variables are called response 1 3 nn profiles. Anyone subject can then be classified into n" 2 one population profile and one response profile. 2 4 n<l This is the type of arrangement that is used to n" analyze categorical data as a linear model. ABSTRACT ~ The frequencies in such a table are used to calculate estimates of the population parameters 7r u , the proportion of subjects in the lth population exhibiting the Jth level of response. The estimate of 1ftJ is Pu, where p1J=ntj/nj. and nt.=nU+n!2 . The use of linear modeling for the analysis of categorical data is a recent development that is gaining popularity among researchers. The theoretical justification for the linear model approach was developed by Wald (1943), Neyman (1949), and Grizzle, Starmer, and Koch (1969). This increase in use of the linear model approach is in part due to the availability of computer software designed for such analysis. One such computer program is the SAS procedure CATMOD (1985). As an illustration, consider the 4X4 sample space of contingency tables within GROUP 1 having fixed sample sizes of three observations each. This sample space is presented in the following display: Hypotheses about population parameters in a categorical model are tested with the generalized Wald statistic, which is asymptoticly distributed as a chi-square (X2). This asymptotic property can 1093 With this enumeration approach, all possible contingency tables were generated fOT each constant sample size n of the populations. Each table was analyzed using the CA TMOD procedure with the following model specification statement: SAMPLE SPACE FOR GROUP I 3 (0,3) (1,3) (2,3) (3,3) TREATMENT 2 2 SUCCESSES (n,,) (0,2) (1,2) (2,2) (3,2) (0,1) (1,1) (0,0) (1,0) 0 I(2,1) I (3,1) PROC CA TMOD; MODEL RESPONSE (3,0) (2,0) As SAS processed each possible table from the sample space, the calculated X2 values were read from the output. For each table that produced a X2 value greater than 3.84 for the interaction effect, the SAS function PROBBNML was called to calculate the probability of that table under each of three null hypotheses; Ho:1t't2='Jr22=1t'32=7r42=1(-O.1, 0.3, or 0.5. These probabilities were accumulated over the sample space and thus represent the actual Type I error rate for the test of interaction. The probability of any point of this sample space is the product of independent binomial probabilities. Under a null hypothesis eg., HO !'7r 1Z '"'"1\Z2=7r=0.3, the designated point (2~1) of the GROUP 1 sample space will occur with a probability given by the following: d) (.3)' (.7)'-2 X d) (.3)' (.7)'-1_ 0.083349 Given the above sample point for GROUP 1, there exists a 4 X4 companion sample space for GROUP 2 which is presented in the following display: For samples sizes greater than twenty, a simulation method was used to select a random sample of 2000 points from the sample space under the three null hypotheses above. The 2000 points were selected using the UNIFORM function to sample from an array that had element values in proportion to their expected binomial frequency. SAMPLE SPACE FOR GROUP 2 3 (0,3) (1,3) (2,3) (3,3) TREATMENT 2 2 SUCCESSES (n.,) (0,2) (1,2) (2,2) (3,2) (0,1) (1,1) (2,1) I(3,1) I (0,0) (1,0) (2,0) (3,0) 0 The 2000 sample space points were used to construct frequencies for the categorical -tables. These tables were processed using the same· model statement as above. In this manner, the same points were analyzed with both the joint and' the logit response functions. The proportion of tables that produced signifiCant values was used as the estimate of the size of the critical region. Three independent estimates of the critical region were obtained and averaged to produce an accurate estimate. 3 0 2 I TREATMENT I SUCCESSES (n"l Under a null hypothesis, as above HO:'K32='7I".04Z='1t"=O.3, the designated point (3,1) in the GROUP 2 sample space occurs with probability given by the following: For both enumeration and simulation methods, the Type I error rate obtained from weighted least squares (WLS) estimation with the logit response function was compared with that obtained from the joint response function. Additional analyses were conducted that involved addition of small values of 0.01 or 0.5 to empty cells prior to analysis. For the complete model, the deSignated points (2.1) and (3,1) in the GROUP 1 and 2 sample spaces, respectively, define a single point in the four dimensional sample space of 256 possible points. These two points are combined to produce the following contingency table: POPULATIONS (S) GROUP TREATMENT S RESULTS With the enumeration approach and the default logit response function, in which SAS adds 0.5 to empty cells, the size of the rejection region was below 0.05 for all sample sizes when the null parameter was 0.3 or 0.1 (Figure 1.). With a null parameter of 0.5, CA TMOD approached an alpha size test when the sample size was greater than nine yet the size of the region was not consistent with increasing sample sizes.- RESPONSE CATEGORY FAILURE SUCCESS I I I 2 2 2 2 I 1 3 0 3 2 4 2 I I The addition of 0.01 to empty cells when using the logit response function, resulted in a critical region that was smaller than that of the default addition of 0.5 (Figure 2). The addition of different small values to empty cells had the greatest effect when sample sizes were less than nine for null parameter of 0.5 and less than fifteen for null parameter of 0.3. 2 Under HO:7l"12=1(22-1!'32-1r42=7l"=O.3, the probability of the complete model frequency table is the product of the within GROUP, probabilities. For the above table, this probability is given by the following: Prob(2,I,3,1) ~ Prob(2,l) X Prob(3,l) ~ GROUP TREA TMNT GROlJp*TREA TMNT; RESPONSE LOGIT; RESPONSE JOINT; 0 2 3 I TREATMENT I SUCCESSES (n"l Prob(2,1) ~ ~ 0.000992436 1094 sample sizes needs to be at least sixty, that is 240 total observations, to achieve an alpha size critical region when using the logit response function; whereas the sample size needs to be forty, that is 160 total observations. when using the joint response function. In contrast to the logit function, the use of the joint response function with 0.5 added to empty cells yielded a critical region that exceeded the expected size of O.OS for most values of the null parameter and sample size (Figure 3). Only for the parameter value of 0.1 did the critical region fail to achieve a level of O.OS. In the CATMOn chapter of the SAS Statistics manual, SAS extends a caution to users that leaves the reader with the impression that all CATMOD results are invalid for small sample sizes. This caution may be misleading to researchers. Our results indicate that caution should be used in interpreting, nonsignificant results when using the default logit response function, but that significant results should not be dismissed as invalid. Conversely. significant results obtained when using the joint response function should be viewed with caution, since the size of the critical region is much larger than the nominal value of 0.05 for sample sizes less than sixty. The choice of a constant for addition to empty cells produced variable effects on the Type I error rate when the joint response function was requested. For null parameter of 0.5. the addition of 0.01 produced a critical region size that was much larger than that associated with the addition of 0.5 for sample sizes greater than seven (Figure 4). The greatest effect of the different additions is seen when the population parameter is 0.1 (Figure 5). For all null parameters, the addition of 0.01 produced a Type I error rate- that was generally greater than that for the addition of 0.5. The results of the simulation approach indicate that when using the logit function, the size of the critical region approached the nominal 0.05 when sample sizes were greater than forty and the null parameter value was 0.3 or 0.5 (Figure 6). An alpha size test is approximated when sample sizes approached 120 with a null parameter of 0.1. For the joint response function. the critical region remained greater than the expected size of 0.05 until sample sizes reached eighty (Figure 7). Since there was little difference in the size of the critical region associated with the addition of the two small values to 'empty cells for either response function at these sample sizes, no illustration is given. REFERENCES Grizzle, J.E., Starmer. C.F. and Koch, G.G. [19691 Analysis of categorical data by linear models. Biometrics 25, 489-504. Neyman, J. {1949] Contribution to the theory of the X2 test. Pp 238-273 in: Proc. Berkeley Symp. Math Statist. Prob. University of California Press, Berkeley and Los Angeles. SAS Institute Inc. [1985] SAS Users GUide: StatistiCS, Version 5 Edition, Cary. NC: SAS Institute Inc. DISCUSSION Wald, A. {1943]. Tests of statistical hypotheses concerning several parameters when the number of observations is large. Trans. Amer. Math Soc. 54, 426-482. Our results indicate that for the categorical model studied. the CATMOD procedure is conservative for the test of an interaction effect when using the default logit response function with small sample sizes. For sample sizes less than twenty, the Type error rate was below the expected five percent for most combinations of sample size and null parameter value. The procedure is especially conservative for small values of the null parameter. The value of 0.01 added to empty cells when using the logit response function produced a slightly reduced critical region size compared to that of the SAS default addition of 0.5. The most notable difference between the addition of the two vatues occurred when sample sizes of the populations were less than nine. - 0.07 f·D6 II! O.OS Specification of the joint response function produces critical regions not only larger than those associated with the logit but often greater than the nominal value of 0.05. This would indicate an overly liberal test. Furthermore, the Type I error rate increases as the value of the constant added to empty cells is decreased. The effect of the value added to empty cells becomes more profound when the null ·parameter is less than 0.5. Sample size· needs to be twenty or greater to achieve an alpha size test when using the logit response function with null parameter of 0.3 or 0.5. With null parameters of 0.3 or 0.5, the joint response function consistently produced a critical region size greater than 0.05 until sample size exceeded sixty. With a small population proportion. such as 0.1, the 2 3 . 5 6 7 891 1 1 1 111 1 1 1 2 o 1 2 3 4 5 6 7 890 SAMPlE SIZE OF I'OFIJl.ATION IT VALUE ----- o. FtGt.fE t eu.B\ATlON I ~ _0.3 . - - . . . 0.5 USWG TIE LOGIT RESPt»S5 AKmON Willi SAS DEFAlJI.T VAU£ OF 0.5 AIlIlI!D TO EIof'TY CB.l.S 1095 0.07 ~ w II: :;;! ~ 0.06 o.os "","-.':-0-" .. J''''- /.Jl~"'-./ 0.04 5 -" ~D.03 ~ 0.02 ~ , }- 0.01 0.00 , ,"" ... : ,: '' Jr-~--4~ , 'It l~-4~~~'·~"6~~~~~~~·~·~·~~~·~·~·~~~·~·:"~·~-~·~ a ., 1 2 3 4 S 6 7 891 1 1 1 1 1 1 t 1 1 2 o 1 2 3 4 5 6 7 890 _...' ;.--t._ ...___ -.4r-" .~'r.._4.~-~~~.•~-~~~.,••_ . c _ r _. .T'~-r_r. ._.~r-r_. .~'T SAMPLE SIZE OF POfULAnON _ P O - O . l AND DEFAULT ADDED PO-O. I AND 0.01 ADDED AND DEFAULT ADDED ~ ~g:g:3 AND ~ PO_O.5 AND gt~).ut~D~gDED po-a.5 AND 0.01 ADDED OPTION -6-*-6- _-6--. OPTION ... AGl.RE 2. A COMPARISON BETWEEN TWO VAUJES ADDED TO EMPTY c:eu..S wtt:N usm l't£ LOGfT FUNCTION SMPLE SIZE OF POPUI..AllON ~ po-a. 1 AND 0.01 ADDED -6.--6--" po-c. 1 ANO 0.5 ADDED FIGURE 5. A COMPARtSON BETWEEN TWO VALUES ADDED TO BrIPTY CELLS wteI USlNG n£ JOIIIT RESPONSE FIA'lCTlON 0.07 i 0.06 II: ~ O.oS 5 t: ~0.04 ~ 0.03 ~ fil 0.02 ~ 0.D1 .... U) w 0.00 20 SAM'LE SIZE OF POPIJLAnON _0.1 fl'''"''' 60 80 100 120 SAIM'\.E SIZE OF POPIJLAnON ............... 0.5 _0.3 40 II VALlJE F1GUlE 3. ENlJI.EAATION aIETHOD USING n£ JOM' RESPONSE RH:11ON AW D.5 ADDED TO EMPTY CEU.S ---- _0.3 O. 1 FtGliRE 6. SWLATIQN METHOD USING n£ LOGIT RESPONSE FlJNCl10N WITH SAS DEFAULT OF 0.5 ADDEO TO SPTY ca.LS 0.07 S 0.06 II: :;;! ~ 0.05 5 'I! 0.04 .... ~ ... 6---- '~ ~ "'~--.. fil 0.02 I """,."""""""",,,,,.-,--,,-",,--,--,,"",,--,-<,,'-,,--,--,~ , • o 123 • 5 e 7 a 9 b U ~ Q ME. W • ~ ~ -/1_ ...... _ .. po_c.s po-o. 5 eo SAMPlE SIZE OF POPULATION AND 0.01 ADDEO AND 0.5 0.01 ~oo~----~----_.~----~----._----~ 120 40 80 100 20 ~ SAM'LE SIZE OF POPIJLATION 0PTKlN 0.03 nVALUE ADDED F1GI..A: 7. SMJLATlON METHOD usm TtE JOtlT RESPONSE RH:11ON WIlli 0.5 ADOEO TO EMPTY CEU.S FIGlAE 4. A COMPAAtSON BETWEEN TWO VALUES ADDED TO EMPl'Y CEU.S wteI US1NG n£ JOIIIT RESPONSE FI.NCTION 1096