Download Type I Error Rate for Small Sample Sizes Using Various Options of the CATMOD Procedure

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
Transcript
TYPE I ERROR RATE FOR SMALL SAMPLE SIZES
USING VARIOUS OPTIONS OF THE CA TMOD PROCEDURE
Susan J. Kenny, University of Oklahoma Health Sciences Center, Oklahoma City, OK
J. Paul Costiloe, University of Oklahoma Health Sciences Center, Oklahoma City, OK
Andrew -J. Cucchiara, University of Oklahoma Health Sciences Center, Oklahoma City, OK
present problems to researchers, since the behaviOr
of
the statistic is not known when computed for
The caution extended by SAS Institute Inc. to
small sample sizes. Computational difficulties occur
users of their categorical model procedure, CATMOD,
when transformations, such as the logit or the cell
does not clearly identify the risks involved when
proportion,
are
performed
on
the
response
analyzing models with small sample size.
A SAS
probabilities in the presence of one or more empty
program was written and executed to assess these
cells.
Furthermore, ,the effect that addition of a
risks by variation of three binomial parameter values
small value to empty cells has on the Type I error
under the null hypothesis (P0=0.1, 0.3, 0.5) with
rate has not beeR investigated.
constant sample sizes of 2 to 120. For sample sizes
less than twenty, the 4 dimensional sample space for
The purpose of this paper is to determine the
a 2 3 categorical model was formed and the
Type I error rate of CA TMOD when the expected
coordinates of each point used to construct cell
size of the critical region is equal to 0.05 for sample
frequencies for a categorical table with two factors
sizes of two to one hundred twenty observations.
and one binomial response,. The significance of the
For sample sizes less than twenty, the assessment is
interaction term in these tables 'was determined by
made by summing the probabilities of all points of
use of CATMOD. For each table that yielded a X2
the sample space that fall into the rejection region.
value of 3.84 or greater, the binomial probability
For larger sample sizes, a simulation study is
function of SAS (PROBNML) was used to calculate
conducted to estimate the Type I error rate. The
the probability associated with point in the sample
effect that the addition of various small values to
space. The cumulative probability of the significant
empty cells has on the Type I error rate is also
tables represents the true size of the rejection
investigated and a comparison is made between the
region.
For sample sizes greater than twenty, a
default logit response function and the joint
response function.
simulation method was used for each of three null
hypotheses to estimate the size of the critical region
METHODS
as the proportion of the sampled 2000 tables which
yielded X2 values of 3.84 or greater. The results
A categorical model with two independent factor
demonstrate that the Type I error rate is less than
variables and one binomial response variable was
the nominal ex. of 0.05 for sample sizes less than 20
used for this investigation. The factor variables,
when using the default options of SAS. Conversely,
designated GROUP and TREATMENT, each assume
the joint response function produced a critical region
values of 1 or 2, while the dependent response
that was more often greater than 0.05.
Different
variable assumes values of success or failure. This
additions to empty cells had the greatest effect when
model can be represented as four populations with
sample sizes were less than twenty.
two
response
categories.
The
row margins
corresponding to the population sizes were held
INTRODUCTION
constant
within
a
sample
space
so
that
n1.=n2.=n3.=n 4 .=n, where 1 < n < 120. The frequency
Categorical data is information that is classified
table for this model is given by the following:
into discrete categories of nominal or ordinal scales.
Observations of subjects under one or more
POPULATIONS (S)
RESPONSE CATEGORY
categorical
variables
can
be
represented
by
GROUP TREATMENT S
FAILURE SUCCESS
. contingency tables of one or more dimensions. When
more than two categorical variables are present, a
1
1
distinction can be made between independent and
nu
n,Z
1
dependent variables.
Profiles based on independent
,
2
n21
variables are called population profiles and those
n'2
based on the dependent variables are called response
1
3
nn
profiles. Anyone subject can then be classified into
n"
2
one population profile and one response profile.
2
4
n<l
This is the type of arrangement that is used to
n"
analyze categorical data as a linear model.
ABSTRACT
~
The frequencies in such a table are used to
calculate estimates of the population parameters 7r u ,
the proportion of subjects in the lth population
exhibiting the Jth level of response. The estimate
of 1ftJ is Pu, where p1J=ntj/nj. and nt.=nU+n!2 .
The use of linear modeling for the analysis of
categorical data is a recent development that is
gaining
popularity
among
researchers.
The
theoretical
justification for
the linear model
approach was developed by Wald (1943), Neyman
(1949), and Grizzle, Starmer, and Koch (1969). This
increase in use of the linear model approach is in
part due to the availability of computer software
designed for such analysis.
One such computer
program is the SAS procedure CATMOD (1985).
As an illustration, consider the 4X4 sample space
of contingency tables within GROUP 1 having fixed
sample sizes of three observations each.
This
sample space is presented in the following display:
Hypotheses about population parameters in a
categorical model are tested with the generalized
Wald statistic, which is asymptoticly distributed as a
chi-square (X2).
This asymptotic property can
1093
With this enumeration approach, all possible
contingency tables were generated fOT each constant
sample size n of the populations. Each table was
analyzed using the CA TMOD procedure with the
following model specification statement:
SAMPLE SPACE FOR GROUP I
3
(0,3)
(1,3)
(2,3)
(3,3)
TREATMENT 2 2
SUCCESSES (n,,)
(0,2)
(1,2)
(2,2)
(3,2)
(0,1)
(1,1)
(0,0)
(1,0)
0
I(2,1) I (3,1)
PROC CA TMOD;
MODEL RESPONSE
(3,0)
(2,0)
As SAS processed each possible table from the
sample space, the calculated X2 values were read
from the output. For each table that produced a X2
value greater than 3.84 for the interaction effect,
the SAS function PROBBNML was called to calculate
the probability of that table under each of three
null hypotheses; Ho:1t't2='Jr22=1t'32=7r42=1(-O.1, 0.3, or
0.5. These probabilities were accumulated over the
sample space and thus represent the actual Type I
error rate for the test of interaction.
The probability of any point of this sample space
is the product of independent binomial probabilities.
Under a null hypothesis eg., HO !'7r 1Z '"'"1\Z2=7r=0.3, the
designated point (2~1) of the GROUP 1 sample space
will occur with a probability given by the following:
d) (.3)'
(.7)'-2 X
d)
(.3)' (.7)'-1_ 0.083349
Given the above sample point for GROUP 1,
there exists a 4 X4 companion sample space for
GROUP 2 which is presented in the following display:
For samples sizes greater than twenty, a
simulation method was used to select a random
sample of 2000 points from the sample space under
the three null hypotheses above. The 2000 points
were selected using the UNIFORM function to sample
from an array that had element values in proportion
to their expected binomial frequency.
SAMPLE SPACE FOR GROUP 2
3
(0,3)
(1,3)
(2,3)
(3,3)
TREATMENT 2 2
SUCCESSES (n.,)
(0,2)
(1,2)
(2,2)
(3,2)
(0,1)
(1,1)
(2,1)
I(3,1) I
(0,0)
(1,0)
(2,0)
(3,0)
0
The 2000 sample space points were used to
construct frequencies for the categorical -tables.
These tables were processed using the same· model
statement as above. In this manner, the same points
were analyzed with both the joint and' the logit
response functions.
The proportion of tables that
produced signifiCant values was used as the estimate
of the size of the critical region. Three independent
estimates of the critical region were obtained and
averaged to produce an accurate estimate.
3
0
2
I
TREATMENT I SUCCESSES (n"l
Under a null hypothesis, as above HO:'K32='7I".04Z='1t"=O.3,
the designated point (3,1) in the GROUP 2 sample
space occurs with probability given by the following:
For both enumeration and simulation methods, the
Type I error rate obtained from weighted least
squares (WLS) estimation with the logit response
function was compared with that obtained from the
joint response function.
Additional analyses were
conducted that involved addition of small values of
0.01 or 0.5 to empty cells prior to analysis.
For the complete model, the deSignated points
(2.1) and (3,1) in the GROUP 1 and 2 sample spaces,
respectively, define a single point in the four
dimensional sample space of 256 possible points.
These two points are combined to produce the
following contingency table:
POPULATIONS (S)
GROUP TREATMENT S
RESULTS
With the enumeration approach and the default
logit response function, in which SAS adds 0.5 to
empty cells, the size of the rejection region was
below 0.05 for all sample sizes when the null
parameter was 0.3 or 0.1 (Figure 1.).
With a null
parameter of 0.5, CA TMOD approached an alpha size
test when the sample size was greater than nine yet
the size of the region was not consistent with
increasing sample sizes.-
RESPONSE CATEGORY
FAILURE SUCCESS
I
I
I
2
2
2
2
I
1
3
0
3
2
4
2
I
I
The addition of 0.01 to empty cells when using
the logit response function, resulted in a critical
region that was smaller than that of the default
addition of 0.5 (Figure 2). The addition of different
small values to empty cells had the greatest effect
when sample sizes were less than nine for null
parameter of 0.5 and less than fifteen for null
parameter of 0.3.
2
Under HO:7l"12=1(22-1!'32-1r42=7l"=O.3, the probability
of the complete model frequency table is the product
of the within GROUP, probabilities. For the above
table, this probability is given by the following:
Prob(2,I,3,1)
~
Prob(2,l) X Prob(3,l)
~
GROUP TREA TMNT
GROlJp*TREA TMNT;
RESPONSE LOGIT;
RESPONSE JOINT;
0
2
3
I
TREATMENT I SUCCESSES (n"l
Prob(2,1) ~
~
0.000992436
1094
sample sizes needs to be at least sixty, that is 240
total observations, to achieve an alpha size critical
region when using the logit response function;
whereas the sample size needs to be forty, that is
160 total observations. when using the joint response
function.
In contrast to the logit function, the use of the
joint response function with 0.5 added to empty cells
yielded a critical region that exceeded the expected
size of O.OS for most values of the null parameter
and sample size (Figure 3). Only for the parameter
value of 0.1 did the critical region fail to achieve a
level of O.OS.
In the CATMOn chapter of the SAS Statistics
manual, SAS extends a caution to users that leaves
the reader with the impression that all CATMOD
results are invalid for small sample sizes.
This
caution may be misleading to researchers.
Our
results indicate that caution should be used in
interpreting, nonsignificant results when using the
default logit response function, but that significant
results should not be dismissed as invalid.
Conversely. significant results obtained when using
the joint response function should be viewed with
caution, since the size of the critical region is much
larger than the nominal value of 0.05 for sample
sizes less than sixty.
The choice of a constant for addition to empty
cells produced variable effects on the Type I error
rate when the joint response function was requested.
For null parameter of 0.5. the addition of 0.01
produced a critical region size that was much larger
than that associated with the addition of 0.5 for
sample sizes greater than seven (Figure 4).
The
greatest effect of the different additions is seen
when the population parameter is 0.1 (Figure 5). For
all null parameters, the addition of 0.01 produced a
Type I error rate- that was generally greater than
that for the addition of 0.5.
The results of the simulation approach indicate
that when using the logit function, the size of the
critical region approached the nominal 0.05 when
sample sizes were greater than forty and the null
parameter value was 0.3 or 0.5 (Figure 6). An alpha
size test is approximated when sample sizes
approached 120 with a null parameter of 0.1. For
the joint response function. the critical region
remained greater than the expected size of 0.05 until
sample sizes reached eighty (Figure 7). Since there
was little difference in the size of the critical region
associated with the addition of the two small values
to 'empty cells for either response function at these
sample sizes, no illustration is given.
REFERENCES
Grizzle, J.E., Starmer. C.F. and Koch, G.G. [19691
Analysis of categorical data by linear models.
Biometrics 25, 489-504.
Neyman, J. {1949] Contribution to the theory of the
X2 test. Pp 238-273 in: Proc. Berkeley Symp.
Math Statist. Prob. University of California
Press, Berkeley and Los Angeles.
SAS Institute Inc. [1985] SAS Users GUide:
StatistiCS, Version 5 Edition, Cary. NC: SAS
Institute Inc.
DISCUSSION
Wald, A. {1943]. Tests of statistical hypotheses
concerning several parameters when the number of
observations is large. Trans. Amer. Math Soc. 54,
426-482.
Our results indicate that for the categorical
model
studied.
the
CATMOD
procedure
is
conservative for the test of an interaction effect
when using the default logit response function with
small sample sizes.
For sample sizes less than
twenty, the Type
error rate was below the
expected five percent for most combinations of
sample size and null parameter value. The procedure
is especially conservative for small values of the
null parameter. The value of 0.01 added to empty
cells when using the logit response function produced
a slightly reduced critical region size compared to
that of the SAS default addition of 0.5. The most
notable difference between the addition of the two
vatues occurred when sample sizes of the populations
were less than nine.
-
0.07
f·D6
II! O.OS
Specification of the joint response function
produces critical regions not only larger than those
associated with the logit but often greater than the
nominal value of 0.05. This would indicate an overly
liberal test.
Furthermore, the Type I error rate
increases as the value of the constant added to
empty cells is decreased. The effect of the value
added to empty cells becomes more profound when
the null ·parameter is less than 0.5.
Sample size· needs to be twenty or greater to
achieve an alpha size test when using the logit
response function with null parameter of 0.3 or 0.5.
With null parameters of 0.3 or 0.5, the joint response
function consistently produced a critical region size
greater than 0.05 until sample size exceeded sixty.
With a small population proportion. such as 0.1, the
2 3 . 5 6 7 891 1 1 1 111 1 1 1 2
o 1 2 3 4 5 6 7 890
SAMPlE SIZE OF I'OFIJl.ATION
IT VALUE
----- o.
FtGt.fE t eu.B\ATlON
I
~
_0.3
. - - . . . 0.5
USWG TIE LOGIT RESPt»S5 AKmON
Willi SAS DEFAlJI.T VAU£ OF 0.5 AIlIlI!D TO EIof'TY CB.l.S
1095
0.07
~
w
II:
:;;!
~
0.06
o.os
"","-.':-0-"
.. J''''-
/.Jl~"'-./
0.04
5
-"
~D.03
~ 0.02
~
,
}-
0.01
0.00
,
,"" ... :
,: ''
Jr-~--4~
,
'It
l~-4~~~'·~"6~~~~~~~·~·~·~~~·~·~·~~~·~·:"~·~-~·~
a
.,
1 2 3 4 S 6 7 891 1 1 1 1 1 1 t 1 1 2
o 1 2 3 4 5 6 7 890
_...' ;.--t._ ...___ -.4r-"
.~'r.._4.~-~~~.•~-~~~.,••_ . c _ r _. .T'~-r_r. ._.~r-r_. .~'T
SAMPLE SIZE OF POfULAnON
_ P O - O . l AND DEFAULT ADDED
PO-O. I AND 0.01 ADDED
AND DEFAULT ADDED
~ ~g:g:3 AND
~ PO_O.5 AND gt~).ut~D~gDED
po-a.5 AND 0.01 ADDED
OPTION
-6-*-6-
_-6--.
OPTION
...
AGl.RE 2. A COMPARISON BETWEEN TWO VAUJES ADDED
TO EMPTY c:eu..S wtt:N usm l't£ LOGfT FUNCTION
SMPLE SIZE OF POPUI..AllON
~ po-a. 1 AND 0.01 ADDED
-6.--6--"
po-c.
1 ANO 0.5
ADDED
FIGURE 5. A COMPARtSON BETWEEN TWO VALUES ADDED TO BrIPTY
CELLS wteI USlNG n£ JOIIIT RESPONSE FIA'lCTlON
0.07
i
0.06
II:
~ O.oS
5
t:
~0.04
~
0.03
~
fil
0.02
~
0.D1
....
U)
w
0.00
20
SAM'LE SIZE OF POPIJLAnON
_0.1
fl'''"'''
60
80
100
120
SAIM'\.E SIZE OF POPIJLAnON
............... 0.5
_0.3
40
II VALlJE
F1GUlE 3. ENlJI.EAATION aIETHOD USING n£ JOM' RESPONSE
RH:11ON AW D.5 ADDED TO EMPTY CEU.S
----
_0.3
O. 1
FtGliRE 6. SWLATIQN METHOD USING n£ LOGIT RESPONSE
FlJNCl10N WITH SAS DEFAULT OF 0.5 ADDEO TO SPTY ca.LS
0.07
S
0.06
II:
:;;!
~ 0.05
5
'I! 0.04
....
~
...
6----
'~
~
"'~--..
fil 0.02
I
""",."""""""",,,,,.-,--,,-",,--,--,,"",,--,-<,,'-,,--,--,~
,
•
o
123
•
5
e
7
a
9
b
U
~
Q
ME. W •
~
~
-/1_ ...... _ ..
po_c.s
po-o. 5
eo
SAMPlE SIZE OF POPULATION
AND 0.01 ADDEO
AND 0.5
0.01
~oo~----~----_.~----~----._----~
120
40
80
100
20
~
SAM'LE SIZE OF POPIJLATION
0PTKlN
0.03
nVALUE
ADDED
F1GI..A: 7. SMJLATlON METHOD usm TtE JOtlT RESPONSE
RH:11ON WIlli 0.5 ADOEO TO EMPTY CEU.S
FIGlAE 4. A COMPAAtSON BETWEEN TWO VALUES ADDED TO EMPl'Y
CEU.S wteI US1NG n£ JOIIIT RESPONSE FI.NCTION
1096