Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Think Outside the Box: Analysis of Categorical Data Margaret Ann Goetz, Quintiles, Inc. Arlington, VA Row and column marginal totals (n1+, n2+, n+2) and an overall total sample size (N), are also calculated. ABSTRACT Analysis of categorical data has many applications in pharmaceutical research. Outcomes such as presence or absence of a tumor, or a positive or negative reaction to a particular medication, may be considered an endpoint of interest in clinical trials. While many researchers are familiar with the 2x2 contingency table, there are other analyses to consider when presenting results based on a discrete outcome. This paper will present a brief overview of some of the statistical distributions commonly encountered in these analyses. Discussion of analysis strategies will include an overview of the chi-square test, Fisher’s Exact test, and the Cochran-Mantel-Haenszel test for binomial proportions. In addition, an introduction to van Elteren’s Test for ordinal variables will be presented. Discussion of each test highlights common applications for each test. Examples using SAS® along with references for further study, are provided. INSIDE THE BOX In the analysis of pharmaceutical data, the basic 2x2 contingency table can be used to present some of the most commonly-requested presentations of categorical data. Questions such as “how many patients present with a successful response to a particular drug,” or “is there a difference in the number of patients who survive following treatment?” can be presented using the format: Figure 1 Trt A Trt B Total Outcome 1 n12 n12 n1+ Outcome 2 n21 n22 n2+ Total n+1 n+2 N The outcome is typically a response or some measure of relative success, for which the underlying assumption is of a mutually exclusive categorization (e.g., survival vs death). The counts in each cell are represented by nxy. However, in a typical controlled clinical trial, there may be more than 2 treatment arms. Treatment A, for instance, might be the group receiving study drug at dose level 1, Treatment B might be the group receiving study drug at dose level 2, and a third arm might represent a standard of care or placebo treatment. Many of the familiar statistics applied to 2x2 tables can be readily applied to more than two treatment groups, or more than two response categorizations. This type of table will be referred to as an s x r table, indicating the possibility of more than 2 treatment groups or more than 2 categories of response, although examples will focus primarily on the analysis of a dichotomous outcome (success/failure) when applied to more than two treatment groups. The following discussions of these techniques and their underlying assumptions are far from exhaustive, but are designed to encourage researchers to think beyond the 2x2 contingency table for analysis of their data. With the exception of van Elteren’s test, this paper will be limited to situations where the response levels of a s x r table need not be ordinal, so as to avoid overextending the discussion. BEHIND THE BOX: SAMPLING DISTRIBUTIONS While it is easy to visualize the proportion of patients which fall into each of the 4 cells in a 2x2 contingency table, it is less intuitive but equally important to consider the underlying assumptions behind the sampling distribution which created these cell counts. In addition, the most common distributions for discrete data can be extended to s x r tables. While a complete exploration would require more extensive discussion, the following is an introduction to the ideas behind applying distributions to s x r tables. Binomial Distribution: The familiar ‘heads or tails’ coin example is often used to depict an application of the binomial distribution. In a binomial outcome, the underlying assumption is one of independence: each of the individuals under consideration can be included in only one of two independent outcomes or responses, often denoted as p and 1-p, where p is the probability of success in each of n independent Bernoulli trials, and 1-p, or q, is the probability of failure. The binomial distribution counts the number of p successes in n trials. Each individual in a given outcome or response has an equal chance of being included under the opposite outcome or response. The binomial random variable can be approximated by the normal random variable with mean np and standard deviation (npq)1/2, provided npq >5 and 0.1 ≤ p ≤ 0.9 or if min(np, nq) > 10. [Evans, 1993]. This relationship can prove critical to a clinical trial researcher who is considering the analysis of a binomial outcome. Beyond the 2x2 table: A generalization of the binomial distribution is the multinomial distribution, which allows patients to be categorized to more than two mutually exclusive response groupings. The categories must continue to be mutually exclusive and exhaustive, each with probabilities pi, {I=1,. . ., k}. The marginal distributions are also multinomial. When N is large and all variances are large, then the multinomial will approximate the multivariate normal distribution (see Zelterman, 1999, for a complete description of the properties of this distribution). Hypergeometric distribution: The hypergeometric distribution is frequently used in instances where data with small sample sizes are being analyzed, and the number of successes out of a total N are being considered. Most frequently, the hypergeometric distribution is applied in the generation of exact tests of hypothesis for count data using a 2x2 table, in which every possible sample outcome can be presented for a particular set of count data [Zelterman, 1999]. The analysis is constrained based on fixed row and column marginal totals ((n1+, n2+, n+2 as calculated in Figure 1). A series of probabilities can be calculated given all possible row and column totals: Pr{nij} = n1 +!n2 + !n+1! n+2! n! n 11 ! n12 ! n21 ! n22! [Stokes, Davis, and Koch, 1995, p. 23] Beyond the 2x2 table: The multivariate hypergeometric distribution is the extension of the hypergeometric distribution to tables larger than 2x2, and can be used to provide exact inference on an s x r table conditional on marginal totals. Poisson distribution: The Poisson distribution is often described based on the limit of the binomial distribution. It is frequently applied in cases where N is considered large and p is very small (e.g., the rate of a rare disease under study in a large population). The researcher may encounter an application of this distribution in the use of Poisson regression, in which the errors in the model take on a Poisson distribution rather than a normal distribution as in linear regression. This technique can be modeled using PROC GENMOD in SAS, where DIST = POISSON, as well as the log link function, must be specified to model these count data. ANALYSIS STRATEGIES USING SAS® As depicted in Figure 1, the analysis of clinical trial data may involve a presumed association between a subject’s random assignment into a treatment group (study drug or placebo) and the outcome of the trial (success or failure). To appropriately determine if such an association is present, certain assumptions for the data and for the analytic techniques being applied must be met: data must be assumed to be drawn using appropriate randomization techniques; distributional assumptions must be met; and sample and individual cell size must be sufficient. PROC FREQ: In SAS®, PROC FREQ can be used to generate the following test statistics of interest in conjunction with a 2x2 table. In each instance, given certain assumptions and other criteria as stated, these statistics can be generalized to an s x r table. Pearson chi-square statistic: This test statistic is based on the difference between the observed and expected values in each cell of a 2x2 crosstabulation. A standard rule of thumb for application of this association is that the expected values in each cell should exceed five. The calculation of the difference between observed and expected values can be extended to each cell of an s x r crosstabulation, in which the response levels do not need to be ordinal. In the SAS® output, this statistic is labeled ‘Chi-Square’ and includes the value of the test statistic, degrees of freedom, and the p-value associated with the test statistic. The following is an example of crosstabulation output for a 2 x 3 table. “Yes” vs “No” is a response indicating a particular outcome of interest. In the statistical output associated with Figure 2 (above), there is not an issue of small expected cell counts to consider. Fisher’s exact test is associated with a p-value of 0.470. Figure 2 Frequency| Treatment Group Row Pct | Col Pct | 1| 2| 3| ---------+--------+--------+--------+ Yes | 13 | 12 | 16 | | 31.71 | 29.27 | 39.02 | | 65.00 | 60.00 | 80.00 | ---------+--------+--------+--------+ _No | 7 | 8 | 4 | | 36.84 | 42.11 | 21.05 | | 35.00 | 40.00 | 20.00 | ---------+--------+--------+--------+ Total 20 20 20 Total 41 19 60 The statistical testing associated with the 2 x 3 output above confirms what is clear from a review of the crosstabulation. There is no indication allowing the researcher to reject a null hypothesis of no association between treatment group and outcome for these data. The Pearson chi-square statistic is compared to the critical chisquare value with (s-1) x (r-1) degrees of freedom, and has a relatively small value of 2.003 (p=0.367). STATISTICS FOR TABLE OF OUTCOME BY TRTGRP Statistic DF Value Prob --------------------------------------------Chi-Square 2 2.003 0.367 Likelihood Ratio Chi-Square 2 2.085 0.353 Mantel-Haenszel Chi-Square 1 1.022 0.312 Fisher’s Exact Test (2-Tail) Phi Coefficient Contingency Coefficient Cramer’s V For tables larger than 2x2, use of the EXACT option following the TABLES statement will include Fisher’s exact test as part of the output, which is produced using the network algorithm given by Mehta and Patel [See SAS/STAT User’s Guide, Volume 1, page 889, for reference information on this computational algorithm]. 0.470 0.183 0.180 0.183 Sample Size = 60 Fisher’s Exact Test: Fisher’s exact test utilizes the hypergeometric distribution to output a p-value which is actually the sum of the probability of observing the current crosstabulation, or all possible more extreme row– column combinations. The use of Fisher’s exact test should be considered if the expected frequencies of each cell in the crosstabulation are not at least five. Note that Fisher’s exact test is always appropriate, even when the sample size is large [Stokes, Davis, and Koch, 1995], and that Fisher’s exact test is considered a non-parametric test [Walker, 1997]. In SAS®, this test statistic and its associated probability value will be printed automatically for all 2x2 tables. For 2x2 tables where the expected frequencies of each cell are not at least five, SAS will also output a warning message indicating that chi-square may not be a valid test. Notes on presentation: Briefly, PROC FREQ and the TABLES statement can be used to display the 2x2 crosstabulation table, showing the familiar combined frequencies for the two variables TRTGRP (study drug) and OUTCOME, which are separated by an asterisk: proc freq; tables outcome*trtgrp; run; If a two-way TABLES statement is requested with no options specified, the default will print cell frequencies, cell percentages of the total frequency, cell percentages of row frequencies, and cell percentages of column frequencies. As certain percentages are not always useful in interpreting an s x r table, percentages can also be suppressed: cell percentages of the total frequency (NOPERCENT); column percentages (NOCOL option); or row percentages (NOROW option). The following code was used to generate the 2 x 3 table output in Figure 2. Included in the output request is the EXACT option, which will generate Fisher’s exact test for the 2 x 3 table. Since no dataset is available, the WEIGHT statement is used in conjunction with PROC FREQ to populate the cells with counts: data test; input trtgrp outcome $ count @@; cards; 1 Yes 13 1 _No 7 2 Yes 12 2 _No 8 3 Yes 16 3 _No 4 ; run; proc freq data=test; tables outcome*trtgrp/ nopercent exact; weight count; run; If analysis alone is the priority, or for very large s x r tables, another useful option associated with PROC FREQ is the NOPRINT option. This will suppress printing of the table itself, but will continue to generate the statistics requested. Statistics can also be output to a new dataset for display in a table or other format using the OUT= option. Other statistics generated by PROC FREQ using the CHISQ option which have not been previously discussed include: • • • Mantel-Haenszel chi-square, a measure of significance for the linear relationship between two ordinal variables; Likelihood ratio chi-square, which computes chisquare based on maximum likelihood estimation; Phi coefficient, contingency coefficient, Cramer’s V all measure of association derived from the chi-square statistic. If response data were ordinal, it would be important to take this characteristic into account when selecting the appropriate test statistic. The Mantel-Haenszel chi-square would be better suited than the Pearson chi-square statistic to detect changes in the means across the levels of the row variable [Stokes, Davis, and Koch, 1995]. For more information on these tests, please see SAS/STAT® Software, page 866. Cochran-Mantel-Haenszel test: Another extension of the chi-square test and PROC FREQ, the Cochran-MantelHaenszel (CMH) test can be used to compare the association between drug treatment and a binomial outcome or response within a specified strata of interest to the clinical trial. Often, researchers will need to assess for differences in study center in a multi-site clinical trial. Other strata of interest could be gender or disease status (active, inactive) at onset of treatment. Using the following code, Figure 2 can be broken into x individual 2x3 tables where x represents the number of sites. Each 2x3 table generated represents treatment and response at each of 3 sites. proc freq data=test; tables site*outcome*trtgrp/cmh nopercent; weight count; run; Individual crosstabulations at each site are presented, along with a Summary Statistics section displaying three summary test statistic controlling for site, their degrees of freedom, and associated p-values: the Mantel-Haenszel correlation statistic (labeled “Nonzero correlation’), the ANOVA statistic (labeled ‘General Mean Scores Differ’) and the general association statistic (labeled ‘General Association’). For a 2x2 table, each of these statistics are interpreted in the same manner. For the 2 x r table in Figure 2, the general association statistic will have degrees of freedom (s-1) x (r-1) and is always interpretable because it does not require an ordinal scale for either row or column. [SAS/STAT User’s Guide]. Therefore, we can look at the test of general association to determine if there is an association between treatment group and outcome, controlling for site. If a response variable was on an ordinal scale, other applications of the /CMH option in PROC FREQ would be the ANOVA statistic, which would correspond to a stratum-adjusted ANOVA or Kruskal-Wallis test [SAS/STAT User’s Guide]. Order of variables in the TABLES statement must be taken into consideration when interpreting this test. Van Elteren’s Test: A lesser-known option for s x r tables with ordinal response data is van Elteren’s test. Similar to the CMH test described above, van Elteren’s test will assess the significance of the difference between treatment groups in the distribution of an ordinal response variable, adjusting for study center. Van Elteren’s test can be applied in the case where there are more than two treatment groups (as in the previous examples) and where the response variable is ordinally scaled (var NEW_OUT in the following example). proc freq data=test; tables site*new_out*trtgrp/cmh scores=modridit; run; The code will output individual s x r tables for each study site, followed by a presentation of overall summary statistics displaying each test statistic controlling for site, their degrees of freedom, and associated p-values: the Mantel-Haenszel correlation statistic (labeled “Nonzero correlation’), the ANOVA statistic (labeled ‘General Mean Scores Differ’) and the general association statistic (labeled ‘General Association’). Total sample size is also displayed. As with the CMH test for ordinal variables, order of variables in the TABLES statement must be taken into consideration when interpreting this test. Selection of SCORES=modridit, which creates a nonparametric analysis, represents the expected value of the withinstratum order statistics and are derived from rank scores [SAS/STAT User’s Guide]. For more extensive information on these procedures, and the underlying assumptions, please see the References section. CONCLUSION There are many instances where clinical trial data analysis will yield categorical outcomes appropriate for either 2 x 2 or s x r crosstabulation procedures. PROC FREQ can very flexibly present the association between a binary response outcome variable and a multiple treatment group situation, as was demonstrated above, in general tests of association and in non-parametric exact tests. Additional options in PROC FREQ for ordinal variables, such as SCORES, and special situations such as controlling for SITE, are also options available to the clinical trials investigator - outside the traditional 2 x 2 ‘box’. REFERENCES Agresti A. (1990). Categorical Data Analysis. New York: John Wiley. Evans M, Hastings N, Peacock B. (1993). Statistical Distributions. 2nd edition. New York: John Wiley. Zelterman D. (1999). Models for Discrete Data. Oxford: Oxford University Press. SAS Institute Inc. (1997). SAS Technical Report P243. SAS/STAT® Software: The GENMOD Procedure. Cary, NC: SAS Institute Inc. SAS Institute Inc. (1997). SAS/STAT® Software: Changes and Enhancements through Release 6.12. Cary, NC: SAS Institute Inc. SAS Institute Inc. (1990). SAS/STAT® User’s Guide: Volume 1, Version 6. 4th edition. Cary, NC: SAS Institute Inc. Stokes M, Davis C, Koch G. (1995). Categorical Data Analysis Using the SAS® System. SAS Institute Inc. Walker G. (1997). Common Statistical Methods for Clinical Research with SAS® Examples. SAS Institute Inc. Key words: Categorical data, binomial distribution, chisquare statistic, PROC FREQ, Fisher’s Exact Test, Cochran-Mantel-Haenszel. SAS and SAS/STAT are registered trademarks or trademarks of SAS Institute, Inc. in the USA and other countries. indicates USA registration. Other brand and product names are registered trademarks or trademarks of their respective companies. Margaret Ann Goetz, M.P.H. Senior Biostatistician Quintiles, Inc. 1300 North 17th Street, Suite 300 Arlington, VA 22209-3801 email: [email protected]