Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Wilfried Karmaus, LCS 829 [email protected] January 2003 Required sample sizes for analyses based on continuous, categorical, and time-to-event outcomes Download 4 SAS-files and 2 files to set up the programs SSIZE and PS from http://www.msu.edu/course/lcs/829/ You may work in groups, but every student has to provide her/his own printouts and description of the results. Please provide for each of the three assignments the computer output and for the SAS programs also provide the log. For the TASK 1 and TASK 2 you need to work with SAS. Four SAS-files are used to demonstrate how to calculate power for testable hypothesis using PROC GLM and PROC CATMOD and four SAS probability functions: FINV and FPROB, CCRIT and PROBCHI. The test of the hypotheses are based on different F- or 2- tests. The power of, e.g., a particular F-test (the probability that a particular F-statistics will exceed its critical value) is determined by the population means of the groups of interests, their common variance, and their sample sizes. The SAS analyses are to be executed in two steps: Step I uses for continuous outcomes PROC GLM to generate the sum of squares for each hypothesis of interest (SSH) or respectively for categorical outcomes PROC CATMOD to generate G2 for each hypothesis of interest Step II uses for continuous outcomes SSH from step 1 and a program to calculate the statistical power (probability) or respectively for categorical outcomes G2 from step I and a program to calculate the statistical power (probability). 1 TASK 1 – Sample size estimation for a continuous outcome Step I In this example, calculate the statistical power for detecting differences in birth weight for lead (Pb) exposure, statistically controlling for effects of lower socioeconomic status (SES) and smoking. See the file pow_weight1.sas. The numbers of the table were best guesses from previous studies. Six rows are entered with the following structure: INPUT lead SES $ smoking N weight;. ($ indicates a character value: low high) The first row 0 high 0 200 3300 stands for 'no lead exposure', 'high SES', 'no smoking', a sample of N=200 and an estimated birth weight of 3300 grams. The second line 0 high 1 100 3100 assumes 'no lead exposure', 'high SES', 'smoking', a sample of N=100 (30% of pregnant women are smokers) and a reduction of birth weight to 3100 grams. The data is used in PROC GLM. The FREQ N statement in PROC GLM causes each case (each line above) to be duplicated N times. Because N can vary from group to group, unbalanced designs are easy to specify. The CLASS statement in PROC GLM lists independent variables, which are to be treated categorically. Finally, a MODEL statement is specified. The results of PROC GLM show 'Type III SS' representing the sum of square due to different hypotheses: effect of lead, …. & lead*smoking. We use the 'Type III SS' as it represents the sum of squares for each independent variable after controlling all other predictors (other independent variables). This is a variable-added-last test (each variable is added last after statistically controlling for the others). The printout shows: sum of squares due to the lead-hypothesis (SSH) : 688095.24 the associated degrees of freedom (DFH): 1 the error degree of freedom (DFE): 895 Step II The SAS statements to calculate the power probabilities are in the file pow_weight2.sas. The INPUT statement includes a variable 'NAME' used to identify each hypothesis, which in this example are ‘lead’, ‘smoking’, ‘SES’, and the combined effect of lead and smoking. However, you may restrict the program to one hypothesis, controlled for the other predictors. 2 The '$' which follows NAME indicates a character value. We enter the data of SSH, DFH, and DFE as defined in step I and an -value. In this case, we set = 0.05. This means that out of a hundred studies 95 will show results that are within the 95% confidence interval of the assumptions for the power analysis. Additionally, we have to enter SIGMA, the standard deviation of birth weight. In this case we assume 500 grams. SIGMA as well as the distribution of step I may be varied from, to see how a different scenario might affect the power analysis. A critical value FCRIT for the null distribution (null hypothesis) is determined by using the FINV function from SAS. “The FINV function returns the pth quantile from the F distribution with numerator degrees of freedom ndf (DFH), denominator degrees of freedom ddf (DFE), and noncentrality parameter nc. The probability that an observation from the F- distribution is less than the quantile is p. This function accepts non-integer degrees of freedom parameters ndf and ddf. The FINV function is the inverse of the PROBF function. If the optional parameter nc is not specified or has the value 0, the quantile from the central F distribution is returned. The noncentrality parameter nc is defined such that if X and Y are normal random variables with means µ and 0, respectively, and variance 1, then X2/Y2 has a noncentral F-distribution with nc = µ2. For large values of nc, the algorithm could fail; in that case, a missing value is returned.” Copyright (c) 1995, SAS Institute Inc., Cary, NC 27513-2414 USA. All rights reserved. In order to compute power, it is necessary to compute a noncentrality parameter . This parameter measures how much of the particular null hypothesis deviates from the population means. If the 'sample' means are identical to the conjectured population means, then the sum of squares due to the hypothesis, SSH, for any testable hypothesis is directly related to the noncentrality parameter: = SSH / 2 where 2 is the conjectured population variance. FNC () is calculated using this formula. Finally, power is calculated with the FPROB function and the noncentrality parameter FNC. The program uses some loops to calculate the power for different sample sizes from about n=180 to n=3,600 (do x= 0.2 to 4 by 0.1). You have to compute power-values including a name for each independent variable (lead_pow, SES_pow, ...). With PROC PRINT you can inspect the results. The power is shown for different independent variables and different sample sizes. 3 Please mark the intended power and the sample sizes for the different exposures (lead, SES, smoking, combined effect of lead * smoking) in the computer printout. The data is used to plot the results with PROC GPLOT and some goptions. This part is not further explained. You apply it and do not change it, if you are not familiar with SAS graphics. The sample size is for the total population. Please mark the intended power and the sample sizes for the different exposures (lead, SES, smoking, combined effect of lead * smoking) in the graph. Please provide one sentence to describe the required sample sizes for each of the four associations. Now repeat the sample size estimation for birth weight and lead exposure with SSIZE . Choose “Continuous response variable”. Use: alpha = 0.05 power = 80% sigma () = 500 grams (This is the assumed standard deviation of birth weights.) We expected a difference in the two means of 50 grams (3250 grams for lead exposure and 3300 as normal birth weights). Please provide short description of your results. What does “example” say about the sample size? Total group or per group? Please compare the results of the two power estimations for “bronchial hyperreactivity” and describe it in one paragraph. Why is the estimated sample size from the SAS-program different? 4 TASK 2 – Sample size estimation for a dichotomous outcome Step I The statements to produce lambda are in the program “pow_peak1b.sas”. In this example, you calculate the power for detecting differences in bronchial hyperreactivity (0-1 = no vs. yes) for exposure “expo” and statistically controlling for effects of lower socioeconomic status (SES) is assessed. From the power you decide which sample size is required to test the hypothesis. The following scenario describes the assumptions. Expo SES 0 1 0 1 lo lo hi hi Proportion of the total sample 0.3 0.3 0.2 0.2 Proportion with the disease 0.06 0.08 0.07 0.16 In the file “POW_PEAK1b.sas” the data set “TABLE1” is produced with an INPUT statement: INPUT expo SES $ cellprob PRONONX, where PRNONX stands for the probability of a disease or adverse outcome. The first row 0 lo 0.3 0.06 stands for 'no exposure', 'low SES', a proportion of 30% of the total sample in this cell and an estimated probability of bronchial hyperreactivity of 6%. (see table above) The frequency for two OUTCOME levels (0-1 = no disease vs. disease) is produced with the OUTPUT statement. In the example, it is assumed that the exposure – bronchial hyperreactivity association is confounded of by SES. Provide a short paragraph that describes how the association is confounded. Altogether, there are 60% in the lower SES, 40% in the higher SES group. In the exposed group, we have 50% with an expected proportion of hyperreactivity of 12%. For the unexposed group, a proportion of diseased of (0.06 + 0.07)/2 is expected. The data is used in PROC CATMOD. The WEIGHT POPFREQ1 statement in PROC CATMOD causes each case (each line above) to be duplicated POPFREQ1 times. Because POPFREQ1 vary from group to group, a different design can be specified. We estimate two models: one full model containing SES and expo, and one model with SES only. The difference of the two models indicates whether the model gains significant information when the expo-variable is added to the model. 5 The results of PROC CATMOD show '-2log Likelihood' for each model after several iterations. You have to compute the difference of the '-2log Likelihood' of the two models. Full model: '-2log Likelihood' = 55.78 Only SES model '-2log Likelihood' = 54.55 Difference 1.23 Under some mild regulatory conditions, the difference - called G2 - is asymptomatically distributed as a Chi-Square random variable with Degrees of freedom (DF) are: DF = DF of the full model - DF of the reduced model (here SES-model) = 1 for the variable “expo”. The statistic lambda is defined as G2: = G2 TASK 2, Step II The SAS statements needed to calculate the power probabilities are in the file “pow_peak2b.sas”. You have to enter the DF, in this case = 1; the alpha-level (0.05) df = 1; alpha = 0.05; When the sample size N increases or decreases by a multiplicative factor m then is affected similarly: (m*N) = m*(N) Based on this relation, the noncentrality parameter can be estimated for different sample sizes with a loop from 2 to 12. The loop produces sample sizes of 2 * 100 n=200 to 12 * 100 n=1200. do i= 2 to 12 by 1; g2popv = 1.23; popsize = 100 * i; lambda = g2popv * i; The critical value of the Chi2-distribution for the null hypothesis “CCRIT” is determined by using the CINV function from SAS. Then power is calculated with the PROBchi function and the noncentrality parameter lambda. CCRIT = CINV(1-ALPHA,DF,0); POWER = 1 - PROBchi(CCRIT,DF,LAMBDA); POWERg = ROUND (POWER, .01); The results of these calculations are used with PROC GPLOT and some goptions to produce a graph of the statistical power for different sample sizes. This PROC GPLOT 6 is not explained. You apply it and do not change it, if you are not familiar with SAS graphics. Print the graph and mark the total sample size point that would be required to test the hypothesis. Now repeat the sample size estimation for birth weight and lead exposure with the computer program PS . Choose “Dichotomous”. In “What do you want to know?”: choose “Sample size”. This is not a matched design. This is not a case-control study. You may choose prospective, although this is a cross-sectional study. The alternative hypotheses are expressed in proportion (see table on page 5) Use Fisher’s exact test. Use alpha = 0.05 and a power of 0.80. Calculate the proportion that develop bronchial hyperreactivity in the two exposed group (ignore the stratification for SES). Identify the relation m of exposed to unexposed participants given in the scenario (table) on page 5. Click Calculate to get required sample size. Click Graphs to get a graphical presentation. Choose “Power” and a maximum sample of n=800 (type 800 in lower left box). The graphical presentation allows you to change some of the assumptions: Say, for instance, that only 10% in the exposed group develop a bronchial hyperreactivity. Please provide a printout of the graph for this assessment. How does graph indicate the sample size? Is it the total group or the group with exposure (In the graph it is called the “case sample size”. This is an inappropriate description for this example. It would be correct for a case-control approach.) Please compare the results of the two power estimations (SAS and PS) for “bronchial hyperreactivity” and describe it in one paragraph. Why is the estimated sample size from the SAS-program different? 7 TASK 3 – Sample size estimation for a time-to-event outcome (survival analysis) You want to investigate whether in utero exposure to PCB increases the waiting time to conception in adult female offspring that attempt to conceive. To determine time-to-pregnancy (TTP) you will apply an ELISA (pregnancy test) using urine samples of the adult female offspring that need to be collected approximately 3 days before the expected next menstrual bleeding. You have the choice (1) To follow-up a sample of n women for 9 months. In this case, you have to collect n * 9 urine samples. Additionally, you have to correct for attrition (loss of followup) of approximately 20% and start with a larger sample. (2) To follow-up a sample of n women for 5 months. In this case, you have to collect n * 5 urine samples. Attrition is assumed to be lower (10%), as in the example above, as the women are only followed for a period of 5 months. The assumption is that the hazard ratio (relative risk) is 0.7. This indicates that the fecundability (or fertility) in exposed group is reduced by 30%. The median time until conception is assumed to be 3.4 months. The time to enroll all study subjects is 12 months. Use alpha = 0.05 and a power of 0.80. The study will recruit an equal number of female offspring exposed in utero and of non-exposed offspring. Calculate the required sample sizes. Please provide a printout of the graph. What is the sample size indicated in the PS program? Is it the total group or the group with exposure (In the graph it is called the “experimental sample size”.) How many urine samples do you have to collect for an observation period of 9 months with the appropriate sample size? Adjusting for loss of follow-up, how many women need to be recruited if you use 9 months of follow-up? How many urine samples do you have to collect for an observation period of 5 months with the appropriate sample size? Adjusting for loss of follow-up, how many women need to be recruited if you use 5 months of follow-up? 8