Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 6 SAMPLE SIZE ISSUES Ref: Lachin, Controlled Clinical Trials 2:93-113, 1981. 1 Sample Size Issues • Fundamental Point Trial must have sufficient statistical power to detect differences of clinical interest • High proportion of published negative trials do not have adequate power Freiman et al, NEJM (1978) 50/71 could miss a 50% benefit 2 Example: How many subjects? • Compare new treatment (T) with a control (C) • Previous data suggests Control Failure Rate (Pc) ~ 40% • Investigator believes treatment can reduce Pc by 25% i.e. PT = .30, PC = .40 • N = number of subjects/group? 3 • Estimates only approximate – Uncertain assumptions – Over optimism about treatment – Healthy screening effect • Need series of estimates – Try various assumptions – Must pick most reasonable • Be conservative yet be reasonable 4 Statistical Considerations Null Hypothesis (H0): No difference in the response exists between treatment and control groups Alternative Hypothesis (Ha): A difference of a specified amount () exists between treatment and control Significance Level (): Type I Error The probability of rejecting H0 given that H0 is true Power = (1 - ): ( = Type II Error) The probability of rejecting H0 given that H0 is not true 5 Standard Normal Distribution Ref: Brown & Hollander. Statistics: A Biomedical Introduction. John Wiley & Sons, 1977. 6 Standard Normal Table 7 Ref: Brown & Hollander. Statistics: A Biomedical Introduction. John Wiley & Sons, 1977. Distribution of Sample Means (1) Ref: Brown & Hollander. Statistics: A Biomedical Introduction. John Wiley & Sons, 1977. 8 Distribution of Sample Means (2) Ref: Brown & Hollander. Statistics: A Biomedical Introduction. John Wiley & Sons, 1977. 9 Distribution of Sample Means (3) Ref: Brown & Hollander. Statistics: A Biomedical Introduction. John Wiley & Sons, 1977. 10 Distribution of Sample Means (4) Ref: Brown & Hollander. Statistics: A Biomedical Introduction. John Wiley & Sons, 1977. 11 Distribution of Test Statistics • Many have a common form • = population parameter (eg difference in means) • ˆ = sample estimate • Then – Z =[ˆ – E(ˆ )]/SE(ˆ ) • And then Z has a Normal (0,1) distribution 12 • If statistic z is large enough (e.g. falls into red area of scale), we believe this result is too large to have come from a distribution with mean O (i.e. Pc - Pt = 0) • Thus we reject H0: Pc - Pt = 0, claiming that there exists 5% chance this result could have come from distribution with no difference 13 Normal Distribution Ref: Brown & Hollander. Statistics: A Biomedical Introduction. John Wiley & Sons, 1977. 14 Two Groups 0 1 C T 1 0 or OR Z c T 0 XC XT ~ N (0,1) 2/n 15 Test Statistics population parameter ~ sample estimate ~ ~ () ~ v() 16 Test of Hypothesis • Two sided e.g. H0: PT = PC vs. • z = critical value Classic test If |z| > z Reject H0 = .05 , z = 1.96 One sided H0: PT < PC If z > z Reject H0 = .05, z = 1.645 where z = test statistic • Recommend z be same value both cases (e.g. 1.96) two-sided one-sided = .05 or = .025 z = 1.96 1.96 17 Typical Design Assumptions (1) 1. = .05, .025, .01 2. Power = .80, .90 Should be at least .80 for design 3. = smallest difference hope to detect e.g. = PC - PT = .40 - .30 = .10 25% reduction! 18 Typical Design Assumptions (2) Two Sided Significance Level 0.05 0.025 0.01 Z 1.96 2.24 2.58 Power 1- 0.80 0.90 0.95 Z 0.84 1.282 1.645 19 Sample Size Exercise • How many do I need? • Next question, what’s the question? • Reason is that sample size depends on the outcome being measured, and the method of analysis to be used 20 One Sample Test for Mean H0: = 0 vs. HA: = 0+ 1 P{reject H 0 | H A is true} Y - 0 P{ z | 0 } / n Y - ( 0 ) P{ z | 0 } / n / n zα = constant associated with a P {|Z|> zα } = α two sided 21 One Sample Test for Mean Under the alternative hypothesis that = 0+ , the test statistic Y - ( 0 ) / n follows a standard normal variable. 22 One Sample Test for Mean - z z / n 2 [ z z ] n 2 23 Simple Case - Binomial 1. H0: PC = PT 2. Test Statistic (Normal Approx.) pˆ C pˆ T Z p(1 p)(1 / N C 1 / NT ) 3. N C PˆC NT PˆT p N C NT Sample Size Assume • NT = NC = N • HA: = PC - PT 24 Sample Size Formula (1) Two Proportions Simpler Case N 2( Z Z ) 2 p (1 p ) 2 • Z = constant associated with P {|Z|> Z } = two sided! (e.g. = .05, Z =1.96) • Z = constant associated with 1 - P {Z< Z} = 1- (e.g. 1- = .90, Z =1.282) • Solve for Z ( 1- ) or 25 Sample Size Formula (2) Two Proportions [ Z 2 p(1 p) Z PC (1 PC ) PT (1 PT ) ] 2 N 2 • Z = constant associated with P {|Z|> Z } = two sided! (e.g. = .05, Z =1.96) • Z = constant associated with 1 - P {Z< Z} = 1- (e.g. 1- = .90, Z =1.282) 26 Sample Size Formula Power • Solve for Z 1- 2 Z pq N Z Difference Detected • Solve for ( Z Z ) pq N 2 27 Simple Example (1) • H0: PC = PT • HA: PC = .40, PT = .30 = .40 - .30 = .10 • Assume = .05 1 - = .90 Z = 1.96 Z = 1.282 (Two sided) • p = (.40 + .30 )/2 = .35 28 Simple Example (2) Thus a. [1.96 2(.35)(.65) 1.282 (.3)(.7) (.4)(.6) ]2 N 2 (. 4 . 3 ) N = 476 2N = 952 b. 2(1.96 1.282) 2 (.35)(.65) N 478 2 (.4 .3) N = 478 2N = 956 29 Approximate* Total Sample Size for Comparing Various Proportions in Two Groups with Significance Level () of 0.05 and Power (1-) of 0.80 and 0.90 True Proportions pC pI (Control) (Invervention) 0.60 0.50 0.40 0.30 0.20 0.10 0.50 0.40 0.30 0.20 0.40 0.30 0.25 0.20 0.30 0.25 0.20 0.20 0.15 0.10 0.15 0.10 0.05 0.05 = 0.05 (two-sided) = 0.05 (one-sided) 1- 0.90 850 210 90 50 850 210 130 90 780 330 180 640 270 140 1980 440 170 950 1- 0.80 610 160 70 40 610 150 90 60 560 240 130 470 190 100 1430 320 120 690 *Sample sizes are rounded up to the nearest 10 1- 0.90 1040 260 120 60 1040 250 160 110 960 410 220 790 330 170 2430 540 200 1170 1- 0.80 780 200 90 50 780 190 120 80 720 310 170 590 250 130 1810 400 150 870 30 31 Comparison of Means • Some outcome variables are continuous – Blood Pressure – Serum Chemistry – Pulmonary Function • Hypothesis tested by comparison of mean values between groups, or comparison of mean changes 32 Comparison of Two Means • H 0: C = T C - T = 0 • HA : C - T = • Test statistic for sample means ~ N (,) Z XC XT (1 / NC 1 / NT ) 2 ~N(0,1) for H0 • Let N = NC = NT for design N 2( Z Z ) 2 2 • Power 2 2( Z Z ) 2 ( / ) 2 Z N / 2 ( / ) Z 33 Example = 15 e.g. IQ • Set 2 = .05 = 0.10 • HA: = 0.3 = 0.3x15 = 4.5 1 - = 0.90 / = 0.3 • Sample Size 2(1.96 1.282) 2 2(10.51) 21.02 N 2 2 (0.3) (0.3) (0.3) 2 • N = 234 2N = 468 34 35 Comparing Time to Event Distributions • Primary efficacy endpoint is the time to an event • Compare the survival distributions for the two groups • Measure of treatment effect is the ratio of the hazard rates in the two groups = ratio of the medians • Must also consider the length of follow-up 36 Assuming Exponential Survival Distributions • If P(T > t) = e- t , where = 1 in group 1, = 2 in group 2, let H0 : 2 = 1 1 Ha : 2 1 1 2 1 2 1 • Then define the effect size by = 1 / 2 = med2 / med1 where medi ln(.5) / i • Standard difference ln( ) 2 37 Time to Failure (1) • Use a parametric model for sample size • Common model - exponential – S(t) = e-t = hazard rate – H0: I = C – Estimate N 2( Z Z ) 2 George & Desu (1974) N [ln(c / I )]2 • Assumes all patients followed to an event (no censoring) • Assumes all patients immediately entered 38 Assuming Exponential Survival Distributions • Simple case • The statistical test is powered by the total number of events observed at the time of the analysis, d. d = 4(Z + Z ) 2 [ln( )]2 C = I 39 Converting Number of Events (D) to Required Sample Size (2N) • d = 2N x P(event) 2N = d/P(event) • P(event) is a function of the length of total followup at time of analysis and the average hazard rate • Let AR = accrual rate (patients per year) A = period of uniform accrual (2N = AR x A) F = period of follow-up after accrual complete A/2 + F = average total follow-up at planned analysis = average hazard rate • Then P(event) = 1 – P(no event) = 1 e - (A /2+ F) 40 Time to Failure (2) • In many clinical trials 1. Not all patients are followed to an event (i.e. censoring) 2. Patients are recruited over some period of time (i.e. staggered entry) • More General Model (Lachin, 1981) N ( z z ) 2 {g (C ) g (I )} (C I ) 2 where g() is defined as follows 41 1. Instant Recruitment Study Censored At Time T g ( ) 2 1 e T 2. Continuous Recruiting (O,T) & Censored at T g ( ) 3T (T 1 e T ) 3. Recruitment (O, T0) & Study Censored at T (T > T0) g ( ) 2 1 e (T T0 ) e T T0 42 Example Assume = .05 (2-sided) & 1 - = .90 C = .3 and I = .2 T = 5 years follow-up T0 = 3 0. No Censoring, Instant Recruiting N = 128 1. Censoring at T, Instant Recruiting N = 188 2. Censoring at T, Continual Recruitment N = 310 3. Censoring at T, Recruitment to T0 N = 233 43 Sample Size Adjustment for Non-Compliance (1) • References: 1. 2. 3. Shork & Remington (1967) Journal of Chronic Disease Halperin et al (1968) Journal of Chronic Disease Wu, Fisher & DeMets (1988) Controlled Clinical Trials • Problem Some patients may not adhere to treatment protocol • Impact Dilute whatever true treatment effect exists 44 Sample Size Adjustment for Non-Compliance (2) • Fundamental Principle Analyze All Subjects Randomized • Called Intent-to-Treat (ITT) Principle – Noncompliance will dilute treatment effect • A Solution Adjust sample size to compensate for dilution effect (reduced power) • Definitions of Noncompliance – Dropout: Patient in treatment group stops taking therapy – Dropin: Patient in control group starts taking experimental therapy 45 Comparing Two Proportions – Assumes event rates will be altered by non-compliance – Define PT* = adjusted treatment group rate PC* = adjusted control group rate If PT < PC, 1.0 0 PC PT PT * PC * 46 Adjusted Sample Size Simple Model Compute unadjusted N – Assume no dropins – Assume dropout proportion R – Thus PC* = PC PT* = (1-R) PT + R PC – Then adjust N N* – Example R .1 .25 N (1 R) 2 1/(1-R)2 1.23 1.78 % Increase 23% 78% 47 Sample Size Adjustment for Non-Compliance Dropouts & dropins (R0, RI) N N* (1 R0 RI ) 2 – Example R0 .1 .25 R1 .1 .25 1/(1- R0- R1)2 1.56 4.0 % Increase 56% 4 times% 48 Multiple Response Variables • Many trials measure several outcomes (e.g. MILIS, NOTT) • Must force investigator to rank them for importance • Do sample size on a few outcomes (2-3) • If estimates agree, OK If not, must seek compromise 49 Sample Size Summary • Ethically, the size of the study must be large enough to achieve the stated goals with reasonable probability (power) • Sample size estimates are only approximate due to uncertainty in assumptions • Need to be conservative but realistic 50 Demo of Sample Size Program www.biostat.wisc.edu/ • Program covers comparison of proportions, means, & time to failure • Can vary control group rates or responses, alpha & power, hypothesized differences • Program develops sample size table and a power curve for a particular sample size 51