Download P c

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
Chapter 6
SAMPLE SIZE ISSUES
Ref: Lachin, Controlled Clinical Trials 2:93-113, 1981.
1
Sample Size Issues
• Fundamental Point
Trial must have sufficient statistical power to detect
differences of clinical interest
• High proportion of published negative trials do not
have adequate power
Freiman et al, NEJM (1978)
50/71 could miss a 50% benefit
2
Example: How many subjects?
• Compare new treatment (T) with a control (C)
• Previous data suggests Control Failure Rate (Pc) ~ 40%
• Investigator believes treatment can reduce Pc by 25%
i.e. PT = .30, PC = .40
• N = number of subjects/group?
3
• Estimates only approximate
– Uncertain assumptions
– Over optimism about treatment
– Healthy screening effect
• Need series of estimates
– Try various assumptions
– Must pick most reasonable
• Be conservative yet be reasonable
4
Statistical Considerations
Null Hypothesis (H0):
No difference in the response exists between treatment
and control groups
Alternative Hypothesis (Ha):
A difference of a specified amount () exists between
treatment and control
Significance Level (): Type I Error
The probability of rejecting H0 given that H0 is true
Power = (1 - ): ( = Type II Error)
The probability of rejecting H0 given that H0 is not true
5
Standard Normal Distribution
Ref: Brown & Hollander. Statistics: A Biomedical Introduction. John Wiley & Sons, 1977.
6
Standard Normal Table
7
Ref: Brown & Hollander. Statistics: A Biomedical Introduction. John Wiley & Sons, 1977.
Distribution of Sample Means (1)
Ref: Brown & Hollander. Statistics: A Biomedical Introduction. John Wiley & Sons, 1977.
8
Distribution of Sample Means (2)
Ref: Brown & Hollander. Statistics: A Biomedical Introduction. John Wiley & Sons, 1977.
9
Distribution of Sample Means (3)
Ref: Brown & Hollander. Statistics: A Biomedical Introduction. John Wiley & Sons, 1977.
10
Distribution of Sample Means (4)
Ref: Brown & Hollander. Statistics: A Biomedical Introduction. John Wiley & Sons, 1977.
11
Distribution of Test Statistics
• Many have a common form
•  = population parameter (eg
difference in means)
• ˆ = sample estimate
• Then
– Z =[ˆ – E(ˆ )]/SE(ˆ )
• And then Z has a Normal (0,1) distribution
12
• If statistic z is large enough (e.g. falls into red area of scale), we believe this result is too large
to have come from a distribution with mean O (i.e. Pc - Pt = 0)
• Thus we reject H0: Pc - Pt = 0, claiming that there exists 5% chance this result could have
come from distribution with no difference
13
Normal Distribution
Ref: Brown & Hollander. Statistics: A Biomedical Introduction. John Wiley & Sons, 1977.
14
Two Groups
     0
1  C  T  1
0
or OR
Z
c
T
0
XC  XT
~ N (0,1)
 2/n
15
Test Statistics
  population parameter
~
  sample estimate
~
~
  ()

~
v()
16
Test of Hypothesis
•
Two sided
e.g. H0: PT = PC
vs.
•
z = critical value
Classic test
If |z| > z
Reject H0
 = .05 , z = 1.96
One sided
H0: PT < PC
If z > z
Reject H0
 = .05, z = 1.645
where z = test statistic
•
Recommend
z be same value both cases (e.g. 1.96)
two-sided
one-sided
  = .05
or
= .025
z = 1.96
1.96
17
Typical Design Assumptions (1)
1.  = .05, .025, .01
2. Power = .80, .90
Should be at least .80 for design
3.  = smallest difference hope to detect
e.g.  = PC - PT
= .40 - .30
= .10
25% reduction!
18
Typical Design Assumptions (2)
Two Sided
Significance Level

0.05
0.025
0.01
Z
1.96
2.24
2.58
Power
1-
0.80
0.90
0.95
Z
0.84
1.282
1.645
19
Sample Size Exercise
• How many do I need?
• Next question, what’s the question?
• Reason is that sample size depends on
the outcome being measured, and the
method of analysis to be used
20
One Sample Test for Mean
H0:  = 0 vs. HA:  = 0+ 
1    P{reject H 0 | H A is true}
Y - 0
 P{
 z |    0  }
/ n
Y - (  0  )

  P{
 z 
|    0  }
/ n
/ n
zα = constant associated with a
P {|Z|> zα } = α two sided
21
One Sample Test for Mean
Under the alternative hypothesis that  =
0+ , the test statistic
Y - (  0  )
/ n
follows a standard normal variable.
22
One Sample Test for Mean

- z   z 
/ n
2
 [ z  z  ]
n
2

23
Simple Case - Binomial
1.
H0:
PC = PT
2.
Test Statistic
(Normal Approx.)
pˆ C  pˆ T
Z
p(1  p)(1 / N C  1 / NT )
3.
N C PˆC  NT PˆT
p
N C  NT
Sample Size
Assume
• NT = NC = N
• HA: = PC - PT
24
Sample Size Formula (1)
Two Proportions
Simpler Case
N
2( Z  Z  ) 2 p (1  p )
2
• Z = constant associated with 
P {|Z|> Z } =  two sided!
(e.g. = .05, Z =1.96)
• Z = constant associated with 1 - 
P {Z< Z} = 1- 
(e.g. 1- = .90, Z =1.282)
• Solve for Z ( 1- ) or 
25
Sample Size Formula (2)
Two Proportions
[ Z 2 p(1  p)  Z  PC (1  PC )  PT (1  PT ) ]
2
N

2
• Z = constant associated with 
P {|Z|> Z } =  two sided!
(e.g. = .05, Z =1.96)
• Z = constant associated with 1 - 
P {Z< Z} = 1- 
(e.g. 1- = .90, Z =1.282)
26
Sample Size Formula
Power
•
Solve for Z
 1- 

2 Z

pq
N
Z 
Difference Detected
• Solve for 

( Z  Z  ) pq
N
2
27
Simple Example (1)
• H0: PC = PT
• HA: PC = .40, PT = .30
 = .40 - .30 = .10
• Assume
 = .05
1 -  = .90
Z = 1.96
Z = 1.282
(Two sided)
• p = (.40 + .30 )/2 = .35
28
Simple Example (2)
Thus
a.
[1.96 2(.35)(.65)  1.282 (.3)(.7)  (.4)(.6) ]2
N
2
(.
4

.
3
)
N = 476
2N = 952
b.
2(1.96  1.282) 2 (.35)(.65)
N
 478
2
(.4  .3)
N = 478
2N = 956
29
Approximate* Total Sample Size for Comparing Various
Proportions in Two Groups with Significance Level ()
of 0.05 and Power (1-) of 0.80 and 0.90
True Proportions
pC
pI
(Control) (Invervention)
0.60
0.50
0.40
0.30
0.20
0.10
0.50
0.40
0.30
0.20
0.40
0.30
0.25
0.20
0.30
0.25
0.20
0.20
0.15
0.10
0.15
0.10
0.05
0.05
= 0.05
(two-sided)
= 0.05
(one-sided)
1-
0.90
850
210
90
50
850
210
130
90
780
330
180
640
270
140
1980
440
170
950
1-
0.80
610
160
70
40
610
150
90
60
560
240
130
470
190
100
1430
320
120
690
*Sample sizes are rounded up to the nearest 10
1-
0.90
1040
260
120
60
1040
250
160
110
960
410
220
790
330
170
2430
540
200
1170
1-
0.80
780
200
90
50
780
190
120
80
720
310
170
590
250
130
1810
400
150
870
30
31
Comparison of Means
• Some outcome variables are continuous
– Blood Pressure
– Serum Chemistry
– Pulmonary Function
• Hypothesis tested by comparison of
mean values between groups, or
comparison of mean changes
32
Comparison of Two Means
• H 0:  C =  T   C -  T = 0
• HA :  C -  T = 
• Test statistic for sample means ~ N (,)
Z
XC  XT
 (1 / NC  1 / NT )
2
~N(0,1) for H0
• Let N = NC = NT for design
N
2( Z  Z  ) 2  2
• Power

2

2( Z  Z  ) 2
( /  ) 2
Z   N / 2 ( /  )  Z
33
Example
 = 15
e.g. IQ
• Set 2 = .05
 = 0.10
• HA:  = 0.3
 = 0.3x15 = 4.5
1 -  = 0.90
 / = 0.3
• Sample Size
2(1.96  1.282) 2 2(10.51) 21.02
N


2
2
(0.3)
(0.3)
(0.3) 2
• N = 234
 2N = 468
34
35
Comparing Time to Event
Distributions
• Primary efficacy endpoint is the time to an
event
• Compare the survival distributions for the
two groups
• Measure of treatment effect is the ratio of
the hazard rates in the two groups = ratio
of the medians
• Must also consider the length of follow-up
36
Assuming Exponential
Survival Distributions
• If P(T > t) = e- t , where  = 1 in group 1,
 = 2 in group 2, let
H0 : 2 = 1

1
Ha : 2  1

1
2  1
2  1
• Then define the effect size by
 = 1 / 2 = med2 / med1
where medi  ln(.5) / i
• Standard difference
ln(  ) 2
37
Time to Failure (1)
• Use a parametric model for sample size
• Common model - exponential
– S(t) = e-t
 = hazard rate
– H0: I = C
– Estimate N
2( Z  Z ) 2
George & Desu (1974)
N
[ln(c / I )]2
• Assumes all patients followed to an event
(no censoring)
• Assumes all patients immediately entered
38
Assuming Exponential
Survival Distributions
• Simple case
• The statistical test is powered by the
total number of events observed at the
time of the analysis, d.
d =
4(Z  + Z  )
2
[ln(  )]2
C
 =
I
39
Converting Number of Events
(D) to Required Sample Size (2N)
• d = 2N x P(event)
 2N = d/P(event)
• P(event) is a function of the length of total followup at time of analysis and the average hazard rate
• Let AR = accrual rate (patients per year)
A = period of uniform accrual (2N = AR x A)
F = period of follow-up after accrual complete
A/2 + F = average total follow-up at planned analysis
 = average hazard rate
• Then P(event) = 1 – P(no event) = 1
e
-  (A /2+ F)
40
Time to Failure (2)
• In many clinical trials
1. Not all patients are followed to an event
(i.e. censoring)
2. Patients are recruited over some period of time
(i.e. staggered entry)
• More General Model (Lachin, 1981)
N
( z  z  ) 2 {g (C )  g (I )}
(C  I ) 2
where g() is defined as follows
41
1. Instant Recruitment Study Censored At Time T
g ( ) 
2
1 e
 T
2. Continuous Recruiting (O,T) & Censored at T
g ( ) 
3T
(T  1  e T )
3. Recruitment (O, T0) & Study Censored at T (T > T0)
g ( ) 
2
1 
e  (T T0 )  e T
T0

42
Example
Assume  = .05 (2-sided) & 1 -  = .90
C = .3
and
I = .2
T = 5 years follow-up
T0 = 3
0.
No Censoring, Instant Recruiting
N = 128
1.
Censoring at T, Instant Recruiting
N = 188
2.
Censoring at T, Continual Recruitment
N = 310
3.
Censoring at T, Recruitment to T0
N = 233
43
Sample Size Adjustment
for Non-Compliance (1)
• References:
1.
2.
3.
Shork & Remington (1967) Journal of Chronic Disease
Halperin et al (1968) Journal of Chronic Disease
Wu, Fisher & DeMets (1988) Controlled Clinical Trials
• Problem
Some patients may not adhere to treatment protocol
• Impact
Dilute whatever true treatment effect exists
44
Sample Size Adjustment
for Non-Compliance (2)
• Fundamental Principle
Analyze All Subjects Randomized
• Called Intent-to-Treat (ITT) Principle
– Noncompliance will dilute treatment effect
• A Solution
Adjust sample size to compensate for dilution effect
(reduced power)
• Definitions of Noncompliance
– Dropout: Patient in treatment group stops taking
therapy
– Dropin: Patient in control group starts taking
experimental therapy
45
Comparing Two Proportions
– Assumes event rates will be altered by
non-compliance
– Define
PT* = adjusted treatment group rate
PC* = adjusted control group rate
If PT < PC,
1.0
0
PC
PT
PT *
PC *
46
Adjusted Sample Size
Simple Model Compute unadjusted N
– Assume no dropins
– Assume dropout proportion R
– Thus PC* = PC
PT* = (1-R) PT + R PC
– Then adjust N
N* 
– Example
R
.1
.25
N
(1  R) 2
1/(1-R)2
1.23
1.78
% Increase
23%
78%
47
Sample Size Adjustment
for Non-Compliance
Dropouts & dropins (R0, RI)
N
N* 
(1  R0  RI ) 2
– Example
R0
.1
.25
R1
.1
.25
1/(1- R0- R1)2
1.56
4.0
% Increase
56%
4 times%
48
Multiple Response Variables
• Many trials measure several outcomes
(e.g. MILIS, NOTT)
• Must force investigator to rank them for
importance
• Do sample size on a few outcomes (2-3)
• If estimates agree, OK
If not, must seek compromise
49
Sample Size Summary
• Ethically, the size of the study must be
large enough to achieve the stated
goals with reasonable probability
(power)
• Sample size estimates are only
approximate due to uncertainty in
assumptions
• Need to be conservative but realistic
50
Demo of Sample Size Program
www.biostat.wisc.edu/
• Program covers comparison of
proportions, means, & time to failure
• Can vary control group rates or
responses, alpha & power,
hypothesized differences
• Program develops sample size table
and a power curve for a particular
sample size
51