Download Sample Size Re-estimation Based on the Observed Treatment Effect

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Declaration of Helsinki wikipedia , lookup

Clinical trial wikipedia , lookup

Theralizumab wikipedia , lookup

Placebo-controlled study wikipedia , lookup

Multiple sclerosis research wikipedia , lookup

Transcript
Implementing Adaptive Designs in
Clinical Trials: Risks and Benefits
Christopher Khedouri, Ph.D.*, Thamban
Valappil, Ph.D.*, Mohammed Huque, Ph.D.*
* U.S. Food and Drug Administration, Center for Drug
Evaulation and Research (FDA/CDER)
ASA Joint Statistical Meetings in Seattle,
Washington on August 7, 2006
Disclaimer: Views expressed are those of the
authors and not necessarily those of the FDA
1
Outline
• Overview of risks & benefits of 3 types of sample size designs in a
non-sequential (NS) two-stage setting
 Fixed sample design (FS): No sample size re-estimation (SSR)
 Variance Re-estimation (VR): SSR based on σ2 only.
 Adaptive design for SSR (AD): SSR based on δ, σ2 and other factors.
• Specific risks of ADs in non-inferiority (NI) trials
 Sample Size (SS) Increases (Due to Chance Variation) and
Inefficiencies
 Reduction of Treatment Effect, Standards
 Bio-creep
 Influence from Confounders
 Operational Biases
 Misclassification Biases (Non-differential or Differential)
• Conclusions: ADs may not be appropriate in a NI trial setting,
especially in Phase III confirmatory trials
2
Risks & Benefits: Fixed Sample (FS) vs. Variance
Re-estimation (VR) vs. Adaptive Designs (AD) (1)
•
FS Risks:


•
FS Benefits:





•
Lack of flexibility (pre-specification of fixed δT, σ2)
Loss of efficiency if δT and σ2 not pre-specified correctly.
Limited operational biases
Efficiently and reliably implemented.
Good statistical properties
Easily interpreted/compared with historical studies
Widely used/accepted
VR Risks:


Pre-specification of fixed δT
Careful implementation of SSR needed to limit
operational biases
3
Risks & Benefits: Fixed Sample vs. Variance Reestimation vs. Adaptive Designs (2)
•
VR Benefits




•
Flexible to unexpected increases in variance
Good statistical properties
More easily implemented than ADs
Less potential for biases with blinded interim looks
AD Risks (NI and Superiority)








Limited use and acceptance in Phase III trials
Limited advantages due to regulatory restrictions
Sample size (SS) increase due to chance variation
Inefficiencies due to large SS increases.
Logistical constraints (e.g. new enrollment)
Inconsistencies in Stage I & II
Statistical vs. clinical significance
Unequal patient weighting
4
Risks & Benefits: Fixed Sample vs. Variance Reestimation vs. Adaptive Designs (3)
• AD Risks (Additional Risks in NI Studies)
 Unclear comparisons with historical controls.
 “Sub-optimal” new treatments
 Unclear comparisons with other drugs.
 Potential bio-creep issues
 Unclear assay sensitivity
 Operational and misclassification biases
 Inflation of type I error given underlying misclassification biases
 ‘Confirmatory’ vs. ‘Exploratory’
• AD Benefits
 Flexibility to faulty design assumptions affecting 1st stage test
statistic (e.g. assumed δ and σ2)
 Flexibility towards other considerations (e.g. logistical
constraints)
 Increased power over FS and VR designs.
5
Specific Risks of Adaptive Designs in NI trials:
Sample Size Increases and Inefficiencies (1)
• In NI trials for anti-infective drugs (AIs), “typical
conditions” for SS estimation include:
 Assumption of equal cure rates for treatment and comparator
 Justified overall cure rate and NI margin (e.g. 10%)
 Pre-specified α (two-sided)=.05, =.10 to .20
 Pre-specified sample size rule (no decreases to initial N)
• Simulations under “typical conditions” compared SS
increases and inefficiencies among FS, VR & ADs based
on Conditional (Proschan et al. ‘94 ) & Unconditional (Cui
et al. ‘99) power:
 First stage data based on 50% of initial N
 Nmax= 2N & 4N used for both ADs
 80% Power, δNI = 10% and δT = 0%.
 Assumed cure rate of 80%
 Treatment effects of +2%, 0% , -2%, -4%, -6%
6
Specific Risks of Adaptive Designs in NI trials:
Sample Size Increases and Inefficiencies (2)
Change in Sample Size & Power: FS & ADs for NI
Fixed
AD:
δT
Power
(%)
SS
(%)
Power
(%) ΔFS
SS
(%)
Power
(%) ΔFS
SS
(%)
Power
(%) ΔFS
SS
(%)
81, 79
+2%
92
8
95 +2
23
98 +2
17
96 +1
46
98 --
80, 80
+0%
80
16
88 +3
37
93 +3
35
90 --
78
96 --
79, 81
-2%
61
27
72 +1
52
81 +2
63
79 -2
119
91 --
78, 82
-4%
39
42
49 -2
67
57 -1
102
60 -6
164
75 -3
77, 83
-6%
20
57
25 -4
79
30 -2
147
32 -10
207
42 -8
Cure
Rate
(%)
T C
CP(2N)
UCP(2N)
CP(4N)
UCP(4N)
Power
(%) ΔFS
Power=80%,δNI =10%, Correctly Assumed Cure Rates = 80%, NT,C=252, Nmax =2N, 4N
ΔFS: Gain/Loss in Power over Fixed Sample Design (FS) with same ASN
7
Specific Risks of Adaptive Designs in NI trials:
Sample Size Increases and Inefficiencies (3)
• Simulation results indicated that:
 Adaptive designs for SSR (ADs) often required substantial
increases in sample size even when study power was
adequate.
 ADs resulted in especially large SS increases if based on
unconditional power or a large Nmax.
 ADs became less efficient as treatment effects became less
favorable and required larger SS increases.
 ADs (compared to FS design) remained substantially underpowered given an unfavorable δT with a large SS increase.
• Overall, ADs did not prove to be an effective strategy for
salvaging a trial with poor treatment performance!
Note: ADs are not a fix for poorly planned studies
8
Specific Risks of Adaptive Designs in NI trials:
Reduction of Treatment Effects, Standards (1)
• In NI trials for AIs, placebo controlled trials (PCTs) are rare.
• NI trials must rely upon assay sensitivity, the ability to distinguish an
effective treatment from an ineffective treatment.
• NI trials should adhere to the following steps:
 NI trials must have historical evidence (about the comparator) from
PCTs to show “sensitivity of drug effect “
 NI trials should be carefully planned, and should adhere closely to
the PCTs from which “sensitivity of drug effect” was determined
 NI trials must justify an acceptable NI margin taking into account
historical data and all relevant clinical & statistical considerations
 Conduct of NI trials should adhere to conduct of historical PCTs
• ADs may compromise any of the latter 3 steps of a NI trial
including justification of an appropriate NI margin.
9
Specific Risks of Adaptive Designs in NI trials:
Reduction of Treatment Effects, Standards (2)
• ADs can reduce the proportion of the 95% CI (of the treatment
difference) due to σ2 and increase the proportion due to T which:
 Allows wider (less favorable) margins for T
 Changes initial assumptions in choosing NI
 Reduce standards and complicate cross-study comparisons
• Assuming 50% power in our previous simulations, a treatment
3% worse (than comparator) under a FS design could be up to
5.6% worse under an AD with same lower bound of 95% CI near
-10%.
Nmax FS
T
2N
3.0
4N
3.0
AD- CP
T

3.9 (30%)
4.8 ( 60%)
AD- UCP
T 
4.5 (50%)
5.6 (87%)
• In NI trials for AIs, bio-creep may also be more likely:
10
Specific Risks of Adaptive Designs in NI trials: Bio-Creep
(Non-inferiority margin =10%)
80%
70%
60%
40%
50%
80
60
50
Placebo
Clinical cure rate
70
40
30
20
10
0
Dru g A
Drug A
Dru g B
Drug B
Dru g C
Drug C
Dru g D
Drug D
P laceb o
Placebo
Dru g E
Drug E
11
Specific Risks of Adaptive Designs in NI trials:
Confounders (1)
Assessing a Clinically Meaningful Treatment Benefit Over Placebo
• Reductions of T (standards) and potential bio-creep and
assay sensitivity issues can be especially problematic in AI
studies due to the presence of other confounders.
• These confounders include:
 Lack of placebo-controlled trials
 Old or poorly controlled historical trials
 Lack of Constancy Assumption
 Different Doses /Treatment durations
 Heterogeneous historical population
 Different inclusion/exclusion criteria
 Different endpoints used
 Regional differences /change in clinical practice
 Different concomitant or adjunct therapies
12
Specific Risks of Adaptive Designs in NI trials:
Confounders (2)
Assessing Treatment vs. Comparator
• Additional confounders exist in assessing the treatment
effect relative to active comparator:
Heterogeneity in disease characteristics at baseline
Heterogeneity in patient characteristics.
Severity of the disease
Combinational therapies (for MRSA, gram+)
Adjunct Therapies
Concomitant Medications
IV to Oral Switch
Misclassification of outcomes
13
Specific Risks of Adaptive Designs in NI trials:
Operational Bias (1)
• Unblinded interim looks in ADs (or VR) can lead to unwarranted
“data driven adjustments” that compromise the overall study results.
• Such “adjustments” can be more influential (and problematic) in NI
trials where biases tend to make treatments appear more similar.
• ADs can increase opportunities for such “adjustments” due to the
increased flexibility and reduced transparency in procedures.
• Such “adjustments” in NI trials may include but not be limited to:
 Enrolling subjects from favorable sites/investigators
 Enrolling subjects with favorable baseline/disease characteristics
 Enrolling subjects with high expected cure rates
 Influencing trial conduct/outcome assessment of investigators
14
Specific Risks of Adaptive Designs in NI trials:
Operational Bias (2)
• Assigning an “independent” statistician to conduct the interim
analysis is no guarantee against operational bias.
 Statistician “independence”
 Statistician and Sponsor interests
 Statistician and Sponsor contact
 Sponsor oversight and authority
• There are many other potential sources for leaks in the data
flow!
• Currently, ADs do not have clearly defined or proven
procedures for implementation that can reliably safeguard
against operational biases.
15
Specific Risks of Adaptive Designs in NI trials:
Misclassification Bias
• Differential outcome misclassification bias can seriously
inflate the type-I error rate in both NI and superiority trials.
• In NI trials, non-differential bias can also seriously inflate
the type I error rate
• This inflation of the type I error rate becomes larger as N
increases (Kim, Goldberg et. al, 2001).
• ADs, using a greatly increased N under inferiority would
inflate the overall type I error rate more severely.
16
Conclusions
• ADs can offer increased power and flexibility but may involve
unwanted SS increases due to chance variation.
• ADs can also become inefficient for larger increases in SS.
• ADs can substantially reduce the allowable T under a FS
design.
• ADs can reduce assay sensitivity in NI studies.
• ADs can introduce serious operational or misclassification
biases in NI studies where statistical testing is very challenging.
• ADs may not be appropriate in a NI trial setting, especially in Phase
III confirmatory trials
17