Download randomized trials to compare effects of active treatments

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Harm reduction wikipedia , lookup

Epidemiology wikipedia , lookup

Transtheoretical model wikipedia , lookup

Declaration of Helsinki wikipedia , lookup

Fetal origins hypothesis wikipedia , lookup

Clinical trial wikipedia , lookup

Management of multiple sclerosis wikipedia , lookup

Multiple sclerosis research wikipedia , lookup

Placebo-controlled study wikipedia , lookup

Transcript
ACTIVE CONTROL
EQUIVALENCE TRIALS:
GETTING IT RIGHT
Susan S. Ellenberg, Ph.D.
University of Pennsylvania School of
Medicine
ASENT Comparative Effectiveness Symposium
Bethesda, MD
March 6, 2010
HOW TO COMPARE EFFECTIVENESS?
Retrospective analysis of observational
databases
Prospective cohort study using available
populations treated according to
physician/patient preference
Historically controlled study
Randomized two-arm trial
Randomized three-arm trial (including
placebo arm)
Meta-analysis of comparative studies
2
EFFECT OF RANDOMIZATION
Can assume prognosis is approximately
the same on average in each randomized
group
For factors that you know are prognostic,
and are measuring, you can check on
balance across groups
For factors that you don’t know about,
randomization allows you not to worry
3
EFFECT OF RANDOMIZATION
Good
Prognosis
Poor
Prognosis
R
A
N
D
O
M
I
Z
E
Treatment
A
GP
PP
Treatment
B
GP
PP
4
NONRANDOMIZED COMPARISONS
Typical approach
―Look at outcomes in individuals treated
with different therapies
―Develop a model that includes all known
risk factors for outcome of interest that
were measured and can be extracted from
medical record
―Compare outcomes by treatment, adjusting
for all available risk factors
―Draw conclusions as usual based on tests
of statistical significance
5
BIG PROBLEM
Factors that we know about usually
explain only a limited amount of the
variability among subjects
Effects of prognostic factors often
dwarf effects of treatment
Unknown prognostic factors associated
with choice of treatment may introduce
huge biases into nonrandomized
comparisons
6
EXAMPLE
Level of adherence was measured in a
placebo-controlled clinical trial
Those who took at least 80% of their
pills did much better than the others
This effect was even stronger in the
placebo arm than in the treatment arm!
Adjustment for all known risk factors
had some impact, but difference
between adherers and nonadherers to
placebo remained highly significant
7
CORONARY DRUG PROJECT
Clofibrate
Placebo
Adherence
N
% mortality
N
% mortality
<80%
>80%
357
708
24.6
15.0
882
1813
28.2
15.1
Coronary Drug Project Research Group, NEJM, 1980
8
OUTCOMES OF INTEREST
In comparative effectiveness studies
many outcomes will be of interest
―Multiple measures of efficacy
―Multiple safety outcomes
―Time on treatment
―Need for concomitant medications
―Cost
Biases may vary in their effect on
outcomes
9
BOTTOM LINE
Comparisons of efficacy and safety
based on observational data will always
be suspect
Large randomized trials will be most
reliable mechanism for understanding
comparative treatment effects
Trials comparing two or more active
treatments must be carefully designed
in order to yield interpretable results
10
RANDOMIZED TRIALS TO COMPARE
EFFECTS OF ACTIVE TREATMENTS
 Interpretation of “similarity” is often
difficult
 Problems have been well described in the
context of investigational drug trials
 Comparing two marketed drugs will be
complicated in same way
― One is better than the other: straightforward
― They look about the same: difficult to interpret
11
EVALUATION OF NEW TREATMENT
Superiority trial
―Trial in which the intent is to prove
that a new treatment is better than
placebo or a standard treatment
Noninferiority trial
―Trial in which the intent is to prove
that a new treatment is ABOUT AS
GOOD as a standard treatment and
can therefore be assumed effective
12
THE PROBLEM
Conclusion of noninferiority requires a
critical assumption: that the effect of
the active control in this study is as
good or better as it was in earlier
studies
Similar to assumptions made in
historically controlled studies
Validity of conclusion rests on
unverifiable assumption of consistency
of effect across studies (and over
time)
13
NONINFERIORITY TRIALS: PROBLEM
 Treatment X has been compared to placebo in
5 studies, with effects of 10, 4, 16, 0 and 8.
The 3 highest effect sizes were significant;
drug was approved. Trials all designed
similarly, with adequate power, and performed
in apparently similar populations.
14
NONINFERIORITY TRIALS: A PROBLEM
 Treatment X has been compared to placebo in
5 studies, with effects of 10, 4, 16, 0 and 8.
The 3 highest effect sizes were significant;
drug was approved. Trials all designed
similarly, with adequate power, and performed
in apparently similar populations.
 New treatment Y is compared to treatment X.
Outcomes are similar. Is treatment Y
effective?
15
NONINFERIORITY TRIALS: PROBLEM
 Treatment X has been compared to placebo in
5 studies, with effects of 10, 4, 16, 0 and 8.
The 3 highest effect sizes were significant;
drug was approved. Trials all designed
similarly, with adequate power, and performed
in apparently similar populations.
 New treatment Y is compared to treatment X.
Outcomes are similar. Is treatment Y
effective?
― maybe yes, if X had effect of 8 or more
― maybe no, if X had effect of 4 or less
16
NONINFERIORITY TRIALS: PROBLEM
 Treatment X has been compared to placebo in
5 studies, with effects of 10, 4, 16, 0 and 8.
The 3 highest effect sizes were significant;
drug was approved. Trials all designed
similarly, with adequate power, and performed
in apparently similar populations.
 New treatment Y is compared to treatment X.
Outcomes are similar. Is treatment Y
effective?
― maybe yes, if X had effect of 8 or more
― maybe no, if X had effect of 4 or less
 Without placebo, don’t know effect of X in
trial
17
IF WE KNOW A TREATMENT
IS EFFECTIVE, WHY DO WE
WORRY THAT IT MIGHT BE
INEFFECTIVE IN ANY GIVEN
STUDY?
18
CONSISTENCY OFTEN DOESN’T HOLD
Pain
Depression
Anxiety
Allergic Rhinitis
GERD
Hypertension
19
OUTCOMES CAN VARY GREATLY
In studies in many areas, particularly symptomrelieving treatments, non-inferiority trials are
often unreliable, for many reasons
 Symptoms wax and wane
 Widely varying response rates
 Modest effect sizes
 High placebo response rates
 Effect measures are variable
20
PROBLEM IS WELL DOCUMENTED
In many of these areas, common to
perform 3-arm studies to evaluate a
new drug, including both a placebo and
an active control
Frequently, active control appears no
better than placebo
Allows differentiation between a
treatment that didn’t work and a study
that didn’t work
21
IMPLICATION FOR TRIAL DESIGN?
 Unfortunately, we don’t know in any given
trial why an active drug appears no better
than placebo
 Not primarily a question of sample size; same
pattern in larger studies as smaller studies
 Many things about trial populations and
environments that we can’t or don’t measure
― Some investigators may interact with subjects in
a way to enhance “placebo response”
― Some subjects are inherently nonresponsive to
treatments
― Some subjects are responsive to some treatments
but not others
22
ASSAY SENSITIVITY
The ability of a study to distinguish
between active and inactive treatments is
called “assay sensitivity”
Trials in many medical areas do not have
this property
Without assay sensitivity, equivalence to
active control cannot prove efficacy of
new treatment
23
DETERMINING ASSAY SENSITIVITY
Determination of assay sensitivity
depends on prior history of placeboactive control comparisons
Determination of appropriate margin
depends on prior estimates of
effectiveness of active control
Conclusions about effectiveness
therefore rely on historically-based
assumptions
24
RELEVANCE TO COMPARATIVE
EFFECTIVENESS STUDIES
 Even if both treatments being compared are
known to be effective, they may be
ineffective in any given study
 Finding of similar outcomes may mean
effects really are similar; or that study
lacked assay sensitivity
 In the latter case, finding of similar
outcomes is not informative about relative
effectiveness
25
OTHER ISSUES WITH ACTIVE
CONTROL TRIALS
Study size
―Negative study should document no
clinically meaningful difference, not just
no statistically significant difference
Study quality
―Problems such as excessive dropouts and
losses to follow-up, missing data and data
errors, protocol violations, etc., tend to
dilute any true treatment differences
26
SOLUTIONS?
 Very large studies conducted at many sites
― More confidence that observed effects are
representative
― Lack of observed effect less likely to result from
lack of assay sensitivity
― Narrow confidence intervals for differences
 Sufficient follow-up to observe longer-term
effects and (for chronic therapies) rates of
discontinuation
 Careful attention to study quality
― Randomization
― Meticulous follow-up
― Blinded adjudication of outcomes
27
OTHER ISSUES TO WORRY ABOUT
Multiple outcomes of interest
Results in subgroups
28
TRIALS TO COMPARE EFFECTIVENESS
Specific outcome measures
―Primary efficacy outcome
―Secondary efficacy outcomes
―Duration of benefit
―Need for augmentation of therapy
―Pre-specified safety outcomes
―Newly arising safety outcomes
―Compliance
―Economic considerations
29
TRIALS TO COMPARE EFFECTIVENESS
In a representative, diverse population
there will be interest in whether there
are differences among subpopulations
in responsiveness to therapy
This is where comparative
effectiveness and personalized
medicine converge
But—we will find differences even if
there aren’t any
―Probability of finding subgroups in which
results differ from overall results is high
30
CLOSING COMMENTS
Comparing available drugs will not be as
straightforward as many have implied
Similar outcomes could be due to
―Lack of assay sensitivity
―Sloppy study conduct
―True similarity of effects
Many opportunities for spurious
findings
What is the right balance for
comparative effectiveness research vs
drug discovery and development?
31