Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Harm reduction wikipedia , lookup
Epidemiology wikipedia , lookup
Transtheoretical model wikipedia , lookup
Declaration of Helsinki wikipedia , lookup
Fetal origins hypothesis wikipedia , lookup
Clinical trial wikipedia , lookup
Management of multiple sclerosis wikipedia , lookup
ACTIVE CONTROL EQUIVALENCE TRIALS: GETTING IT RIGHT Susan S. Ellenberg, Ph.D. University of Pennsylvania School of Medicine ASENT Comparative Effectiveness Symposium Bethesda, MD March 6, 2010 HOW TO COMPARE EFFECTIVENESS? Retrospective analysis of observational databases Prospective cohort study using available populations treated according to physician/patient preference Historically controlled study Randomized two-arm trial Randomized three-arm trial (including placebo arm) Meta-analysis of comparative studies 2 EFFECT OF RANDOMIZATION Can assume prognosis is approximately the same on average in each randomized group For factors that you know are prognostic, and are measuring, you can check on balance across groups For factors that you don’t know about, randomization allows you not to worry 3 EFFECT OF RANDOMIZATION Good Prognosis Poor Prognosis R A N D O M I Z E Treatment A GP PP Treatment B GP PP 4 NONRANDOMIZED COMPARISONS Typical approach ―Look at outcomes in individuals treated with different therapies ―Develop a model that includes all known risk factors for outcome of interest that were measured and can be extracted from medical record ―Compare outcomes by treatment, adjusting for all available risk factors ―Draw conclusions as usual based on tests of statistical significance 5 BIG PROBLEM Factors that we know about usually explain only a limited amount of the variability among subjects Effects of prognostic factors often dwarf effects of treatment Unknown prognostic factors associated with choice of treatment may introduce huge biases into nonrandomized comparisons 6 EXAMPLE Level of adherence was measured in a placebo-controlled clinical trial Those who took at least 80% of their pills did much better than the others This effect was even stronger in the placebo arm than in the treatment arm! Adjustment for all known risk factors had some impact, but difference between adherers and nonadherers to placebo remained highly significant 7 CORONARY DRUG PROJECT Clofibrate Placebo Adherence N % mortality N % mortality <80% >80% 357 708 24.6 15.0 882 1813 28.2 15.1 Coronary Drug Project Research Group, NEJM, 1980 8 OUTCOMES OF INTEREST In comparative effectiveness studies many outcomes will be of interest ―Multiple measures of efficacy ―Multiple safety outcomes ―Time on treatment ―Need for concomitant medications ―Cost Biases may vary in their effect on outcomes 9 BOTTOM LINE Comparisons of efficacy and safety based on observational data will always be suspect Large randomized trials will be most reliable mechanism for understanding comparative treatment effects Trials comparing two or more active treatments must be carefully designed in order to yield interpretable results 10 RANDOMIZED TRIALS TO COMPARE EFFECTS OF ACTIVE TREATMENTS Interpretation of “similarity” is often difficult Problems have been well described in the context of investigational drug trials Comparing two marketed drugs will be complicated in same way ― One is better than the other: straightforward ― They look about the same: difficult to interpret 11 EVALUATION OF NEW TREATMENT Superiority trial ―Trial in which the intent is to prove that a new treatment is better than placebo or a standard treatment Noninferiority trial ―Trial in which the intent is to prove that a new treatment is ABOUT AS GOOD as a standard treatment and can therefore be assumed effective 12 THE PROBLEM Conclusion of noninferiority requires a critical assumption: that the effect of the active control in this study is as good or better as it was in earlier studies Similar to assumptions made in historically controlled studies Validity of conclusion rests on unverifiable assumption of consistency of effect across studies (and over time) 13 NONINFERIORITY TRIALS: PROBLEM Treatment X has been compared to placebo in 5 studies, with effects of 10, 4, 16, 0 and 8. The 3 highest effect sizes were significant; drug was approved. Trials all designed similarly, with adequate power, and performed in apparently similar populations. 14 NONINFERIORITY TRIALS: A PROBLEM Treatment X has been compared to placebo in 5 studies, with effects of 10, 4, 16, 0 and 8. The 3 highest effect sizes were significant; drug was approved. Trials all designed similarly, with adequate power, and performed in apparently similar populations. New treatment Y is compared to treatment X. Outcomes are similar. Is treatment Y effective? 15 NONINFERIORITY TRIALS: PROBLEM Treatment X has been compared to placebo in 5 studies, with effects of 10, 4, 16, 0 and 8. The 3 highest effect sizes were significant; drug was approved. Trials all designed similarly, with adequate power, and performed in apparently similar populations. New treatment Y is compared to treatment X. Outcomes are similar. Is treatment Y effective? ― maybe yes, if X had effect of 8 or more ― maybe no, if X had effect of 4 or less 16 NONINFERIORITY TRIALS: PROBLEM Treatment X has been compared to placebo in 5 studies, with effects of 10, 4, 16, 0 and 8. The 3 highest effect sizes were significant; drug was approved. Trials all designed similarly, with adequate power, and performed in apparently similar populations. New treatment Y is compared to treatment X. Outcomes are similar. Is treatment Y effective? ― maybe yes, if X had effect of 8 or more ― maybe no, if X had effect of 4 or less Without placebo, don’t know effect of X in trial 17 IF WE KNOW A TREATMENT IS EFFECTIVE, WHY DO WE WORRY THAT IT MIGHT BE INEFFECTIVE IN ANY GIVEN STUDY? 18 CONSISTENCY OFTEN DOESN’T HOLD Pain Depression Anxiety Allergic Rhinitis GERD Hypertension 19 OUTCOMES CAN VARY GREATLY In studies in many areas, particularly symptomrelieving treatments, non-inferiority trials are often unreliable, for many reasons Symptoms wax and wane Widely varying response rates Modest effect sizes High placebo response rates Effect measures are variable 20 PROBLEM IS WELL DOCUMENTED In many of these areas, common to perform 3-arm studies to evaluate a new drug, including both a placebo and an active control Frequently, active control appears no better than placebo Allows differentiation between a treatment that didn’t work and a study that didn’t work 21 IMPLICATION FOR TRIAL DESIGN? Unfortunately, we don’t know in any given trial why an active drug appears no better than placebo Not primarily a question of sample size; same pattern in larger studies as smaller studies Many things about trial populations and environments that we can’t or don’t measure ― Some investigators may interact with subjects in a way to enhance “placebo response” ― Some subjects are inherently nonresponsive to treatments ― Some subjects are responsive to some treatments but not others 22 ASSAY SENSITIVITY The ability of a study to distinguish between active and inactive treatments is called “assay sensitivity” Trials in many medical areas do not have this property Without assay sensitivity, equivalence to active control cannot prove efficacy of new treatment 23 DETERMINING ASSAY SENSITIVITY Determination of assay sensitivity depends on prior history of placeboactive control comparisons Determination of appropriate margin depends on prior estimates of effectiveness of active control Conclusions about effectiveness therefore rely on historically-based assumptions 24 RELEVANCE TO COMPARATIVE EFFECTIVENESS STUDIES Even if both treatments being compared are known to be effective, they may be ineffective in any given study Finding of similar outcomes may mean effects really are similar; or that study lacked assay sensitivity In the latter case, finding of similar outcomes is not informative about relative effectiveness 25 OTHER ISSUES WITH ACTIVE CONTROL TRIALS Study size ―Negative study should document no clinically meaningful difference, not just no statistically significant difference Study quality ―Problems such as excessive dropouts and losses to follow-up, missing data and data errors, protocol violations, etc., tend to dilute any true treatment differences 26 SOLUTIONS? Very large studies conducted at many sites ― More confidence that observed effects are representative ― Lack of observed effect less likely to result from lack of assay sensitivity ― Narrow confidence intervals for differences Sufficient follow-up to observe longer-term effects and (for chronic therapies) rates of discontinuation Careful attention to study quality ― Randomization ― Meticulous follow-up ― Blinded adjudication of outcomes 27 OTHER ISSUES TO WORRY ABOUT Multiple outcomes of interest Results in subgroups 28 TRIALS TO COMPARE EFFECTIVENESS Specific outcome measures ―Primary efficacy outcome ―Secondary efficacy outcomes ―Duration of benefit ―Need for augmentation of therapy ―Pre-specified safety outcomes ―Newly arising safety outcomes ―Compliance ―Economic considerations 29 TRIALS TO COMPARE EFFECTIVENESS In a representative, diverse population there will be interest in whether there are differences among subpopulations in responsiveness to therapy This is where comparative effectiveness and personalized medicine converge But—we will find differences even if there aren’t any ―Probability of finding subgroups in which results differ from overall results is high 30 CLOSING COMMENTS Comparing available drugs will not be as straightforward as many have implied Similar outcomes could be due to ―Lack of assay sensitivity ―Sloppy study conduct ―True similarity of effects Many opportunities for spurious findings What is the right balance for comparative effectiveness research vs drug discovery and development? 31