* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Design and Analysis of Clinical Trials
Survey
Document related concepts
Transcript
Design And Analysis Of Clinical Trials Martin L. Lesser, Ph.D. Biostatistics Unit Feinstein Institute for Medical Research North Shore – Long Island Jewish Health System CME Disclosure Statement • The North Shore LIJ Health System adheres to the ACCME's new Standards for Commercial Support. Any individuals in a position to control the content of a CME activity, including faculty, planners and managers, are required to disclose all financial relationships with commercial interests. All identified potential conflicts of interest are thoroughly vetted by the North Shore-LIJ for fair balance and scientific objectivity and to ensure appropriateness of patient care recommendations. • Course Director, Kevin Tracey, has disclosed a commercial interest in Setpoint, Inc. as the cofounder, for stock and consulting support. He has resolved his conflicts by identifying a faculty member to conduct content review of this program who has no conflicts. • The speaker, Martin L. Lesser, PhD, has no conflicts. 2 Types of Clinical Trials Phase I - exploratory; assessment of toxicity; determination of safe dosage; pharmacokinetics Phase II - evaluation of efficacy in a select group of patients; estimation of treatment effect Phase III - comparative trial; hypothesis testing Phase IV - establish new indication; post-marketing surveillance Phase I Designs • “3+3” dose escalation design for determining maximum tolerated dose (MTD) • Fixed multiple dose design (e.g., randomize 5 subjects to each of 5 doses) • Goal: design should protect subjects from harm, especially in a trial for which safe dosing, pharmacokinetics, and potential toxicities are unknown or poorly understood X1=# of DLTs in first cohort of 3 X2=# of DLTs in first cohort of 3 DLT=dose limiting toxicity Source: Jovanovic, et al. 2004 Phase II Designs • Applied to a specific disease entity • Fixed dose is used • Simple primary outcome: response, measurement of some parameter • Single arm, open label (traditional) • Single arm, blinded evaluator (uncommon) • Simon 2-stage design • Randomized Phase II trial (for selection of best therapy) Simon 2-Stage Optimal Design • H0: p ≤ p0 vs. HA: p ≥ p1 • Where response rate ≤ p0 is uninteresting and response rate ≥ p1 is the desired target • Simon’s “Optimal Design”: Observe n1 subjects in stage 1. If response rate r1≤ a1/n1, then stop the trial and reject the drug. • If r1> a1/n1, then study an additional n2 subjects in stage 2, for a total of n=n1+n2. If the “total” response rate r ≤ a/n, then reject the drug. If r > a/n, then consider the drug for further testing and Phase III trials. Simon: Controlled Clin Trials, 10:1-10, 1989. Simon 2-Stage Optimal Design (cont’d) • For given α, β, p0, and p1, this design minimizes EN(p0), the expected number of subjects studied under H0 . • Example: Let α=0.05 β=0.20 p0= 0.30 p1= 0.45 Stage 1: Enter 27 subjects; stop trial and reject drug if r1≤ 9/27. If r1 > 9/27, then go on to Stage 2. Stage 2: Enter 54 additional subjects (total=81). If r ≤ 30/81, then reject the drug. If r > 30/81, then trial is favorable toward drug. Note: E(N(p0)) = 41.7. Prob(early termination)=0.73 Simon 2-Stage Minimax Design • Similar to the 2-stage optimal design • Minimizes the maximum total sample size (n) among all optimal designs • Minimax design is attractive when subject accrual is low • Previous example worked with minimax: r1≤ 16/46, r ≤ 25/65, EN(p0)=49.6, PET(p0)=0.81 (Optimal design had n=81.) Phase III Trials Design Considerations Purpose of study; What is the question? - Primary and secondary questions - Operationalizing the question (definition of response, survival, pain, quality of life, etc.) Patient population - Target population, sampling frame - Inclusion/exclusion criteria - Comparability of patients, equivalent baseline workups Design Considerations (continued) • Treatment Plan • Blinding • Use of Placebo Control • Criteria for evaluation of treatment effect (comparability of patient follow-up) Design Considerations (continued) General study design structure for comparative studies - randomized controls concurrent non-randomized controls historical controls (Phase II and III) patient as own control (cross-over design) Randomized Controls Advantages Reduces or eliminates bias because chance, alone, determines assignment Assures that most statistical methods will be valid Disadvantages Can be expensive, labor intensive Patients may refuse randomization, resulting in bias Potential ethical problems May upset the patient-physician relationship Not feasible if contamination is likely Concurrent Non-Randomized Controls Advantages Useful when randomization is not feasible Useful in group or community interventions Usually less cost/effort than randomized trials Disadvantages Assignment to treatment may be biased May require matching or post-hoc adjustments Historical Controls Advantages Data already exist Relatively inexpensive Ethical problems of randomization are avoided Often requires fewer patients on new treatment Disadvantages HCs and current group subjects may differ on: Method/criteria for selection Diagnostic and/or follow-up criteria Disease epidemiology, etiology, or natural history may have changed Difficult to protect against unknown biases Some data elements may not be available in the HC era Patient As Own Control Advantages Reduces variance, often resulting in smaller required sample sizes Disadvantages Only useful in certain disease settings May introduce "order" effects Nature of intervention may be influenced by results of first study period Design Considerations (continued) • Blinding • Placebo control • Stratification • The process of randomization • Handling dropouts and non-compliance • Statistical methods for data analysis • Sample size and power • Interim analysis and early stopping Blinding • Any attempt to make study participants unaware of which treatment is offered • Is indicated when the occurrence and reporting of outcomes can be easily influenced by knowledge of treatment (subjective responses, behavior change) • May be either single blind or double blind • Blinding is not always feasible • Blinding may be unsuccessful (ability to break the blind) Placebo Control • Appropriate when no effective standard treatment exists for the control group • Makes subject’s attitudes to the trial as similar as possible in the treatment and control groups • Major uses: − Controls for psychological factors − Maintains double blind design − Controls for spontaneous disease variability • Ethical issues: - May be unethical to withhold treatment in order to administer placebo Stratification Randomization does not guarantee that prognostic factors will be evenly distributed between treatment groups Imbalance can be partly addressed by stratification prior to randomization Imbalance can also be addressed by covariate adjustment at the time of analysis Stratification: An Example NO STRATIFICATION Response Rate Low Risk High Risk Chemo Randomize RT 27 (30% ) 62 (70% ) 89 Chemo 25% 56 (80% ) 38 (20% ) 94 RT 64% 83 100 183 Observed difference is confounded by the prognostic factor RANDOMIZE WITHIN STRATA Chemo Randomize within Low Risk n=83 40 (45% ) RT 43 (46% ) Response Rate Chemo 25% Chemo Randomize within High Risk n=100 49 (55% ) RT 64% RT 51 (54% ) Observed difference is not confounded by the prognostic factor The Process of Randomization simple randomization permuted block randomization unbalanced randomization randomized consent form Examples of permuted block randomization - B=1 AAAABABAAAAAABBB (11 A, 5 B) - B=4 ABBA AABB BABA BABA ( 8 A, 8 B) - B=6 AABABB ABABBA AAAB ( 9 A, 7 B) Dropouts and Non-Compliance Intention to Treat Principle - analyze as randomized - evaluates the effect of a treatment "policy" Analyze as Treated Principle - exclude dropouts - adjust for compliance or dose received - evaluates the effect of the "active ingredient" (but in a possibly biased subset of patients) Dropouts and Non-Compliance Examples - Patients with head and neck cancer randomized to nasogastric feeding tube or good oral nutrition; - Outcome=weight; - Some patients "cross-over" from NG tube to oral nutrition arm - Patients with familial polyposis randomized to high fiber or low fiber diets; - Outcome=number and size of new polyps; - Some patients do not eat the required amount of high fiber cereal; dose of fiber varies from patient to patient Example: RCT in Head and Neck Cancer Assuming Full (100%) Compliance in Group A Weight Gain (lbs.) n=50 A RANDOMIZE B NG Feeding Tube µ=8.0, σ=3 7.57 ± 2.84 n=50 Best Oral Nutrition µ=5.0, σ=3 4.61 ± 3.01 A vs. B P<0.0001 Example: RCT in Head and Neck Cancer Assuming 50% Compliance in Group A Weight Gain (lbs.) n=25 A1 n=50 RANDOMIZE A NG Feeding Tube A2 Pull out NG tube and default to best oral nutrition µ=4.5, σ=3 B Compliant with NG tube µ=8.0, σ=3 Best Oral Nutrition µ=5.0, σ=3 A1 6.39 ± 2.74 A1 A2 n=25 n=50 5.00 ± 2.42 7.78 ± 2.34 5.44 ± 2.81 B 5.66 ± 2.98 A vs. B (ITT) A2 vs. B A2 vs. A1+B A1 vs. A2 vs. B p=0.2098 p=0.0028 p=0.0009 p=0.0003 Statistical Methods Commonly Used in the Analysis of Clinical Trials Data Binary response data - chi square, Fisher exact test - multiple logistic regression Survival, duration of response, and time until event data - Kaplan-Meier product limit method - logrank test, Gehan-Wilcoxon test - Cox proportional hazards regression Continuous-type data - analysis of variance - ordinary multiple regression Sample Size Considerations Concept of power Type of endpoint/outcome variable Specification of clinically significant difference of interest Estimation and confidence intervals Multiple endpoints, Bonferroni correction Tables of sample size and power Patient Flow in Clinical Trials Available Considered Eligible Consented Enrolled Compliant Adequately Followed Sample Size/ Power Sample Suppose the response rate using standard therapy (A) is assumed to be 30%. The investigator would like to see an increase in the response rate to at least 50% (with treatment B) in order for it to be considered clinically useful. A trial of A vs. B would require 125 patients in each group in order to have a 90% chance (power) of detecting a difference of this magnitude or larger (two-tailed test, 5% significance level). Other calculations: n=93/group to achieve 80% power n=56/group to achieve 90% power to detect response rates of 30% vs. 60% n=42/group to achieve 80% power to detect response rates of 30% vs. 60% n=184/group to achieve 90% power to detect response rates of 30% vs. 35% Interim Analysis and Early Stopping Dangers of naive interim analysis - increases Type I error rate (significance level) - increases bias with respect to "expected" results - data lags may influence interim results Statistically sound stopping rules (i.e., rules that maintain the Type I error rate and desired power) - group sequential analysis (O'Brien-Fleming, Pocock, Lan-Demets, etc.) - curtailed sampling "individual" sequential testing - conditional power Early stopping depends on formal statistics as well as on other factors Example: The BHAT Trial (Beta-blocker Heart Attack Trial) • Randomized, double-blind, placebo-controlled trial to test the effect of propanolol (beta-blocker) on total mortality • n = 3837 patients randomized to propanolol or placebo • Trial was stopped 1 year early (on the 6th interim analysis) using the O-F group sequential approach when logrank X2 =2.82 > 2.23 O’Brien-Fleming Boundaries Applied to the BHAT Trial