Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Basket Designs of Phase III Confirmatory Trials Cong Chen Director, BARDS, Merck & Co., Inc., Kenilworth, NJ, USA ASA NJ Chapter Symposium, 6/17/2016 Acknowledgment • Co-authors on the present work: – Robert A. Beckman, Lombardi Comprehensive Cancer Center and Innovation Center for Biomedical Informatics, Georgetown University Medical Center – Zoran Antonijevic, Amgen – Rasika Kalamegham, Genentech – Wen Li, Merck – Nicole Li, Merck • Pathway design subteam, Small Populations workstream, DIA Adaptive Design Scientific Working Group (ADSWG) • During the preparation of the slide deck, we have also received very helpful comments from Eric Rubin and Keaven Anderson from Merck 2 Outline • Background • A general strategy for Phase III basket design – Challenges and recommendations • Pruning and pooling based on same endpoint – Type I error control and operating characteristics • Pruning and pooling based on different endpoints – Type I error control and an application of special interest • Bias in point estimation of treatment effect • Discussions 3 Background 4 Challenges to Oncology Drug Development • The increasing discovery of molecular subtypes of cancer leads to small subgroups that correspond to orphan or “niche” indications, even within larger tumor types • Cancer is becoming largely a collection of diseases defined by molecular subtype with low prevalence • Enrolling enough subjects for confirmatory trials in these indications may be challenging • Exclusive use of “one indication at a time” approaches will not be sustainable • The shift to a molecular view of cancer requires a corresponding paradigm shift in drug development approaches 5 Advanced Approaches • “Umbrella” trials – One tumor type with multiple drugs and predictive biomarkers – Subjects are matched to drugs based on predictive biomarkers – Cooperation among multiple sponsors and hard to pull off – Examples: BATTLE, I-SPY, Lung-MAP • “Basket” or “bucket” trials – Multiple tumor types with one drug and a predictive biomarker – Intends to gain drug approval in multiple tumor types with a common predictive biomarker under the premise that molecular subtype is more fundamental than histology – Single sponsor and easy to pull off 6 The First Basket: Imatinib B2225 186 subjects with 40 different malignancies with known genomic MOA of imatinib target kinases KIT, PDGFRA, or PDGFRB Synovial sarcoma Aggressive fibromatosis Dermatofibrosarcoma protuberans 1/16 (6%) 2/20 (10%) 10/12 (83%) 13 centers in consortium: North America, Europe, Australia Aggressive systemic mastocytosis 1/5 (20%) Imatinib 400- 800 mg BID primary endpoint ORR Hypereosinophilic syndrome Myeloproliferative disorder 6/14 (43%) 4/7 (58%) Led to supplemental indications for these 4 subsets after pooling with other trials and case reports Blumenthal. Innovative trial designs to accelerate the availability of highly effective anti-cancer therapies: an FDA perspective, AACR 2014 7 Phase II Basket Trials to Date • A similar design to Imatinib B2225 was endorsed at a Brookings/Friends Conference in 2011 – Statistical design not published but likely Bayesian • Common features: – Single-arm trials with ORR as primary endpoint – Intend to use pooled population for primary analysis to gain broader indication across tumor types (individual tumor type may not be adequately powered) – Involve possibly transformative medicines in patients with great unmet need and seemingly exceptionally strong scientific rationale – Exploratory and opportunistic in nature, rendering regulatory decision on drug approval a non-binding “review” issue 8 A General Strategy for Phase III Basket Design Pathway Design Subteam • Can we develop a generalizable confirmatory basket design with statistical rigor that follows existing regulatory pathways to increase drug approvability? – Applicable not only to exceptional cases, but to all effective medicines in any line of therapy • This would have multiple benefits: – Increase and accelerate access to effective medicines for patients in niche indications – Provide sponsors with cost-effective options for development in niche indications – Provide health authorities with more robust packages for evaluation of benefit and risk 10 Study Design • Frequentist approach for statistical design and analysis • Randomized and controlled trials (generally preferred) – A shared control group may be used for indications with a common standard of care to reduce sample size • Single arm trials may be appropriate for rare tumors (or common tumors with low prevalence), or drugs that can clearly demonstrate superiority to SOC without using a concurrent control – A high rate of durable objective response may provide evidence for drug approvals – Performance of SOC in biomarker selected population is often less well-known, and patient selection bias is unavoidable 11 General Strategy SELECTION PRUNING (External Data) PRUNING (Internal Data) POOLING 12 Pruning • Initial indications are carefully selected • Prune the “losers” using external data to avoid penalty – Data from Phase I/II trials in same indications if available – Data in same indications for different drugs in same class – When accrual is much slower than expected • Prune the “losers” using trial data at an interim analysis – Pruning is seen as cherry-picking, but it also shares similarity with binding futility analysis – The bar for pruning should be carefully chosen to properly balance the risk and balance, and should also be consulted with regulatory agencies – Net impact on Type I error rate and point estimation is complicated (focus of remaining presentation) 13 Why Pruning? • Histology can affect the validity of a molecular predictive hypothesis, in ways often unpredictable in advance – Vemurafenib is effective for BRAF 600E mutant melanoma, but not for analogous colorectal cancer (CRC) tumors (due to subsequently discovered tissue specific feedback loops) – Any prior assumption on treatment effect across tumor types can be challenged • Pruning serves as an empirical approach to separate “winners” from “losers” (seeing is believing) although it doesn’t guarantee to separate with 100% accuracy – One or more bad indications can lead to a failed study for all indications in a basket 14 Pooling • Only “winners” are pooled in final analysis – Sample sizes of “winners” are subject to increase to ensure adequate power in pooled population – Whether to pool “losers” at interim that later turn out to be “winners” at the end is a subject beyond current scope – In an application of special interest, a surrogate endpoint is analyzed at interim for accelerated approval and a pooled analysis of clinical endpoint is conducted for full approval • Study is positive if pooled analysis of remaining indications is positive for the primary definitive endpoint – Some “losers” may have shown adequate activity for further investigation in a different line of therapy or for combination 15 Consistency of Treatment Effects • Consistency of treatment effects across tumor types in pooled analysis are subject to regulatory review, however no formal hypothesis testing needs to be pre-specified – This approach is consistent with current regulatory practice (e.g., subgroup analysis based on baseline characteristics) • Similar methods developed in subgroup analysis, metaanalysis or analysis of multi-regional clinical trials can be adopted for investigation • Graphic tools (e.g., forest plot and Galbraith plot) may be used to assist with the decision 16 Pruning and Pooling Based on Same Endpoint Type I Error Control • k tumor indications, each with sample size of N and all with 1:1 randomization • An interim analysis is conducted at information fraction t for each tumor indication, and a tumor will not be included in the pooled analysis if P-value>αt • The pooled analysis will be conducted at α* so that the overall Type I error is controlled at α when there is no treatment effect for any tumor (H0) • What is α*? 18 Solving for Adjusted Alpha (α*) • Let Yi1 be the test statistics based on information fraction t, and Yi2 be the test statistics based on the final analysis of data in the i-th cohort (i=1, 2,…,k) • Suppose that m cohorts are included in the final analysis (m≥1), and let Vm be the corresponding test statistics. The probability of a positive outcome in pooled analysis is: Q0(α*|αt, m)= PrH 0 (∩{Yi1> Z1t for i=1,…,m}, ∩{Yj1< Z1t for j=m+1,…k}, Vm > Z1- α*) or Q0(α*|αt, m)= PrH 0 (∩{Yi1> Z1t for i=1,…,m}, Vm > Z1- α*)(1- αt) (k-m) • α* is solved from below, where c(k, m) = k!/((k-m)!m!) 𝑘 𝑚 =1 𝑐(𝑘, 𝑚)Q0(α*| αt, m) = α 19 Designs to Be Compared • D0: No pruning and no change (benchmark) • D1: No increase to sample size after pruning • D2: Sample size in pooled analysis after pruning remains the same as planned for the trial (SS) • D3: Sample size for overall trial remains the same after pruning as planned (SS) Designs Overall Trial Pooled Population D0 SS SS D1 <SS <SS D2 >SS SS D3 SS <SS 20 Correlations Under Different Design Options • Corr(Yi1, Vm)=Corr(Yi1, Yi2)/ 𝑚 • Correlation between Yi1 and Yi2 is largest under D1 and smallest under D2 – D1: Corr(Yi1, Yi2)= 𝑡 – D2: Corr(Yi1 ,Yi2)= tm/k – D3: Corr(Yi1 ,Yi2)= 𝑚𝑡/(𝑚𝑡 + 𝑘 1 − 𝑡 ) • Greater α* (i.e., smaller alpha penalty) is associated with smaller correlation (i.e., the more subjects added after IA the smaller the penalty) 21 α* Under Different Design Options α* decreases with increasing k as expected, but its relationship with αt is complicated with the interplay between cherry-picking and futility stopping. 22 Comparison of Operating Characteristics • k=6 tumor indications with total planned event size (kN) ranging from 150 to 350 – The true treatment effect is –log(0.6), or hazard ratio of 0.6 in a time-to-event trial (e.g., PFS, OS or any endpoint acceptable to regulatory agencies) • Pruning occurs at t=0.5 and the bar is set at αt = 0.4 – An indication is pruned if observed hazard ratio at interim is greater than ~0.87 to 0.91 based on event size range – α* is 0.32% under D1, 0.50% under D2, and 0.43% under D3 – Number of active indications (g) with target effect size ranges from 3 to 6, with remaining ones inactive 23 Study Power When Number of Active Indications Varies D2 is always the most powerful. The relative performance of D0 deteriorates as g decreases. 24 Study Power and Sample Sizes Under Different Pruning and Pooling Strategies Power (%) for a Positive Study Exp. Number of Events Exp. Number of Events for Pooled Population for Overall Study Planned Events Number of Active Tumors (g) D0 D1 D2 D3 D0/D2 D1 D3 D0/D3 D1 D2 200 6 95 85 95 93 200 157 179 200 179 221 200 5 85 75 91 86 200 144 172 200 172 228 200 4 67 62 82 76 200 131 166 200 166 234 200 3 44 45 68 61 200 119 159 200 159 240 300 6 99 96 99 99 300 254 277 300 277 323 300 5 96 81 98 96 300 232 266 300 266 334 300 4 84 81 94 91 300 209 255 300 255 345 300 3 60 64 84 79 300 187 244 300 244 356 Except for g=6, D0 is inferior to D3 of same size for overall study and D2 of same size for pooled population. It is inferior to D1 when half indications are active. 25 Numbers of Tumor Indications and Expected True Treatment Effect in Pooled Population Planned Events Number of Active Tumor Indications 200 Exp. Number of Active Tumor Indications in Pooled Population Exp. Total Number of Tumor Indications in Pooled Population Exp. True Effect in Pooled Population Relative to Δ (%) No Pruning With Pruning No Pruning With Pruning No Pruning With Pruning 6 6 4.7 6 4.7 100 100 200 5 5 3.9 6 4.3 83 91 200 4 4 3.1 6 3.9 67 80 200 3 3 2.4 6 3.6 50 66 300 6 6 5.1 6 5.1 100 100 300 5 5 4.2 6 4.6 83 91 300 4 4 3.4 6 4.2 67 81 300 3 3 2.5 6 3.7 50 68 More inactive tumor indications are pruned than active ones. While exclusion of some active indications under pruning is unfortunate, inclusion of many inactive ones without pruning defies the principle of drug development and will be a major regulatory and reimbursement issue. 26 Application to A Single Arm Basket Trial • Traditionally, Simon’s optimal two-stage design is used to decide whether an investigational therapy is worth further development in Phase III – The design minimizes expected sample size under null – The Type I/II error rates are often high • A trial stops for futility after first stage if the futility bar is met, and continues to completion otherwise, in which case a final analysis will be conducted • How to turn this classical design (or any two-stage design) naturally into a confirmatory basket trial in settings that ORR is an acceptable endpoint for drug approval? 27 A Hypothetical Example • Ten tumor indications, each applying Simon’s optimal two-stage design with (α, β)=(0.1, 0.1) and (p0, p1)=(0.1, 0.3) • The first stage enrolls 12 subjects each and an indication is dropped if there is ≤1 response. Otherwise, the indication continues until 35 subjects are enrolled. – No change in sample size (i.e., D1) – An indication would be declared inactive if ≤5 responses at final analysis based on Simon’s optimal two-stage design • An exact calculation shows that the pooled analysis should be conducted at α*=0.29% to keep the overall alpha at 2.5% – E.g., min empirical RR is ~20%/25%/30% for trial to be positive if pooled from 4/2/1 tumor indications and each further requires >5 responses 28 Pruning and Pooling Based on Different Endpoints Set-up for Multiplicity Adjustment • An intermediate endpoint may be used for filing for accelerated approval or to terminate the study early to save cost and time – Bar for pruning is usually higher (αt smaller) in either case • The correlation between Yi1 (test stat based intermediate endpoint data at interim) and Yi2 (test stat based on clinical endpoint data at final analysis) is different from before • The treatment effect in terms of intermediate endpoint shouldn’t be assumed null under the null hypothesis for clinical endpoint – Meta-analysis of trials showing no treatment effect on clinical endpoint may be used to help find a reasonable range for treatment effect on intermediate endpoint for derivation of an appropriate α* that controls overall alpha in this range 30 Maximum Penalty • The treatment effects on intermediate endpoint are treated as nuisance parameters and the α* is chosen such that it minimizes over (-Inf, Inf) while keeping the Type I error controlled at α – Minimum α* is achieved when effect sizes take same value δ (or at same δ 𝑁𝑡 in general) • The common δ associated with minimum α* is nondegenerate (i.e., 0, -Inf or Inf) due to the complicated interplay between futility stopping and cherry-picking – When δ is very large, none will be pruned (no penalty) – When δ is very small, trial may be terminated early for futility (no chance to even commit a Type I error) 31 Correlation between Yi1 and Yi2 • In general, the correlation may be estimated by a bootstrapping method or any other method – Martingale residuals may be used to construct the correlation when both are time-to-event variables – Estimator of α* converges to true value as long as estimator of correlation is a consistent estimator • In the special case when Yi1 and Yi2 are continuous variables and interim analysis is conducted based on t proportion of planned sample size, the correlation differs from before by a factor of ρ where ρ is correlation between the two endpoints based on same subjects – Next slides are based on this setup for ease of presentation 32 Illustration of α* at Different δ at αt=0.05 and k=6 α* is minimized at a non-degenerate delta and varies with rho 33 Minimum α* Under Different Design Options Minimum α* decreases with increasing k or t as expected and is always less than 2.5%. 34 An Application of Special Interest • A randomized controlled basket trial with 1:1 randomization in 6 tumor indications, each targeting a hazard ratio of 0.5 in PFS with 90% power at 2.5% alpha for accelerated approval – 88 PFS events and 110 subjects planned for each indication – PFS analysis is conducted when all are enrolled (t=1) • D2 is applied to keep total sample size at 660 in pooled population targeting 430 death events for full approval – ~90% power to detect a hazard ratio of 0.7 in OS at 0.8% when ρ is 0.5 (i.e., correlation between PFS at interim and OS at final is 0.5 m/6 if m-tumors are selected due to sample size adjustment) – Observed hazard ratio ~0.79 or lower for a positive trial in pooled population (vs ~0.84 under D0) • Typical Phase III sample size but potential approval in 6 indications 35 Forest Plot of Hypothetical PFS Outcome Tumor 1 Tumor 2 Tumor 3 Tumor 4 Tumor 5 Tumor 6 Tumors 1-5 HR=1 36 Forest Plot of Hypothetical OS Outcome Tumor 1 Tumor 2 Tumor 3 Tumor 4 Tumor 5 Tumors 1-5 HR=1 37 Bias in Point Estimation of Treatment Effect Bias • The fact that a penalty should be paid to control Type I error implies estimate of treatment effect is potentially inflated – The smaller the α* the greater the potential bias • Bias is not a concern if “losers” are truly losers and “winners” are truly winners, but a conservative assumption is that treatment effect for clinical endpoint is the same for all and “winners” are cherry-picked due to random highs • Similarly to what happens in Type I error control, the treatment effect on intermediate endpoint plays a complicated role – When it is very large, none will be pruned (no inflation) – When it is very small, trial may be terminated early for futility – Maximum bias is reached at a non-degenerate point 39 Set-up for Calculation • Again, let Yi1 be the test statistics based on intermediate endpoint data at interim and Yi2 test statistics based on clinical endpoint data at final analysis • Let 𝐼𝑖 (t) be the information Yi1 is based upon where t is the proportion of planned subjects used in interim analysis (t=1 in last example) and let 𝛿𝑖 be the true treatment effect for i-th tumor based on the intermediate endpoint, and – Notice that the expectation of Yi1 is 𝛿𝑖 𝐼𝑖 𝑡 • Let 𝐽𝑖 be the information Yi2 is based upon at final analysis – The corresponding estimated treatment effect is Yi2/ 𝐽𝑖 – The estimated treatment effect in pooled population can be approximated with a weighted average of individual estimate for each tumor by the information units 40 Key Steps in Calculation • Probability of including a subset of m-tumors in final analysis 𝑃𝑟 𝑌11 𝑡 > 𝑍1−𝛼𝑡 , … , 𝑌1𝑚 𝑡 > 𝑍1−𝛼𝑡 , 𝑌1,𝑚+1 𝑡 < 𝑍1−𝛼𝑡 , … , 𝑌1𝑘 𝑡 < 𝑍1−𝛼𝑡 • Bias conditional on a i-th tumor being selected follows from below: – Bias of Yi2 conditional on Yi1 is ρ 𝑌𝑖1 − 𝛿𝑖 𝐼𝑖 𝑡 – Yi1 is a left-truncated normal variable with mean 𝛿𝑖 𝐼𝑖 𝑡 + ϕ 𝑍1−𝛼𝑡 − 𝛿𝑖 𝐼𝑖 𝑡 1 − Φ 𝑍1−𝛼𝑡 − 𝛿𝑖 𝐼𝑖 𝑡 • The overall bias is given below 𝑘 Bias𝑎 𝑠𝑢𝑏𝑠𝑒𝑡 × 𝑃𝑟 𝑎 𝑠𝑢𝑏𝑠𝑒𝑡 𝑜𝑓 𝑚 𝑡𝑢𝑚𝑜𝑟𝑠 𝑚=1 𝑎𝑙𝑙 𝑠𝑢𝑏𝑠𝑒𝑡𝑠 – Proportional to ρ and 1/ 𝐽𝑖 , and a function of 𝑍1−𝛼𝑡 − 𝛿𝑖 𝐼𝑖 𝑡 – An estimate of bias is obtained by replacing 𝛿𝑖 with its estimate 41 Symmetry Revisited Contour plot Contour plotof bias of bias 42 • A randomized basket trial with equal sample size in 2 tumor indications ‒ The interim analysis is conducted after 88 PFS events in each cohort. ‒ To have 430 death events in the pool, sample size changes after pruning (D2). ‒ Maximum bias is achieved when δ1=δ2 ‒ Same conclusion holds for > 2 cohorts due to symmetry (similar to minimum α* calculation) Last Example Revisited and Expanded • A randomized basket trial with equal sample size in 6 tumors –The interim analysis is conducted after 88 PFS events are observed in each cohort –Number of deaths is kept at 430 in pooled population (D2) • Two scenarios –Only those with αt <0.025 (observed hazard ratio <~0.66) are included in pooled population – Irrespective of filing for accelerated approval, which may require αt <0.025, a tumor indication is included in pooled population as long as αt <0.1 (observed hazard ratio <~0.76) • Maximum bias is comparable to difference in empirical bar for a positive trial between pruning and no pruning – For ρ=0.5, max bias in log(hazard ratio) is ~0.07 (or ~0.05 in hazard ratio when true treatment effect is ~0.8) 43 Bias at αt =0.025 in -log(hazard ratio) Bias ~0.05 in hazard ratio when δ=0.5 Maximum bias ~0.07 44 Bias at αt =0.1 in -log(hazard ratio) Bias ~0.02 in hazard ratio when δ=0.5 Curve shifts to left with same max 45 Discussions Summary • The risk of a basket trial is that the trial fails for all indications at once, but a successful trial could provide very significant benefits to patients, drug developers, and health authorities – Sample size adjustment after pruning mitigates the risk – Initial emphasis should be using external data for pruning • A basket trial with pruning of indications based on an interim analysis is feasible and, after paying the maximum penalty, has superior overall performance vs a basket trial without pruning – A special application is to file accelerated approval based on interim analysis and file full approval based on final analysis – Bias in point estimation of treatment effect can be properly corrected if needed 47 Ongoing Research • The pathway design subteam is working with academic statisticians and regulatory agencies to investigate various issues related to statistical design, analysis and reporting, and monitoring of confirmatory basket trials to pave the way for incorporation into regulatory guidance • The general framework is being applied to design and analysis of umbrella trials • The methodology for handling different endpoints is being applied to design and analysis of general adaptive trials • Presentations to clinical and scientific conferences are underway, and multiple manuscripts are under draft 48 Key References • Chen C, Li N, Yuan S, Antonijevic Z, Kalamegham R, Beckman RA. Statistical design and considerations of a phase 3 basket trial for simultaneous investigation of multiple tumor types in one study. To appear in Statistics in Biopharmaceutical Research , 2016. • Yuan S, Chen A, He L, Chen C, Gause CK, Beckman RA. On group sequential enrichment design for basket trials. To appear in Statistics in Biopharmaceutical Research , 2016. • Beckman RA, Antonijevic Z, Kalamegham R, Chen C. Design for a phase 3 basket trial in multiple tumor types based on a putative predictive biomarker. Under review for publication. • Li, W, Chen C, Li N. Estimation of treatment effect in two-stage adaptive trials using an intermediate endpoint, under draft. • • • • • • • • • • • • • • Xiaoyun (Nicole) Li, and Cong Chen. Adaptive Biomarker Population Selection in Phase III Confirmatory Trials. JSM proceedings of ASA Biopharmaceutical Section, 2015. Chen C, Li N, Shentu Y, Pang L, Beckman RA. Adaptive Informational Design of Confirmatory Phase III Trials with an Uncertain Biomarker Effect to Improve the Probability of Success. To appear in Statistics in Biopharmaceutical Research 2016. Magnusson BP, Turnbull BW. Group sequential enrichment design incorporating subgroup selection. Stat Med. 2013;32(16):2695-2714. Heinrich MC, Joensuu H, Demetri GD, Corless CL, Apperley J, Fletcher JA, et al. Phase II, open-label study evaluating the activity of imatinib in treating life-threatening malignancies known to be associated with imatinib-sensitive tyrosine kinases. Clin Cancer Res. 2008;14(9):27172725. Berry SM, Broglio KR, Groshen S, Berry DA. Bayesian hierarchical modeling of patient subpopulations: efficient designs of Phase II oncology clinical trials. Clin Trials. 2013 Oct;10(5):720-34. Simon R, Geyer S, Subramanian J, Roychowdhury S. The Bayesian basket design for genomic variant-driven phase II trials. Seminars in Oncology, DOI: http://dx.doi.org/10.1053/j.seminoncol.2016.01.002. Berry DA. The Brave New World of clinical cancer research: Adaptive biomarker-driven trials integrating clinical practice with clinical research. Molecular Oncology, DOI: http://dx.doi.org/10.1016/j.molonc.2015.02.011. Neuenschwander B, Wandel S, Roychoudhury S and Bailey S. Robust exchangeability designs for early phase clinical trials with multiple strata. Pharmaceutical Statistics, DOI: 10.1002/pst.1730. Barker AD, Sigman CC, Kelloff GJ, Hylton NM, Berry DA, Esserman LJ. I-SPY 2: an adaptive breast cancer trial design in the setting of neoadjuvant chemotherapy. Clin Pharmacol Ther. 2009; 86: 97-100. Herbst RS, Gandara DR, Hirsch FR, Redman MW, LeBlanc M, Mack OC. et al. Lung Master Protocol (Lung-MAP)-A Biomarker-Driven Protocol for Accelerating Development of Therapies for Squamous Cell Lung Cancer: SWOG S1400. Published Online First February 13, 2015; doi: 10.1158/1078-0432.CCR-13-3473. Kim ES, Herbst RS, Wistuba II, Lee JJ, Blumenschein GR, Jr., Tsao, A. et al. The BATTLE trial: personalizing therapy for lung cancer. Cancer Discovery 2011; 1: 44-53. Demetri G, Becker R, Woodcock J, Doroshow J, Nisen P, Sommer J. Alternative trial designs based on tumor genetics/pathway characteristics instead of histology. Issue Brief: Conference on Clinical Cancer Research 2011; http://www.focr.org/conference-clinical-cancer-research-2011. Bauer P, Koenig F, Brannath W, Posch M. Selection and bias - two hostile brothers. Stat Med. 2010 Jan 15;29(1):1-13. doi: 10.1002/sim.3716. Wathen JK,Thall PF, Cook JD and Estey EH. Accounting for patient heterogeneity in phase II clinical trials. Statist. Med. 2008; 27:2802–2815. 49