Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Experimental Design and Statistical Considerations in Translational Cancer Research (in 15 minutes) Elizabeth Garrett-Mayer, PhD Associate Professor of Biostatistics and Epidemiology Two Parts Phase I studies Taking markers into the clinic Phase I Trial Design Historically, DOSE FINDING study Classic Phase I objective: “What is the highest dose we can safely administer to patients?” Translation: Kill the cancer, not the patient Assumes monotonic relationship between dose and toxicity dose and efficacy 1.0 Classic Phase I Assumption: Efficacy and toxicity both increase with dose DLT = doselimiting toxicity 0.8 0.6 0.4 0.2 0.0 Probability of Outcome Response DLT 1 2 3 4 Dose Level 5 6 7 Classic Phase I approach: Algorithmic Designs “3+3” or “3 by 3” Prespecify a set of doses to consider, usually between 3 and 10 doses. Treat 3 patients at dose K 1. If 0 patients experience DLT, escalate to dose K+1 2. If 2 or more patients experience DLT, de-escalate to level K-1 3. If 1 patient experiences DLT, treat 3 more patients at dose level K A. If 1 of 6 experiences DLT, escalate to dose level K+1 B. If 2 or more of 6 experiences DLT, de-escalate to level K-1 MTD is considered highest dose at which 1 or 0 out of six patients experiences DLT. Confidence in MTD is usually poor. “Novel” Phase I approaches Continual reassessment method (CRM) (O’Quigley et al., Biometrics 1990) Many changes and updates in 20 years Tends to be most preferred by statisticians Other Bayesian designs (e.g. EWOC) and model-based designs (Cheng et al., JCO, 2004, v 22) Other improvements in algorithmic designs Accelerated titration design (Simon et al. 1999, JNCI) Up-down design (Storer, 1989, Biometrics) CRM: Bayesian Adaptive Design Dose for next patient is determined based on toxicity responses of patients previously treated in the trial After each cohort of patients, posterior distribution is updated to give model prediction of optimal dose for a given level of toxicity (DLT rate) Find dose that is most consistent with desired DLT rate Modifications have been both Bayesian and non-Bayesian. New paradigm: Targeted Therapy How do targeted therapies change the early phase drug development paradigm? Not all targeted therapies have toxicity Toxicity may not occur at all Toxicity may not increase with dose Targeted therapies may not reach the target of interest Implications for study design: Previous assumptions may not hold Does efficacy increase with dose? Endpoint (DLT) may no longer be appropriate Should we be looking for the MTD? What good is phase I if the agent does not hit the target? 0.2 0.4 0.6 0.8 Efficacy Toxicity 0.0 Probability of Outcome 1.0 Possible Dose-Toxicity & Dose-Efficacy Relationships for Targeted Agent 0 2 4 6 dose 8 10 12 What is a Correlative Study? A study that correlates a “marker” with disease What is a marker? An innate characteristic of a tumor or tissue Examples Marker PSA Estrogen receptor SUV from PET KIT mutation Disease Prostate cancer Breast cancer Many cancers GIST What is it good for? Prognostic marker: Predictive marker: Predicts outcome (independent of therapy) Predicts response to therapy Can be used for Treatment assignment Treatment stratification in clinical trials Surrogate endpoint (?) Targeted therapy development Diagnosis Mitotic Rate: Prognostic Marker Figure 3. Recurrence-free survival in 127 patients with completely resected localized gastrointestinal stromal tumor (GIST) based on mitotic rate DeMatteo et al, Cancer, 112:608-615 HER-2: Predictive Marker Disease-free survival. Gennari A et al. JNCI J Natl Cancer Inst 2007;100:14-20 © The Author 2007. Published by Oxford University Press. Lifecycle of a marker Analytical development Clinical development Measurement, logistics etc Sample collection, storage, processing “Retrospective” connection with outcome Clinical validation “Prospective “ connection with outcome Statistical issues during analytical development Reproducibility Repeat the measurement on the same sample multiple times under otherwise identical conditions Suppose binary marker, twice measured Results can be summarized in a fourfold (2x2) table Statistical Significance? not good enough! p<0.05 shows there is a trend need strong agreement, not just a trend Continuous Measurements Measurement 1 p = 5.2x10E-11 R-squared = 0.59 Measurement 2 p = 3.2x10E-5 R-squared = 0.62 Measurement 2 Measurement 2 p = 1.2x10E-11 R-squared = 0.92 Measurement 1 DO NOT RELY ON P-VALUES!! Measurement 1 Clinical development of a marker Correlate marker(s) with the outcome on a cohort of patients Many issues relate to bias Case/control selection Quality/Processing Over-fitting/Lack of validation What is bias? A systematic difference between what we think we observe and what we actually observe The more “haphazard” the data collection process, the more chances of bias creeping in Buyer beware: Commercial Tissue Microarrays Why is bias a problem? Cannot be “quantified” (within a study) Does not diminish with increasing sample sizes Double dipping Use the same data to develop/fine-tune a marker (or model) and evaluate its characteristics Most obvious with multivariable analyses (gene signatures etc) Might happen in seemingly innocuous circumstances Choosing a cutpoint Not reporting negative markers VALIDATION!!! “cross-validation”: statistical approaches that use the same data but account for double-dipping true validation: repeat the study in a new but similar population apply the “model” to a new dataset and test its prediction accuracy Be critical of your results All sorts of biases crept in Patients with tissue are unlikely to be a random sample No real inclusion/exclusion criteria Possibly looked at many markers, many subsets and many thresholds Build your marker into a clinical trial Incorporating markers into clinical trials Start as secondary endpoints in a Phase I or II trial If Phase I, might be better to have an MTD-cohort and limit the correlative studies to that cohort If Phase II and an expensive/invasive marker, consider a two-stage design where marker will be measured only in the second stage