Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
4/27/16 FangxinHong.Ph.D. Sr.ResearchScientist Dept.BiostatisticsandComputationalBiology Dana-FarberCancerInstitute 1 Disclosure: Goal:practicallyuseful. Nottrytobecomprehensive;notplantopresentmostrigorous methodsorlatestdevelopmentinthefield Scope:Discoverorvalidatemarkersusingarchivedspecimens (prospective)Correlativestudiesembeddedinaclinicalstudyprotocol or(retrospective)ingrantsusingsample/datafromaclinicalstudy. Þ Willnotcoverpre-clinicalstudies:nocell-line,noanimal experiments. Willnotdistinguishprognosticorpredictivebiomarker,prospective orretrospectivestudy (retrospectively)Studydesignmayhaveevidentiaryvalueclosetoa prospectivestudyundercertainconditions(Simonetal,2009): 2 1 4/27/16 Outlines Overview Whatisbiomarker? Whatiscorrelativestudy? Generalprincipalsandguidelinesfordesign,analysisandreporting Studytypes Typesofsampling Typesofmarkervalueandoutcomedata Typesoftests/analyses StudyDesignandanalysiswithexamples Power/samplesizecalculation:approachesandprogramingcode Statisticalsectionwrite-up AnalysisplanandActualanalysis 3 WhatisaBiomarker? Biomarker Acharacteristicthatisobjectivelymeasuredandevaluatedasanindicatorof normalbiologicprocesses,pathogenicprocesses,orpharmacologic responsestoatherapeuticintervention BiomarkersDefinitionsworkingGroupNationalInstitutesofHealth2001 Example:geneticmarkers:geneexpression,mutationstatus,SNP,….;imaging biomarkers:PET+/-…;molecularbiomarkers:circulatingtumorcells…; pharmacologicmarkers:serumconcentration..;pathogenicbiomarkers:EBV virusload…. Assay Amethodfordeterminingthepresenceorquantityofacomponent Test Aprocedurethatmakesuseofanassayforaparticularpurpose Goodbiomarkers≠GoodAssays≠Tests 4 2 4/27/16 TypesofClinicalBiomarkers: Diagnosis • Confirmation • Staging • Subtyping Prediagnosis Pretreatment • Risk • Screening • Early detection • Prognostic • Predictive Intratreatment Posttreatment • Early • Early response endpoint or futility • Recurrence/ • Toxicity progression monitoring monitoring 5 WhatisaCorrela;veStudy? Correlativescienceisatermusedtoshowtherelationshipbetweenbiology(i.e. biomarkers)andclinicaloutcomes(i.e.diseaseprogression).Thisisthepromise ofpersonalizedmedicine. Addressesthequestion: WhatisthecorrelationbetweentheXandY? Cancerandobesity QoLscoreandtreatment ResponseandPerformancestatus Prognosis(progression)andgenemutation(KRAS) Timingofcorrelativestudies Ascorrelativeobjectivesofaclinicaltrial/protocol-prospectivestudy Thisbecomelesscommon,asthescienceischangingrapidly.Itishardto predictwhichmarkerswillbeofinterestwhenthetrialcomplete(usually5-10 yearsafterconceptdevelopment) Grantapplicationusingarchivedsamplesfromacompleteorongoingclinical trial. 6 3 4/27/16 Clinicalcorrela;vestudies:Design Guidelineprinciples Definingclearobjectives:descriptiveorinferential Specifyingstudysample:*allpatientsorselectedpatients Specifyingdesignandprocedures: Definingcorrelativemeasures:whatwillbequantifyineachsample Samplingtimepoints:baseline,mid-treatment,end-treatment… Choiceofoutcomemeasures:ORR,PFS,OS,toxicity Specifyingmethodsand**consideringpowerandsamplesize Writingstatisticalanalysisplan:shouldbeoutlinesforeachobjective *Full-cohortdesignvs.Cohort-samplingdesign **dependsonwhethertheobjectiveifdescriptiveorinferential. 7 AnalysisandRepor;ng 8 4 4/27/16 Typeofsampling Forvariousreasons,includingcostsofbiomarkerevaluationandlackof availabilityoftissueorconsent,foruseoftissue,thebiomarkersareoften evaluatedononlyasubsetofthepatientsonthestudy. Full-cohortdesign:useallpatientsthathavebio-specimen(orbiomarkerdata) available(orhavePETscan); Cohortsamplingdesign:sampleallcases(patientswithevent)andonlyafractionof controls(patientswithoutevent). Iftheclinicaloutcomesofinterestareeventtimes,andtheeventrateislow,then itiswell-knownthatamoreefficientdesignisobtainedbysamplingsubjectswith observedevents(cases)moreintensivelythancaseswithcensoredeventoutcomes(noncases). 1)Classiccase-cohortsampling:allstudycases,plusarandomsubsetofcontrols 2)Nestedcase–control(NCC)sampling:allstudycasesbutonlyafractionofcontrols selectedrandomlyfromtherisksetofthematchedcases. Thesetofpatientsenrolledonthestudywillbereferredasthecohort.Thesubsetofthe cohortanalyzedforbiomarkerswillbereferredtoasthesample.Thelargersetofpatients meetingthecriteriaforentryontheclinicaltrialwillbereferredtoasthepopulation.The terms“studypopulation”and“targetpopulation”couldalsobeusedforthecohortandthe population. 9 CommonTypesofdata Biomarker (1)Numericalmeasurement. e.g.;geneexpression,circulatingcancercells (2)Categorical(dichotomized)data* e.g.;mutationpresent/absent,high/lowscore,SNP (3)Percentageoutcome e.g.;Ki-67index,growthpattern Þ Twotypes:Continuousorcategorical Clinicaloutcome (1) Rate/dichotomized:CR/ORR,x-yearPFSrate (2) Time-to-eventoutcome:PFS,DFS,OS *Ordinaldatausuallywouldberegroupedintodichotomizedgroups 10 5 4/27/16 Typeoftests/analyses Biomarker Clinicaloutcome Rate/dichotomized Time-to-eventoutcome Categorical/ dichotomized Fisher’sexact/Chisquaretest Log-ranktest Continuous T-test/WilcoxsonTest Orsimilar Coxregressionmodel Note:Testing(powercalculation)usuallyconsidersonemarkeratatime; analysisoftenworkonmultiplemarkers,e.g.,high-throughputdata Grantproposalorprotocolcorrelativeobjectives:power/samplesize calculation,analysisplan Manuscripts:analysisandreporting 11 Categoricalbiomarker: E4402:PhaseIIITrialofComparingTwoDifferentRituximabDosingRegimens forPatientswithLowTumorBurdenIndolentnon-Hodgkin'sLymphoma STEP 2 STEP 1 Rev. 11/04, 5/07 Rev. 11/04 R Classification/ Stratification Factors E G Rev. 7/08 Arm I Induction Rituximab1, 2 375 mg/m2 IV x 4 weekly doses I S Rev. 7/08 T · Histology: Follicular vs. Other · Age: < 60 years vs. > 60 years PR/CR Restage Week 13 · E R PD or < PR/CR Time from diagnosis: < 1 year vs. > 1 year Arm A Rituximab1 Retreatment Rev. 7/08 Administer only for progressive disease (within 28 days of CT scans documenting disease progression) 3 R A 375 mg/m2 IV x 4 weekly doses then re-enter observation · N D O Rev. 7/08 M Patients may be retreated on Arm A indefinitely following each progression until they meet the criteria below for rituximab failure4 Discontinue Protocol Therapy Rev. 5/05 Arm B5 Rituximab1 Scheduled Rev. 7/08 I · Single 375 mg/m2 IV dose Rev. 7/08 · q 13 weeks Rev. 7/08 · Continue to rituximab failure4 Z E For Quality of Life time points, see section 6.3 Accrual goal = Approximately 519 total patients (270 randomized follicular lymphoma patients) Rev. 7/08 1. 2. Rev. 7/08 Rev. 7/08 3. 4. Rev. 7/08 Dose based on actual body weight. Rituximab Induction is the first 4 week course of rituximab the patient receives during Step 1. For scheduling according to the protocol study calendar, count from the first dose of the 4 week course of rituximab. Randomization must be done within 1 week of the week 13 restaging visit. Rituximab failure is defined as no response (<PR/CR), TTP < 26 weeks (TTP calculation begins on day 1 of: rituximab induction OR rituximab retreatment OR rituximab scheduled), initiation of alternative therapy, or inability to complete therapy. See section 6.2.1 for full definition. 5. Treatment should start within 14 working days of the week 13 restaging visit. 12 6 4/27/16 E4402LRFgrant:designelements Aim:TocorrelateimmunoglobulinFcreceptor(FcγR)polymorphismsinFL patientsreceivingsingleagentrituximabtoresponse,responseduration,and timetorituximabresistance. Sample:(anticipated)259outof408FLpatientswillhavedataonFcγR polymorphisms,thusbeavailabletocorrelatewithresponse.And192 (assuming74%respondrate)tocorrelatewithdurationofresponse(DOR). Clinicaloutcome(thewholetrial):Responserate74%,medianresponse duration:18month;mediantimetorituximabresistance:25months Note:theoutcomewasknownforthewholetrial,thenumbersmightvary withinthestudysample; Markermeasurement: Fc RIIIapolymorphism:VV,VF,FF,assuming20%VVpatientsand80%F carriers Hypotheses:patientswithVVtypehavehigherresponserateandlongerDOR thanpatientswhoareFcarriers. =>one-sidetesting 13 Power/samplesize:responserate Fisher’sexacttestwillbeused(one-sidedsignificancelevelof0.05)toevaluate whethertheresponserateforVVpatientsishigherthantheFcarriers.With 20%(n=52)VVpatientsand80%(n=207)Fcarriers,thefollowingtableliststhe differencecanbedetectedwithatleast80%powerwithoverallresponserate within95%CIoftheobservedone(ORR:74%,95%C.I.69%,78%).For example,adifferenceof83%vs65%,87%vs71%and90%vs75%canbe detectedbetweenVVpatientsandFcarriers.Alogisticregressionmodelwillbe employedtoevaluatethepolymorphismeffectonresponserateaccountingfor otherpatientcharacteristics. Overallresponserate ResponserateinVV patients(HHpatients) ResponserateinFcarriers (Rcarries) 69% 83% 74% 87% 65% 71% 78% 90% 75% 14 7 4/27/16 Rcode: library(desmon) b2diff(0.69,0.2,259,alpha=0.05) b2diff(0.74,0.2,259,alpha=0.05) b2diff(0.78,0.2,259,alpha=0.05) b2diff(p,r,n,alpha=0.025,power=0.8,exact=TRUE) b2diffisintendedforfacilitatingpowercalculationswhenanew(binary)markerwillbe measuredonanexistingsample.Thenthecombinedsamplessizenandoverallresponse rateisknown.rgivestheexpectedproportionofthesampleingroup1(definedbythe newmarker).b2diffthencalculatestheresponseprobabilitiesinthetwogroupsthatgive adifferenceforwhichthetwo-samplecomparisonwillhavethespecifiedpower,subject totheoverallresponseprobabilityandsamplesizeconstraints.Theprobabilitiesare computedfortheone-sidedtestsineachdirection(higherresponseratesingroup1and lowerresponseratesingroup1). >b2diff(0.69,0.2,259,alpha=0.05) [1]0.69000000.20000000.83244400.65421700.53450530.7290615 80%poweratone-sided0.05todetect0.83(group1)vs.0.65(group2), or0.53(group1)vs.0.73(group2) 15 Power/samplesize:Dura;onofresponse Durationofresponse(DOR)willbeestimatedusingtheKaplanandMeiermethodoverall witharmscombinedandbyarmsifsignificantdifferenceisobserved.Logranktest(onesidedsignificancelevelof0.05)willbeusedtocompareDORamong192patients(70%of274 FLresponders)enrolledtomaintenancestepover58monthswith24monthsfollow-up.For armscombined,assuming20%(n=38)areVVpatientsand80%(n=154)areFcarriers,the followingtablelistthedifferenceinmedianDORcanbedetectedwithatleast80%power withaseriespossibleoverallmedianestimate.Forexample,thereisatleast80%powerto detectadifferenceinmedianDORof27monthsvs16monthswithanoverall18months DOR.ACoxproportionalhazardsregressionmodelwillbefittoevaluatethesignificanceof polymorphismeffectonDOR(TTRF)afteradjustingforotherpatientcharacteristics. Overall median 16 18 20 MedianDOR(months) VVpatients Fcarriers 24 27 31 14 16 18 16 8 4/27/16 Rcode: ##192pts/58months=>3.3pts/month ##computepowertodetectdifferenceinmedfora ##controlgroup(lowmedian,Fcarriers):r2=80%patients, seriesselectedof ##experimentalgroup(VVpatients):r1=20% medianofVVpatients ##overallmedianDOR16months,assumingmedianforone ##(1)overallmedian16months ##group=>computemedianfortheothergroup; med.all=16 hazG2=function(med.all,med.g1,r1,r2) med.g1=c(20,21,22,23,24,25,26,27,28,29,30,31) { haz.all=log(2)/med.all r1=0.2 haz.g1=log(2)/med.g1 =0.8 haz.g2=(-1)*log((0.5-r1*exp(-haz.g1*med.all))/r2)/med.all hazG2.pow(med.all,med.g1,r1,r2,0.05) med.g2=log(2)/haz.g2 med.g1med.g2pow haz.g2 power2015.170850.3924527 } power2115.006370.5156537 power2214.855050.6324386 ##computepowertodetectthedifferenceinDOR power2314.715400.7337765 hazG2.pow=function(med.all,med.g1,r1,r2,alp,acc.per=58, power2414.586150.8152157 acc.rate=3.3,add.fu=24) power2514.466200.8764927 { power2614.354600.9200949 Exponentialdistribution power2714.250520.9497021 haz.g1=log(2)/med.g1 haz.g2=c() power2814.153230.9690430 for(iin1:length(med.g1)) power2914.062100.9812853 {haz.g2=c(haz.g2,hazG2(med.all,med.g1[i],r1,r2))} power3013.976570.9888412 med.g2=log(2)/haz.g2 power3113.896140.9934133 pow=c() for(iin1:length(haz.g1)) ##(2)overallmedian18months {pow.i=powlgrnk(acc.per,acc.rate,add.fu,alpha=alp,p.con=r2, control.rate=haz.g2[i],test.rate=haz.g1[i]) pow=c(pow,pow.i[1]) Log-ranktest rm(pow.i)} hazG2.pow=cbind(med.g1,med.g2,pow) hazG2.pow } 17 Alterna;veapproach Iftheanalyticalsubsetisknown,and#eventsarefixed(orwithinarange)=> calculateHRthatwouldbedetectedbetweentwomarkergroupswithcertainpower Exponentialdistribution proportionalhazard #events HR(HR<1) Example:AstudytoconfirmprognosticsignificanceofHGALprotein expression. Itisexpectedtohave77failuresintheanalyticalsample.Assuming proportionalhazardbetweenHGAL-positiveandHGAL-negativegroups,there is80%power(log-ranktest)atone-sided0.05significanceleveltodetecta hazardratioof0.49inFFS.Forexample,withtheobserved5-yearFFSrate 72%,5-yearFFSratesof76%vs.57%canbedetectedbetweenHGAL-positive andHGAL-negativegroups. 18 9 4/27/16 Rcode ##total77failures n.fail=77 ##proportiononpositiveornegativegroupa r.pos=0.8 r.neg=0.2 TheHRcomputedare<1, alpha=0.05##one-sidedalpha withone-sidedαandpower z.alpha=qnorm(alpha,lower.tail=FALSE) 1-β.Thus,inourcase beta=0.2##betalevel z.beta=qnorm(beta,lower.tail=FALSE) HR(pos/neg) ##hazardratiodetectedbetweenpositiveandnegativegroups haz.ratio.FFS=exp(-(z.alpha+z.beta)/sqrt(n.fail*r.pos*r.neg)) ##[1]0.4924313##pos/neg Importantfeatureofclinical correlativestudy:theoverall ##Overall5-yearFFSisfixed,selectpossible5yearFFSrateforonegroup clinicaloutcomeisfixed!Thus, ##=>compute5-yearFFSratefortheothergroup; therates/mediansfortwogroups areconstrainedbytheoverall S5.overall=0.72 S5.pos=c(0.76,0.75,0.74,0.73) rates/medians S5.neg=(S5.overall-r.pos*S5.pos)/r.neg #[1]0.560.600.640.68 ##checkwhichonegivesHRthatmatches haz.ratio=log(S5.pos)/log(S5.neg)##posvs.neg #[1]0.47331510.56317080.67468920.8160263 log(0.76)/log(0.57)#1]0.4882185 Power/samplesizecalculation(nottheanalysis) ismainlyusedtojustifythestudy,noneedtobe 100%rigorous. 19 Analysisandrepor;ng Practicaltips: Checkbiomarkerdata Duplicatebiologicalsample=>twomarkervalues Baselinepatientcharacteristics:comparepatientsthatareincludedin theanalysisvs.theentiretrialcohort =>Isthestudysamplearepresentativesubsetofthetrialcohort? Ifnot,nothingtodowithbias,rathergeneralibility 20 10 4/27/16 Con;nuousbiomarker: E1411:IntergroupRandomizedPhaseIIFourArmStudyInPatientsWithPreviouslyUntreatedMantle CellLymphomaOfTherapyWith:ArmA=Rituximab+BendamustineFollowedByRituximab Consolidation(RB→R);ArmB=Rituximab+Bendamustine+BortezomibFollowedByRituximab Consolidation(RBV→R),ArmC=Rituximab+BendamustineFollowedByLenalidomide+Rituximab Consolidation(RB→LR)orArmD=Rituximab+Bendamustine+BortezomibFollowedBy Lenalidomide+RituximabConsolidation(RBV→LR) N=332,randomizationratio:1:1:1:1 21 Correla;veobjec;ves(prospec;vestudy) Prognostic biomarker Surrogate endpoint 22 11 4/27/16 MinimalResidualDisease(MRD)Correla;veStudies -write-up (aim)Todeterminewhetherthenumberofmalignantcellsincirculationorin marrowattheendofinductioncorrelatewithCRor2-yearPFS, (approach)thenumberofmalignantcellswillbecomparedbetweenCRvs.nonCR,andbetweenthosewhoareprogression-freeat2yearsvs.thosewhoarenot, usingWilcoxonrank-sumtestswithone-sidedTypeIerrorof10%. (sample)Weexpectthatbloodsampleswillbecollectedinapproximately90%of patientsatbaseline,and90%ofthesepatientswillhavebloodsamplesatthe endofinduction(260samples). (powercalculation)Assuming50%ofpatientswillachieveCRafterinduction, wewillhaveadequatepower(90%)todetectaneffectsizeof0.33(0.33standard deviationdifference)innumberofmalignantcellsbetweenCRandnon-CRwith 260bloodsamples.Assuming70%ofpatientswillbeprogression-freeat2years, wewillhaveadequatepower(90%)todetectaneffectsizeof0.36innumberof malignantcellsbetweenthosewhoareprogression-freeat2yearsvs.thosewho arenot,with260bloodsamples. 23 High-throughputcon;nuousbiomarkerdatawith dichotomizedclinicaloutcome Aim:Toidentifygenewhoseexpressionlevelcorrelatewithresponse =>Identifydifferentiallyexpressedgenesbetweenresponders(CR/PR)andnonresponders(SD/PD). Power/samplesize:thestudycanbepoweredbasedoninferenceforindividual genes=>two-sampletestonmean/median Methods:ModifiedT-test(SAM)orLinearmodels(t-orF-statistic);fold-change basedmethod;rank-basedmethod Bionconductorhasacollectionofpackages:toidentifydifferentiallyexpressed geneandcontrollingformultipletestingprocedure,e.g.,falsediscoveryrate (FDR)approach Example:Bioconductorpackages:Limma,SAM,RankProd,multtest….. Rank Product: based on fold-change, also assess biological variation -- intuition of FC criterion, naturally controlling of FDR -- fewer assumptions under the model -- increased performance with noisy data and/or low numbers of replicates -- application in meta-analysis -- New version is coming soon….. (improved computational capacity…) 24 12 4/27/16 Exampleofgrantwrite-up Theanalysisofgeneexpressiondatawillbeprimarilyexploratory.After microarrayhybridization,therawCELfileswillfirstbeanalyzedwithGC-RMA method(fornormalizationandexpressionvaluecomputation)toobtaingene expressionvaluesasatwo-waydatatable.Thereafter,thestatisticalanalysiswill beusingRsoftwareandBioconductorpackages. TotestwhetherGEPwilldistinguishFLpatientswithdifferentresponsetosingle agentrituximab,wewillsearchfordifferentiallyexpressedgenesbetween responders(CR/PR)andnon-responders(SD/PD).Arankbasedmethod developedbyDr.Hong(RankProd,Rpackage,BioconductorProject)willbe applied,withthecontroloftheoveralltypeIerror(multiplecomparison adjustment)usingafalsediscoveryrate(FDR)approach.Atwo-way unsupervisedhierarchicalclusteringanalysiswillbeappliedtoboththegenes andthesamplestoillustratewhetherGEPcanseparaterespondersfromnonresponders.Andthegenesrankedonthetopwillbeassessedbygeneontology andpathwayanalysis(e.g.,GeneSetEnrichmentAnalysis(GSEA))fortheir potentialfunctionsinlymphomaprogressandtreatment. 25 SampleSizePlanningforDevelopingClassifiersUsingHigh DimensionalData Aim:Tobuildupaclassifier(pre-defineddiagnosticorprognosticclasses) (step1)Identifydifferentiallyexpressedgenesbetweentwoclasses (step2)Constructclassifierusingidentifiedgenes Bothofthesestepswillintroducevariabilityintotheresultingclassifier,sobothmust beincorporatedinsamplesizeestimation. Method: DobbinK,SimonR.Samplesizedeterminationinmicroarrayexperimentsforclass comparisonandprognosticclassification.Biostatistics.2005Jan;6(1):27-38. Tool http://linus.nci.nih.gov/brb/samplesize/samplesize4GE.html Theprogramcanbeusedheuristicallyfordevelopingclassifiersforpredictingrisk groupswithsurvivaldata. Howtousethisprogramwithsurvivaldata 26 13 4/27/16 High-throughputcon;nuousbiomarkerdatawithsurvival outcome E2496:RANDOMIZEDPHASEIIITRIALOFABVDVSSTANFORDV+/- RADIATIONTHERAPYINLOCALLYEXTENSIVEANDADVANCEDSTAGE HODGKIN'SDISEASE Stratification Factors: Disease: Locally extensive (stage I-IIA/ B with massive mediastinal adenopathy) vs. advanced (stage III-IV) Rev. 9/02 Number of adverse risk factors: 0 - 2 vs. 3 - 7 Time of entry: Before addendum 6 vs. after addendum 6 ARM A: ABVD1 R Minimum of 6 and maximum of 8 cycles.3 A N D Rev. 1/00 Modified involved field radiation will be given ONLY to patients with massive mediastinal disease regardless of stage. (See Section 5.51). Rev. 3/01 ARM B: STANFORD V4 O M I 12 weeks of chemotherapy. Z E Rev. 1/00 Modified involved field radiation will be given to all patients with sites of disease > 5 cm in maximum transverse dimension and/or macroscopic splencic disease. (See Section 5.52). 1 For Arm A repeat every 28 days. 3 See Section 5.2. Number of cycles will depend on response; 2 cycles will be given beyond best response with a minimum of 6 cycles and a maximum of 8 cycles. If an objective response occurs between cycles 4 and 6, a total of 8 cycles of therapy will be given. Patients with residual radiographic abnormalities after 4 cycles will receive 2 additional cycles. If there is no change in residual radiographic abnormalities, these will be interpreted as residual scarring and ABVD will be discontinued. If there is an objective change in radiographic measurements after 2 additional cycles of ABVD, 2 further cycles will be given for a total of 8 cycles. Rev. 3/01 4 For patients currently receiving treatment on Arm B (the Stanford chemotherapy regimen), for whom nitrogen mustard is not available, a substitution with cyclophoshamide will be made on weeks 1, 5 and 9. The dose of cyclophosphamide is 650 mg/mm2. Dose modification and use of G-CSF should continue as outlined in the protocol. Rev. 2/03 27 Grantproposal Aim:toidentifygeneexpressionprofileforoutcomepredictionandrisk stratificationinHodgkinlymphoma,andtodevelopandvalidatethisprofile intoarobustmulti-geneclassifiersuitableforformalin-fixedparaffin embedded(FFPE)tissues. Sample:(fixed)309outof752elgiblecaseshavegeneexpressionprofile measuredonFFPEsample. Clinicaloutcome:Failure-freesurvival(FFS)andoverallsurvival(OS) Markermeasurement:expressionlevelof259genes,includingthose previouslyreportedtobeassociatedwithoutcomeincHL,weredetermined bydigitalexpressionprofiling(Naostringtechnology)ofpretreatmentFFPE biopsies. 28 14 4/27/16 Howtodoapowercalcula;on? “Whenconstructingaprognosticmarker,eveniftheultimategoalistobuild amultivariatemarker,afirststepistypicallytoidentifyindividualgenes associatedwithdiseaseoutcome,sothatagainthestudyshouldbepowered basedoninferenceforindividualgenes.”(DobbinandSimon,2005) Whichgenes? “genesthathavelowvariationacrosstheentiresetofsampleswouldbe difficulttouseforprognosticpredictioninclinicalsituations”.(Simon,2002) =>BasedonourpreviousstudyusingNaostringtechnology,thevarianceof loggeneexpressionhasmedianvalueof0.57(1.15,and2.01forthe3rdquartile and90%percentile,respectively).Therefore,wedemonstratestatistical poweringeneidentificationusingmedian,upperquartile,and90%percentile fromthepreviousexperiment. 29 WhatMethods? 30 15 4/27/16 WhatMethods? Comparewiththesamplesizeformulaforbinarytesting 31 Powercalcula;onstatement Thereare35deathsand81failures(progression/relapse,ordeath)observed within309casesfromE2496.UsingtheformulabyHsiehandLavori(2000),the followingtablelistshazardratioofdeath/failureassociatedwithaone-unit changeinlogexpression,whichcanbedetectedwith90%power,underaseries ofpossiblevarianceingeneexpression.Toadjustformultiplecomparison problem(total235genes),asmallone-sidedalphaof0.001isusedinthis calculation.Forexample,thereis90%powertodetectahazardratioof2.10with death,or1.62withfailurewithaone-unitchangeinlogexpressionforagene with1.0variance,atone-sidedsignificancelevelof0.001,usingCoxmodel. Comparingwiththehazardratiodemonstratedinestablishedgene-expression basedpredictor(e.g.,OncotypeDX),thestudyhasappropriatepowerfor similartypeanalysisinthispatientpopulation Variance(loggene expression) 0.5 1.0 2.0 Hazardratio OS FFS 2.85 2.10 1.69 1.99 1.62 1.41 32 16 4/27/16 Analysis(Plan) Therearemanyapproachedtobuildupsuchapredictor,fromhighthroughput(biomarker)data.Usuallyuselog-transformedand standardizeddata,andthepredictorisalinearcombinationofafew “important”genes 1)ParsimoniouspredictivemodelsusingapenalizedCoxmodelonthe trainingcohort. Rpackage“penalized”implements“elastic-net”methodonacox regressionmodel.λ1andλ2parameterscanbetrainedbyusingaleaveoneoutcross-validationapproachwiththelog-likelihoodasthecrossvalidationmetric.λ1willbetrainedfirstandthenλ2willbetrainedwith respecttotheoptimalλ1. Totestwhetherthemodelingprocesscouldbesensitivetothenumber ofgenesintroduced,themodelbuildingprocesswillberepeatedwith thegenesassociatedwithOSinunivariateanalysisatp<0.05,p<0.10, p<0.20,p<0.50andp<1.0(all229genes). 33 Analysis(Plan) 2)SupervisedPrinciplecomponents(SPC)techniquewithanestedKfoldcross-validationmethod. Specifically,K-foldcross-validationmethoddividesdataintokparts,and(k-1) partsarecombinedasatrainingsettodevelopamodelandonepartisusedasa testingsettovalidateit,andtheprocedurerepeatsktime.Withineachtime,we willcalculatetheunivariatescoreforeachgeneusingthetrainingsetandthen retainonlythosefeatureswhosescoreexceedsathreshold.Andprincipal componentswillbederivedfromtheselectedgenesfromthetrainingsetand usedinregressionmodeltopredictoutcomeinthetestingset.Thethreshold willbeestablishedastheonethatgivesthehighestaveragepartialloglikelihood.ASupervisedPrinciplecomponents(SPC)techniquewillthenbe appliedtothewholedatasetandalistofgeneswillbeselectedwiththe establishedthreshold,theprincipalcomponentsderivedusingonlythedata fromtheseselectedfeatureareusedinamultivariateregressionmodeltopredict outcome. 34 17 4/27/16 Sofar…. Covered: Design(power/samplesize),development(write-up)andanalysison Individualmarker&high-throughputdata. Biomarker Clinicaloutcome Rate/dichotomized Categorical Fisher’sexact/Chisquaretest Time-to-eventoutcome Log-ranktest Continuous T-test/WilcoxsonTest Coxregressionmodel Orsimilar Allunderfull-cohortdesign,andwithinferentialobjective Withdescriptiveobjectives,usuallywithsmallsamplesize(phaseIorII trials)=>nopowerstatement,couldprovideconfidenceinterval 35 Cohortsampling Sampleallcases(patientswithevent)andonlyafractionofcontrols(patientswithout event).PopularinEpidemiologystudies. Samplingapproaches: 1) Classiccase-cohortsampling:allstudycases,plusarandomsubsetofcontrols (independentofcharacteristicsofcases) 2) Nestedcase–control(NCC)sampling:allstudycasesbutonlyafractionofcontrols selectedrandomlyfromtherisksetofthematchedcases Example: Investigateblood-basedmarkers(e.g.,EBVload)inHLaspredictorsoftreatmentresponse andclinicaloutcomesinHL(E2496). Amongcaseswithspecimenavailable,atotalof54patientsrelapsed(cases)duringstudy course.Foreachcase,wewillrandomizeddraw1-2matchedcontrolsbasedonother baselineriskfactors(age,yearssincediagnosis,gender..) Matchingmethods:PropensityScorematching 36 18 4/27/16 Cohortsamplingdesign-analysisapproaches Weightedanalysisoftenused Theunderlyingassumptionforweightedanalysisisthatthestudy sampleisrepresentativeofthestudycohort,thoughwithadistorted proportionofevents(failure),producedbyasamplingmethod (enrichmentforfailure).Therefore,unbiasedestimatescanbeachieved throughassigningdifferentweighttostudysamplewiththegoalof restoringtheproportionoftreatmentfailures-inotherwordsmakethe samplea“random”selectionfromthecohort. 1. Gray(2009)givesdetailsforimplementationofweightedanalysesforevent- stratifiedsubsamplingforbiomarkerstudiesinclinicaltrialsandepidemiologic cohortstudies. 2. CaiandZheng(2011)proposedInverseprobabilityweighted(IPW)estimators, wherecontributionsofindividualsareweightedinverselyproportionaltotheir samplingfractions. 37 Ahypothe;calexample Aim:verifyabiomakerinriskprediction(separatehighvslowriskpatients) Trialcohort:500patients,with100cases,400controls Biomarkerperformance:highriskincludes90%casesand20%controls,low riskgroupincludes10%casesand80%controls Case-cohortdesign:sampleall100casesandonly100controls Trialcohort Riskgroup #cases #controls Eventrate High 90 80 0.52 Low 10 320 0.03 Ratio=17 Studysample Riskgroup #cases #controls Eventrate High 90 20 0.82 Low 10 80 0.11 Theratiooffailuresin“high”vs. “low”riskgroupislowerinthestudy samplerelativetothetrialcohort, thussimplelog-ranktestwould under-estimatethedifference. Ratio=7 38 19 4/27/16 Exampleofacohortusingcase-cohortdesign Figure:Kaplan-MeierestimatesofFFS:unweightedanalysis(left)andweightedanalysis (right) 39 Closingremarks Clinicalcorrelativestudiesareusuallydesigntodiscoverorvalidatemarkers usingarchivedspecimensfromatrial Þ Samplesizeisrelativelyfixed;clinicaloutcomeis(largely)known. Þ Power/samplesizecalculationisconstrainedbytrialdesignandsamplesize. Descriptiveorinferential?Forclinicaltrialswithsmallsamplesize(phaseI,II studies),descriptivemightbeanoption. One,multipleorhigh-throughoutbiomarkersmightbeinvolved. Þ Powercalculationismainlybasedoninferenceforindividualbiomarkers, thoughitisnotnecessary(notpossible)foreverybiomarkers. Power/samplesizecalculationistojustifythestudy(willbeabletoget somethingmeaningful). Analysisplan(includingtests)needstobeprospectivelyspecifiedforall objectives. 40 20