Download Vast population genetic diversity underlies the treatment

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Gene therapy of the human retina wikipedia , lookup

RNA-Seq wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup

Population genetics wikipedia , lookup

Koinophilia wikipedia , lookup

Microevolution wikipedia , lookup

Mutagen wikipedia , lookup

NEDD9 wikipedia , lookup

Epistasis wikipedia , lookup

Oncogenomics wikipedia , lookup

Frameshift mutation wikipedia , lookup

Mutation wikipedia , lookup

Point mutation wikipedia , lookup

Transcript
bioRxiv preprint first posted online Mar. 16, 2017; doi: http://dx.doi.org/10.1101/117614. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.
Title:
VastpopulationgeneticdiversityunderliesthetreatmentdynamicsofETV6-RUNX1acute
lymphoblasticleukemia
One-Sentence Summary:
APOBEC and replication-associated mutagenesis contribute to the development of
ETV6-RUNX1 ALL, creating massive leukemic population genetic diversity that results
in clonal differences in susceptibilities to chemotherapy.
Authors:
Veronica Gonzalez-Pena1, Matthew MacKay2, Iwijn De Vlaminck2, John Easton3,
Charles Gawad1,3
Affiliations:
1
Department of Oncology, St. Jude Children’s Research Hospital, Memphis, TN,
38105, USA
2
Meinig School of Biomedical Engineering, Cornell University, Ithaca, NY, 14850, USA
2
Department of Computational Biology, St. Jude Children’s Research Hospital,
Memphis, TN, 38105, USA
Correspondence: Charles Gawad [email protected]
bioRxiv preprint first posted online Mar. 16, 2017; doi: http://dx.doi.org/10.1101/117614. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.
Abstract
Ensemble-averagedgenomeprofilingofdiagnosticsamplessuggeststhatacuteleukemias
harborfewsomaticgeneticalterations.Weusedsingle-cellexomeanderror-corrected
sequencingtosurveythegeneticdiversityunderlyingETV6-RUNX1acutelymphoblastic
leukemia(ALL)athighresolution.Thesurveyuncoveredavastrangeoflow-frequencygenetic
variantsthatwereundetectedinconventionalbulkassays,includingadditionalclone-specific
“driver”RASmutations.Single-cellexomesequencingrevealedAPOBECmutagenesistobe
importantindiseaseinitiationbutnotinprogressionandidentifiedmanymoremutationsper
cellthanpreviouslyfound.Usingthisdata,wecreatedabranchingmodelofETV6-RUNX1ALL
developmentthatrecapitulatesthegeneticfeaturesofpatients.Exposureofleukemic
populationstochemotherapyselectedforspecificclonesinadose-dependentmanner.
Together,thesedatahaveimportantimplicationsforunderstandingthedevelopmentand
treatmentresponseofchildhoodleukemia,andtheyprovideaframeworkforusingpopulation
geneticstodeeplyinterrogatecancerclonalevolution.
bioRxiv preprint first posted online Mar. 16, 2017; doi: http://dx.doi.org/10.1101/117614. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.
Maintext:
Introduction
Analogoustoorganismalevolutioninanecosystem,theenvironmentalselectionpressures,
populationsize,andrateofgenomemodificationarecoredeterminantsoftumorevolutionary
dynamics1.Contemporarybulksequencingmethodsthatinterrogatetheensemble-averaged
mutationalprofilesofthegenomesofthousandsofcellspredominantlyidentifythose
mutationspresentinthemostdominanttumorsubclonesatdiagnosis2.However,bulk
sequencingstrategiesarelimitedintheirabilitytocapturethefullgeneticdiversityofa
population,astheydonotidentifylower-frequencyclonesandvariants.Adeeper
understandingofthegeneticdiversitywithintumorsiskeytounderstandingtumorevolution,
includingthatofmalignantcellpopulationsthatmayincreaseinsizeastherelativefitnessof
cloneschangesinresponsetonewselectionpressuresaspatientsundergotreatment.
Ourknowledgeoftheevolutionarydynamicsofacuteleukemiasislargelyderivedfrombulk
sequencingstudies,whichhaveconcludedthatacuteleukemiasharborminimalgenetic
complexitywhencomparedtoothermalignantneoplasms3,4.InuteroacquisitionofanETV6RUNX1translocationhasbeenidentifiedasthemostfrequentinitiatingeventofacute
lymphoblasticleukemia(ALL)5,6,themostcommonchildhoodleukemia.However,anETV6RUNX1translocationisinsufficientforleukemogenesis7;recentgenomicstudieshaveidentified
deletionsofgenesrequiredfornormalB-celldifferentiation8thatharborsignaturesofaberrant
RAGrecombinaseactivityasfrequentcooperatinglesions9.Inaddition,ETV6-RUNX1ALLcells
bioRxiv preprint first posted online Mar. 16, 2017; doi: http://dx.doi.org/10.1101/117614. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.
alwaysharborsomaticsingle-nucleotidevariants(SNVs)9,10.Byusingsingle-cellgenomics,we
determinedthatmostofthedeletionsoccurbeforetheacquisitionofSNVs,whichresultinthe
outgrowthofco-dominantclonalpopulations9.Inaddition,analysesoftheSNVsequence
motifsrevealedanimportantmutagenicroleforaberrantcytosinedeaminaseactivitybyan
APOBEC(apolipoproteinBmRNA-editingenzyme,catalyticpolypeptide-like)protein9,10.As
APOBECproteinsmediateinnatedefenseagainstviralinfections11,thisfindingsupportsthe
hypothesisthatenvironmentalexposuretovirusestriggersthetransformingmutagenesis12.ALL
alsoundergoesclonalevolutionbetweenitsinitialdiagnosisandrelapse13,14.However,the
magnitudeofthatevolutionandthechangesinclonalcompositionbetweendiagnosisand
relapseremainsunknown.
SeveralfundamentalquestionsregardingthegenesisandtreatmentresponseofETV6-RUNX1
ALLcouldbemorepreciselyaddressedbystudyingALLgeneticsonapopulationscale.For
example,whyarethereco-dominantclonalpopulations,especiallywhensomeclonesharbor
known“driver”mutations?Whatisthetotalmutationburdenacrossthepopulationof
malignantcells,anddotheunderlyingmutationalprocesseschangeovertime?Howdoesthis
populationgeneticdiversityinfluencetreatmentresponse?
Toaddressthesequestions,wefirstusedsingle-cellexomesequencinganderror-corrected
sequencingtofurtherdissecttheintraclonalgeneticdiversityandtheshiftinmutational
processesthatoccurredduringETV6-RUNX1ALLdevelopment.Theseanalysesrevealed
evidenceofmassivepopulationgeneticdiversity.Then,afterpatientsbegantherapy,we
bioRxiv preprint first posted online Mar. 16, 2017; doi: http://dx.doi.org/10.1101/117614. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.
examinedtheevolutionofthoseleukemicclonesbyexposingsamplestostandard
chemotherapydrugs.Together,ourfindingsenhanceourunderstandingofthedevelopment
andtreatmentresponseofETV6-RUNX1ALLatthesingle-celllevelandhavesignificantclinical
implications.
Results
Clone-specific“driver”mutationsidentifiedbysingle-cellexomesequencing
Tofurthercharacterizethegenomicdiversityofclonespreviouslydefinedbysegregating
variantsidentifiedinthebulksampletosinglecells10,weperformedsingle-cellexome
sequencingon3cellsfromeachcloneandon3normalcellsfromthesamepatient(Fig.1A).
Weachievedameansaturatingcoveragebreadthof82%ofthetargetexomewith60million
reads,ascomparedto95%coverageofthetargetexomeinbulksamples(SupplementaryFig.
1).Usingonlythesinglecells,wecalledmutationsbyrequiringatleast2cellstohavethesame
basechangeatthesamegenomicposition.Theinitialclonalstructurehad5high-frequency
clones,withoneofthe2largestclonesharboringanE63KKRASmutation,whereaswecould
notclearlyidentifytransformingalterationsintheotherclones(Fig.1A).Wethenidentifieda
further10to29mutationsperclone(Fig.1B).Usingthenormalcellsasacontrol,weidentified
alowfalse-variantcallrateat7sites,possiblyresultingfromclonalmutationsacquiredin
nonmalignantcells,amplificationartifacts,orsequencingerrors.Wethencomparedthe
mutagenicbase-changepatternoftheearlier,higher-frequencymutationsthatweredetected
inbulkandweresharedbetweenclonestothelater,clone-specificchangesthatweredetected
onlywithsingle-cellexomesequencing.Interestingly,theearlymutationswerestrongly
bioRxiv preprint first posted online Mar. 16, 2017; doi: http://dx.doi.org/10.1101/117614. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.
enrichedforC-to-TandC-to-GchangeswithanassociatedAPOBECmotif(TprecedingC),
whereasthemostcommonlatermutationswereA-to-G,A-to-C,andC-to-T,withenrichment
forGfollowingC.Thatsignatureismostconsistentwithreplicationerrors(Fig.1C,D)15.Wealso
foundaKRASG12Smutationthatwasconfinedtoalessabundantclone,aswellasanother
clone-specificNRASG12Dvariant;bothofthesemutationswereC-to-Tchanges(Fig.1A).
Becausethe3activatingRASmutations(G12R,G12S,E63K)wereacquiredindistinctclonesand
notsequentially,wehypothesizedthattheyoccurredoverarelativelyshorttime.Wethen
proposedthatmoreoncogenicmutationswouldbeacquiredslightlylaterthanthedominant
clones,resultingintheirbeingpresentatevenlowerfrequencies.
Deeperinterrogationidentifiesmassivepopulationgeneticdiversity
Tomoredeeplycharacterizeoncogenicmutationsinthebulksamplefromthesamepatient,
weperformederror-correctedtargetedsequencingof50mutationalhotspotsinALL
(SupplementaryTable1).Weidentifiedthesame3RASmutations(G12R,G12S,E63K)that
weredetectedthroughbulkandsingle-cellexomesequencing,alongwith2additionalknown
activatingKRASmutations(G12D,D119N)(Fig.2A)16–19,allofwhichwereC-to-Tsubstitutions.
Wefoundnoevidenceofmutationsintheother37genesthatweexaminedusingaFisher’s
exacttestcutoffof0.01.TheonlybenignlesionfoundinaRASgenewasaG12Gmutationthat
waspartofthedinucleotideG12Dvariant.
ToevaluatewhetherthesefindingswereapplicabletoawiderrangeofETV6-RUNX1leukemias,
weperformederror-correctedsequencingonsamplesfrom6patientswithasingleRAS
bioRxiv preprint first posted online Mar. 16, 2017; doi: http://dx.doi.org/10.1101/117614. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.
mutationidentifiedbybulksequencingandon6samplesfrompatientswithnoknownRAS
mutations.WedetectednoRASmutationsinthelatter6samples;however,wefoundamedian
of5activatingRASmutationsinthe6patientsforwhomamutationwasdetectedinthebulk
samples(Fig.2B).OnexaminingtheadditionalRASvariantsmoreclosely,wefoundthe
mutationstobeclusteredatknownRASmutationalhotspots,namelyKRAScodons12,13,119,
and146andNRAScodons12and13.Therewasnoclearpropensityforvariantsatspecific
codonstobesubclonal,supportingtheassertionthatthetimingoftheacquisitiondetermined
whetheramutationbecamedominant,asopposedtothechangeinRASsignalingactivitydue
tothespecificaminoacidchange(Fig.3C).
WehypothesizedthatevenmoresensitivemeasurementswouldidentifyadditionalRAS
mutations.Weperformedlimitedsampledilutionsfollowedbyerror-correctedsequencingof
patientSJETV075,hypothesizingthatsomeRASmutationsthatwerejustbelowourdetection
thresholdwouldrandomlybepresentatslightlyhigherfrequenciesinthedilutesamples.By
thisapproach,weidentified3additionallow-frequencyKRASmutations(G12R,A146T,L19F)
foratotalof9knownactivatingRASmutationsinpatientSJETV075.OurexaminationofallRAS
mutationsinthiscohortrevealedthatofthe33mutationsidentified,31wereC-to-TorC-to-G
changes.The5C-to-GchangeswereenrichedfortheAPOBECmotif,butnosimilarenrichment
wasfoundintheC-to-Tmutations,suggestingthattheyresultedfromnear-targetAPOBEC
activityorreplication-associatedmutagenesis.Thiscontrastswithlungcancer,inwhich
approximately60%ofKRASmutationsareG12CorG12V,resultingfromC-to-Asubstitutions20.
Together,theseobservationsfurthersupportthehypothesisthatAPOBECactivityand
bioRxiv preprint first posted online Mar. 16, 2017; doi: http://dx.doi.org/10.1101/117614. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.
replication-associatedmutagenesisaretheunderlyingprocessesdrivingtheevolutionofETV6RUNX1ALL.
Wethenconsideredthevariablesthatwouldresultinsuchalargeintra-patientactivatingRAS
mutationburden.Inourfirstmodel,weproposedthatalargepopulationofcellswasatriskfor
rapidRAS-mediatedexpansionandunderwentawidespreadmutagenesisprocess,resultingin
theconcurrentoutgrowthofmultipleclonescontainingmutationsconferringsimilarfitness.In
oursecondmodel,weproposedthatasmallerpopulationwasatriskfortransformation,witha
lowerglobalmutationrate,butthatRASwasamutationalhotspot,resultingintheacquisition
ofmultipleRASmutationsduringtheperiodofleukemictransformation.IftheRASgeneswere
mutationalhotspotsforETV6-RUNX1ALL,wewouldexpecttofindadditionalbenignmutations
aspassengersoftheactivatingmutations.However,evenourmoresensitiveapproachto
mutationdetection,usingbothsingle-cellanderror-correctedsequencing,foundnoadditional
somaticvariantswithinKRAS,NRAS,orHRAS.
Takentogether,thesefindingssupportthemodelinwhichamassivemutationburdendevelops
withinarelativelyshorttimeinapopulationatriskfortransformation.Tofurthersupportthis
model,wemeasuredthemutationrateinsingleleukemiacellsandusedthenormalcellsfrom
thesamesamplethathadundergonewhole-genomeamplificationinthesamemicrofluidic
chiptocontrolforthebackgroundmutationsandforamplificationandsequencingerrors.By
thisapproach,wemeasuredameanof208codingmutationspercellthatwereabovethe
backgroundrateinthenormalcells(Fig.3A).Comparedtoourpreviousmutationestimation,
bioRxiv preprint first posted online Mar. 16, 2017; doi: http://dx.doi.org/10.1101/117614. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.
thisapproachestimatedgreatergeneticdiversity,witheachcellharboringameanof150
codingvariantsthatwerenotdetectedbybulksequencingorintraclonalexomesequencing
(Fig.3B).
Simulationestimatesthesizeandgeneticdiversityofleukemicpopulations
Giventhelargenumberofmutationspercell,wedevisedamodeltoestimatethepopulation
sizeinordertoapproximatethetotalgeneticdiversityacrossalltheclonalpopulationsatthe
timeALLisdiagnosed.Toaccomplishthis,weusedthosevariablesthatcouldbeestimated
fromtheresultsofpreviousstudies,alongwithourcurrentmeasurements,tosimulateETV6RUNX1ALLdevelopment.WeknowthatthediseaseisinitiatedfromasinglecellbyanETV6RUNX1translocation,asallcellsharborthesamebreakpoint5.Furthermore,eachpopulation
acquiresameanof12deletions9.Wealsoknowthatthediseaseisinitiatedinuteroand
developsoverameanof4.7years21;thatlymphocyteprecursorsdivideapproximatelyevery
11.9days22,whereasprimaryleukemiccellsdividemuchmorefrequently;andthatreplication
errorsoccurincodingregionswithafrequencyofapproximatelyonceinevery300humancell
divisions24.Fromourresults,weestimatethateachcellhadacquiredapproximately200coding
SNVs,withmanyofthemarisingfromAPOBECmutagenesisthatoccurredinaburstovera
shorttime,randomlyresultinginthe16activatingRASmutationsidentifiedinourETV6-RUNX1
ALLcohort.CellsharboringETV6-RUNX1,arecurrentdeletion,andaRASmutationhave
significantlyincreasedratesofreplication,resultingintheclinicaldiagnosiswhenchildrenreach
atotalleukemiccellburdenofapproximately1 × 1011cells(Fig.4A).
bioRxiv preprint first posted online Mar. 16, 2017; doi: http://dx.doi.org/10.1101/117614. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.
Usingthoseparameters,weinitiated100simulationsinwhichthemutationratesresultedin
eachcellacquiringameanof13deletionsand229SNVsbyameanof28daysafteraburstof
APOBECmutagenesis(SupplementaryTable2,Fig.4B).Definingclonesbasedoncellswiththe
samesomaticmutationprofileincodingregions,weestimatethattherewere330million
clonesintotal(SupplementaryTable2).However,mostcloneswerecreatedwhentheoverall
populationsizewashigh,makingthemrare(Fig.4C).Thenumberofhigh-frequencycloneswas
variableanddependedonthetimeandfrequencyofRASmutationacquisition(Fig.4D).We
alsofoundevidencethatAPOBECmutagenesisandnotreplicationerrorscausedmostofthe
distinctRASmutationsandthatthenumberofuniqueRASmutationsvariedconsiderably
betweensimulations(SupplementaryFig.3).Takentogether,thesesimulationresultsare
consistentwithourexperimentaldataandsupporttheassertionthatthereismassive
populationgeneticdiversityatthetimeapatientisdiagnosedwithALL.
Massivepopulationdiversitydrivestreatmentdynamicsinapatientsample
Withsuchhighpopulationgeneticdiversity,wehypothesizedthatsomeclonesalready
harboredmutationsatdiagnosisthatalteredtheirsusceptibilitytotreatment.Totestthat
hypothesis,weexposedleukemiccellsto5standardALLchemotherapydrugs(mercaptopurine,
vincristine,prednisolone,daunorubicin,asparaginase)andusedexomesequencingtoevaluate
themutationalcompositionafterdrugexposure.Fromthisinitialscreen,weidentified537
putativemutationsinatleastonetreatment.Wethenperformederror-correctedsequencing
ofthosesitesintriplicateforeachtreatmenttoconfirmthatthemutationswerepre-existing.
Wedetected224specificbasechangesatthesamelocationinmorethanonesample
bioRxiv preprint first posted online Mar. 16, 2017; doi: http://dx.doi.org/10.1101/117614. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.
(SupplementalTable3)—approximately5timesasmanymutationsasthe41thatweidentified
intheinitialbulksequencing.
Toidentifyresistantclones,wetreatedthecellswithanincreasingdoseofeachchemotherapy
drugandlookedformutationsexhibitingadose-dependentincreaseinfrequency.Thecontrol
cellstreatedwithnodrugorDMSO,aswellasthecellstreatedwiththelowestdoseof
asparaginase,showedanincreaseinadistinctclusterofmutations.Thisclusterincludedthe
highest-frequency146VKRASmutation,alongwith2nonsynonymousTP53mutations(Fig.5,
cluster7).Interestingly,2mutations(FOLH1R281HandRGPD3P816C)werestronglyselected
forinallcontrolandtreatmentsamplesbutwerenotdetectedinthediagnosticsample,
suggestingtheywereselectedforbythecultureconditions.Treatmentwithlow-dose
mercaptopurineorhigherdosesofasparaginaseresultedinreducedexpansionoftheKRAS
A146Vclone.Thisisconsistentwiththeknownpharmacokineticsofasparaginase,whereby
increasingthedoseaftertargetsaturationhasnoeffectoncellkilling25.Treatmentwith
vincristineordoselevel2ofmercaptopurinefurtherdecreasedtheclone(s)inmutationcluster
7anddecreasedthefrequencyofthehighest-frequencymutationsincluster1.Finally,
exposuretoprednisolone,thehighestdosesofmercaptopurine,ordaunorubicinresultedinthe
greatestdecreaseinmutationsinclusters1and7,whereasmutationsinclusters2through4
increasedinfrequencyasaresultofthesetreatments.Treatmentwiththehighestdosesof
daunorubicinkilledallcellularpopulations.Takentogether,thesedatashowthatthe
underlyinggeneticdiversitywhenALLisdiagnoseddoesaffectthetreatmentdynamics.
bioRxiv preprint first posted online Mar. 16, 2017; doi: http://dx.doi.org/10.1101/117614. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.
Discussion
Wehavepresentedtheresultsofacombinationoferror-correctedandsingle-cellsequencing
ofETV6-RUNX1ALLcellscollectedatdiagnosisinordertofurtherresolvethetemporalchanges
inclonalstructuresandmutationalprocessesthatoccurduringthedevelopmentofthedisease.
TheshiftintherelativefrequencyofdifferenttypesofcytosinemutationrevealedanAPOBEC
mutagenesispatternthatdecreasesovertime,suggestingthatthisprocessisimportantfor
diseaseprogressionbutisnotrequiredforpersistenceorongoingexpansionofleukemic
clones.Wedetectedanunexpectedlyhighpopulationmutationburden,whichrevealed
differencesinthetreatmentresponseamongtheclonesthatarosefromthepopulationthat
hadpreviouslyundergoneAPOBECandreplication-mediatedmutagenesis.Thisfinding
emphasizesthedynamicnatureofleukemicevolution,astherelativeimportanceofmutations
requiredforcellsurvivalshiftsunderdistinctselectionpressuresaspatientsundergo
treatment.Wehaveintegratedthesefindingswithpreviousknowledgetocreateanewmodel
ofETV6-RUNX1ALL,whichispresentedinFigure6.
ThesenewdetailsoftemporalshiftsinthemutationalprocessesofALLintheperiodleadingup
toapatient'sclinicalpresentationhighlighthowsingle-cellgenomicscanbeusedtotraceback
themutationalhistoriesoftumors.ThedecreaseinAPOBEC-mediatedmutagenesisduring
diseaseprogressionremovesamajorcontributortotheglobalmutationrateinthosepatients.
ThischangeinthemutationratemaybeonereasonwhyALLpatientsarecuredatsuchhigh
ratescomparedtopatientswithothertumortypesinwhichthesourceofthemutagenesis,
suchasamutationthatdecreasesthefidelityoftheDNArepairpathway,doesnotdecrease
bioRxiv preprint first posted online Mar. 16, 2017; doi: http://dx.doi.org/10.1101/117614. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.
overtime26.Fromourresults,itisunclearwhateffectongoingexposuretomutagenic
chemotherapyagentshasontheinductionofdrug-resistantclones,buttheeffectcouldbe
significantinpediatricpatientswithALL,mostofwhomundergotreatmentfor2to3years27.
Mutationsthatareconsidereddrivinglesions,suchasvariantsinKRASorTP53,werenot
dominantatdiagnosisorafterundergoingselectionduringdrugtreatment,suggestingthat
combinationsofmutationsmediatetheclinicalbehaviorsofclones.Wealsoobservedtherise
ofclustersofmutations,butitisunclearhowmanydistinctclonalpopulationsthoseclusters
represent.Thisisanareawherehigher-resolutionstudiesusingsingle-cellsequencingto
determinetheco-occurrencepatternsofmutationswillprovideadditionalinsightsintodrug
resistance.Theputativehighnumberanddifferentialdrugsensitivityofleukemicclonesalso
provideinsightsintotheneedforcombinationtherapytocurechildrenwithALL.Althoughthe
sheergeneticdiversityofthepopulationpresentsnewchallengesforthedesignoftargeted
treatmentstrategies,thepresenceofmassivepopulationgeneticdiversityinALLalsoprovides
anopportunitytoprobethosesamplestolearnwhywecanalreadyovercomethepre-existent
resistanceinmostpatientswithALL.Focusingonspecificmutationsthatareselectedforby
particulardrugscouldyieldmechanisticinsightsintodrug-specificresistanceandprovideanew
rationaleforchoosingdrugcombinations.
Insummary,wehaveprovidednewinsightsintothedevelopmentofETV6-RUNX1ALLandits
resistancetotreatmentbystudyingpopulationgenetics.Ourfindingsunderscorethe
importanceofstudyingthemutationburden,size,andmutationrateofthepopulationasa
bioRxiv preprint first posted online Mar. 16, 2017; doi: http://dx.doi.org/10.1101/117614. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.
whole,notjustthehighest-frequencyvariantsdetectedbystandardbulksequencing,when
tryingtounderstandandpredicttreatmentresponse.Greatergeneticdiversityhasalso
recentlybeenreportedinotherpremalignantandmalignantstates28,29,suggestingthatthe
studyofcancerpopulationgeneticsisimportantforunderstandingmosttumortypes.
Together,thesestudieshaverevealednewlayersofcomplexityinleukemicevolutionthatneed
tobefullyunderstoodinordertomoreeffectivelyeradicatepremalignantandmalignantALL
cellpopulationswithlesstreatment-relatedtoxicity,andtheyhaveprovidedaframeworkfor
studyingintra-tumorevolutioninawiderangeofmalignantneoplasms.
OnlineMethods
Single-cellexomesequencingandmutationcalling
AmplifiedDNAfrompatient4thathadundergonesingle-cellisolationandwhole-genome
amplificationusingtheFluidigmC1Systemaspreviouslydescribed10wasusedforlibrary
constructionandexomecapturewiththeNexteraRapidCaptureExomeKit(Illumina),usedin
accordancewiththemanufacturer'sinstructions.Exome-enrichedlibrariesthenunderwent
sequencingusing2 ××100readson4flowcellsofaHiSeq2000or2500SequencingSystem
(Illumina).AdaptersweretrimmedfromeachofthecellsbyusingTrimmomatic
(ILLUMINACLIP:nextera_adapters.fa:2:30:10TRAILING:25LEADING:25SLIDINGWINDOW:4:20
MINLEN:30),followedbyalignmentwithBWAusingdefaultparameters.Duplicateswere
markedusingPicard(https://broadinstitute.github.io/picard),andlocalrealignmentfollowed
bioRxiv preprint first posted online Mar. 16, 2017; doi: http://dx.doi.org/10.1101/117614. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.
bybasescorerecalibrationwasperformedusingGATK
(https://software.broadinstitute.org/gatk).WethencalledvariantsbyusingGATKandfollowed
thiswithfilteringusingtheparameter“QD<2.0||FS>60.0||MQ<40.0||HaplotypeScore>
13.0||MQRankSum<−12.5||ReadPosRankSum<−8.0”.On-targetcoveragewascalculated
withPicardHsMetrics;thiswasrepeatedaftersubsamplingforanincreasingnumberofreads
usingcustombashscripts.Custombashscriptswerealsousedtoidentifylocationsthathadthe
samemutationcalledinmorethanonecell.GermlineSNPlocationsidentifiedbybulk
sequencingwerethenfilteredout,afterwhichlocationsthatwereidentifiedinanyofthe
normalsinglecellswereremoved.
Error-correctedsequencing
Adapterswithuniqueidentifierswerepreparedaspreviouslydescribed.Aliquotsof250or500
ngofgenomicDNAthenunderwent30minofchemicalfragmentationandstandardlibrary
preparationbyusingtheKAPAHyperPlusKit(KapaBiosystems)withadaptersthatcontained
uniquemolecularidentifiersasdescribed30,using3μgofadapterperreaction(a10:1molar
ratio).PCRamplificationandhybridcapturewereperformedaspreviouslydescribed31.
SequencingwasperformedusingMiSeqV2chemistry,using2 × 150-bpPEreads.Wethen
trimmedthesequencesto125bpwithTrimmomaticandplacedtheuniquemolecular
identifiersintotheheaderbyusingthescripttag_to_header.py30.Readswerealignedusing
BWAALNwithstandardparameters.WesortedandindexedusingPicardthenperformed
consensuscallingbyusingConsensusMaker.pywithparameters–minmem3,–cutoff0.8,and-Ncutoff0.7.UnmappedreadswereremovedwithSAMtools(http://samtools.sourceforge.net/),
bioRxiv preprint first posted online Mar. 16, 2017; doi: http://dx.doi.org/10.1101/117614. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.
thenlocalrealignmentwasperformedusingGATKbeforecreatinganmpileupfile.Normaland
TumormpileupfileswerethencomparedusingVarScanSomatic,withsomaticmutations
requiringaP-valueoflessthan10−4,ascomputedusingFisher’sexacttestbyVarScan
(http://varscan.sourceforge.net).Wealsorequiredthegermlinesampletohavefewerthan5
readsandthatnomorethan90%ofvariantreadswereonthesameDNAstrand.Variants
underwentRefSeqannotationwithANNOVAR.
Estimationofmutationrates
Toestimatethemutationrate,wedownsampledeachofthefilesto70millionreads.Wethen
createdmarked,realigned,andbasescore–recalibratedBAMfilesasdescribedabove.Thiswas
followedbyfurthervariantcallingandfilteringusingtheGATKfilteringparametersdetailed
above.Wethensubtractedthosesitesthatwerefoundinthebulkgermlinesequencing.To
subtractthebackgrounderrorrateduetoamplificationerrors,thesomaticmosaicismratesin
normalcells,andmutationmiscalls,wesubtractedthemeanmutationrateinthe3normalcells
fromthatofeachofthesingletumorcellsandplottedthedistributionofthemutationsrates.
Simulation
WedesignedacomputationalmodeltoinvestigatewhenRAG-mediateddeletionsandRAS
mutationsoccur,theclonalburden,andthetimingofdiseaseonset.Themodelisinitiatedwith
asinglecellwithanETV6-RUNX1translocationthatiscapableofdifferentiationanddivides
every12days,definedascelltype0.Ateachtimepoint,cellsmaygainadeletionor,ifdividing
orifAPOBECmutagenesisisactive,anSNV.Type1cellsarecreatedwhentype0cellsgaina
bioRxiv preprint first posted online Mar. 16, 2017; doi: http://dx.doi.org/10.1101/117614. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.
specificRAG-mediateddeletion,whichoccurswithprobabilitypRAG_DA.Type1cellsare
consideredtohavedifferentiationarrest,buttheydivideatthesamerateastype0cells.Type
2cellsarecreatedwhentype1cellshaveamutationwithinRAS,whichoccurswithprobability
pMut_IP.Type2cellshavedifferentiationarrestaswellasincreasedproliferation,duplicating
everyday.Thenumberofcellswithineachcelltypewithaspecificamountofmutationsand
deletionsaretracked.Onetimepointisequivalentto1day,andeachcellisstochastically
sampled.
Foreachcellofagiventype,arandomnumberisgeneratedfrom1to1/pDel,wherepDelisthe
probabilityofgainingadeletion.Adeletionoccurswithincellsforwhichthisrandomnumber
equals1.Asecondrandomnumberisgeneratedfrom1:1/pRAG_DAforeachtype0cellthat
acquiresadeletion.Anycellinwhichthissecondnumberis1containsadeletionthatcausesit
totransformintoatype1cell.Thissameprocessiscarriedoutwhencalculatingthemutational
burdenaswellasthetransformationoftype2cellsintotype3cells.
APOBECmutagenesisstartsonafixedday,ifcelltype3hasnotbeengenerated,andcontinues
for2daysaftercelltype3hasformed.MutationsgeneratedfromAPOBECarecalculatedinthe
samewayaspreviouslydescribed,withtheexceptionthateverycellwithincelltypes1and2
undergoesAPOBECmutagenesisandthenumberofmutationscausedbyAPOBECwithina
givencellisdeterminedbyrandomlysamplingfrom1:Burst.Thesimulationcompleteswhen
celltype3hasreached1 × 1011cells.Weestimatethenumberofcellsperclonebythetiming
ofthenewlyacquiredmutationandreplicationrate,accountingforthebranchingofcellssuch
thatthesumofallcellsequalsthenumberofcellsofagiventype.
bioRxiv preprint first posted online Mar. 16, 2017; doi: http://dx.doi.org/10.1101/117614. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.
SimulationswererunusingournewRpackage,whichiscalledRepALLandisavailableat
https://github.com/mjdm/RepALL.Weranthesimulationwiththefollowingdefault
parameters:
ProbabilityofgainingaRAG-mediateddeletion,pRAG:.008
Probabilityofcausingdifferentiationarrest,givenadeletion(Type0),pRAG_DA:2e-7
Probabilityofgainingamutationduetoreplicationerror,pBGMut:0.003
APOBECmutagenesisstartday:1290
APOBECdurationafterleukemiccell-typeinitiation:3days
MaximumnumberofAPOBECmutationsgeneratedpercellperday,Burst:75
ProbabilityofgainingaRASmutation,givenamutation(Type1),pMut_IP:16/(3e9*.02)=2.6e7
Primarycellcultureanddrugtreatment
PrimarysampleswerefrompatientsthathadprovidedconsentinstudiesapprovedbytheSt.
JudeIRB.Atthetimeofsamplecollection,mononuclearcellswereisolatedusingFicoll-Paque
(GELifeSciences)followedbycryopreservation.Onevialofcellsfromeachpatientwasthawed
slowlyusingtheThawSTARsystem(MedCision),andthecellswereplacedincultureunderthe
conditionspreviouslydescribed32.Forthelimiteddilutionexperiment,750,000cellswere
platedineachwellofa12-wellplateandweregrowninculturefor3weeks.Fordrug
treatments,thedrugsand350,000cellswereplatedineachwellofa24-wellplate.Inboth
experiments,themediumwaschangedtwiceweeklybycarefullyremovinghalfofitand
bioRxiv preprint first posted online Mar. 16, 2017; doi: http://dx.doi.org/10.1101/117614. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.
replacingitwithfreshmedium.Thereplacementmediumincludeda2×drugconcentrationif
thecellsamplewasundergoingchemotherapyexposure.AlldrugswerepurchasedfromSigmaAldrich,andtheconcentrationrangeswerebasedonsolubilitylimitsandpreviouslypublished
data32.Thedrugsusedweremercaptopurine(500,250,125,and62.5μg/mL
ConsensusMaker.pyand90μg/mL),vincristine(810,162,32.4,and6.5μg/mL),daunorubicin
(31,6.2,1.2,and0.2μg/mL),andasparaginase(19,9.5,4.8,and2.4μg/mL).Livecellswere
isolatedbyusingadeadcellremovalkit(Miltenyl).DNAwasextractedusingaDNAUniversal
Kit(ZymoResearch),andlibrarieswerepreparedusingtheHyperPlusKit(KapaBiosciences).
Exomeorcustomcapturewasperformedusingoligonucleotidesandthestandardprotocol
fromIntegratedDNATechnologies.Qualitytrimming,alignment,andmutationcallingwere
performedusingthepipelineoutlinedabove.
bioRxiv preprint first posted online Mar. 16, 2017; doi: http://dx.doi.org/10.1101/117614. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.
References
1.
Merlo,L.M.,PepperJ.W.,Reid,B.J.&MaleyC.C.Cancerasanevolutionaryand
ecologicalprocess.Nat.Rev.Cancer6,924–935(2006).
2.
CancerGenomeAtlasResearchNetworketal.TheCancerGenomeAtlasPan-Cancer
analysisproject.Nat.Genet.45,1113–1120(2013).
3.
Kandoth,C.etal.Mutationallandscapeandsignificanceacross12majorcancertypes.
Nature502,333–339(2013).
4.
Lawrence,M.S.etal.Discoveryandsaturationanalysisofcancergenesacross21
tumourtypes.Nature505,495–501(2014).
5.
Golub,T.R.etal.,FusionoftheTELgeneon12p13totheAML1geneon21q22inacute
lymphoblasticleukemia.Proc.Natl.Acad.Sci.U.S.A.92,4917–4921(1995).
6.
Greaves,M.Pre-nataloriginsofchildhoodleukemia.Rev.Clin.Exp.Hematol.7,233–245
(2003).
7.
Mori,H.etal.Chromosometranslocationsandcovertleukemicclonesaregenerated
duringnormalfetaldevelopment.Proc.Natl.Acad.Sci.U.S.A.99,8242–8247(2002).
8.
Mullighan,C.G.etal.,Genome-wideanalysisofgeneticalterationsinacute
lymphoblasticleukaemia.Nature446,758–764(2007).
9.
Papaemmanuil,E.etal.,RAG-mediatedrecombinationisthepredominantdriverof
oncogenicrearrangementinETV6-RUNX1acutelymphoblasticleukemia.Nat.Genet.46,
116–125(2014).
10.
Gawad,C.,Koh,W.&Quake,S.R.Dissectingtheclonaloriginsofchildhoodacute
lymphoblasticleukemiabysingle-cellgenomics.Proc.Natl.Acad.Sci.U.S.A.111,
17947–17952(2014).
11.
Harris,R.S.&Liddament,M.T.RetroviralrestrictionbyAPOBECproteins.Nat.Rev.
Immunol.4,868–877(2004).
12.
Greaves,M.Infection,immuneresponsesandtheaetiologyofchildhoodleukaemia.
Nat.Rev.Cancer6,193–203(2006).
13.
Mullighan,C.G.etal.Genomicanalysisoftheclonaloriginsofrelapsedacute
lymphoblasticleukemia.Science322,1377–1380(2008).
14.
Ma,X.etal.RiseandfallofsubclonesfromdiagnosistorelapseinpediatricB-acute
lymphoblasticleukaemia.Nat.Commun.6,6604(2015).
15.
Helleday,T.,Eshtad,S.&Nik-Zainal,S.Mechanismsunderlyingmutationalsignaturesin
humancancers.Nat.Rev.Genet.15,585–598(2014).
16.
Schubbert,S.etal.GermlineKRASmutationscauseNoonansyndrome.Nat.Genet.38,
331–336(2006).
17.
Scholl,C.etal.SyntheticlethalinteractionbetweenoncogenicKRASdependencyand
STK33suppressioninhumancancercells.Cell137,821–834(2009).
18.
Janakiraman,M.etal.Genomicandbiologicalcharacterizationofexon4KRAS
mutationsinhumancancer.CancerRes.70,5901–5911(2010).
19.
Smith,G.etal.ActivatingK-Rasmutationsoutwith'hotspot'codonsinsporadic
colorectaltumours—implicationsforpersonalisedcancermedicine.Br.J.Cancer102,
693–703(2010).
bioRxiv preprint first posted online Mar. 16, 2017; doi: http://dx.doi.org/10.1101/117614. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
Kerner,G.S.etal.,CommonandrareEGFRandKRASmutationsinaDutchnon-smallcelllungcancerpopulationandtheirclinicaloutcome.PloSOne8,e70346(2013).
Rubnitz,J.E.etal.ProspectiveanalysisofTELgenerearrangementsinchildhoodacute
lymphoblasticleukemia:aChildren'sOncologyGroupstudy.J.Clin.Oncol.26,2186–
2191(2008).
Harbott,J.,Viehmann,S.,Borkhardt,A.,Henze,G.&Lampert,F.IncidenceofTEL/AML1
fusiongeneanalyzedconsecutivelyinchildrenwithacutelymphoblasticleukemiain
relapse.Blood90,4933–4937(1997).
Cooperman,J.,Neely,R.,Teachey,D.T.,Grupp,S.&Choi,J.K.Celldivisionratesof
primaryhumanprecursorBcellsinculturereflectinvivorates.StemCells22,1111–
1120(2004).
Drake,J.W.,Charlesworth,B.,Charlesworth,D.&Crow,J.F.Ratesofspontaneous
mutation.Genetics148,1667–1686(1998).
Asselin,B.L.etal.InvitroandinvivokillingofacutelymphoblasticleukemiacellsbyLasparaginase.CancerRes.49,4363–4368(1989).
Negrini,S.,Gorgoulis,V.G.,&Halazonetis,T.D.Genomicinstability—anevolving
hallmarkofcancer.Nat.Rev.Mol.CellBiol.11,220–228(2010).
Pui,C.H.&Evans,W.E.Treatmentofacutelymphoblasticleukemia.N.Engl.J.Med.
354,166–178(2006).
Sottoriva,A.etal.,ABigBangmodelofhumancolorectaltumorgrowth.Nat.Genet.47,
209–216(2015).
Martincorena,I.etal.,Tumorevolution.Highburdenandpervasivepositiveselectionof
somaticmutationsinnormalhumanskin.Science348,880–886(2015).
Kennedy,S.R.etal.Detectingultralow-frequencymutationsbyDuplexSequencing.Nat.
Protoc.9,2586–2606(2014).
Schmitt,M.W.etal.Sequencingsmallgenomictargetswithhighefficiencyandextreme
accuracy.Nat.Methods12,423–425(2015).
Pieters,R.etal.,Invitrodrugsensitivityofcellsfromchildrenwithleukemiausingthe
MTTassaywithimprovedcultureconditions.Blood76,2327–2336(1990).
Acknowledgements
The authors would like to acknowledge Stephen Quake, Jinghui Zhang, and Jim
Downing for their constructive advice on the project. In addition, the authors would like
to thank the members of the Pediatric Cancer Genome Project, especially Charles
Mullighan and Ching-Hon Pui who lead the Hematological Malignancies Program.
V.G., Y.I., J.E., and C.G. are supported by ALSAC. C.G. is also supported by the
Burroughs Wellcome Fund, Leukemia and Lymphoma Society, Hyundai Hope on
Wheels, and the American Society of Hematology.
bioRxiv preprint first posted online Mar. 16, 2017; doi: http://dx.doi.org/10.1101/117614. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.
The data are available in the short read archive accession ID .
Author contributions: C.G., V.G., and M.M. designed research; V.G., J.E., M.M., and
C.G. performed research; C.G., M.M., and I.D. contributed new reagents/analytic tools;
C.G., and M.M. analyzed data; and C.G., V.G., M.M., and I.D. wrote the paper.
bioRxiv preprint first posted online Mar. 16, 2017; doi: http://dx.doi.org/10.1101/117614. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.
FigureLegends:
Figure1.Identificationofclone-specific“driver”mutationsbyusingsingle-cellexome
sequencing.(A)TheclonalstructureofanETV6-RUNX1diagnosticpatientsamplethatwas
identifiedbyinterrogatingsinglecellsformutationsfirstdetectedinthebulksamplewas
furtherresolvedbycallingmutationsinthesinglecellsalone.Theclone-specific“driver”RAS
mutationsidentifiedaspossiblecausesoftheclonalexpansionsarenoted.(B)Thenumberof
newmutationsidentifiedineachcloneusingphasingofbulkmutationsand2-cellmutation
calls.(C)Basesubstitutionpatternsseeninshared(early)andclone-specific(later)mutations.
(D)ThesurroundingmotifsinC-to-TmutationsinearlyandlateSNVs,showingthatthestrong
APOBECmotifisonlypresentintheearlymutations.
Figure2.EvidenceforlargepopulationgeneticdiversityinETV6-RUNX1ALL.(A)Error-corrected
sequencingconfirmed3clone-specificactivatingRASmutationsandidentified2additional
lower-frequencyactivatingmutations.(B)SubclonalRASmutationswerealsocommonina
largercohortinwhicheachpatienthadonemutationidentifiedinthebulksamplebuthada
medianof5activatingRASmutations.(C)TheallelefrequencydistributionsofRASmutations
shownoevidenceofpreferentialselectionofspecificaminoacidchanges.(D)Theincreased
sensitivityofmutationdetectionwithlimitingdilutionidentifiesadditionalactivatingRAS
mutations.
bioRxiv preprint first posted online Mar. 16, 2017; doi: http://dx.doi.org/10.1101/117614. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.
Figure3.Estimatingsingle-cellmutationrates.(A)Comparisonofthenumberofexome
mutationcallsinleukemiacellstothatinnormalcellsafterremovinggermlineSNVs.
Subtractingthenumberofmutationcallsinthenormalcellsprovidesanestimateofthe
mutationsacquiredbyeachleukemiacell.(B)Increasingtheresolutionofmutationcalls
identifieslower-frequencymutations.Theincreasinggeneticdiversityathigher-resolution
measurementsfurthersupportstheexistenceofmuchhigherpopulationgeneticdiversity.
Figure4.OnehundredsimulationsofthedevelopmentofETV6-RUNX1ALL.(A)Overviewof
ETV6-RUNX1simulationinwhichasinglecellwithanETV6-RUNX1translocationevolvesover
theyears.Cellsthatacquiredeletionsthatcausedifferentiationarrestareatriskfor
transformationbyeitherreplicationorAPOBECmutationsthatcreateanactivatingRAS
mutation.Thecellsexpanduntilthesubjectacquiresatotalleukemiccellburdenof
approximately1 × 1011.(B)Thetimingofdifferentiationarrest,APOBECmutagenesis,
appearanceofthefirstactivatingRASmutantclone,andclinicalpresentationofdisease,using
thedescribedparameters,aresimilartowhatisseeninpatients.(C)Whendefiningaclone
basedonthesomaticcodingmutationprofileofeachcell,thereisaninversecorrelation
betweenthesizeandfrequencyofclones.Thenumberofcloneswiththelargestnumberof
cellsismorevariableacrosssimulationsasaresultofthetimingandfrequencyofactivating
RASmutationsincellsthatalreadyharboranETV6-RUNX1translocationandadeletionthat
causesdifferentiationarrest.(D)Trackingtheclonesizeandfrequencyforeachofthe100
simulationsshowsthevarianceinthenumberofhigh-frequencyclonesacrosssimulations.
bioRxiv preprint first posted online Mar. 16, 2017; doi: http://dx.doi.org/10.1101/117614. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.
Figure5.Differentialsensitivityofleukemicpopulationstochemotherapy.Clustersof
mutationsshowingpatternsofresponsetodrugtreatmentanddosage.Cloneswithcluster7
mutations,whichincludesKRASA146VandtwoTP53mutations,expandedwithouttreatment
oruponexposuretolow-doseasparaginasewhencomparedtothediagnosticsamplethatwas
notplacedinculture.Mutationsinclusters8and9wereselectedinallsamplesunderthe
cultureconditionsused.Low-dosemercaptopurineorhigherdosesofasparaginaselimitedthe
expansionofmutationcluster7.Exposuretovincristineoranincreaseddoseof
mercaptopurinefurtherreducedmutationcluster7.Higherdosesofmercaptopurine,aswellas
exposuretoprednisoloneordaunorubicin,furtherdecreasedthefrequencyofmutationsin
clusters1and7whileselectingforcloneswithmutationsinclusters2through4.
Figure6.ModelofETV6-RUNX1evolution.Amassivenumberofclonalpopulationsarepresent
atdiagnosis,withasubsetcontainingvariantsthatallowthemtobecomethemostabundant
(red).Afterselectionpressureschangeduringtreatment,someofthehigh-frequencyclones
contract,whereasothercloneswithdifferentsomaticmutationprofilesundergopositive
selection(dashedline).
bioRxiv preprint first posted online Mar. 16, 2017; doi: http://dx.doi.org/10.1101/117614. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.
SUPPLEMENTARYMATERIALS
SupplementaryTable1.ListofALLhotspotmutationlocationsintheerror-corrected
sequencingcapturepanel.
SupplementaryTable2.ListofrecurrentmutationsdetectedinpatientSJETV077.
SupplementaryTable3.StatisticsfromthesimulationofthedevelopmentofETV6-RUNX1ALL.
SupplementaryFigure1.Saturationofsequencingcoverageatincreasingdepth.(A)Exome
sequencingofbulksamplesreachedasaturatingcoveragebreadthof94%at40millionreads.
(B)Singlecellsreachedasaturatingcoveragebreadthof82%at60millionreads.
SupplementaryFigure2.Distributionofduplicatenumbersinerror-correctedsequencing.
Uniquemolecularidentifierfamilysizedistributionsfor(A)germlineand(B)leukemiasamples.
SupplementaryFigure3.EstimatingtherelativecontributionofAPOBECandreplication
mutagenesistoactivatingRASmutations.Thereissignificantvariabilityinthenumberof
activatingRASmutationsproducedbyeachsimulation.MostoftheactivatingRASmutationsin
thesimulationswereproducedbyAPOBEC(A)ratherthanbyreplication-mediated(B)
mutations.
A)
Bulk Mutation Calls Followed By
Single-Cell Interrogation
B)
Cumulative Number of Mutations in Each Clone Identifed
by Bulk Variant Phasing and Two Cell Exome Variant Calls
80
Number of Mutations
Mutations
Cells
Normal Cells
60
Bulk Phasing Shared
Bulk Phasing Clone-Specific
Two Cell Exome Shared
40
Two Cell Exome Clone-Specific
20
0
bioRxiv preprint first posted online Mar. 16, 2017; doi: http://dx.doi.org/10.1101/117614. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.
ne
o
Cl
C)
1
ne
o
Cl
2
3
ne
o
Cl
ne
o
Cl
4
ne
o
Cl
5
no
al
rm
Type of Base Changes in Early Shared Bulk
Mutations or Late Clone-Specific Mutations
Fraction of SNVs
1.00
Base Change
A->C:T->G
A->G:T->C
A->T:T->A
C->G:G->C
C->A:G->T
C->T:G->A
0.75
0.50
KRAS E63K
0.25
Exome Sequencing of 3 Cells
per Clone and 3 Normal Cells
Ea
La
te
rly
0.00
Identi cation of Clone-Speci c
“Driver” Mutations
D)
Bulk Shared
Intra-clonal
C to T Mutations
A
C
G
A
G
C
C
A
T
3′
weblogo.berkeley.edu
0
5′
T
G
A
7
7
6
T
T
6
C
5
T
4
A
A
G
C
A
G
4
2
A
3
T
0
5′
G
3
A
C
G
C
1
2
T
KRAS E63K
bits
C
1
1
KRAS G12S
2
5
bits
2
1
NRAS G12R
MLL3 G838S
Mutant Allele Frequency
Location and Variant Frequency of Six
Identified Ras Mutations in a Single Patient
1.000
KRAS
bioRxiv preprint first posted online Mar. 16, 2017; doi: http://dx.doi.org/10.1101/117614. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.
E63K
G12S
0.100
NRAS
G12R
0.010
G12D
0
Number of Ras Mutations in Each Sample
Detected by Error-Corrected Sequencing
8
6
KRAS
NRAS
4
2
D119N
G12G
0.001
B)
Number of Mutation
A)
0
100
200
300
400
T
500 566
E
SJ
Position along Coding Region (bp)
C)
Locations and Error-Corrected Frequencies of Ras Mutations in
7 Samples with One Mutation Identified by Standard Sequencing
T
E
SJ
7
7
0
V
T
E
SJ
8
7
0
V
T
E
SJ
3
8
0
V
T
E
SJ
2
9
0
V
8
SU
1
02
Allele Frequency of Ras Mutations Identified in SJETV075
1.000
KRAS A146V
KRAS D119N
NRAS G12D
0.100
0.5000
0.010
0.2500
0.1000
KRAS
NRAS
0.0100
0.0010
Allele Frequency
Mutant Allele Frequency
T
E
SJ
5
7
0
V
D)
1.0000
0.001
1.000
13
58
63
117
119
Amino Acid Position
146
156
KRAS G12S
KRAS G12R
0.001
1.000
0.001
Bulk
Bulk Barcode
Limiting Dilution
0.010
0.010
12
KRAS G12D
0.100
0.100
0.0001
6
2
0
V
NRAS G12R
KRAS A146T
KRAS L19F
A)
B)
Comparison of Mutation Numbers at Increasing Resolution
300
Number of Mutations
*p=0.041
800
700
600
500
400
300
200
m
or
N
ke
m
ia
C
al
C
el
el
l
l
0
*p=0.002
200
150
*p=0.006
100
50
0
100
Le
u
Number of Mutation Calls
Comparison of Normal and Leukemia
Single Cell Mutation Call Numbers
k
l
Bu
k
l
Bu
d
Ph
e
as
o
Tw
l
l
e
C
le
S
g
in
l
l
e
C
B)
Clinical Presenta
Density Plot of Distribution of Timing of Differentiation Arrest, Acquisition of First RAS Mutation,
and Time of Diagnosis After Acquisition of an ETV6-RUNX1 Translocation
APOBEC Mutagenesis
Density
APOBEC Mutagenesis
A)
bioRxiv preprint first posted online Mar. 16, 2017; doi: http://dx.doi.org/10.1101/117614. The copyright holder for this preprint (which was not
peer-reviewed) is the author/funder. It is made available under a CC-BY-NC 4.0 International license.
Day First Cell Undergoes
Differentiation Arrest
0.04
Day First RAS Mutation
Acquired
Day of Diagnosis
(100 Billion Cells)
0.02
0.00
0
500
1000
1500
Days After ETV6-RUNX1 Translocation
D)
Number of Leukemic Clones with Increasing Number
of Leukemic Cells Per Clone For Each Simulation
4
0
2
4
6
2
Number of Clones (log10)
8
6
8
Change in Number of Leukemic Clones with Increasing
Number of Leukemic Cells Per Clone after 100 Simulations
Number of Clones (log10)
C)
0
0
1
3
5
7
9
Number of Cells Per Clone (log10)
0
1
3
5
7
9
Number of Cells Per Clone (log10)
Mutations
MDH1B p.V475D
KCNG4 p.T451M
SYT16 p.Q8X
ANK3 p.S2426F
KIAA1024 p.T63M
LACC1 p.E53K,LACC1
CTCF p.D247H,CTCF
PCDHB5 p.F763L
SDHA p.Y215C
IGFN1 p.T2132A
TRIM48 p.Y192H
H2AFV p.Q99R,H2AFV
PDS5B p.R473W
ADH1B p.R272Q
PIK3C2G p.Q118E
PANK3 p.I301F
CEL p.I488T
NBPF10 p.E285A
NBPF10 p.M286I
FAM186A p.H1367Q
HERC2 p.R1211C
HGC6.3 p.M72V
C17orf112 p.R84T
PRRC2A p.R2138Q,PRRC2A
GNB5 p.F27L
PARP4 p.L1168V
DCAF12 p.A379P
DPP4 p.D133N
PTPN11 p.S502L
ARHGAP15 p.F371L
ARHGAP15 p.F371S
HIVEP1 p.Q2002E
KIAA1551 p.Q1577E
CRTAP p.R114C
FAM194B p.E544K
ARHGAP5 p.E1423K,ARHGAP5
SPTA1 p.D455H
UBR1 p.E602K
KLHL24 p.F158L
APC p.S281C,APC
SVIL p.D17H,SVIL
BRCA1 p.S220X,BRCA1
FHOD1 p.E234K
MAN1B1 p.E227K
ISCU p.A126V,ISCU
BRAT1 p.Q333E
SPEG p.E717K
SRRT p.E287Q,SRRT
NRN1 p.I62M
TIGD7 p.S324X
SEC61A1 p.I249M
TTC24 p.Q48X
SPTA1 p.R494T
TSPAN19 p.S127F
TIGD7 p.L282V
SELP p.S557L
MBD6 p.R360C
SLC4A2 p.E766Q,SLC4A2
PRDM9 p.S814R
CR1 p.R1744X,CR1
ZC3H11A p.S805X
RGPD1,RGPD2 p.D1365Y,RGPD2
PPFIBP1 p.L57F,PPFIBP1
KRAS p.A146T,KRAS
MGAT4C p.R199C
ABCB6 p.E220X
SUPT6H p.D1716N
TP53 p.G66C,TP53
TP53 p.N107D,TP53
CLIP1 p.E434Q,CLIP1
RAP1GDS1 p.Q397H,RAP1GDS1
FOLH1 p.R281H,FOLH1
RGPD3 p.R816C
Drug
Treatment
(Dose)
Prednisolone 1
Daunorubicin 1
Daunorubicin 2
Prednisolone 4
Prednisolone 3
Prednisolone 2
Mercaptopurine 4
Mercaptopurine 3
Vincristine 4
Vincristine 2
Mercaptopurine 2
Vincristine 3
Vincristine 1
Asparaginase 4
Asparaginase 2
Asparaginase 3
Mercaptopurine 1
DMSO
No Drug or Vehicle
Asparaginase 1
Diagnosis
12 4
Mutation Clusters
5
79
Allele
Frequency
(%)
40
30
20
10
5
Clonal
Expansion
Rate
Low
Medium
High
Disease Initiation
Diagnosis
Treatment