* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download View PDF - Genetics
Genomic library wikipedia , lookup
Transposable element wikipedia , lookup
Population genetics wikipedia , lookup
Group selection wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Short interspersed nuclear elements (SINEs) wikipedia , lookup
Non-coding DNA wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Gene desert wikipedia , lookup
Quantitative trait locus wikipedia , lookup
Public health genomics wikipedia , lookup
Polymorphism (biology) wikipedia , lookup
Essential gene wikipedia , lookup
Human genome wikipedia , lookup
Adaptive evolution in the human genome wikipedia , lookup
Polycomb Group Proteins and Cancer wikipedia , lookup
Gene expression programming wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Metagenomics wikipedia , lookup
History of genetic engineering wikipedia , lookup
Koinophilia wikipedia , lookup
Genomic imprinting wikipedia , lookup
Biology and consumer behaviour wikipedia , lookup
Ridge (biology) wikipedia , lookup
Genome (book) wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Helitron (biology) wikipedia , lookup
Designer baby wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Pathogenomics wikipedia , lookup
Gene expression profiling wikipedia , lookup
Minimal genome wikipedia , lookup
Genetics: Early Online, published on October 11, 2016 as 10.1534/genetics.116.188268 Positive selection in rapidly evolving plastid-nuclear enzyme complexes † ‡ § Kate Rockenbach*, Justin C. Havird*, J. Grey Monroe , Deborah A. Triant , Douglas R. Taylor , Daniel B. Sloan*. *Department of Biology, Colorado State University, Fort Collins, CO 80523 † Department of Bioagricultural Sciences and Pest Management, Colorado State University, Fort Collins, CO 80523 ‡ Florida Museum of Natural History, University of Florida, Gainesville, FL 32611 § Department of Biology, University of Virginia, Charlottesville, VA 22904 1 Copyright 2016. Running Title: Selection on Plastid-Nuclear Complexes Keywords: chloroplast, cytonuclear interactions, McDonald-Kreitman test, plastome Author for Correspondence: Daniel B. Sloan: [email protected] Colorado State University 1878 Campus Delivery Fort Collins, CO 80523 970.491.2256 2 ABSTRACT Ratesofsequenceevolutioninplastidgenomesaregenerallylow,butnumerousangiospermlineages exhibitacceleratedevolutionaryratesinsimilarsubsetsofplastidgenes.ThesegenesincludeclpP1and accD,whichencodecomponentsofthecaseinolyticprotease(CLP)andacetyl-coAcarboxylase(ACCase) complexes,respectively.Whethertheseextremeandrepeatedaccelerationsinratesofplastidgenome evolutionresultfromadaptivechangeinproteins(i.e.,positiveselection)orsimplyalossoffunctional constraint(i.e.,relaxedpurifyingselection)isasourceofongoingcontroversy.Toaddressthis,wehave takenadvantageofthemultipleindependentaccelerationsthathaveoccurredwithinthegenusSilene (Caryophyllaceae)byexaminingphylogeneticandpopulationgeneticvariationinthenucleargenesthat encodesubunitsoftheCLPandACCasecomplexes.Wefoundthat,inspecieswithacceleratedplastid genomeevolution,thenuclear-encodedsubunitsintheCLPandACCasecomplexesarealsoevolving rapidly,especiallythoseinvolvedindirectphysicalinteractionswithplastid-encodedproteins.Amassive excessofnonsynonymoussubstitutionsbetweenspeciesrelativetolevelsofintraspecificpolymorphism indicatedahistoryofstrongpositiveselection(particularlyinCLPgenes).Interestingly,however,some speciesarelikelyundergoinglossofthenative(heteromeric)plastidACCaseandputativefunctional replacementbyaduplicatedcytosolic(homomeric)ACCase.Overall,thepatternsofmolecularevolution intheseplastid-nuclearcomplexesareunusualforancientlyconservedenzymes.Theyinsteadresemble casesofantagonisticco-evolutionbetweenpathogensandhostimmunegenes.Wediscussapossible roleofplastid-nuclearconflictasanovelcauseofacceleratedevolution. 3 INTRODUCTION Plastidscarryreducedgenomesthatreflectanevolutionaryhistoryofextensivegenelossandtransfer tothenucleussincetheirancientendosymbioticoriginroughlyonebillionyearsago(Timmisetal.2004; Keeling2010;GrayandArchibald2012).Manyoftheproteinsencodedbygenesthathavebeen transferredtothenucleargenomearetraffickedbackintotheplastid(Gouldetal.2008),wherethey interactcloselywithproteinsencodedbygenesremainingintheplastidgenome.Theseinteracting proteinsarekeynotonlytophotosynthesis,butalsototranscription,translationandcriticalnonphotosyntheticmetabolicfunctionsoftheplastid.Theinteractionsbetweenthesegeneproductscreate theopportunityforco-evolutionbetweenplastidandnucleargenomes.Thus,studyingthenuclear genesthatcontributetoplastidcomplexesisavaluabletoolforunderstandingtheprocessesunderlying plastidgenomeevolutionandcytonuclearco-evolution. Withinangiosperms,mostplastidgenomesarehighlyconservedinsequenceandstructure (Jansenetal.2007;Wickeetal.2011),butmultipleindependentlineageshaveexperiencedaccelerated ratesofaminoacidsubstitutioninsimilarsubsetsofnon-photosyntheticgenes(Jansenetal.2007; ErixonandOxelman2008;Greineretal.2008b;Guisingeretal.2008,2010,2011;Straubetal.2011; Sloanetal.2012a,2014a;Barnard-Kubowetal.2014;Wengetal.2014;Dugasetal.2015;Williamset al.2015;Zhangetal.2016).Severalmechanismshavebeenhypothesizedtoexplaintheserepeated accelerationsincludingpositiveselection,reducedeffectivepopulationsize(Ne),alteredDNArepair, changesingeneexpression,andpseudogenizationfollowinggenetransfertothenucleus(seeabove citations).Distinguishingamongthesehypotheseshasprovedchallenging,andtheultimatecauseor causesoftheextremedifferencesinratesofmolecularevolutionamonggeneswithinplastidgenomes remainunclear. 4 Inmanycasesofextremeplastidgenomeevolution,accelerationshavedisproportionately affectednonsynonymoussites,resultinginelevatedratiosofnonsynonymoustosynonymous substitutionrates(dN/dS)(e.g.,ErixonandOxelman2008;Guisingeretal.2008;Barnard-Kubowetal. 2014;Sloanetal.2014a),whichindicatesthatchangesinselectionarelikelyinvolved.Inaddition, recentstudiesshowedcorrelatedincreasesindN/dSbetweennuclear-andplastid-encodedsubunitsin ribosomal(Sloanetal.2014b;Wengetal.2016)andRNApolymerasecomplexes(Zhangetal.2015), providingfurtherevidenceforchangesinselectionpressures.However,thesestudiescouldnot confidentlydistinguishbetweentwoalternativeexplanationsforincreaseddN/dS:positiveselectionand relaxedpurifyingselection,whichcanbedifficulttodisentanglebasedonsequencedivergencedata alone.Becausetheseselectionpressurescanhaveverydifferenteffectsonpopulationgeneticvariation, analysesthatcombinedataonintraspecificpolymorphismandinterspecificdivergence(McDonaldand Kreitman1991)candetectpositiveselectionevenincaseswhereitisnotreadilyidentifiablebasedonly ondN/dS(Rausheretal.2008).However,moststudiesofacceleratedplastidgenomeevolutionand plastid-nuclearco-evolutionhavenotincludedthenecessaryintraspecificpolymorphismdatato performtheseanalyses. Incontrasttorecentanalysesofplastidgeneticmachinery(i.e.ribosomalandRNApolymerase genes;Sloanetal.2014b;Zhangetal.2015;Wengetal.2016),thepotentialformolecularco-evolution involvingnuclear-encodedsubunitsinotherplastidcomplexesremainslargelyunexplored.Twosuch complexesarethecaseinolyticprotease(CLP),whichisanATP-dependentproteaserequiredforproper plastidfunction(NishimuraandvanWijk2015),andtheheteromericacetyl-coAcarboxylase(ACCase), whichisinvolvedinfattyacidbiosynthesis(SasakiandNagano2004;SalieandThelen2016).TheCLP complexandACCaseeachcontainasingleplastid-encodedsubunit(ClpP1andAccD,respectively)and multiplesubunitsofnuclearorigin.Inmostangiosperms,thesequencesoftheclpP1andaccDgenesare generallyconserved,buttheyareamongtheplastid-encodedgenesthatexhibitelevatedratesof 5 sequenceevolutioninmultipleindependentlineages.TheclpP1gene,inparticular,exhibitsrecentand dramaticallyincreasedratesofnonsynonymoussubstitutionsandindels(e.g.,ErixonandOxelman2008; Sloanetal.2014a). Inadditiontophylogeneticandpopulationgeneticanalyses,examiningpatternsofaminoacid substitutionsrelativetoproteinstructurecanhelpdistinguishbetweenrelaxedandpositiveselection.In themodelangiospermArabidopsisthaliana,theCLPcomplexismadeupoftwostackedheptameric rings,comprisingninedifferenttypesofparalogousandstructurallyrelatedsubunitsthatarederived fromthesinglesubunitfoundintheancestralhomotetradecamericformofthisenzyme(Peltieretal. 2004;YuandHoury2007;Olinaresetal.2011).TheP-ringisformedentirelyofthenuclear-encoded subunitsCLPP3,4,5,6ina1:2:3:1stoichiometricratio,andtheR-ringcontainstheplastid-encoded subunitClpP1andthenuclear-encodedsubunitsCLPR1,2,3,4ina3:1:1:1:1ratio.TheCLPPsubunitsall containaconservedcatalyticSer-His-Asptriad,whichislackingfromtheCLPRsubunits(Peltieretal. 2004),meaningthatClpP1istheonlycatalyticsubunitwithintheR-ring.Othernuclear-encoded subunitssuchasCLPC,CLPD,CLPF,CLPS,CLPT1,andCLPT2arephysicallyassociatedwiththecoreCLP complexandactasadapters,chaperones,andaccessoryproteins,helpingtoregulatetheproteolytic activityofCLP(Peltieretal.2004;NishimuraandvanWijk2015;Nishimuraetal.2015). MostfloweringplantscontaintwodifferenttypesofACCaseenzymes:aeukaryotic-like homomericmultidomainACCaseinthecytosolandabacterial-likeheteromericACCasewithinthe plastids.TheheteromericACCaseconsistsofproteinsencodedbyfourdifferentgenes(ACCA,B,C,D; SasakiandNagano2004;SalieandThelen2016).ACCCisabiotincarboxylase.InanATP-dependent reaction,itcarboxylatesabiotinmoleculeattachedtoACCB(abiotincarboxylcarrierprotein),with bicarbonateservingasthedonorofthecarboxylgroup(Whiteetal.2005).Thenuclear-encodedACCA andplastid-encodedAccDcloselyinteractwitheachotherandrepresenttheαandβ carboxyltransferasesubunits,respectively.Eachofthesesubunitsfirsthomodimerizes,andthenthey 6 combineasahetero-tetramer,formingthefunctionalenzymethattransfersthecarboxylgroupfrom biotintoacetyl-CoA(CronanandWaldrop2002).Together,theseenzymesconvertacetyl-CoAto malonyl-CoAwithintheplastid,whichisthefirststepinthefattyacidbiosynthesispathway(Whiteetal. 2005).Insomelineages,thehomomericACCasehasundergoneaduplication,andonecopyistargeted totheplastid,whiletheotherremainsinthecytosol(KonishiandSasaki1994;Schulteetal.1997; Babiychuketal.2011;Parkeretal.2014). TheangiospermtribeSileneae(Caryophyllaceae)hasemergedasmodelforstudyingorganelle genomesunderdivergentratesofsequenceevolution(Moweretal.2007;ErixonandOxelman2008; Sloanetal.2012a;2012b,2014a).Thisgroupcontainsmultiplelineageswithphylogenetically independentaccelerationsinratesofplastidgenomeevolution(Sloanetal.2012a,2014a).Incontrast, closelyrelatedSileneaelineageshavelargelymaintainedlowancestralratesofevolution.Thisrate variationamongcloselyrelatedspeciespresentsapowerfulcontrasttoanalyzetheevolutionary mechanismsresponsibleforacceleratedplastidgenomeevolutionandtestforcorrelatedchangesin nuclear-encodedcounterparts.Here,weusetranscriptomesequencingdatacoupledwithstructural informationtoidentifyvariationinnucleargenesbothwithinandamongSileneaespecieswithhighly divergentratesofplastidgenomeevolution.Specifically,weaskedifthereisevidenceofselectionon thesequencesofnuclear-encodedsubunitsoftheCLPandACCasecomplexesinSilenespecieswhose plastid-encodedcounterpartshaveexperiencedrecentaccelerationsinratesevolution. METHODS TaxonSampling,mRNA-seq,andTranscriptomeAssembly Sileneconica,S.noctiflora,andS.paradoxawereallpreviouslyidentifiedashavinghighlyaccelerated ratesofnonsynonymoussubstitutionsinasubsetofplastidgenes,withthemostdramaticeffects observedinclpP1(Sloanetal.2012a,2014a).TheaccDgenealsoexhibitedincreasednonsynonymous 7 substitutionrates(albeitmuchlesspronounced)aswellastheaccumulationoflargeindelsinthese species.Incontrast,therewaslittleornoevidenceofacceleratedsequenceevolutioninphotosynthetic genesinthesespecies.Silenelatifolia,S.vulgaris,andAgrostemmagithagowerechosenas representativesofcloselyrelatedlineagesthathavemaintainedlowratesofevolutionthroughouttheir entireplastidgenomes(Sloanetal.2012a,2014a).Transcriptomesforthesesixspecies(S.conica,S. latifolia,S.noctiflora,S.paradoxa,S.vulgaris,andA.githago)weretakenfrompreviouslydescribed datasets(Sloanetal.2014b)thatwereeachgeneratedfromasingleindividualandassembledwith Trinityr20120608(Grabherretal.2011).Thesedatasetswereusedforallphylogeneticanalysesand correspondtoNCBISequenceReadArchive(SRA)accessionsSRX353031,SRX353047,SRX353048, SRX353049,SRX353050,andSRX352988.Fortwogenes(CLPCandCLPP6),somesequenceswere extractedfromseparateSOAPdenovo-Transv1.02(Xieetal.2014)assembliesofthesamereads becausetheTrinityassemblieswerefragmentedorcomplex. Seedsfrom19geographicallydispersedS.conicacollections,includingABRwhichwasusedfor theoriginalS.conicatranscriptomereferencedabove,andonecollectionofthecloserelativeS. macrodonta(TableS1)weregerminatedonsoilineitherJulyorAugust2014(Fafard2SVmix supplementedwithvermiculiteandperlite)andgrownundera16-hr/8-hrlight/darkcyclewithregular wateringandfertilizertreatmentsingreenhousefacilitiesatColoradoStateUniversity.Plantswere grownfor7-9weeks,andtotalRNAwasextractedfrom2-3leavesofasingleindividualfromeach collectionusinganRNeasyPlantMiniKit(Qiagen).Rosetteleaveswereusedforallindividualswiththe exceptionoftheARZandPDAsamplesforwhichcaulineleaveswereused.TheresultingRNAwassent totheYaleCenterforGenomeAnalysisforIlluminamRNA-seqlibrarypreparationandsequencing.For allbuttwosamples,polyAselectionwasusedduringlibraryconstruction,whilefortheABRsampleofS. conicaandthesingleS.macrodontasample,mRNAselectionwasperformedusingaRibo-ZeroPlant LeafrRNARemovalKit(Illumina)inanefforttocapturemoreorganellartranscripts(aspartofan 8 unrelatedproject).Resultingstrand-specificIlluminalibrariesweresequencedontwolanesofan IlluminaHiSeq2500togeneratepaired-end151bp(2´151)reads.Raw(i.e.,non-normalizedand untrimmed)readswerethenassembledusingTrinityr20140717withdefaultparameters(notethatthe strand-specificityofthereadswasnottakenintoaccountduringassembly).Transcriptomeassembly statisticsandnumbersofreadsweresimilaramongthe20samples,exceptforanapproximately50% reductionintheaverageandtotallengthofassembledtranscriptsforsampleswhereRibo-Zerowas used(TableS2). ExtractionandAlignmentofOrthologousSequencesfromSileneaeSpecies Thefocusofourstudywasthenuclear-encodedcomponentsoftheplastidCLPandACCasecomplexes (TableS3).Inaddition,setsofgenesequenceswereobtainedfromphotosystemI(PSI)andthe mitochondrial-targetedCLPprotease(mtCLP)toserveasabasisforcomparison.PSIwasselected becauseitcontainssubunitsfromboththenuclearandplastidgenomesbut,unlikeCLPandACCase,the plastid-encodedsubunitshavebeenhighlyconservedeveninSilenespecieswithacceleratedratesof evolutioninotherplastidgenes(Sloanetal.2012a,2014a).ThemtCLPcomplexwaschosenbecauseitis homologoustotheplastidCLPbutconsistsentirelyofnuclear-encodedsubunitsandistargetedtoa differentcellularcompartment(themitochondria).ThemtCLPcomplexhasahomotetradecamercore consistingentirelyofCLPP2subunitsthatinteractswiththechaperonesCLPX1,2,3(vanWijk2015). Additionally,50geneswithaminimumcodingsequencelengthof600bpwereselectedatrandomfrom apublishedlistofsingle-copynucleargenesinangiospermsinordertotestforglobalincreasesin evolutionaryrateswithinthenucleargenome(TableS4;Duarteetal.2010).Genesthatwereannotated asbeingtargetedtothemitochondriaorplastidswereexcludedfromthisrandomset.TheArabidopsis thalianasequencesforselectedgeneswereobtainedthroughtheTAIRdatabase (https://www.arabidopsis.org/)withaccessionnumbersfromtheliterature(TablesS3andS4;Peltieret al.2004;Duarteetal.2010;Olinaresetal.2011;vanWijk2015). 9 BLAST+2.2.31(Camachoetal.2009)wasusedtoruntblastnsearches(defaultsettings)withthe selectedArabidopsisthalianaaminoacidsequencesasqueriesagainsteachoftheassembledSileneae transcriptomes.ThetophitineachtranscriptomewasretrievedwithacustomPerlscriptusingBioPerl modules(Stajichetal.2002).Incaseswherethereweremultipleparalogousgenes,manualcuration aidedbyexploratorytree-buildingwasperformedtoidentifyorthologs.Genesthatwereabsentfrom thetranscriptomesorforwhichorthologscouldnotbeconfidentlyidentifiedwereexcludedfrom furtheranalysis.Fortherandomset,geneswereexcludedandreplacedwithanotherrandomlyselected geneifoneormorespecieslackedorthologoussequenceorhadapartiallyassembledtranscriptthat waslessthantwo-thirdsthelengthofthecodingsequence.Extractedsequenceswerealignedby nucleotideusingtheMUSCLEalgorithmembeddedwithinMEGAv6.0(Tamuraetal.2013).Thelongest openreadingframe(ORF)intheAgrostemmagithagosequencewaspredictedwiththeNCBIORFFinder (http://www.ncbi.nlm.nih.gov/gorf/gorf.html),andothersequencesweretrimmedaccordingly. SequenceswerethenrealignedbycodonusingMUSCLE.TargetPv1.1(Emanuelssonetal.2000) wasusedwithtranslatedORFsfromA.githagotopredictthelengthoftheN-terminalsignalpeptide(for plastid-andmitochondrial-targetedproteins),whichwasthenremovedfromallsequences.Inafew cases,TargetPwasunabletopredictatargetingpeptidebasedonA.githagosequence,soS.Iatifolia,S. vulgaris,orA.thalianawasusedinsteadtoidentifyandremovesignalpeptides.Concatenated sequencesforsetsofgenesineachcomplexweregeneratedfromfinalalignmentswithacustomPerl scriptusingBioPerlmodules. PhylogeneticAnalysisofRatesofSequenceEvolution Foreachgeneindividuallyandfortheconcatenatedsetsofnuclear-encodedgenesofeachcomplex,we conductedmultipleanalysesofratesofsynonymousandnonsynonymoussubstitutionsbyusingthe codemlprogramwithinPAMLversion4.8(Yang2007).AnF1´4codonfrequencymodelwasappliedin 10 eachanalysis,andaconstrainedtreetopologywasusedwiththespeciesinSilenesubgenusBehenantha (S.conica,S.latifolia,S.noctiflora,andS.vulgaris)collapsedasapolytomy.First,weimplementedafree branchmodel(model=1inPAML)toestimatedN/dSforeachbranchindependently.Branchesthatwere identifiedashavingdN/dSvaluesgreaterthanonewerethentestedforsignificanceusingalikelihood ratiotest(LRT)thatcomparedthefreebranchmodeltoamodelthatconstraineddN/dSforthe individualbranchinquestiontoavalueofone.Finally,weclassifiedspecies/branchesintotwogroups– “fast”and“slow”–basedonknownratesofplastidgenomeevolution(Sloanetal.2012a,2014a)and estimatedseparatedN/dSvaluesforeachgroup(model=2inPAML).TheterminalbranchesforSilene conica,S.noctifloraandS.paradoxawereassignedtothefastgroup,whilethoseforS.latifolia,S. vulgaris,andA.githagowereassignedtotheslowgroup.Theinternalbranchconnectingthecommon ancestorofSilenetothebaseofSilenesubgenusBehenanthawasalsoincludedintheslowgroup.As above,weusedanLRTtotestforsignificanceincasesinwhichindividualgenesorconcatenated sequencesforentirecomplexeshadanestimateddN/dSvaluegreaterthanoneforthefastgroup(no suchcaseswhereidentifiedfortheslowgroup).FortheconstrainedmodelintheseLRTcomparisons, thedN/dSvalueforthefastgroupwassettoone. McDonald-KreitmanTests McDonald-Kreitman(MK)tests(McDonaldandKreitman1991)wereperformedusingsequencesfrom theSileneconicapopulationgeneticdataset.Sequenceswereextracted,alignedandtrimmedfollowing thesamemethodologydescribedaboveforthephylogeneticanalysis.Thetestswereimplementedwith thewebserverdescribedbyEgeaetal.(2008).Foreachgene,theneutralityindex(NI)wascalculatedby dividingtheratioofnonsynonymoustosynonymouspolymorphismswithinS.conica(Pn/Ps)bytheratio ofnonsynonymoustosynonymousdivergencefromacloselyrelatedoutgroup(seebelow)species (Dn/Ds)(RandandKann1996).NIvalueslessthanoneareindicativeofpositiveselection,withstatistical significanceassessedbyastandardcontingency-tablec2analysis.Wealsocalculatedthedirectionof 11 selection(DoS)foreachgene(StoletzkiandEyre-Walker2011).PositiveDoSvaluesareindicativeof positiveselectionandanexcessofnonsynonymoussubstitutions.Welookedforevidenceofselectionin setsofrelatedgenesbysummingpolymorphismanddivergencecountsforgenesbelongingtothe plastidCLP,ACCase,PSI,ormtCLPcomplexesaswellasforthesetofrandomgenes.Becausesumming acrosscontingencytablescanintroducestatisticalbias,wealsocalculatedNITGforeachcombinedsetof relatedgenes,whichisanunbiasedestimatorofNI(StoletzkiandEyre-Walker2011).Twoseparatesets ofanalyseswerecarriedout,usingeitherS.latifoliaorS.macrodontaastheoutgroup.Extracted sequencesfromtheS.conicaandS.macrodontatranscriptomeassembliesthatcouldnotbeconfidently identifiedasorthologouswereremovedfromtheanalysis.Specifically,CLPR2,CLPX1,andonerandomly selectedgene(AXS2:AT1G08200)showedevidenceofrecentduplicationsintheS.conicalineage, leadingtoapparentchimericassemblyartifacts.Therefore,thesegeneswerenotusedforMKtests.In addition,allthreeCLPXgenesand32oftherandomlyselectednucleargeneshadlowcoverageand fragmentedassembliesintheS.macrodontadataset,soMKtestsforthesegeneswereonlyperformed withS.latifoliaasanoutgroup.ThelowcoverageformanynucleargenesintheS.macrodontaassembly waslikelyrelatedtotheuseofRibo-Zeroinconstructionofthatlibrary(TableS2). AnalysisofProteinStructureandPositionofSubstitutions Togaininsightintothefunctionalconsequencesofaminoacidchangesobservedinfast-evolvingSilene species,wemappedsubstitutionsontoplastidCLPandACCaseproteinstructures.AncestralSilene sequenceswereinferredusingcodemlinPAMLwiththeguidetreecontainingthefiveSilenespecies encodedasapolytomy,withAgrostemmagithagoandEscherichiacoliasoutgroups.Partialsequences wereexcludedwheninferringancestralsequences.ForeachCLPPandCLPRsubunit(includingthe plastid-encodedClpP1),changesthatwereinferredtohaveoccurredinS.conica,S.paradoxa,orS. noctiflorafromtheancestralSilenesequenceweremappedontothestructureofanindividualE.coli CLPPsubunit(PDBaccession1YG6;Bewleyetal.2006;YuandHoury2007).Likewise,changesinACCase 12 subunitswerealsomappedontosolvedE.colistructures(PDBaccessions4HR7and2F9Y:Bilderetal. 2006;Broussardetal.2013).TemplatestructuresfromE.coliwereusedbecausenoplantCLPorACCase structureshavebeensolved.Whileitislikelytherehavebeenstructuralchangeswithinthese complexesbetweenbacteriaandplants,mostofthesesubunitsareancientlyconservedandcanbe reliablyaligned(averageaminoacididentity=43.3%). DataAvailability RawIlluminareadsandassembledtranscriptomesequencesareavailableviatheNCBISRAand TranscriptomeShotgunAssembly(TSA)database,respectively.AccessionnumbersareprovidedinTable S2.SequencealignmentsusedinPAML,MK,andphylogeneticanalysesareprovidedinSupplementary FileS1. RESULTS CLPandACCaseGeneContentintheTribeSileneae WewereabletorecovermostoftheexpectedgenesfromtheSileneaetranscriptomes,asthegene contentinthesespecieswaslargelysimilartothatofArabidopsisthaliana.However,wedidfindthat somegeneshadexperiencedrecentduplicationsorlosses.Weidentifiedorthologsofalleightofthe nuclear-encodedCLPPandCLPRgenesthataretargetedtotheplastidinArabidopsis(FigureS1),andwe foundCLPP5hasbeenduplicated(withtheresultingcopiesdesignatedasCLPP5aandCLPP5b).The duplicationappearstohaveoccurredpriortothedivergencebetweenAgrostemmaandSilene,butonly oneofthesecopies(CLPP5b)wasrecoveredfromtheAgrostemmatranscriptome(FigureS1).As describedintheMethods,theremayalsohavebeenmorerecentduplicationsofgenessuchasCLPR2in individualspecies.InadditiontothesubunitsthatmakeupthecoreproteolyticringsoftheplastidCLP 13 complex,wealsoidentifiedorthologsoftheassociatedchaperones,adapters,andaccessoryproteins thathavebeendescribedinArabidopsis(Table1;NishimuraandvanWijk2015).Thenewlydiscovered CLPFadapterwasalsoidentifiedinourdatasetbutwasnotincludedinthepresentanalysis(Nishimura etal.2015).ArabidopsiscontainsthreeparalogouschaperonegenesthatcontributetotheplastidCLP complex(CLPC1,CLPC2,andCLPD).WefoundevidencethatmultiplecopiesofCLPC/Dalsoexistinthe Sileneae,buttheassembliesoftheselonggeneswereoftenfragmented,andwewereonlyableto successfullyextractonesetoforthologs,whichwerefertoasCLPC.Inadditiontotheseplastid-targeted subunits,wealsofoundorthologsofthemitochondrial-targetedCLPgenesthathavebeenidentifiedin Arabidopsis(CLPP2,CLPX1,CLPX2,andCLPX3;vanWijk2015). Sileneaegenesweresuccessfullyidentifiedfromeachofthethreeclassesofnuclear-encoded subunitsoftheheteromericplastidACCase(ACCA,ACCB,andACCC),includingtwodivergentcopiesof ACCB.TwocopiesofthisgenealsoexistinArabidopsis(Fukudaetal.2013),butitwasnotreadily apparentfromphylogeneticanalysisifthereisanorthologousrelationshipbetweenSileneaeand Arabidopsiscopiesoriftheyaretheproductofindependentduplicationevents(datanotshown). AlthoughalloftheheteromericACCasegeneswereidentifiedinthisclade,wefoundevidenceofrecent genelossinsomeoftheSilenespecies,whichisdescribedindetailbelow(see“GeneLossand AcceleratedEvolutionofSomeSubunitsintheACCaseComplex”). WithrespecttothehomomericACCasethatistypicallytargetedtothecytosol,our transcriptomedataindicatedthatSileneaespeciesexpresstwodistinctcopiesofthisgene(FiguresS2 andS3)andthatoneoftheresultingproteinshasanN-terminalextensionthatisstronglypredictedto actasaplastid-targetingpeptide(withaspecificity>0.95basedonTargetPanalysis).Duplicationofthe homomericACCaseandre-targetingtotheplastidshasoccurredrepeatedlyandindependentlyduring angiospermevolution(FigureS3).Theobservedduplicationinourdatasetprecededthedivergence 14 betweenAgrostemmaandSilene,butitwasindependentfromsimilarduplicationsingrasses(Konishi andSasaki1994)andtheBrassicaceae(Schulteetal.1997;Babiychuketal.2011;Parkeretal.2014). Nuclear-EncodedComponentsofthePlastidCLPComplexShowElevatedRatesofAminoAcid SubstitutioninSpecieswithRapidlyEvolvingPlastidGenomes Wefoundthatnuclear-encodedCLPgeneshavedramaticallyelevateddN/dSvaluesinSilenespecieswith recentaccelerationsintheevolutionaryratesofplastid-encodedclpP1(Tables1and2).When concatenated,all13nuclear-encodedCLPgeneshadadN/dSvaluesignificantlygreaterthanoneforboth S.conicaandS.noctifloraandnearlyequaltooneforS.paradoxa(Table1).Incontrast,concatenated CLPgeneshaddN/dSvaluesbetween0.05and0.16forcloselyrelatedspecieswithtypicalratesofclpP1 evolution(Table1).TheextremevarianceindN/dSestimatesresultedfromelevatednonsynonymous substitutionrates,whereassynonymoussubstitutionrateswereverysimilaracrossspecies(Figure1). RatedifferencesweremostpronouncedforCLPRsubunits,whichoccupythesamestructuralringasthe plastid-encodedCLPP1subunit(vanWijk2015).All12dN/dSestimatesforCLPRgeneswithinthe“fast” speciesweregreaterthanone,witheightfoundtobesignificantlygreaterthanone(Table1).ThedN/dS estimatesfortheCLPPgeneswithinthesefastspecieswerealsohighlyelevated,butonlyeightofthe15 weregreaterthanone,andonlyonewassignificantlyso(Table1).TheadaptorgeneCLPSwasa noticeableoutlier,beinggenerallyconservedinspeciesregardlessoftheirratesofplastidgenome evolution(Figure2). GeneLossandAcceleratedEvolutionofSomeSubunitsintheACCaseComplex Thethreespecieswithhighratesofplastid-encodedaccDevolutionexhibitedvariedpatternswith respecttonuclear-encodedACCasegenes,includingsomecasesofacceleratedevolutionandother examplesofoutrightgeneloss.Notably,noneofthenuclear-encodedACCasegeneswereidentifiedin theassembledS.noctifloratranscriptome.Weconfirmedthelossofthesegenesbysearchingadraft 15 assemblyoftheS.noctifloranucleargenome(DBS,unpublisheddata).Thenucleargenomeassembly containedonlypseudogenizedfragmentsofACCA,andnoneoftheotherACCasegenesweredetected. TheS.paradoxatranscriptomealsoappearedtolackafullcomplementoffunctionalnuclear-encoded ACCasegenes.MostofthespeciescontainedtwoACCBparalogs,butwedidnotdetectacopyofACCB1 intheS.paradoxatranscriptome.Inaddition,theassemblyoftheS.paradoxaACCAtranscriptwas incomplete,coveringonly543ntof2133-ntalignmentandexhibitingadN/dSof0.99(Table1).This partialACCAtranscriptwasaberrantlyspliced,resultingina9-ntinsertionthatintroducedapremature in-framestopcodon(FigureS4).Therefore,despitebeingtranscribed,ACCAislikelyapseudogeneinS. paradoxa.Incontrast,ACCB2andACCCwerebothintactwithverylowdN/dSvaluesinS.paradoxa. Finally,unlikeinS.noctifloraandS.paradoxa,allfournuclear-encodedACCasegeneswerepresentand intactinthetranscriptomesofS.conicaandallthree“slow”species. TheACCAsubunit,whichinteractsdirectlywiththeplastid-encodedACCDsubunit,washighlydivergent inaminoacidsequenceinS.conica.WeinitiallyestimatedanelevateddN/dSvalueof0.49forACCAinS. conica,andthatvalueincreasedto0.94whenweanalyzedthefull-lengthofthegenebyexcludingthe partialS.paradoxasequencefromthealignment(PAMLwasrunwiththe“cleandata”option,which ignoresalignmentpositionsforwhichgapsarepresentinanyofthesequences).Therewasastriking differencebetweenthiselevateddN/dSinS.conicaandtheverylowvalues(£0.06)forACCAintheslow groupspecies(Table1).TheS.conicaACCB1genealsoexhibitedsubstantiallyhigherdN/dSvaluesthanin anyoftheotherspecies,whereasdN/dSwasverylowinACCB2andACCC(Table1).Ingeneral,ratesof nonsynonymoussubstitutionintheslowspecieswereverylow(dN/dS<0.2)forACCasegenes,withthe exceptionofACCB2.Interestingly,thedN/dSvaluesforthisgeneshowedaconversepattern,inwhich dN/dSwaselevatedintheslowspeciesrelativetoS.conicaandS.paradoxa(Table1).LowRatesof NonsynonymousSubstitutionsinmtCLP,PSI,andRandomlySelectedGenes 16 ForboththePSIandmtCLPconcatenatedgenesets,therewassignificantlyhigherdN/dSinthefast group(Table2).However,dN/dSvaluesforbothPSIandmtCLPweregenerallylowinallspecies,andthe differencesbetweenthetwospeciesgroupswasverysmall,especiallyincomparisontothedifferences observedfortheplastidCLPcomplexandsomeACCasegenes(Table1;FigureS5).Theloweroverall dN/dSestimatefortheslowspeciesgroupappearedtobelargelydrivenbythelowvaluesfortheA. githagobranch,whichhasadisproportionateeffectontheestimationofdN/dSinthisgroupbecauseit representsmoredivergencetimeandalargefractionoftotalobservedsubstitutions.Therandomly selectednucleargenesshowednosignificantdifferenceindN/dSbetweenthefastandslowspecies groups(Tables1,2andS5).Thus,itdoesnotappearthatthereisaglobalelevationofdN/dSinthe nucleargenomesofspecieswithrapidlyevolvingplastidgenomes. McDonald-KreitmanTestsRevealExcessofNonsynonymousDivergenceBetweenSpecies DespitehighlevelsofobservednonsynonymousdivergenceinACCAandthemajorityofthenuclear genesthatencodecomponentsoftheplastidCLPcomplex,mostofthesegregatingvariantswithinS. conicaaresynonymous.Thus,inACCAandtheconcatenatedsetofnuclear-encodedCLPgenes,there wasalargeandhighlysignificantexcessofnonsynonymousdivergencefromtheoutgroupS.latifolia relativetolevelsofnonsynonymousandsynonymouspolymorphismwithinS.conica(Table3).ThePSI geneshadextremelylowDn/Dsvalues,butthePn/Psvalueswereevenlower,againresultingina significantexcessofnonsynonymousdivergencefortheconcatenatedgeneset(Table3).Incontrast, therewerenoindicationsofasimilarexcessinthemtCLPgenesorthesetofrandomlyselectednuclear genes,astheirconcatenatedsequenceshadNIvaluesveryclosetoone(Tables3andS6).Repeatingthe MKanalysiswithamorecloselyrelatedoutgroup(S.macrodonta)producedsimilarresults(TablesS6 andS7). SubstitutionsinNuclear-EncodedSubunitsPreferentiallyOccuratInterfaceswithPlastid-Encoded SubunitswithintheCLPComplexbutnotwithintheACCaseComplex 17 ToinvestigatetheeffectofphysicalinteractionsbetweenCLPcomplexsubunitsonsubstitution patterns,wemappedobservedchangesinCLPPandCLPRsubunitsontothesolvedstructureofthe representativeClpPsubunitfromE.coli(Figure3).Thisanalysisshowedthatnumeroussubstitutions haveoccurredthroughouttheentiretyoftheplastid-encodedClpP1subunitinfast-evolvingSilene species.Theseincludesubstitutionsin1)thehandledomainwhichphysicallyinterconnectsthetwo heptamericringsandlikelystabilizesring-ringinteractions,2)theheaddomainwhichlikelystabilizes interactionsbetweensubunitswithinasinglering,and3)theaxialloopregionswhichformtheaxial poresandmediateinteractionswithassociatedchaperones(Figure3C)(YuandHoury2007).Several individualresiduesthathavebeenimplicatedinringstabilityorsubstrateinteractions(Wangetal. 1997)werealsoobservedtohaveundergonechangesinClpP1(Figure3). Giventheextremelevelsofdivergenceintheplastid-encodedClpP1subunit,wereasonedthat substitutionsmightbenon-randomlydistributedwithininteractingnuclear-encodedsubunits. Specifically,wepredictedthatthenuclear-encodedCLPRsubunitswouldhaveanabundanceofchanges intheheaddomainbecausetheClpP1subunitassembleswithnuclear-encodedCLPRsubunitstoform theR-ring(NishimuraandvanWijk2015),andtheheaddomaincontainsresiduesthatarelikelyto maintainintra-ringinteractions(YuandHoury2007).Similarly,wepredictedthatnuclear-encodedCLPP subunitswouldhaveadisproportionatenumberofsubstitutionsintheirhandledomainsbecausethese subunitsformtheP-ring,andtheirhandledomainsarelikelyinvolvedininteractionswiththehighly divergentcopiesofClpP1intheR-ring.Aspredicted,amino-acidsubstitutionsweresignificantly overrepresentedintheheaddomainsofCLPRsubunitsandthehandledomainsofCLPPsubunitsinall threefast-evolvingspecies(Table4). Inmostplants,boththenuclear-andplastid-encodedCLPPsubunits(butnottheCLPRsubunits) retaintheSer-His-Asptriadthatconfersproteaseactivity(Peltieretal.2004).However,wefoundthatin fast-evolvingSilenespecies,manyofthesesubunitshaveexperiencedsubstitutionsattheHisorAsp 18 positionswithinthishighlyconservedtriad(whereasthecatalyticSerisuniversallyconservedinour dataset;TableS8). RelativetotheCLPcomplex,therewerefewerchangesintheACCasesubunits,andonlya fractionofthesecouldbeanalyzedinastructuralcontextbecauseseveralregionsofACCaseproteins lackstructuralinformation(Broussardetal.2013).Inparticular,thereisalargeN-terminalextensionof AccDthatishighlyvariableamongangiosperms(Greineretal.2008a)andabsententirelyfromE.coli. Likewise,theC-terminalhalfofACCAisalsouniquetoplants.Incontrasttothepatternobservedinthe CLPcomplex,theACCasechangesthatwereabletobemappedtotheE.colistructureoccurredaway fromprotein-proteininterfacesandgenerallydidnotinvolvefunctionallyimportantresidues(Figure4). ThiswasalsotrueforasiteatwhichlargeinsertionsarepresentintheAccDsubunitinbothS.conica andS.paradoxa(Figure4B). DISCUSSION Innumerousangiospermlineages,asubsetofplastidgenes,includingclpP1andaccD,display acceleratedevolutionaryrates,butthecausesofthisrecurringphenomenonhaveremainedunclear (Jansenetal.2007;ErixonandOxelman2008;Greineretal.2008b;Guisingeretal.2008,2010,2011; Straubetal.2011;Sloanetal.2014a,2012a;Barnard-Kubowetal.2014;Wengetal.2014;Williamset al.2015;Dugasetal.2015;Blazieretal.2016;Zhangetal.2016).Weinvestigatedthenucleargenes thatcontributetothemultisubunitcomplexesthatincludeClpP1andAccDandincorporatedpopulation geneticandstructuraldatatodistinguishbetweenrelaxedpurifyingselectionandpositiveselectionas driversofelevateddN/dSvalues. Ouranalysisrevealeddifferentpatternsofselectiononthenuclear-encodedCLPandACCase genes,whichmayreflectthecontrastingevolutionaryhistoriesoftheplastid-encodedsubunitsinthese twocomplexes.ThepatternsofclpP1sequencedivergenceinsomelineagesaretrulyremarkableand 19 includebothstructuralchanges(i.e.,indelsandlossofintrons)andextremeincreasesin nonsynonymoussubstitutionrates(e.g.,ErixonandOxelman2008;Sloanetal.2012a).Forexample,the amino-acididentitybetweentheclpP1copiesinthefast-evolvingspeciesSileneconicaandS.noctiflora isonly34%,andmanySilenespecieshaveelevateddN/dSvalues(upto5.9inS.fruticosa;Erixonand Oxelman2008).Incontrast,slow-evolvingspeciesfromthesamegenussuchasS.latifoliaretainupto 58%identitywithfree-livingcyanobacteria,sotheserecentaccelerationshaveledtofarmore divergenceinthelastfewmillionsyearsthanhastypicallyaccumulatedsincetheendosymbioticorigins ofphotosyntheticeukaryotesroughlyonebillionyearsago.Thecontrastsbetweenfastandslow lineagesforaccDarefarlessstark.Theincreasedratesofaminoacidsubstitutioninfastlineagesare onlymodest,andmostofthesequencechangeiscausedbyindels(Sloanetal.2012a,2014a). Furthermore,eveninspecieswithtypical,slow-evolvingplastomes,itisprimarilythecatalyticCterminaldomainofAccDthatishighlyconserved,whereastheN-terminaldomain,whichisplantspecificandhasanunknownfunction,accumulatessubstantialstructuraldivergence(Greineretal. 2008a).TheevolutionofaccDisfurthercomplicatedinsomeangiospermlineagesbyfunctionaltransfer tothenucleus(Mageeetal.2010;Rousseau-Gueutinetal.2013)orbyfunctionalreplacementwitha duplicatedandre-targetedcopyofthehomomericACCase(KonishiandSasaki1994).Incontrast,there isnoevidencetoourknowledgeoffunctionaltransferofclpP1tothenucleusingreenplants. ThecomplexhistoryoftheplastidaccDgeneinangiospermsismirroredbythevaried evolutionaryhistoriesthatweobservedwithinSileneforthenuclear-encodedACCasesubunits.One lineage(S.noctiflora)hasexperiencedtheoutrightlossoftheheteromericACCasecomplex,and anotherlineage(S.paradoxa)appearstobeundergoinggeneloss/pseudogenizationwithsignaturesof relaxedselection.However,inathirdfastspecies(S.conica),alltheACCasesubunitsareretainedand onegeneshowsclearevidenceofpositiveselection.Incontrasttothisheterogeneityintheevolutionof ACCasegenes,wefoundaconsistentsignalofpositiveselectionthroughoutnearlyallthesubunitsin 20 theplastidCLPcomplexinallthreefastspecies.ItwasespeciallystrikingtofinddN/dSvaluessignificantly greaterthanonewhenaveragedovermorethanadozennuclear-encodedCLPgenes. LossofPlastidHeteromericACCase ThefindingthatSilenenoctiflorahascompletelylostthenuclear-encodedheteromericACCasegenesis consistentwithpreviousobservationsthatthecopyoftheplastid-encodedaccDmaybeapseudogene inthisspecies(Sloanetal.2012a).AlthoughtheaccDreadingframeisintactinS.paradoxa,thegeneis highlydivergentandcontainsmultiplelargeinsertions,raisingquestionsastoitsfunctionality(Sloanet al.2014a).ThesequestionsextendtotheentireACCasecomplex,aswefoundevidenceofgene loss/decayinnuclear-encodedS.paradoxaACCasegenes(i.e.,theapparentlossofACCB1and pseudogenizationofACCA).Inatleasttwoangiospermlineages,theplastidaccDgenehasbeen transferredtothenucleus(Mageeetal.2010;Rousseau-Gueutinetal.2013).However,wefoundno evidenceofafunctionalnuclearcopyofaccDinanySilenespeciesexamined.Instead,thepresenceofa duplicatedhomomericACCasethatispredictedtobetargetedtotheplastidsmaybecompensatingfor thelostoralteredfunctionoftheheteromericACCase,asshowningrasses(KonishiandSasaki1994). BecausetheduplicationofthehomomericACCaseappearstohavehappenedlongbeforethe divergenceofAgrostemmaandSilene(FigureS3),theheteromericandduplicatedhomomericACCases haveremainedconservedandexpressedformorethan20millionyears(Sloanetal.2009)inmany lineageswithinthisclade.Thisraisesintriguingquestionsforfutureinvestigationabouttherespective rolesoftheseenzymesandwhyfunctionallossoftheheteromericACCasehasoccurredinsome lineages,whileothershaveretainedallofthesubunitsandevenexhibitevidenceofstrongpositive selectioninonecase(seebelow). PositiveSelectionActingonNuclear-PlastidEnzymeComplexes 21 WefoundsignaturesofintensepositiveselectionactingontheplastidCLPcomplex.Inmanygenes withinthefastspecies,dN/dSisgreaterthanone(oftensignificantlyso;Table1),whichisanespecially powerfulsignatureofpositiveselectionbecauseanyeffectsareaveragedacrosstheentirelengthof eachgeneandlikelydampenedbypurifyingselectionactingonmanyresidues.Althoughanincreasein therateofnonsynonymoussubstitutionscanalsobeindicativeofreducedfunctionalimportanceor evenapseudogene,wecanrejectthathypothesisbasedonthepopulationgeneticdata.Thevast majorityoftheCLPsequencepolymorphismthatissegregatingwithinS.conicaissynonymous(Table3), meaningthatthesegenesarestillfunctionallyconstrainedandthatpurifyingselectionispurgingmost newnonsynonymousmutationsfromthepopulation.Instead,thelargeobservedexcessof nonsynonymousdivergencebetweenspecies(relativetointraspecificpolymorphism)isanindication thataspecificsubsetofaminoacidsubstitutionshavebeenpreferentiallydriventofixationdueto positiveselection(McDonaldandKreitman1991). Interestingly,someoftheobservedsequencechangesaffectedresiduesinthecatalytictriadin SileneCLPPsubunits(TableS8).Substitutionsinthecatalytictriadoftheplastid-encodedClpP1subunit inAcaciahavepreviouslybeeninterpretedasevidenceofpseudogenization(Williamsetal.2015). However,thestrongselectionactingontheplastidCLPcomplexinSilenesuggeststhatbothplastid-and nuclear-encodedCLPPsubunitsmayretainanimportantfunctionalroledespitechangesinthecatalytic triad(TableS8).AlthoughtheSer-His-Aspcatalyticsiteisawidelyconservedfeatureacrossthediversity oflife,someofthesamesubstitutionsinthistriadhavebeenobservedinotheratypicalserine proteasesandfunctionallyrelatedenzymes(Schragetal.1991;Ekicietal.2008;Zeileretal.2013). Notably,substitutionsinthecatalytictriadwereonlyobservedinonenuclear-encodedCLPPgenein eachofthefastspecies.Itispossiblethatsuchchangesaretolerableaslongassomeofthesubunitsin theP-ringretainthecanonicalcatalyticresidues. 22 GiventheevidenceofgenelossandrelaxedselectionontheheteromericACCaseinS.noctiflora andS.paradoxa,weinitiallysuspectedthattheelevateddN/dSforACCAinS.conica(0.94forthefulllengthgene)wasalsoduetorelaxedselection.However,thepopulationgeneticdatademonstratedthat ACCAandtheheteromericACCasecomplexarestillfunctionallyconstrainedinS.conica,asmostof intraspecificpolymorphismsweresynonymous(Table3).Therefore,theunusuallyhighrateoffixed ACCAaminoacidsubstitutionsinthislineagearemostlikelytheresultofpositiveselectionandadaptive evolution. Werandomlyselectedasetofnucleargenesthatarenottargetedtotheplastidsor mitochondriaaswellasgenesfromthemtCLPandPSIcomplexes.Thesewerechosenwiththeapriori expectationthattheywouldbesimilarbetweenthefastandslowspecies,becausetheyeitherdonot interactwithorganelle-encodedsubunits(randomgenesandmtCLP),ortheyinteractwithplastidencodedsubunitsthatshowtypical,slowratesofevolutioninallofthespecies(PSI).ThemtCLP complex,whichiscomprisedsolelyofnuclearsubunits,andthesetofrandomnucleargeneslargely supportedthisexpectation.TheconcatenatedmtCLPgenesinfastspeciesexhibitedverysimilar(albeit slightlyhigher)dN/dSlevelscomparedtoslowspecies,andtherewasnosignificantdifferenceforthe randomgenes(Tables1and2).Furthermore,theMKtestsfoundnoevidenceofpositiveselectionfor eitherofthesedatasets(Tables3,S6,andS7).Incontrast,thenuclearPSIgenesdidexhibitasignificant excessofnonsynonymousdivergence(Tables3andS7)eventhoughtheabsoluterateofaminoacid substitutionswasextremelylow.Thisevidencesuggeststhat,althoughrare,someofthe nonsynonymoussubstitutionsinPSIareadaptivechangesthatspreadunderpositiveselectionrather thanfixingbydrift. Analternateinterpretationfortheobservedexcessofnonsynonymousdivergencebetween speciesisthattherewasanancestralbottleneck,whichcouldhaveledtoanincreasedfrequencyof weaklydeleteriousallelesspreadingtofixationbecauseofthereducedefficiencyofselectioninsmall 23 populations(Hughes2007).However,weidentifiedthreelinesofevidencethatleadustorejectthis possibilityandconcludethatourresultsare,infact,indicativeofstrongpositiveselection.First,ifa demographicbottleneckhadoccurred,wewouldexpectanexcessofnonsynonymousdivergenceacross allgenes,butwedidnotobservethisinthemtCLPgenesortherandomlychosennucleargenes. PreviousstudiesinSilenealsosupporttheconclusionthattheobservedchangesarenottheresultof genome-widedemographiceffects;analysisof140cytosolicribosomalproteinsandseven mitochondrial-targetedcomplexIIgenesinS.conicaandS.noctifloradidnotshowelevateddN/dS relativetootherSilenespecies(Sloanetal.2014b;Havirdetal.2015).Second,weobservedelevated levelsofnonsynonymousdivergenceacrossdifferenttimescalesinourMKtests(Tables3andS7)by usingtwodifferentoutgroups:S.latifolia(~5.7Myrdivergencetime)andS.macrodonta(~1.8Myr divergencetime;Rautenbergetal.2012).Therefore,separatebottlenecksataminimumoftwo differenthistoricalpointswouldhavetohavetakenplace.Third,themagnitudeoftheobservedeffects isinconsistentwithabottleneck.TherelaxedselectionassociatedwithareducedNe,shouldnot increasedN/dStovaluessignificantlyaboveone,whichwerefrequentlyobservedinourdataset(Table 1).Thus,weconcludethat,althoughitispossiblethatanancestralbottleneckinS.conicamighthave contributedtosomeminorincreasesinaminoacidsubstitutionrates,themassiverateincreases(e.g.,in CLPgenes)aremorelikelytohavebeendrivenbypositiveselectionthanatemporaryreductioninNe. AntagonisticCo-EvolutionandPlastid-NuclearConflict Theacceleratedaminoacidsubstitutionratesinboththenuclear-andplastid-encodedcomponentsof CLPandACCaseareveryunusualforancientlyconservedenzymecomplexes,butaresimilarinmany waystothepatternsthatresultfromantagonisticco-evolutionbetweenpathogensandhostimmune genes(HughesandNei1988;Borghansetal.2004).Selfishinteractionsand“armsraces”canoccur withinacell(i.e.,intragenomicconflict)whenthereisopportunityforgeneticelementstoenhancetheir owntransmissionattheexpenseoforganismal-levelfitness(BurtandTrivers2006).Suchconflictsare 24 commonbetweenthenucleusandcytoplasmicgenomes.Forexample,copiesofmitochondrial genomeswithlargedeletionscanconferareplicationadvantagewithinthecelleveniftheyharm overallfitnessbyreducingoreliminatingthecell’sabilitytorespire(Tayloretal.2002;Clarketal.2012; Phillipsetal.2015).Inaddition,becausemostcytoplasmicgenomesareinheritedmaternally,theycan benefitfrommanipulatingsexualreproductiontoincreasefemalereproductionandfitness(Perlmanet al.2015).ExamplesofthisphenomenonincludechimericORFsinplantmitochondrialgenomesthat inducecytoplasmicmalesterility(CMS;IngvarssonandTaylor2002;TouzetandBudar2004;Fujiietal. 2011)andnumerousbacterialendosymbiontsthatmanipulatesexualreproductioninanimalhosts (Werrenetal.2008). Interestingly,someoftheearliesthypothesesaboutcytonuclearconflictweredevelopedbased onobservationsofdifferentialratesofreplicationofplastidgenomesinheteroplasmicplants(Grun 1976;reviewedinGreineretal.2015).Sincethatpoint,however,researchoncytonuclearconflictin plantshasoverwhelminglyfocusedonmitochondria,particularlytheirroleinCMS.Althoughplastidsare oftenviewedasbeingrelativelybenign,inprinciple,thesameevolutionarypressurescouldapplyto thesematernallyinheritedorganelles.Onepossibilityisthattherearelimitedpathwaysavailablefor plastidstoexploitinaselfishfashion.Forexample,themajorroleofplastids(specificallychloroplasts)is inphotosynthesis,andmalereproductivetissuesaregenerallynon-photosynthetic.However,plastids alsoperformotherimportantprocesses(includingCLPandACCaseactivity).Recentstudieshave providedsupportforthepossibilityofselfishplastid-nuclearinteractionswithincomplexessuchasthe heteromericACCaseandtheCLPcomplex.Inparticular,reproductiveincompatibilities(includingmale sterility)betweenwildanddomesticatedlinesofpeaswererecentlyattributedtovariationinnuclear- andplastid-encodedcomponentsoftheheteromericACCase(Bogdanovaetal.2015).Plastid-nuclear incompatibilitieshavealsobeenimplicatedinmalesterilityinOenothera(StubbeandSteiner1999). Disruptingsynthesisoffattyacidsandtheirderivativessuchasjasmonicacidhasalsobeenassociated 25 withsterilityphenotypes(Parketal.2002).Inaddition,arecentproteomicanalysisinwheatfoundthat plastid-encodedclpP1wasoneofthemostupregulatedgenesintheanthersofmale-sterileindividuals, suggestingthatitmayplayimportantfunctionalrolesinmalereproductivetissues(Lietal.2015). Althoughthecorrelatedincreasesinevolutionaryratesandsignaturesofpositiveselectionin Sileneplastid-nuclearcomplexescouldindicateahistoryofgenomicconflict,theyarenotconclusive evidencethatplastidandnucleargenesarelockedinanarmsrace,oreventhattheyareco-evolvingin anyfashion(LovellandRobertson2010).GeneralchangesinselectionforCLPorACCasefunctioncould simultaneouslyaffectallsubunits,withoutinteractionsamongthesubunitsbeingamajorsourceof selection.ItisalsopossiblethattheoverallstructureorsubunitcompositionoftheCLPcomplexhas beenradicallydisruptedorreorganized.Furthermore,evenifco-evolutionarydynamicsareatplay,they involvemutuallybeneficialchangesratherthanantagonisticinteractions(Randetal.2004).Adaptive changesinonegenomemayalterfitnesslandscapesandfacilitatesubsequentadaptivechangesinthe othergenome,thoughitisunclearwhatforcesmighttriggersuchrunawayadaptiveevolutioninthese systems.Amoreconventionalmodelofcompensatorychangeinthenucleusinresponseto accumulationofdeleteriouschangesinasexualorganellegenomes(Randetal.2004;OsadaandAkashi 2012)seemslesslikely–particularlyfortheCLPcomplex–giventheevidencethatacceleratedratesof sequenceevolutionintheplastid-encodedclpP1evolutionaredrivenlargelybypositiveselection (ErixonandOxelman2008;Sloanetal.2012a;Barnard-Kubowetal.2014). Itisworthnotingthatthelargestincreasesinsubstitutionrateswerefoundinthenuclearencodedsubunitsthatinteractmostdirectlywithplastid-encodedsubunits.Specifically,thegreatest elevationofdN/dSintheCLPcomplexoccurredintheCLPRsubunits(Table1),whichassemblewith plastid-encodedClpP1subunitstoformtheproteolyticR-ring(NishimuraandvanWijk2015).More detailedstructuralanalysisoftheCLPcomplex(Figure3;Table4)showedthat,evenwithinsubunits, therewasanenrichmentforsubstitutionsindomainsthathaveintimateinteractionswithClpP1(i.e., 26 theheaddomainsofCLPRproteinsandhandledomainsofCLPPproteins;YuandHoury2007;Nishimura andvanWijk2015).FortheheteromericACCase,wefoundevidenceofpositiveselection(Table3)only onthenuclear-encodedACCAsubunitinS.conica,whichinteractsdirectlywiththeplastid-encoded ACCDsubunittomakeupthecarboxyltransferase(SasakiandNagano2004).However,ourstructural analysisdidnotdetectanenrichmentofsubstitutionsattheinterfacebetweenACCAandAccD(Figure 4),anditisdifficulttodrawfirmconclusionsaboutACCasestructuregiventhelackofknowledgeabout thefunctionsandinteractionsoftheplant-specificportionsofACCAandAccD. Therefore,itappearsmorelikelythatplastid-nuclearinteractionsandco-evolutionhaveplayed aroleingeneratingpositiveselectionandtheobservedaccelerationsinratesofsequenceevolutionin theCLPcomplexthanintheheteromericACCase.Wespeculatethatthemysteriousrateaccelerations thathaveoccurredrepeatedlyinclpP1inSileneandthroughoutthediversificationoffloweringplants (andpossiblythosethathaveoccurredinotherplastidgenesaswell)aretheresultofantagonisticcoevolutionbetweentheplastidandnucleus.Animportanttestofthishypothesiswillbetofunctionally characterizethesequencechangesfromrapidlyevolvinglineages,particularlywithrespecttotheir phenotypiceffectsonplastidreplicationwithincellsandplantallocationtomalevs.female reproductiveoutput. ACKNOWLEDGEMENTS WethankAndreaBerardi,RollandDouzet,PeterFields,MichaelHood,AndreasKönig,ArneSaatkamp, theKewMilleniumSeedBank,theOrnamentalPlantGerplasmCenter,andtheVilledeNantesJardin Botaniqueforcollecting/providingseeds.WealsothankCodyKalousandJessicaHurleyforperforming RNAextractionsandqualitycontrol.WearegratefulforvaluablecommentsfromStephenWrightand twoanonymousreviewersonanearlierversionofthismanuscript.Thisresearchwassupportedby 27 grantsfromtheNationalScienceFoundation(NSFMCB-1412260andMCB-1022128).KRissupportedby aGAANNgraduatefellowshipfromtheU.S.DepartmentofEducation(P200A140008)andisa participantintheNSF-fundedGAUSSIgraduatetrainingprogram(DGE-1450032).JCHissupportedbya NationalInstitutesofHealthPostdoctoralFellowship(F32GM116361). 28 REFERENCES BabiychukE.,VandepoeleK.,WissingJ.,Garcia-DiazM.,RyckeR.De,AkbariH.,JoubèsJ.,BeeckmanT., JänschL.,FrentzenM.,MontaguM.C.E.Van,KushnirS.,2011Plastidgeneexpressionandplant developmentrequireaplastidicproteinofthemitochondrialtranscriptionterminationfactor family.Proc.Natl.Acad.Sci.U.S.A.108:6674–9. Barnard-KubowK.B.,SloanD.B.,GallowayL.F.,2014Correlationbetweensequencedivergenceand polymorphismrevealssimilarevolutionarymechanismsactingacrossmultipletimescalesina rapidlyevolvingplastidgenome.BMCEvol.Biol.14:268. BewleyM.C.,GrazianoV.,GriffinK.,FlanaganJ.M.,2006Theasymmetryinthematureamino-terminus ofClpPfacilitatesalocalsymmetrymatchinClpAPandClpXPcomplexes.J.Struct.Biol.153:113– 128. BilderP.,LightleS.,BainbridgeG.,OhrenJ.,FinzelB.,SunF.,HolleyS.,Al-KassimL.,SpessardC.,Melnick M.,NewcomerM.,WaldropG.L.,2006Thestructureofthecarboxyltransferasecomponentof acetyl-CoAcarboxylaserevealsazinc-bindingmotifuniquetothebacterialenzyme.Biochemistry 45:1712–1722. BlazierJ.C.,RuhlmanT.A.,WengM.-L.,RehmanS.K.,SabirJ.S.M.,JansenR.K.,2016Divergenceof RNApolymeraseαsubunitsinangiospermplastidgenomesismediatedbygenomic rearrangement.Sci.Rep.6:24595. BogdanovaV.S.,ZaytsevaO.O.,MglinetsA.V,ShatskayaN.V,KosterinO.E.,VasilievG.V,2015 Nuclear-cytoplasmicconflictinpea(PisumsativumL.)isassociatedwithnuclearandplastidic candidategenesencodingacetyl-CoAcarboxylasesubunits.PLoSOne10:e0119835. BorghansJ.A.M.,BeltmanJ.B.,BoerR.J.De,2004MHCpolymorphismunderhost-pathogen coevolution.Immunogenetics55:732–9. BroussardT.C.,PriceA.E.,LabordeS.M.,WaldropG.L.,2013Complexformationandregulationof escherichiacoliacetyl-CoAcarboxylase.Biochemistry52:3346–3357. BurtA.,TriversR.,2006GenesinConflict:TheBiologyofSelfishGeneticElements.HarvardUniversity Press. CamachoC.,CoulourisG.,AvagyanV.,MaN.,PapadopoulosJ.,BealerK.,MaddenT.L.,2009BLAST+: architectureandapplications.BMCBioinformatics10:421. ClarkK.A.,HoweD.K.,GafnerK.,KusumaD.,PingS.,EstesS.,DenverD.R.,2012Selfishlittlecircles: transmissionbiasandevolutionoflargedeletion-bearingmitochondrialDNAinCaenorhabditis briggsaenematodes.PLoSOne7:e41433. CronanJ.E.,WaldropG.L.,2002Multi-subunitacetyl-CoAcarboxylases.Prog.LipidRes.41:407–35. DuarteJ.M.,WallP.K.,EdgerP.P.,LandherrL.L.,MaH.,PiresJ.C.,Leebens-MackJ.,DePamphilisC.W., 2010IdentificationofsharedsinglecopynucleargenesinArabidopsis,Populus,VitisandOryzaand theirphylogeneticutilityacrossvarioustaxonomiclevels.BMCEvol.Biol.10:61. DugasD.V,HernandezD.,KoenenE.J.M.,SchwarzE.,StraubS.,HughesC.E.,JansenR.K.,NageswaraRaoM.,StaatsM.,TrujilloJ.T.,HajrahN.H.,AlharbiN.S.,Al-MalkiA.L.,SabirJ.S.M.,BaileyC.D., 29 2015Mimosoidlegumeplastomeevolution:IRexpansion,tandemrepeatexpansions,and acceleratedrateofevolutioninclpP.Sci.Rep.5:16958. EgeaR.,CasillasS.,BarbadillaA.,2008StandardandgeneralizedMcDonald-Kreitmantest:awebsiteto detectselectionbycomparingdifferentclassesofDNAsites.NucleicAcidsRes.36:W157–62. EkiciO.D.,PaetzelM.,DalbeyR.E.,2008Unconventionalserineproteases:variationsonthecatalytic Ser/His/Asptriadconfiguration.ProteinSci.17:2023–37. EmanuelssonO.,NielsenH.,BrunakS.,HeijneG.von,2000Predictingsubcellularlocalizationofproteins basedontheirN-terminalaminoacidsequence.J.Mol.Biol.300:1005–1016. ErixonP.,OxelmanB.,2008Whole-genepositiveselection,elevatedsynonymoussubstitutionrates, duplication,andindelevolutionofthechloroplastclpP1gene.PLoSOne3:e1386. FujiiS.,BondC.S.,SmallI.D.,2011Selectionpatternsonrestorer-likegenesrevealaconflictbetween nuclearandmitochondrialgenomesthroughoutangiospermevolution.Proc.Natl.Acad.Sci.U.S. A.108:1723–8. FukudaN.,IkawaY.,AoyagiT.,KozakiA.,2013Expressionofthegenescodingforplastidicacetyl-CoA carboxylasesubunitsisregulatedbyalocation-sensitivetranscriptionfactorbindingsite.Plant Mol.Biol.82:473–483. GouldS.B.,WallerR.F.,McFaddenG.I.,2008Plastidevolution.Annu.Rev.PlantBiol.59:491–517. GrabherrM.G.,HaasB.J.,YassourM.,LevinJ.Z.,ThompsonD.A.,AmitI.,AdiconisX.,FanL., RaychowdhuryR.,ZengQ.,ChenZ.,MauceliE.,HacohenN.,GnirkeA.,RhindN.,PalmaF.di,Birren B.W.,NusbaumC.,Lindblad-TohK.,FriedmanN.,RegevA.,2011Full-lengthtranscriptome assemblyfromRNA-Seqdatawithoutareferencegenome.Nat.Biotechnol.29:644–52. GrayM.W.,ArchibaldJ.M.,2012GenomicsofChloroplastsandMitochondria.In:BockR,KnoopV (Eds.),GenomicsofChloroplastsandMitochondria,AdvancesinPhotosynthesisandRespiration. SpringerNetherlands,Dordrecht,pp.1–30. GreinerS.,WangX.,RauwolfU.,SilberM.V,MayerK.,MeurerJ.,HabererG.,HerrmannR.G.,2008aThe completenucleotidesequencesofthefivegeneticallydistinctplastidgenomesofOenothera, subsectionOenothera:I.Sequenceevaluationandplastomeevolution.NucleicAcidsRes.36: 2366–2378. GreinerS.,WangX.,HerrmannR.G.,RauwolfU.,MayerK.,HabererG.,MeurerJ.,2008bThecomplete nucleotidesequencesofthe5geneticallydistinctplastidgenomesofOenothera,subsection Oenothera:II.Amicroevolutionaryviewusingbioinformaticsandformalgeneticdata.Mol.Biol. Evol.25:2019–30. GreinerS.,SobanskiJ.,BockR.,2015Whyaremostorganellegenomestransmittedmaternally? Bioessays37:80–94. GrunP.,1976CytoplasmicGeneticsandEvolution.ColumbiaUniversityPress,NewYork. GuisingerM.M.,KuehlJ.V,BooreJ.L.,JansenR.K.,2008Genome-wideanalysesofGeraniaceaeplastid DNArevealunprecedentedpatternsofincreasednucleotidesubstitutions.Proc.Natl.Acad.Sci.U. S.A.105:18424–9. GuisingerM.M.,ChumleyT.W.,KuehlJ.V,BooreJ.L.,JansenR.K.,2010Implicationsoftheplastid 30 genomesequenceoftypha(typhaceae,poales)forunderstandinggenomeevolutioninpoaceae.J. Mol.Evol.70:149–66. GuisingerM.M.,KuehlJ.V,BooreJ.L.,JansenR.K.,2011Extremereconfigurationofplastidgenomesin theangiospermfamilyGeraniaceae:rearrangements,repeats,andcodonusage.Mol.Biol.Evol. 28:583–600. HavirdJ.C.,WhitehillN.S.,SnowC.D.,SloanD.B.,2015Conservativeandcompensatoryevolutionin oxidativephosphorylationcomplexesofangiospermswithhighlydivergentratesofmitochondrial genomeevolution.Evolution. HughesA.L.,NeiM.,1988PatternofnucleotidesubstitutionatmajorhistocompatibilitycomplexclassI locirevealsoverdominantselection.Nature335:167–70. HughesA.L.,2007LookingforDarwininallthewrongplaces:themisguidedquestforpositiveselection atthenucleotidesequencelevel.Heredity(Edinb).99:364–73. IngvarssonP.K.,TaylorD.R.,2002Genealogicalevidenceforepidemicsofselfishgenes.Proc.Natl. Acad.Sci.U.S.A.99:11265–9. JansenR.K.,CaiZ.,RaubesonL.A.,DaniellH.,DepamphilisC.W.,Leebens-MackJ.,MüllerK.F., Guisinger-BellianM.,HaberleR.C.,HansenA.K.,ChumleyT.W.,LeeS.-B.,PeeryR.,McNealJ.R., KuehlJ.V,BooreJ.L.,2007Analysisof81genesfrom64plastidgenomesresolvesrelationshipsin angiospermsandidentifiesgenome-scaleevolutionarypatterns.Proc.Natl.Acad.Sci.U.S.A.104: 19369–74. KeelingP.J.,2010Theendosymbioticorigin,diversificationandfateofplastids.Philos.Trans.R.Soc. Lond.B.Biol.Sci.365:729–48. KonishiT.,SasakiY.,1994Compartmentalizationoftwoformsofacetyl-CoAcarboxylaseinplantsand theoriginoftheirtolerancetowardherbicides.Proc.Natl.Acad.Sci.U.S.A.91:3598–601. LiY.-Y.,LiY.-Y.,FuQ.-G.,SunH.-H.,RuZ.-G.,2015Antherproteomiccharacterizationintemperature sensitiveBainongmalesterilewheat.Biol.Plant.59:273–282. LovellS.C.,RobertsonD.L.,2010Anintegratedviewofmolecularcoevolutioninprotein-protein interactions.Mol.Biol.Evol.27:2567–75. MageeA.M.,AspinallS.,RiceD.W.,CusackB.P.,SémonM.,PerryA.S.,StefanovićS.,MilbourneD., BarthS.,PalmerJ.D.,GrayJ.C.,KavanaghT.A.,WolfeK.H.,2010Localizedhypermutationand associatedgenelossesinlegumechloroplastgenomes.GenomeRes.20:1700–1710. McDonaldJ.H.,KreitmanM.,1991AdaptiveproteinevolutionattheAdhlocusinDrosophila.Nature 351:652–4. MowerJ.P.,TouzetP.,GummowJ.S.,DelphL.F.,PalmerJ.D.,2007Extensivevariationinsynonymous substitutionratesinmitochondrialgenesofseedplants.BMCEvol.Biol.7:135. NishimuraK.,WijkK.J.van,2015Organization,functionandsubstratesoftheessentialClpprotease systeminplastids.Biochim.Biophys.Acta1847:915–30. NishimuraK.,ApitzJ.,FrisoG.,KimJ.,PonnalaL.,GrimmB.,WijkK.J.van,2015Discoveryofaunique Clpcomponent,ClpF,inchloroplasts:aproposedbinaryClpF-ClpS1adaptorcomplexfunctionsin substraterecognitionanddelivery.PlantCell27:2677–2691. 31 OlinaresP.D.B.,KimJ.,WijkK.J.van,2011TheClpproteasesystem;acentralcomponentofthe chloroplastproteasenetwork.Biochim.Biophys.Acta1807:999–1011. OsadaN.,AkashiH.,2012Mitochondrial-nuclearinteractionsandacceleratedcompensatoryevolution: evidencefromtheprimatecytochromeCoxidasecomplex.Mol.Biol.Evol.29:337–46. ParkJ.-H.,HalitschkeR.,KimH.B.,BaldwinI.T.,FeldmannK.A.,FeyereisenR.,2002Aknock-out mutationinalleneoxidesynthaseresultsinmalesterilityanddefectivewoundsignaltransduction inArabidopsisduetoablockinjasmonicacidbiosynthesis.PlantJ.31:1–12. ParkerN.,WangY.,MeinkeD.,2014Naturalvariationinsensitivitytoalossofchloroplasttranslationin Arabidopsis.PlantPhysiol.166:2013–27. PeltierJ.-B.,RipollD.R.,FrisoG.,RudellaA.,CaiY.,YtterbergJ.,GiacomelliL.,PillardyJ.,WijkK.J.van, 2004Clpproteasecomplexesfromphotosyntheticandnon-photosyntheticplastidsand mitochondriaofplants,theirpredictedthree-dimensionalstructures,andfunctionalimplications. J.Biol.Chem.279:4768–81. PerlmanS.J.,HodsonC.N.,HamiltonP.T.,OpitG.P.,GowenB.E.,2015Maternaltransmission,sex ratiodistortion,andmitochondria.Proc.Natl.Acad.Sci.U.S.A.112:10162–8. PhillipsW.S.,Coleman-HulbertA.L.,WeissE.S.,HoweD.K.,PingS.,WernickR.I.,EstesS.,DenverD.R., 2015SelfishmitochondrialDNAproliferatesanddiversifiesinsmall,butnotlarge,experimental populationsofCaenorhabditisbriggsae.GenomeBiol.Evol.7:2023–37. RandD.M.,KannL.M.,1996ExcessaminoacidpolymorphisminmitochondrialDNA:contrastsamong genesfromDrosophila,mice,andhumans.Mol.Biol.Evol.13:735–48. RandD.M.,HaneyR.A.,FryA.J.,2004Cytonuclearcoevolution:Thegenomicsofcooperation.Trends Ecol.Evol.19:645–653. RausherM.D.,LuY.,MeyerK.,2008Variationinconstraintversuspositiveselectionasanexplanation forevolutionaryratevariationamonganthocyaningenes.J.Mol.Evol.67:137–44. RautenbergA.,SloanD.B.,AldénV.,OxelmanB.,2012PhylogeneticrelationshipsofSilenemultinervia andSilenesectionConoimorpha(Caryophyllaceae).Syst.Bot.37:226–237. Rousseau-GueutinM.,HuangX.,HigginsonE.,AyliffeM.,DayA.,TimmisJ.N.,2013Potentialfunctional replacementoftheplastidicacetyl-CoAcarboxylasesubunit(accD)genebyrecenttransferstothe nucleusinsomeangiospermlineages.PlantPhysiol.161:1918–29. SalieM.J.,ThelenJ.J.,2016Regulationandstructureoftheheteromericacetyl-CoAcarboxylase. Biochim.Biophys.Acta1861:1207–13. SasakiY.,NaganoY.,2004Plantacetyl-CoAcarboxylase:structure,biosynthesis,regulation,andgene manipulationforplantbreeding.Biosci.Biotechnol.Biochem.68:1175–84. SchragJ.D.,LiY.G.,WuS.,CyglerM.,1991Ser-His-Glutriadformsthecatalyticsiteofthelipasefrom Geotrichumcandidum.Nature351:761–4. SchulteW.,TöpferR.,StrackeR.,SchellJ.,MartiniN.,1997Multi-functionalacetyl-CoAcarboxylasefrom Brassicanapusisencodedbyamulti-genefamily:indicationforplastidiclocalizationofatleastone isoform.Proc.Natl.Acad.Sci.U.S.A.94:3465–70. 32 SloanD.B.,OxelmanB.,RautenbergA.,TaylorD.R.,2009Phylogeneticanalysisofmitochondrial substitutionratevariationintheangiospermtribeSileneae.BMCEvol.Biol.9:260. SloanD.B.,AlversonA.J.,WuM.,PalmerJ.D.,TaylorD.R.,2012aRecentaccelerationofplastid sequenceandstructuralevolutioncoincideswithextrememitochondrialdivergenceinthe angiospermgenusSilene.GenomeBiol.Evol.4:294–306. SloanD.B.,AlversonA.J.,ChuckalovcakJ.P.,WuM.,McCauleyD.E.,PalmerJ.D.,TaylorD.R.,2012b Rapidevolutionofenormous,multichromosomalgenomesinfloweringplantmitochondriawith exceptionallyhighmutationrates.PLoSBiol.10:e1001241. SloanD.B.,TriantD.A.,ForresterN.J.,BergnerL.M.,WuM.,TaylorD.R.,2014aArecurringsyndrome ofacceleratedplastidgenomeevolutionintheangiospermtribeSileneae(Caryophyllaceae).Mol. Phylogenet.Evol.72:82–9. SloanD.B.,TriantD.A.,WuM.,TaylorD.R.,2014bCytonuclearinteractionsandrelaxedselection acceleratesequenceevolutioninorganelleribosomes.Mol.Biol.Evol.31:673–82. StajichJ.E.,BlockD.,BoulezK.,BrennerS.E.,ChervitzS.A.,DagdigianC.,FuellenG.,GilbertJ.G.R.,Korf I.,LappH.,LehväslaihoH.,MatsallaC.,MungallC.J.,OsborneB.I.,PocockM.R.,SchattnerP., SengerM.,SteinL.D.,StupkaE.,WilkinsonM.D.,BirneyE.,2002TheBioperltoolkit:Perlmodules forthelifesciences.GenomeRes.12:1611–8. StoletzkiN.,Eyre-WalkerA.,2011Estimationoftheneutralityindex.Mol.Biol.Evol.28:63–70. StraubS.C.K.,FishbeinM.,LivshultzT.,FosterZ.,ParksM.,WeitemierK.,CronnR.C.,ListonA.,2011 Buildingamodel:developinggenomicresourcesforcommonmilkweed(Asclepiassyriaca)with lowcoveragegenomesequencing.BMCGenomics12:211. StubbeW.,SteinerE.,1999Inactivationofpollenandothereffectsofgenome-plastomeincompatibility inOenothera.PlantSyst.Evol.217:259–277. TamuraK.,StecherG.,PetersonD.,FilipskiA.,KumarS.,2013MEGA6:MolecularEvolutionaryGenetics Analysisversion6.0.Mol.Biol.Evol.30:2725–9. TaylorD.R.,ZeylC.,CookeE.,2002Conflictinglevelsofselectionintheaccumulationofmitochondrial defectsinSaccharomycescerevisiae.Proc.Natl.Acad.Sci.U.S.A.99:3690–4. TimmisJ.N.,AyliffeM.A.,HuangC.Y.,MartinW.,2004Endosymbioticgenetransfer:organelle genomesforgeeukaryoticchromosomes.Nat.Rev.Genet.5:123–35. TouzetP.,BudarF.,2004Unveilingthemoleculararmsracebetweentwoconflictinggenomesin cytoplasmicmalesterility?TrendsPlantSci.9:568–70. WangJ.,HartlingJ.A.,FlanaganJ.M.,1997ThestructureofClpPat2.3Aresolutionsuggestsamodel forATP-dependentproteolysis.Cell91:447–56. WengM.-L.,BlazierJ.C.,GovinduM.,JansenR.K.,2014Reconstructionoftheancestralplastidgenome inGeraniaceaerevealsacorrelationbetweengenomerearrangements,repeats,andnucleotide substitutionrates.Mol.Biol.Evol.31:645–59. WengM.-L.,RuhlmanT.A.,JansenR.K.,2016Plastid-nuclearinteractionandacceleratedcoevolutionin plastidribosomalgenesinGeraniaceae.GenomeBiol.Evol.8:evw115. 33 WerrenJ.H.,BaldoL.,ClarkM.E.,2008Wolbachia:mastermanipulatorsofinvertebratebiology.Nat. Rev.Microbiol.6:741–51. WhiteS.W.,ZhengJ.,ZhangY.-M.,Rock,2005ThestructuralbiologyoftypeIIfattyacidbiosynthesis. Annu.Rev.Biochem.74:791–831. WickeS.,SchneeweissG.M.,dePamphilisC.W.,MüllerK.F.,QuandtD.,2011Theevolutionofthe plastidchromosomeinlandplants:genecontent,geneorder,genefunction.PlantMol.Biol.76: 273–97. WijkK.J.van,2015Proteinmaturationandproteolysisinplantplastids,mitochondria,andperoxisomes. Annu.Rev.PlantBiol.66:75–111. WilliamsA.V,BoykinL.M.,HowellK.A.,NevillP.G.,SmallI.,2015ThecompletesequenceoftheAcacia ligulatachloroplastgenomerevealsahighlydivergentclpP1gene.PLoSOne10:e0125768. XieY.,WuG.,TangJ.,LuoR.,PattersonJ.,LiuS.,HuangW.,HeG.,GuS.,LiS.,ZhouX.,LamT.-W.,LiY., XuX.,WongG.K.-S.,WangJ.,2014SOAPdenovo-Trans:denovotranscriptomeassemblywith shortRNA-Seqreads.Bioinformatics30:1660–6. YangZ.,2007PAML4:phylogeneticanalysisbymaximumlikelihood.Mol.Biol.Evol.24:1586–91. YuA.Y.H.,HouryW.A.,2007ClpP:Adistinctivefamilyofcylindricalenergy-dependentserine proteases.FEBSLett.581:3749–3757. ZeilerE.,ListA.,AlteF.,GerschM.,WachtelR.,PorebaM.,DragM.,GrollM.,SieberS.A.,2013 Structuralandfunctionalinsightsintocaseinolyticproteasesrevealanunprecedentedregulation principleoftheircatalytictriad.Proc.Natl.Acad.Sci.U.S.A.110:11302–7. ZhangJ.,RuhlmanT.A.,SabirJ.,BlazierJ.C.,JansenR.K.,2015Coordinatedratesofevolutionbetween interactingplastidandnucleargenesinGeraniaceae.PlantCell27:563–73. ZhangJ.,RuhlmanT.A.,SabirJ.,BlazierJ.C.,WengM.-L.,ParkS.,JansenR.K.,2016Coevolution betweennuclearencodedDNAreplication,recombinationandrepairgenesandplastidgenome complexity.GenomeBiol.Evol.8:evw033. 34 Table 1. Summary of dN/dS estimates. Values greater than one are highlighted in bold, with underlined text indicating statistical significance based on likelihood ratio tests. Cells containing "--" indicate that the particular gene was not found in the corresponding transcriptome. Only concatenated results are reported for the set of 50 random genes. Individual gene results for this random set are available in Table S5. Note that CLPP5a and CLPP5b are recently duplicated paralogs and that only one of these copies was recovered from the Agrostemma transcriptome. Therefore, this Agrostemma sequence was used in both CLPP5 analyses. Random MT CLP Photosystem I ACC Plastid CLP Gene A. githago S. paradoxa S. conica S. noctiflora S. latifolia S. vulgaris CLPP3 0.09 0.88 2.89 2.31 0.52 0.00 CLPP4 0.02 0.36 0.62 1.24 0.21 0.00 CLPP5a 0.03 0.52 1.55 0.53 0.00 0.03 CLPP5b 0.02 0.94 2.03 1.77 0.13 0.17 CLPP6 0.10 0.57 2.84 1.23 0.35 0.09 CLPR1 0.04 1.79 8.86 2.40 0.46 0.10 CLPR2 0.02 3.65 1.16 2.41 0.06 0.00 CLPR3 0.03 8.64 1.24 2.29 0.22 0.07 CLPR4 0.01 1.83 2.37 2.02 0.08 0.06 CLPC 0.02 0.22 0.47 0.75 0.02 0.03 CLPT1 0.10 1.66 1.31 7.31 0.27 0.10 CLPT2 0.17 0.67 0.80 1.24 0.31 0.36 CLPS 0.02 0.03 0.05 0.39 0.00 0.04 Concatenated 0.05 0.87 1.30 1.55 0.16 0.06 ACCA 0.01 0.99 0.49 -- 0.06 0.05 ACCB1 0.17 -- 0.45 -- 0.07 0.14 ACCB2 0.67 0.18 0.11 -- 0.42 0.59 ACCC 0.01 0.08 0.08 -- 0.08 0.06 Concatenated 0.13 0.33 0.21 -- 0.16 0.13 LHCA2 0.03 0.00 0.04 0.04 0.09 0.02 LHCA3 0.02 0.05 0.13 0.06 0.02 0.07 PSAK 0.00 0.07 0.28 0.45 0.00 0.00 PSAL 0.02 0.08 0.18 0.11 0.00 0.00 PSAN 0.07 0.14 0.09 0.05 0.09 0.00 PSAO 0.09 0.00 0.20 0.13 0.21 0.05 PSAP 0.18 0.12 0.03 0.14 0.18 0.16 Concatenated 0.05 0.05 0.14 0.11 0.08 0.04 CLPP2 0.05 0.48 0.09 0.00 0.12 0.21 CLPX1 0.06 0.21 0.08 0.12 0.07 0.02 CLPX2 0.16 0.13 0.26 0.20 0.22 0.18 CLPX3 0.12 0.52 0.32 0.08 0.31 0.30 Concatenated 0.10 0.28 0.18 0.11 0.16 0.15 Concatenated 0.12 0.15 0.12 0.15 0.13 0.15 35 Table 2. Summary of dN/dS for constrained sets of fast and slow lineages based on concatenations of genes in each functional complex. The p-values indicate significant differences in rate between the two sets of lineages based on LRTs. dN/dS Genes Slow Lineages Fast Lineages Plastid CLP 0.07 1.25 0.000 ACC 0.14 0.26 0.014 Photosystem I 0.05 0.10 0.002 MT CLP 0.11 0.20 0.002 Random 0.13 0.13 0.894 p-value 36 Random MT CLP Photosystem I ACC Plastid CLP Table 3. Summary of MK tests using intraspecific polymorphism within S. conica relative to interspecific divergence between S. conica and S. latifolia. For each gene, the neutrality index (NI) and the direction of selection (DoS) were calculated, with NI values less than 1 and DoS values greater than 0 indicative of positive selection (i.e., an excess of nonsynonymous divergence). For concatenations of genes within each complex, NITG (an unbiased estimator of NI) is reported in parentheses. Only concatenated results are reported for the set of random genes. Individual gene results for this random set are available in Table S6. Gene Ps Pn Ds Dn NI DoS p-value CLPP3 8 3 16 83 0.07 0.57 0.000 CLPP4 18 0 31 51 0.00 0.62 0.000 CLPP5a 5 2 4 19 0.08 0.54 0.006 CLPP5b 0 0 11 25 -- -- -- CLPP6 0 2 11 59 -- -0.16 0.543 CLPR1 1 1 22 144 0.15 0.37 0.256 CLPR3 3 2 48 120 0.27 0.31 0.128 CLPR4 1 1 22 80 0.28 0.28 0.337 CLPC 4 1 55 50 0.28 0.28 0.226 CLPT1 4 0 10 34 0.00 0.77 0.001 CLPT2 7 1 19 39 0.07 0.55 0.002 CLPS 3 0 11 2 0.00 0.15 0.467 Concatenated 54 13 260 706 0.09 (0.11) 0.54 0.000 ACCA 13 7 42 111 0.20 0.38 0.001 ACCB1 0 1 14 11 -- -0.56 0.270 ACCB2 4 3 10 7 1.07 -0.02 0.939 ACCC 10 3 22 18 0.37 0.22 0.160 Concatenated 27 14 88 147 0.31 (0.31) 0.28 0.001 LHCA2 4 0 19 6 0.00 0.24 0.271 LHCA3 13 1 28 9 0.24 0.17 0.167 PSAK 1 0 13 7 0.00 0.35 0.468 PSAL 5 0 15 5 0.00 0.25 0.211 PSAN 11 2 18 10 0.33 0.20 0.183 Concatenated 34 3 93 37 0.22 (0.20) 0.20 0.011 CLPP2 1 1 21 4 5.25 -0.34 0.233 CLPX2 4 5 26 23 1.41 -0.09 0.634 CLPX3 9 3 12 8 0.50 0.15 0.387 Concatenated 14 9 59 35 1.08 (1.05) -0.02 0.866 Concatenated 368 178 1297 570 1.10 (1.07) -0.02 0.371 37 Table 4. The ratio of amino acid substitutions in the head vs. handle regions is higher in CLPR subunits than CLPP subunits. Values represent counts of substitutions summed across all nuclear-encoded genes in each category. Head:Handle Ratio Fisher's Exact Test Species CLPP CLPR p-value Silene conica 85:63 136:61 0.031 Silene noctiflora 50:37 109:36 0.006 Silene paradoxa 49:40 127:22 0.000 38 Figure 1. Rates of sequence evolution in nuclear genes coding for subunits of the plastid and mitochondrial CLP complexes. Branch lengths are scaled to the amount of nonsynonymous (dN) and synonymous (dS) divergence per site. Species with rapid rates of plastid genome evolution (Sloan et al. 2014a) are highlighted in red. 39 Figure 2. Gene-by-gene comparison of dN/dS in fast vs. slow sets of lineages. Points are color-coded by functional complex, and the diameter of each point is proportional to gene length. All points represent individual genes except the “random” point, which is based on the concatenation of all 50 genes in that set. The size of that point is scaled to the average length of each gene rather than the total concatenation length. ACCA and CLPS, which exhibit distinct patterns from the other ACCase and CLP subunits are labeled individually. 40 Figure 3. A) ClpP protease structure from E. coli (PDB 1YG6; Yu and Houry 2007) oriented with the two heptameric rings stacked on top of each other (left) or in front and behind (right). One of the 14 (identical) E. coli subunits is highlighted in blue. B) A single E. coli ClpP subunit (as in part A), with the head, handle, and axial-loop domains highlighted. The surface of the catalytic triad is indicated in mesh and the individual residues (Ser, His, and Asp) are labeled. Other important residues are numbered and indicated with stick models, including those that interact with substrates (Phe-17 and Phe-49) or contribute to heptamer stability by forming hydrophobic bonds (Asn-41/Tyr-20, Asn-41/Thr-32, Asp78/Asn-116, and Asp-171/Try-128) or ion pairs (Arg-118/Glu-141, Arg-170/Glu-134, and Asp-171/His138). C) Locations of substitutions in Silene species with fast-evolving CLP sequences (S. conica, S. noctiflora, and S. paradoxa). Residues with changes (relative to the inferred ancestral Silene sequence) in one, two, or three species are indicated in yellow, orange, and red, respectively. The black triangle highlights the ClpP1 site at which large insertions have occurred in S. conica (39 amino acids) and S. noctiflora (7 amino acids). 41 Figure 4. A) Structure of the carboxyltransferase component from the E. coli ACCase (PDB 2F9Y), oriented as in Figure 2C from Bilder et al. (2006). Residues of active sites (Gly-204, Gly-205, Gly-206, Gly-207) are shown as spheres, the four cysteines that form cysteinyl zinc ligands (Cys-27, Cys-30, Cys46, Cys-49) are shown as stick models, and the helices and sheets that compose the catalytic platform are indicated. B) The same structure as in part A with inferred Silene changes mapped as described in Figure 3. The red arrow indicates the location of an AccD site with large insertions in both S. conica (26 amino acids) and S. paradoxa (58 amino acids). Importantly, this model contains only the portions of ACCA and AccD that can be aligned and mapped to E. coli, as the copies of these genes in plants are roughly twice the length of their counterparts in E. coli owing to large N- or C-terminal extensions of unknown function. C) Structure of the biotin carboxylase (ACCC)-biotin carboxyl carrier protein (ACCB) complex from the E. coli ACCase (PDB 4HR7) as in Figure 2A from Broussard et al. (2013). D) An ACCC monomer complexed with two ACCB monomers as in Figure 2D in Broussard et al. (2013), indicating functional domains and ACCB-ACCC interfaces. The Lys-122 residues that bind biotin and the active site of ACCC (Arg-338) are shown as spheres, while residues that play important roles in ACCB-ACCC interactions are shown as sticks and highlighted with arrows. E) The same structure as part D with inferred Silene changes mapped as described in Figure 3. F) Two ACCC dimers and a single ACCB monomer, showing ACCC-ACCC interfaces as in Figure 4A from Broussard et al. (2013). Residues that play important roles in ACCC-ACCC interactions are shown as sticks and highlighted with arrows. G) The same structure as part F with inferred Silene changes mapped as described in Figure 3. 42