* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Sequence Alignment Introduction
Survey
Document related concepts
Nucleic acid analogue wikipedia , lookup
Transcriptional regulation wikipedia , lookup
Genome evolution wikipedia , lookup
Molecular cloning wikipedia , lookup
Silencer (genetics) wikipedia , lookup
Deoxyribozyme wikipedia , lookup
Ancestral sequence reconstruction wikipedia , lookup
Promoter (genetics) wikipedia , lookup
Cre-Lox recombination wikipedia , lookup
Point mutation wikipedia , lookup
Endogenous retrovirus wikipedia , lookup
Molecular ecology wikipedia , lookup
Non-coding DNA wikipedia , lookup
Molecular evolution wikipedia , lookup
Community fingerprinting wikipedia , lookup
Transcript
Short Film The Origin of Species: Lizards in an Evolutionary Tree Educator Materials SEQUENCEALIGNMENTINTRODUCTIONUSINGCLUSTALX ThisdocumentcanbeusedtointroducethebasicconceptofDNAsequencealignment,whichis necessarybeforeDNAsequencescanbemeaningfullycompared. FORMATOFDNASEQUENCEINFORMATION TherearedifferentformatsforrepresentingDNAsequences.Shownbelowisapartialsequencefrom thedog’scytochromeoxidasesubunitI(COI)geneinFASTAformat.FASTAformatstartswitha“>,” followedbyinformationaboutthefiletotheendofthefirstline,followedbytheDNAsequence. >gi|377685879|gb|JN850779.1|Canislupusfamiliarisisolatedog_3cytochromeoxidasesubunitI(COI) gene,partialcds;mitochondrial TACTTTATACTTACTATTTGGAGCATGAGCCGGTATAGTAGGCACTGCCTTGAGCCTCCTCATCCGAGCC GAACTAGGTCAGCCCGGTACTTTACTAGGTGACGATCAAATTTATAATGTCATYGTAACCGCCCATGCTT… AfilecontainingFASTAformatsequenceinformationmaycontainmultiplesequencesoneafter another.Forexample: WHATSEQUENCESDOWECHOOSETOCOMPARE? Inmoderntaxonomicpractice,scientistsroutinelyanalyzetheDNAfromspecimenstheycollectto obtaina“DNAbarcode,”ashortDNAsequenceuniquetoaparticularspecies,whichisusedtoidentify thespeciesitbelongsto.Foranimalsandmanyothereukaryotes,differentgeneshavebeenusedfor thispurpose.OneexampleisthemitochondrialcytochromeoxidasesubunitI(COI)gene,whichencodes partofanenzymethatisimportantforcellularrespiration,andthemitochondrialNADHdehydrogenase subunit2(ND2)geneisanother.Sequencesliketheseareavailablefromawiderangeofspecies, makingitpossibletousethesegenesequencestoexplorephylogeneticrelationships. COIorND2aregoodchoicesforDNAbarcoding,because,ingeneral,thereislittlevariationinthe sequencesoforganismswithinthesamespecies,whilethereissignificantvariationinthesequencesof organismsfromdifferentspecies.Therefore,theyprovideauniquesequencesignatureforaparticular species,andaresuitableforcomparingphylogeneticrelationshipsbetweenspecies. Becausethesesequencesaresosimilarwithinthesamespecies,thesegenesarenotagoodchoicefor studyingvariationswithinthesamespecies,orevenamongspeciesthathaveveryrecentlyspeciated. www.BioInteractive.org PublishedMarch2014 UpdatedApril2015 Page1of3 Short Film The Origin of Species: Lizards in an Evolutionary Tree Educator Materials COIsequencesalsohavealowmutationrateamongmanyspeciesofplantsandcannotbeusedforDNA barcodingorphylogeneticcomparisonsofthosespecies. ThesequenceincludestheNADHdehydrogenasesubunit2genealongwithadjacentsequencesthat includesometransferRNAgenes.ND2geneisoneofseveralgenesthatareoftenusedforgenetic fingerprintinginanimals.Itissuitableforthispurposebecauseitisconservedenoughsothatthegeneis sharedamongadiversegroupofanimals,yetdifferentenoughbetweendifferentanimalstoexamine evolutionaryrelationshipbycomparingDNAsequences. WHICHPROGRAMTOUSE Toteachthebasicsofsequencealignment,werecommendClustalXifyoucaninstallsoftwareonyour computer.Togeneratephylogeny,usingwww.phylogeny.frissimpler. • ClustalXhasagraphicinterfacethatisintuitive,anditisanexcellenttoolforillustratingthe conceptandtheprocessofsequencealignment.ClustalXisafreelyavailableinstalledprogram, withitsadvantage(norelianceoninternet)anddisadvantage(requiresprograminstallation)in theclassroomsetting.Itsalgorithmisalsoalittledated,andthereareotherprogramsthatdoa betterjobofgeneratingphylogenies;howeveritissufficientasademonstrationofhowto generatephylogenetictreesfromDNAsequencealignments.Thephylogenygeneratedrequires anotherfreelyavailableprogram,NJplot,toprintorview. • www.Phylogeny.frisaweb-basedtoolforgeneratingphylogenies.Usingthedefaultsettings, phylogeny.frissimpletouse,anditusesadifferentalignmentgeneratorcalledMUSCLE.The websitegeneratesaphylogenythatcanbesavedasdifferentgraphicfiles.However,thedisplay ofalignmentisnotasintuitiveasinClustalX. ALIGNMENTTUTORIALANDTREEGENERATIONVIACLUSTALX SoftwareandFiles InstallClustalX,whichisavailableathttp://www.clustal.org/clustal2/.(ForWindows,downloadclustalx2.1-win.msi;forMacOS,downloadclustalx-2.1-macosx.dmg.)Next,installNJplot,whichisavailableat http://pbil.univ-lyon1.fr/software/njplot.html. Understandingsequencealignment Let’suseClustalXtocompareDNAsequences.Forthisexercise,usethetestsequencefiletest.txt,which containsthethreeshortDNAsequences(test1,test2,andtest3)shownbelow: >test1 AAGGAAGGAAGGAAGGAAGGAAGG >test2 AAGGAAGGAATGGAAGGAAGGAAGG >test3 AAGGAACGGAATGGTAGGAAGGAAGG www.BioInteractive.org PublishedMarch2014 UpdatedApril2015 Page2of3 Short Film The Origin of Species: Lizards in an Evolutionary Tree Educator Materials LoadthesesequencesintoClustalXbychoosingfromthemenu, File->LoadSequences,andthenselectingtest.txt.ClustalX displaysthesesequencesasshownintheillustrationontheright (PCversionshown). Beforeyoucancomparesequences,youhaveto“align”them, whichmeansliningupthesequencesandslidingthempastone anotheruntilthebestmatchingpatternisfound.Alignment allowsyoutoexaminedifferencesbetweenrelatedsequences; suchdifferencesreflectevolutionaryrelationships. Fromthemenu,chooseAlignment->DoCompleteAlignment. Whenpromptedforoutputfilenames,usethedefaultnames givenandclick“OK.”Thescreenchangestolookslikethe illustrationontheright. Noticethatit’saloteasiertoseedifferencesamongDNA sequencesafteralignment.Youcanfigureoutwhatkindsof mutationshaveoccurredineachsequencebyhowitcompares totheothers(asshowninthelabeledillustration).Thenumber ofdifferencesamongsequencesdetermineshowcloselyor distantlyrelatedthecorrespondingorganismsare. Deletion Insertion Substitution Basedonthisinformationalone,whichtwosequencesdoyouthinkare morecloselyrelated?Toseeifyouranswerwasaccurate,wecanuse ClustalXtogenerateaphylogenetictree. Fromthemenu,chooseTrees->DrawTree.Thiscreatesaphylogenetic treefilecalledtest1.ph,whichcanbeopenedusingNJplot.exe.Launch NJplot,thenfromitsmenu,chooseFile->Open,andselecttest1.ph. Theresultshowsthattest1andtest2areonthesamebranchofthe tree,indicatingthattheyaremorecloselyrelatedtoeachotherthanto test3. www.BioInteractive.org PublishedMarch2014 UpdatedApril2015 Page3of3