Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Minimal genome wikipedia , lookup
Human genome wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Pathogenomics wikipedia , lookup
Genomic library wikipedia , lookup
Whole genome sequencing wikipedia , lookup
Metagenomics wikipedia , lookup
Genome editing wikipedia , lookup
Human Genome Project wikipedia , lookup
SubmittinganRNA-SeqjobatPATRIC I.LocatingtheRNA-SeqServiceApp. 1. AtthetopofanyPATRICpage,findtheServicestab. 2. ClickonRNA-SeqAnalysis. 3. ThiswillopenuptheRNA-Seqlandingpagewhereresearcherscansubmitlongreads, singleorpairedreadfiles. II.Fillinginparameters-Strategy 1. PATRICoffersthreedifferentRNA-Seqstrategies: • RockhoppersupportsvariousstagesofbacterialRNA-seqdataanalysis,including aligningsequencingreadstoagenome,constructingtranscriptomemaps, quantifyingtranscriptabundance,testingfordifferentialgeneexpressionand determiningoperonstructures[1]. • TuxedoispartofTopHat2[2],whichallowsforvariable-lengthindelswithrespectto thereferencegenomeandalsoalignsreadsacrossfusionbreaks,whichcanoccur aftergenomictranslocations.TopHat2combinestheabilitytoidentifynovelsplice • siteswithdirectmappingtoknowntranscripts,producingsensitiveandaccurate alignments,evenforhighlyrepetitivegenomesorinthepresenceofpseudogenes. HISAT(hierarchicalindexingforsplicedalignmentoftranscripts)[3]isahighly efficientsystemforaligningreadsfromRNAsequencingexperiments.HISATusesan indexingschemebasedontheBurrows-WheelertransformandtheFerraginaManzini(FM)index,employingtwotypesofindexesforalignment:awhole-genome FMindextoanchoreachalignmentandnumerouslocalFMindexesforveryrapid extensionsofthesealignments.PATRICprovidesthisservice(calledHostHISAT2)to allowresearcherstoanalyzeRNA-seqdataforalimitednumberofhostgenomes thatinclude: § Caenorhabditiselegans § Daniorerio § Drosophilamelanogaster § Gallusgallus § Homosapiens § Macacamulatta § Musmusculus § Mustelaputoriusfuro § Rattusnorvegicus § Susscrofa 2.ClickingonthedownarrowthatfollowstheStrategytextboxarrow(redarrow)willshow thethreeoptions.Clickonthedesiredstrategy(bluearrow). III.Fillinginparameters-TargetGenome 1. ResearchersmustselectaTargetGenometoalignthereadsagainst.Ifthisgenomeisa privategenome,thesearchcanbenarrowedbyclickingonthefiltericonunderthe wordsTargetGenome(redarrow).ThiswillopenthefilterwherePublicGenomescan bede-selected(bluearrow). 2. Researcherscanalsoclickonthedownarrowattheendofthetextbox(redarrow) whichwillopenupadrop-downboxwhereagenomecanbeselected(bluearrow). 3. Researcherscanalsostarttypingthenameofagenome.AboxbelowTargetGenome willshowtheclosestmatchesthatcanbeselected. IV.Fillinginparameters–Outputfolder 1. ResearchersthathaveusedPATRICbeforecanclickonthedownarrowattheendofthe OutputFoldertextbox.Thiswillopenadropdownboxthatwillshowthefoldersthat existintheworkspace(redarrow). 2. ResearchersthathavenotpreviouslysubmittedaRNA-Seqjobandwanttocreatea newfoldertostoretheresultswillneedtoclickonthefoldericonattheendofthe OutputFoldertextbox(redarrow). 3. Thiswillopenapop-upwindow.Tocreateanewfolder,clickonthefoldericon(red arrow)whichwillreloadthewindow(blackarrow)toshowatextboxwherethenew foldercanbenamed(bluearrow).Oncethefolderhasbeennamed,clickonOKto finalizeit(greenarrow). 4. Finally,researchersmustnametheRNA-seqjob(redarrow). V.Uploadingreadsfromyourcomputerthatarenotintheworkspace 1. TouploadreadsthathavenotpreviouslybeenuploadedintoPATRIC,clickonthefolder iconthatfollowstheReadFiletextbox(redarrows).Thiswillopenapop-upwindow (blackarrow).Touploadnewreads,clickontheuploadicon(bluearrow). 2. Selectthefileofinterest(redarrow)andthenclickOK(bluearrow). 3. Onceafileisselected,researchersmustplayparticularattentiontotheUploads monitoratthebottomofthepage,whichwillshowtheprogressinupload. 4. Thenameoftheuploadedfilewillappearinthetextbox. V.Uploadingreadsfromyourcomputerthatareintheworkspace 1. Clickingonthedownarrowthatfollowsthetextbox(redarrow)willopenadropdown boxwherefilescanbeselected(bluearrow). 2. Anotherwaytouploadreadsthatarealreadyintheworkspaceistoclickonthefolder iconthatfollowsthetextbox(redarrow)whichwillopenapop-upboxwherereadscan beselected(bluearrow).TheuploadiscompletedbyclickingOK(bluearrow). 3. Singleorpairedendreadsshouldbeselectedandthenwillappearinthetextbox(es). VI.Selectingaconditionorgroupthatwillbelinkedtoaread(Optional) 1. Metadatacanbeassignedtoselectedreads.Thiswillmakeidentificationeasierinsome ofthedownstreamtoolsavailableonPATRIC.Todothis,locatetheConditionboxand clicktheOnbox(redarrow).Thiswillmakeitpossibletonamespecificconditions. 2. Namethecondition(redarrow)andclicktheplusicon(bluearrow).Thiswillshowthe nameoftheconditionandacolorcodeassignedtoitinthetextbox(greenarrow).As manyconditionsasdesiredcanbeentered. 3. Tolinkthenameoftheconditionorgrouptotheselectedreads,clickonthedown arrowthatfollowsthetextboxunderCondition(redarrow).Thiswillopenadropdown boxthatshowsallpossibleconditions.Clickontheappropriateone(bluearrow). 4. ThiswillautofilltheConditiontextboxwiththenameofthecondition. VII.Selectingaconditionorgroupthatwillbelinkedtoaread 1. Clickingonthearrowiconinanyreadlibrarybox(redarrow)willloadthereads(shown togetherinthesamelineforpairedreads)withtheirassignedconditionintothe selectedlibrary. 2. Tosubmitthecompletedjob,clicktheSubmitbutton(redarrow). 3. Ifthejobwassubmittedsuccessfully,amessagewillappearthatindicatesthatthejob hasenteredtheassemblyqueue. 4. Tocheckthestatusoftheassemblyjob,clickontheJobsindicatoratthebottomofthe PATRICpage. 5. ClickingonJobsopenstheJobsStatuspage,whereresearcherscanseetheprogression oftheassemblyjobaswellasthestatusofallthepreviousservicejobsthathavebeen submitted. VIII.ExaminingtheRNA-Seqjobresults 1. OncetheRNA-Seqjobofinterestislocatedandclickedoninthejobslist,thevertical greenbarwillbecomepopulatedwiththeViewicon(redarrow).Toseetheresults, clickonthaticon. 2. ThiswillopentheresultspagefortheRNA-seqjob. 3. Summary.txtfile.Clickingonthedownloadbuttonopensatextfilethatsummarizes thestepsandthedifferentcategoriesoffeaturealignment. 4. Transcripts.txtfile.Thistextfilecontainsthedataonapergenebases.Thisdata includesthecontig,thetranscriptionandtranslationstopandendsites,thestrand, PATRICandRefSeqlocustags,thefunctionaldescription,estimatesofabundancelevels pergene,andtheq-value.Estimatesoftranscriptionabundancesumthenumberof readsforatranscriptanddividesthatbythetranscript’slengthandnormalization factor.Theq-valueisanadjustedp-value,takingintoaccountthefalsediscoveryrate (FDR). 5. BAMfiles.PATRICalsoprovidesBAMfiles.BAMisthecompressedbinaryversionof theSequenceAlignment/Map(SAM)format,acompactandindex-ablerepresentation ofnucleotidesequencealignments,anduploadedintoagenomebrowsersothat researcherscanseethealignmentofthereadscomparedtotheannotationforthe genomeinquestion. 6. Gene_exp.gmxfile.TheGMXfileformatisatabdelimitedfileformatthatdescribes genesetsorothercollectionsofelements.TheRNA-seqgene_exp.gmxfilecontainsa listofthegenesandtheratiooftheirexpressionbetweentwoconditions. References 1. McClure,R.,etal.,ComputationalanalysisofbacterialRNA-Seqdata.NucleicAcidsRes, 2013.41(14):p.e140. 2. Kim,D.,etal.,TopHat2:accuratealignmentoftranscriptomesinthepresenceof insertions,deletionsandgenefusions.GenomeBiol,2013.14(4):p.R36. 3. Kim,D.,B.Langmead,andS.L.Salzberg,HISAT:afastsplicedalignerwithlowmemory requirements.NatMethods,2015.12(4):p.357-60.