Download Submitting an RNA-Seq job at PATRIC

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Minimal genome wikipedia , lookup

Human genome wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Pathogenomics wikipedia , lookup

Genomic library wikipedia , lookup

Whole genome sequencing wikipedia , lookup

Metagenomics wikipedia , lookup

Genome editing wikipedia , lookup

Genomics wikipedia , lookup

Human Genome Project wikipedia , lookup

Genome evolution wikipedia , lookup

RNA-Seq wikipedia , lookup

Transcript
SubmittinganRNA-SeqjobatPATRIC
I.LocatingtheRNA-SeqServiceApp.
1. AtthetopofanyPATRICpage,findtheServicestab.
2. ClickonRNA-SeqAnalysis.
3. ThiswillopenuptheRNA-Seqlandingpagewhereresearcherscansubmitlongreads,
singleorpairedreadfiles.
II.Fillinginparameters-Strategy
1. PATRICoffersthreedifferentRNA-Seqstrategies:
• RockhoppersupportsvariousstagesofbacterialRNA-seqdataanalysis,including
aligningsequencingreadstoagenome,constructingtranscriptomemaps,
quantifyingtranscriptabundance,testingfordifferentialgeneexpressionand
determiningoperonstructures[1].
• TuxedoispartofTopHat2[2],whichallowsforvariable-lengthindelswithrespectto
thereferencegenomeandalsoalignsreadsacrossfusionbreaks,whichcanoccur
aftergenomictranslocations.TopHat2combinestheabilitytoidentifynovelsplice
•
siteswithdirectmappingtoknowntranscripts,producingsensitiveandaccurate
alignments,evenforhighlyrepetitivegenomesorinthepresenceofpseudogenes.
HISAT(hierarchicalindexingforsplicedalignmentoftranscripts)[3]isahighly
efficientsystemforaligningreadsfromRNAsequencingexperiments.HISATusesan
indexingschemebasedontheBurrows-WheelertransformandtheFerraginaManzini(FM)index,employingtwotypesofindexesforalignment:awhole-genome
FMindextoanchoreachalignmentandnumerouslocalFMindexesforveryrapid
extensionsofthesealignments.PATRICprovidesthisservice(calledHostHISAT2)to
allowresearcherstoanalyzeRNA-seqdataforalimitednumberofhostgenomes
thatinclude:
§ Caenorhabditiselegans
§ Daniorerio
§ Drosophilamelanogaster
§ Gallusgallus
§ Homosapiens
§ Macacamulatta
§ Musmusculus
§ Mustelaputoriusfuro
§ Rattusnorvegicus
§ Susscrofa
2.ClickingonthedownarrowthatfollowstheStrategytextboxarrow(redarrow)willshow
thethreeoptions.Clickonthedesiredstrategy(bluearrow).
III.Fillinginparameters-TargetGenome
1. ResearchersmustselectaTargetGenometoalignthereadsagainst.Ifthisgenomeisa
privategenome,thesearchcanbenarrowedbyclickingonthefiltericonunderthe
wordsTargetGenome(redarrow).ThiswillopenthefilterwherePublicGenomescan
bede-selected(bluearrow).
2. Researcherscanalsoclickonthedownarrowattheendofthetextbox(redarrow)
whichwillopenupadrop-downboxwhereagenomecanbeselected(bluearrow).
3. Researcherscanalsostarttypingthenameofagenome.AboxbelowTargetGenome
willshowtheclosestmatchesthatcanbeselected.
IV.Fillinginparameters–Outputfolder
1. ResearchersthathaveusedPATRICbeforecanclickonthedownarrowattheendofthe
OutputFoldertextbox.Thiswillopenadropdownboxthatwillshowthefoldersthat
existintheworkspace(redarrow).
2. ResearchersthathavenotpreviouslysubmittedaRNA-Seqjobandwanttocreatea
newfoldertostoretheresultswillneedtoclickonthefoldericonattheendofthe
OutputFoldertextbox(redarrow).
3. Thiswillopenapop-upwindow.Tocreateanewfolder,clickonthefoldericon(red
arrow)whichwillreloadthewindow(blackarrow)toshowatextboxwherethenew
foldercanbenamed(bluearrow).Oncethefolderhasbeennamed,clickonOKto
finalizeit(greenarrow).
4. Finally,researchersmustnametheRNA-seqjob(redarrow).
V.Uploadingreadsfromyourcomputerthatarenotintheworkspace
1. TouploadreadsthathavenotpreviouslybeenuploadedintoPATRIC,clickonthefolder
iconthatfollowstheReadFiletextbox(redarrows).Thiswillopenapop-upwindow
(blackarrow).Touploadnewreads,clickontheuploadicon(bluearrow).
2. Selectthefileofinterest(redarrow)andthenclickOK(bluearrow).
3. Onceafileisselected,researchersmustplayparticularattentiontotheUploads
monitoratthebottomofthepage,whichwillshowtheprogressinupload.
4. Thenameoftheuploadedfilewillappearinthetextbox.
V.Uploadingreadsfromyourcomputerthatareintheworkspace
1. Clickingonthedownarrowthatfollowsthetextbox(redarrow)willopenadropdown
boxwherefilescanbeselected(bluearrow).
2. Anotherwaytouploadreadsthatarealreadyintheworkspaceistoclickonthefolder
iconthatfollowsthetextbox(redarrow)whichwillopenapop-upboxwherereadscan
beselected(bluearrow).TheuploadiscompletedbyclickingOK(bluearrow).
3. Singleorpairedendreadsshouldbeselectedandthenwillappearinthetextbox(es).
VI.Selectingaconditionorgroupthatwillbelinkedtoaread(Optional)
1. Metadatacanbeassignedtoselectedreads.Thiswillmakeidentificationeasierinsome
ofthedownstreamtoolsavailableonPATRIC.Todothis,locatetheConditionboxand
clicktheOnbox(redarrow).Thiswillmakeitpossibletonamespecificconditions.
2. Namethecondition(redarrow)andclicktheplusicon(bluearrow).Thiswillshowthe
nameoftheconditionandacolorcodeassignedtoitinthetextbox(greenarrow).As
manyconditionsasdesiredcanbeentered.
3. Tolinkthenameoftheconditionorgrouptotheselectedreads,clickonthedown
arrowthatfollowsthetextboxunderCondition(redarrow).Thiswillopenadropdown
boxthatshowsallpossibleconditions.Clickontheappropriateone(bluearrow).
4. ThiswillautofilltheConditiontextboxwiththenameofthecondition.
VII.Selectingaconditionorgroupthatwillbelinkedtoaread
1. Clickingonthearrowiconinanyreadlibrarybox(redarrow)willloadthereads(shown
togetherinthesamelineforpairedreads)withtheirassignedconditionintothe
selectedlibrary.
2. Tosubmitthecompletedjob,clicktheSubmitbutton(redarrow).
3. Ifthejobwassubmittedsuccessfully,amessagewillappearthatindicatesthatthejob
hasenteredtheassemblyqueue.
4. Tocheckthestatusoftheassemblyjob,clickontheJobsindicatoratthebottomofthe
PATRICpage.
5. ClickingonJobsopenstheJobsStatuspage,whereresearcherscanseetheprogression
oftheassemblyjobaswellasthestatusofallthepreviousservicejobsthathavebeen
submitted.
VIII.ExaminingtheRNA-Seqjobresults
1. OncetheRNA-Seqjobofinterestislocatedandclickedoninthejobslist,thevertical
greenbarwillbecomepopulatedwiththeViewicon(redarrow).Toseetheresults,
clickonthaticon.
2. ThiswillopentheresultspagefortheRNA-seqjob.
3. Summary.txtfile.Clickingonthedownloadbuttonopensatextfilethatsummarizes
thestepsandthedifferentcategoriesoffeaturealignment.
4. Transcripts.txtfile.Thistextfilecontainsthedataonapergenebases.Thisdata
includesthecontig,thetranscriptionandtranslationstopandendsites,thestrand,
PATRICandRefSeqlocustags,thefunctionaldescription,estimatesofabundancelevels
pergene,andtheq-value.Estimatesoftranscriptionabundancesumthenumberof
readsforatranscriptanddividesthatbythetranscript’slengthandnormalization
factor.Theq-valueisanadjustedp-value,takingintoaccountthefalsediscoveryrate
(FDR).
5. BAMfiles.PATRICalsoprovidesBAMfiles.BAMisthecompressedbinaryversionof
theSequenceAlignment/Map(SAM)format,acompactandindex-ablerepresentation
ofnucleotidesequencealignments,anduploadedintoagenomebrowsersothat
researcherscanseethealignmentofthereadscomparedtotheannotationforthe
genomeinquestion.
6. Gene_exp.gmxfile.TheGMXfileformatisatabdelimitedfileformatthatdescribes
genesetsorothercollectionsofelements.TheRNA-seqgene_exp.gmxfilecontainsa
listofthegenesandtheratiooftheirexpressionbetweentwoconditions.
References
1.
McClure,R.,etal.,ComputationalanalysisofbacterialRNA-Seqdata.NucleicAcidsRes,
2013.41(14):p.e140.
2.
Kim,D.,etal.,TopHat2:accuratealignmentoftranscriptomesinthepresenceof
insertions,deletionsandgenefusions.GenomeBiol,2013.14(4):p.R36.
3.
Kim,D.,B.Langmead,andS.L.Salzberg,HISAT:afastsplicedalignerwithlowmemory
requirements.NatMethods,2015.12(4):p.357-60.