* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Functional Genomics I: Transcriptomics and
Heritability of IQ wikipedia , lookup
X-inactivation wikipedia , lookup
RNA interference wikipedia , lookup
Epigenetics of diabetes Type 2 wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Short interspersed nuclear elements (SINEs) wikipedia , lookup
Oncogenomics wikipedia , lookup
Public health genomics wikipedia , lookup
Epigenetics of neurodegenerative diseases wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Metagenomics wikipedia , lookup
History of genetic engineering wikipedia , lookup
Pathogenomics wikipedia , lookup
Long non-coding RNA wikipedia , lookup
Quantitative trait locus wikipedia , lookup
Essential gene wikipedia , lookup
Microevolution wikipedia , lookup
Polycomb Group Proteins and Cancer wikipedia , lookup
Designer baby wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Gene expression programming wikipedia , lookup
Mir-92 microRNA precursor family wikipedia , lookup
Genome evolution wikipedia , lookup
Genome (book) wikipedia , lookup
Genomic imprinting wikipedia , lookup
Ridge (biology) wikipedia , lookup
Minimal genome wikipedia , lookup
Biology and consumer behaviour wikipedia , lookup
ExploringTranscriptomicdata 1. ExploringRNAsequencedatainPlasmodiumfalciparum. Note:Forthisexerciseusehttp://www.plasmodb.org a. Find all genes in P. falciparum that are up-regulated during the later stages of the intraerythrocyticcycle. - Hint:Usethefoldchangesearchforthedataset“Transcriptomeduringintraerythrocytic development (Bartfai et al.)”. For this data set, synchronized Pf3D7 parasites were assayedbyRNA-seqat8time-pointsduringtheiRBCcycle.Wewanttofindgenesthat areup-regulatedinthelatertimepoints(30,35,40hours)usingtheearlytimepoints(5, 10,15,20,25hours)asreference. There are a number of parameters to manipulate in this search. As you modify parametersontheleftsidenotethedynamichelpontherightside.Seescreenshots. - Direction:thedirectionofchangeinexpression.Chooseup-regulated. - Fold Change>= the intensity of difference in expression needed before a gene is returnedbythesearch.Choose12butfeelfreetomodifythis. - Betweeneachgene’sAVERAGEexpressionvalue: Thisparameterappearsonceyouhave chosentwoReferenceSamplesanddefinestheoperationappliedtoreferencesamples. Foldchangeiscalculatedastheratiooftwovalues(expressioninreference)/(expression incomparison).Whenyouchoosemultiplesamplestoserveasreference,wegenerate one number for the fold change calculation by using the minimum, maximum, or average.Chooseaverage - Reference Sample: the samples that will serve as the reference when comparing expressionbetweensamples.choose5,10,15,20,25 - Andit’sAVERAGEexpressionvalue: Thisistheoperationappliedtocomparisonsamples. seeexplanationabove.Chooseaverage - ComparisonSample:thesamplethatyouarecomparingtothereference.Inthiscaseyou areinterestedingenesthatareup-regulatedinlatertimepointschoose30,35,40 b. - For the genes returned by the search, how does the RNA-sequence data compare to microarraydata? Hint:PlasmoDBcontainsdatafromasimilarexperimentthatwasanalyzedbymicroarray insteadofRNAsequencing.Thisexperimentiscalled:Erythrocyticexpressiontimeseries (3D7, DD2, HB3) (Bozdech et al. and Linas et al.) or Pf-iRBC 48hr for shorter column headings.TodirectlycomparethedataforgenesreturnedbytheRNA-seqsearchthat youjustran,addthecolumncalled“Pf-iRBC48hr-Graph”. OPTIONAL: You can also run a fold change search using this experiment to compare resultsonagenomescale.Addasteptoyourstrategyandintersecttheresultsofafold changesearchusingthe“Erythrocyticexpressiontimeseries(3D7,Dd2,HB3)(Bozdechet al.andLinasetal.)”experiment(undermicroarrayevidence).Configureitsimilarlytothe RNA-seq experiment although you will probably need to make the fold change smaller (try2or3)duetothedecreaseddynamicrangeofmicroarrayscomparedtoRNA-seq. 2. ExploringmicroarraydatainTriTrypDB. Note:Forthisexerciseusehttp://www.tritrypdb.org a. Find T. cruzi protein coding genes that are upregulated in amastigotes compared to trypomastigotes.Gotothetranscriptexpressionsectionthenselectmicroarray.Choose the fold change (FC) search for the data set called: Transcriptomes of Four Life-Cycle Stages(Minningetal.). - Selectthedirectionofregulation,yourreferencesampleandyourcomparisonsample.For thefoldchangekeepthedefaultvalue2. - Howmanygenesdidyoufind?Dotheresultsseem plausible? - Are any of these genes also up-regulated in the replicative insect stage (epimastigotes)? How can you find this out? (Hint: add a step and run a microarray search comparing expression of epimastigotestometacyclics). - Dothesegeneshaveorthologsinotherkinetoplastids?(Hint:addastepandrunanortholog transformonyourresults). - How many orthologs exist in L. braziliensis? (Hint: look at the filter table between the strategy panel and your result list. Click on the number in of gene to view results from a specific species). Explore your results. Scan the product descriptions for this list of genes. Did you find anything interesting? Perhaps a GO enrichment analysis would support your ideas. 3. FindinggenesbasedonRNAseqevidenceandinferringfunctionofhypotheticalgenes. Note:Usehttp://plasmodb.orgforthisexercise. a. Find all genes in P. falciparum that are up-regulated at least 50-fold in ookinetes compared to other stages: “Transcriptomes of 7 sexual and asexual life stages (LopezBarragan et al.)”. For this search select “average” for the operation applied on the referencesamples. b. - The above search will give you all genes that are up-regulated by 50 fold in ookinetes comparedtotheotherstages.Despitethehighfoldchange,somegenesinthelistmay behighlyexpressedintheotherstages.Howcanyouremovegenesfromthelistthatare highlyexpressedintheotherstages? Hint:RunasearchforgenesbasedonRNASeqevidencefromthesameexperiment,but thistimeselectthepercentilesearch:P.f.sevenstages-RNASeq(percentile).What minimalpercentilevaluesshouldyouchoose?40–100%? c. Which metabolic pathways are represented in this gene list? Hint: add a step and transform results to pathways. How does this result compare to running a pathways enrichmentonstep2? d. Whathappensifyourevisethefirststepandmodifythefolddifferencetoalowervalue- 10forexample? e. PlasmoDBalsohasanexperimentexamininggeneexpressionduringsexualdevelopment inPlasmodiumberghei(rodentmalaria).Canyoudetermineiftherearegenesthatare up-regulatedinbothhumanandrodentookinetes(comparedtoallotherstages)?Hint: startbydeletingthelaststepyouaddedinthisexercise(transformtopathways).Todo thisclickoneditthendeleteinthepopup.Next,addstepsfortheP.bergheiexperiments “PbergheiANKA5asexualandsexualstagetranscriptomesRNASeq”.Notethatyouwill have to use a nested strategy or by running a separate strategy then combining both strategies. . 4. FindgenesthatareessentialinprocyclicsbutnotinbloodformT.brucei. Note:forthisexerciseusehttp://TriTrypDB.org. - FindthequeryforHighThroughputPhenotyping.Thinkabouthowtosetupthisquery (Hint:youwillhavetosetupatwo-stepstrategy).Rememberyoucanplayaroundwith the parameters but there is no one correct way of setting them up – try the default parametersfirstandselectthe“inducedprocyclics”asthecomparisonsample. - Next add a step and run the same search except this time select the “induced bloodstreamform”samples. - Howdidyoucombinetheresults?Rememberyouwanttofindgenesthatareessential inprocyclicsandnotinbloodform. 5. a. FindingoocystexpressedgenesinT.gondiibasedonmicroarrayevidence. Note:Forthisexerciseusehttp://toxodb.org Findgenesthatareexpressedat10foldhigherlevelsinoneoftheoocyststagesthanin any other stage in the “Oocyst, tachyzoite, and bradyzoite developmental expression profiles (M4) (John Boothroyd)” microarray experiment. In this example, the maximum expression value between genes in the reference and comparison groups was used to determinethefolddifference. b. Addasteptolimitthissetofgenestoonlythoseforwhichallthenon-oocyststagesare expressedbelow50thpercentile…ielikelynotexpressedatthosestages.(Hint:afteryou clickonaddstepfindthesameexperimentundermicroarrayexpressionandchosethe percentilesearch). - Selectthe4non-oocystsamples. - We want all to have less than 50th percentile so set minimum percentile to 0 and maximumpercentileto50. - - c. Sincewewantallofthemtobeinthisrange, chooseALLinthe“MatchesAnyorAllSelected Samples”. Toviewthegraphsinthefinalresulttable, turnonthecolumnscalled“Tg-M4LifeCycle Stages–graph”and“Tg-M4LifeCycleStage %ile-graph”(insidethe“Tg-LifeCycle”Microarray). Revise the first step of this strategy and compare the maximum expression of the referencesamplestotheminimumofthecomparisonsamples. - Does this result convincing?Why? look cleaner/more - Wouldyouconsiderthesegenestobeoocyst specific? Save this strategy so that you can use it for an exercisewearedoinglaterduringthecourse. d. Revisethefirststepofthisstrategytofindgenesthatare3foldhigherinday4oocyststhan anyotherlifecyclestageinthisexperiment. - Doallthesegeneshaveday4oocystsastheglobalmaximumtimepoint? - Notethatwestillhavethesteptolimitthepercentileofnon-oocystsamplesto<=50th percentile.Whathappensifyourevisethissteptoalsoincludetheunsporulatedandday 10 oocyst samples in this percentile range? Do you get more of fewer results back? Why? 6. ComparingRNAabundanceandProteinabundancedata. Note:forthisexerciseusehttp://TriTrypDB.org. InthisexercisewewillcomparethelistofgenesthatshowdifferentialRNAabundancelevels between procyclic and blood form stages in T. brucei with the list of genes that show differentialproteinabundanceinthesesamestages. Findgenesthataredown-regulated2-foldinprocyclicformcells.Gotothesearchpage forGenesbyMicroarrayEvidenceandselectthefoldchangesearchforthe“Expression profilingoffivelifecyclestages(MarilynParsons)”experimentandconfigurethesearch to return protein-coding genes that are down-regulated 2 fold in procyclic form (PCF) relative to the Blood Form reference sample. Since there are two PCF samples, it is reasonabletochoosebothandaveragethem. a. b. Add a step to compare with quantitative protein expression. Select protein expression then “Quantitative Mass Spec Evidence” and the "Quantitative phosphoproteomes of bloodstream and procyclic forms (Tb427) (Urbaniak et al.)" experiment. Configure this searchtoreturngenesthataredown-regulatedinprocyclicformrelativetobloodform. c. d. e. Howmanygenesareintheintersection?Doesthismakesense?Makecertainthatyou setthedirectionscorrectly. Try changing directions and compare up-regulated genes/proteins. (Hint: revise the existing strategy … you might want to duplicate it so you can keep both). When you change one of the steps but not the other do you have any genes in the intersection? Whymightthisbe? Can you think of ways to provide more confidence (or cast a broader net) in the microarray step? (Hint: you could insert steps to restrict based on percentile or add a RNASequencingstepthathasthesamesamples). 7.FindgeneswithevidenceofphosphorylationinintracellularToxoplasmatachyzoites. Forthisexerciseusehttp://www.toxodb.org PhosphorylatedpeptidescanbeidentifiedbysearchingtheappropriateexperimentsintheMass SpecEvidencesearchpage. 7a. Find all genes with evidence of phosphorylation in intracellular tachyzoites. Select the “Infectedhostcell,phosphopeptide-enriched(peptidediscoveryagainstTgME49)”sampleunder theexperimentcalled“Tachyzoitephosphoproteomefrompurifiedparasiteorinfectedhostcell (RH)(Treecketal.)” 7b. Removeallgeneswithphosphorylationevidencefrompurifiedtachyzoites. 7c. Removeallgenesthatarealsopresentinthephosphopeptide-depletedfractions(select bothintracellularandextracellular). 7d. Exploreyourresults.Whatkindsofgenesdidyoufind?Hint:usetheProductdescription wordcolumnorperformaGOenrichmentanalysisofyourresults.Couldyouachievethissame 105geneswithatwostepstrategy?Hint:removedepletedandtachozoiteproteinsinonestep ratherthantwo. 7e. Areanyofthesegeneslikelytobesecreted?Hint:addastepsearchingforgeneswith secretorysignalpeptides. 7f. Pickoneortwoofthehypotheticalgenesinyourresultsandvisittheirgenepages.Can youinferanythingabouttheirfunction?Hint:exploretheproteinandexpressionsections. 7g. Whataboutpolymorphismdata?GobacktoyourstrategyandaddcolumnsforSNPdata found under the population biology section. Explore the gene page for the gene that has the most number of non-synonymous SNPs. Hint: you can sort the columns by clicking on the up/downarrowsnexttothecolumnnames.