Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Genome evolution wikipedia , lookup
Ridge (biology) wikipedia , lookup
Designer baby wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Minimal genome wikipedia , lookup
Genome (book) wikipedia , lookup
Biology and consumer behaviour wikipedia , lookup
Scenario 4 Analysis: Discovery of coregulated genes What follows is a simulation of the proposed graphical interface. As you go through the simulation please consider what capabilities you would want to serve your research and annotation interests. A narrative to help you go through the simulation appears in a red-bordered box, such as the one below. To begin: 1. Click on Slide Show, (on the upper toolbar) 2. Click View Show 3. Click Continue button Continue How do cells control response to light? What genes are related to the adaptation to high light? Look for: • Gene present in Prochlorococcus MED4 MED4 is naturally adapted to grow in high light. • Ortholog absent in Prochlorococcus MIT9313 MIT9313 is naturally adapted to grow in low light • Ortholog present in Synechocystis PCC 6803 Reason will become apparent in a moment • Synechocystis PCC 6803 ortholog responds to high light Gene turns on by factor > 2 in response to high light Continue Build set Display set Set operation HELP Cancel Variable Data Operation Function Done Click Build set to start building a new set Click Display set to see set you or someone else made Click Set operation to see statements to manipulate sets Click Variable to give a value to a new or old variable Data Click to access results of experiments Click Operation to see list of available statements Click Function to see list of available manipulations Click any red button to get help Click Build Set to begin finding orfs with the desired specifications Build set Display set Set operation HELP Cancel Variable Data Operation Function Done Click Set operation to see statements to manipulate sets Click Variable to give a value to a new or old variable Data Click to access results of experiments Click Operation to see list of available statements Click Function to see list of available manipulations Click any red button to get help You want to go through all MED4 ORFS. Click Operation to see how to do that. Build set Display set Set operation HELP Cancel Variable Data Operation Function Done Consider X in set... (loop) IF... THEN... OTHERWISE Click Click to go through each element of a set IF... THEN... OTHERWISE to perform actions only under certain conditions Click any red button to get help Consider X in set... (loop) You want to go consider each ORF in the set of all all MED4 ORFS. Click Consider X in set… Build set Display set Set operation HELP Cancel Variable Data Operation Function Done Consider each item in Choose ORFs as the set type and Prochlorococcus MED4 as the database. Choose set type Choose database All nucleotides of All open reading frames of All amino acid sequences of All intergenic regions of Human-annotated orfs of Private set Public set Arthrobacter platensis Gloeobacter violaceus Microcystis aeruginosa Nostoc punctiforme Nostoc PCC 7120 Prochlorococcus MED4 Prochlorococcus MIT9313 Prochlorococcus S120 Synechococcus PCC6301 Synechococcus PCC7942 Synechococcus WH Synechocystis PCC 6803 Thermosynechococcus Trichodesmium Unicellulular Filamentous All Build set Display set Set operation HELP Cancel Variable Data Operation Function Done Choose database Consider each item in Choose ORFs as the set type and Prochlorococcus MED4 as the database. All open reading frames of Arthrobacter platensis Gloeobacter violaceus Microcystis aeruginosa Nostoc punctiforme Nostoc PCC 7120 Prochlorococcus MED4 Prochlorococcus MIT9313 Prochlorococcus S120 Synechococcus PCC6301 Synechococcus PCC7942 Synechococcus WH Synechocystis PCC 6803 Thermosynechococcus Trichodesmium Unicellulular Filamentous All Build set Display set Set operation HELP Cancel Variable Data Operation Function Done Consider each item in Click Click Click Click Click Variable Data Operation Function Set operation All open reading frames of Prochlorococcus MED4 to give a value to a new or old variable to access results of experiments to see list of available statements to see list of available manipulations to see statements to manipulate sets You want to consider this MED4 ORF only if an ortholog in MIT9313 doesn’t exist.. Click Operation and choose If… then… : Build set Display set Set operation HELP Cancel Variable Data Operation Function Done Consider each item in Click Click Consider X in set... (loop) IF... THEN... OTHERWISE All open reading frames of KEEP item Prochlorococcus MED4 : to go through each element of a set IF... THEN... OTHERWISE to perform actions only under certain conditions Keep item Click to add item to set Click any red button to get help Consider X in set... (loop) You want to this MED4 ORF only if an ortholog in MIT9313 doesn’t exist.. Click Operation and choose If… then… Build set Display set Set operation HELP Cancel Variable Data Operation Function Done Consider each item in IF All open reading frames of Prochlorococcus MED4 Data Click Variable or Function to begin specifying condition to be met Your condition is that the ortholog of the item in MIT9313 doesn’t exist. Ortholog of… is a function. : Build set Display set Set operation HELP Cancel Variable Data Operation Function Done Consider each item in IF Your condition is that the ortholog of the item in MIT9313 doesn’t exist. Ortholog of… is a function. Ortholog of Protein product of All open reading frames of Prochlorococcus MED4 Sequence of Upstream region of Downstream region of : Build set Display set Set operation HELP Cancel Variable Data Operation Function Done Consider each item in All open reading frames of Variable IF Ortholog of ( You want the ortholog of the item (the specific ORF of MED4 being considered)… Item Set Specify Prochlorococcus MED4 Choose database in Arthrobacter platensis Gloeobacter violaceus Microcystis aeruginosa Nostoc punctiforme Nostoc PCC 7120 Prochlorococcus MED4 Prochlorococcus MIT9313 Prochlorococcus S120 Synechococcus PCC6301 Synechococcus PCC7942 Synechococcus WH8102 Synechocystis PCC 6803 Thermosynechococcus Trichodesmium ) : Build set Display set Set operation HELP Cancel Variable Data Operation Function Done Consider each item in All open reading frames of Prochlorococcus MED4 Choose database IF Ortholog of ( ... And you want the ortholog of the item in Prochlorococcus MIT9313 Item in Arthrobacter platensis Gloeobacter violaceus Microcystis aeruginosa Nostoc punctiforme Nostoc PCC 7120 Prochlorococcus MED4 Prochlorococcus MIT9313 Prochlorococcus S120 Synechococcus PCC6301 Synechococcus PCC7942 Synechococcus WH8102 Synechocystis PCC 6803 Thermosynechococcus Trichodesmium ) : Build set Display set Set operation HELP Cancel Variable Data Operation Function Done Consider each item in IF Ortholog of ( = exists All open reading frames of doesn’t exist Item in Prochlorococcus MED4 Prochlorococcus MIT9313 : ) Click specific operation to continue specifying the condition to be met or click Variable You want the ortholog of the item in Prochlorococcus MIT9313 not to exist. to save results of function Build set Display set Set operation HELP Cancel Variable Data Operation Function Done Consider each item in IF Ortholog of ( AND OR All open reading frames of BUT NOT IF... THEN... OTHERWISE Item in Prochlorococcus MED4 Prochlorococcus MIT9313 ) : doesn’t exist Click a logical operation to continue specifying the condition to be met or IF... THEN... to end the condition Let’s pause to see where we are in the task at hand (Click to proceed) How do cells control response to light? What genes are related to the adaptation to high light? Look for: √ √ • Gene present in Prochlorococcus MED4 MED4 is naturally adapted to grow in high light. • Ortholog absent in Prochlorococcus MIT9313 MIT9313 is naturally adapted to grow in low light • Ortholog present in Synechocystis PCC 6803 Reason will become apparent in a moment • Synechocystis PCC 6803 ortholog responds to high light Gene turns on by factor > 2 in response to high light Continue Build set Display set Set operation HELP Cancel Variable Data Operation Function Done Consider each item in IF Ortholog of ( AND OR All open reading frames of BUT NOT IF... THEN... OTHERWISE Item in Prochlorococcus MED4 Prochlorococcus MIT9313 ) : doesn’t exist Click a logical operation to continue specifying the condition to be met or IF... THEN... to end the condition There are more conditions to fulfill, so click AND. Build set Display set Set operation HELP Cancel Variable Data Operation Function Done Consider each item in IF Ortholog of ( All open reading frames of Item in Prochlorococcus MED4 Prochlorococcus MIT9313 ) Data Click Variable or Function to continue specifying condition to be met Your condition now is that the ortholog of the item in Synechocystis does exist. Ortholog of… is a function. : doesn’t exist AND Build set Display set Set operation HELP Cancel Variable Data Operation Function Done Consider each item in IF Ortholog of ( Your condition now is that the ortholog of the item in Synechocystis does exist. Ortholog of… is a function. Ortholog of Protein product of All open reading frames of Prochlorococcus MED4 : Sequence of Upstream region of Downstream region of Item MIT9313 in Prochlorococcus ) doesn’t exist AND Build set Display set Set operation HELP Cancel Variable Data Operation Function Done Consider each item in IF Ortholog of ( All open reading frames of Item in Variable Ortholog of ( You want the ortholog of the item (the specific ORF of MED4 being considered)… Item Set Specify Prochlorococcus MED4 doesn’t exist Prochlorococcus MIT9313 Choose database in Arthrobacter platensis Gloeobacter violaceus Microcystis aeruginosa Nostoc punctiforme Nostoc PCC 7120 Prochlorococcus MED4 Prochlorococcus MIT9313 Prochlorococcus S120 Synechococcus PCC6301 Synechococcus PCC7942 Synechococcus WH8102 Synechocystis PCC 6803 Thermosynechococcus Trichodesmium : ) AND Build set Display set Set operation HELP Cancel Variable Data Operation Function Done Consider each item in IF Ortholog of ( All open reading frames of Item in Prochlorococcus MED4 doesn’t exist Prochlorococcus MIT9313 Choose database Ortholog of ( You want the ortholog of the item in Synechocystis PCC 6803 Item in Arthrobacter platensis Gloeobacter violaceus Microcystis aeruginosa Nostoc punctiforme Nostoc PCC 7120 Prochlorococcus MED4 Prochlorococcus MIT9313 Prochlorococcus S120 Synechococcus PCC6301 Synechococcus PCC7942 Synechococcus WH8102 Synechocystis PCC PCC6803 Synechocystis 6803 Thermosynechococcus Trichodesmium : ) AND Build set Display set Set operation HELP Cancel Variable Data Operation Function Done Consider each item in IF = exists All open reading frames of doesn’t exist Prochlorococcus MED4 Ortholog of ( Item in Prochlorococcus MIT9313 Ortholog of ( Item in Synechocystis PCC 6803 : doesn’t exist ) Click specific operation to continue specifying the condition to be met or click Variable You want the ortholog of the item in Synechocystis PCC 6803, this time to exist... but first you need to save the ortholog for later. To do this click Variable. to save results of function AND Build set Display set Set operation HELP Cancel Variable Data Operation Function Done Consider each item in IF All open reading frames of Prochlorococcus MED4 Ortholog of ( Item in Prochlorococcus MIT9313 Ortholog of ( Item in Synechocystis PCC 6803 doesn’t exist ) Type variable name (assigned to 6803 ortholog : ) Give the ortholog a logical name so that you can refer to it later (for now, just click on the box and I’ll do the typing) AND Build set Display set Set operation HELP Cancel Variable Data Operation Function Done Consider each item in IF = existsframes of All open reading doesn’t exist Prochlorococcus MED4 Ortholog of ( Item in Prochlorococcus MIT9313 Ortholog of ( Item in Synechocystis PCC 6803 (assigned to 6803 ortholog : doesn’t exist ) ) Click specific operation to continue specifying the condition to be met or click Variable Now you can demand that the ortholog of the item in Synechocystis PCC 6803 exists. to save results of function AND Build set Display set Set operation HELP Cancel Variable Data Operation Function Done Consider each item in IF AND OR All open reading frames of BUT NOT IF... THEN... OTHERWISE Prochlorococcus MED4 Ortholog of ( Item in Prochlorococcus MIT9313 Ortholog of ( Item in Synechocystis PCC 6803 (assigned to 6803 ortholog : doesn’t exist ) exists ) Click a logical operation to continue specifying the condition to be met or IF... THEN... to end the condition Still one more condition (2x expression) so click AND. AND Build set Display set Set operation HELP Cancel Variable Data Operation Function Done Consider each item in IF All open reading frames of Prochlorococcus MED4 Ortholog of ( Item in Prochlorococcus MIT9313 Ortholog of ( Item in Synechocystis PCC 6803 (assigned to 6803 ortholog ) ) Data Click Variable or Function to continue specifying condition to be met This time the condition to be met concerns data from a microarray experiment. Click Data. : doesn’t exist AND exists AND Build set Display set Set operation HELP Cancel Variable Data Operation Function Done Consider each item in IF All open reading frames of Prochlorococcus MED4 Ortholog of ( Item in Prochlorococcus MIT9313 Ortholog of ( Item in Synechocystis PCC 6803 (assigned to 6803 ortholog Variable data for Item 6803 ortholog Specify ) : doesn’t exist AND exists AND ) Choose organism used in Microcystis aeruginosa Nostoc punctiforme Nostoc PCC 7120 Prochlorococcus MED4 Prochlorococcus MIT9313 The data desired concerns Prochlorococcus S120 Synechococcus PCC6301 the 6803 ortholog and an Synechococcus PCC7942 experiment using Synechococcus WH8102 Synechocystis PCC6803 Synechocystis PCC 6803 Build set Display set Set operation HELP Cancel Variable Data Operation Function Done Consider each item in IF All open reading frames of Prochlorococcus MED4 Ortholog of ( Item in Prochlorococcus MIT9313 Ortholog of ( Item in Synechocystis PCC 6803 (assigned to 6803 ortholog ) : doesn’t exist AND exists AND ) Choose organism used data for 6803 ortholog in Microcystis aeruginosa Nostoc punctiforme Nostoc PCC 7120 Prochlorococcus MED4 Prochlorococcus MIT9313 The data desired concerns Prochlorococcus S120 Synechococcus PCC6301 the 6803 ortholog and an Synechococcus PCC7942 experiment using Synechococcus WH8102 Synechocystis PCC6803 Synechocystis PCC 6803 Build set Display set Set operation HELP Cancel Variable Data Operation Function Done Consider each item in IF All open reading frames of Prochlorococcus MED4 Ortholog of ( Item in Prochlorococcus MIT9313 Ortholog of ( Item in Synechocystis PCC 6803 (assigned to 6803 ortholog doesn’t exist AND exists AND ) ) Choose type data for 6803 ortholog It’s a microarray experiment… in Synechocystis PCC 6803 : microarray 2D gel Build set Display set Set operation HELP Cancel Variable Data Operation Function Done Consider each item in IF All open reading frames of Prochlorococcus MED4 Ortholog of ( Item in Prochlorococcus MIT9313 Ortholog of ( Item in Synechocystis PCC 6803 (assigned to 6803 ortholog ) ) : doesn’t exist AND exists AND High light vs low light experiment Choose expt data for 6803 ortholog It’s a microarray experiment, Hihara et al, you think… Mouse over that experiment to see (here click on it) in Synechocystis PCC 6803 microarray Hihara1 Suzuki1 Yoshimura1 Build set Display set Set operation HELP Cancel Variable Data Operation Function Done Consider each item in IF Ortholog of ( Ortholog of ( < < or = = frames of All open reading Prochlorococcus MED4 : > or = doesn’t exist Item in> Prochlorococcus MIT9313 exists doesn’t exist Item Synechocystis PCC 6803 in exists ) (assigned to data for 6803 ortholog 6803 ortholog in ) Synechocystis PCC 6803 microarray Hihara1 You want the experimental condition (high light) to be greater than the control… AND AND Build set Display set Set operation HELP Cancel Variable Data Operation Function Done Consider each item in IF All open reading frames of Ortholog of ( Item in Prochlorococcus MIT9313 Ortholog of ( Item in Synechocystis PCC 6803 (assigned to data for 6803 ortholog 6803 ortholog in : Prochlorococcus MED4 ) doesn’t exist AND exists AND ) Synechocystis PCC 6803 microarray Hihara1 Value > +2 You want the experimental condition (high light) to be greater than the control by a factor of 2 (I’ll type it). Build set Display set Set operation HELP Cancel Variable Data Operation Function Done Consider each item in IF AND OR All open reading frames of BUT NOT IF... THEN... OTHERWISE Ortholog of ( Item in Prochlorococcus MIT9313 Ortholog of ( Item in Synechocystis PCC 6803 (assigned to data for 6803 ortholog 6803 ortholog in : Prochlorococcus MED4 ) doesn’t exist AND exists AND ) Synechocystis PCC 6803 microarray Hihara1 Value > +2 Click a logical operation to continue specifying the condition to be met (Let’s pause again) or IF... THEN... to end the condition How do cells control response to light? What genes are related to the adaptation to high light? Look for: √ √ √ √ • Gene present in Prochlorococcus MED4 MED4 is naturally adapted to grow in high light. • Ortholog absent in Prochlorococcus MIT9313 MIT9313 is naturally adapted to grow in low light • Ortholog present in Synechocystis PCC 6803 Reason will become apparent in a moment • Synechocystis PCC 6803 ortholog responds to high light Gene turns on by factor > 2 in response to high light Continue Build set Display set Set operation HELP Cancel Variable Data Operation Function Done Consider each item in IF AND OR All open reading frames of BUT NOT IF... IF... THEN... THEN... OTHERWISE OTHERWISE Ortholog of ( Item in Prochlorococcus MIT9313 Ortholog of ( Item in Synechocystis PCC 6803 (assigned to data for 6803 ortholog 6803 ortholog in : Prochlorococcus MED4 ) doesn’t exist AND exists AND ) Synechocystis PCC 6803 microarray Hihara1 Value > +2 That’s it! We’ve specified all the conditions, so end the IF segment and specify what do do if all the conditions are met. Click a logical operation to continue specifying the condition to be met or IF... THEN... to end the condition Build set Display set Set operation HELP Cancel Variable Data Operation Function Done IF Ortholog of ( Item in Prochlorococcus MIT9313 Ortholog of ( Item in Synechocystis PCC 6803 (assigned to data for 6803 ortholog > Click Click Click Click Click Variable Data Operation Function Set operation +2 6803 ortholog in ) doesn’t exist AND exists AND ) Synechocystis PCC 6803 microarray Hihara1 THEN to give a value to a new or old variable to access results of experiments And if they are met, what you want to do is to save the to see list of available statements gene in the set you’re building, a Set operation. to see list of available manipulations to see statements to manipulate sets Build set Display set Variable IF Data Ortholog of ( Ortholog of ( in Item +2 Cancel Done ) doesn’t exist AND exists AND ) Synechocystis PCC 6803 THEN In other words, you want to ADD the gene to the growing set. Synechocystis PCC 6803 6803 ortholog in 6803 ortholog > HELP ADD TO Operation Function DELETE FROM UNION OF INTERSECTION OF Item in Prochlorococcus MIT9313 Arithmetic (assigned to data for Set operation microarray Hihara1 Build set Display set Set operation HELP Cancel Variable Data Operation Function Done IF Ortholog of ( Item in Prochlorococcus MIT9313 Ortholog of ( Item in Synechocystis PCC 6803 (assigned to data for 6803 ortholog > +2 Item 6803 ortholog Specify exists AND Synechocystis PCC 6803 microarray THEN Variable Add AND ) 6803 ortholog in ) doesn’t exist Type name of set to Which gene? I could save the Prochlorococcus gene (the item), but I’ll instead save the gene from PCC 6803 (more known about them). Hihara1 Build set Display set Set operation HELP Cancel Variable Data Operation Function Done IF Ortholog of ( Item in Prochlorococcus MIT9313 Ortholog of ( Item in Synechocystis PCC 6803 (assigned to data for in 6803 ortholog > +2 6803 ortholog Synechocystis PCC 6803 THEN 6803 ortholog I need to give the set a logical name (you click, I’ll type). to AND exists AND ) Type name of set Add ) doesn’t exist Light-specific genes microarray Hihara1 Build set Display set Set operation HELP Cancel Variable Data Operation Function Done ( Ortholog of (assigned to data for 6803 ortholog > in Item +2 Synechocystis PCC 6803 6803 ortholog in ) exists ) Synechocystis PCC 6803 microarray Hihara1 THEN Type name of set Add Click Click Click Click Click 6803 ortholog Variable Data Operation Function Set operation to Light-specific genes to give a value to a new orIf Iold variable stop here, then all Prochlorococcus genes will to access results of experiments be considered, and if the to see list of available statements conditions are met, the 6803 ortholog will be saved. to see list of available manipulations That’s what I want, so click to see statements to manipulateDone. sets AND Build set Display set Set operation HELP Cancel Variable Data Operation Function Done ( Ortholog of (assigned to data for +2 Synechocystis PCC 6803 6803 ortholog in 6803 ortholog > in Item ) Synechocystis PCC 6803 THEN Type name of set Add 6803 ortholog That was a complicated script, so I’ll save it in case I (or someone else) needs to run it again or modify it. to Save Save results results and and script script Save only results exists ) AND Light-specific genes microarray Hihara1 Equivalent script that bypasses interface (loop for item in (#^Genes ProcMed4) as all-orthologous = (all-blast-orthologous-geneIDs item stdevalue) as 6803ortholog = (#^Genes Syny6803) :in all-orthologous) as light-specific-genes = nil when (and (there-are-not-any #’member-geneID-of-gene-frames (#^Genes slotv Proc9313) :in all-orthologous)) (there-are-any #'member-geneID-of-gene-frames 6803ortholog) (>= ratio-value (select-matching-geneIDs-from-table Hihara1 item) 2)))) collect light-specific-genes 6803ortholog) This is the script that the interface would have produced. It looks for all the world like a computer program. In fact, it is a computer program, written in BioLingua. Continue Build set Display set Set operation HELP Set: Light-specific genes Syny6803:sll0990 Formaldehyde dehydrogenase (glutathione dependent) Syny6803:slr1332 fabF beta ketoacyl acyl carrier protein synthase Syny6803:sll0337 Sensor histidine kinase Syny6803:sll0335 Hypothetical Syny6803:sll0789 Response regulator (OmpR) Syny6803:sll0576 Putative epimerase/hydratase Here are the results of the program. The genes meeting all the conditions are given along with a brief description and a graphic display of the regions surrounding the genes. (Click to proceed) Done Build set Display set Set operation HELP Done Set: Light-specific genes Syny6803:sll0990 Formaldehyde dehydrogenase (glutathione dependent) Syny6803:slr1332 fabF beta ketoacyl acyl carrier protein synthase Syny6803:sll0337 Sensor histidine kinase Syny6803:sll0335 Hypothetical Syny6803:sll0789 Response regulator (OmpR) Syny6803:sll0576 Putative epimerase/hydratase What can you do with this set? Certainly one thing of interest is the function of the genes. Clicking on a gene name brings you to the annotation page. Try clicking on slr1332. Main Menu Annotate Options History HELP Synechocystis PCC 6803: slr1332 Replicon: Chromosome Coordinates: 1670650 (start-ATG) Length = 404 amino acids 1671864 (stop) Human A Human A Experiment A Strand: Direct Gene name(s): fabF or fabJ Function: beta-ketoacyl-acyl carrier protein synthase Activity: In vivo activity: exists Cyanobacterial orthologs: Syny6803 Nost7120 NostPun More You can find out more about this kind of page from Scenarios 1 – 3. For now, click to return to the Set Display page. Build set Display set Set operation HELP Set: Light-specific genes Syny6803:sll0990 Formaldehyde dehydrogenase (glutathione dependent) Syny6803:slr1332 fabF beta ketoacyl acyl carrier protein synthase Syny6803:sll0337 Sensor histidine kinase Syny6803:sll0335 Hypothetical Syny6803:sll0789 Response regulator (OmpR) Syny6803:sll0576 Putative epimerase/hydratase Another interesting point of attack is the regulation of this set of genes (after all, they were selected as being coregulated by light). Perhaps the upstream regions share a common motif. (Click to continue) Done Build set Display set Set operation HELP Set: Light-specific genes Syny6803:sll0990 Formaldehyde dehydrogenase (glutathione dependent) Syny6803:slr1332 fabF beta ketoacyl acyl carrier protein synthase Syny6803:sll0337 Sensor histidine kinase Syny6803:sll0335 Hypothetical Syny6803:sll0789 Response regulator (OmpR) Syny6803:sll0576 Putative epimerase/hydratase Unfortunately, in three cases, the genes don’t have an upstream region. Evidently these genes are part of operons. We’d like to consider the upstream regions of the operons by adding the first genes of the operon to the set. Add to set... that’s a Set operation. (Click on that) Done Build set Display set Set operation HELP Done ADD TO Operation Set: Light-specific DELETE FROM genes UNION OF Syny6803:sll0990 Formaldehyde dehydrogenase (glutathione dependent) INTERSECTION OF Syny6803:slr1332 fabF beta ketoacyl acyl carrier protein synthase Arithmetic Syny6803:sll0337 Sensor histidine kinase Syny6803:sll0335 Hypothetical Syny6803:sll0789 Response regulator (OmpR) Syny6803:sll0576 Putative epimerase/hydratase Click ADD TO... Build set Display set Set operation HELP Done Set: Light-specific genes Syny6803:sll0990 Formaldehyde dehydrogenase (glutathione dependent) Syny6803:slr1332 fabF beta ketoacyl acyl carrier protein synthase Syny6803:sll0337 Sensor histidine kinase Syny6803:sll0335 Hypothetical Syny6803:sll0789 Response regulator (OmpR) Syny6803:sll0576 Putative epimerase/hydratase Add to set Find specific gene Add set of genes Specify gene Click on icon of gene Click ADD TO and click on the icon (the short black arrow) upstream from sll0990, the first gene without an upstream region. Build set Display set Set operation HELP Done Set: Light-specific genes Syny6803:sll0990 Syny6803:srl7009 Formaldehyde dehydrogenase (glutathione dependent) trnR: tRNA Arg (UCU) Syny6803:slr1332 fabF beta ketoacyl acyl carrier protein synthase Syny6803:sll0337 Sensor histidine kinase Syny6803:sll0335 Hypothetical Syny6803:sll0789 Response regulator (OmpR) Syny6803:sll0576 Putative epimerase/hydratase Add to set Find specific gene Add set of genes Specify gene Click on icon of gene The short arrow is now named and part of the set. Click on the icon (the long black arrow) upstream from slr1332, the next gene without an upstream region. Build set Display set Set operation HELP Done Set: Light-specific genes Formaldehyde dehydrogenase (glutathione dependent) Syny6803:srl7009 trnR tRNA Arg (UCU) Syny6803:sll0990 Syny6803:slr1331 Processing protease Syny6803:slr1332 fabF beta ketoacyl acyl carrier protein synthase Syny6803:sll0337 Sensor histidine kinase Syny6803:sll0335 Hypothetical Syny6803:sll0789 Response regulator (OmpR) Syny6803:sll0576 Putative epimerase/hydratase Add to set Find specific gene Add set of genes Specify gene Click on icon of gene slr1331 is now part of the set. Click on the icon upstream from sll0789, the third gene without an upstream region. Build set Display set Set operation HELP Set: Light-specific genes Formaldehyde dehydrogenase (glutathione dependent) Syny6803:srl7009 trnR tRNA Arg (UCU) Syny6803:sll0990 Syny6803:slr1331 Processing protease Syny6803:slr1332 fabF beta ketoacyl acyl carrier protein synthase Syny6803:sll0337 Sensor histidine kinase Syny6803:sll0335 Hypothetical Syny6803:sll0789 Response regulator (OmpR) Syny6803:sll0788 Hypothetical protein Syny6803:sll0576 Putative epimerase/hydratase Add to set Find specific gene Add set of genes Specify gene Now the genes presumably at the start of the operons have been Click on icon of gene added, and it’s time to remove the genes without upstream regions. Click on the radio buttons of the three genes and click Set operation. Done Build set Display set Set operation HELP KEEP (delete others) Set: Light-specific DELETE FROM genes Formaldehyde dehydrogenase (glutathione dependent) Syny6803:srl7009 trnR tRNA Arg (UCU) Syny6803:sll0990 Syny6803:slr1331 Processing protease Syny6803:slr1332 fabF beta ketoacyl acyl carrier protein synthase Syny6803:sll0337 Sensor histidine kinase Syny6803:sll0335 Hypothetical Syny6803:sll0789 Response regulator (OmpR) Syny6803:sll0788 Hypothetical protein Syny6803:sll0576 Putative epimerase/hydratase Click DELETE FROM to remove the three genes from the set. Done Build set Display set Set operation HELP Done Set: Light-specific genes Syny6803:srl7009 trnR tRNA Arg (UCU) Syny6803:slr1331 Processing protease Syny6803:sll0337 Sensor histidine kinase Syny6803:sll0335 Hypothetical Syny6803:sll0788 Hypothetical protein Syny6803:sll0576 Putative epimerase/hydratase Now you have what you want: a set of genes coregulated by light. The game now is to extract their upstream regions and determine if that set contains a common sequence motif. So you need a new set. Click Build set. Build set Display set Set operation HELP Cancel Variable Data Operation Function Done Click Display set to see set you or someone else made Click Set operation to see statements to manipulate sets Click Variable to give a value to a new or old variable Data Click to access results of experiments Click Operation to see list of available statements Click Function to see list of available manipulations Click any red button to get help The set will be based on the upstream regions of the set of light-specific genes. Upstream region of... that’s a Function. Build set Display set Set operation HELP Cancel Variable Data Operation Function Done Ortholog of Protein product of Sequence of Upstream region of Downstream region of Common sequences (Meme) of The set will be based on the upstream regions of the set of light-specific genes. Upstream region of... that’s a Function. Build set Display set Set operation HELP Cancel Variable Data Operation Function Done Upstream region of Choose variable ( 6803 ortholog Set Specify Choose database in Arthrobacter platensis Gloeobacter violaceus Microcystis aeruginosa Nostoc punctiforme Nostoc PCC 7120 Prochlorococcus MED4 Prochlorococcus MIT9313 Prochlorococcus S120 Synechococcus PCC6301 Synechococcus PCC7942 Synechococcus WH8102 Synechocystis PCC 6803 Thermosynechococcus Trichodesmium ) You want not the upstream region of a variable (6803 ortholog is the only one you’ve made so far) but rather a set. Build set Display set Set operation HELP Cancel Variable Data Operation Function Done Upstream region of ( Choose set type Choose database All open reading frames of Human-annotated orfs of Private set Public set Arthrobacter platensis Gloeobacter violaceus Microcystis aeruginosa Nostoc punctiforme Nostoc PCC 7120 Prochlorococcus MED4 Prochlorococcus MIT9313 Prochlorococcus S120 Synechococcus PCC6301 Synechococcus PCC7942 Synechococcus WH8102 Synechocystis PCC 6803 Thermosynechococcus Trichodesmium ) You have a couple of premade sets available, but you want your own. Click Private set. Build set Display set Set operation HELP Cancel Variable Data Operation Function Done Upstream region of Choose set ( Light-specific genes Type set name ) assigned to You’ve made only one set. Choose it... Build set Display set Set operation HELP Cancel Variable Data Operation Function Done Upstream region of Type set name Light-specific genes assigned to Upstream light-sp genes You’ve made only one set. Choose it... and give it a name (I’ll type it). Build set Display set Set operation HELP Cancel Variable Data Operation Function Done Upstream region of Type variable name Light-specific genes assigned to Upstream light-sp genes Click Display set to see set you or someone else made Click Set operation to see statements to manipulate sets Click Variable to give a value to a new or old variable Data Click to access results of experiments Click Operation to see list of available statements Click Function to see list of available manipulations Click any red button to get help You want to run this new set through a filter that will give you conserved motifs. That’s a Function. Build set Display set Set operation HELP Cancel Variable Data Operation Function Done Upstream region of Light-specific genes assigned Ortholog of Protein product of to Upstream Sequence oflight-sp genes Upstream region of Downstream region of Common Common sequences sequences (Meme) (Meme) of of The function you want, Meme, analyzes a set of sequences for statistically overrepresented motifs. Build set Display set Set operation HELP Cancel Variable Data Operation Function Done Upstream region of Common sequences of Light-specific genes assigned to Choose set Light-specific genes Upstream light sp-genes light-sp genes Public set Premade set You now have two private sets. You want, of course, the set of upstream lightspecific genes. Upstream light-sp genes Type variable name assigned to Build set Display set Set operation HELP Cancel Variable Data Operation Function Done Upstream region of Common sequences of Light-specific genes assigned to Upstream light-sp genes Type variable name Upstream light-sp genes assigned to Memed light-sp genes You click and I’ll type in a logical name. Build set Display set Set operation HELP Cancel Variable Data Operation Function Done Upstream region of Common sequences of Light-specific genes assigned to Upstream light-sp genes Type variable name Upstream light-sp genes assigned to Memed light-sp genes Click Display set to see set you or someone else made Click Set operation to see statements to manipulate sets Click Variable to give a value to a new or old variable Data Click to access results of experiments Click Operation to see list of available statements Click Function to see list of available manipulations Click Done to put your Click any red button to get help plans into action and to display the last defined set. Build set Display set Set operation HELP Done Set: Memed upstream light-sp genes Len Pos E-val Upstream of Syny6803:srl7009 638 D 509 6.38e-05 Upstream of Syny6803:slr1331 159 D 152 4.38e-05 Upstream of Syny6803:sll0337 138 D Upstream of Syny6803:sll0335 279 D 183 5.30e-05 Upstream of Syny6803:sll0788 221 D 184 1.09e-05 Upstream of Syny6803:sll0576 79 D 34 3.29e-05 28 1.31e-04 And there you have it! A conserved octomeric sequence found in front of all six light-regulated genes (and, gratifyingly, all in the same orientation: Direct rather than Reverse). Is the sequence involved in gene regulation? Only experiments will tell... Left flank AAATATGGGA GGCCCATGGG AGCTTAAAAA AAGGGTTAGC TGCTTTGCCA TTTTTTGCTT Motif GAATGGAA GACTAGGA GACTAGAA GACTGGAG GACTGGAA TACTGGGA Right flank TTGAGTAGCA TTCAATGGGT TTGGCAAAAC TTAGAGAAGG ACGGATATTT What have we done? • Drawn on complex knowledge base • Combined multiple tools • Written a computer program • Modified set by hand Did it ourselves Continue Scenario 4 Analysis: Discovery of coregulated genes Summary • The graphical interface facilitates searches using functions, loops and Boolean operations, with few of the complexities of most computer languages • The graphical interface facilitates the searching through experimental data for orfs with desired properties • The script interface permits interaction between the search and display capabilities of the web site and outside resources. Reminder: This was a simulation. The underlying language (BioLingua) exists, but not the interface that facilitates access. End