Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
23/05/17 The Gene Ontology Project An Introduction EBI is an Outstation of the European Molecular Biology Laboratory. There is a lot of biological research output. 2 23/05/17 Search on mesoderm development… 3 23/05/17 You get 6752 results! How will you ever find what you want? Another example… 4 23/05/17 time Microarray data shows changed expression of thousands of genes. How will you spot the patterns? attacked control ee: pearson lw n3d ... lw n3d ... Colored Selected Gene Tree: pearson Coloredby: by: sification: Set_LW_n3d_5p_... Gene List: Branch color classification: Set_LW_n3d_5p_... Gene List: 5 23/05/17 Copy ofofCopy of(Defa... C5_RMA (Defa... Copy of Copy C5_RMA genes allall genes (14010)(14010) Bregje Wertheim at the Centre for Evolutionary Genomics, Department of Biology, UCL and Eugene Schuster Group, EBI. Scientists work hard. http://www.teamtechnology.co.uk/f-scientist.jpg 6 23/05/17 http://www.kilbot.com.au/wp-content/shop/careful-scientist.gif There are lots of papers to read. http://www.teamtechnology.co.uk/f-scientist.jpg 7 23/05/17 http://www.kilbot.com.au/wp-content/shop/careful-scientist.gif more every week. http://www.teamtechnology.co.uk/f-scientist.jpg 8 23/05/17 http://www.kilbot.com.au/wp-content/shop/careful-scientist.gif and more… http://www.teamtechnology.co.uk/f-scientist.jpg 9 23/05/17 http://www.kilbot.com.au/wp-content/shop/careful-scientist.gif more and more and more! http://www.teamtechnology.co.uk/f-scientist.jpg 10 23/05/17 http://www.kilbot.com.au/wp-content/shop/careful-scientist.gif more and more and more! Help! http://www.teamtechnology.co.uk/f-scientist.jpg 11 23/05/17 Help! http://www.kilbot.com.au/wp-content/shop/careful-scientist.gif Ontology is a way to capture knowledge in a written and computable form. Computable Computable means that the computer finds patterns so we don’t have to. 12 23/05/17 Ebay search (keyword ‘lead’) v. Pubmed search (keyword ‘flower’) Demo and practical work 13 23/05/17 The Gene Ontology 14 23/05/17 This is our browser. 15 23/05/17 Search on mesoderm development. 16 23/05/17 Here is mesoderm development. 17 23/05/17 Definition of mesoderm development. Gene products involved in mesoderm development. 18 23/05/17 There are many gene products involved in mesoderm development. But fewer gene products than papers. You can read papers describing what is known about them. 19 23/05/17 20 23/05/17 time attacked control ne Tree: pearson lw n3d ... lw n3d ... Colored Selected Gene Tree: pearson Coloredby: by: classification: Set_LW_n3d_5p_... Gene List: Branch color classification: Set_LW_n3d_5p_... Gene List: 21 23/05/17 Copy ofofCopy of(Defa... C5_RMA (Defa... Copy of Copy C5_RMA genes allall genes (14010)(14010) Bregje Wertheim at the Centre for Evolutionary Genomics, Department of Biology, UCL and Eugene Schuster Group, EBI. See which processes are upregulated or downregulated. time Defense response Immune response Response to stimulus Toll regulated genes JAK-STAT regulated genes Puparial adhesion Molting cycle hemocyanin Amino acid catabolism Lipid metobolism Peptidase activity Protein catabloism Immune response Immune response Toll regulated genes attacked control Selected Gene Tree: pearson Coloredby: by: ne Tree: pearson lw n3d ... lw n3d ... Colored Branch color classification: Set_LW_n3d_5p_... Gene List: classification: Set_LW_n3d_5p_... Gene List: 22 23/05/17 Copy of Copy C5_RMA Copy ofofCopy of(Defa... C5_RMA (Defa... allall genes (14010)(14010) genes Bregje Wertheim at the Centre for Evolutionary Genomics, Department of Biology, UCL and Eugene Schuster Group, EBI. Practical work: Search AmiGO Did you find your favourite gene product or process? 23 23/05/17 How does the Gene Ontology work? 24 23/05/17 25 23/05/17 The Gene Ontology is like a dictionary term: transcription initiation id: GO:0006352 definition: Processes involved in the assembly of the RNA polymerase complex at the promoter region of a DNA template resulting in the subsequent synthesis of RNA from that promoter. 26 23/05/17 The whole system. Clark et al., 2005 is_a part_of 27 23/05/17 An example… Mitochondrial P450 (CC24 PR01238; MITP450CC24) 28 23/05/17 Where is it? Mitochondrial p450 mitochondrial inner membrane GO cellular component term: GO:0005743 29 23/05/17 What does it do? substrate + O2 = CO2 +H20 product monooxygenase activity GO molecular function term: GO:0004497 30 23/05/17 Which process is this? electron transport http://ntri.tamuk.edu/cell/ mitochondrion/krebpic.html 31 23/05/17 GO biological process term: GO:0006118 The whole system. Clark et al., 2005 is_a part_of 32 23/05/17 The Gene Ontology is for all species and that means we have to *bridge* some language barriers. 33 23/05/17 Same name, same thing? http://www.darknessandlight.co.uk/cambridge_photographs.html Bridge of Sighs, Cambridge. 34 23/05/17 http://www.lockeheemstra.com/italy/bridge-of-sighs-venice.html Ponte dei Sospiri, Venice. Tactition Taction Tactile sense In biology… 35 23/05/17 ? Tactition Taction Tactile sense perception of touch ; GO:0050975 36 23/05/17 Bud initiation? 37 23/05/17 = tooth bud initiation = reproductive bud initiation = branch bud initiation 38 23/05/17 Demo: Writing an ontology The car ontology 39 23/05/17 • Demo: The gene ontology 40 23/05/17 Categorization of gene products using GO is called annotation. So how does that happen? 41 23/05/17 P05147 Choose your favourite gene. 42 23/05/17 P05147 Find a paper about it. PMID: 2976880 43 23/05/17 P05147 PMID: 2976880 Find the GO term describing its function, process or location of action. GO:0047519 44 23/05/17 P05147 PMID: 2976880 What evidence do they show? IDA GO:0047519 45 23/05/17 P05147 PMID: 2976880 Write these down… P05147 GO:0047519 IDA PMID:2976880 IDA GO:0047519 46 23/05/17 Send to the GO Consortium . 47 23/05/17 Finding annotations in a paper …for B. napus PERK1 protein (Q9ARH1) In this study, we report the isolation and molecular characterization of the B. napus PERK1 cDNA, that is predicted to encode a novel receptor-like kinase. We have shown that like other plant RLKs, serine/threonine kinase activity, the kinase domain of PERK1 has serine/threonine In addition, the location of a PERK1-GTP fusion protein to the plasma membrane supports the prediction that PERK1 is an integral integralmembrane membraneprotein protein…these kinases have been implicated in early stages of wound wound response… response PubMed ID: 12374299 Function: 48 23/05/17 protein serine/threonine kinase activity GO:0004674 Component: integral to plasma membrane GO:0005887 Process: response to wounding GO:0009611 23/05/17 Annotation details EBI is an Outstation of the European Molecular Biology Laboratory. 50 23/05/17 51 23/05/17 Where to get annotations? • Non-redundant species database • Contains all GO annotations for given species + other information. • http://www.arabidopsis.org/ • Multispecies database - GOA • Contains all GO annotations. • http://beta.uniprot.org/ 52 23/05/17 Evidence codes 53 23/05/17 IDA - inferred from direct assay Enzyme assays In vitro reconstitution (e.g. transcription) Immunofluorescence (for cellular component) Cell fractionation (for cellular component) Physical interaction/binding IEP - inferred from expression pattern Transcript levels (e.g. Northerns, microarray data) Protein levels (e.g. Western blots) IGC - inferred from genomic context Operon structure Syntenic regions Pathway analysis Genome-scale analysis of processes 54 23/05/17 IGI - inferred from genetic interaction "Traditional" genetic interactions such as suppressors, synthetic lethals, etc. Functional complementation Rescue experiments Inference about one gene drawn from the phenotype of a mutation in a different gene. IMP - inferred from mutant phenotype Any gene mutation/knockout Overexpression/ectopic expression of wild-type or mutant genes Anti-sense experiments RNAi experiments Specific protein inhibitors Polymorphism or allelic variation IPI - inferred from physical interaction 2-hybrid interactions Co-purification Co-immunoprecipitation Ion/protein binding experiments 55 23/05/17 ISS - inferred from sequence or structural similarity Sequence similarity (homologue of/most closely related to) Recognized domains Structural similarity Southern blotting RCA - inferred from reviewed computational analysis Large-scale protein-protein interaction experiments Microarray experiments Integration of large-scale datasets of several types Text-based computation IEA - Inferred from Electronic Annotation NAS - non-traceable author statement ND - no biological data available TAS - traceable author statement NR - not recorded 56 23/05/17 Should we trust electronic annotations? PMID: 15960829 57 23/05/17 http://www.geneontology.org/GO.indices.shtml 58 23/05/17 59 23/05/17 ec2go mapping !version: $Revision: 1.67 $ !date: $Date: 2008/01/21 11:29:01 $ !Mapping of GO function_ontology "enzymes" to Enzyme Commission Numbers. !original mapping by Michael Ashburner, Cambridge. !This version parsed from function.ontology on 2008/01/15 14:01:16 !by Daniel Barrell, EBI, Hinxton ! EC:1 > GO:oxidoreductase activity ; GO:0016491 EC:1.1 > GO:oxidoreductase activity, acting on CH-OH group of donors ; GO:0016614 EC:1.1.1 > GO:oxidoreductase activity, acting on the CH-OH group of donors, NAD or NADP as acceptor ; GO:0016616 EC:1.1.1.1 > GO:alcohol dehydrogenase activity ; GO:0004022 EC:1.1.1.10 > GO:L-xylulose reductase activity ; GO:0050038 EC:1.1.1.100 > GO:3-oxoacyl-[acyl-carrier-protein] reductase activity ; GO:0004316 EC:1.1.1.101 > GO:acylglycerone-phosphate reductase activity ; GO:0000140 EC:1.1.1.102 > GO:3-dehydrosphinganine reductase activity ; GO:0047560 EC:1.1.1.103 > GO:L-threonine 3-dehydrogenase activity ; GO:0008743 EC:1.1.1.104 > GO:4-oxoproline reductase activity ; GO:0016617 60 23/05/17 interpro2go mapping InterPro is a database of protein families, domains and functional sites in which identifiable features found in known proteins can be applied to unknown protein sequences. !date: 2008/01/15 13:01:24 !Mapping of InterPro entries to GO !Nicola Mulder, Hinxton ! InterPro:IPR000003 Retinoid X receptor > GO:DNA binding ; GO:0003677 InterPro:IPR000003 Retinoid X receptor > GO:steroid binding ; GO:0005496 InterPro:IPR000003 Retinoid X receptor > GO:regulation of transcription, DNA-dependent ; GO:0006355 InterPro:IPR000003 Retinoid X receptor > GO:nucleus ; GO:0005634 InterPro:IPR000005 Helix-turn-helix, AraC type > GO:transcription factor activity ; GO:0003700 InterPro:IPR000005 Helix-turn-helix, AraC type > GO:intracellular ; GO:0005622 InterPro:IPR000006 Metallothionein, vertebrate > GO:metal ion binding ; GO:0046872 InterPro:IPR000013 Peptidase M7, snapalysin > GO:extracellular region ; GO:0005576 InterPro:IPR000014 PAS > GO:signal transducer activity ; GO:0004871 InterPro:IPR000015 Fimbrial biogenesis outer membrane usher protein > GO:transporter activity ; GO:0005215 InterPro:IPR000018 P2Y4 purinoceptor > GO:purinergic nucleotide receptor activity, G-protein coupled ; GO:0045028 InterPro:IPR000020 Anaphylatoxin/fibulin > GO:extracellular region ; GO:0005576 InterPro:IPR000021 Hok/gef cell toxic protein > GO:membrane ; GO:0016020 InterPro:IPR000022 Carboxyl transferase > GO:ligase activity ; GO:0016874 InterPro:IPR000023 Phosphofructokinase > GO:6-phosphofructokinase activity ; GO:0003872 InterPro:IPR000025 Melatonin receptor > GO:integral to membrane ; GO:0016021 InterPro:IPR000026 Guanine-specific ribonuclease N1 and T1 > GO:endoribonuclease activity ; GO:0004521 InterPro:IPR000028 Chloroperoxidase > GO:peroxidase activity ; GO:0004601 61 23/05/17 Manual annotation appears in AmiGO. Manual and electronic annotation appears in QuickGO. 62 23/05/17 Clark et al., 2005 Many species groups annotate. We see the research of one function across all species. 63 23/05/17 Exercise: Search for your favourite gene and see if the annotation is electronic or manual. http://www.ebi.ac.uk/ego/ 64 23/05/17 Submit new GO terms: http://www.geneontology.org/ 65 23/05/17 66 23/05/17 23/05/17 GO slims EBI is an Outstation of the European Molecular Biology Laboratory. Clark et al., 2005 is_a part_of 68 23/05/17 Clark et al., 2005 is_a part_of 69 23/05/17 Whole genome analysis (J. D. Munkvold et al., 2004) 70 23/05/17 …analysis of high-throughput data according to GO time Puparial adhesion Molting cycle hemocyanin Amino acid catabolism Lipid metobolism Peptidase activity Protein catabloism Immune response Immune response Toll regulated genes attacked control pearson lw n3d ... lw n3d ... Colored Selected Gene Tree: pearson Coloredby: by: on: Set_LW_n3d_5p_... Gene List: Branch color classification: Set_LW_n3d_5p_... Gene List: 71 23/05/17 Bregje Wertheim at the Centre for Evolutionary Genomics, Department of Biology, UCL and Eugene Schuster Group, EBI. Copy ofofCopy of(Defa... C5_RMA (Defa... Copy of Copy C5_RMA genes allall genes (14010)(14010) Making Slims: OBO-Edit 72 23/05/17 Reapplying slimmed ontology to annotations: AmiGO http://amigo.geneontology.org/ 73 23/05/17 Converting IDs: PICR http://www.ebi.ac.uk/Tools/picr/ 74 23/05/17 GOOSE http://www.berkeleybop.org/goose 75 23/05/17 2006 Consortium Meeting, St. Croix, U.S. Virgin Islands, March 30 - April 3, 2006 76 23/05/17 E. Coli hub http://www.geneontology.org Reactome 77 23/05/17