Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
GO and OBO: an introduction • What is the Gene Ontology? • What is OBO? • OBO-Edit demo & practical Jane Lomax EMBL-EBI Gene Ontology • Built for a very specific purpose: “annotation of genes and proteins in genomic and protein databases” • Applicable to all species Jane Lomax EMBL-EBI Evolution of GO • Original GO created in 2000 • Three databases involved: – FlyBase (Drosophila) – MGI (Mouse) – SGD (S. cerevisae) • Used immediately Jane Lomax EMBL-EBI Evolution of GO • Later databases: – – – – TAIR (Arabadopsis) TIGR (microbes including prokaryotes) SWISS-PROT (several thousand species inc. human) PSU (P. falciparum) • Recent additions – ZFIN (zebrafish) – PAMGO (plant pathogens) Jane Lomax EMBL-EBI Evolution of GO • GO development traditionally annotation-driven – development directed by use • Terms added as new species annotated • Terms added on as as-needed basis Jane Lomax EMBL-EBI Evolution of GO • Developed by an international consortium of biologists and computer scientists – members from individual databases – central office at EBI • Development involves collaboration with domain experts from different biological fields – also formal ontologists Jane Lomax EMBL-EBI Evolution of GO • Resulted in ‘organic’ structure, little formality • Ontological formality added subsequently – philosophical and logical Jane Lomax EMBL-EBI Ja n0 Ap 1 r0 Ju 1 l0 O 1 ct -0 Ja 1 n0 Ap 2 r0 Ju 2 l0 O 2 ct -0 Ja 2 n0 Ap 3 r0 Ju 3 l0 O 3 ct -0 Ja 3 n0 Ap 4 r0 Ju 4 l0 O 4 ct -0 Ja 4 n0 Ap 5 r0 Ju 5 l0 O 5 ct -0 Ja 5 n0 Ap 6 r0 Ju 6 l0 O 6 ct -0 Ja 6 n07 Number of terms Growth of GO GO term history 2001 - 2007 30000 25000 20000 15000 obsolete undefined terms defined terms 10000 5000 0 Date Jane Lomax EMBL-EBI How does GO work? What information might we want to capture about a gene product? • What does the gene product do? • Where and when does it act? • Why does it perform these activities? Jane Lomax EMBL-EBI GO structure • GO terms divided into three parts: – cellular component – molecular function – biological process Jane Lomax EMBL-EBI Cellular Component • where a gene product acts Cellular Component Cellular Component Cellular Component • Enzyme complexes in the component ontology refer to places, not activities. Molecular Function • activities or “jobs” of a gene product glucose-6-phosphate isomerase activity Molecular Function insulin binding insulin receptor activity Molecular Function drug transporter activity Molecular Function • A gene product may have several functions; a function term refers to a single reaction or activity, not a gene product. • Sets of functions make up a biological process. Jane Lomax EMBL-EBI Biological Process a commonly recognized series of events cell division Biological Process transcription Biological Process regulation of gluconeogenesis Biological Process limb development Biological Process courtship behavior Ontology Structure • Terms are linked by two relationships – is-a – part-of Jane Lomax EMBL-EBI Ontology Structure cell membrane mitochondrial membrane is-a part-of chloroplast chloroplast membrane Jane Lomax EMBL-EBI Ontology Structure • Ontologies are structured as a hierarchical directed acyclic graph (DAG) • Terms can have more than one parent and zero, one or more children Jane Lomax EMBL-EBI Ontology Structure cell membrane mitochondrial membrane Directed Acyclic Graph (DAG) - multiple parentage allowed chloroplast chloroplast membrane Jane Lomax EMBL-EBI Open Biomedical Ontologies (OBO) • GO is a member of OBO • An umbrella project for grouping different ontologies in biological/medical field – a repository for ontologies with defined set of standards • Available from a single source: http://obo.sourceforge.net/ Jane Lomax EMBL-EBI Why do we need OBO? • GO covers small area of biology: – molecular function of a protein – biological function of a protein – cellular location of a protein Jane Lomax EMBL-EBI Why do we need OBO? • Lots of other aspects that also need to be captured, e.g.: – – – – phenotype anatomy genomic taxonomy Jane Lomax EMBL-EBI Why do we need OBO? • Many groups develop their own ontologies – e.g. plant ontology, anatomies for specific organisms • No standardisation of ontologies with respect to: – format – scope – relationships • No way of knowing whether such ontologies already exist • No mechanism of distribution for other groups Jane Lomax EMBL-EBI Why do we need OBO? • Creating ontologies takes a lot of work – Makes sense to reuse existing ontologies where possible • Improves data integration where small set of ontologies used • Allows ontologies to be made available from a single place Jane Lomax EMBL-EBI Why do we need OBO? • Ultimate aim: a complete set of integrated ontologies completely covering the biomedical domain Jane Lomax EMBL-EBI OBO requirements To be part of OBO, ontologies must: • Be open, can be used by all without any constraint Jane Lomax EMBL-EBI OBO requirements: open • Ontologies can be used by anyone without any constraints, except: – original authors are acknowledged – cannot be edited and then released under same name Jane Lomax EMBL-EBI OBO requirements To be part of OBO, ontologies must: • Be open, can be used by all without any constraint • Be in a common shared syntax Jane Lomax EMBL-EBI OBO requirements: syntax • Usually the OBO format, same as primary GO format – and adaptions of OBO format • Also accept OWL (Web Ontology Language) format • Allows the same tools to be applied, facilitating shared software implementations Jane Lomax EMBL-EBI Anatomy of an OBO term id: GO:0006094 unique ID name: gluconeogenesis term name ontology namespace: process def: The formation of glucose from noncarbohydrate precursors, such as definition pyruvate, amino acids and glycerol. [http://cancerweb.ncl.ac.uk/omd/index.html] exact_synonym: glucose biosynthesis synonym xref_analog: MetaCyc:GLUCONEO-PWY database ref is_a: GO:0006006 parentage is_a: GO:0006092 Jane Lomax EMBL-EBI OBO requirements To be part of OBO, ontologies must: • Be open, can be used by all without any constraint • Be in a common shared syntax • Not overlap with other ontologies in OBO Jane Lomax EMBL-EBI OBO requirements: overlapping • Ontologies can (and should) overlap partially, but large overlap should be avoided • Idea is that terms from different ontologies can be combined to form new terms • Striving for accepted standards rather than competition Jane Lomax EMBL-EBI OBO requirements To be part of OBO, ontologies must: • Be open, can be used by all without any constraint • Be in a common shared syntax • Not overlap with other ontologies in OBO • Share a unique identifier space Jane Lomax EMBL-EBI OBO requirements: id space • So, for example, the GO identifier is “GO”: – No other OBO ontology could use this id space • Prevents problems where multiple ontologies are used together Jane Lomax EMBL-EBI OBO requirements To be part of OBO, ontologies must: • Be open, can be used by all without any constraint • Be in a common shared syntax • Not overlap with other ontologies in OBO • Share a unique identifier space • Include text definitions of their terms Jane Lomax EMBL-EBI OBO requirements • In addition, OBO includes ontology of relationships – all ontologies should use these definitions of relationships • For example – part_of – develops_from – regulates Jane Lomax EMBL-EBI What’s available • demo: http://obo.sourceforge.net/ Jane Lomax EMBL-EBI Editing ontologies • GO is edited using OBO-Edit – stand-alone Java application – available for all platforms – browse, create or edit any ontology in OBO format Jane Lomax EMBL-EBI OBO-Edit demo • Browsing ontologies – – – – – loading ontologies (including loading multiple ontologies) graph viewer reasoner/single relationship views searching/filtering/rendering help • Creating/editing ontologies – – – – – – creating a new ontology adding terms copying/moving/deleting terms adding definitions, dbxrefs etc verification plugin saving ontologies Jane Lomax EMBL-EBI